University of Missouri, St. Louis IRL @ UMSL

Dissertations UMSL Graduate Works

5-15-2019 Phylogeographic Relationships of Coereba flaveola and Their alM aria Parasites Meghann Humphries University of Missouri-St. Louis, [email protected]

Follow this and additional works at: https://irl.umsl.edu/dissertation Part of the Population Biology Commons

Recommended Citation Humphries, Meghann, "Phylogeographic Relationships of Coereba flaveola and Their alM aria Parasites" (2019). Dissertations. 880. https://irl.umsl.edu/dissertation/880

This Dissertation is brought to you for free and open access by the UMSL Graduate Works at IRL @ UMSL. It has been accepted for inclusion in Dissertations by an authorized administrator of IRL @ UMSL. For more information, please contact [email protected]. Phylogeographic Relationships of Coereba flaveola and Their Malaria Parasites

By

Meghann Birk Humphries

Master of Natural Science – Southeast Missouri State University, 2008 Bachelor of Science, Biology – Southeast Missouri State University, 2005

A Dissertation Submitted to The Graduate School of the

UNIVERSITY OF MISSOURI – SAINT LOUIS

in partial fulfillment of the requirements for the degree

DOCTOR OF PHILOSOPHY

In Biology with an emphasis in Ecology, Evolution, and Systematics

August 2019

Advisory Committee

Robert E. Ricklefs, Ph.D. (Advisor)

Patricia G. Parker, Ph.D.

Nathan Muchhala, Ph.D.

Carlos Botero, Ph.D.

Humphries, Meghann B., 2019, UMSL, pg. 2

For the Birk and Humphries families, on whose shoulders I stand.

2

Humphries, Meghann B., 2019, UMSL, pg. 3

TABLE OF CONTENTS Acknowledgments ...... 4 Dissertation Abstract ...... 6 Chapter 1 ...... 9 Chapter 2...... 52 Chapter 3...... 89

3

Humphries, Meghann B., 2019, UMSL, pg. 4

Acknowledgments

First, I thank my advisor and mentor, Dr. Robert Ricklefs, for seeing potential in me and for your endless patience. It is the privilege of a lifetime to learn from such a brilliant and generous ecologist, I will be forever grateful for the opportunity. To Dr. Patricia Parker, thank you for the many times you saved the day, for genuinely celebrating the high times and for offering consolation and encouragement during the low times. You are an inspiration in so many ways. To Drs. Nathan Muchhala and Carlos Botero, thank you for your generous advice; your perspectives and guidance have made me a better biologist and mentor.

To Dr. Letícia Soares, who taught me how to work in the field, museum, and lab, answered endless questions, shared the best field snacks (and kept the gross ones to herself), and became a treasured friend, thank you. To Emma Young, my friend and confidant, your energy and vision have been a sustaining inspiration. Thank you for commiserating, uplifting, and celebrating. To Isabel Loza, whose encouragement, thoughtful advice, and hugs I could always count on, thank you. Thank you to Dr. Maria

(Lua) Pil, who put the very first pipette in my hand and gave me my start in population genetics. To Dr. Samoa Asigau, source of encouragement, perspective, and co- conspirator in schemes for our futures, thank you for all you shared over cups of coffee.

To the rest of my friends and colleagues too numerous to mention, thank you for building a community that feels like a family, that is demanding but never demeaning, and that allows each member the safety and support to find their voice. I am so proud to have been part of this institution.

4

Humphries, Meghann B., 2019, UMSL, pg. 5

To my mother, you never saw a limit to what I could achieve or a barrier I couldn’t overcome. You taught me to have big dreams and modeled the courage and persistence to pursue them. You are the strongest person I have ever met. To my brother, ever a source of confidence and love, your pep talks and unflagging belief in me kept me going. I don’t tell you enough what an amazing man you are. The two of you were my first and fiercest champions, and this long-held dream was sustained by your love and support. Thank you doesn’t seem enough. To my father, I carry you with me and think of you often. I strive to honor your legacy as I move through my own life. To the rest of my family who lived such brief but spectacular lives, I carry on with you in my heart and your voices in my memory, I know you were with me too.

To my wife Julie, for your boundless and perfect love, your calm through the storm, your confidence in my eventual success, and your taste for a good adventure, thank you will never be enough. Now again we plunge into the caverns of tomorrow but, as always, I know you’ll bring the flashlights.

To my son Finnick, a light in the dark since the moment you were born and my respite when weariness overcomes me, thank you for the joy you continue to give me and your mama. Perhaps my highest hope is to make you proud, Sunshine.

5

Humphries, Meghann B., 2019, UMSL, pg. 6

Dissertation Abstract

The broad aim of this dissertation is to investigate the phylogeographic relationships of and their avian malaria parasites in the West Indies, and to determine whether periods of geographic expansion and contraction of host populations are related to either parallel or complementary phases in the parasites.

I begin with an investigation of the demographic history and structure of the bananaquit (Coereba flaveola) population within the Greater Antillean island of Puerto

Rico, which is the apparent island source of several expansions of the species into the

Lesser Antilles. I investigate how these phases of geographic expansion ( cycles) of bananaquits within the West Indies relate to the structure of the bananaquit population on

Puerto Rico and describe the recent demographic history of this population inferred from contemporary genetic variation. I also compare the genetic structure of the bananaquit population to that of the closely related Puerto Rican black-faced grassquit (Tiaris bicolor), and to the population of bananaquits on in order to differentiate forces driving population structure and expansion that are intrinsic to the species from those that are intrinsic to the island. This study revealed similar signatures of population differentiation and expansion for both C. flaveola and T. bicolor on , but not for C. flaveola on Jamaica. These findings indicate that both island and taxon effects influence these demographic changes, but what initiates a period of expansion remains elusive.

Coevolutionary outcomes with enemies is suspected to be related to the observed demographic changes in bananaquits and other avian species in the West Indies, as susceptibility to antagonists is related to both host species-specific characteristics and to

6

Humphries, Meghann B., 2019, UMSL, pg. 7 geographical location. Accordingly, in the second part of this dissertation I investigate an abundant and widespread avian antagonist: the parasites causing avian malaria.

The ubiquity of avian malaria in the global avifauna makes it an accessible pathogen to investigate how the geographic distribution of parasite genetic diversity relates to patterns of virulence, the evolution of host-resistance, and the population structure and demography of hosts. I used contemporary genetic variation of a highly polymorphic nuclear gene that interacts directly with the host immune system during red blood cell invasion to characterize population structure and demographic history. I found evidence of genetic differentiation in association with a single host and among some locations, and evidence of mitochondrial introgression from one of the lineages into a second lineage. These findings have significant implications for the practice of defining parasite lineages based on mitochondrial genetic variation. Moreover, pairwise comparisons of genetic differentiation illustrate complex patterns of parasite population structure among hosts and locations which are likely influenced by migratory and vector spatial dynamics. Further investigation of both host-generalist and -specialist parasite lineages is needed to assess the influence of these interactions on host demographic cycling, but findings here provide a promising genetic marker to continue these pursuits.

In the final chapter, I return to the genetic characteristics of hosts to investigate a component of the host genetic background that has been linked to malaria resistance and tolerance in humans: hemoglobin variation. Functionally consequential variation in the globin subunits of hemoglobin can be the result of as little as a single nucleotide polymorphism, and intraspecific variation at these loci has been documented in birds across altitudinal gradients. I hypothesized that if hemoglobin variants confer resistance

7

Humphries, Meghann B., 2019, UMSL, pg. 8 or tolerance to malaria parasites in birds as they are known to do in humans, then they may explain differences in parasitism among island populations and may relate to demographic cycling of hosts. I assessed genetic variation in a postnatally expressed avian globin gene (αA -globin) in C. flaveola populations and tested whether this variation is associated with differences in parasitism among different lineages of parasites. I found no association between αA -globin haplotype and infection by particular parasite lineages among all locations, nor any protective association between globin haplotype frequency and the proportion of individuals infected within populations. Though further investigation of the alpha subunit of β-globin may yet reveal a role of hemoglobin variation in avian malaria infection, findings reported here suggest that there is no such benefit conferred by αA -globin variation. However, phylogeographic structure and genetic variation at the αA -globin locus, including a highly variable intron, is largely concordant with the mitochondrial cytochrome-b locus for these same populations, supporting it as a locus with potential application in biogeographic analyses.

8

Humphries, Meghann B., 2019, UMSL, pg. 9

Chapter 1

Historical demography of Coereba flaveola on Puerto Rico

Meghann B. Humphries,1,*, Maria W. Barbosa De Oliveira Pil,2 Steven C. Latta,3, Peter

P. Marra,4, and Robert E. Ricklefs1

1 Department of Biology, University of Missouri–Saint Louis, St. Louis, Missouri, USA

2 Federal University of Pernambuco, Zoology Department, Recife, PE,

3 National Aviary, Pittsburgh, Pennsylvania, USA

4 Migratory Center, Smithsonian Conservation Biology Institute, National

Zoological Park, Washington, D.C., USA

The Auk (2019) Vol. 32:11–16.

ABSTRACT We use contemporary genetic diversity to characterize the within-island population structure and historical demography of bananaquits (Aves: Thraupidae: Coerebinae:

Coereba flaveola) on the West Indian island of Puerto Rico (Greater Antilles). We relate periods of population expansion, from Puerto Rico across the , to the genetic architecture of the source population, and describe differentiation of populations within Puerto Rico. Lastly, we report comparable analyses of populations of bananaquits on Jamaica and of a related species, the black-faced grassquit (Coerebinae: Tiaris bicolor), on Puerto Rico. We found differentiation among contemporary populations of bananaquits within Puerto Rico and signatures of renewed demographic expansion in

9

Humphries, Meghann B., 2019, UMSL, pg. 10 eastern Puerto Rico beginning approximately 100 thousand years ago (ka) and, in the western portion of the island, approximately 40 ka. Populations of T. bicolor on Puerto

Rico exhibit similar structure to bananaquits, while bananaquits on Jamaica exhibit no differentiation among locations. Both T. bicolor and Jamaican C. flaveola provide mixed evidence of demographic expansion.

Keywords: Coereba flaveola, historical demography, population differentiation, population genetics, Puerto Rico, within-island differentiation

INTRODUCTION

Expansion and contraction of populations of birds in the West Indies have been investigated for some time (Ricklefs and Cox 1972), though the causes of these cycles remain unknown. In an effort to better understand the mechanisms that drive the so-called

“taxon cycle” (Wilson 1959, 1961), we investigated the demographic history and structure of the bananaquit (Coereba flaveola) population within the Greater Antillean island of Puerto Rico. Bananaquits are small, mostly nectarivorous birds that are widespread and abundant in the West Indies (except ), and in Central and South

America (Raffaele et al. 2010, Bellemain et al. 2012). The species has experienced a dynamic history, as suggested by the 41 recognized , although not all subspecies occur in the West Indies, and not all West Indian subspecies were supported by a previous mitochondrial RFLP analysis (Seutin et al. 1994). In contrast to the typical pattern of colonization of islands from the mainland, bananaquits originated in the

Greater Antilles and colonized continental from an island source

10

Humphries, Meghann B., 2019, UMSL, pg. 11

(Bellemain et al. 2008). Colonization of , on the Yucatan Peninsula of southern , appears to have occurred from through Cuba (Bellemain et al. 2008), though no established bananaquit populations currently exist on Cuba

(Raffaele et al. 2010). Since the earliest node in the bananaquit phylogeny supported by genetic data, 1.75 – 4 million years ago, bananaquits have undergone several phases of geographic expansion throughout the West Indies; Puerto Rican populations were the source of at least two previous expansions through the Lesser Antilles (Seutin et al. 1994,

Bellemain et al. 2008). A more recent expansion originating in the Lesser Antilles ca. 41 ka has extended northward and westward as far as the British Virgin Islands (BVI), but has not reached neighboring Puerto Rico (Pil 2015), in spite of a land connection exposed by low sea levels between Puerto Rico and BVI as recently as 8 ka (Pregill and Olson

1981).

The historical biogeography of the bananaquit in the is consistent with the taxon-cycle hypothesis, which describes alternating phases of expansion and contraction of species populations and geographic distributions (Wilson 1961, Ricklefs and Cox

1972, Ricklefs and Bermingham 2002). These phases are usually accompanied by shifts in habitat distribution, dispersal ability, and population density (Ricklefs and Cox 1978); they appear to be independent of change in the environment, in that related species might be in different stages of the cycle at the same time (Ricklefs and Cox 1972). Early-stage species undergo range expansion, typically have large populations, and tend to occupy marginal habitats (coastal or open lowlands) on recently colonized islands (Wilson 1961,

Ricklefs and Cox 1972), though some species of birds apparently have dispersed between islands through forest habitats (Dexter 2010). Following range expansion, populations

11

Humphries, Meghann B., 2019, UMSL, pg. 12 begin to differentiate among islands within their new ranges. Late-stage species undergo range contraction, become increasingly differentiated among islands, exhibit gaps in their geographic distributions (i.e., become extinct on some islands within archipelagos), and persist in interior habitats, generally mature forest. The cycle can continue from late-stage contraction via re-expansion, or it can end in global extinction (Wilson 1961, Ricklefs and Cox 1972, Ricklefs and Bermingham 2002).

Little is known about the processes that drive taxon cycles, although coevolutionary outcomes with predators and pathogens, and competitive pressure from new colonists, are likely factors (Ricklefs and Cox 1972). A better understanding of these population dynamics, both among and within islands, would inform research on species distributions and the assembly of ecological communities more generally. These processes may also have important implications for understanding how species respond to climate change, their vulnerability to extinction, and their resistance to invasive species (Ricklefs and Cox

1972). For example, late-stage species, with contracted ranges and reduced populations sizes, are likely to be less resilient than early-stage species to disturbance and habitat loss.

Genetic differentiation between and within island populations of a species allow one to estimate population divergence times and changes in population size. Island populations of species in different stages of the taxon cycle should exhibit different signatures of population expansion or contraction in haplotype networks and genetic diversity statistics. For example, populations of species in earlier stages of the taxon cycle, which are currently widespread or have recently expanded, would be expected to have lower nucleotide diversity and star-like patterns of relationship in haplotype networks (Cook, Pringle, & Hughes, 2008, Pil, 2015). Moreover, populations on islands

12

Humphries, Meghann B., 2019, UMSL, pg. 13 colonized in a recent geographic expansion should share haplotypes differing by few mutational steps, or none, from a common ancestral haplotype, reflecting a relatively short time since colonization by a small number of individuals (Avise 2000). In contrast, within islands, late-stage species should exhibit complex haplotype networks with many differences in pairwise nucleotide comparisons, as well as gaps resulting from the random loss of haplotypes.

Because mitochondrial markers have a four-fold lower effective population size compared to nuclear markers, random processes can cause mitochondrial haplotype frequencies to diverge more quickly and provide more information about population history. However, other processes, such as sex-biased dispersal and selection, also influence genetic differentiation, and so we report both mitochondrial and nuclear patterns of variation.

In the present study, we investigate how phases of geographic expansion of bananaquits within the West Indies relate to the structure of the bananaquit population on

Puerto Rico, and we describe the recent demographic history of this population inferred from contemporary genetic variation. Finally, we compare the genetic structure of the

Puerto Rican bananaquit population to that of the closely related black-faced grassquit

(Tiaris bicolor) on Puerto Rico, as well as to the genetic structure of the population of bananaquits on Jamaica. By comparing Puerto Rican bananaquits to a different species on the same island and to the same species on a different island, we endeavor to differentiate processes intrinsic to the species from those intrinsic to the island.

13

Humphries, Meghann B., 2019, UMSL, pg. 14

METHODS

Birds were captured with mist-nets, and ca. 10 µL of blood were obtained via sub- brachial venipuncture (field techniques described in Latta and Ricklefs (2010). Five locations were sampled on Puerto Rico (Figure 1): El Yunque (EY) in the Caribbean

National Forest, a moist, mid-elevation broadleaf forest in the northeastern part of the island; Refugio Boqueron (RB), in the southwest, a mangrove forest with open water and cattail marsh; Guanica Forest (GF), a lowland dry forest in the south; Casa de la Selva,

Carite forest (C), a mid-elevation second growth forest in the east; and Fajardo (F), an eastern coastal lowland area with second growth and agriculture. Samples were obtained between January 2001 and February 2003, except for C and F, which were sampled in

October 1993 (Table 1). The EY, RB, GF, F, and C samples included 31, 37, 30, 21, and

6 bananaquits, and 5, 9, 23, 13, and 0 black-faced grassquits, respectively. Jamaica was sampled in January and February of 2003 at six locations including coastal forest, second growth forest, and agricultural land (Table 1, Figure 2). Sample locations and accompanying sample sizes for bananaquits are Log I (13), Log II (8), WRC (7), Portland

Bight (18), Long Mountain (6), and Gentles Coleyville Coffee (34) (Table 1).

We extracted DNA from blood using the isopropanol precipitation technique described in Svensson and Ricklefs (2009) and amplified and sequenced two mitochondrial genes and one nuclear intron. The partial mitochondrial gene cytochrome b (CYTB, 626 bp) was amplified using primers L14990 (Kocher et al. 1989) and HCERC

(Pil 2015). The partial NADH dehydrogenase subunit II (ND2, 634 bp) gene was amplified using primers H6313 and L5219 (Johnson and Sorenson 1998). Mitochondrial genes were concatenated for a total of 1260 bp. Intron 8 of the nuclear gene Vimentin

14

Humphries, Meghann B., 2019, UMSL, pg. 15

(VIM) was amplified only for Puerto Rican bananaquits using primers VIMF and VIMR

(Kimball et al. 2009), yielding 477 bp of sequence. All PCR reactions were conducted in

25 µL total volumes with 1 µL of genomic DNA, 0.5 µL of each 10 µM primer, 0.125 µL

TM Takara Taq polymerase, 10x buffer, 10 mM dNTP, 25 mM MgCl2, and ddH2O.

Thermocycling conditions for CYTB and ND2 were as follows: 94˚C for 2 min (95˚C for

ND2), 35 cycles of 94˚C for 30 s (45 s for ND2), 52˚C for 45 s, and 72˚C for 1 min, with a 10-minute extension time at 72˚C. Thermocycling conditions for VIM were: 94˚C for 3 min, 35 cycles of 95˚C for 30 s, 57˚C for 1 min, 70˚C for 1 min, with a 1-hour extension time at 70˚C.

Negative controls were included in each PCR reaction, and products were verified by visualization on 1% TBE agarose gels with ethidium bromide. PCR products were sequenced by Beckman-Coulter Genomics (Danvers, MA) and Eurofins Genomics

(Louisville, KY). Contigs were aligned and edited in Bioedit 7.2.5 (Hall 1999) and

Mega6 (Tamura et al. 2013), and chromatograms were checked by eye to confirm polymorphisms. Nuclear sequences were coded for ambiguous bases in Sequencher v.5.4.1 (available at http://www.genecodes.com), confirmed by eye, and phased for heterozygosity in DNAsp 5 (Librado and Rozas 2009). All unique sequences were deposited in Genbank (Accession numbers MG020850 –MG020983).

To determine whether populations were differentiated among sampling locations, fixation indices (pairwise FST values; Excoffier and Lischer 2010) were calculated separately for the concatenated mitochondrial alignment and the nuclear alignment, with confidence intervals based on 1000 permutations (Arlequin 3.5; Excoffier and Lischer

2010). A median-joining haplotype network (Bandelt et al. 1999) was created using

15

Humphries, Meghann B., 2019, UMSL, pg. 16

PopArt (available at http://popart.otago.ac.nz) to compare and visualize relationships between haplotypes at each sample location. Summary statistics for genetic variation in populations were calculated separately for the mitochondrial and nuclear sequences using

DNAsp 5 (Librado and Rozas 2009). These statistics include number of segregating sites

(S), number of haplotypes (h), haplotype diversity (Hd), and nucleotide diversity (π). Hd is the probability that two haplotypes chosen at random differ (Nei 1987), while π is the average number of nucleotide differences per site for two sequences chosen at random

(Nei and Li 1979). Summary statistics were calculated for monophyletic populations identified by fixation indices and for a pooled dataset containing all individuals.

AMOVA was performed to estimate the proportion of total variation within and between populations (Arlequin 3.5, Excoffier and Lischer 2010).

Neutrality tests (Tajima’s D (Tajima 1989) and Fu’s Fs (Fu 1997)) were calculated for each sampling location as indicators of recent or ongoing population expansion (Arlequin 3.5, Excoffier and Lischer 2010). Mismatch analysis, including the sum of squared deviations (SSD) and Harpending’s raggedness index (r), was performed using Arlequin 3.5 (Excoffier and Lischer 2010). Mismatch distributions represent pairwise differences between haplotypes. Frequency histograms of these differences tend to be smooth and unimodal when populations have experienced recent expansion. In contrast, species at long-term equilibrium tend towards multimodal (“ragged”) distributions (Slatkin and Hudson 1991, Rogers and Harpending 1992, Harpending

1994). SSD tests whether a mismatch distribution differs significantly from the null prediction of a unimodal distribution, while r tests whether a mismatch distribution differs significantly from the null model of population expansion (Harpending 1994).

16

Humphries, Meghann B., 2019, UMSL, pg. 17

Lastly, we used BEAST2 and its applications (Bouckaert et al. 2014) to generate

Bayesian Skyline Plots (BSP) (Drummond et al. 2005), which estimate change in effective population size through time. The best nucleotide substitution model was HKY

+ I, determined using jModelTest2 2.1.6 (Guindon and Gascuel 2003, Darriba et al.

2015). We assumed a strict molecular clock with a nucleotide substitution rate of 1.4 % per million years for CYTB and 2.9% per million years for ND2 (Lerner et al. 2011); proportions of invariant sites were estimated by BEAST2. Ten million generations were computed and sampled every 1,000th generation. Results were checked for convergence by viewing Effective Sample Size (ESS) values. Plots were generated individually for locations with significant genetic differentiation from all other locations; as a result, eastern locations were pooled into a single population for this analysis.

RESULTS

Puerto Rican bananaquit population structure

Fixation indices based on the concatenated mitochondrial dataset (Table 2) indicated no differentiation between Carite (C) and El Yunque (EY) or between Carite and Fajardo

(F), but significant differentiation in all other pairwise comparisons. AMOVA results indicated that 26% of the variation in the mtDNA sequences occurs between populations and 74% within populations (Table 3). In contrast to the mitochondrial data, the nuclear sequences did not exhibit significant population differentiation (Table 2), and the

AMOVA analysis of the nuclear sequences indicated that more than 99% of the variation resided within populations (Table 3).

17

Humphries, Meghann B., 2019, UMSL, pg. 18

The mitochondrial haplotype network (Figure 3; individual population haplotype networks Figure 4) revealed clustering as well as a star-like pattern for RB, indicative of recent expansion. GF also exhibited a star-like pattern, but haplotypes did not cluster as distinctly as in RB; GF additionally shared several haplotypes with eastern populations.

The eastern populations exhibited a complex network with multiple nucleotide differences between haplotypes, indicative of greater age and resulting loss of intervening haplotype sequences. In contrast to the mitochondrial haplotype network, the nuclear haplotype network (Figure 5) revealed a pattern of shared haplotypes among all locations with few nucleotide differences.

Puerto Rican bananaquit genetic diversity

Comparisons of mitochondrial (mtHd) and nuclear (nuHd) haplotype diversity revealed similar values in RB and GF for both markers, and similar values for C, EY, and F for mitochondrial markers. Eastern populations (C, EY, F) had higher mtHd than RB or GF, but nuHd showed no similar clumping (Table 4). Nucleotide diversity (π) estimates were higher for nuclear sequences (nuπ) than for mitochondrial sequences (mtπ) in all populations (Table 4). Measures of mtπ were again higher for C, EY, and F than for RB and GF, which were similar to each other, and nuπ estimates again revealed no clumping.

Because haplotype diversity was similar between populations, the higher π estimates for a given population indicated that non-shared haplotypes exhibit more nucleotide differences. This suggests deeper mitochondrial divergence within the eastern populations than within the GF and RB populations.

18

Humphries, Meghann B., 2019, UMSL, pg. 19

Estimates of π for nuclear sequences should exceed those for mitochondrial sequences due to the fourfold higher effective population size of nuclear genes. To determine whether ratios of nuπ/ mtπ differed significantly from 4 for our samples, we generated a random distribution using the estimated π and associated standard deviation for each marker in each population. We then calculated nuπ/mtπ ratios for each with 1000 permutations and determined the mean and standard deviation of these ratios. In all cases,

4 falls within our confidence interval and so we determined that the ratio of nuπ/mtπ did not differ from 4 for any population.

Puerto Rican bananaquit demography

Neutrality tests based on mitochondrial sequences indicated significant negative departures from neutrality for EY, GF, and RB (Table 4) and a significant negative departure from neutrality for F detected only by Fu’s Fs. In contrast, Tajima’s D for nuclear sequences detected significant negative departures for C, EY, and RB but not F or

GF, while Fu’s Fs detected a significant negative departure for GF but no other populations. Negative departures from neutrality can indicate either population expansion or purifying selection (Tajima 1989).

SSD and raggedness (r) were not significant for any populations except RB

(Figure 6, Table 5), and so the null hypothesis of population expansion cannot be rejected for GF, F, EY, or C. These results corroborate the neutrality test results indicating recent expansion in GF, EY, and F but indicate mixed support for expansion in RB.

Population comparisons

19

Humphries, Meghann B., 2019, UMSL, pg. 20

We found contrasting patterns in the comparative analyses of bananaquits on Jamaica and black-faced grassquits on Puerto Rico. Jamaican bananaquits exhibited no significant population fixation indices (Fst) among locations (Table 6) so we pooled sample locations for further analysis. Both Tajima’s D and Fu’s Fs were significantly negative for the pooled locations (Table 6). These results together suggest island-wide demographic expansion. The haplotype network corroborated these findings with star-like clusters; most haplotypes were only a single mutational step from their nearest neighbor (Figure

7). In contrast, SSD indicated that the observed mismatch distributions are marginally significantly different from the null model of population expansion. The raggedness index (r) indicated no significant difference from the null model (Figure 6, Table 5).

These findings suggest possible island-wide expansion, but with low statistical support.

Fixation indices for black-faced grassquits sampled on Puerto Rico were significant in all pairwise comparisons except between EY and GF (Table 7). Hd and π were similar for all locations, Fu’s Fs was significantly negative for all locations, but only GF exhibited significant negative departure for Tajima’s D (Table 8). The haplotype network (Figure 8) reveals similar patterns for each location, namely that most haplotypes are 1 to 3 mutational steps from a common haplotype.

DISCUSSION

Our analysis of mitochondrial diversity patterns revealed population differentiation and signatures of recent or ongoing demographic expansion of bananaquits within Puerto

Rico. Neutrality tests indicated that four populations (EY, GF, F, and RB) are expanding;

Mismatch Distributions (and associated measures SSD and r) corroborate these findings

20

Humphries, Meghann B., 2019, UMSL, pg. 21 for EY, GF, and F, and additionally support expansion in C. Fixation indices determined that all locations are differentiated from all others except between C/EY and C/F.

Differentiation between RB and GF is especially notable because both sites are at low elevation and separated by only 30 kilometers, although they are strikingly different ecologically. In contrast, similar analysis of nuclear diversity patterns reveal no signature of differentiation by location.

Disagreement between nuclear and mitochondrial markers reflects the shallow node depth of this analysis. Resolution of recent nodes is unlikely with nuclear markers because of the four-fold slower rate of random nucleotide substitution resulting from the four-fold larger effective population size (Bellemain et al. 2008, Zink and Barrowclough

2008, Toews and Brelsford 2012). As a result neutral variation at nuclear markers, such as in nuclear introns, acts as lagging indicator of demographic processes and reflects demographic events in the intermediate past (Zink and Barrowclough 2008). Summary statistics for our data show fewer haplotypes with deeper divergences for the nuclear marker compared to the mitochondrial marker. Thus, the population expansion detected in the mitochondrial markers is likely too recent to be evident in the nuclear marker, which suggests the presence of a single undifferentiated population of bananaquits on

Puerto Rico in the intermediate past.

Alternatively, discrepancies between the two markers could reflect different dispersal behavior of the sexes (Toews and Brelsford 2012). If females typically remained near their birthplaces while males dispersed, individual movements could effectively homogenize the nuclear genome (biparental inheritance) while allowing the mitochondrial genome to differentiate spatially (maternal inheritance). species

21

Humphries, Meghann B., 2019, UMSL, pg. 22 generally exhibit female-biased dispersal (Greenwood 1980, Clarkeet al. 1997, Athrey et al. 2012, Cox and Kesler 2012), though reversals of the female-biased dispersal trend are known to occur (especially in cooperative breeders) (Crochet et al. 2003, Harrisonet al.

2014). For bananaquits, specifically, information is lacking on dispersal, although

Wunderle (1981) showed that males and females of this species moved about the same distance on average between years, while females moved greater distances within years.

Population differentiation within species of birds has not previously been recognized on Puerto Rico but within-island differentiation is common among organisms less mobile than birds. For example, different montane areas on Puerto Rico support genetically distinct populations of the frogs Eleutherodactylus antillensis and E. coqui, the result of an historic dispersal barrier (Velo-Antón et al. 2007, Barker et al. 2012).

Bananaquits do exhibit a well-documented propensity for variability at fine spatial scales, demonstrated by the morphological cline in plumage coloration on (Wunderle

1981) and in body size on Jamaica (Diamond 1973). However, examples of within-island genetic differentiation of avian populations in the West Indies have been limited to the forest thrush (Turdus lherminieri) on Guadaloupe (Arnoux et al. 2014), the Jamaican streamer-tailed (Trochilus polytmus) (Lance et al. 2009), and several species on the larger and more rugged island of where paleogeographic history as separate island blocks likely allowed allopatric divergence within genera. (e.g.,

Townsend et al. 2007, Sly et al. 2011). Beyond the West Indies, examples of differentiation at small spatial scales are known from finches on the Tristan da Cunha

Islands in the South Atlantic (Ryan et al. 2007), the Mascarene gray white-eye on

Réunion (Milá et al. 2010), and sunbirds on Mindanao in the Philippines (Hosner et al.

22

Humphries, Meghann B., 2019, UMSL, pg. 23

2013). Differentiation at loci associated with divergent shapes has also been documented for Darwin’s finches between locations on Santa Cruz Island, Galápagos, though these populations exhibit no signature of differentiation at neutral markers (De

León et al. 2010).

Haplotype networks of mitochondrial genes provide additional information about each population. The haplotype networks of EY and F show signatures of old populations, having several haplotypes that differ by several mutational steps, while the

RB haplotype network has the aspect of a young population, with most haplotypes separated by only a single mutational step. The GF population has a signature similar to

RB but with more haplotypes differing by more than one mutational step. This pattern suggests that the GF population is older than the RB population, owing to the longer time required to accumulate additional mutations, but not as old as the eastern populations.

Bayesian skyline plots (Figure 9) indicated that population expansion in the pooled eastern locations (EY, F, and C) began approximately 100 ka and, in GF and RB, approximately 40 ka. The last estimated inter-island expansion of bananaquits (Bellemain et al. 2008) occurred 240 ka before the eastern expansion, and 300 ka before that within

GF and RB.

Comparative analyses of T. bicolor on Puerto Rico reveal geographical structure similar to that of bananaquits and mixed evidence of population expansion (Figure 8,

Tables 7 and 8), though small sample sizes preclude a high level of confidence in these conclusions. In contrast, analyses of bananaquits on Jamaica revealed no within-island population differentiation and marginal support for recent island-wide population expansion (Figures 7 and 8, Table 6). The absence of population differentiation on

23

Humphries, Meghann B., 2019, UMSL, pg. 24

Jamaica is surprising. Jamaica and Puerto Rico are similar in latitude, shape, size, and topographic complexity, with Jamaica being slightly larger and having a somewhat higher central mountain range. Moreover, relative proportions of land cover type are similar (Figure 10, data source in Appendix 1). The Jamaican bananaquit population also has a history of island colonization: Bellemain et al. (2008) found evidence of an expansion colonization event from Jamaica to Hispaniola soon after the origin of the species, as well as more recent mitochondrial introgression in the same direction.

Differences in the structure and demography of populations of the same species on different islands, as in comparisons of C. flaveola on Puerto Rico and Jamaica, indicate that driving factors are related to geographic location or are intrinsic to particular populations. The similarity of within-island population trends of ecologically dissimilar but phylogenetically related species, as in comparisons between C. flaveola and T. bicolor, is unexpected. It might be relevant that populations of both species on islands of the Lesser Antilles are inversely related to the prevalence of two haemosporidian parasite lineages (OZ01 and OZ12) (Ricklefs et al. 2016), both of which are present, but currently uncommon, on Puerto Rico (Ricklefs, personal observation).

Conclusions

The bananaquit’s demographic and geographic history has been characterized by cyclic expansion and contraction of populations, independently of changes in other species or changes in the environment. Understanding factors governing these cycles as well as their impact on ecological communities remains a challenge in population ecology. It has long been established that as phases of expansion give way to contraction, populations

24

Humphries, Meghann B., 2019, UMSL, pg. 25 differentiate among islands in archipelagos (Ricklefs and Bermingham 2002). Using approaches to analyze contemporary genetic diversity, we determined that populations can additionally differentiate within islands, even in close proximity to one another. Our work adds to the growing body of evidence that population differentiation can occur at very small scales even among highly mobile organisms. However, what initiates a period of expansion still eludes us. Environmental drivers are unlikely because ecologically similar species may be in different phases of the taxon cycle at any given time. Ricklefs and Cox (1972) assessed this possibility, noting that neither ecological characteristics nor phylogenetic relationships are associated with the relative positions of species in the taxon cycle. A possible explanation for this independence might be release of populations from predators or parasites following the evolution of escape or resistance mechanisms.

Susceptibility to antagonists is related to both species-specific characteristics as well as geographical location, as suggested here, but future research will be necessary to determine the degree to which escape mechanisms govern these patterns of expansion and contraction.

ACKNOWLEDGMENTS

We thank Eldredge Bermingham, John Faaborg, and Irby Lovette for assistance in the field.

Funding Statement: We thank the National Geographic Society and the Curators of the

University of Missouri for funding. We declare that funders had no input into the content of the manuscript and did not require approval before submission for publication.

25

Humphries, Meghann B., 2019, UMSL, pg. 26

Ethics Statement: Fieldwork was carried out in compliance with the Guidelines for the

Use of Wild Birds in Research, under IACUC protocols from the University of

Pennsylvania and the University of Missouri-St. Louis, and under permits from the relevant government agencies with jurisdiction at our field sites.

LITERATURE CITED

Arnoux, E., C. Eraud, N. Navarro, C. Tougard, A. Thomas, F. Cavallo, N. Vetter, B.

Faivre, and S. Garnier (2014). Morphology and genetics reveal an intriguing

pattern of differentiation at a very small geographic scale in a bird species, the

forest thrush Turdus lherminieri. Heredity 113:514-525.

Athrey, G., R. Lance, and P. Leberg (2012). How far is too close? restricted, sex-biased

dispersal in black-capped vireos. Molecular Ecology 21:4359–4370.

Avise, J. (2000). Phylogeography: the history and formation of species. Harvard

university press, MA, USA.

Bandelt, H.., P. Forster, and A. Röhl (1999). Median-joining networks for inferring

intraspecific phylogenies. Molecular Biology and Evolution 16:37–48.

Barker, B., J. Rodriguez-Robles, V. Aran, A. Montoya, R. Waide, and J. Cook (2012).

Sea level, topography and island diversity: Phylogeography of the Puerto Rican

Red-eyed Coquı, Eleutherodactylus antillensis. Molecular Ecology 21:6033–

6052.

Bellemain, E., E. Bermingham, and R. Ricklefs (2008). The dynamic evolutionary history

of the bananaquit (Coereba flaveola) in the Caribbean revealed by a multigene

analysis. BMC Evolutionary Biology 8:240.

26

Humphries, Meghann B., 2019, UMSL, pg. 27

Bellemain, E., O. Gaggiotti, A. Fahey, E. Bermingham, and R. Ricklefs (2012).

Demographic history and genetic diversity in West Indian Coereba flaveola

populations. Genetica 140:137–148.

Bouckaert, R., J. Heled, D. Kuhnert, T. Vaughan, C.-H. Wu, D. Xie,M. Suchard, A.

Rambaut, and A. Drummond (2014). BEAST 2: A software platform for Bayesian

evolutionary analysis. PLoS Computational Biology, 10(4):1–6.

Clarke, A. L., B.-E. Sæther, and E. Røskaft (1997). Sex biases in avian dispersal: A

reappraisal. Oikos 79:429–438.

Cook, B., C. Pringle, and J. Hughes (2008). Molecular evidence for sequential

colonization and taxon cycling in freshwater decapod shrimps on a Caribbean

island. Molecular Ecology 17:1066–1075.

Cox, A., and D. Kesler (2012). Prospecting behavior and the influence of forest cover on

natal dispersal in a resident bird. Behavioral Ecology 23:1068–1077.

Crochet, P.-A., J. Chen, J.-M. Pons, J.-D. Lebreton, P. Hebert, and F. Bonhomme (2003).

Genetic differentiation at nuclear and mitochondrial loci among large white-

headed gulls: Sex-biased interspecific gene flow? Evolution 57:2865–2878.

Darriba, D., G. Taboada, R. Doallo, and D. Posada (2015). jModelTest 2: more models,

new heuristics and high-performance computing. Nature Methods 9:772-775.

De León, L., E. Bermingham, J. Podos, and A. Hendry (2010). Divergence with gene

flow as facilitated by ecological differences: Within-island variation in Darwin’s

finches. Philosophical Transactions of the Royal Society of London. Series B,

Biological Sciences 365: 1041–52.

27

Humphries, Meghann B., 2019, UMSL, pg. 28

Dexter, K. G. (2010). The influence of dispersal on macroecological patterns of Lesser

Antillean birds. Journal of Biogeography 37:2137–2147.

Diamond, A. (1973). Altitudinal variation in a resident and migrant passerine on Jamaica.

The Auk 90:610–618.

Drummond, A., A. Rambaut, B. Shapiro, and O. Pybus (2005). Bayesian coalescent

inference of past population dynamics from molecular sequences. Molecular

Biology and Evolution, 22:1185–1192.

Excoffier, L., H. Lischer (2010). Arlequin suite v 3.5: A new series of programs to

perform population genetics analyses under Linux and Windows. Molecular

Ecology Resources, 10:564–567.

Fu, Y. X. (1997). Statistical tests of neutrality of mutations against population growth,

hitchhiking and background selection. Genetics 147:915–925.

Greenwood, P. (1980). Mating systems, philopatry and dispersal in birds and mammals.

Animal Behaviour 28:1140–1162.

Guindon, S., and O. Gascuel (2003). A simple, fast, and accurate algorithm to estimate

large phylogenies by maximum likelihood. Systematic Biology 52:696–704.

Hall, T. (1999). BioEdit: a user-friendly biological sequence alignment editor and

analysis program for Windows 95/98/NT. Nucleic Acids Symposium Series

41:95-98.

Harpending, H. (1994). Signature of ancient population growth in a low-resolution

mitochondrial DNA mismatch distribution. Human Biology 66”591–600.

28

Humphries, Meghann B., 2019, UMSL, pg. 29

Harrison, X., J. York, and A. Young (2014). Population genetic structure and direct

observations reveal sex-reversed patterns of dispersal in a cooperative bird.

Molecular Ecology 23: 5740–5755.

Hosner, P., Á. Nyári, and R. Moyle (2013). Water barriers and intra-island isolation

contribute to diversification in the insular Aethopyga sunbirds (Aves:

Nectariniidae). Journal of Biogeography 40:1094–1106.

Johnson, K., and M. Sorenson (1998). Comparing molecular evolution in two

mitochondrial protein coding genes (cytochrome b and ND2) in the dabbling

ducks (Tribe: Anatini). Molecular and Evolution 10:82–94.

Kimball, R., E. Braun, F. Barker, R. Bowie, M. Braun, J. Chojnowski, S. Hackett, K.-L.

Han, J. Harshman, V. Heimer-Torres, W. Holznagel, C, Huddleston, B. Marks, K.

Miglia, W. Moore, S. Reddy, F. Sheldon, J. Smith, C. Witt, and T. Yuri (2009). A

well-tested set of primers to amplify regions spread across the avian genome.

Molecular Phylogenetics and Evolution 50:654–660.

Kocher, T., W. Thomas, A. Meyer, S. Edwards, S. Paabo, F. Villablanca, and A. Wilson

(1989). Dynamics of mitochondrial DNA evolution in : Amplification and

sequencing with conserved primers. Proceedings of the National Academy of

Sciences 86:6196–6200.

Lance, S., C. Hagen, T. Glenn, R. Brumfield, K. Stryjewski, and G. Graves (2009).

Fifteen polymorphic microsatellite loci from Jamaican streamertail

(Trochilus). Conservation Genetics 10:1195–1198.

Latta, S. and R. Ricklefs (2010). Prevalence patterns of avian haemosporida on

Hispaniola. Journal of Avian Biology 41:25–33.

29

Humphries, Meghann B., 2019, UMSL, pg. 30

Lerner, H., M. Meyer, H. James, M. Hofreiter, and R. Fleischer (2011). Multilocus

resolution of phylogeny and timescale in the extant adaptive radiation of

Hawaiian honeycreepers. Current Biology 21:1838–1844.

Librado, P., and J. Rozas (2009). DnaSP v5: A software for comprehensive analysis of

DNA polymorphism data. Bioinformatics 25:1451–1452.

Milá, B., B. Warren, P. Heeb, and C. Thébaud (2010). The geographic scale of

diversification on islands: genetic and morphological divergence at a very small

spatial scale in the Mascarene grey white-eye (Aves: Zosterops borbonicus).

BMC Evolutionary Biology, 10(158), 1–13.

Nei, M. (1987). Molecular evolutionary genetics. Columbia University Press, NY, USA.

Nei, M. and W.-H. Li (1979). Mathematical model for studying genetic variation in terms

of restriction endonucleases. Proceedings of the National Academy of Sciences of

the United States of America, 76:5269–5273.

Pil, M. (2015). Comparative phylogeography and demographic histories of West Indian

birds. Ph. D. Dissertation. University of Missouri-Saint Louis, MO, USA.

Pregill, G. and S. Olson (1981). Zoogeography of West Indian vertebrates in relation to

Pleistocene climatic cycles. Annual Review of Ecology and Systematics 12:75–

98.

Raffaele, H., J. Wiley, O. Garrido, A. Keith, and J. Raffaele (2010). Birds of the West

Indies. Princeton University Press, NJ, USA.

Ricklefs, R. and E. Bermingham (2002). The concept of the taxon cycle in biogeography.

Global Ecology and Biogeography 11:353–361.

30

Humphries, Meghann B., 2019, UMSL, pg. 31

Ricklefs, R. and G. Cox (1972). Taxon cycles of the West Indian avifauna. The American

Naturalist 106:195–219.

Ricklefs, R. and G. Cox (1978). Stage of taxon cycle, habitat distribution, and population

density in the avifauna of the West Indies. The American Naturalist 112:875–895.

Ricklefs, R., L. Soares, V. Ellis, and S. Latta (2016). Haemosporidian parasites and avian

host population abundance in the Lesser Antilles. Journal of Biogeography

43:1277–1286.

Rogers, A. and H. Harpending (1992). Population growth makes waves in the distribution

of pairwise genetic differences. Molecular Biology and Evolution 9:552–569.

Ryan, P., P. Bloomer, C. Moloney, T. Grant, and W. Delport (2007). Ecological

speciation in South Atlantic island finches. Science 315:1420–1423.

Seutin, G., N. Klein, R. Ricklefs, and E. Bermingham (1994). Historical biogeography of

the bananaquit (Coereba flaveola) in the Caribbean region: A mitchondrial DNA

assessment. Evolution 48:1041–1061.

Slatkin, M. and R. Hudson (1991). Pairwise comparisons of mitochondrial DNA

sequences in stable and exponentially growing populations. Genetics 129:555–

562.

Sly, N., A. Townsend, C. Rimmer, J. Townsend, S. Latta, and I. Lovette, I. J. (2011).

Ancient islands and modern invasions: Disparate phylogeographic histories

among Hispaniola’s endemic birds. Molecular Ecology 20:5012–5024.

Svensson, L., and R. Ricklefs (2009). Low diversity and high intra-island variation in

prevalence of avian parasites on , Lesser Antilles.

Parasitology 136:1121–31.

31

Humphries, Meghann B., 2019, UMSL, pg. 32

Tajima, F. (1989). Statistical method for testing the neutral mutation hypothesis by DNA

polymorphism. Genetics 123:585–595.

Tamura, K., G. Stecher, D. Peterson, A. Filipski, and S. Kumar (2013). MEGA6:

Molecular evolutionary genetics analysis version 6.0. Molecular Biology and

Evolution 30:2725–2729.

Toews, D. and A. Brelsford (2012). The biogeography of mitochondrial and nuclear

discordance in animals. Molecular Ecology 21: 3907–3930.

Townsend, A., C. Rimmer, S. Latta, and I. Lovette (2007). Ancient differentiation in the

single-island avian radiation of endemic Hispaniolan chat- (Aves:

Calyptophilus). Molecular Ecology 16:3634–3642.

Velo-Antón, G., P. Burrowes, R. Joglar, I. Martínez-Solano, K. Beard, and G. Parra-Olea

(2007). Phylogenetic study of Eleutherodactylus coqui (Anura: Leptodactylidae)

reveals deep genetic fragmentation in Puerto Rico and pinpoints origins of

Hawaiian populations. and Evolution 45:716–728.

Wilson, E. O. (1959). Adaptive shift and dispersal in a tropical ant fauna. Evolution

13:122–144.

Wilson, E. O. (1961). The nature of the taxon cycle in the Melanesian ant fauna. The

American Naturalist 95:169–193.

Wunderle, J. M. J. (1981). An analysis of a morph ratio cline in the Bananaquit (Coereba

flaveola) on Grenada, West Indies. Evolution 35:333–344.

Zink, R. and G. Barrowclough (2008). Mitochondrial DNA under siege in avian

phylogeography. Molecular Ecology 17:2107–2121.

32

Humphries, Meghann B., 2019, UMSL, pg. 33

APPENDIX

Data Sources

Topographic map data from Jarvis A., H. Reuter, A. Nelson, and E. Guevara (2008).

Hole-filled seamless SRTM data V4, International Centre for Tropical Agriculture

(CIAT), available from http://srtm.csi.cgiar.org.

Land cover map data from Broxton, P.D., X. Zeng, D. Sulla-Menashe, and P. A. Troch

(2014a). A global land cover climatology using MODIS data. Journal of Applied

Meteorology Climatology 53:1593—1605.

33

Humphries, Meghann B., 2019, UMSL, pg. 34

TABLES AND FIGURES

Table 1. Sampling information. T. bicolor C. flaveola mt mt and nuc Locality Coordinates Date genotypes genotypes Casa de la Selva, Carite Forest (C) 18.102036˚N, 66.044206˚W October 1993 0 6 Fajardo (F) 18.321419˚N, 65.633333˚W October 1993 13 5 nuc/21 mt El Yunque Carib. Nat. Forest (EY) 18.294522˚N, 65.780217˚W May/June 2001 5 31 Refugio Boqueron (RB) 18.007922˚N, 67.162636˚W October 2001 9 37 Guanica Forest (GF) 17.963003˚N, 66.884347˚W January 2001 23 30 Log I 18.035289˚N, 77.935186˚W Jan/Feb 2003 13 Log II 18.042381˚N, 77.941303˚W Jan/Feb 2003 8 WRC 18.390000˚N, 77.620000˚W Jan/Feb 2003 7 Portland Bight 17.883331˚N, 77.133331˚W Jan/Feb 2003 18 Long Mountain 17.983300˚N, 76.750000˚W Jan/Feb 2003 6 Gentles Coleyville Coffee 18.207300˚N, 77.507500˚W Jan/Feb 2003 34

34

Humphries, Meghann B., 2019, UMSL, pg. 35

Table 2. Pairwise FST for mitochondrial and nuclear data of C. flaveola on Puerto

Rico

Mitochondrial F GF EY C GF 0.052* EY 0.031* 0.033* C -0.038 0.15* 0.061 RB 0.32** 0.52** 0.42** 0.24** Nuclear F GF EY C GF 0.003 EY -0.034 0.031 C -0.041 0.061 -0.035 RB -0.038 0.002 0.018 0.008 Note: **indicates p value <0.001, *indicates p value <0.05.

35

Humphries, Meghann B., 2019, UMSL, pg. 36

Table 3. AMOVA results for mitochondrial (mt) and nuclear (nuc) data C. flaveola on Puerto Rico.

Source of Variation d.f. SS Proportion Variation Mt. Among populations 4 52.55 0.26 Mt. Within Populations 121 169.93 0.74 Mt. Total 125 222.48 Nuc. Among populations 4 8.4 0.005 Nuc. Within Populations 87 168.6 0.995 Nuc. Total 91 177.0

36

Humphries, Meghann B., 2019, UMSL, pg. 37

Table 4. Summary statistics and neutrality tests

Mitochondri N S h Hd π Tajima’s D Fu’s FS al F 21 21 15 0.92 +/- 0.050 0.003 +/-0.002 -1.28 -7.88* GF 30 20 16 0.84 +/- 0.065 0.001 +/-0.001 -2.25* -13.071* EY 31 31 18 0.94 +/- 0.029 0.003 +/-0.002 -1.64* -7.84* C 7 10 6 0.95 +/- 0.096 0.003+/-0.002 -0.61 -1.81 RB 37 16 14 0.85 +/- 0.042 0.001 +/-0.001 -1.79* -8.42*

Nuclear N S h Hd π Tajima’s D Fu’s FS F 5 3 3 0.7+/-0.22 0.003+/-0.002 -0.18 0.061 GF 25 8 8 0.74+/-0.079 0.003+/-0.002 -1.037 -2.91* EY 24 50 7 0.5+/-0.12 0.012+/-0.007 -2.21* 2.96 C 7 22 6 0.95+/-0.096 0.014+/-0.009 -1.51* -0.52 RB 31 31 13 0.8+/-0.064 0.008+/-0.005 -1.89* -2.57

Note: N: sample size, S: number of segregating sites, h: number of haplotypes, Hd: haplotype diversity ± s.d., π: nucleotide diversity ± s.d., Neutrality Tests (Tajima’s D, Fu’s FS): *p <0.05, **p <0.001.

37

Humphries, Meghann B., 2019, UMSL, pg. 38

Table 5. Mismatch analysis of C. flaveola populations on Puerto Rico and Jamaica

GF F EY C RB Jamaica SSD 0.003 0.003 0.005 0.027 0.003 0.011 p value 0.15 0.84 0.69 0.68 0.09 0.05 r 0.051 0.016 0.0173 0.116 0.074 0.02 p value 0.06 0.89 0.75 0.62 0.01* 0.37

38

Humphries, Meghann B., 2019, UMSL, pg. 39

Table 6. Neutrality Tests for C. flaveola on Jamaica.

Statistic Log Log I WRC GentlesCC PortlandB LongMt Pooled II Tajima's - -1.071 -0.540 -1.1535 -1.284 -0.763 -1.641* D 0.885 Fu’s FS - - - -13.100* -2.083 -3.386* - 1.858 3.921* 2.773* 25.462**

Note: * indicate p value <0.05.

39

Humphries, Meghann B., 2019, UMSL, pg. 40

Table 7. Pairwise FST for T. bicolor on Puerto Rico.

EY GF RB

GF -0.017

RB 0.209* 0.249** LP 0.226* 0.290** 0.355**

Note: * indicates p value <0.05, ** indicates p value <0.001

40

Humphries, Meghann B., 2019, UMSL, pg. 41

Table 8. Neutrality Tests for T. bicolor on Puerto Rico.

N S h Hd π Tajima’s D Fu’s FS EY 5 3 4 0.9+/-0.16 0.001+/-0.001 -1.048 -1.938* GF 23 13 9 0.78+/-0.063 0.001+/-0.001 -1.955* -3.961* RB 9 8 7 0.92+/-0.092 0.002+/-0.001 -1.284 -3.618* F 13 11 9 0.87+/-0.091 0.002+/-0.001 -1.396 -4.595*

Note: *p <0.05

41

Humphries, Meghann B., 2019, UMSL, pg. 42

Figure 1. Sampling locations on Puerto Rico. El Yunque (EY): a moist, mid-elevation broadleaf forest; Refugio Boquerón (RB): a mangrove forest with open water and cattail marsh; Guanica Forest (GF): a lowland dry seasonally deciduous forest in the south; Casa de la Selva, Carite forest (C): a mid-elevation forest with second growth; and Fajardo (F): a coastal lowland area of second growth and agriculture.

42

Humphries, Meghann B., 2019, UMSL, pg. 43

Figure 2. Jamaica Sampling Locations. Note: Log I and Log II locations have been displaced here for clarity.

43

Humphries, Meghann B., 2019, UMSL, pg. 44

Figure 3. Mitochondrial median-joining haplotype network for C. flaveola on Puerto Rico. Circle size indicates the number of individuals with a given haplotype, color indicates sampling location, and black hash marks indicate mutated positions.

44

Humphries, Meghann B., 2019, UMSL, pg. 45

Figure 4. Mitochondrial median-joining haplotype networks for individual Puerto Rican

C. flaveola populations. Circle size indicates the number of individuals with a given haplotype, color indicates sampling location, and black hash marks represent mutations.

45

Humphries, Meghann B., 2019, UMSL, pg. 46

Figure 5. Median-joining haplotype network of nuclear intron VIM from C. flaveola on

Puerto Rico. Circle size indicates the number of individuals with a given haplotype, color indicates sampling location, and black hash marks indicate mutated positions.

46

Humphries, Meghann B., 2019, UMSL, pg. 47

Figure 6. Mismatch analysis for C. flaveola populations on Puerto Rico (panels A through E) and Jamaica (panel F). Black bars indicate observed frequencies and grey lines indicate expected frequencies of mismatches, where expected values represent the null model of expansion reflected by a unimodal distribution. SSD and r depicted in inlays with accompanying p values. P values >0.05 indicate a good fit of the model and therefore support for population expansion.

47

Humphries, Meghann B., 2019, UMSL, pg. 48

Figure 7. Mitochondrial median-joining haplotype network for Jamaican C. flaveola populations. Circle size indicates the number of individuals with a given haplotype, color indicates sampling location, and black hash marks indicate mutations. Jamaican populations do not cluster distinctly from one another and most shared haplotypes are common to several locations.

48

Humphries, Meghann B., 2019, UMSL, pg. 49

Figure 8. Median-Joining Haplotype Network for Tiaris bicolor on Puerto Rico.

49

Humphries, Meghann B., 2019, UMSL, pg. 50

Figure 9. Bayesian Skyline Plots (BSP) for GF (A), RB (B), and the pooled eastern locations (C). The y-axis is number of individuals in millions on a log scale and the x-axis is time before present in millions of years.

50

Humphries, Meghann B., 2019, UMSL, pg. 51

Figure 10. Percentage land cover by island.

51

Chapter 2.

Population Structure of Avian Malaria Parasites

Meghann B. Humphries1*, Matthew T. Stacy1, Robert E. Ricklefs1

1 Department of Biology, University of Missouri–Saint Louis, St. Louis, Missouri, USA

* Corresponding author: [email protected]

Ecology and Evolution (2019); 00:1-11.

Abstract

The geographic distribution of genetic diversity in malaria parasite populations

(Apicomplexa: Haemosporida) presumably influences local patterns of virulence and the evolution of host-resistance, but little is known about population genetic structure in these parasites. We assess the distribution of genetic diversity in the partial Domain I of

Apical Membrane Antigen 1 (AMA1) in three mtDNA-defined lineages of avian

Plasmodium to determine spatial population structure and host-parasite genetic relationships. We find that one parasite lineage is genetically differentiated in association with a single host genus and among some locations, but not with respect to other hosts.

Two other parasite lineages are undifferentiated with respect to host species but exhibit geographic differentiation that is inconsistent with shared geographic barriers or with isolation-by-distance. Additional differentiation within two other lineages is unassociated with host species or location; in one case, we tentatively interpret this differentiation is as the result of mitochondrial introgression from one of the lineages into a second lineage. Humphries, Meghann B., 2019, UMSL, pg. 53

More sampling of nuclear genetic diversity within populations of avian Plasmodium is needed to rule out coinfection as a possible confounding factor. If coinfections are not responsible for these findings, further assessment is needed to determine the frequency of mitonuclear discordance and its implications for defining parasite lineages based on mitochondrial genetic variation.

Keywords: AMA1, avian malaria, Haemoproteus, Haemosporida, host-parasite genetic relationship, introgression, Plasmodium, spatial population structure

Introduction

The ubiquity of avian malaria in the global avifauna makes it an accessible pathogen to investigate how the geographic distribution of parasite genetic diversity relates to patterns of virulence, the evolution of host-resistance, and the population structure and demography of hosts. Although more than 40 morphospecies of avian

Plasmodium have been described (Valkiunas 2004) and hundreds of genetic lineages have been identified utilizing variation at the partial cytochrome b gene (Bensch et al.

2009), little is known of the genetic diversity within avian malaria parasite populations, or even concerning the boundaries of these populations. Whether individuals of a particular parasite lineage vary genetically relative to their hosts or to geographic location is poorly understood, but assessments of lineage distributions through time reveal dynamic patterns that differ among both location and host taxa (Fallon et al. 2003).

Previous work in the West Indies has revealed disjunctions in the geographic distributions of parasite lineages (Ricklefs et al. 2016) and dynamic parasite community

53

Humphries, Meghann B., 2019, UMSL, pg. 54 assembly processes, including differentiation of geographically separated parasite assemblages in as little as 2,500 years (Soares et al. 2017). However, information about the genetic diversity and structure of parasite populations, including effective population sizes and intra-lineage variation related to host species or geographic location, has not been reported. This lack of information reflects, in part, the difficulty of obtaining suitable genetic markers.

Genetic variation and population structures of two Plasmodium parasites infecting humans, P. vivax (Pv) and P. falciparum (Pf), have been characterized in detail, particularly at immunogenic loci. Among these loci is the apical membrane antigen 1

(AMA1), which plays a critical role in forming the junction between Plasmodium merozoites and host red blood cells (reviewed in Bai et al. 2005). AMA1 is also thought to interact directly with host immune systems; analyses of PfAMA1 have located one T-cell epitope and a second B/T-cell epitope in Domain I (Escalante et al. 2001), as well as several erythrocyte binding sites (Zakeri et al. 2013). Analyses of variation in the three- dimensional structure of the protein suggests that the acquisition of several highly variable loops in Domain I is related to evasion of the host immune response (Collins et al. 2009). These direct interactions support Domain I of AMA1 as a marker related to host-specific immune pressures. Accordingly, the rapid accumulation of mutations as a result of diversifying selection provides information about fine-scale population structure.

Assessments of PvAMA1 and PfAMA1 reveal distinct demographic patterns: PvAMA1 typically exhibits a differentiated structure consistent with an endemic pathogen (Neafsey et al. 2012; Taylor et al. 2013) while PfAMA1 most often exhibits an epidemic population structure with reduced diversity and little geographic differentiation (Arnott et al. 2014;

54

Humphries, Meghann B., 2019, UMSL, pg. 55

Mueller et al. 2002; Ordet al. 2008), typical of frequent clonal outbreaks. P. vivax and P. falciparum exhibit these demographic differences despite infecting the same hosts in many of the same locations and being transmitted by the same vectors. Therefore, populations of avian malaria parasites might be expected to exhibit substantial variation in the degree of population structure within lineages as a result of the more complex relationships among hosts, vectors, and locations.

To develop a better understanding of the influence of host species and geographic location on parasite genetic diversity, we assess the distribution of genetic diversity in the partial Domain 1 of AMA1 of three mitochondrial lineages commonly infecting birds in the West Indies and eastern North America. We assess genetic diversity and phylogenetic relationships within mitochondrial lineages and test whether parasite populations exhibit genetic structure related to hosts, location, or geographic distance.

Materials and Methods

DNA Extraction and Sequencing

Samples for this study were obtained over several years from diverse localities in the

Americas (see Fig. 1). Birds were captured with mist-nets and ca. 10 µL of blood was collected by sub-brachial venipuncture (field techniques described in Latta and Ricklefs

2010). DNA was extracted from blood using the isopropanol precipitation technique described in Svensson and Ricklefs (2009) and all samples were screened for avian malaria using primers 343F and 496R (Fallon et al. 2003). Samples that screened positive were genotyped to identify the cytochrome b lineage of the infection (Bensch et al. 2009) using a variety of primers and PCR conditions described in Perkins and Schall (2002),

Ricklefs et al. (2005), and Waldenström et al. (2004). For samples infected by

55

Humphries, Meghann B., 2019, UMSL, pg. 56

Plasmodium lineages OZ01 (equivalent to PADOM11 in the MalAvi database (Bensch et al. 2009)), OZ04 (MalAvi ICTCAY01), and OZ14 (MalAvi CARCAR11), approximately

400 bp (length varies by lineage) of the partial Domain 1 of AMA1 were amplified using nested primers Pg_AMA1F1/ Pg_AMA1R1 and Pg_AMA1F2/ Pg_AMA1R2 and PCR protocols described in Lauron et al. (2014). Negative controls were included in each PCR reaction and products were verified by visualization on 1% TBE agarose gels with ethidium bromide. PCR product was cleaned using the ExoSAP-IT protocol (Bell 2008) and sequenced by Eurofins Genomics (Louisville, KY). Contigs were aligned and edited in Mega6 (Tamura et al. 2013), and chromatograms were checked by eye to confirm polymorphisms. Heterozygous positions were denoted with IUPAC ambiguous base codes and haplotypes were reconstructing using PHASE in DNAsp v. 5 (Librado and

Rozas 2009).

Statistical Analyses of Variation

Summary statistics describing genetic variation within parasite populations, based either on geography or host species, were calculated using DNAsp v.5 (Librado and Rozas

2009). These statistics include number of haplotypes (h), haplotype diversity (Hd), and nucleotide diversity (π). Hd is the probability that two haplotypes chosen at random differ

(Nei 1987), while π is the average number of nucleotide differences per site for two sequences chosen at random (Nei and Li 1979). We calculated π for the entire sequence, and, to examine patterns of nucleotide substitution within the AMA1 gene, we additionally calculated π using a sliding window of 30 bp and step size of 9 bp (Lauron et al. 2014). Synonymous and nonsynonymous substitutions, and minimum number of recombination events (Rm, using the four gamete test after Hudson and Kaplan 1985),

56

Humphries, Meghann B., 2019, UMSL, pg. 57

were also calculated in DNAsp v.5. Pairwise FST values were calculated in Arlequin v. 3.5

(Excoffier and Lischer 2010) for a reduced dataset containing only sequences from locations or hosts with 10 or more samples.

Phylogeographic Analysis

A median-joining haplotype network was generated using Popart (Leigh and Bryant

2015) to compare and visualize relationships among haplotypes at each sample location and within each host species or family. Pairwise FST values were calculated with confidence intervals based on 1000 permutations (Arlequin v.3.5, (Excoffier and Lischer

2010)) for all locations with more than 10 sequences and for the two clusters identified in the haplotype networks of lineages OZ01 and OZ14. A Mantel test comparing pairwise

FST and geographic distance was employed to determine whether the population structure detected among some locations in lineage OZ01 is consistent with isolation-by-distance.

The Mantel test was implemented in the R package “ade4” version 1.7-11 with 9,999 permutations.

Results

Statistical Analyses of Variation

We sequenced 389 - 407 base pairs (depending on lineage, Table 1) of the partial Domain

1 of AMA1 corresponding to amino acid positions 169 – 295/300 of P. falciparum (3D7 isolate, Genbank accession U33274.1). A maximum likelihood phylogeny of AMA1 amino acid sequences (Fig. 2), including representatives of four avian and three mammalian Plasmodium species in addition to the three lineages assessed here, is concordant with the mitochondrial CYTB gene tree reported in Ricklefs et al. (2014,

57

Humphries, Meghann B., 2019, UMSL, pg. 58 supporting information Fig. S2) , recovering OZ04 and OZ14 as being closely related to each other relative to OZ01.

Analysis of lineage OZ01 included 232 nucleotide sequences containing 18 unique haplotypes from 31 host species and 18 locations distributed in the eastern United

States and the West Indies (Table 1, Fig. 3). Analysis of the geographically more restricted lineage OZ04 included 170 nucleotide sequences containing 22 unique haplotypes from 16 host species and 11 locations distributed in the Lesser Antilles and in the Ozarks region of Missouri (Table 1, Fig. 3). Lineage OZ14 comprised 134 nucleotide sequences representing 10 unique haplotypes from 23 host species and 12 locations distributed in the eastern United States and the West Indies (Table 1, Fig. 3). Estimates of overall nucleotide diversity (π) were 0.004, 0.173, and 0.007 for OZ01, OZ04, and OZ14, respectively, and estimates of Hd were 0.607, 0.717, and 0.430 for the same sequences

(Table 2). Lineage OZ01 exhibited the highest heterozygosity with 13 individuals possessing at least one heterozygous position; lineages OZ04 and OZ14 contained 3 and

7 such individuals, respectively (Table 3).

Sliding window analysis of π revealed congruence in the distribution of diversity among lineages, particularly within the range of nucleotides from positions 1 – 130 (Fig.

4). We detected the highest polymorphism at nucleotide positions (nt) 94 – 130. This region aligns with PfAMA1 amino acid residues 259-271 (Escalante et al. 2001), which constitute a hypervariable region encompassing a T-cell epitope. Lineage OZ14 exhibits three additional peaks in π; one is shared with OZ04 (nt 204-249 in both lineages) and corresponds to a B/T cell epitope in PfAMA1 at amino acid residues 279-288, a third is at

OZ14AMA1 nt 150 – 170, and the final peak is at OZ14AMA1 nt 276-303. The latter two

58

Humphries, Meghann B., 2019, UMSL, pg. 59 are not known to have immunogenic functions. These lineages share an amino acid insertion detected in other avian malaria parasites at PfAMA1 amino acid residue 188

(Lauron et al. 2014), and lineage OZ01 and a subset of OZ04 exhibit a second amino acid insertion at the same position. Most of the nucleotides that encode a hydrophobic trough hypothesized to contain a critical structural component of the molecule are conserved; amino acid residues at PfAMA1 tyrosine Y251, valine V169, and leucine L357 (Bai et al.

2005) are conserved in all samples, and PfAMA1 phenylalanine F183 is conserved in all samples of lineages OZ01 and OZ04, although we detected three haplotypes with a serine at this position in lineage OZ14.

Phylogeographic Analysis

Haplotype networks (Fig. 5) reveal that lineages OZ04 and OZ14 each contain a lineage- specific common and widespread AMA1 haplotype, and that OZ01 contains two common haplotypes. However, in OZ01 these haplotypes are shared with individuals belonging to the mitochondrial OZ04 lineage (visualized in Fig. 5A). One of the OZ01 haplotypes corresponds to a group of parasites infecting many hosts in many locations, and a second corresponds to a group infecting predominantly individuals of the avian genus Passerina

(buntings): 42 of 68 infections in this group were detected in P. cyanea, and 6 of 68 were detected in P. ciris, representing 70.5% of the infections in this group and 87.5% of all

Passerina OZ01 infections.

Much of the diversity of OZ04 can be attributed to the presence of a single deep division within the lineage (visualized in Fig. 5B). One subdivision of OZ04 consists primarily of a single common, widespread AMA1 haplotype; the other subdivision includes haplotypes identical or nearly identical to several haplotypes detected in lineage

59

Humphries, Meghann B., 2019, UMSL, pg. 60

OZ01 (Fig. 6). Inspection of 21 (of 31total) of the CYTB sequences of discordant samples showed no nucleotide variation from the reference sequence for lineage OZ04, indicating that the samples were correctly assigned to the lineage and supporting the absence of this divergence in the mitochondrial genome.

The Passerina-dominant group within lineage OZ01 is significantly differentiated from the other group (FST = 0.74, p <0.001), but pairwise FST values among the parasites of other well-sampled hosts were not significant in any of the lineages assessed. The smaller group of AMA1 haplotypes within OZ14 (visualized in Fig. 5C) is significantly differentiated from the larger group in this lineage (FST = 0.88, p <0.001) and does contain a preponderance of hosts in the family Icteridae (30% of this group, comprising

57% of the icterid infections sampled) but is broadly geographically distributed and exhibits no statistically significant differentiation in association with any one host species.

Support for differentiation among at least some locations comes from pairwise comparisons in each lineage. Lineage OZ01exhibits a complex pattern of differentiation and gene flow (Table 4) while both OZ04 and OZ14 are differentiated with respect to only a single location: OZ04 samples from are differentiated from those from

Dominica, , and Jamaica, but no other pairwise comparisons among these locations are significant; OZ14 samples from Chicago are differentiated with respect to samples from locations in Saint Louis, the Missouri Ozarks, Michigan, and Pennsylvania, but exhibit no other significant pairwise differences among these locations. Further investigation into the complex pattern of OZ01 revealed no shared geographic boundary

60

Humphries, Meghann B., 2019, UMSL, pg. 61 among pairwise haplotype comparisons and no signal of isolation-by-distance (Mantel statistic r: -0.188, significance 0.69).

Discussion

The avian malaria parasite lineages assessed here have been defined based on their similarity at the partial cytochrome b gene, but whether they represent good phylogenetic species has not been determined. Broad host ranges, propensity for host switching, and difficulty in linking morphological species and genetic lineages have all contributed to the problem of delimiting species in this group (Martinsen et al. 2008; Outlaw and

Ricklefs 2014; Galen et al. 2018). Moreover, a cut-off based on percentage of divergence is not a useful way to define these species because it has been demonstrated through combined geographic range, host range, and corroboration with multiple loci that good phylogenetic species can exhibit low divergence at CYTB (Galen et al. 2018; Nilsson et al. 2016; Outlaw and Ricklefs 2014). In the absence of morphological data, species delimitation is thought to be best addressed by comparing multiple gene trees (Bensch et al. 2004; Outlaw and Ricklefs 2014).

The AMA1 gene tree produced here contradicts the mitochondrial lineage assignment in some cases. Specifically, all parasites from mitochondrial lineage OZ01 possess AMA1 alleles similar to one another, but 31 parasites from mitochondrial lineage

OZ04 also possess AMA1 alleles identical or very similar to OZ01 alleles; the remaining parasites of lineage OZ04 possess AMA1 alleles that are similar to one another and more than 60 mutations divergent from the OZ01 cluster (Figure 6). We interpret these relationships as supporting mitochondrial introgression from OZ04 to OZ01, but an

61

Humphries, Meghann B., 2019, UMSL, pg. 62 alternative explanation of undetected coinfections cannot be ruled out with currently available data. Avian malaria coinfections are known to occur and are known to be underestimated by PCR as a result of preferential primer binding on variations of the template sequence (Bernotiene et al. 2016). Therefore, it is possible that the suspected introgressed parasites are instead OZ01/OZ04 coinfections in which our CYTB primers only amplified templates from OZ04 and our AMA1 primers only amplified templates from OZ01.

Individual CYTB SNP data were available for 21 (of 31) samples with suspected introgressed AMA1 variants and revealed no difference from the reference sequence for lineage OZ04 (Genbank accession GQ395669) confirming appropriate lineage assignment and supporting our inference of introgression. We did not detect similar patterns between OZ14 and either of the other lineages, though they co-occur in some of the same hosts and locations and might therefore be expected to exhibit similar rates of coinfection. Additionally, we recovered a suspected introgressed parasite in Refugio

Boquerón, PR, where OZ01 has not been detected (it has been detected at very low abundances in other locations on Puerto Rico, see Figs. 3, 7), making it less likely to be a coinfection in this case. The lineages assessed here have not yet been assigned to morphospecies (Malavi database, Bensch et al. 2009) and a morphological assessment of lineages OZ01 and OZ04 following the protocols and keys described in (Valkiunas et al.

2009; Valkiūnas and Iezhova 2018) revealed no discernible difference between lineages.

As a result, coinfections could not be identified or ruled out at present by inspecting blood smears. Though we cannot definitively rule out coinfections in these cases, we proceed in the following with a tentative interpretation of mitochondrial introgression

62

Humphries, Meghann B., 2019, UMSL, pg. 63 among lineages and encourage increased attention to this matter in future research. The introgressed OZ01 parasites detected here were most often recovered in hosts that are rarely parasitized by OZ01 but commonly parasitized by OZ04, namely Coereba flaveola and Tiaris bicolor. Thraupidae species host only 6% of detected OZ01 infections (7/116 detections, Fig. 8) but make up 71% of the cases of evident introgression (22/31 detections, Fig. 5B, Fig. 8), with C. flaveola and T. bicolor together acconting for 77% of these (8 and 6, respectively). These patterns suggest that mitochondrial introgression from OZ04 to OZ01 may have facilitated this host shift. Moreover, the higher relative abundance of introgressed OZ01 parasites on Jamaica, the co-occurrence of OZ01 and

OZ04 on the same island, and the relationship of the Jamaican haplotypes to the other introgressed haplotypes detected (Table 5, Figs. 7 and 9) supports Jamaica as the likely site of introgression. One possible hypothesis for this occurrence is that an individual mosquito vector on this island was infected by both OZ01 and OZ04 gametes (from either multiple blood meals or a single co-infected host) which hybridized during sexual reproduction in the mosquito vector. The hybrid progeny then repeatedly backcrossed with OZ01 parasites to produce the mitonuclear discordance we detect in contemporary populations.

The introgressed parasites exhibit several geographic disjunctions that might be related to migratory movement of hosts. Specifically, one or more migrants might have carried the parasite from Jamaica to the North American Midwest (e.g., Missouri

Ozarks), where it was transmitted to other hosts, which then carried it to Puerto Rico and the Lesser Antilles in following migrations, skipping the island of Hispaniola. Further dispersal within the Lesser Antilles may have been facilitated by movements of common

63

Humphries, Meghann B., 2019, UMSL, pg. 64 endemic species, including C. flaveola and T. bicolor. Other intra-lineage divisions, for example in lineage OZ14 and in P. lucens (Lauron et al. (2014), discussed below), may also represent mitonuclear discordance resulting from introgression, although further sampling will be required to substantiate this. If substantiated, these findings imply that defining lineages based on CYTB similarity might often be problematic.

The comparatively high estimate of nucleotide diversity (π) for samples of P. lucens prompted our further investigation into the relationships among AMA1 haplotypes within that parasite species. We constructed a median-joining haplotype network for P. lucens (Fig. 10) which revealed that seven of the 51 sequences (three of 12 haplotypes) cluster together more than 60 mutational steps from the other haplotypes. The pairwise nucleotide differences among haplotypes within the major cluster vary between one and three substitutions, suggesting uncertain taxonomic identity of the minor cluster. If the divergent sequences are removed from analysis, the estimate of π is 0.003 (reported in

Table 2), consistent with findings for OZ01 and OZ14. The estimate for lineage OZ04 provided here is also exceedingly high, but expectedly so because it includes the highly divergent introgressed group. Removal of the divergent group in this case produced an estimate of π = 0.006 for lineage OZ04.

We found no support for isolation-by-distance in the widespread lineage OZ01, and lineages OZ04 and OZ14 are genetically undifferentiated with respect to host species. OZ04 does, however, exhibit overall reduced host breadth compared to OZ14, with the former infecting mainly West Indian tanagers such as bananaquits (Coereba flaveola), grassquits (Tiaris bicolor), and bullfinches (Loxigilla spp.), while the latter infects extremely diverse hosts. The lack of population differentiation among some

64

Humphries, Meghann B., 2019, UMSL, pg. 65 locations in OZ01 and OZ14 is likely related to host dispersal as these lineages infect primarily migratory species. In contrast, we detected OZ04 most commonly in resident species. The presence of genetic differentiation among locations within lineages might be promoted by geographic isolation related to migratory paths of hosts or vector dispersal, but we lack information to assess these possibilities at present.

Congeneric comparisons

P. vivax and P. falciparum represent contrasting demographic patterns and impacts on hosts, and these parasites may provide an informative context in which to consider the findings presented here. In general, P. vivax causes less mortality and is more likely to persist at low densities and with a lower transmission rate than P. falciparum (Neafsey et al. 2012). P. vivax has more diverse populations, exhibits dormancy and relapse, and dominates the interepidemic periods when sympatric with P. falciparum (Arnott et al.

2014; Neafsey et al. 2012). P. falciparum exhibits a contrasting demographic pattern typified by dramatic clonal outbreaks (Mueller et al. 2002; Razakandrainibe et al. 2005) and more severe host illness. The avian parasites with the three AMA1 lineages described here exhibit population patterns inconsistent with either the epidemic pathogen structure of P. falciparum or the endemic pathogen structure of P. vivax. That is, our lineages present comparatively low diversity at AMA1, as in P. falciparum, but nonetheless exhibit local differentiation and recombination as in P. vivax. Little is known of other avian malaria parasite populations, but analysis of a taxonomically conservative subset of P. lucens AMA1 from a population of olive sunbirds (Cyanomitra olivacea) in Cameroon

(Lauron et al. 2014) is consistent with our finding that genetic diversity in the avian

65

Humphries, Meghann B., 2019, UMSL, pg. 66 parasites is lower than both P. vivax (Figtree et al. 2000; Grynberg et al. 2008) and P. falciparum (Escalante et al. 2001).

Conclusions

Analyses at more loci and with wider taxon sampling will be necessary to uncover the complexities of these relationships, but findings here provide a glimpse into the host distribution, spatial distribution, and diversity of avian malaria AMA1 in natural host communities. The complex spatial patterns and differentiation in relation to host genus described here suggest several possible influences on population structure, including host immune pressure, host dispersal, and migratory movements, as well as vector dispersal and host feeding preferences. Moreover, the mitonuclear discordance detected here warrants further investigation to assess the role of coinfections and to determine the frequency of such introgression events and implications for defining parasite lineages based on mitochondrial genetic variation.

Literature Cited

Abouie Mehrizi, Akram, Masoumeh Sepehri, Fatemeh Karimi, Navid Dinparast Djadid,

and Sedigheh Zakeri. 2013. “Infection, Genetics and Evolution Population Genetics,

Sequence Diversity and Selection in the Gene Encoding the Plasmodium falciparum

Apical Membrane Antigen 1 in Clinical Isolates from the South-East of Iran.”

Infection, Genetics and Evolution 17:51–61.

Arnott, Alicia, Johanna Wapling, Ivo Mueller, Paul A Ramsland, Peter M Siba, John C

Reeder, and Alyssa Barry 2014. “Distinct Patterns of Diversity, Population Structure

and Evolution in the AMA1 Genes of Sympatric Plasmodium falciparum and

66

Humphries, Meghann B., 2019, UMSL, pg. 67

Plasmodium vivax Populations of Papua New Guinea from an Area of Similarly

High Transmission.” Malaria Journal 13(233):1–16.

Bai, Tao, Michael Becker, Aditi Gupta, Phillip Strike, Vince J. Murphy, Robin F.

Anders, and Adrian H. Batchelor. 2005. “Structure of AMA1 from Plasmodium

falciparum Reveals a Clustering of Polymorphisms That Surround a Conserved

Hydrophobic Pocket.” Proceedings of the National Academy of Science 102:12736–

41.

Bell, Jonathan R. 2008. “A Simple Way to Treat PCR Products Prior to Sequencing

Using ExoSAP-IT ®.” Biotechniques 44(6):2008.

Bensch, Staffan, Olof Hellgren, and Javier Pérez-Tris. 2009. “MalAvi: A Public Database

of Malaria Parasites and Related Haemosporidians in Avian Hosts Based on

Mitochondrial Cytochrome b Lineages.” Molecular Ecology Resources 9(5):1353–

58.

Bensch, Staffan, Javier Pérez-Tris, Jonas Waldenström, and Olof Hellgren. 2004.

“Linkage between Nuclear and Mitochondrial DNA Sequences in Avian Malaria

Parasites: Multiple Cases of Cryptic Speciation ?” Evolution 58(7):1617–21.

Bernotiene, Rasa, Vaidas Palinauskas, Tatjana Iezhova, Dovile Murauskaite, and

Gediminas Valkiunas. 2016. “Avian Haemosporidian Parasites (Haemosporida): A

Comparative Analysis of Different Polymerase Chain Reaction Assays in Detection

of Mixed Infections.” Experimental Parasitology (163):31–37.

Collins, Christine R., Chrislaine Withers-Martinez, Fiona Hackett, and Michael J.

Blackman. 2009. “An Inhibitory Antibody Blocks Interactions between Components

of the Malarial Invasion Machinery.” PLoS Pathogens 5(1).

67

Humphries, Meghann B., 2019, UMSL, pg. 68

Escalante, Ananias A., Heather M. Grebert, Sansanee C. Chaiyaroj, Magda Magris, Sukla

Biswas, Bernard L. Nahlen, and Altaf A. Lal. 2001. “Polymorphism in the Gene

Encoding the Apical Membrane Antigen-1 (AMA-1) of Plasmodium Falciparum. X.

Asembo Bay Cohort Project.” Molecular and Biochemical Parasitology

113(2):279–87.

Excoffier, Laurent and Heidi E. L. Lischer. 2010. “Arlequin Suite Ver 3.5: A New Series

of Programs to Perform Population Genetics Analyses under Linux and Windows.”

Molecular Ecology Resources 10(3):564–67.

Fallon, Sylvia. M., Robert. E. Ricklefs, B. L. Swanson, and Eldridge Bermingham. 2003.

“Detecting Avian Malaria : An Improved Polymerase Chain Reaction Diagnostic.”

Journal of Parasitology 89(5):1044–47.

Fallon, Sylvia M., Eldredge Bermingham, and Robert E. Ricklefs. 2003. “Island and

Taxon Effects in Parasitism Revisited : Avian Malaria in the Lesser Antilles.”

Evolution 57(3):606–15.

Figtree, Melanie, Cielo J. Pasay, Robert Slade, Qin Cheng, Nicole Cloonan, John Walker,

and Allan Saul . 2000. “Plasmodium vivax Synonymous Substitution Frequencies,

Evolution and Population Structure Deduced from Diversity in AMA 1 and MSP 1

Genes.” Molecular and Biochemical Parasitology 108:53–66.

Galen, Spencer C., Janus Borner, Ellen S. Martinsen, Juliane Schaer, Christopher C.

Austin, Christopher J. West, and Susan L. Perkins. 2018. “The Polyphyly of

Plasmodium: Comprehensive Phylogenetic Analyses of the Malaria Parasites (Order

Haemosporida) Reveal Widespread Taxonomic Conflict.” Royal Society Open

Science 5(5).

68

Humphries, Meghann B., 2019, UMSL, pg. 69

Grynberg, Priscila, Cor Jesus F. Fontes, Austin L. Hughes, and Érika M. Braga. 2008.

“Polymorphism at the Apical Membrane Antigen 1 Locus Reflects the World

Population History of Plasmodium vivax.” BMC Evolutionary Biology 8:1–9.

Gunasekera, Anusha M., Thilan Wickramarachchi, Daniel E. Neafsey, Ishani Ganguli,

Lakshman Perera, Prasad H. Premaratne, Daniel Hartl, Shiroma M.

Handunnetti, Preethi V. Udagama-Randeniya, and Dyann F. Wirth. 2007. “Genetic

Diversity and Selection at the Plasmodium vivax Apical Membrane Antigen-1

(PvAMA-1) Locus in a Sri Lankan Population.” Molecular Biology and Evolution

24(4):939–47.

Hudson, R. R. and N. L. Kaplan. 1985. “Statistical Properties of the Number of

Recombination Events in the History of a Sample of DNA Sequences.” Genetics

111(1):147–64.

Latta, Steven C. and Robert E. Ricklefs. 2010. “Prevalence Patterns of Avian

Haemosporida on Hispaniola.” Journal of Avian Biology 41:25–33.

Lauron, Elvin, Khouanchy S Oakgrove, Lisa A Tell, Kevin Biskar, Scott W Roy, and

Ravinder NM Sehgal . 2014. “Transcriptome Sequencing and Analysis of

Plasmodium Gallinaceum Reveals Polymorphisms and Selection on the Apical

Membrane Antigen-1.” Malaria Journal 13(1):382.

Le, Si Quang and Olivier Gascuel. 2008. “An Improved General Amino Acid

Replacement Matrix.” Molecular Biology and Evolution 25(7):1307–20.

Leigh, Jessica W. and David Bryant. 2015. “POPART: Full-Feature Software for

Haplotype Network Construction.” Methods in Ecology and Evolution 6:1110–16.

Librado, P. and J. Rozas. 2009. “DnaSP v5: A Software for Comprehensive Analysis of

69

Humphries, Meghann B., 2019, UMSL, pg. 70

DNA Polymorphism Data.” Bioinformatics 25(11):1451–52.

Martinsen, Ellen S., Susan L. Perkins, and Jos J. Schall. 2008. “A Three-Genome

Phylogeny of Malaria Parasites (Plasmodium and Closely Related Genera):

Evolution of Life-History Traits and Host Switches.” Molecular Phylogenetics and

Evolution 47(1):261–73.

Mueller, Ivo, Japalis Kaiok, John C. Reeder, and Alfred Cortés. 2002. “The Population

Structure of Plasmodium Falciparum and Plasmodium Vivax during an Epidemic of

Malaria in the Eastern Highlands of Papua New Guinea.” The American Journal of

Tropical Medicine and Hygiene 67(5):459–64.

Neafsey, Daniel E., Kevin Galinsky, Rays H Y Jiang, Lauren Young, Sean M Sykes,

Sakina Saif, Sharvari Gujja, Jonathan M Goldberg, Sarah Young, Qiandong Zeng,

Sinéad B Chapman, Aditya P Dash, Anupkumar R Anvikar, Patrick L Sutton, Bruce

W Birren, Ananias A Escalante, John W Barnwell, and Jane M Carlton. 2012. “The

Malaria Parasite Plasmodium Vivax Exhibits Greater Genetic Diversity than

Plasmodium Falciparum.” Nature Genetics 44(9):1046–50.

Nei, Masatoshi. 1987. Molecular Evolutionary Genetics. Columbia University Press.

Nei, Masatoshi and Wen-Hsiung Li. 1979. “Mathematical Model for Studying Genetic

Variation in Terms of Restriction Endonucleases.” Proceedings of the National

Academy of Sciences 76(10):5269–73.

Nilsson, E, H. Taubert, O. Hellgren, X. Huang, V. Palinauskasm, M. Y. Markovets, G.

Valkiunas, and S. Bensch. 2016. “Multiple Cryptic Species of Sympatric Generalists

within the Avian Blood Parasite Haemoproteus Majoris.” Journal of Evolutionary

Biology 29:1812–26.

70

Humphries, Meghann B., 2019, UMSL, pg. 71

Ord, Rosalynn L., Adriana Tami, and Colin J. Sutherland. 2008. “Ama1 Genes of

Sympatric Plasmodium Vivax and P . Falciparum from Differ

Significantly in Genetic Diversity and Recombination Frequency.” PLoS ONE

3(10).

Outlaw, D. C. and R. E. Ricklefs. 2014. “Species Limits in Avian Malaria Parasites

(Haemosporida): How to Move Forward in the Molecular Era.” Parasitology

141(10):1223–32.

Perkins, Susan L. and Jos. J. Schall. 2002. “A Molecular Phylogeny of Malarial Parasites

Recovered from Cytochrome b Gene Sequences.” The Journal of Parasitology

88(5):972. Retrieved (http://www.jstor.org/stable/3285540?origin=crossref).

Razakandrainibe, F. G., P., Durand, J. C. Koella, T. De Meeüs, F. Rousset, F. J. Ayala,

and F. Renaud 2005. "Clonal" Population Structure of the Malaria Agent

Plasmodium falciparum in High-Infection Regions.” Proceedings of the National

Academy of Sciences of the USA 102:17388–93.

Ricklefs, Robert E., Bethany L. Swanson, Sylvia M. Fallon, Alejandro Martínez- Abraín,

Alexander Scheuerlein, Julia Gray and Steven C. Latta. 2005. “Community

Relationships of Avian Malaria Parasites in Southern Missouri.” Ecological

Monographs 75(4):543–59.

Ricklefs, Robert E., Diana C. Outlaw, Maria Svensson-Coelho, Matthew C. I. Medeiros,

Vincenzo A. Ellis, and Steven Latta 2014. “Species Formation by Host Shifting in

Avian Malaria Parasites.” Proceedings of the National Academy of Sciences of the

United States of America 111(41):14816–21.

Ricklefs, Robert E., Leticia Soares, Vincenzo A. Ellis, and Steven C. Latta. 2016.

71

Humphries, Meghann B., 2019, UMSL, pg. 72

“Haemosporidian Parasites and Avian Host Population Abundance in the Lesser

Antilles.” Journal of Biogeography 43:1277–86.

Soares, Leticia, Steven C. Latta, and Robert E. Ricklefs. 2017. “Dynamics of Avian

Haemosporidian Assemblages through Millennial Time Scales Inferred from Insular

Biotas of the West Indies.” Proceedings of the National Academy of Science

114(25):6635–40.

Svensson, L. Maria E. and Robert E. Ricklefs. 2009. “Low Diversity and High Intra-

Island Variation in Prevalence of Avian Haemoproteus Parasites on Barbados,

Lesser Antilles.” Parasitology 136:1121–31. Retrieved August 15, 2016

(http://journals.cambridge.org/abstract_S0031182009990497).

Tamura, Koichiro, Glen Stecher, Daniel Peterson, Alan Filipski, and Sudhir Kumar.

2013. “MEGA6: Molecular Evolutionary Genetics Analysis Version 6.0.”

Molecular Biology and Evolution 30(12):2725–29.

Taylor, Jesse M. Andreína Pacheco, David J. Bacon, Mohammad A. Beg, Ricardo Luiz

Dantas Machado, Rick M. Fairhurst, Socrates Herrera, Jung-Yeon Kim, Didier

Menard, Marinete Marins Póvoa, Leopoldo Villegas, Mulyanto, Georges Snounou,

Liwang Cui, Fadile Yildiz Zeyrek, and Ananias A. Escalante. 2013. “The

Evolutionary History of Plasmodium vivax as Inferred from Mitochondrial

Genomes: Parasite Genetic Diversity in the .” MBE Advance Access.

Valkiūnas, Gediminas. 2004. Avian Malaria Parasites and Other Haemosporidia. CRC

press.

Valkiūnas, Gediminas and Tatjana A. Iezhova. 2018. “Keys to the Avian Malaria

Parasites.” Malaria Journal 17.

72

Humphries, Meghann B., 2019, UMSL, pg. 73

Valkiūnas, Gediminas, Tatjana A. Iezhova, Claire Loiseau, and Ravinder N. M. Sehgal.

2009. “Nested Cytochrome B Polymerase Chain Reaction Diagnostics Detect

Sprozoites of Hemosporidian Parasites in Peripheral Blood of Naturally Infetected

Birds.” Journal of Parasitology 95(6):1512–15.

Waldenström, J., S. Bensch, D. Hasselquist, and Ö. Östman. 2004. “A New Nested

Polymerase Chain Reaction Method Very Efficient in Detecting Plasmodium and

Haemoproteus Infections From Avian Blood.” Journal of Parasitology 90(1):191–

94.

Zakeri, Sedigheh, Hengameh Sadeghi, Akram Abouie Mehrizi, and Navid Dinparast

Djadid. 2013. “Population Genetic Structure and Polymorphism Analysis of Gene

Encoding Apical Membrane Antigen-1 (AMA-1) of Iranian Plasmodium vivax Wild

Isolates.” Acta Tropica 126(3):269–79.

73

TABLES AND FIGURES

OZ01 OZ04 Major OZ04 Minor OZ04 OZ14 Number of Sequences 232 170 108 62 134 Number of haplotypes 18 22 10 12 10 Haplotype Diversity 0.607 0.717 0.362 0.804 0.430 Number of Host species 31 16 12 12 23 Number of locations 18 11 9 8 12

Table 1. AMA1 ample information and summary statistics for lineages OZ01, OZ04, and OZ14. OZ04 Major includes only the major cluster, and Minor OZ04 includes only the apparently introgressed samples.

OZ01 OZ04 Major OZ04 Minor OZ04 OZ14 P. lucens P. vivax P. falciparum 0.043/ ~ 0.016‡ 0.01†† π 0.004 0.173 0.006 0.004 0.007 0.003† 0.0174§ 0.0098‡‡

S % 25 12.7 21.6 21.5 20.7 22† 24 ¶ -

Rm 3 9 1 1 1 0† 9 ¶ 3§§

Table 2. Nucleotide diversity (π), percentage of synonymous variation (S%), and recombination estimates (Rm) for avian and human Plasmodium lineages. Lineage OZ04 estimates of π are for all sequences, including suspected cases of introgression, Major OZ04 includes only the major cluster, and Minor OZ04 includes only the apparently introgressed samples. Locations of samples for congeneric comparisons: †Africa (for all sequences above, for a reduced dataset excluding 7 sequences of uncertain taxonomic identity below, from Lauron et al. 2014) ‡Venezuela (Grynberg et al. 2008), § Brazil (Figtree et al. 2000), ¶ Sri Lanka (Gunasekera et al. 2007), †† Iran (Abouie Mehrizi et al. 2013), ‡‡10 locations distributed globally (Escalante et al. 2001), §§Venezuela (Ord et al. 2008).

Humphries, Meghann B., 2019, UMSL, pg. 76

OZ01 OZ04 OZ14 Introgr. Heterozygous Individuals 13 3 7 13 # of ind. With 1 heterozygous position 12 1 3 13 # of ind. With 2 heterozygous positions 1 1 4 0 # of ind. With 3 heterozygous positions 0 1 0 0

Table 3. AMA1 genotype heterozygosity for lineages OZ01, OZ04, and OZ14, and for the suspected introgressed group.

76

Humphries, Meghann B., 2019, UMSL, pg. 77

N Michigan Illinois Indiana Penn. Mexico Ozarks St. Louis Michigan 20 - + - + + + Illinois 32 0 + + + + + Indiana 34 0.484 0.5 + + + + Penn. 24 0.088 0.09 0.302 + - + Mexico 26 0.253 0.277 0.119 0.121 + - Ozarks 34 0.297 0.303 0.1 0.082 0.067 - St. Louis 22 0.289 0.315 0.083 0.137 0.019 0.048 Table 4. Pairwise FST among locations for lineage OZ01 (only locations with 20 or more detections are included in this analysis), N = sample, p-value <0.05 indicated by + and p- value >0.05 indicated by - in upper diagonal. Note: negative FST value reported as 0.

77

Humphries, Meghann B., 2019, UMSL, pg. 78

JAM OZ GU DO CF-PR RB-PR SK SL Sum Coereba flaveola 7 1 1 9 Tiaris bicolor 2 3 1 1 1 8 Euneornis campestris 3 3 Passerina cyanea 2 2 Vireo olivaceus 2 2 Cinclocerthia ruficauda 1 1 Setophaga plumbea 1 1 Icteria virens 1 1 Loxipasser anoxanthus 1 1 Loxigilla noctis 1 1 Mniotilta varia 1 1 Turdus plumbeus 1 1 Sum 13 6 6 2 1 1 1 1 31 Table 5. Host and geographic location of introgressed parasites. Location abbreviations are as follows: JAM, Jamaica; OZ, Ozarks region of Missouri; GU, Guadeloupe; DO, ; CF-PR, Carite Forest, Puerto Rico; RB-PR, Refugio Boqueron, Puerto Rico; SK, Saint Kitts; SL, Saint Lucia.

78

Humphries, Meghann B., 2019, UMSL, pg. 79

Figure 1. Overall detections of lineages OZ01 (blue triangles), OZ04 (red diamonds), and OZ14 (green circles). OZ01 is broadly distributed in the Neotropics and Nearctic; OZ04 is primarily restricted to the West Indies with limited detection in North and South America; OZ14 is distributed in the eastern United States and locally in the Greater

79

Humphries, Meghann B., 2019, UMSL, pg. 80

Antilles. All three lineages are known to co-occur on Jamaica, Hispaniola, and in the Ozarks region of Missouri.

Figure 2. . Phylogenetic relationships among the lineages assessed here and other Plasmodium species, based on a maximum likelihood phylogeny of the amino acid sequences produced in Mega6. In accordance with Mega6 model selection, the amino acid substitution model LG + G (Le and Gascuel 2008) was implemented with five gamma categories and 100 bootstrap replicates. Included in the phylogeny are representatives of nine avian and three mammalian malaria species for which AMA1 sequences were available (Genbank accessions KJ722597, KX601270, KX601271, KJ722596, KJ722593, KJ722594, KJ722595, KJ722542, KJ722547, KY905274, ACB55391, and AF352829)

80

Humphries, Meghann B., 2019, UMSL, pg. 81

Figure 3. AMA1 sampling localities depicting lineage co-occurrences and locations of suspected introgression (crosses). Introgressed parasites were detected at seven localities where OZ01 and OZ04 are known to co-occur and one locality where OZ01 has not been detected (Refugio Boquerón, PR).

81

Humphries, Meghann B., 2019, UMSL, pg. 82

Figure 4. . Sliding window analysis of nucleotide diversity for OZ01, OZ04, OZ14, and P. falciparum from Escalante et al. (2001). Window, 30 bp; step size, 9 bp

82

Humphries, Meghann B., 2019, UMSL, pg. 83

Figure 5. Median joining haplotype networks of AMA1 variation for mitochondrial lineages OZ01 (A), OZ04 (B), and OZ14 (C). Circle size indicates the number of individuals with a given haplotype, color indicates host family/superfamily, and black hash marks indicate mutations. The gray oval in panel A encloses the group primarily infecting Passerina.

83

Humphries, Meghann B., 2019, UMSL, pg. 84

Figure 6. Median joining haplotype networks of AMA1 depicting combined lineages OZ01, OZ04, and OZ14. Circle size indicates the number of individuals with a given haplotype, color indicates mitochondrial lineage designation, and black hash marks indicate mutations.

84

Humphries, Meghann B., 2019, UMSL, pg. 85

90

80

70

60

50

40

30

20

10

0

Brazil 1 Brazil 2 Brazil 3 Brazil 4 Brazil

Antigua

Jamaica

Panama

Ecuador

Illinois 1 Illinois 2 Illinois

Trinidad

Grenada

Alabama

Bahamas

Mexico 1 Mexico 2 Mexico

Dominica

Saint Kitts Saint

Venezuela

Missouri 1 Missouri 2 Missouri 3 Missouri 4 Missouri

Tennessee

Saint Lucia Saint

Mississippi

Michigan 1 Michigan 2 Michigan

Martinique

Montserrat

Connecticut

Guadeloupe

Pennsylvania

Puerto Rico 1 Rico Puerto 2 Rico Puerto 3 Rico Puerto 4 Rico Puerto 5 Rico Puerto

NorthCarolina Dominican Republic Dominican

OZ01 samples OZ04 samples

Figure 7. Number of detections by location for lineages OZ01 and OZ04.

85

Humphries, Meghann B., 2019, UMSL, pg. 86

Figure 8. . Number of detections by host species for lineages OZ01 and OZ04.

86

Humphries, Meghann B., 2019, UMSL, pg. 87

Figure 9. Median joining haplotype networks of AMA1 for only introgressed samples. Circle size indicates the number of individuals with a given haplotype, color indicates location, and black hash marks indicate mutations.

87

Figure 10. Median joining haplotype network of 51 P. lucens AMA1 sequences depicting the main group (red) and the divergent group (blue). All samples were recovered from a population of olive sunbirds (Cyanomitra olivacea) in Cameroon (Lauron et al 2014).

Humphries, Meghann B., 2019, UMSL, pg. 89

Chapter 3.

Alpha-globin variation does not predict avian malaria infection in the West Indian bananaquit, Coereba flaveola

Meghann B. Humphries1*, Robert E. Ricklefs1

1 Department of Biology, University of Missouri–Saint Louis, St. Louis, Missouri, USA

Abstract

Infection by lineages of avian malaria parasites (Apicomplexa: Haemosporida) varies geographically, and some lineages exhibit disjunct distributions. These patterns might be related to differential resistance among host populations reflecting intrinsic characteristics of both the host and the pathogen. In hematophagous parasites, in particular, the structure of host hemoglobin can influence parasite development and reproduction. Though variation in avian hemoglobin has been documented across altitudinal gradients, little is known about hemoglobin variation as it relates to infection by the parasites causing avian malaria. We sequenced the αA -globin subunit of Coereba flaveola, and related sequence variation to avian malaria infection frequency and parasite lineage identity. We found no association between αA -globin haplotype and infection by particular parasite lineages among all locations, nor any protective association between globin haplotype frequency and the proportion of individuals infected within populations.

Phylogeographic structure and genetic variation at the αA -globin locus, including a highly variable intron, is largely concordant with the mitochondrial cytochrome-b locus

89

Humphries, Meghann B., 2019, UMSL, pg. 90 for these same populations, supporting this marker as an independent and variable target with potential application in biogeographic analyses.

Introduction

The frequency of infection by a particular lineage of avian malaria parasite

(Haemosporida: Plasmodium spp. and Haemoproteus spp.) on a single host varies geographically, and some lineages exhibit disjunct distributions (Fallon et al. 2005;

Ricklefs et al. 2016). The host species Coereba flaveola (Thraupidae), familiarly known as the bananaquit, is widely distributed throughout the Caribbean Basin and continental tropical America. Despite its ubiquity, three-quarters of described parasite lineages infecting this species each occur on only one or two islands. Moreover, most local populations of C. flaveola sustain three or fewer lineages of parasite (figure 1). These patterns might be related to variation in resistance among host populations to particular parasite lineages. In humans, 10 – 15% of the risk of malaria infection and severity is determined by genetic factors that cause variations in either the immune response or in red blood cell (RBC) structure or function (Mackinnon et al. 2000). The latter has been widely investigated in humans and is considered a substantial influence on susceptibility to malarial infection (De Mendonça et al. 2012).

Hemoglobin (Hb) variation is one type of RBC variation that is especially important in interactions with hematophagous parasites. Hemoglobin is partly composed of heme, which is an iron-rich compound required by hematophagous parasites for growth, development, and reproduction. Heme is required by most life on earth, including parasites, for a variety of cellular processes including antioxidant defense, electron

90

Humphries, Meghann B., 2019, UMSL, pg. 91 transfer in mitochondrial cellular respiration, and the detection and transfer of gases, including oxygen (van Dooren et al. 2012). Because heme is highly reactive and can be toxic, intracellular levels must be carefully regulated. Hematophagous parasites process large amounts of this compound in the course of metabolizing hemoglobin, and excess heme molecules are polymerized and sequestered as hemozoin (Toh et al. 2010; De

Mendonça et al. 2012; van Dooren et al. 2012). Consequently, variation in the structure of host hemoglobin molecules that affect these processes can influence parasite fitness and disease progression.

The heme structure within the hemoglobin molecule is surrounded by globin protein subunits. Each hemoglobin molecule incorporates two α-type and two β-type globins (Hardison 2012). Most bird species have retained an identical complement of three tandemly linked alpha-type globin genes (αE-, αD-, and αA -globin) and four tandemly linked β-type globin genes (βH-, βA-, ρ-, and ε-globin), although some of these genes, especially ε-globin, are truncated or exist as pseudogenes in some species (Opazo et al. 2015). The α-type globin genes are expressed differentially during the course of development: αA- and αD-globins are typically unevenly co-expressed postnatally, with

αA-globin comprising the majority of the expressed protein. Of the β-type globins, only

βA-globin is expressed at appreciable levels postnatally (Opazo et al. 2015).

Functionally consequential variation in hemoglobin can result from as little as a single amino acid change in one of the globin genes, while other variants result from deletions or premature stop codons (De Mendonça et al. 2012; Weatherall 2008). In humans, a single amino acid substitution causes the structural change known as sickle cell (HbS), and additional human thalassemia variants exhibit either a deletion (HbE) or a

91

Humphries, Meghann B., 2019, UMSL, pg. 92 point mutation (HbC) causing a premature end to transcription. Hemoglobin variants have also been documented in orangutans (Steiper et al. 2006), macaques, and baboons

(Barnicot et al. 1966). Human thalassemia and sickle cell genes in heterozygotes have been shown to confer protection against severe malaria progression, or resistance to P. falciparum, though the homozygous genotype is deleterious (De Mendonça et al. 2012;

Weatherall 2008). Hemoglobin variation is similarly thought to confer resistance to malaria in orangutans (Steiper et al. 2006), but no adaptive benefit has been identified in other non-human primates.

Previous intraspecific surveys of avian globin molecular diversity have revealed adaptive modification of oxygen-binding affinity related to altitude (Natarajan et al.

2015; Galen et al. 2015). If hemoglobin variation additionally influenced the susceptibility of birds to avian malaria parasites, or the severity of infections, then contemporary globin variation among avian populations might explain variation in the prevalence or intensity of infection. In the present study, we assessed genetic variation of the αA -globin subunit (HBA) in bananaquit populations throughout much of the species’ range in the West Indies, to test whether associations exist between globin variants and parasitism by avian malaria parasites. If an association were to exist, we would expect parasite prevalence to be inversely related to the frequency of an advantageous HBA haplotype.

In addition to potentially being associated with disjunctions and turnover in avian malaria infections among island populations, HBA provides an independent nuclear genetic locus that can inform demographic patterns inferred from mitochondrial variation. Accordingly, we compare patterns of genetic diversity and population structure

92

Humphries, Meghann B., 2019, UMSL, pg. 93 inferred from HBA variation to patterns based on an independent mitochondrial locus, cytochrome-b (CYTB) (Pil 2015).

Methods

Coereba flaveola populations throughout the West Indies (table 1, figure 2) have been sampled over the course of several years by capturing birds with mist-nets and obtaining ca. 10 µL of blood via sub-brachial venipuncture (field techniques described by Latta &

Ricklefs (2010)). Six individuals of Loxigilla noctis and four Tiaris bicolor (both

Thraupidae), from Barbados, were included for outgroup comparisons. DNA was extracted from blood using the isopropanol precipitation technique described in Svensson and Ricklefs (2009), and all samples were screened for avian malaria using primers 343F and 496R (Fallon et al. 2003). Samples that screened positive were genotyped to identify the cytochrome-b lineage of the infection (Bensch et al. 2009) using a variety of primers and PCR conditions described in Perkins and Schall (2002), Ricklefs et al. (2005), and

Waldenström et al. (2004).

The gene encoding HBA was amplified using primers HBAF ( 5’TTC GGC AAA

ATY GGC GGC CAK GCC GA3’) and HBAR (5’CCA CGG CAC ACA SGA ACT

TGT CCA GG3’) developed for our study using HYDEN software (available at http://acgt.cs.tau.ac.il/hyden/ ) with reference sequences from Opazo et al. (2015). PCR conditions and thermocycling protocols for HBA were as follows. All polymerase chain reactions (PCRs) were performed in 25 µL volumes using Immomix Red PCR premix

(Bioline product BIO-25022). Reaction volumes were composed of 12.5 µL of

Immomix, 10 µL of ddH20, 0.5 µL of 50 mM MgCl2 Solution, 0.5 µL of each primer (10 mM), and 1 µL of genomic DNA (concentrations varied from 30 – 70 ng/ µL).

93

Humphries, Meghann B., 2019, UMSL, pg. 94

Thermocycling conditions began with a hot-start at 95˚C for 10 minutes, followed by 30 cycles of a 30-second denaturing step at 95˚C, a 45-second annealing step at 62˚C, and - second extension step at 72˚C. A final extension step was carried out at 72˚C for 5 minutes. Positive and negative controls were included in each PCR reaction, and products were verified by visualization on 1% TAE agarose gels with ethidium bromide. PCR product was cleaned using the ExoSAP-IT protocol (Bell 2008) and sequenced by

Eurofins Genomics (Louisville, KY). Contigs were aligned and edited in Mega7 (Tamura et al. 2013) and chromatograms were checked by eye to confirm polymorphisms. HBA sequences were coded for ambiguous bases in Mega7 and phased for heterozygosity in

DNAsp v.5 (Librado & Rozas 2009).

To determine whether HBA genotypes are differentiated among sampling locations, we calculated fixation indices (pairwise FST values (Excoffier & Lischer 2010)) with confidence intervals based on 1000 permutations (Arlequin v.3.5, (Excoffier et al.

2005)) for the entire gene sequence and for the sequence with the introns removed

(nucleotide positions 1-135 and 339-439, identified in Cheviron et al. (2014)). A median- joining haplotype network for the coding sequence was generated using PopArt

(available at http://popart.otago.ac.nz) to compare and visualize relationships between haplotypes at each sample location. Nucleotide diversity (π) and haplotype diversity (Hd) for each sampling locality were calculated using DNAsp v.5 (Librado & Rozas 2009).

To determine whether particular globin genotypes are associated with higher or lower rates of parasitism by particular lineages of parasite, we performed a chi-squared test on the numbers of infected individuals that carried or did not carry the allele for all full-length haplotypes detected 10 or more times (hereafter common haplotypes) and for

94

Humphries, Meghann B., 2019, UMSL, pg. 95 the four commonest coding sequence haplotypes. We regressed the infection prevalence of common globin haplotypes where they are most abundant against all other haplotypes in the same locality to determine whether infection rates on common haplotypes varied from the overall rate of a given locality. Finally, to determine the degree of concordance among genetic distances observed at HBA and CYTB we performed a Mantel test (999 permutations) and Mantel correlogram of the pairwise FST values for each marker using the R package vegan.

Results

Our samples included HBA sequences from 382 C. flaveola individuals at 19 localities across the species range in the West Indies (table 1). Of these individuals, 77 (20%) exhibited haemosporidian infections representing at least 11 parasite lineages; sample sizes per locality ranged from 5 to 48 individuals. The full-length HBA sequence (494 base pairs) was highly variable (152 unique haplotypes), in contrast to the highly conserved coding sequence. We recovered only 22 unique coding sequence haplotypes

(i.e., excluding nucleotide positions 1-135, 339-439). One HBA coding sequence haplotype, which comprised 75% of all sequences, was distributed in all localities sampled, and was shared with one L. noctis sample (figure 3). Four additional haplotypes were somewhat common (13 – 77 detections), and the remaining 17 haplotypes were detected rarely. Twenty of the twenty-two coding sequence haplotypes differed by a single mutational step from the most common haplotype, and all outgroup samples possessed haplotypes also recovered in C. flaveola. Eighteen of the 22 coding sequence haplotypes exhibited exclusively synonymous nucleotide variation and only two

95

Humphries, Meghann B., 2019, UMSL, pg. 96 nonsynonymous substitutions were detected in the remaining three haplotypes: 86V>A

(shared by haplotypes 12 and 17) and 69A>T (haplotype 8).

Comparisons of the relative abundances of the full-length haplotypes among locations revealed that 8 of the 12 haplotypes detected 10 or more times were abundant on one or two islands and were comparatively rare elsewhere (figure 4).

Population nucleotide diversity (π) and haplotype diversity (Hd) for the HBA coding sequence ranged from 0.015 to 0.066 and 0.214 to 0.644, respectively (table 2).

The majority of populations (12/20) contained either two or three HBA haplotypes, though as many as eight haplotypes were recovered from some localities, i.e., Guánica

Forest (Puerto Rico) and Grenada. Loxigilla noctis and Tiaris bicolor samples revealed three coding sequence haplotypes that were shared with the C. flaveola samples. The intron sequence exhibits substantial variation among individuals, including indels that varied in size among species.

We found no difference in the probability of malaria infection for nine of the twelve common HBA haplotypes when compared to the prevalence of all other haplotypes. The other three full-length haplotypes exhibited significantly higher than average rates of infection (table 3, figure 5). Similarly, three of the four commonest coding-sequence (CDS) haplotypes exhibited no difference in the probability of malaria infection compared to the prevalence of all other haplotypes, and the sole statistically significantly different comparison was one in which the rate of parasitism is higher than in all other haplotypes (figure 6). Additionally, malaria prevalence was not significantly lower on a particular haplotype where it was most common compared to all other haplotypes in the same location (figure 7). Finally, higher genetic diversity at HBA was

96

Humphries, Meghann B., 2019, UMSL, pg. 97 not associated with lower rates of parasitism among locations; the overall prevalence was

29.69% while prevalence in the four locations with the highest HBA diversity

(Guadeloupe, Jamaica, Nevis, and ) ranged from 8.3% to 75%.

Population structure inferred by FST was largely concordant between HBA (full- length sequence: table S3; coding sequence: table S4) and CYTB (table S5), with HBA exhibiting generally lower values, as expected in comparisons of nuclear and mitochondrial markers. A Mantel test of the FST matrices for HBA and CYTB revealed that the two are significantly correlated (Figure 8, Mantel R statistic: 0.527, p-value

0.002), but that the correlation breaks down at higher distance classes. This breakdown is attributed to the lower genetic distances at HBA for bananaquits from Jamaica, Puerto

Rico, Barbados, and Saint Vincent, relative to CYTB distances for the same populations

(Figures 8, S1).

Discussion

Findings reported here suggest that while HBA coding sequence does vary among populations of C. flaveola, consistent with findings reported for neutral mitochondrial variation, these patterns are likely not related to interactions with avian malaria parasites.

We find no support for particular HBA variants conferring resistance to infection by particular lineages of malaria parasites, either in terms of overall rates of infection or in relation to variation in infection rates among localities. We suspected that parasite prevalence might be inversely related to the frequency of an advantageous HBA haplotype but found that the haplotypes with highest frequencies had average or above- average parasite prevalence. Nonetheless, the distribution of genetic variation at HBA among locations, particularly in the intronic regions, does support this marker as a

97

Humphries, Meghann B., 2019, UMSL, pg. 98 promising independent locus for phylogeographic and population studies focused on fine- scale demographic processes. We recovered substantial individual variation and, in interspecific comparisons, we found indels of varying length.

We were unable to design suitable primers for the α subunit of β-globin (HBB) in

C. flaveola with currently available data, though this marker may warrant further investigation. β-globin makes up half the tetrad of the hemoglobin molecule and therefore we would expect it to be as consequential as α-globin to hematophagous parasite fitness.

As previously described, among the four tandemly linked β-globin genes in most avian species only HBB is expressed at appreciable levels postnatally. Support for the response of this gene to natural selection was reported by Galen et al. (2015), who found functionally consequential variation in the oxygen-binding affinity of HBB along an altitudinal gradient.

Parasite population disjunctions and lineage turnover observed in island populations of C. flaveola appear independent of globin variation and might be better explained by variation at immune markers such as the major histocompatibility complex

(MHC) and toll-like receptors (TLR). Loiseau et al. (2011) found evidence of local adaptation of Passer domesticus MHC diversity in relation to Plasmodium relictum infection and inverse associations of some alleles with parasite infection across locations.

TLR variation is reported to impact clinical outcomes of malaria in humans (Leoratti et al.

2008) and these genes, along with MHC, have recently been annotated in C. flaveola

(Antonides et al. 2017), making this avenue suitable for further investigation.

98

Humphries, Meghann B., 2019, UMSL, pg. 99

Literature Cited Antonides, J., Ricklefs, R. & DeWoody, J. 2017. The genome sequence and insights into

the immunogenetics of the bananaquit (Passeriformes: Coereba flaveola).

Immunogenetics, 69, pp. 175–186.

Barnicot, N., Huehns, E. & Jolly, C. 1966. Biochemical studies on haemoglobin variants

of the irus macaque. Proceedings of the Royal Society of London B: Biological

Sciences, 165(999), pp. 224–244.

Bell, J. 2008. A Simple Way to Treat PCR Products Prior to Sequencing Using ExoSAP-

IT ®. Biotechniques, 44(6), p. 2008.

Bensch, S., Hellgren, O. & Pérez-Tris, J. 2009. MalAvi: A public database of malaria

parasites and related haemosporidians in avian hosts based on mitochondrial

cytochrome b lineages. Molecular Ecology Resources, 9(5), pp. 1353–1358.

Cheviron, Z., Natarajan, C., Projecto-Garcia, J., Eddy, D., Jones, J., Carling, M., Witt, C.,

Mariyama, H., Weber, R., Fago, A., & Storz, J. 2014. Integrating evolutionary

and functional tests of adaptive hypotheses: a case study of altitudinal

differentiation in hemoglobin function in an Andean sparrow, Zonotrichia

capensis. Molecular Biology and Evolution, 31(11), pp.2948–2962. van Dooren, G., Kennedy, A. & Mcfadden, G. 2012. The use and abuse of heme in

apicomplexan parasites. Antioxidants and Redox Signaling, Forum Review Article

Online.

99

Humphries, Meghann B., 2019, UMSL, pg. 100

Excoffier, L., Laval, G. & Schneider, S. 2005. Arlequin (version 3.0): an integrated

software package for population genetics data analysis. Evolutionary

Bioinformatics Online, 1, pp.47–50.

Excoffier, L. & Lischer, H. 2010. Arlequin suite ver 3.5: A new series of programs to

perform population genetics analyses under Linux and Windows. Molecular

Ecology Resources, 10(3), pp. 564–567.

Fallon, S., Ricklefs, R., Swanson, B., & Bermingham, E. 2003. Detecting Avian Malaria:

An Improved Polymerase Chain Reaction Diagnostic. Journal of Parasitology,

89(5), pp. 1044–1047.

Galen, S., Natarajan, C., Moriyama, H., Weber, R., Fago, A., Benham, P., Chavez, A>,

Cheviron, Z., Storz, J., & Witt, C. 2015. Contribution of a mutational hot spot to

hemoglobin adaptation in high-altitude Andean house wrens. Proceedings of the

National Academy of Sciences, 112(45), pp.13958–13963.

Hardison, R. 2012. Evolution of hemoglobin and its genes. Cold Spring Harbor

Perspectives in Medicine, 2(12), pp.1–18.

Latta, S. & Ricklefs, R. 2010. Prevalence patterns of avian haemosporida on Hispaniola.

Journal of Avian Biology, 41, pp.25–33.

Leoratti, F. Farias, L., Alves, F., Suarez-Mútis, M., Coura, J., Kalil, J., Camargo, E.,

Meraes, S., & Ramasawny, R. 2008. Variants in the Toll-Like Receptor Signaling

Pathway and Clinical Outcomes of Malaria. The Journal of Infectious Diseases,

198, pp. 772-780.

100

Humphries, Meghann B., 2019, UMSL, pg. 101

Librado, P. & Rozas, J. 2009. DnaSP v5: A software for comprehensive analysis of DNA

polymorphism data. Bioinformatics, 25(11), pp.1451–1452.

Loiseau, C. Zoorob, R., Robert, A., Chastel, O., Julliard, R. & Sorci, G., 2011.

Plasmodium relictum infection and MHC diversity in the house sparrow (Passer

domesticus). Proceedings of the Royal Society B: Biological Sciences, 278,

pp.1264–1272.

Mackinnon, M., Gunawardena, D., Rajakaruna, J., Weerasingha, S., Mendis, K., &

Carter, R. 2000. Quantifying genetic and nongenetic contributions to malarial

infection in a Sri Lankan population. Proceedings of the National Academy of

Science, 97(23), pp.12661–12666.

De Mendonça, V., Goncalves, M., & Barral-Netto, M. 2012. The host genetic diversity in

malaria infection. Journal of Tropical Medicine, 2012, pp.1–17.

Natarajan, C., Projecto-Garcia, J., Moriyama, H., Weber, R., Munoz-Fuentes, V, Green,

A., Kopuchian, C., Tubaro, P., Alza, L., Bulgarella, M, Smith, M., Wilson, R.,

Fago, A., McCracken, K., Storz, J. 2015. Convergent Evolution of Hemoglobin

Function in High-Altitude Andean Waterfowl Involves Limited Parallelism at the

Molecular Sequence Level. PLoS Genetics, 11(12), pp.1–25.

Opazo, J., Hoffmann, F., Natarajan, C., Witt, C., Berenbrink, M., Storz, J. 2015. Gene

turnover in the avian globin gene families and evolutionary changes in

hemoglobin isoform expression. Molecular Biology and Evolution, 32(4), pp.871–

887.

101

Humphries, Meghann B., 2019, UMSL, pg. 102

Perkins, S. & Schall, J. 2002. A Molecular Phylogeny of Malarial Parasites Recovered

from Cytochrome b Gene Sequences. The Journal of Parasitology, 88(5), p.972.

Pil, M. 2015. Comparative phylogeography and demographic histories of West Indian

birds. Doctoral Dissertation. University of Missouri-Saint Louis.

Ricklefs, R. Swanson, B., Fallon, S., Martínez-Abraín, A., Scheuerlein, A., Gray, J.,

Latta, S. 2005. Community Relationships of Avian Malaria Parasites in Southern

Missouri. Ecological Monographs, 75(4), pp.543–559.

Ricklefs, R. Soares, L., Ellis, V., & Latta, S. 2016. Haemosporidian parasites and avian

host population abundance in the Lesser Antilles. Journal of Biogeography, 43,

pp.1277–1286.

Steiper, M., Wolfe, N., Karesh, W., Kilbourn, A., Bosi, E., Ruvolo, M. 2006. The

phylogenetic and evolutionary history of a novel alpha-globin-type gene in

orangutans (Pongo pygmaeus). Infection, Genetics and Evolution, 6, pp.277–286.

Svensson, L. & Ricklefs, R. 2009. Low diversity and high intra-island variation in

prevalence of avian Haemoproteus parasites on Barbados, Lesser Antilles.

Parasitology, 136, pp.1121–31.

Fallon, S., Bermingham, E. & Ricklefs, R. 2005. Host Specialization and Geographic

Localization of Avian Malaria Parasites: A Regional Analysis in the Lesser

Antilles. The American Naturalist, 165(4), pp.466–480.

Tamura, K., Stecher, G., Peterson, D., Filipski, A., Kumar, S. 2013. MEGA6: Molecular

evolutionary genetics analysis version 6.0. Molecular Biology and Evolution,

30(12), pp.2725–2729.

102

Humphries, Meghann B., 2019, UMSL, pg. 103

Toh, S.Q. Glanfield, A., Gobert, G., & Jones, M. 2010. Heme and blood-feeding

parasites: friends or foes? Parasites & vectors, 3(108), pp.1–10.

Waldenström, A., Bensch, S., Hasselquist, D., & Östman, Ö. 2004. A new nested

polymerase chain reaction method very efficient in detecting plasmodium and

haemoproteus infections from avian blood. Journal of Parasitology, 90(1),

pp.191–194.

Weatherall, D. 2008. Genetic variation and susceptibility to infection: The red cell and

malaria. British Journal of Haematology, 141, pp.276–286.

103

Humphries, Meghann B., 2019, UMSL, pg. 104

Tables and Figures

Table 1: Sample information for HBA sequence data. C. flaveola N individuals 382 N infections 77 N locations 19 N haplotypes (CDS) 152 (22) N parasite lineages 11

104

Humphries, Meghann B., 2019, UMSL, pg. 105

Table 2. Sampling and diversity statistics for the HBA coding sequence. N, number of sequences; H, number of haplotypes; π, nucleotide diversity; Hd, haplotype diversity.

N H π Hd Antigua (AN) 10 3 0.049 0.644 Barbados (BA) 26 2 0.018 0.271 Barbuda (BU) 20 2 0.022 0.337 British Virgin Islands (BVI) 50 2 0.033 0.490 Grand Cayman, (C) 34 2 0.020 0.300 Dominica (DO) 22 2 0.016 0.247 Dominican Republic (DR) 38 4 0.028 0.360 Eleuthera, Bahamas (ELE) 34 2 0.030 0.451 El Yunque, Puerto Rico (EY) 44 4 0.015 0.214 Grenada (GR) 96 8 0.024 0.330 Guánica Forest, Puerto Rico (GF) 56 8 0.039 0.494 Guadeloupe (GU) 14 4 0.051 0.571 Jamaica (J) 24 6 0.052 0.544 Carite Forest, Puerto Rico (LP) 34 3 0.018 0.266 (MA) 20 3 0.019 0.279 Montserrat (MO) 19 3 0.021 0.307 Nevis (NE) 40 5 0.057 0.631 Saint Lucia (SL) 16 3 0.049 0.633 Saint Vincent (SV) 20 5 0.050 0.626

105

Humphries, Meghann B., 2019, UMSL, pg. 106

Table 3. Chi-square and p values for tests assessing infected vs. not infected for each haplotype against all other haplotypes. Tests include the four commonest coding sequence haplotypes (CDS) and all full-length haplotypes (FL) detected ten or more times. All tests have df = 1, p-values <0.05 highlighted in bold. In all significant tests, individuals with that haplotype exhibited higher than average infection prevalence.

Chi-square p value CDS Hap1 0.4236 0.5152 CDS Hap2 23.7220 < 0.0001 CDS Hap3 0.0008 0.9770 CDS Hap4 0.5369 0.4637 FL Hap 1 0.2993 0.5843 FL Hap 2 4.3173 0.0377 FL Hap 3 0.1127 0.7371 FL Hap 6 2.6345 0.1046 FL Hap 7 10.2405 0.0014 FL Hap 8 0.2372 0.6263 FL Hap 9 0.9254 0.3361 FL Hap 11 0.3752 0.5402 FL Hap 16 0.6495 0.4203 FL Hap 19 0.0002 0.9901 FL Hap 38 5.6071 0.0179 FL Hap 39 1.1229 0.2893

106

Humphries, Meghann B., 2019, UMSL, pg. 107

100%

90%

80%

70%

60%

50%

40%

30%

20%

10%

0%

DR 03 DR 11 GA 01 JA 01 JA 02 JA 06 LA 07 LA 19 LA 22 LA01 LA02 LA07 OZ 01 OZ 02 OZ 02, unique OZ 04 OZ 06 OZ 08 OZ 19 OZ 21 OZ 32 TOC28 YU 04 unique uninfected

Figure 1. C. flaveola malaria infection by location as a percent of total captures (these analyses include more individuals than those in the HBA/CYTB analyses). Colored portions represent infection by a particular lineage, and sample sizes are in parenthesis on the x-axis. Location abbreviations available in Table S1.

107

Humphries, Meghann B., 2019, UMSL, pg. 108

Figure 2. Sampling localities in the West Indies.

108

Humphries, Meghann B., 2019, UMSL, pg. 109

Figure 3. Median joining haplotype network for the HBA coding sequence. Circle size indicates the number of individuals with a given haplotype, color indicates collection location, and black hash marks indicate mutations. Six Loxigilla noctis and four Tiaris bicolor from Barbados were included for outgroup comparisons though all outgroup samples possessed haplotypes also recovered in C. flaveola.

109

Humphries, Meghann B., 2019, UMSL, pg. 110

Figure 4. Relative abundance among locations for each haplotype recovered 10 or more times. Abundances are scaled relative to the total number of detections of a particular haplotype and an upper limit to circle size was applied in some cases for clarity.

110

Humphries, Meghann B., 2019, UMSL, pg. 111

35 * 30 25 * * 20 15 10 5 0 -5 -10

-15 OR DO NOT CARRY THE ALLELETHEDONOT CARRY OR FL Hap FL Hap FL Hap FL Hap FL Hap FL Hap FL Hap FL Hap FL Hap FL Hap FL Hap FL Hap

DIFFERENCE IN % INFECTED THAT% CARRY IN INFECTED DIFFERENCE 1 2 3 6 7 8 9 11 16 19 38 39

Figure 5. For full-length sequences detected 10 or more times, the difference in percent infected that carry (+) versus do not carry (-) a given allele. * indicates Chi-square tests with p-value < 0.05, Chi-square values reported in Table 3.

111

Humphries, Meghann B., 2019, UMSL, pg. 112

0.2 * 0.15

0.1

0.05

0 CARRY ALLELE THE CARRY

-0.05

THAT CARRY OR DO NOT DOOR NOT CARRY THAT DIFFERENCE IN % INFECTED % DIFFERENCE IN -0.1 CDS Hap1 CDS Hap2 CDS Hap3 CDS Hap4

Figure 6. For the commonest HBA coding sequences, the difference in percent infected that carry or do not carry a given allele. * indicates Chi-square tests with p-value < 0.05, Chi-square values reported in Table 3.

112

Humphries, Meghann B., 2019, UMSL, pg. 113

0.6

0.5 R2 = 0.785 0.4 p = 0.003

0.3

0.2

0.1 Prevalence Prevalence inother of CFA same locality

0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Prevalence in abundant locality

Figure 7. Linear regression of infection prevalence of a haplotype where the haplotype is common against all other haplotypes in the same locality (includes HBA full-length sequence haplotypes detected at least 14 times total and eight or more times in the ‘abundant’ locality). Infection rates on the abundant HBA haplotype are consistent with infection rates on other haplotypes in the same locality.

113

Humphries, Meghann B., 2019, UMSL, pg. 114

Mantel R 0.5271 P-value 0.002

Figure 8. Mantel Test of CYTB and HBA pairwise FST values. Location abbreviations as follows: AN, Antigua; BA, Barbados; BU, Barbuda; BVI, British Virgin Islands; C, Grand Cayman; DO, Dominica; DR, Dominican Republic; ELE, Eleuthera; EY, El Yunque Forest Puerto Rico; GD, Grenada; GF, Guanica Forest Puerto Rico; GU, Guadeloupe; J, Jamaica; MA, Martinique; MO, Montserrat; NE, Nevis; SL, Saint Lucia; SV, Saint Vincent.

114

Humphries, Meghann B., 2019, UMSL, pg. 115

Supplementary Material

Table S1. Location Abbreviations

LC Little Cayman GC Grand Cayman CB Cayman Brac JA Jamaica DR Dominican Republic PR-RB Refugio Boquerón, Puerto Rico PR-GF Guánica Forest, Puerto Rico PR-EY El Yunque, Puerto Rico PR-LP Las Playitas, Puerto Rico PR-CS Carite Forest, Puerto Rico PR-UPR University of Puerto Rico EL Eleuthera, Bahamas BVI British Virgin Islands SA Saba, Lesser Antilles SE Saint Eustatius, Lesser Antilles SK Saint Kitts NE Nevis AN Antigua BU Barbuda MO Montserrat GU Guadeloupe DO Dominica MA Martinique SL Saint Lucia SV Saint Vincent BA Barbados GR Grenada

115

Humphries, Meghann B., 2019, UMSL, pg. 116

Table S2. Samples sizes for C. flaveola infection by location represented in Figure 1, location abbreviations in Table S1. DR DR GA JA JA JA LA LA LA LA LA LA OZ OZ OZ 02, OZ OZ OZ OZ OZ OZ TO YU uni uninf Grand 03 11 01 01 02 06 07 19 22 01 02 07 01 02 unique 04 06 08 19 21 32 C28 04 que ected Total LC 4 21 20 45

GC 20 5 2 32 59

CB 3 23 3 29

JA 6 2 35 4 96 143

DR 18 3 4 2 1 2 1 1 73 105

PR-RB 1 45 46

PR-GF 43 9 1 1 9 1 1 102 167

PR-EY 1 10 50 61

PR-LP 1 3 15 19

PR-CS 6 1 12 19

PR-UPR 3 3

EL 1 1 15 17

BVI 2 28 30

SA 1 1 5 7

SE 17 88 105

SK 1 1 1 3 27 33

NE 2 3 55 60

AN 4 12 16

BU 6 1 7 14

MO 1 1 1 3 12 18

GU 1 3 13 4 7 28

DO 2 9 7 18

MA 1 1 1 4 7

SL 1 1 14 1 6 2 36 61

SV 6 34 61 101

BA 2 18 50 70

GR 1 29 3 174 207

Grand 24 3 1 1 33 2 91 1 0 1 1 17 6 63 1 62 5 1 12 11 1 0 1 12 1039 1488 Total 0

116

Humphries, Meghann B., 2019, UMSL, pg. 117

Table S3. Pairwise FST for the full-length HBA sequence. FST values below the diagonal, significance above. + indicates p < 0.05.

AN BA BU BVI C DO DR ELE GD GF GU J LP MA MO NE SL SV

AN + + + + + + + + - - + + + - + - +

BA 0.172 + + + + + + + + - + + - - + - -

BU 0.132 0.040 + + - + + + - - + ------

BVI 0.103 0.165 0.102 + + + + + + + + + + + + + +

C 0.604 0.589 0.558 0.489 + + + + + + + + + + + + +

DO 0.272 0.116 0.000 0.146 0.546 + + + + + + - - - - - +

DR 0.184 0.289 0.282 0.228 0.578 0.354 + + + + + + + + + + +

ELE 0.547 0.582 0.565 0.452 0.400 0.589 0.494 + + + + + + + + + +

GD 0.137 0.067 0.168 0.261 0.669 0.289 0.304 0.603 + + + + + + + + +

GF 0.037 0.043 0.023 0.047 0.495 0.080 0.170 0.445 0.126 - + - - - + - +

GU 0.043 0.031 0.000 0.092 0.551 0.066 0.222 0.536 0.117 0.004 + + - - - - -

J 0.130 0.094 0.100 0.103 0.386 0.147 0.216 0.314 0.203 0.060 0.077 + + + + + +

LP 0.165 0.072 0.025 0.051 0.479 0.038 0.269 0.489 0.224 0.021 0.042 0.067 - + + - +

MA 0.175 0.020 0.000 0.106 0.544 0.004 0.288 0.562 0.188 0.023 0.000 0.077 0.006 - - - -

MO 0.061 0.046 0.000 0.119 0.581 0.074 0.215 0.555 0.115 0.013 0.000 0.107 0.073 0.020 - - -

NE 0.120 0.082 0.000 0.125 0.525 0.015 0.248 0.524 0.214 0.058 0.006 0.133 0.066 0.012 0.020 - +

SL 0.094 0.076 0.000 0.076 0.523 0.028 0.259 0.540 0.230 0.029 0.000 0.099 0.040 0.003 0.033 0.000 -

SV 0.141 -0.010 0.021 0.144 0.542 0.069 0.268 0.546 0.109 0.043 0.019 0.077 0.052 0.002 0.041 0.054 0.032

117

Humphries, Meghann B., 2019, UMSL, pg. 118

Table S4. Pairwise FST for the HBA coding sequence. FST values below the diagonal, significance above. + indicates p < 0.05.

AN BA BU BVI C DO DR EL EY GD GF GU J LP MA MO NE SL SV

AN + - - + + + + + + - - + + - - - - +

BA 0.257 + + + + + + + - + + + + - - + + -

BU 0.032 0.144 - + - + + - + - - + - - - - - +

BVI 0.000 0.292 0.052 + + + + + + + - + + + + + - +

C 0.309 0.136 0.158 0.305 + + + + + + + - + + + + + +

DO 0.115 0.108 0.000 0.118 0.123 + + - + - - + - - - - + +

DR 0.269 0.118 0.135 0.288 0.136 0.104 + + + + + + + + + + + +

EL 0.538 0.557 0.540 0.561 0.389 0.554 0.527 + + + + + + + + + + +

EY 0.240 0.059 0.034 0.221 0.106 0.000 0.097 0.586 + - + + - - - + + +

GD 0.248 0.000 0.104 0.284 0.105 0.062 0.098 0.546 0.030 + + + + - - + + -

GF 0.042 0.049 0.000 0.089 0.097 0.000 0.097 0.470 0.010 0.051 - + ------

GU 0.000 0.100 0.000 0.060 0.157 0.012 0.139 0.481 0.075 0.106 0.000 ------

J 0.139 0.080 0.062 0.205 0.000 0.047 0.095 0.301 0.064 0.092 0.057 0.048 + - - + + +

LP 0.140 0.069 0.000 0.149 0.116 0.000 0.104 0.561 0.000 0.042 0.000 0.025 0.058 - - - + -

MA 0.122 0.029 0.000 0.151 0.097 0.000 0.083 0.537 0.000 0.011 0.000 0.009 0.035 0.000 - - - -

MO 0.096 0.028 0.000 0.135 0.099 0.000 0.083 0.529 0.000 0.012 0.000 0.000 0.032 0.000 0.000 - - -

NE 0.000 0.107 0.001 0.047 0.145 0.022 0.137 0.441 0.077 0.123 0.012 0.004 0.088 0.038 0.027 0.018 - +

SL 0.000 0.211 0.033 0.000 0.278 0.097 0.250 0.522 0.203 0.222 0.045 0.000 0.143 0.119 0.100 0.079 0.008 +

SV 0.107 0.003 0.065 0.206 0.111 0.052 0.103 0.462 0.057 0.014 0.036 0.035 0.048 0.046 0.013 0.007 0.069 0.102

118

Humphries, Meghann B., 2019, UMSL, pg. 119

Table S5. Pairwise FST of CYTB from Pil (Pil 2015). FST values below the diagonal, significance above. + indicates p < 0.05.

AN BA BVI BU C DR DO EL EY GD GF GU JA MA MO NE SL SV

AN + - - + + - + + + + - + - - - - +

BA 0.693 + + + + + + + + + + + + + + + +

BVI 0.014 0.633 - + + - + + + + - + - + + - +

BU 0.004 0.707 0.052 + + - + + + + + + + + + + +

C 0.976 0.977 0.966 0.977 + + + + + + + + + + + + +

DR 0.964 0.967 0.959 0.965 0.919 + + + + + + + + + + + +

DO 0.017 0.701 0.038 0.093 0.977 0.963 + + + + - + - - - - +

EL 0.984 0.986 0.979 0.985 0.988 0.979 0.984 + + + + + + + + + +

EY 0.861 0.898 0.843 0.864 0.955 0.953 0.854 0.970 + - + + + + + + +

GD 0.758 0.827 0.761 0.760 0.940 0.927 0.744 0.958 0.808 + + + + + + + +

GF 0.899 0.924 0.872 0.901 0.969 0.964 0.896 0.980 0.006 0.831 + + + + + + +

GU 0.010 0.610 0.010 0.051 0.963 0.955 0.000 0.975 0.836 0.748 0.868 + + + - - +

JA 0.910 0.924 0.920 0.911 0.870 0.831 0.903 0.944 0.925 0.900 0.934 0.909 + + + + +

MA 0.028 0.611 0.044 0.065 0.967 0.957 0.046 0.978 0.849 0.748 0.882 0.040 0.906 + + - +

MO 0.052 0.587 0.118 0.121 0.956 0.947 0.032 0.969 0.830 0.730 0.860 0.062 0.902 0.105 + + +

NE 0.054 0.552 0.087 0.101 0.947 0.942 0.022 0.966 0.817 0.741 0.840 0.022 0.908 0.102 0.081 - +

SL 0.000 0.630 0.010 0.034 0.967 0.958 0.000 0.978 0.850 0.755 0.881 0.000 0.911 0.025 0.047 0.041 +

SV 0.683 0.748 0.705 0.687 0.912 0.898 0.659 0.936 0.765 0.159 0.786 0.684 0.874 0.675 0.669 0.694 0.690

119

Humphries, Meghann B., 2019, UMSL, pg. 120

HBA Pairwise FST 18

16

14

12

10

8

6

4

2

0 AN BA BU BVI C DO DR ELE EY GD GF GU J MA MO NE SL SV -2

AN BA BU BVI C DO DR ELE EY GD GF GU J MA MO NE SL SV

CYTB Pairwise FST 18

16

14

12

10

8

6

4

2

0 AN BA BU BVI C DO DR ELE EY GD GF GU JA MA MO NE SL SV -2

AN BA BU BVI C DO DR ELE EY GD GF GU JA MA MO NE SL SV

Figure S1. Stacked histograms of Pairwise FST values for HBA (above) and CYTB (below). Values are lower than expected for HBA on Saint Vincent, Barbados, Dominican Republic, Jamaica, and Puerto Rico.

120