BMC Evolutionary Biology BioMed Central

Research article Open Access Timing major conflict between mitochondrial and nuclear genes in relationships of (: ) Niklas Wahlberg*1, Elisabet Weingartner2, Andrew D Warren3,4 and Sören Nylin2

Address: 1Laboratory of Genetics, Department of Biology, University of Turku, 20014 Turku, Finland, 2Department of Zoology, Stockholm University, 106 91 Stockholm, Sweden, 3McGuire Center for and Biodiversity, Florida Museum of Natural History, University of Florida, SW 34th Street and Hull Road, PO Box 112710, Gainesville, FL 32611-2710, USA and 4Museo de Zoología, Departamento de Biología Evolutiva, Facultad de Ciencias, Universidad Nacional Autónoma de México, Apdo. Postal 70-399, México, DF 04510 México Email: Niklas Wahlberg* - [email protected]; Elisabet Weingartner - [email protected]; Andrew D Warren - [email protected]; Sören Nylin - [email protected] * Corresponding author

Published: 7 May 2009 Received: 20 August 2008 Accepted: 7 May 2009 BMC Evolutionary Biology 2009, 9:92 doi:10.1186/1471-2148-9-92 This article is available from: http://www.biomedcentral.com/1471-2148/9/92 © 2009 Wahlberg et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract Background: Major conflict between mitochondrial and nuclear genes in estimating species relationships is an increasingly common finding in . Usually this is attributed to incomplete lineage sorting, but recently the possibility has been raised that hybridization is important in generating such phylogenetic patterns. Just how widespread ancient and/or recent hybridization is in animals and how it affects estimates of species relationships is still not well-known. Results: We investigate the species relationships and their evolutionary history over time in the Polygonia using DNA sequences from two mitochondrial gene regions (COI and ND1, total 1931 bp) and four nuclear gene regions (EF-1α, wingless, GAPDH and RpS5, total 2948 bp). We found clear, strongly supported conflict between mitochondrial and nuclear DNA sequences in estimating species relationships in the genus Polygonia. Nodes at which there was no conflict tended to have diverged at the same time when analyzed separately, while nodes at which conflict was present diverged at different times. We find that two species create most of the conflict, and attribute the conflict found in to ancient hybridization and conflict found in to recent or ongoing hybridization. In both examples, the nuclear gene regions tended to give the phylogenetic relationships of the species supported by morphology and biology. Conclusion: Studies inferring species-level relationships using molecular data should never be based on a single locus. Here we show that the phylogenetic hypothesis generated using mitochondrial DNA gives a very different interpretation of the evolutionary history of Polygonia species compared to that generated from nuclear DNA. We show that possible cases of hybridization in Polygonia are not limited to sister species, but may be inferred further back in time. Furthermore, we provide more evidence that Haldane's effect might not be as strong a process in preventing hybridization in butterflies as has been previously thought.

Page 1 of 16 (page number not for citation purposes) BMC Evolutionary Biology 2009, 9:92 http://www.biomedcentral.com/1471-2148/9/92

Background used as model taxa in numerous studies on the evolution Phylogenetics at the species-level is becoming increasingly of -host plant interactions [18-21], phenotypic plas- important in the study of processes underlying speciation ticity in life-history traits [22,23], effects of environment [1,2]. Most species-level phylogenies have until recently on distribution [24] and insect physiology [25,26]. Polygo- been based on only mitochondrial DNA (mtDNA) due to nia is a genus thought to include five Palaearctic species the ease of PCR amplification and its perceived suitability, (P. c-album, P. c-aureum, P. egea, P. gigantea and P. inter- e.g. due to maternal inheritance (shorter time for coales- posita), and nine Nearctic species, seven in the United cence than nuclear DNA (nDNA) because of smaller Ne), States and Canada (P. comma, P. faunus, P. gracilis, P. inter- lack of recombination and relatively high mutation rate. rogationis, P. oreas, P. progne and P. satyrus [27,28]) and However, species phylogenies are not necessarily the same two endemic to Mexico (P. g-argenteum and P. haroldii). as gene phylogenies [3,4], as different genes might have The taxonomic status of some of these species is disputed. different histories. Genes involved with speciation, affect- Polygonia interposita has been treated as a subspecies of P. ing such traits as hybrid incompatibility, as well as sex c-album [29] but was suggested to be a species-level taxon chromosomes should be more differentiated between by Churkin [30]. So far, this taxon has not been included species and less likely to introgress than autosomes [5-8]. in any earlier molecular studies. We have tentatively Different processes such as random sorting of homoplasy, treated P. zephyrus as a species separate from P. gracilis [fol- ancient polymorphism and hybridization with gene intro- lowing [31]] although this status is unclear; P. zephyrus is gression can obscure patterns of species relationships. often considered conspecific with P. gracilis [32]. These Information from different regions of genomes such as two taxa (P. zephyrus and P. gracilis) are morphologically mitochondrial DNA, nuclear DNA (from sex chromo- distinguishable at the extremes of their ranges but somes as well as from autosomes) and microsatellites are between those "ends of a cline" a broad zone exists where thus necessary in investigating the evolutionary history of intermediate forms occur (ADW pers. obs.). This may be a group of closely related species. an example of incipient speciation or secondary contact between two species. Earlier, P. oreas was sometimes con- As species-level molecular phylogenies based on both sidered a subspecies of P. progne, but in a recent study mitochondrial and nuclear markers have become more [33], P. oreas was found to be closely related to P. gracilis. common, it has become clear that there is often well-sup- ported conflict between the genomes for certain clades in According to previous analyses, the ancestor of Polygonia given phylogenies [9]. Recent work is pointing to major was distributed in the Palaearctic and there have been two conflict between mtDNA and nuclear DNA in species- colonization events into the Nearctic region [33]. The level phylogenetic analyses [6,9-15]. The conflict is often ancestral host plants were most likely "urticalean rosids" attributed to ancient or recent hybridization [9,12-14] or (which includes the closely related plant families Urti- incomplete lineage sorting [15]. Hybridization, a well- caceae, Ulmaceae, Cannabaceae and Celtidaceae) [21,34]. accepted process in plants, appears to be more common Many Polygonia species are still restricted to plants from also among closely related animals than previously this group but some species have included additional or thought [16]. Kronforst [17] showed in Heliconius butter- shifted completely to other plant families, such as Betu- flies that gene flow between species can proceed for long laceae, Ericaceae, Grossulariaceae and Salicaceae. In previ- periods of time after divergence. ous studies [21,34] phylogenetic trees have been used to infer ancestral host plant ranges used by butterflies in the Although phylogenies give us a hypothesis of species rela- subfamily . The results imply that when host tionships, they tell us little about the processes involved in plant range has expanded, an increase in the rate of net diversification on their own. More information is needed diversification has followed. In order to understand in to discover reasons for diversification, such as geographic more detail the dynamics of host plant use and diversifi- location of specimens used and times of divergence of lin- cation in the Polygonia butterflies in particular, and eages. Contemporary sympatric species might have been in general, it is necessary to generate a hypothesis of the allopatric when the two lineages diverged and without evolutionary history of the group. well-sampled species it is even harder to draw any conclu- sions about movements of species and populations. Species of Polygonia have been included in several earlier Knowledge about when divergences of lineages have hap- phylogenetic studies [33,35,36], but the relationships of pened in a given group of species may give insight to the some species have not been stable and conflict between processes behind the conflicts in phylogenetic signal. datasets has been noted [36]. In this study, we present a However, the temporal framework has rarely been studied hypothesis of the evolution of this genus in which all cur- for such conflicts. rently accepted species and most subspecies have been included. We then apply a temporal framework in order Here we study the relationships of species in the genus to illuminate the causes of major conflicts between Polygonia (Lepidoptera: Nymphalidae), which have been genomic datasets.

Page 2 of 16 (page number not for citation purposes) BMC Evolutionary Biology 2009, 9:92 http://www.biomedcentral.com/1471-2148/9/92

Results tle or no support as sister to P. c-aureum with the mito- Combined analysis with Partitioned Bremer Support chondrial dataset (Figure 2). Interestingly, the estimated (PBS) of the 25-taxon dataset showed that there was times of divergence for clades which are common to the strong conflict between the mitochondrial partition and two datasets are similar regardless of which dataset one the nuclear partition at almost all nodes within the genus uses (with the caveat that the confidence intervals are very Polygonia (Figure 1). Analysis of the two genomic datasets wide). Thus the Polygonia clade is estimated to have started separately showed that the topology was different at these diverging 18–19 million years ago (mya), the P. c-album to conflicting nodes (Figure 2). Despite the conflict, there are Nearctic Polygonia clade between 13 and 16 mya, the several well-supported clades at which there is no conflict Nearctic Polygonia at 11–12 mya and the P. progne to P. between datasets. The genus Polygonia without the species gracilis clade between 5 and 6 mya (Figure 2). Kaniska canace is strongly supported by all datasets. The position of K. canace as sister to the genus is not Of particular interest in the separate analysis of the mito- well-supported in the cladistic analyses, but is supported chondrial and nuclear datasets is the position of P. inter- by all Bayesian analyses of combined and separate data. posita as sister to P. c-album (indeed with almost identical The sister species relationships of P. egea and P. undina, as COI haplotypes) based on mtDNA, but as sister to P. c- well as P. comma and P. g-argenteum are well supported. album+P. faunus based on nDNA. Also the position of P. The clade containing P. c-album, P. interposita and P. fau- satyrus as very closely related to P. gracilis+P. oreas+P. nus, as well as the clade containing all the rest of the zephyrus based on mtDNA, but as sister to P. interroga- Nearctic species (except P. faunus), are also well-sup- tionis+P. g-argenteum+P. comma based on nDNA. Finally, ported. the position of P. oreas as part of the P. gracilis+P. zephyrus clade based on mtDNA, but as sister to P. progne based on The level of conflict between the different genomic data- nDNA (Figure 2). sets is particularly evident when comparing the separate analyses, where species relationships are quite different Increasing the sample size for each gene region and ana- with relatively good support (Figure 2). These topologies lyzing them separately brings some light to these patterns. were not dependent on method used for analysis, and The COI tends to have very little variation within species, thus the phylogenetic signal found within the mitochon- but substantial variation between species (Figure 3). The drial and nuclear datasets appears to be strong. The excep- exceptions are P. interposita, which is almost identical to P. tion is P. gigantea, which receives strong support as the c-album; P. g-argenteum, which is very similar to P. comma; sister to P. undina+P. egea with the nuclear dataset, but lit- and P. gracilis, P. zephyrus and P. oreas, which are all very similar to each other even to the point of sharing haplo- types between the three taxa. The position of P. satyrus is consistent with the 25-taxon dataset, and shows some var- iation within the species.

The nuclear gene regions show quite different topologies when analyzed on their own, but several patterns are con- sistent between them (Figure 4, Figure 5, Figure 6, Figure 7 and Figure 8). First of all, the haplotypes of GAPDH and wgl are very similar in the taxa P. gracilis, P. zephyrus, P. haroldii, P. oreas and P. progne (Figure 5, Figure 7 and Fig- ure 8). The haplotypes of EF-1α, RpS5 and wgl in P. satyrus are more related to P. comma, P. g-argenteum and P. inter- rogationis than to the other Nearctic Polygonia (Figure 4, Figure 6 and Figure 7). The nDNA haplotypes found in P. interposita tend to not be especially close to P. c-album (Fig- ure 4, Figure 5, Figure 6 and Figure 7). Interestingly, the CombinedFigure 1 analysis of all genes haplotypes of RpS5 found in P. oreas are very closely Combined analysis of all genes. Values above branches are the Partitioned Bremer Support (PBS) values for the related to those found in P. progne (Figure 6 and Figure combined mitochondrial gene partition and values below 8d), while all other nDNA haplotypes are ambiguous branches are the PBS values for the combined nuclear gene about this relationship. Finally, the subspecies P. e. undina partition. Grey circles highlight nodes with strong conflict is differentiated from P. egea for all sequenced genes. within the Polygonia clade. Pictured butterflies are from top to bottom Kaniska canace, Nymphalis polychloros, Polygonia c- Regarding the haplotype networks, we have focused on album, Polygonia satyrus and Polygonia zephyrus. the unresolved clade of P. zephyrus, P. gracilis and P.

Page 3 of 16 (page number not for citation purposes) BMC Evolutionary Biology 2009, 9:92 http://www.biomedcentral.com/1471-2148/9/92

FigureDated phylogenies 2 of Polygonia Dated phylogenies of Polygonia. a) based on mtDNA b) based on nDNA. Values below the branches are posterior proba- bilities for the nodes to the right of the numbers.

Page 4 of 16 (page number not for citation purposes) BMC Evolutionary Biology 2009, 9:92 http://www.biomedcentral.com/1471-2148/9/92

oreas (in COI) as well as the relationships between these "oddity" and long history of independent evolution. This species and P. progne and P. haroldii (in RpS5, EF-1α, is of course only valid if one accepts the validity of the GAPDH and wgl) (Figure 8). None of the haplotype net- genera Polygonia and Nymphalis, which some consider to works showed a star-like pattern, i.e., a "central" com- be a single genus Nymphalis, along with our Aglais [e.g., monly shared haplotype from which other haplotypes [40,41]]. For reasons explained in Wahlberg & Nylin [36], deviate by only a few mutational steps, indicative of a we feel that the genus Polygonia should be retained, and rapid and recent diversification [37]. Most haplotypes thus we suggest that the taxon "canace" be retained in the were only represented by one individual. A few haplo- genus Kaniska, as is frequently done in the literature [e.g., types were however shared, even between different spe- [29], e.g., [42,43]]. cies. For instance, Polygonia oreas shared the same haplotype with P. gracilis and P. zephyrus in the COI data- There are several independent lineages within Polygonia. set (Figure 8a). In the EF-1α dataset one P. oreas haplo- The type species of the genus, Polygonia c-aureum, is the sis- type is shared with P. gracilis (Figure 8b). Shared ter to the rest of Polygonia, as has been found in previous haplotypes were found for P. haroldii, P. progne and P. studies [33,35,36]. Polygonia gigantea, included here for zephyrus in the GAPDH dataset as well as for P. progne, P. the first time in a phylogenetic study, is an independent zephyrus and P. gracilis (Figure 8c). In the RpS5 dataset P. lineage that is most likely sister to the P. egea+P. undina haroldii, P. zephyrus and P. gracilis shared the same hap- clade, based on the well-supported result of the nDNA lotype (Figure 8d). Three haplotypes were shared between dataset and the ambiguous result of the mtDNA dataset. P. gracilis and P. zephyrus, one haplotype was shared Polygonia undina has mainly been considered to be a sub- between P. progne and P. gracilis and one haplotype was species of P. egea, but our results show that it is genetically shared between P. oreas and P. zephyrus in the wgl dataset very distinct and the common ancestor of the two (Figure 8e). In addition, in the wgl dataset one haplotype diverged as early as between 8–13 mya (Figure 2). This is shared between P. comma, P. g-argenteum and all P. makes the pair older than several other species pairs in satyrus. However, the P. g-argenteum individual had Polygonia, and we found no evidence of interbreeding (all many positions of missing data. genes were clearly diverged for this pair of species). We thus elevate P. undina to the species level (stat. nov.). Discussion Major clades in the phylogeny The clade containing P. c-album, P. interposita and P. faunus Unlinked genes are expected to have independent genea- is well-supported and quite clearly the sister to the Nearc- logical histories [3,38], and thus combining data may not tic clade. The interrelationships of these three species will always be informative of species relationships [39]. In this be discussed in more detail below. The Nearctic clade study we have found that gene regions from different including P. satyrus, P. interrogationis, P. comma, P. g-argen- genomes (mtDNA and nDNA) give rather different esti- teum, P. progne, P. oreas, P. haroldii, P. gracilis and P. zephy- mates of species relationships. The phylogenetic positions rus, is also well-supported. Within this clade, P. g- of four taxa in particular need explanation: P. satyrus, P. argenteum (here included for the first time in a phyloge- oreas, P. haroldii and P. interposita. Each of these is strongly netic analysis) is clearly the sister species to P. comma, and supported in different positions depending on which apparently these two have diverged relatively recently (2– dataset is analysed. Before we discuss these four anoma- 4 mya). The position of P. interrogationis with regard to lous taxa, we will discuss the general findings for the other these two species is different with the two genomic data- species, as this is the most complete study of Polygonia sets. Mitochondrial DNA suggests that it is sister to the rest phylogeny to date, including several taxa that have never of the Nearctic species, while nDNA suggests that it is sis- been part of a phylogenetic systematic investigation. ter to P. comma+P. g-argenteum. The latter sister relation- ship is in fact suggested by morphological data as well Previous studies have shown conflicting results on the [35], giving more weight to this hypothesis of phylogeny. position of the taxon "canace" [33,35,36], often placed in the monotypic genus Kaniska, but suggested to be Our data suggest that P. comma and P. g-argenteum have included in Polygonia by Wahlberg & Nylin [36]. The diverged in the past 2–3 mya, during which time there has present study does not corroborate that finding, but it been considerable morphological diversification between should be noted that in contrast to the earlier study we did them. Adults of g-argenteum are among the largest of Poly- not include morphological data here. However, it is clear gonia (generally the same size as P. interrogationis), and that the position of "canace" is not stable and it's sister they lack the seasonal polyphenism (expression of dark relationship either to Nymphalis or to Polygonia is weakly "summer" forms) seen in P. comma and P. interrogationis. supported. In such a case, we feel it is best to retain it in As a result, size excluded, adults of P. satyrus and P. g- the monotypic genus Kaniska, in order to highlight its argenteum share a very similar superficial resemblance

Page 5 of 16 (page number not for citation purposes) BMC Evolutionary Biology 2009, 9:92 http://www.biomedcentral.com/1471-2148/9/92

1.00 Aglais io Aglais milberti Aglais urticae 1.00 Kaniska canace EW19-11 1.00 Kaniska canace EW19-10 0.82 Kaniska canace NW164-1 1.00 Nymphalis l-album 0.97 Nymphalis californica 0.69 Nymphalis xanthomelas Nymphalis antiopa 1.00 Nymphalis polychloros 1.00 c-aureum EW13-10 c-aureum NW65-8 c-aureum EW13-12 0.1 gigantea NW166-5 1.00 1.00 egea NW120-7 1.00 egea NW63-1 0.52 1.00 egea NW77-15 0.61 1.00 undina EW30-1 undina EW33-24 undinaEW43-2 0.80 undinaEW43-1 1.00 c-album NW70-3 c-album EW13-1 0.83 interposita NW166-6 1.00 0.93 c-album EW26-32 c-album EW13-3 1.00 faunus EW12-7 faunus EW19-1 faunus EW21-12 0.52 faunus NW74-12 1.00 faunus EW21-13 1.00 interrogationis EW22-12 interrogationis NW77-12 interrogationis EW22-14 1.00 comma EW21-10 g-argenteum NW165-2 comma NW65-6 1.00 comma EW21-11 0.57 progne NW63-10 progne EW22-17 1.00 progne EW21-2 progne EW22-21 1.00 progne EW22-16 progne EW21-4 progne EW21-3 0.63 progne EW21-16 progne EW22-20 progne EW22-15 progne EW22-19 0.70 progne EW22-18 1.00 progne EW19-9 1.00 haroldii NW112-3 haroldii NW165-1 satyrus EW11-11 satyrus EW16-7 satyrus EW23-13 satyrus EW11-12 0.91 satyrus EW23-14 1.00 satyrus EW16-6 satyrus EW15-11 0.93 satyrus EW15-16 0.72 satyrus EW15-12 satyrus EW15-13 satyrus NW65-9 satyrus EW15-9 satyrus EW15-14 0.88 satyrus EW15-10 satyrus EW11-10 satyrus NW74-8 1.00 satyrus NW74-9 0.98

gracilis/zephyrus/oreas clade 0.91 TopologyFigure 3 of haplotypes from Bayesian analysis based on the mtDNA COI Topology of haplotypes from Bayesian analysis based on the mtDNA COI. Details of gracilis/zephyrus/oreas clade are shown in the haplotype network in Figure 8a. Values below the branches are posterior probabilities for the nodes to the right of the numbers.

(especially in the dorsal view), while adults of P. comma Ancient mitochondrial introgression in P. satyrus (especially dark forms) and P. g-argenteum appear quite Mitochondrial DNA suggests that P. satyrus is closely different at first glance. related to P. gracilis, P. zephyrus and P. oreas, whereas nDNA suggests very strongly that P. satyrus is sister to P. As an aside, it is interesting to note that apparently similar interrogationis, P. comma and P. g-argenteum (Figure 2). patterns of differences between mitochondrial and Morphological and ecological features, however, suggest nuclear DNA are found in the genus Nymphalis (Figure 2). that P. satyrus is more related to the latter clade. In addi- This warrants a separate study to see if similar forces have tion to great overall phenotypic similarity between the acted on the sister group of Polygonia. adults and immatures of P. satyrus and P. comma, larvae of

Page 6 of 16 (page number not for citation purposes) BMC Evolutionary Biology 2009, 9:92 http://www.biomedcentral.com/1471-2148/9/92

Aglais io 1.00 Aglais urticae Aglais milberti 1.00 Kaniska canace EW19-11 Kaniska canace NW164-1b Kaniska canace NW164-1a 0.53 Nymphalis l-album 1.00 Nymphalis polychloros 1.00 Nymphalis californica 1.00 Nymphalis antiopa 0.79 Nymphalis xanthomelas 1.00 c-aureum EW13-12 c-aureum NW65-8b c-aureum NW65-8a 0.1 1.00 undina EW30-1 egea NW77-15 egea NW63-1 0.67 0.89 egea NW120-7b egea NW120-7a comma NW65-6a 1.000.98 g-argenteum NW165-2a g-argenteum NW165-2b comma NW65-6b 0.74 1.00 1.00 interrogationis NW77-12a interrogationis NW77-12b 0.97 1.00 satyrus NW74-9 satyrus EW11-10 satyrus EW15-9a 0.67 satyrus EW15-12a 0.57 satyrus EW15-9b 0.99 satyrus EW15-12b 1.00 gigantea NW166-5a gigantea NW166-5b 0.89 c-album NW70-3a c-album EW26-32 c-album NW70-3b 1.00 1.00 interposita NW166-6a interposita NW166-6b 0.84 faunus NW74-12a 1.00 1.00 faunus NW74-12b 0.54 c-album EW13-1a 1.00 c-album EW13-3 faunus EW19-1a 0.63 faunus EW21-13a faunus EW12-7a faunus EW12-7b 0.59 0.63 faunus EW21-13b 0.63 c-album EW13-1b faunus EW19-1b

1.00 gracilis/zephyrus/oreas/progne/haroldii clade TopologyFigure 4 of haplotypes from Bayesian analysis based on the nDNA EF-1α Topology of haplotypes from Bayesian analysis based on the nDNA EF-1α. Details of gracilis/zephyrus/oreas/progne/ haroldii clade are shown in the haplotype network in Figure 8b. Values below the branches are posterior probabilities for the nodes to the right of the numbers. those two taxa, as well as those of P. interrogationis and P. mtDNA, it is possible that the presence of an "alien" g-argenteum [see [44]], feed on Ulmaceae, Moraceae and mtDNA lineage in P. satyrus may be the result of ancient Urticaceae as larvae, and late-instar larvae of P. satyrus and introgression from the ancestor of P. gracilis+P. zephy- P. comma make very similar larval nests out of altered host rus+P. haroldii, which could have happened some 2–3 plant leaves [45]. Polygonia satyrus is largely parapatric mya (prior to the onset of the Pleistocene glacial periods). with respect to P. comma, as the two fly in sympatry only The current sympatric distribution of P. satyrus vs. P. graci- in a limited portion of northeastern North America, and lis+P. zephyrus (the geographic distribution of these taxa is rarely in eastern Colorado, where P. comma is present only almost identical) highlights the potential for gene as uncommon vagrant individuals from the east [46]. exchange in the recent past and present. Given also that all mtDNA haplotypes found to date in P. satyrus are very According to our estimates of times of divergence (Figure similar, yet all nDNA haplotypes are more related to the 2), P. satyrus diverged from the ancestral populations P. interrogationis clade, it is possible that repeated popula- between 7–8 mya based on nDNA, whereas the result tion bottlenecks during the glacial cycles have wiped out from mtDNA suggests that the divergence happened the original mtDNA lineages from P. satyrus, by chance much more recently, about 2 mya. Given that the nDNA leaving the current introgressed lineage in extant popula- estimate of divergence time is older than that from tions. Lack of gene flow during the last 2 million years has

Page 7 of 16 (page number not for citation purposes) BMC Evolutionary Biology 2009, 9:92 http://www.biomedcentral.com/1471-2148/9/92

Aglais io Aglais milberti Aglais urticae Nymphalis l-album 1.00 Kaniska canace EW19-11 0.99 Kaniska canace NW164-1a 0.92 Kaniska canace NW164-1b 0.99 Nymphalis polychloros 0.91 Nymphalis californica 0.68 Nymphalis antiopa 1.00 Nymphalis xanthomelas 1.00 c-aureum NW65-8 1.00 c-aureum EW13-12 0.54 gigantea NW166-5 0.94 undina EW30-1 0.1 1.00 egea NW120-7a 0.50 1.00 egea NW77-15 egea NW120-7b interposita NW166-6b 0.64 1.00 c-album NW70-3b 1.00 faunus EW21-13b c-album EW13-1b interposita NW166-6a c-album NW70-3a 0.94 faunus EW21-13a 1.00 faunus EW12-7 faunus EW19-1 c album EW13-1a faunus 74-12b 0.81 faunus 74-12a zephyrus NW74-6b 1.00 interrogationis NW77-12a interrogationis NW77- 12b 1.00 comma NW65-6a comma NW65-6b 1.000.97 g-argenteum NW165-2a g-argenteum NW165-2b 1.00 comma EW21-11a comma EW21-11b satyrus EW15-12b satyrus EW15-9a 1.00 0.80 satyrus EW15-12a 0.91 satyrus EW15-9b 0.89 satyrus NW74-9 satyrus EW11-10b satyrus EW11-10a

gracilis/zephyrus/oreas/progne/haroldii clade 0.74 TopologyFigure 5 of haplotypes from Bayesian analysis based on the nDNA GAPDH Topology of haplotypes from Bayesian analysis based on the nDNA GAPDH. Details of the gracilis/zephyrus/oreas/ progne/haroldii clade are shown in the haplotype network in Figure 8c. Values below the branches are posterior probabilities for the nodes to the right of the numbers. now resulted in reciprocal monophyly to evolve in P. graves [50] showed that hybrid sterility and inviability are satyrus and P. gracilis+P. zephyrus+P. haroldii. Such a specu- common in Lepidoptera and evolve gradually. In those lative scenario could be corroborated by more extensive studies of butterflies where both mtDNA and nDNA have sampling of P. satyrus populations across North America, been screened, introgression in nDNA but not mtDNA which could make possible coalescense modeling to rule has been found between Papilio machaon and P. hospiton out any possibility that the conflicting results can be (Papilionidae) [7] as well as between Heliconius cydno and explained by ancient polymorphisms [39,47]. H. melpone (Nymphalidae) [13,14]. However, mtDNA introgression has been found between the latter species In butterflies females are the heterogametic sex and it is pair in another study [8], suggesting that the wide accept- accepted that "Haldane's rule" [48] is an important phe- ance of Haldane's rule needs to be questioned. In the case nomenon, ie. introgression of the maternally inherited of Polygonia satyrus we have no knowledge of whether mtDNA will not enter the new gene pool due to low via- hybrid female offspring are sterile or not, but even if bility or sterility of female F1 offspring [see [49]]. Pres- hybrids between contemporary P. satyrus and species from

Page 8 of 16 (page number not for citation purposes) BMC Evolutionary Biology 2009, 9:92 http://www.biomedcentral.com/1471-2148/9/92

Aglais io Aglais milberti Aglais urticae 1.00 c-aureum NW65-8 c-aureum EW13-12 0.69 Nymphalis polychloros 0.69 0.99 Nymphalis l-album 0.92 Nymphalis californica Nymphalis antiopa 0.87 Nymphalis xanthomelas 1.00 1.00 Kaniska canace EW19-11 Kaniska canace NW164-1 faunus 74-12b interposita NW166-6 0.95 0.90 c-album EW13-1a c-album EW13-1b faunus NW74-12a faunus EW21-13 0.69 0.93 0.1 faunus EW12-7 faunus EW19-1 1.00 0.90 c-album NW70-3a c-album NW70-3b interrogationis NW77-12 1.00 g-argenteum NW165-2 0.98 0.96 satyrus NW74-9a satyrus NW74-9b 0.98 1.00 comma NW65-6a comma NW65-6b 1.00 egea NW120-7 1.00 egea NW77-15a 0.99 0.99 egea NW77-15b 0.99 undina EW43-1a 0.98 undina EW30-1a undina EW43-1b undina EW30-1b gigantea NW166-5 0.81 1.00 clade 1.00 progne/oreas 0.99 gracilis/zephyrus/haroldii clade TopologyFigure 6 of haplotypes from Bayesian analysis based on the nDNA RpS5 Topology of haplotypes from Bayesian analysis based on the nDNA RpS5. Details of the progne/oreas clade and the gracilis/zephyrus/haroldii clade are shown in the haplotype network Figure 8d. Values below the branches are posterior probabil- ities for the nodes to the right of the numbers. the P. gracilis clade are inviable this may not have been the cilis and P. zephyrus began diverging about 5 mya at the case when (if) introgression occured. end of the Miocene. Based on nDNA, P. oreas and P. progne began diverging about 3 mya, whereas the divergence of P. Recent mitochondrial introgression in P. oreas oreas mtDNA is more recent. As with P. satyrus, no P. oreas The mtDNA haplotypes found in P. oreas are very similar COI haplotypes were found to be more related to its prob- to those found in P. gracilis and P. zephyrus, and one hap- able sister species P. progne, and it may be that bottlenecks lotype is shared between these species. In the nDNA data- have wiped out the original mtDNA lineages, while cur- sets, haplotypes are shared between P. oreas, P. gracilis and rent introgression is introducing new genetic material into P. zephyrus for EF-1α but not for the other genes, and in P. oreas from the P. gracilis complex (most likely from the case of RpS5, haplotypes of P. oreas are clearly more western P. zephyrus). On the other hand, we have sampled related to P. progne (Figure 4, Figure 5, Figure 6, Figure 7, only 5 individuals of P. oreas, and it may be that a denser and Figure 8e). Polygonia oreas has been considered a sub- sampling would reveal mtDNA lineages closer to P. species of P. progne by various authors [e.g., [51]], thus progne. Polygonia oreas flies in sympatry and synchrony once again, the nDNA dataset corroborates the morpho- with P. zephyrus throughout the vast majority of its range, logical proposals of previous authors. Interestingly, both the latter usually being much more abundant locally and the mtDNA and the nDNA datasets suggest that the clade regionally; thus there are ample opportunities for ongoing including the five taxa P. progne, P. oreas, P. haroldii, P. gra- introgression between the two taxa. It should be noted

Page 9 of 16 (page number not for citation purposes) BMC Evolutionary Biology 2009, 9:92 http://www.biomedcentral.com/1471-2148/9/92

Aglais io 1.00 Aglais milberti Aglais urticae 1.00 Kaniska canace NW164-1b 1.00 Kaniska canace EW19-11 0.75 Kaniska canace NW164-1a Nymphalis l-album 0.99 Nymphalis polychloros 1.00 Nymphalis antiopa Nymphalis xanthomelas Nymphalis californica 1.00 c-aureum EW13-12b c-aureum EW13-10a 0.95 c-aureum NW65-8 c aureum EW13-12a 1.00 c aureum EW13-10b zephyrus EW10-12a gigantea NW166-5 0.93 interposita NW166-6a interposita NW166-6b 0.98 egea NW77-15 0.1 egea NW120-7b 0.90 egea NW120-7a undina EW30-1 1.00 undina EW43-2b 0.54 undina EW43-2a undina EW43-1 comma EW21-10 g-argenteum NW165-2 satyrus NW74-9 satyrus EW15-9 0.99 satyrus EW15-12 satyrus EW11-10 comma NW65-6 comma EW21-11b comma EW21-11a 0.89 interrogationis EW22-14 1.00 interrogationis NW77-12a interrogationis NW77-12b 0.96 gracilis EW22-9b 0.98 c-album NW70-3b 0.84 c-album NW70-3a 0.62 zephyrus EW10-12b 0.51 c-album EW13-1a c album EW13-1b c-album EW13-3a 0.99 c-album EW13-3b 0.59 faunus EW21-13 0.59 faunus EW12-7 faunus NW74-12 faunus EW19-1

1.00 gracilis/zephyrus/oreas/progne/haroldii clade TopologyFigure 7 of haplotypes from Bayesian analysis based on the nDNA wgl Topology of haplotypes from Bayesian analysis based on the nDNA wgl. Details of the gracilis/zephyrus/oreas/progne/ haroldii clade are shown in the haplotype network Figure 8e. Values below the branches are posterior probabilities for the nodes to the right of the numbers.

that adults of some subspecies of P. oreas, especially nigro- drial DNA suggests that P. haroldii is a distinct lineage sis- zephyrus, and some individuals of threatfuli, are so similar ter to the P. gracilis/zephyrus lineage (that includes the to those of sympatric P. zephyrus that many experienced "alien" P. satyrus lineage) (Figure 3), yet the nDNA sug- lepidopterists cannot distinguish them (without life his- gests that P. haroldii is not distinct from P. zephyrus (note tory information), and these two taxa were not described that by chance, the P. zephyrus chosen for the 25-taxon until 1984 and 2001, respectively (adults of these taxa are analyses is rather different from the other P. zephyrus) (Fig- still hiding in museum series of P. zephyrus all over the ure 4, Figure 5, Figure 6 and Figure 7). Here the classical world). explanation for conflicts [52] of recent divergence with not enough time for slowly evolving nuclear genes to have Recent speciation of P. haroldii and incipient speciation segregated would appear to hold. Interestingly, P. haroldii of P. gracilis/zephyrus? (endemic to mainland Mexico) and some P. zephyrus The taxa P. haroldii, P. gracilis and P. zephyrus appear to be (endemic to western United States and Canada, including related to one another in a complicated way. Mitochon- northern Baja California, Mexico) nDNA haplotypes

Page 10 of 16 (page number not for citation purposes) BMC Evolutionary Biology 2009, 9:92 http://www.biomedcentral.com/1471-2148/9/92

a) b) between P. zephyrus and P. haroldii, the latter has diversi- fied morphologically to the point where it cannot be con- fused with any other member of the genus. This was perhaps achieved through evolution of a mimetic rela- tionship with the presumably distasteful model Dione moneta (Nymphalidae: Heliconiinae: Heliconiini); in flight, adults of D. moneta and P. haroldii appear nearly indistinguishable, since the metallic ventral spots of D. moneta frequently are not visible (ADW, pers. obs.). Cur- c) d) rently, these two species are very often found flying in sympatry and synchrony throughout the Mexican distri- bution of P. haroldii, although the presumed model, D. moneta, is usually much more widespread and common than P. haroldii. No obvious geographic variation in mor- phology has been noted in P. haroldii.

The taxon pair P. gracilis and P. zephyrus has been treated as two hypothetically separate species in this study, but the current consensus is that these are subspecies of the e) same species [27,28]. Our results are ambiguous about whether these two taxa are currently diverging or merging. Polygonia zephyrus Morphologically, populations of far western P. zephyrus Polygonia oreas are separable from far eastern populations of P. gracilis, but there is a clear cline between the extremes, and popu- Polygonia gracilis lations found in Alberta, Canada, consist mostly of adults Polygonia haroldii that cannot be confidently assigned to one or the other taxon [55]. On the one hand, our data does not distin- guish between the two taxa (haplotypes are shared regard- MinimumFigure 8 spanning networks less of gene or genome inspected), but on the other hand, Minimum spanning networks. a) COI b) EF-1α c) haplotypes are also shared with P. progne, P. haroldii, and GAPDH d) RpS5 and e) wgl. Size of the circles are directly P. oreas, which are distinct species-level taxa. Thus, proportional to the number of individuals with that haplo- type. Small white circles indicate a missing haplotype. Each detailed elaboration of the taxonomic status of P. zephyrus branch is equivalent to one basepair change. Circles with and P. gracilis will only be possible once a thorough study more than one pattern show the proportion of each species. can be conducted, considering dozens of populations The colour coding are as follows; blue – P. zephyrus, red – P. from throughout the range of the complex. The current oreas, green – P. progne, yellow – P. gracilis and orange – P. distribution of these two taxa, with an apparently broad haroldii. zone in western Canada where their identities become blurred, suggests ongoing gene flow between them, and a careful study of populations in Alberta seems warranted. appear to be more related to each other, perhaps suggest- ing recent gene exchange between these western taxa dur- Polygonia interposita, species or subspecies of P. c-album? ing the Pleistocene glacial periods. Currently, the two taxa The rarely collected taxon P. interposita is found in central appear to be allopatrically distributed. A close relation- Asian mountains and has often been considered a subspe- ship between P. haroldii and P. zephyrus was suggested by cies of P. c-album [e.g., [43]]. This taxon has frequently Krogen [53], based on morphological similarities. Beutel- been confused with P. undina, due to the somewhat simi- spacher [54] reported an unidentified species of Urti- lar ventral wing pattern and similar distribution. We were caceae as a larval foodplant for P. haroldii (presumably in only able to get one specimen of P. interposita that gave the Valley of Mexico), although this record seems good quality DNA. The mtDNA of this specimen was unlikely, since P. haroldii is usually found in immediate almost identical to P. c-album (which shows very little var- association with species (Grossulariaceae) (ADW, iation in COI across its entire range; Weingartner et al. in pers. obs.), the host plant genus utilized by P. zephyrus. prep.). The nDNA, however, was quite distinct from P. c- album, and indeed in the 25-taxon analysis, P. interposita Our data suggest that morphological differentiation may emerged as sister to P. c-album+P. faunus, with a diver- occur rapidly in Polygonia, once speciation has occurred. gence time estimated at about 5 mya. Could this possibly Despite the essentially identical nDNA haplotypes be a similar case to P. satyrus in North America? Only

Page 11 of 16 (page number not for citation purposes) BMC Evolutionary Biology 2009, 9:92 http://www.biomedcentral.com/1471-2148/9/92

mainly due to conflicts between mitochondrial and nuclear gene regions.

Considering the lineage concept, it is clear that P. c- aureum, P. gigantea, P. egea, P. undina, P. satyrus and P. interrogationis have differentiated so long ago that there is no question about their taxonomic status as species (Fig- ure 9). The species-level status of P. comma and P. g-argen- teum, as well as P. c-album, P. interposita and P. faunus also is not really a question based on our results, but they have speciated relatively recently and in the case of P. inter- posita, may still hybridize in nature with P. c-album. In Fig- ure 9, P. faunus is placed out of the grey zone due to the clear separation of it's populations in North America from those of P. c-album and P. interposita in Eurasia. Further down the continuum closer to the divergence events are P. progne, P. oreas and P. haroldii (Figure 9), which have spe- ciated so recently that occasional gene flow may still occur between P. oreas and P. progne and/or P. zephyrus, but which remain taxonomic entities separate from their clos- est relatives. Just above the divergence line, entering into the grey zone of one or two species is the taxon pair P. gra- SpeciesFigure through9 time, a summary of our results cilis and P. zephyrus (Figure 9). To really be able to say Species through time, a summary of our results. Spe- whether the two are above or below the line would cies below the "grey zone" are clear independent lineages require population genetic methods to see whether gene with no known closely related sister species. Species in the flow between the two populations is sufficiently high to "grey zone" are at various stages in the speciation process. consider them conspecific. Species above the "grey zone" are closely related sister spe- cies that are separate genetic entities. They have thus by this Conclusion time also become clear independent lineages ready for a In conclusion, although we now have included consider- future new bifurcation, should the right circumstances arise. able amounts of new genetic information in an attempt to Figure modified from de Queiroz [58]. interpret the evolutionary history of Polygonia butterflies, we are still not able to fully understand the processes of speciation in this taxon. Especially within the Nearctic more samples of P. interposita would shed light on this clade, more population genetic data is needed. However, question, but based on the current specimen, it is possible our results graphically demonstrate, first, that species in that the mitochondrial lineage of P. c-album has invaded this group evolve over time, sometimes over a very long the genome of P. interposita in recent times, resulting in a time, and, second, that evidently even well-differentiated situation where the two genomes give conflicting signals species can hybridize to the extent that different parts of regarding phylogenetic relatedness. their genome may suggest strongly conflicting patterns of relationships. Species as lineages through time The concept of species as lineages is fast gaining support The results from the present study do not change the main from the scientific community [56-59]. The concept takes conclusion from the study of host plant range in Polygonia into account that species are part of an evolutionary con- butterflies [21]. In that paper we introduced the idea that tinuum from diverging populations to already diverged, based on the phylogeny it was possible to show that but- well-defined species. Many of the multitude of proposed terfly clades including species that use host plants addi- species concepts lie along this continuum, but are not tional to, or other than, the "urticalean rosids", included general enough to explain the diversity we see in nature. more species than the sister clade (only feeding Here we present results for a small group of well-known on "urticalean rosids"). Our present results are still in butterflies with a relatively stable . Despite agreement with the former result and there is no case in molecular data from 6 gene regions for a total of 4879 bp which the reverse is valid (that species restricted to "urti- (much more than the standard in species level phyloge- calean rosids" constitute more butterfly species than the netic studies at the moment), we were unable to resolve sister group of species with a broader host plant range). the relationships of the 16 species unambiguously, Thus, we believe that being able to expand the host plant

Page 12 of 16 (page number not for citation purposes) BMC Evolutionary Biology 2009, 9:92 http://www.biomedcentral.com/1471-2148/9/92

Table 1: Summary of number of individuals per species sequenced for a given gene.

Number of individuals sequenced

Species COI ND1 wingless EF1-α GAPDH RpS5

Outgroup taxa Aglais io 11 1 1 1 1 Aglais milberti 11 1 1 1 1 Aglais urticae 11 1 1 1 1 Nymphalis antiopa 11 1 1 1 1 Nymphalis californica 11 1 1 1 1 Nymphalis l-album 11 1 1 1 1 Nymphalis polychloros 11 1 1 1 1 Nymphalis xanthomelas 11 1 1 1 1

Ingroup taxa Kaniska canace 31 2 2 2 2 Polygonia c-album 41 3 4 2 2 Polygonia interposita 10 1 1 1 1 Polygonia c-aureum 31 3 2 2 2 31 3 1 2 1 31 2 3 2 2 Polygonia undina 40 3 1 1 2 51 4 4 4 4 Polygonia g-argenteum 10 1 1 1 1 Polygonia gigantea 10 1 1 1 1 Polygonia gracilis 31 2 1 2 2 Polygonia zephyrus 24 1 8 6 8 8 Polygonia haroldii 21 2 2 2 2 Polygonia interrogationis 31 2 1 1 1 Polygonia oreas 51 4 4 4 4 Polygonia progne 14 1 2 3 2 2 Polygonia satyrus 17 1 4 4 4 1

See Additional File 1 for details of individuals, including collection locality and GenBank accession numbers. range will enhance speciation through colonizations and Methods local adaptations, according to the oscillation hypothesis We sampled 96 individuals of all Polygonia species, as well [60]. as 8 outgroup species belonging to the genera Nymphalis and Aglais (see Table 1 and Additional File 1). Most indi- We have shown that the species-level relationships viduals were collected by colleagues (see Acknowledg- inferred from DNA sequence data may be strongly influ- ments) and sent dry to Stockholm. Total genomic DNA enced by the markers that have been chosen. This then was extracted from two legs using QIAgen's DNEasy begs the question of how this phenomenon affects studies extraction kit, according to the manufacturer's instruc- aimed at looking at higher levels of phylogenetic relation- tions, with the exception that individuals more than 2 ships, such as genera or families. Fortunately, our previ- years old at extraction were eluted into 50 μl of elution ous studies at higher levels have used the same markers as buffer, rather than the recommended 200 μl. Voucher we have in this study [33,61-66], and results show that the specimens are stored at the Department of Zoology, COI is generally concordant with the nuclear markers at Stockholm University and Laboratory of Genetics, Univer- taxonomic levels above genera. This is probably due to the sity of Turku, and can be viewed at http://nymphali same stochastic processes that lead to reciprocal mono- dae.utu.fi. phyly at the species level, i.e. given enough time, lineages (species) will go extinct, leaving sister entities that we call We amplified 6 loci using PCR directly from the genomic genera (and by default higher taxa) reciprocally mono- extracts. The loci were cytochrome oxidase subunit I (COI) phyletic. This is not to say that all currently described gen- and NADH subunit 1 (ND1) from the mitochondrial era are monophyletic entities, simply because the majority genome, and elongation factor-1α (EF-1α), wingless (wgl), of genera have not been rigorously tested for monophyly glyceraldehyde-3-phosphate dehydrogenase (GAPDH) and using phylogenetic analyses. ribosomal protein S5 (RpS5) from different nuclear

Page 13 of 16 (page number not for citation purposes) BMC Evolutionary Biology 2009, 9:92 http://www.biomedcentral.com/1471-2148/9/92

genomes. Primers and PCR protocols were taken directly addition rounds of successive Sectorial, Ratchet, Drift and from Wahlberg and Wheat [67], except for ND1, for Tree Fusing searches [70-72]. We evaluated the character which we followed the protocol described in Nylin et al. support for the clades in the resulting cladograms using [35]. PCR products were cleaned using exonuclease I and Bremer support [73,74] and Partitioned Bremer support calf intestine alkaline phosphatase (Fementas) and [75,76]. The scripting feature of TNT was used to calculate sequenced directly, using either the PCR primers or uni- these values [see [64]]. versal tails attached to the primers [for details, see [67]], on a Beckman-Coulter CEQ8000 capillary sequencer Bayesian inference of phylogeny and times of divergence (Stockholm), or an ABI PRISMR 3130xl capillary were estimated using the program BEAST v1.4.6 [77]. sequencer (Turku) using dye terminator sequencing kits Both datasets were analysed under the GTR+ Γ model with according to the recommendations of manufacturers. a relaxed clock, allowing branch lengths to vary according to an uncorrelated Lognormal distribution [78]. The tree All six genes were initially amplified for a selection of 25 prior was set to the Yule process, and the "tree- taxa (8 outgroup species and 17 taxa of Polygonia). In Model.RootHeight" prior (i.e., the age at the root of the order to verify patterns of strong conflict between the tree) was set to 33 million years (with a standard devia- mitochondrial and nuclear genes [33,36], a further 77 tion of 5 million years), in accordance with results from individuals of Polygonia and four individuals of Kaniska Wahlberg [79]. All other priors were left to the defaults in canace were amplified and sequenced for COI, 27 individ- BEAST. Parameters were estimated using 2 independent uals of Polygonia for EF-1α, 30 individuals of Polygonia for runs of 1 million generations each (with a pre-run burn- wgl, 23 individuals of Polygonia for GAPDH and 20 indi- in of 10000 generations), with parameters sampled every viduals of Polygonia for RpS5. Two individuals of Kaniska 1000 generations. Convergence was checked in the Tracer canace were amplified and sequenced for all nuclear v1.4.6 program and summary trees were generated using genes. TreeAnnotator v1.4.6, both part of the BEAST package.

Resulting chromatograms were examined by eye in To confirm that Bayesian analyses converged on the same BioEdit [68] and any heterozygous positions (two equally topology, the data were also analyzed with MrBayes 3.1 sized peaks observed at one position) were coded with [80]. The Bayesian analysis was performed on the com- IUPAC ambiguity codes. All sequences are from protein- bined data set with parameter values estimated separately coding genes and thus alignment was trivial. As noted in for each gene region using the "unlink" command and the previous publications [33,35,36], a one-codon deletion rate prior (ratepr) set to "variable". The analysis was run was inferred in the wgl sequence of the three species of twice simultaneously for 2 million generations, with four Aglais. Heterozygous sequences were separated manually chains (one cold and three heated) and every 500th tree into haplotypes. For sequences with only one hetero- sampled. The first 500 sampled generations discarded as zygous position, this was trivial. For those with two or burn-in (based on a visual inspection of when log likeli- more heterozygous positions, one haplotype was hood values reached stationarity), leaving 3501 sampled assumed to be identical to a common haplotype found in generations for the estimation of posterior probabilities. other individuals of the same species. This was possible in Results of the two simultaneous runs were compared for all cases. convergence using Tracer v1.4.6 [77].

The previously noted strong conflict between two mito- The expanded single-gene datasets were analysed sepa- chondrial and two nuclear genes [33,36] was investigated rately after separating heterozygotes into haplotypes. with a total evidence approach and Partitioned Bremer These datasets were analysed using both parsimony and Support (PBS) on the 25-taxon dataset. Results suggested Bayesian methods in TNT and MrBayes 3.1, respectively. that mitochondrial and nuclear partitions continued to Search parameters were as above, except the single data- conflict with the addition of new nuclear gene regions. We sets were not partitioned in any way. thus analysed the combined mitochondrial genes and the combined nuclear genes to obtain estimates of relation- In order to further investigate the resulting polytomies, we ships based on the mitochondrial genome and the nuclear constructed a haplotype network in TCS [81], which genome, respectively. The two genome sets were analysed shows how haplotypes are connected to each other. In this separately, but combined within each set (ie. COI+ND1 program, the gene genealogies from DNA sequences are and EF-1α+GAPDH+RpS5+wgl) and will be referred to as estimated with statistical parsimony according to Temple- the mitochondrial data and the nuclear data, respectively. ton et al. [82]. We focused on the Nearctic Polygonia spe- Parsimony analyses were conducted using a heuristic cies (excluding P. faunus). Regions of missing basepairs search algorithm in the program TNT [69] on the equally were removed and we performed analyses of all Nearctic weighted data set. The data were subjected to 100 random taxa as well as subsets of clades. The datasets are com-

Page 14 of 16 (page number not for citation purposes) BMC Evolutionary Biology 2009, 9:92 http://www.biomedcentral.com/1471-2148/9/92

prised of 1430 bp for COI, 1240 bp for EF-1α, 392 bp for 9. Chan KMA, Levin SA: Leaky prezygotic isolation and porous genomes: rapid introgression of maternally inherited DNA. wgl, 691 bp for GAPDH and 617 bp for RpS5. Evolution 2005, 59:720-729. 10. Sota T, Vogler AP: Incongruence of mitochondrial and nuclear Authors' contributions gene trees in the carabid beetles Ohomopterus. Syst Biol 2001, 50:39-59. NW, SN and EW conceived the study, EW did most of the 11. Sota T: Radiation and reticulation: extensive introgressive labwork, NW and EW carried out the analyses and wrote hybridization in the carabid beetles Ohomopterus inferred the manuscript. ADW helped interpret phylogenetic pat- from mitochondrial gene genealogy. Popul Ecol 2002, 44:145-156. terns and wrote part of the discussion. All authors partook 12. Linnen CR, Farrel BD: Mitonuclear discordance is caused by in discussions during analysis and writing, read and rampant mitochondrial introgression in Neodiprion (Hymenoptera: Diprionidae) sawflies. Evolution 2007, approved the final manuscript. 61:1417-1438. 13. Bull V, Beltrán M, Jiggins CD, McMillan WO, Bermingham E, Mallet J: Additional material Polyphyly and gene flow between non-sibling Heliconius spe- cies. BMC Biology 2006, 4:11. 14. Kronforst MR, Young LG, Blume LM, Gilbert LE: Multilocus analy- ses of admixture and introgression among hybridizing Heli- Additional file 1 conius butterflies. Evolution 2006, 60:1254-1268. List of specimens sampled in this study. Voucher codes, locality where 15. McCracken KG, Sorensen MD: Is homoplasy or lineage sorting the taxa were collected and GenBank accession numbers for genes the source of incongruent mtDNA and nuclear gene trees in the Stiff-tailed Ducks (Nomonyx-Oxyura)? Syst Biol 2005, sequenced. Photos of vouchers can be viewed at http://nymphalidae.utu.fi/ 54:35-55. db.php. 16. Mallet J: Hybridization as an invasion of the genome. Trends Click here for file Ecol Evol 2005, 20:229-237. [http://www.biomedcentral.com/content/supplementary/1471- 17. Kronforst MR: Gene flow persists millions of years after speci- 2148-9-92-S1.doc] ation in Heliconius butterflies. BMC Evol Biol 2008, 8:98. 18. Nylin S: Host plant specialization and seasonality in a polypha- gous butterfly, Polygonia c-album (Nymphalidae). Oikos 1988, 53:381-386. 19. Janz N, Nylin S: The role of female search behaviour in deter- mining host plant range in plant feeding insects: a test of the Acknowledgements information processing hypothesis. Proceedings of the Royal Soci- Thanks to Jorge Llorente, Armando Luis and Isabel Vargas (Museo de ety of London Series B Biological Sciences 1997, 264:701-707. Zoología, UNAM, Mexico City) for access to specimens under their care, 20. Janz N: Sex-linked inheritance of host-plant specialization in a literature and for providing permits and assistance to ADW. We are also polyphagous butterfly. Proceedings of the Royal Society of London Series B Biological Sciences 1998, 265:1675-1678. grateful to Anton Chichvarkin, Norbert Kondla, Zdenek Fric, Michel Tar- 21. Weingartner E, Wahlberg N, Nylin S: Dynamics of host plant use rier, Vladimir Lukhtanov, Andrew B. Martynenko, David Threatful, Jim and species diversity: a phylogenetic investigation in Polygo- Beck, Bob Parsons, Runar Krogen, Katy Prudic and Mike Leski for providing nia butterflies (Nymphalidae). J Evol Biol 2006, 19:483-491. specimens used in this study. We thank Bertil Borg and two anonymous ref- 22. Nylin S: Seasonal plasticity in life-history traits – growth and development in Polygonia c-album (Lepidoptera, Nymphali- erees for useful comments on a previous version of this ms. This study was dae). Biol J Linn Soc 1992, 47(3):301-323. funded by grants to SN and NW from the Swedish Research Council, and 23. Wiklund C, Wickman PO, Nylin S: A sex difference in the pro- from the Academy of Finland to NW (grant number 118369). Partial fund- pensity to enter direct/diapause development – a result of ing to ADW was provided by DGAPA-UNAM (Mexico City). selection for protandry. Evolution 1992, 46(2):519-528. 24. Bryant SR, Thomas CD, Bale JS: Thermal ecology of gregarious and solitary nettle-feeding nymphalid butterfly larvae. Oeco- References logia 2000, 122:1-10. 1. Barraclough TG, Vogler AP, Harvey PH: Revealing factors that 25. Hiroyoshi S, Mitsuhashi J: Sperm reflux and its role in multiple promote speciation. Philos Trans R Soc Lond B Biol Sci 1998, mating in males of a butterfly Polygonia c-aureum Linnaeus 353:241-249. (Lepidoptera: Nymphalidae). J Insect Physiol 1999, 2. Barraclough TG, Vogler AP: Detecting the geographical pattern 45(2):107-112. of speciation from species-level phylogenies. Am Nat 2000, 26. Tanaka D, Sakurama T, Mitsumasu K, Yamanaka A, Endo K: Separa- 155:419-434. tion of bombyxin from a neuropeptide of Bombyx mori show- 3. Tajima F: Evolutionary relationship of DNA-sequences in ing summer-morph-producing hormone (SMPH) activity in finite populations. Genetics 1983, 105:437-460. the Asian comma butterfly, Polygonia c-aureum L. J Insect Phys- 4. Nichols R: Gene trees and species trees are not the same. iol 1997, 43(2):197-201. Trends Ecol Evol 2001, 16:358-364. 27. Opler PA, Warren AD: Butterflies of North America. 2. Scien- 5. Ting C-T, Tsaur S-C, Wu C-I: The phylogeny of closely related tific Names List for Butterfly Species of North America, species as revealed by the genealogy of a speciation gene, North of Mexico. Fort Collins, Colorado, USA: Gillette Museum Odysseus. Proceedings of the National Academy of Sciences USA 2000, Publications; 2002. 97:5313-5316. 28. Pelham JP: A catalogue of the butterflies of the United States 6. Bachtrog D, Thornton K, Clark A, Andolfatto P: Extensive intro- and Canada, with a complete bibliography of the descriptive gression of mitochondrial DNA relative to nuclear genes in and systematic literature. J Res Lepid 2008, 40:. the Drosophila yakuba species group. Evolution 2006, 29. Gorbunov PY: The Butterflies of Russia. Moscow, Russia: Russian 60:292-302. Academy of Sciences; 2001. 7. Cianchi R, Ungaro A, Marini M, Bullini L: Differential patterns of 30. Churkin SV: Taxonomic notes on Polygonia Hubner, [1818] hybridization and introgression between the swallowtails (Lepidoptera, Nymphalidae) with description of a new sub- Papilio machaon and P. hospiton from Sardinia and Corsica species. Helios (Moscow) 2003, 4:132-147. islands (Lepidopera, Papilionidae). Mol Ecol 2003, 31. Guppy CS, Shepard JH: Butterflies of British Columbia. UBC 12:1461-1471. Press in collaboration with the Royal British Columbia Museum; 2001. 8. Salazar C, Jiggins CD, Taylor JE, Kronforst MR, Linares M: Gene flow 32. Layberry RA, Hall PW, Lafontaine JD: The Butterflies of Canada. and the genealogical history of Heliconius heurippa. BMC Evol Toronto: University of Toronto Press; 1998. Biol 2008, 8:132.

Page 15 of 16 (page number not for citation purposes) BMC Evolutionary Biology 2009, 9:92 http://www.biomedcentral.com/1471-2148/9/92

33. Wahlberg N, Brower AVZ, Nylin S: Phylogenetic relationships 62. Wahlberg N, Braby MF, Brower AVZ, de Jong R, Lee M-M, Nylin S, and historical biogeography of tribes and genera in the sub- Pierce N, Sperling FA, Vila R, Warren AD, et al.: Synergistic effects family Nymphalinae (Lepidoptera: Nymphalidae). Biol J Linn of combining morphological and molecular data in resolving Soc 2005, 86:227-251. the phylogeny of butterflies and skippers. Proc R Soc Lond B Biol 34. Nylin S, Janz N: Butterfly host plant range: an example of plas- Sci 2005, 272:1577-1586. ticity as a promoter of speciation? Evol Ecol 2009, 23:137-146. 63. Wahlberg N, Freitas AVL: Colonization of and radiation in 35. Nylin S, Nyblom K, Ronquist F, Janz N, Belicek J, Källersjö M: Phyl- South America by butterflies in the subtribe Phyciodina ogeny of Polygonia, Nymphalis and related butterflies (Lepi- (Lepidoptera: Nymphalidae). Mol Phylogenet Evol 2007, doptera: Nymphalidae): a total-evidence analysis. Zool J Linn 44:1257-1272. Soc 2001, 132:441-468. 64. Peña C, Wahlberg N, Weingartner E, Kodandaramaiah U, Nylin S, 36. Wahlberg N, Nylin S: Morphology versus molecules: resolution Freitas AVL, Brower AVZ: Higher level phylogeny of Satyrinae of the positions of Nymphalis, Polygonia and related genera butterflies (Lepidoptera: Nymphalidae) based on DNA (Lepidoptera: Nymphalidae). Cladistics 2003, 19:213-223. sequence data. Mol Phylogenet Evol 2006, 40:29-49. 37. Slatkin M, Hudson RR: Pairwise comparisons of mitochondrial- 65. Peña C, Wahlberg N: Prehistorical climate change increased DNA sequences in stable and exponentially growing popula- diversification of a group of butterflies. Biology Letters 2008, tions. Genetics 1991, 129:555-562. 4:274-278. 38. Maddison WP: Gene trees in species trees. Syst Biol 1997, 66. Warren AD, Ogawa JR, Brower AVZ: Phylogenetic relationships 46:523-536. of subfamilies and circumscriptions of tribes in the family 39. Knowles LL, Carstens BC: Delimiting species without mono- Hesperiidae (Lepidoptera: Hesperioidea). Cladistics 2008, phyletic gene trees. Syst Biol 2007, 56:887-895. 24:642-676. 40. Kullberg J, Albrecht A, Kaila L, Varis V: Checklist of Finnish Lepi- 67. Wahlberg N, Wheat CW: Genomic outposts serve the phylog- doptera – Suomen perhosten luettelo. Sahlbergia 2001, enomic pioneers: designing novel nuclear markers for 6:45-190. genomic DNA extractions of Lepidoptera. Syst Biol 2008, 41. Aarvik L, Berggren K, Hansen LO: Catalogus Lepidopterorum 57:231-242. Norvegiae. Ås, Oslo: Lepidopterologisk arbeidsgruppe, Zoologisk 68. Hall TA: BioEdit: a user-friendly biological sequence align- museum, Universitet i Oslo, Norsk institutt for skogsforskning; 2000. ment editor and analysis program for Windows 95/98/NT. 42. Corbet AS, Pendlebury HM: The Butterflies of the Malay Penin- Nucl Acids Symp Ser 1999, 41:95-98. sula. Kuala Lumpur, 4th edition. 1992. 69. Goloboff PA, Farris JS, Nixon KC: T. N. T. (Tree Analysis using 43. Tuzov VK: Nymphalidae part I. In Guide to the Butterflies of the Pal- New Technology). Published by the authors 1.0th edition. 2004 earctic Region Volume 5. Edited by: Bozano GC. Milano: Omnes Artes; [http://www.cladistics.com]. 2003:1-64. 70. Goloboff PA: Analyzing large data sets in reasonable times: 44. Maza RGdl, Maza Jdl: Notas sobre el ciclo de vida de Polygonia solutions for composite optima. Cladistics 1999, 15:415-428. gargenteum (Dbl y Hew) (Nymphalidae). Revista de la Sociedad 71. Nixon KC: The parsimony ratchet, a new method for rapid Mexicana de Lepidopterología 1977, 3:35-41. parsimony analysis. Cladistics 1999, 15:407-414. 45. Scott JA: The Butterflies of North America. Stanford: Stanford 72. Moilanen A: Searching for most parsimonious trees with sim- University Press; 1986. ulated evolutionary optimization. Cladistics 1999, 15:39-50. 46. Ferris CD, Brown F: Butterflies of the Rocky Mountain States. 73. Bremer K: The limits of amino acid sequence data in Norman, Oklahoma, USA: University of Oklahoma Press; 1981. angiosperm phylogenetic reconstruction. Evolution 1988, 47. Good JM, Hird S, Reid N, Demboski JR, Steppan SJ, Martin-Nims TR, 42:795-803. Sullivan J: Ancient hybridization and mitochondrial capture 74. Bremer K: Branch support and tree stability. Cladistics 1994, between two species of chipmunks. Mol Ecol 2008, 10:295-304. 17:1313-1327. 75. Baker RH, DeSalle R: Multiple sources of character information 48. Haldane JBS: Sex ratio and unisexual sterility in and the phylogeny of Hawaiian drosophilids. Syst Biol 1997, hybrids. J Genet 1922, 12:101-109. 46:654-673. 49. Sperling FAH: Butterfly species and molecular phylogenies. In 76. Gatesy J, O'Grady P, Baker RH: Corroboration among data sets Butterflies: Evolution and Ecology Taking Flight Edited by: Boggs CL, Watt in simultaneous analysis: hidden support for phylogenetic WB, Ehrlich PR. Chicago: University of Chicago Press; 2003:431-458. relationships among higher level artiodactyl taxa. Cladistics 50. Presgraves DC: Patterns of postzygotic isolation in Lepidop- 1999, 15:271-313. tera. Evolution 2002, 56:1168-1183. 77. Drummond AJ, Rambaut A: BEAST: Bayesian evolutionary anal- 51. Scott JA: A review of Polygonia progne (oreas) and P. gracilis ysis by sampling trees. BMC Evol Biol 2007, 7:214. (zephyrus) (Nymphalidae), including a new subspecies from 78. Drummond AJ, Ho SYW, Phillips MJ, Rambaut A: Relaxed phyloge- the southern Rocky Mountains. J Res Lepid 1984, 23:197-210. netics and dating with confidence. Plos Biology 2006, 52. Avise JC: Phylogeography: The History and Formation of Spe- 4(5):699-710. cies. Cambridge, MA: Harvard University Press; 2000. 79. Wahlberg N: That awkward age for butterflies: insights from 53. Krogen R: Records of Polygonia haroldi (Dewitz, 1877) in Son- the age of the butterfly subfamily Nymphalinae. Syst Biol 2006, ora, Mexico. Atalanta 2000, 31:67-70. 55:703-714. 54. Beutelspacher CR: Mariposas dirunas del Valle de México. Méx- 80. Ronquist F, Huelsenbeck JP: MrBayes 3: Bayesian phylogenetic ico City: Ediciones Científicas La Prensa Médica Mexicana; 1980. inference under mixed models. Bioinformatics 2003, 55. Bird CD, Hilchie GJ, Kondla NG, Pike EM, Sperling FA: Alberta But- 19:1572-1574. terflies. Edmonton: Provincial Museum of Alberta; 1995. 81. Posada D, Crandall KA: TCS: a computer program to estimate 56. O'Hara RJ: Systematic generalization, historical fate, and the gene genealogies. Mol Ecol 2000, 9:1657-1660. species problem. Syst Biol 1993, 42:231-246. 82. Templeton AR, Crandall KA, Sing CF: A cladistic analysis of phe- 57. O'Hara RJ: Evolutionary history and the species problem. Am notypic associations with haplotypes inferred from restric- Zool 1994, 34:12-22. tion endonuclease mapping and DNA sequence data. III. 58. de Queiroz K: The general lineage concept of species, species Cladogram estimation. Genetics 1992, 132:619-633. criteria, and the process of speciation. In Endless Forms: Species and Speciation Edited by: Howard DJ, Berlocher SH. Oxford: Oxford University Press; 1998:57-75. 59. de Queiroz K: Species concepts and species delimitation. Syst Biol 2007, 56:879-886. 60. Janz N, Nylin S, Wahlberg N: Diversity begets diversity: host expansions and the diversification of plant-feeding insects. BMC Evol Biol 2006, 6:4. 61. Wahlberg N, Weingartner E, Nylin S: Towards a better under- standing of the higher systematics of Nymphalidae (Lepidop- tera: Papilionoidea). Mol Phylogenet Evol 2003, 28:473-484.

Page 16 of 16 (page number not for citation purposes)