Liu et al. Parasites & Vectors 2010, 3:57 http://www.parasitesandvectors.com/content/3/1/57

RESEARCH Open Access TheResearch phylogeography of exustus (Gastropoda: Planorbidae) in Asia

Liang Liu1, Mohammed MH Mondal2, Mohamed A Idris3, Hakim S Lokman4, PRV Jayanthe Rajapakse5, Fadjar Satrija6, Jose L Diaz7, E Suchart Upatham8,9 and Stephen W Attwood*1,10

Abstract Background: The freshwater snail Indoplanorbis exustus is found across India, Southeast Asia, central Asia (Afghanistan), Arabia and Africa. Indoplanorbis is of economic importance in that it is responsible for the transmission of several species of the genus which infect cattle and cause reduced livestock productivity. The snail is also of medical importance as a source of cercarial dermatitis among rural workers, particularly in India. In spite of its long history and wide geographical range, it is thought that Indoplanorbis includes only a single species. The aims of the present study were to date the radiation of Indoplanorbis across Asia so that the factors involved in its dispersal in the region could be tested, to reveal potential historical biogeographical events shaping the phylogeny of the snail, and to look for signs that I. exustus might be polyphyletic. Results: The results indicated a radiation beginning in the late Miocene with a divergence of an ancestral bulinine lineage into Assam and peninsular India clades. A Southeast Asian clade diverged from the peninsular India clade late- Pliocene; this clade then radiated at a much more rapid pace to colonize all of the sampled range of Indoplanorbis in the mid-Pleistocene. Conclusions: The phylogenetic depth of divergences between the Indian clades and Southeast Asian clades, together with habitat and parasitological differences suggest that I. exustus may comprise more than one species. The timescale estimated for the radiation suggests that the dispersal to Arabia and to Southeast Asia was facilitated by palaeogeographical events and climate change, and did not require human involvement. Further samples from Afghanistan, Africa and western India are required to refine the phylogeographical hypothesis and to include the African Recent dispersal.

Background by I. exustus. Although other snails have been implicated The medical importance of Indoplanorbis in transmission of these three Schistosoma spp. (e.g., Lym- The freshwater snail Indoplanorbis exustus (Deshayes, naea luteola, S. indicum and S. nasale and Lymnaea 1834) (Planorbidae: Bulininae) is the sole member of its acuminata, S. nasale and S. spindale), I. exustus is the genus and is widely distributed across the tropics as an most important host for S. nasale and S. spindale, as well important intermediate host for several trematode para- as for S. indicum in certain regions. Indeed I. exustus may sites (: ). I. exustus is best known as be the sole natural intermediate host for these three the intermediate host responsible for the transmision of Schistosoma species on the Indian sub-continent[3]. The Schistosoma nasale Rao, 1933 and three Schistosoma species are parasites of Artiodactyla, in (Montgomery, 1906), as well as other trematodes such as particular of buffaloes, cows, goats pigs and sheep. Two Echinostoma spp. and some spirorchids (Digenea: of the species cause intestinal , whilst S. Spirorchiidae)[1,2]. A third species of Schistosoma, Schis- nasale inhabits blood vessels of the nasal mucosa and tosoma indicum (Montgomery, 1906) is also transmitted causes "snoring disease" in cattle[4]. Surveillance for cat- tle schistosomiasis is generally inadequate and the litera- * Correspondence: [email protected] ture is limited, but some idea of the problem can be 1 State Key Laboratory of Biotherapy, West China Hospital, West China Medical School, Sichuan University, Chengdu 610041, China gained from past small scale studies. Surveys in Sri Lanka Full list of author information is available at the end of the article revealed a prevalence of S. spindale of 31.2% (of 901 cat- © 2010 Liu et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons At- tribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Liu et al. Parasites & Vectors 2010, 3:57 Page 2 of 18 http://www.parasitesandvectors.com/content/3/1/57

tle)[5], whilst in Bangladesh a similarly high prevalence of per year; however, in Plio-Pleistocene humid periods this 36% has been reported[6]. More recently, in Kerala South has risen to 630 mm or more[22]. It has been recognized India, prevalences have been reported of up to 57.3% in that it is the length and severity of the dry season that cattle, 50% in buffalo and 4.7% in goats[3]. influences vegetation type (savanna, rain forest, monsoon In addition to causing disease in cattle, I. exustus has forest, etc) rather than the annual rainfall[23]. The rain- been implicated in outbeaks of cercarial dermatitis in fall in the dry season of Assam is approximately 250 mm, human populations in India[7,8], Laos[9], Malaysia[10] whereas that in Northeast Thailand is only 30-40 and Thailand[11,12]. Cercarial dermatitis results from mm[21]. These observations suggest that the Assam/Ban- the cutaneous allergic reaction in people exposed to lar- gladesh populations are subject to different ecological val schistosomes (cercariae) shed by infected snails into conditions than those of the Southeast Asian mainland freshwater bodies such as lakes, ponds, and paddy fields. (SEAM); these observations support the need to question The cercariae cause pruritis and papular eruptions, with the monophyly of Indoplanorbis. The invasive nature and often severe secondary infections, as they attempt to ecological tolerance of the snails adds to their importance infect a non-permissive definitive host and die in the in veterinary and medical science. skin[13]. Indoplanorbis exustus (see Additional file 1) is a com- The historical biogeography of Indoplanorbis mon snail across Southeast Asia and the Indian sub-con- The Lymnaeae and Planorbidae appear to share a com- tinent, where it often acts as intermediate host for S. mon ancestor in the Permian (c.a. 250 Ma (Mega annum spindale. The snail is also found in the Middle East or million years ago)) fossil record[24]. Fossil Planorbinae (Oman and Socotra) and Nigeria and the Ivory Coast; and Bulininae are known from the mid to upper Creta- these findings were attributed by Brandt[14] to recent ceous of Africa and India[25]. The Tethys sea separated introductions by human activities (Brandt's view has Africa and Laurasia until 10-5 Ma[26]. Consequently, been frequently cited in the literature on Indoplanor- Meier-Brook 1984[27] adopted an African (Gondwanan) bis[15-17]). In view of the wide geographical range of this origin for Indoplanorbis with rafting to Asia since the snail and its importance as a host for several species of Cretaceous on the northward migrating Indian craton; Schistosoma, there is a need to understand the phyloge- this author also considered a Europe to Southwest Asia netics and dispersal history of Indoplanorbis. In contrast tract or an Africa to South India dispersal. Morgan et al. to Asia, the well documented appearance of the snail in 2002 attributed the occurrence of Indoplanorbis in India Africa (e.g., Nigeria[18] and Ivory Coast[19]) and more to colonization (from Africa) via the Middle East land recently (2002) in the Lesser Antilles[16], is almost cer- connection[28]. Similarly, Attwood et al. 2007 described a tainly the result of introductions through human activi- Sinai-Levant dispersal tract, from Africa to central Asia, ties over the last 50-100 years. for Schistosoma spindale during the early Pleistocene[29]. Indoplanorbis exustus is a hermaphroditic invasive Clearly the two different dispersal mechanisms imply snail species with high fecundity. Within one year of very different chronologies; the Gondwanan vicariance introduction the snail is able to colonize habitats with hypothesis implies that proto-Indoplanorbis has been well established populations of other pulmonate and present in India since the late Eocene (35 Ma; India: Asia prosobranch snails[16]. The snail requires a water tem- collision[30]), whereas dispersal via the Sinai-Levant sug- perature in excess of 15°C for maturation. At the opti- gests a Plio-Pleistocene arrival. mum temperature of 30°C each snail can lay up to 800 The assumption that the Indoplanorbis lineage has eggs[20]. The capacity for self-fertilization and high been represented on the Indian craton since the Creta- fecundity probably underlies the invasive potential of the ceous is difficult justify in palaeogeographic terms. There species. The snail is found in small ponds, pools and less would have been numerous opportunities for dispersal of commonly in rice paddy fields. The snail may also occur Gondwanan taxa off India and into Asia throughout the in semi-permanent pools formed in flooded areas of Cenozoic. For example, during the middle Eocene (45 fields, where it can survive the dry season buried in mud. Ma) the northern corner of greater India came into con- Consequently, dispersal may occur in clumps of mud tact with western Indonesia and subsequently with adhered to the bodies of cattle or across water in flotsam Sumatra and Myanmar[30]. Even prior to the true (hard) (vegetation mats), and possibly also attached to migratory collison of India with Asia (i.e., between the Indian craton birds (although this has not been observed for I. exustus). and the Lhasa block at 35 Ma), faunal exchange was pos- In Northeast Thailand, where I. exustus is very common, sible via volcanic arc terranes (some forming exposed pla- the annual rainfall is 1541 mm. In Assam, in the region teaus) in the oceanic crust subducting at the collision where the snails were sampled for this study, the annual zone (c.a. 130 Ma onwards)[31]. Thus the absence of a rainfall is much higher at around 4200 mm (data recorded fossil record for Indoplanorbis or bulinine taxa from the 1982-2002)[21]. The current rainfall in Arabia is 340 mm Southeast Asian Palaeocene or Eocene argues against a Liu et al. Parasites & Vectors 2010, 3:57 Page 3 of 18 http://www.parasitesandvectors.com/content/3/1/57

Gonwanan-India rafting hypothesis for Indoplanorbis rrnL locus and 11 at the mt cox1 locus; there were 13 dis- and favors a post-Miocene Sinai-Levant dispersal tract to tinct haplotypes in the combined data set, with the taxa India. from Laos, Luzon and Thailand (Phitsanulok) being Earlier studies on I. exustus provide some DNA indistinguishable. The aligned cox1 data comprised 619 sequence data (cox1[32,33], nuclear 18S rRNA bps and the rrnL data 454 sites (396 bps without align- gene[32,33], mitochondrial (mt) 16S rRNA gene[32], and ment gaps); the total aligned data set was therefore 1073 nuclear 28S rRNA gene[28]); however, these studies con- sites. The PTP-test indicated that the data showed signif- sidered only one population, and in most cases did not icantly more structure than randomized data (P = report the location of the collecting site. In view of this, 0.0002). The ILD-test was also significant, indicating that no population phylogenetic study had been possible prior the two data partitions (cox1 and rrnL) should be mod- to the present work. Collection site details were reported eled separately (P = 0.0203). Tests for deviations from for one data set[33], which was recorded as being sam- neutrality were not significant: cox1 Fu * Li's D* -0.87029 pled from Loei Province northern Thailand[34]; these (P > 0.10), F* -1.10963 (P > 0.10); rrnL D* -1.29227 (P > data were for the 18S gene and cox1, but did not include 0.10), F* -1.58492 (P > 0.10), all calculations were based the 16S gene and were geographically close to the sample on total number of mutations. Tajima's D: cox1 -1.23388 from Phitsanulok taken in the present study. Conse- (P > 0.10); rrnL -1.63067 (P > 0.05). The test of Xia et quently, the Loei data were not added to the present anal- al.[35] suggested that there were no significant levels of ysis. substitution saturation at either locus (ISS < ISS.C, P < 0.0001, a lack of statistical significance here would imply Aims of the study a poor phylogenetic signal). The palaeogeographic, fossil and biogeographical disper- Uncorrected p distances were obtained from PAUP* sal data described above imply a Miocene origin for the and transformed to percentage sequence divergence esti- radiation of Indoplanorbis across Asia. Under such a mates. The use of this simple (there was no correction for model, colonization of India could have been followed transitional bias) distance measure was to afford compar- almost immediately, by radiation across Southeast Asia ison with other studies on inter and intraspecific varia- through land bridges created by low eustatic sea levels tion among freshwater pond snails. The average during Pleistocene glacial excursions. The aim of this divergence among ingroup taxa was 3.123% (range from study was therefore to collect DNA sequence data for 0.095 to 8.385%). The largest divergence was observed populations of the snails across the known range. These between the Bangladesh population and the other data could then be used to address three main questions. ingroup populations (excluding Assam), average 7.871% Firstly, the timescale for the dispersal and diversification (range from 7.617 to 8.385%). This was followed by the of these snails across Asia; the time of arrival of taxa in divergences between the Assam population and the other various geographical regions is important because it ingroup populations (excluding Bangladesh), average sheds light on how long different host:parasite assem- 6.546% (range from 6.345 to 6.724%). In contrast, the blages have been in contact and facilitates studies on phy- divergence between the Assam and Bangladesh popula- logenetic tracking or co-evolution. Secondly, the work tions was only 4.16%. The divergence between aimed to determine the historical events (climatic, tec- Biomphalaria glabrata and Radix a. rubiginosa was tonic or palaeogeogaphical) that may have influenced the 16.9%. phylogeography of these taxa. Thirdly, the taxonomic sta- tus of Indoplanorbis was considered. It seems improbable Phylogeny reconstruction that only a single species of the genus has arisen if its Maximum likelihood (ML) method radiation dates from the Miocene. A phylogeny was estimated for all taxa (with unique hap- In order to answer the above questions, a population lotypes) using PAUP*. An LRT comparing the TIM + G phylogeny for Indoplanorbis was first estimated and then and TIM + ss models for the cox1 data indicated a signifi- used to elucidate major dispersal tracts and confirm the cant difference between them (χ2 = 336.627 P < 0.0001, taxonomy of the genus (i.e., that only one species has d.f. = 2) favouring TIM + ss, consequently this model was arisen). Molecular clock methods were then used to date used for the cox1 partition. Here ss refers to the estima- the radiation of these taxa so that the cladogenic events tion of codon position specific rates. An LRT of the and dispersal tracts inferred could be related to any major molecular clock hypothesis failed to support the hypoth- historical environmental changes. esis that the different lineages had been evolving at the same rate (-ln likelihood with a clock enforced 3218.4030, Results without clock 3207.4237; χ2 = 21.96, P = 0.0247). Figure Sequence data 1A shows the phylogeny estimated by direct ML for these Of the 15 taxa included in the analysis (see Table 1), 9 dis- taxa. The most basal taxa are the northern India popula- tinct haplotypes were found at the mitochondrial (mt) tions (Assam and Bangladesh), followed by South Indian Liu et al. Parasites & Vectors 2010, 3:57 Page 4 of 18 http://www.parasitesandvectors.com/content/3/1/57

Table 1: Locations and dates of samples used in the study

Country Locality Coordinates Collection date Accession number: cox1

Indoplanorbis exustus rrnL

Assam Dibrugarh 27°30'08.2";94°57'06.7" 20/06/2002 [GenBank:GU451744] [GenBank:GU451726]

Bangladesh Mymensingh 24°43'33.4";90°25'32.7" 12/07/2003 [GenBank:GU451745] [GenBank:GU451727]

Borneo (Malaysia) Bintulu 02°52'31.0"; 112°50'19.0" 29/05/2002 [GenBank:GU451746] [GenBank:GU451728]

Indonesia Bogor (West Java) -6°27'44.5";106°42'25.1" 19/03/2005 [GenBank:GU451747] [GenBank:GU451729]

Laos Xang 19°10'15.0";102°13'30.0" 14/05/2005 [GenBank:GU451750] [GenBank:GU451751]

Malaysia (West) Kampang Pelegong 05°17'11.0";100°27'29.0" 26/05/2004 [GenBank:GU451738] [GenBank:GU451731]

Nepal Janakpur 26°48'03.6";85°56'27.9" 12/04/2003 [GenBank:GU451739] [GenBank:GU451732]

Oman1 Wadi Bani Khaled 22°37'02.2"; 59°05'36.8" 25/11/04 [GenBank:GU451740] [GenBank:GU451733]

Oman2 Wadi Qab 22°50'13.7"; 59°14'34.7" 30/11/2004 [GenBank:GU451741] [GenBank:GU451734]

Philippines (Luzon) Bulan 12°39'55.0";123°54'38.5" 12/03/2004 [GenBank:GU451748] [GenBank:GU451730]

Sri Lanka Kekirawa 08°01'04.7"; 80°36'16.3" 10/07/2005 [GenBank:GU451742] [GenBank:GU451735]

Thailand Khon Kaen 15°50'06.0";103°23'16.0" 08/03/2000 [GenBank:GU451743] [GenBank:GU451736]

Thailand Phitsanulok 16°49'45.0";100°17'30.0" 14/06/2001 [GenBank:HM104223] [GenBank:HM104222]

Radix auricularia rubiginosa

Thailand Udon 17°10'19.0"; 102°26'25" 29/05/2001 [GenBank:GU451737] [GenBank:GU451749]

Biomphalaria glabrata

Brazil - - - [GenBank:NC_005439] All sequences are new data collected for this study, with the exception of the Biomphalaria sequence which was obtained from preexisting data in GenBank. GenBank accession numbers are given for all data. Liu et al. Parasites & Vectors 2010, 3:57 Page 5 of 18 http://www.parasitesandvectors.com/content/3/1/57

(Sri Lanka). From this "Indian" root, a SEAM clade arises; Bayesian method with P4 this clade is not fully resolved and comprises a trichot- Initial runs with a burnin of up to 2.3 million generations, omy formed by a Oman/Malaysia clade (Malay-Occiden- up to 12 million sampled generations, and with TrN+I, tal, MO), a Sundaland (Sundaic) island clade (Indonesia F81, HKY, HKY + G and F81 models specified for codon and Borneo) and a "ubiquitous" clade (including Laos, positions 1, 2 and 3, rrnL and gaps, respectively, indicated both Thai populations, the Philippine population and a possible burnout problem (the likelihood reached a Nepal). Here we define Sundaland as the islands, which peak, then fell and then plateaued) with failure to reach rest on Asia's shallow continental shelf, namely the Malay true convergence. Burnout can be a sign that the Markov Peninsula and maritime Sundaic islands of Sumatra, Java, chain has been distracted from the ML region by regions Borneo, and the surrounding smaller islands. Aside from of lower likelihood but higher posterior probability mass. the placing of 2 taxa (Malaysia and Sri Lanka), this phy- The result of this can be apparent exclusion of unbiased logeny is consistent with the a priori hypothesis for a tree lengths from the 99% credible interval, where this radiation out of central Asia via India, and driven by interval becomes so wide that the chain is unlikely to solely geophysical factors. The occurrence of Nepal in the sample it evenly[36]. This problem is often caused by "ubiquitous" clade with Thai taxa was unexpected. We issues related to model specification, such as the use of doubt that this was an error, as the Nepal samples were overly complex models. To overcome problems with con- processed almost a year apart from the Thai and Philip- vergence and burnout the following changes were made. pine samples, also the sequences were not identical to any The chain temperature was lowered from 0.2 to 0.1 (thus other taxon in that clade (and no other taxon showed making cold and heated chains more able to exchange intra-population variation). states, thus increasing acceptance rates of proposals); the model was simplified by substitution of TrN (BIC

Figure 1 Phylograms estimated for the Indoplanorbis exustus populations sampled (with clade support values) A. Phylogeny estimated by maximum likelihood. Numbers assigned to each node are bootstrap support values in the maximum likelihood analysis (5000 replicates); the numbers following in parentheses represent the posterior probability that the hypothesis represented by this bi-partition, and under all parameters of the mod- el, is correct, given the observed data in a corresponding Bayesian analysis implemented using MrBayes. Clades referred to in text (and Table 2) are given in italics and their bounds indicated by grey or black shaded boxes. B. Phylogeny estimated by a Bayesian method implemented in P4 and with a polytomy prior set to e1.11. Numbers assigned to each node are posterior probabilities for the corresponding bi-partition. The posterior probabilities are scaled here to range from one to 100. Liu et al. Parasites & Vectors 2010, 3:57 Page 6 of 18 http://www.parasitesandvectors.com/content/3/1/57

1050.244) for TrN + I (BIC 1057.008) (a move also possi- repeated with gap cost = 1 and all 4 of the Tv weightings bly reducing Long-Tree (LT) error); and a user starting in turn. The cox1 and rrnL loci were input as two separate tree was specified (again potentially reducing LT error). partitions. The Radix sequences were set as the outgroup. The starting tree was based on the tree in file R1 (see A single most parsimonious tree was found. The resulting Additional file 2) but with all non-Indian taxa (and Nepal) cladogram was identical in topology to the phylogeny in one unresolved clade, branch lengths were otherwise estimated by the ML method. The same tree was chosen taken from the PAUP ML tree. The starting tree was either with or without the iterative pass search. edited to have a non-bifurcating root. Clade posterior probabilities estimated using MrBayes After implementing the above moves apparent conver- were all high; bootstrap support values from the direct gence was achieved with a burnin of 4 million genera- ML method were high (>70%) for all clades except for the tions. Posterior probabilities were then estimated over 10 Malay-Occidental (MO) clade (53-67%), the SEAM-Sun- million generations beyond the assumed point of station- daic clade (47%) and the Sri Lanka-SEAM-Sundaic clade arity. The final polytomy prior was set to 3.038 (varying (61%). Posterior probabilities for clades recovered using the prior from e to e2 had no significant effect on the phy- P4 were also high except for MO (50%) and Luzon/Nepal logeny). The relative rates of the 3 codon positions were (50%); this was with a polytomy prior set, with polyto- 0.253479, 0.283483 and 2.5462; the latter value was the mies excluded (as in MrBayes) clade support increased highest rate of any partition and corresponded to the (all above 70%) (see Figure 1). Jackknife support on the third codons, consequently there was no indication of LT maximum parsimony tree showed a similar pattern to the error. The resulting phylogeny (see Figure 1B) was identi- Mr Bayes tree. cal to that of the ML method except that the ubiquitous clade was resolved (Thai, (Luzon, Nepal)). Hypothesis testing MrBayes Shimodaira-Hasegawa test (SH-test) and topological method The final run conditions were: GTR, F81, HKY, HKY + G (TM) tests and F81 (for the cox1 codons and the rrnL partition, A constraint tree was first drawn to describe a plausible respectively); the chain temperature was 0.1; the covarion hypothesis for the radiation of Indoplanorbis based only model was applied to the first 4 partitions (invoking + I) on palaeogeographical and climatological events (see the BF comparing this move with the simpler model was Additional file 2). This tree assumes a central Asian ori- 183.46 in favour of the covarion model; the ASPSD was gin for the genus, with colonization of India and then 0.005028; the PSRF values ranged from 1.000 to 1.009 SEAM and finally the Sundaic islands. The events used to (mode 1.003); the burnin was 2.7 million generations, shape this tree were: 25 Ma Arabia-Africa separation with 10 million generations sampled post burnin; a ran- (Indoplanorbis diverges from other bulinines)[37]; 20 Ma dom starting tree was used. GTR was used for the first orginally Indian taxa enter Sundaland via extended Brah- partition because of the limited range of models specifi- maputra-Irrawaddy and become isolated there[38]; 3 Ma able in MrBayes 3. The topology of the estimated phylog- taxa isolated on Borneo/Indonesia as these terrains sepa- eny was identical to that of the ML method, with almost rate from Malaysia[39]; <0.5 Ma SEAM taxa isolated at identical relative branch lengths. end of Mid-Pleistocene Transition (MPT)[40]. The end of Parsimony method the Arabian humid interval (3.5 - 1.2 Ma[37]) was used to POY4 was used to implement a parsimony method using date the Oman radiation. The branch lengths of these dynamic homology characters with direct optimization trees were then rescaled to within the range of those (i.e., no a priori multiple sequence alignment was found in the ML tree from the PAUP* analysis. The Nepal required). In the initial sensitivity analyses the Navajo taxon was moved from the Indian clade to the Sundaland Rug indicated that no partition was overly sensitive to clade to follow the indications of the (unconstrained) ML variations in weighting model over the range tested (see tree search. Additional file 3), and that, whilst the cox1 data appeared The log likelihood for the geophysical hypothesis (tree) best for resolving more recent radiations, the rrnL data was -3218.32 P = 0.358, indicating that it is an equally appeared best for recovering the deeper phylogenetic likely explanation of the data as the unconstrained tree divisions. The maximum MRI values were obtained for (all other "best" trees were significantly worse). In TM schemes with gap cost = 1 and did not decline with Tv test (using the same hypothesis tree) the posterior proba- cost until costs exceeded 32. The maximum MRI for cox1 bility of the geophysical hypothesis was 533/47779 trees, and rrnL was greater than that seen when the cox1 first i.e., P = 0.0112. These results suggested that the true phy- and second codon positions were partitioned separately logeography might not differ significantly from the a pri- from the third positions (0.45 vs 0.41; see Additional file ori hypothesis, but that trees of exactly that topology 4). As the Tv cost weighting schemes 2, 4, 16, and 32 were were not well represented in the posterior (i.e., the phylo- equivalent in the MRI testing, the POY tree searches were geography indicated by the data is significantly different Liu et al. Parasites & Vectors 2010, 3:57 Page 7 of 18 http://www.parasitesandvectors.com/content/3/1/57

from that of the central Asia-Brahmaputra-SEAM prior on the TMRCA of the outgroup as described above. hypothesis). The outgroup was then removed and the more recent Estimation of divergence times with BEAST divergences were estimated with a prior on the TMRCA For these estimates all 15 taxa were included but gaps of the Sri Lanka population and all of the SEAM-Sundaic were excluded (as missing data). Initial runs with a Yule populations (this was the split subsequent to the out- tree prior and an uncorrelated log-normal (relaxed) clock group); this prior was the estimated value from the first model showed that the ucld.stdev was >1.0 and its confi- runs (with all 15 taxa). The outgroup was removed dence interval did not come close to zero (mean 2.84816 because its inclusion results in undesirable time depth of HPSD ranged from 1.80335 to 3.7994); this suggested that the phylogeny; this led to underestimation of the overall the data were not sufficiently clock like for analyses using mean rate and inflated time estimates for the more recent a strict molecular clock. A 2 partition model (by locus) divergences among the ingroup. The McMC was 10 mil- with TIM + G and HKY + G was compared with a by lion generations each run (3 runs were combined), with a codon model (TrN, F81, HKY, HKY + G) the latter burnin of 10%. All ESS values were >500. The posterior appeared preferable (BF 87.88). Other moves, such as distributions were examined using Tracer[45] and no dis- TrN+I or HKY, HKY did not result in significant tribution was seen to be cut off by its prior or show signs improvement. Consequently, TrN, F81, HKY, HKY + G of failure to converge (rising likelihood). The estimated was used in the main analyses. The prior on the mean TMRCA values are given in Table 2. TMRCA estimates rate was a normal prior with mean equal to the average of for an exponential distribution clock model were also published rrnL (one rate) and cox1 (2 rates) clock rates for examined to ensure they agreed with those in Table 2; freshwater snails and S.D. Set such that the 95% confi- this was because BF comparisons of the two models indi- dence interval spanned twice the range of these 3 values. cated they were equally fitting (BF 0.16 in favour of the The 3 rates were as follows (in substitutions per site per log normal model. It was found that simpler tree priors year × 10-8): 1.097 for rrnL, triculine snails (Gastropda: (Yule and Birth-Death models) gave implausible TMR- Pomatiopsidae) over the early Pleistocene[41]; 1.83 for CAs, with coalescent priors performing better. The wide early Pliocene Hydrobiidae[42]; and 1.62 also for Hydro- range of the HPDs in for the divergence time estimates in biidae[43]. These rates were estimated following the sim- Table 2 reflects the uncertainty inherent in all molecular ple point estimation method of Edwards and Beerli[44], date estimates; this is not unusual and is a realistic feature but are useful as a prior in this study. The other initial of this method of analysis. empirical prior was one on the height of the tree; this was a 100 to 350 Ma uniform prior. A more informative nor- Discussion mal prior of 200 Ma S.D. 30 Ma was also used, but there Phylogenetics was no marked difference on the estimation of mean rate The topological-method hypothesis test rejected the a or other TMRCAs (Time to Most Recent Common priori (central Asia-Brahmaputra-SEAM) model; this Ancestor). The informative prior was based on a Triassic/ suggested that the phylogeography of these snails differed Jurassic boundary date for the divergence of the Planorbi- significantly from the history of other snails such as the dae and Lymnaeidae[24]. Runs were also performed using Triculinae[46], which are also found in India and the an NPRS-transformation of the ML tree from PAUP as a SEAM. The phylogenies estimated for Indoplanorbis starting tree (scaled to a height of 250 Ma), but, as this exustus offer strong and concerted support for an origin had little effect on run performance, the final runs used of the South and Southeast Asian radiation in Northern random starting trees. Runs using only a prior on tree India; the Assamese and Bangladeshi populations form a height or on mean rate (but not both) gave implausible robust basal clade. The cratonic Indian (or at least South results. Sampling from the prior only (with conditions Indian/Sri Lankan) populations radiated later. Sri Lanka otherwise set to those of the final runs) also resulted in was intermittently linked to mainland India by extensive implausible results and TMRCA estimates very different land bridges until about 165 Ka[47], consequently the Sri from those obtained after observing the data. Each run Lankan population is considered as phylogeographically was repeated with 3 different starting random number South Indian. The most recent radiation could only be seeds the results of each run were compared and then partially resolved by the present data, resulting in a tri- combined for parameter estimation from the posterior chotomy of Malay-Occidental (MO), Sundaic and ubiqui- distribution. tous clades. The ubiquitous clade had representatives on The final runs used a coalescent model, with the prior Thailand and Laos (SEAM), Luzon (Philippines) and assumption of constant population size (throughout the Nepal (Himalayan frontal ranges). The colonization of genealogy), for the starting-tree prior and an uncorre- Arabia (and probably also Africa) appears to be most lated log normal relaxed clock model. TMRCAs were recent and this agrees with earlier assumptions based on estimated for the most basal divergences first, with a biogeography only[17,18,48]. The suggestion of an Indian Liu et al. Parasites & Vectors 2010, 3:57 Page 8 of 18 http://www.parasitesandvectors.com/content/3/1/57

Table 2: Results of a Bayesian estimation of divergence times

Parameter Mean ± S.E. ESS 95% HPD Lower/Upper

-ln(likelihood) 2018.99 ± 0.55 3517.56 2028.77/2007.54

mean substitution rate 1.37 ± 0.01 4896.57 0.02/2.43

tmrca(Outgroup) 186.99 ± 0.03 17814.11 180.58/194.00

tmrca(Ingroup) 6.93 ± 0.42 1085.77 1.09/20.42

tmrca(North India) 2.84 ± 0.13 2944.97 0.01/8.83

tmrca(Cratonic India) 2.48 ± 0.09 2796.45 0.13/9.48

tmrca(Southeast Asia) 0.96 ± 0.04 2614.02 0.07/3.37

tmrca(Malay-Occidental) 0.45 ± 0.02 3219.78 0.01/1.5

tmrca(Arabia) 0.18 ± 0.01 4481.45 0.00/0.60

tmrca(Sundaic) 0.17 ± 0.01 4426.12 0.07/0.57

Tmrca (Ubiquitous) 0.16 ± 0.01 3697.91 0.01/0.52 Time estimates are given in millions of years (Ma) for nodes representing the most recent common ancestor of relevant clades. The mean substitution rate is expressed per 100 bases per Ma. ESS, effective sample size (corrected for auto-correlation); HPD, the 9% highest posterior probability density (equivalent to a confidence interval); SE, standard error of the mean; TMRCA, time to most recent common ancestor of the taxa in the clade. Here Southeast Asia refers to SEAM (Southeast Asian mainland) and Sundaic. origin also agrees with earlier biogeographical hypothe- India (except for the Ghats) and Southeast Asia where the ses[48]. The derivation of the Luzon/Nepal clade from samples were taken (see Background, first section). the SEAM element of the ubiquitous clade was less well Although the concept of critical levels of sequence diver- supported; this clade was only supported by the P4 analy- gence (genetic distances) that can be used to delimit spe- sis; however, it is reasonable to accept that the extralimi- cies boundaries is generally inappropriate, such data can tal taxa originated on the SEAM. be used to differentiate species in combination with other The relative depths of divergences in the phylogeny kinds of observations. Typical interspecific divergences of suggest that Indoplanorbis comprises a single species, 1.9-6.9% have been reported for Biomphalaria[49] (for save for possibly the northern India lineage, which forms the mt 16S and nuclear ITS loci) and, as these snails have a cohesive basal clade. If the Assam/Bangladesh taxon is a similar ecologies to I. exustus, these values could be used distinct species this may explain the apparent stasis in the as indicators in the present study. Genetic distances for northern Indian radiation for a relatively long period, fol- the related snail Bulinus have been reported for the cox1 lowed by a more recent and sudden radiation of Indo- locus. The mean distance within B. forskalii from differ- planorbis outside India. Also worthy of note in this ent countries was 4.2% and for B. senegalensis 1.7%. The context is the observation that infection prevalences in divergence between two closely related Bulinus species, natural populations (with Schistosoma spindale) were B. camerunensis and B. forskalii, was 3.96%[50]. In view always >8% in samples (of over 200) of snails collected in of these reports, and considering the differences between Thailand and Laos (this study), but are reportedly much those and the present study, a divergence greater than lower in Indoplanorbis sampled in Assam (2.2% [8]) and 6.5% might be considered suggestive of polyphyly in the prevalences of under 1% seem typical in India[4]. In addi- present data. In the present study only the Assam and tion, the Assam/Bangladesh taxa exist in a region of cur- Bangladesh taxa showed such high divergences from rent (and historically) higher rainfall than other parts of other ingroup taxa. Such observations, taken together, Liu et al. Parasites & Vectors 2010, 3:57 Page 9 of 18 http://www.parasitesandvectors.com/content/3/1/57

may be sufficient to warrant a detailed morphological been fragmented into refugia at this time. Assam is comparison of the Assamese (and cratonic India) and regarded as a distinct from the drier Sikkim Himalaya SEAM taxa in order to confirm their conspecific status. (which includes Nepal)[58]. In view of this it is plausible In the present work morphological study was restricted that the initiation of a monsoon dominated climate trig- to only that sufficient to be sure that the snails collected gered the divergence of the Assam/Bangladesh (Table 2, were no other known species aside from I. exustus. North India) and the cratonic Indian and SEAM-Sunda- The occurrence of the Nepal sample in the SEAM land clades at 7 Ma. The dating suggests that proto-Indo- clade, and Sri Lanka in the SEAM-Sundaic clade is similar planorbis remained unchanged on the Indian craton for to other disjunct distributions reported for a variety of almost 100 Ma; thus the divergence at 7 Ma may have early Pleistocene organisms, where certain Indian taxa been a result of some adaptation to more arid conditions, show greater affinity with Southeast Asian taxa than with such as the appearance of a new ecological race (if not neighboring Indian forms. The Asian langurs[51], certain species) adapted to drier conditions. As mentioned bird species[52] and Macaca spp.[53] show such disjunct above, the Assam region, and the part of Bangladesh distributions. The phylogeny of the snail Paracrostoma where the samples were taken, has a year round more Cossmann, 1900 (Caenogastropoda: Pachychilidae) and humid climate than the areas of SEAM that experience an other pachychilids shows a similar disjunction between annual dry season. The divergence of the Assam and Ban- Indian and the SEAM[54]. Although these authors gladesh populations after 2.84 Ma could be attributed to advanced a slightly different phylogeographical hypothe- mid-Pleistocene changes in river courses associated with sis for this taxon, Paracrostoma showed similar disjunct rapid uplift of the frontal range of the Himalaya[59] and India-Southeast Asia affinities in their study as observed uplift of the Silong Plateau and parts of the Meghalaya for I. exustus in the present study . These disjunct distri- Plateau and Khasi hills which separate these regions butions are proposed to have arisen following Plio-Pleis- today[60]. tocene (5.1-1.6 Ma) aridification of southern India with a The next major divergence event is the divergence of shift in the predominant vegetation cover type from trop- the cratonic India clade (Sri Lanka) from the Sundaland ical forest to Savannah, with "wet-zone" species becoming clades; this is dated at around 2.5 Ma. It is known that sig- confined to and migrating along refugia of tropical forest nificant Asian monsoon intensification occurred prior to in the Western and Eastern Ghats and Sri Lanka[52] (Fig- the major glaciation at around 2.7 Ma; this triggered an ure 2). Wet-zone is defined as regions with annual precip- extremely arid 1.4 Ma interval[61]. This dry period is itation of >250 cm and dry-zone as areas with 50-100 cm likely to have forced Southern Indian and SEAM popula- or less, the SEAM is wet-zone. This "dry-zone refugia" tions further into dry-zone refugia, thereby promoting hypothesis would allow migration of (wet-zone) taxa additional divergence. The end of the arid period may from Southeast Asia, via Myanmar and into Sri Lanka then have led to the explosive radiation of the SEAM and and mainland India. Such migration would be facilitated Sundaic taxa at 960 Ka. Note that by 900 Ka sea levels by Quaternary glacial eustasy, as lowered sea levels were again quite high[62] and this would prevent back exposed continental shelf areas providing terranes of low migration after range expansion (thus further facilitating relief and suitable dispersal corridors. For snails radiating divergence). The next dated event is the divergence of the in the early Pleistocene such a disjunct distribution would MO clade at 450 Ka, with the Arabian populations be correlated with fluctuations in sea level during glacia- diverging from a Sundaic (Malaysian) ancestral form. tions around MIS22 (Marine Isotope Stage 22)[55]. This interval (PL4) is associated with very low sea lev- els[63], milder interglacials (with more extensive wet- Molecular dating lands) after 430 Ka, and maximum vegetation cover in The results of molecular dating using the present data Arabia around 500 Ka (MIS13)[64]. These events could and a Bayesian approach with a relaxed clock model are facilitate migration of Sundaic taxa, from the western rim given in Table 2. The mean substitution rate of 1.37 × 10-8 of the Sunda shelf (Sumatra), across exposed regions of substitutions/site/million years agrees well with pub- the Indian Ocean, then around the exposed Indus fan and lished estimates for freshwater snails at these loci[43]. into Arabia or Africa (Figure 2). A Makran coast tract, The divergence of the ingroup is dated at around 7 Ma. however, raises the question as to why the Sundaic lin- This date corresponds with accelerated Himalayan uplift eage did not establish in southern India or Pakistan/Iran, (at 8 Ma) and the associated establishment of a monsoon as it would have passed through these regions. The climate in Asia[56]. Consquently a major climatic change absence from southern India may be explained by ecolog- occurred in northern India at this time, with the appear- ical adaptation of the Sundaic form to SEAM habitats. ance and dominance of C4 plants in the flora and associ- The fact that the migration would have been across low ated deforestation[57]. Indoplanorbis requires humid lying wetland regions that were later below sea level may tropical environments and Indian populations could have explain the absence from Pakistan/Iran. When these ter- Liu et al. Parasites & Vectors 2010, 3:57 Page 10 of 18 http://www.parasitesandvectors.com/content/3/1/57

 

 "  &'             % 

$     #  % ,'        "' ++    ' ,'        #$ ! ' "#!

  * 

    

 %  () $

Figure 2 Semi-schematic summarizing a phylogeography for Indoplanorbis. The radiation is assumed to have originated in the Assam region of Northeast India and initial divergence to have been triggered by the onset of the Asian monsoon and a change from rain forest to drier savanna. More humid refugia are shaded with wavy lines. Arrows show subsequent dispersal tracts. Pleistocene land area is shown by lighter shading extending from the modern coast lines. Pleistocene coastlines are indicated by solid lines, modern coast lines by broken lines. International boundaries, scale and geo- graphical features approximate. ranes were next exposed winter conditions were much The most recent radiations appear to be the Sundaic colder than prior to the Mid-Pleistocene-Transition additions to the range; these being almost isochronous at (MPT)[64]. In addition, after 500 Ka sea levels were never around 165 Ka. This interval corresponds to the MIS6 as low for such sustained intervals[65]. The MO clade is glacial at 195-130 Ka during which sea levels reached then dated as further diverging at 180 Ka into two clades minima of around 120 m below present levels (c.a. 140 in Oman, which were represented by two populations in Ka)[65]. At such times of eustatic minima the Sunda shelf different wadi systems, separated by 30 km of extremely between the SEAM and Java and Borneo was exposed arid mountains. The date 180 Ka coincides with the end (see Figure 2) and island arcs (volcanic) provided land of the longest Pleistocene humid period in southern Ara- bridges towards the Philippines, such that rafting and bia (200-180 Ka)[66]. It is possible that the return to transport by birds could bridge the shorter stretches of much drier conditions and contraction of vegetation sea separating these islands. The presence of the Nepal cover blocked further immigration of snails to the region taxon in the SEAM clade rather than the Indian clade and thereby isolated the coastal (Wadi Qab) population could be explained by glaciation, and or hyper-aridity from the more interior (Wadi Bani Khaled) population. eliminating snail populations in Nepal. There is some evi- Humid periods also appear to have occurred at 160-130 dence for late Pleistocene glaciation in southern Nepal, Ka, 82-78 Ka and [67]10-6 Ka[68], but these were a series but the Brahmaputra valley and areas South of the Nepal: of shorter lived wet periods. The mountains between the Assam border appear to have remained unglaciated[69]. two Oman populations suggest that the two sites were There is also evidence for a hyper-arid climate in West colonized via different tracts, that probably originated India 18-13 Ka, but Assam appears to have provided a from the same source in Makran Iran (across the exposed more humid refugium[70]. If snail populations became Oman and Persian gulfs), one passing inland and one extinct in Nepal they could have been replenished by col- coastal. onization from the SEAM by populations which were not Liu et al. Parasites & Vectors 2010, 3:57 Page 11 of 18 http://www.parasitesandvectors.com/content/3/1/57

sampled in the present study (e.g., from Myanmar). As cant maritime technology until 60 Ka[74]; this would mentioned above the SEAM lineage contains the most have made the colonization of the Philippines possible. widely dispersed taxa and this suggest some adaptation in Although such human migrations are recorded for the this clade which makes it more invasive and possibly able middle palaeolithic, the domestication of livestock, and to out compete Indian clade taxa. their subsequent trade, did not begin until around 10 Ka[75]. Consequently, an anthropogenic radiation of Conclusions Indoplanorbis would be expected to accelerate and diver- The findings of the study suggest an Indian origin for the sify mostly in the last 10 Ka and a human-mediated dis- Indoplanorbis radiation at the Miocene-Pliocene bound- persal would be expected to show a phylogeny with less ary. The radiation appears to have been triggered by the phylogenetic depth than one driven by geophysical switch of the Asian climate from a planetary to a mon- events. soonal climate. The antecedent taxa were probably The timescale of the radiation, which is estimated to be adapted to life in humid and partially forested tropical to Pliocene to late Pleistocene, entirely predates the radia- sub-tropical environments, and now appear restricted to tion of the genus Homo in Asia and the human maritime refugia in Assam and Bangladesh. Further intensification culture. Consequently, the results of this study support of the Asian monsoon at 2.5 Ma appears to have triggered geophysical and climatic forces, rather than human activ- divergence of a peninsular Indian taxon; this taxon may ity, as drivers of the dispersal of Indoplanorbis across Asia have been better adapted to the savanna habitats that and its arrival in Arabia. In contrast, the results could not replaced rain forest as the dominant vegetation type in distinguish an Indian rafting hypothesis from an Africa to cratonic India at that time. Indoplanorbis then appears to India migration via central Asia (because the phylogeny have entered the Southeast Asian Mainland (SEAM) by 1 shows a complex migration out of India, rather than an Ma. The end of a very arid period then triggered diver- informative tract such as Assam-SEAM-Sri Lanka). Inter- gence of the snail on the SEAM at 960 Ka, but the SEAM estingly, the radiation of Indoplanorbis does not appear to lineage may already have acquired additional adaptations track that of Schistosoma spindale, which has been dated to aridity which allowed it to colonize new areas more at 0.25 Ma for the divergence of Bangladesh, Thai and Sri effectively and to live in regions with pronounced dry and Lankan populations[29] this suggests that the host-para- wet seasons more readily than the original Indian taxa. site association here is not tightly co-evolved. As indi- The SEAM lineage first colonized Arabia (possibly via cated by the phylogeny, Indoplanorbis exustus appears to Sumatra, island chains and the exposed Indus fan and comprise two or three ecological races or ecotypes. These Oman gulf during periods of low sea level) at around 500 races show different habitat requirements (the SEAM lin- Ka and then the Sundaic islands. Severe tradewinds in eage appears to have a wider ecological tolerance) and MIS6 could have facilitated this westward migration[71]. some parasitological differences (the SEAM lineage It is possible that there have been multiple colonization shows higher prevalences with S. spindale). The high lev- events, out of the SEAM, over the Pleistocene glacial els of sequence divergence between the Northeast India cycles, with replacement of ancestral forms by ever better populations and the other ingroup taxa also suggest that adapted (to dry conditions) forms; such loss of ancestral these taxa may be a different species from I. exusutus. polymorphism could make the colonization of Arabia Consequently, there is a case for a detailed morphological from the SEAM appear to predate that of the Sundaic study to confirm that Indoplanorbis is truly monospe- islands, when the Sundaic islands may have been colo- cific. This is in addition to the need for additional sam- nized first. The phylogeography hypothesized here is not pling of Indoplanorbis from Afghanistan, Myanmar and proposed as a definitive explanation, but only to demon- western India - to distinguish an Indian rafting hypothe- strate that the major relationships, tracts of dispersal and sis from a central Asian tract from Africa, to test theories timescale are plausible. No doubt alternative palaeogeo- on the origin of the Nepal population, and to confirm that graphical and climatic events could be put forward to the inclusion of only a Sri Lankan population correctly explain the same phylogeny. represents the cratonic Indian lineage. Under the pro- The low eustatic sea levels of the Würm glacial maxi- posed phylogeography it is conceivable that southern mum allowed human migration from central Asia, across peninsular India could have been colonized by snails with northern India and into Myanmar and the SEAM via SEAM clade adaptations distinct from that of the wetter Assam (130-70 ka). Inter pluvial low sea levels could have more forested Sri Lanka. permitted colonization of Indonesia and Borneo, even as far south as Australia[72]. Recent studies on the evolution Methods of human mtDNA haplogroups suggests that Sundaland Sampling was colonized by humans (possibly from northeast Samples were taken in 10 countries so as to cover most of SEAM) over 50 Ka[73]. Humans did not develop a signifi- the range of Indoplanorbis exustus (see Table 1 for details Liu et al. Parasites & Vectors 2010, 3:57 Page 12 of 18 http://www.parasitesandvectors.com/content/3/1/57

of taxa and sampling). No samples were taken in Africa, on the ingroup. Although often criticised for being too as these populations were assumed to be represented by liberal[85], the PTP-test is useful as a first step in phylo- the samples from Arabia. Identification of taxa followed genetic analysis[86], as rejection of the null hypothesis published accounts[17]. Whole snails were fixed in 10% here suggests that the real data may be better than ran- ethanol directly in the field. A small subsample of each domly permuted (phylogenetically uninformative) data. population was taken in 1% neutral formalin to aid identi- In addition Tajima's test for neutrality (based on the total fication. DNA was extracted from individual snails by a number of mutations) was performed using DNAsp (v. standard method[76]. Sequence variation was assessed at 5.10.01)[87]. The data were tested for substitution satura- two loci, being partial sequences of the mitochondrial tion using plots of the numbers of transitions (Ts) and (mt) cytochrome oxidase subunit I gene (cox1) and the mt transversions (Tv) against the maximum likelihood (ML) large ribosomal-RNA gene (rrnL or 16S). Sequences of genetic distance[88]. The indications of these plots were the oligonucleotide primers used in the PCR for the further evaluated using the entropy-based test[35] as amplification of these loci are published elsewhere, for found in the DAMBE (v. 4.5.20) software package[89], the cox1 amplifications the primers HCO-2198 and LCO- which provides a statistical test for saturation. Aside from 1490 were used[77]. For the rrnL locus the primers of the parsimony analyses, gaps were coded as a binary Palumbi et al. (1991) were used[78]. 2 to 8 snails were matrix, and all characters were run unordered and sequenced for each locus/population (mode 4 rrnL and 2 equally weighted. Indels for the sequence alignment cox1). Mitochondrial loci were chosen because with their appeared to be homologous (they were short and of equal maternal pattern of inheritance and therefore smaller length in all taxa) and alignment errors were unlikely (the effective population size, these loci may be expected to sequences aligned readily in Sequencer under stringent evolve more rapidly and be better suited to estimation of parameters). In addition, recurrent mutation is less likely intraspecific phylogenies and more recent (i.e., post-Mio- to significantly affect the analysis of intraspecific phylog- cene radiations). In addition, these loci have proven use- enies and there was no reliably tested approach for choos- ful in such studies of gastropods and molecular clock rate ing a particular model. Consequently, gaps were coded data are available[24,42,43,46,79,80]. Total genomic DNA simply in a binary, presence/absence, manner rather than was used as a template for PCR amplification on a Pro- using SIC or other more complex coding methods[90]. gene thermal cycler (MWG) employing standard PCR conditions[81]. Unincorporated primers and nucleotides Choice of substitution model and data partitioning scheme were removed from PCR products using the QIAQuick The incongruence length-difference (ILD) test[91] as PCR purification kit (QIAGEN). Sequences were deter- implemented in PAUP* (v. 4.0b10; [92]), was used to test mined bidirectionally, directly from the products by ther- for homogeneity between the cox1 and rrnL data parti- mal- cycle-sequencing using Big Dye fluorescent dye tions; the test was applied to informative sites only[93]. A terminators and an ABI 377 automated sequencer (Per- suitable substitution model was selected for each data kin-Elmer), following procedures recommended by the partition using an hierarchical test of alternative models manufacturers. Sequences were assembled and aligned as implemented in Modeltest (v. 3.7)[94] and MrModelt- using Sequencher (version 3.1 Gene Codes Corp. Ann est (v. 2.3.)[95]. The two applications use different sets of Arbor, Michigan). DNA sequences for both strands were hierarchies for the comparison of models[96], thus the aligned and compared to verify accuracy. Controls with- additional use of MrModeltest not only provided models out DNA template were included in all PCR runs to which could be implemeted in MrBayes, but also could exclude any cross-over contamination. highlight cases where Modeltest might fail to select a A multiple sequence alignment was generated for all class of models due to the hierarchy of comparisons. taxa for each locus and the cox1 and rrnL sequences were Models were selected first by BIC, then by second order then concatenated to form a combined data set. No AIC, then (in the case of apparent equivalence of two intrapopulation variation was found. Corresponding models) by the number of free parameters (although AIC sequences were obtained for Radix auricularia rubigi- may also penalize extraneous parameters). There was no nosa (Michelin, 1831) (Lymnaeoidea: Lymnaeidae) in this disagreement between Modeltest and MrModeltest in laboratory, as outgroup sequences. Additional outgroup cases where a model considered by both applications was data were obtained for Biomphalaria glabrata (Say, 1818) chosen by Modeltest. Molecular evolution of indels was (Planorbidae: Planorbinae) from the GenBank [Gen- modeled by an F81 model. For the calculation of the BIC, Bank:NC_005439][82]. the sample size was set as the number of characters in the alignment. Preparation of data As an initial test for phylogenetic signal in the data, a Per- Phylogeny reconstruction: parameters and priors mutation Tail Probability test (PTP)[83,84] with 50000 Phylogenetic estimation used three different approaches; replicates (each of 100 random additions), was performed this strategy was adopted to look for resilience of the Liu et al. Parasites & Vectors 2010, 3:57 Page 13 of 18 http://www.parasitesandvectors.com/content/3/1/57

hypothesis to changes in method and associated assump- (for analyses with an equivalent level of confidence) but tions. The rationale was that any clade that was repre- are also statistically superior to a solely ML method[99]. sented in phylogenies found by all methods (and well For example, such methods do not assume approximate supported) would be a reliable reconstruction of phyloge- normality or large sample sizes as would general ML netic history for these taxa. The use of a direct ML methods[100]. BMs differ from direct ML methods in method and Bayesian methods also allowed comparison that the former consider the posterior probability of the of estimated branch lengths, as Bayesian methods (as model (with parameters) and tree after observing the implemented in MrBayes) have been known to errone- data. The posterior probability of a hypothesis is propor- ously converge on LT solutions in cases where partitioned tional to the product of its prior probability and the prob- data are described by complex models with many param- ability of observing the dataset given the hypothesis (i.e., eters close to nonidentifiability[36,97]. The ILD test com- the likelihood of the hypothesis). Consequently, unlike paring the cox1 and rrnL loci was significant (P = 0.02), so direct ML, a BM allows incorporation of prior informa- the two partitions were modeled separately. The starting tion about the phylogenetic process into the analysis. P4 models indicated for the ML analyses were as follows: (v. 0.87) was used as the primary method to apply the BM; All sites: TIM + G this employs the same method as MrBayes[101] but cox1 first codon positions: TrN + I allows consideration of unresolved trees (i.e. polytomies) cox1 second codon positions: F81 and can implement all commonly used substitution mod- cox1 third codon positions: HKY els as well as any custom model the user might need to rrnL: HKY + G specify[102]. Indels: F81 The priors specified for the BM generally followed the The F81 model for gaps was not chosen by goodness of default values found in MrBayes; a flat Dirichlet distribu- fit tests, but it is the only appropriate model suitable for tion was set as the prior for the state frequency and for indels. Also, it is a monotone model fitting the similarly the rate set priors (e.g., revmat, tratio), an exponential monotone substitution of a binary character evolving prior was set on the branch lengths. A polytomy proposal along a phylogeny[98]). was set as either zero (i.e., no favouring of multifurca- The following details the methods particular to each tions) or as e, e1.11 e2 to examine the effect this has on the phylogenetic method. Analyses were time consuming and posterior probabilities of the clades found; this imple- performed on several multiprocessor machines (4 × 2.44 ments a move (proposed by Lewis et al., 2005) to counter GHz PCs), with output files pooled from multiple simul- the problem of the spuriously high posterior clade proba- taneous runs where appropriate. bilities returned by MrBayes relative to corresponding ML with heuristic search ML analyses[103]. Model parameters and relative rates Heuristic searches were performed (under the respective were set to be freely variable and unlinked; there were model and starting parameters indicated by Modeltest) four discrete rate categories for the Г-distribution. Out- using PAUP* initially with 1000 replicates, random addi- group monophyly was enforced. The McMC was tuned to tion of sequences (10 replicates) and tree-bisection- give proposal acceptance rates between 10 and 70% for reconnection branch-swapping (TBR) options in effect. each data partition. The TIM+G model was used, gaps were treated as miss- Four simultaneous Markov chains were run (one cold, ing data. The outgroup was forced to be monophyletic. three heated) and trees were sampled every 10 genera- The initial search rapidly converged on a single tree tions, two such runs were performed simultaneously. which was hit 100% of the time. Consequently, the search Convergence of the McMC was assessed by plotting split was repeated with nchuck = 10 and chuckscore = 3210 in support (at the Indonesia/Borneo node) for consensus an attempt to improve exploration of parameter space trees over different generation time windows; the genera- and find trees of better score. This was repeated with the tion of convergence was considered to be that at which monophyly constraint both on and off. A final search of the support reached a plateau. In addition, a plot of the 5000 replicates was then performed to increase the cumulative split frequencies from two simultaneous runs chance that the ML tree had been found. A likelihood (which should cluster along the line y = x when both runs ratio test for a strict molecular clock hypothesis was also are converging on the same region of parameter space) made possible by repeating the final run with a clock con- was also used to determine the point of stationarity (see straint. Nodal support was assessed by bootstrap with literature for details of rationale[104]). Generations prior 5000 replicates. to the inferred point of convergence (i.e., prior to station- Bayesian method (BM) with McMC applied by P4 arity) were discarded as a burnin. Bayesian phylogenetic methods are becoming increas- Bayesian method with McMC applied using MrBayes ingly popular in molecular systematics with the belief To afford comparison with more standard analyses and to that they are not only faster in terms of computing time confirm the findings with P4, the BM was reapplied using Liu et al. Parasites & Vectors 2010, 3:57 Page 14 of 18 http://www.parasitesandvectors.com/content/3/1/57

the popular package MrBayes (v. 3.1.2). Run conditions string as a character and each sequence as a state (i.e., the followed those of the P4 analyses, where possible. Unlike data are treated as dynamic homology characters)[108]. P4, the polytomy proposal for MrBayes is effectively fixed The data are partitioned in the sense that the homologies at zero and model specification was more limited than in are not moved between them, but are combined during P4. In order to help assess if runs had converged the aver- optimization in that each has the potential to influence age standard deviation of the split frequencies (ASDSF) the "alignment" of any other partition (strictly alignment was used to gauge the similarity of trees sampled by 2 is not performed, only implied). The data partitioning independent runs and the Potential Scale Reduction Fac- scheme and Tv/Ts weighting scheme and gap cost were tor (PSRF) was used to compare the final mean parameter determined by a MRI[109] and "Navajo rug" approach (a estimates (for the 2 runs), including the posterior proba- graphic plot where the parameter space is represented as bilities of nodes[105,101]. The McMC was considered a grid)[110]. The MRI provides only a rough measure of complete if this ASDSF was <0.01 otherwise the run (and congruence among data partitions; however, the burnin) was lengthened until the ASDSF was adequate. approach has performed adequately in empirical PSRF values close to 1.000 were also considered as indi- tests[111]. The Tv:Ts weightings considered were 1, 2, 4, cating that convergence might have been achieved. 8, 16 and 32, and the gap costs similarly ranged from 1 to As noted above, it is possible for MrBayes (and P4) to 64 (a total of 42 weight matrices). The gap opening cost converge on an incorrect LT solution where partitioned was set to one. One Navajo rug was produced for each data are used with complex models. This can occur data partition (cox1, cox1 first and second codon posi- because the default starting tree used by MrBayes often tions, cox1 third codon positions, rrnL and cox1 + rrnL) has relatively long branch lengths for most data sets (this and weighting scheme was plotted against clade, with can draw the chain towards long-trees in the early stages each clade sored as monophyletic, paraphyletic or poly- of the run, such that convergence on the correct solution phyletic. Clades were taken from the ML tree found in is never achieved). This problem may not be detected by the PAUP analyses as well as additional clades found on either the ASDSF or PSRF[36]. To reduce the risk of this trees in the most frequent 5% of trees from the posterior source of systematic error the following precautions were distribution in MrBayes. The Navajo rugs were used to taken. Branch lengths of the final trees were compared detect any unusual sensitivity of a particular data parti- with and without a user specified starting tree (the ML tion to changes in character weighting. The MRI was tree from PAUP was used for this). The relative rates for used to guide the choice of data partitioning strategy and the partitions were examined because as the chain is weighting scheme. drawn to LT solutions the relative rate of the partition The search strategy was as follows. Initially 250 ran- with fewest variable sites is often inflated (and alpha falls dom-addition (Wagner) trees were built. Alternate sub- to accommodate this). In the present study the expecta- tree pruning and TBR rounds of branch-swapping were tion was that third codon positions should have the high- then performed on all 250 trees until a local optimum est rates and rrnL among the lowest - deviations from this was achieved (using default options in POY 4). In addi- would be considered symptomatic of an LT error. tion to the most optimal trees, all the suboptimal trees Although the mean branch length prior in MrBayes was found within 5% of the best cost were also considered. left at the default 0.1, that in P4 was lowered to 0.001 and Then all but the best 100 of all optimal and topologically the branch lengths of the final trees compared (following unique trees were excluded from the analysis. The published suggestions[106]). There was an exponential remaining trees were then subjected to 15 rounds of prior on the branch lengths in MrBayes. In P4 Pinvar was ratcheting, where 20% of the characters were re-weighted also removed from the model for Cox1 first codon posi- by a factor of 2. During ratcheting, the dynamic homol- tions; this move was to reduce or detect any parameter ogy characters were transformed into static homology correlation affecting convergence[107]. characters, so that the fraction of nucleotides (rather than Parsimony method using POY of sequence fragments) could be re-weighted. 200 swap- To provide an analysis free of the assumptions underlying pings of subtrees identical in terminal composition but ML methods, and any effects particular to the alignment different in topology, were then performed between pairs inferred by Sequencher, the topology for the phylogeny of of best trees from the ratcheting step . These two steps the snail populations was also estimated using a maxi- were intended to increase the efficiency of the search of mum parsimony approach implemented by POY (v. tree space. Each resulting tree was then refined further by 4.1.1). POY was used to effect a direct optimization crite- local branch swapping under the default parameters in rion, which performs simultaneous tree searching and POY for swapping . The optimal and topologically unique sequence alignment. Direct optimization finds a set of trees were then reported. The resulting tree was then parsimonious homologies for a set of sequences of differ- used as a start point for 2 rounds of optimization using ent length by treating the full homologous nucleotide iterative pass; this constitutes an extremely effective Liu et al. Parasites & Vectors 2010, 3:57 Page 15 of 18 http://www.parasitesandvectors.com/content/3/1/57

means of determining parsimonious cladogram used the following criteria to compare and choose tree costs[112]. Clade support was estimated by jackknifing of prior and clock models. static transformed characters (5000 replicates, 50% of Beginning with a Yule tree prior and an uncorrelated characters deleted). log-normal (relaxed) clock model, and with a prior on the tree height and mean substitution rate, each available Hypothesis testing model was evaluated. Firstly, any models estimating Topological method implausible TMRCAs on the outgroup were discounted. The posterior probability of a particular phylogeographi- Implausible TMRCAs are, for example, those dated to the cal hypothesis was tested by determining the proportion pre-Cambrian. For those models which gave stable of trees with this topology in the posterior probability results, the ratio of the marginal likelihoods (with respect distribution output from a tree search using P4. This was to the prior) of alternative models (i.e., the Bayes Factor, done by writing a constraint tree describing the hypothe- BF) was used to choose between them[121] (here impor- sis of interest and reading this into PAUP*. tance sampling and the harmonic mean of the sampled SH-test [113] likelihoods are used as estimators of the marginal likeli- Again a constraint tree describing the hypothesis of inter- hoods). Where there was a tie between two models both est was written and used in the SH-test as implemented were run and the model giving the higher ESS values in PAUP* with full optimization, branch lengths used in (most efficient run) was chosen; there was no marked dif- likelihood calculation, and 1000 bootstrap replicates. The ference in parameter estimates for such ties. Choice of test compares between and among topologies from con- nucleotide substitution model followed that for the BM strained and unconstrained searches; the set of "best" described above (based on MrModeltest), followed by use trees (from the 5% confidence interval of unique trees of BF tests to look for simpler, equally suitable, models for sampled from the posterior of the P4 search) and the ML use in BEAST. tree (unconstrained) were used as the set of trees for this comparison. Additional material Relaxed molecular clock methods (BEAST) The program BEAST (v. 1.5.3)[114,115] was used to esti- Additional file 1 Indoplanorbis exustus adult specimen collected in mate the molecular clock rates and divergence dates Khon Kaen, Thailand. Scale 1 cm. (TMRCA of clades). BEAST implements a Bayesian Additional file 2 Phylogram representing a possible history for Indo- method for the simultaneous estimation of divergence planorbis driven solely by geophysical events (changes in orogenics, eustasy, climate, etc.). This phylogeny was used as the test hypothesis in times, tree topology and clock rates; this method is cur- the SH-test. rently considered superior to other approaches (e.g., non- Additional file 3 Matrix used in the initial sensitivity analysis for the parametric methods such as NPRS[116] or penalized parsimony method. Horizontal axis, weighting matrix; vertical axis, clade. likelihood methods[117], particularly for phylogenies Clade status: M, monophyletic; PA, paraphyletic; PO, polyphyletic, on best tree found by tree search using the corresponding weighting scheme. with a low time depth, because it can allow for uncer- Additional file 4 Matrix of MRI values for different data partitioning tainty in dates assigned to calibration points and does not schemes and character weighting regimes. require untested assumptions about the pattern of clock rate variation among lineages[118]. The greatest benefit Competing interests of using a Bayesian method for dating is that the specifi- The authors declare that they have no competing interests. cation of prior distributions can be used to ensure that Authors' contributions the analysis realistically incorporates the uncertainty LL Performed molecular work, parsimony analysis and wrote part of the Meth- ods. MMHM, MAI, HSL, RPVJR, FS, JLD and ESU assisted with location, collection associated with the calibration points used[119]. The and identification of taxa and provided laboratory facilities and parasitological procedure involves the user specifying both a phyloge- screening. SWA assisted with sample collection, collation, performed all netic model (a model of population history; the tree remaining analyses and wrote the remaining sections of the paper. All authors prior) and a clock model (of substitution rate variation); read and approved the final manuscript. however, the likelihood calculation is based on the clock Acknowledgements model only. In addition, as in any BM, one must specify a Sampling for this work was supported by Wellcome Trust project grant No. substitution model for each data partition. Rate variation 068706 to SWA. Thanks are due to the Natural History Museum (London) for providing access to library facilities. To Dr Peter Foster (NHM) for advice on the between adjacent branches is assumed to be uncorre- use of P4 and to Prof. Fan Bo Xiao (Sichuan University) and his students for bib- lated, as these rates did not show autocorrelation in liographic materials. Special thanks are due to Prof. Yu-Quan Wei (Sichuan Uni- recent studies[120]. BEAST can implement several com- versity) for providing facilities. binations of tree and clock models; the present study Liu et al. Parasites & Vectors 2010, 3:57 Page 16 of 18 http://www.parasitesandvectors.com/content/3/1/57

Author Details 20. Raut S, Rahman M, Samanta S: Influence of temperature on survival, 1State Key Laboratory of Biotherapy, West China Hospital, West China Medical growth and fecundity of the freshwater snail Indoplanorbis exustus School, Sichuan University, Chengdu 610041, China, 2Department of (Deshayes). Mem Inst Oswaldo Cruz 1992, 87:15-19. Parasitology, Faculty of Veterinary Science, Bangladesh Agricultural University, 21. Egashira K, Matsushita Y, Virakornphanich P, Darmawa N, Moslehuddin Mymensingh 2202, Bangladesh, 3Department of Microbiology and AZM, Al Mamun MA, Do NH: Features and trends of rainfall in recent 20 Immunology, College of Medicine, Sultan Qaboos University, Oman, years at different locations in humid tropical to subtropical Asia. J Fac 4Infectious Diseases Research Centre, IMR, Jalan Pahang, 50588 Kuala Lumpur, Agric Kyushu Univ 2003, 48:219-225. Malaysia, 5Department of Veterinary Pathobiology, Faculty of Veterinary 22. Renssen H, Brovkin V, Fichefet T, Goosse H: Simulation of the Holocene Medicine and Science, University of Peradeniya, Peradeniya 20400, Sri climate evolution in northern Africa: the termination of the African Lanka, 6Department of Animal Diseases and Veterinary Public Health, Faculty of humid period. Quat Int 2006, 150:95-102. Veterinary Medicine, Bogor Agricultural University (IPB), Jl. Agathis-Kampus IPB 23. Walter H: Vegetation of the earth in relation to climate and the eco- Darmaga, Bogor 16680, Indonesia, 7Veterinary Inspection Board, Vitas, Tondo, physiological conditions New York: Springer-Verlag; 1973. Metro Manila, Philippines, 8Department of Biology, Faculty of Science, Mahidol 24. Remigio EA, Blair D: Molecular systematics of the freshwater snail family University, Bangkok, Thailand, 9Department of Medical Science, Faculty of Lymnaeidae (Pulmonata: Basommatophora) utilising mitochondrial Science, Burapha University, Bangsaen, Chonburi, Thailand and 10Department ribosomal DNA sequences. J Moll Stud 1997, 63:173-185. of Zoology, The Natural History Museum, London, UK 25. Newton RB: On some freshwater fossils from central South Africa. The Annals and Magazine of Natural History 1920, 9:241-249. Received: 1 June 2010 Accepted: 5 July 2010 26. Smith AG, Smith DG, Funnell BM: Atlas of Mesozoic and Cenozoic coastlines Published: 5 July 2010 Cambridge, UK: Cambridge University Press; 1994. ParasitesThis© 2010 isarticle an Liu &Open Vectorsetis availableal; Access licensee 2010, from:article 3BioMed:57 http://www.parasitesandvectors.com/content/3/1/57 distributed Central Ltd.under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 27. Meier-Brook C: A preliminary biogeography of freshwater pulmonate References gastropods. World-wide snails 1984, 1:23-27. 1. Combes C, Albaret J, Arvy L, Durette-Desset M, Gabrion C, Jourdane J, 28. Morgan JAT, DeJong RJ, Jung Y, Khallaayoune K, Kock S, Mkoji GM, Loker Lambert A, Leger N, Maillard C, Matricon M, Nassi H, Prevot G, Richard J, ES: A phylogeny of planorbid snails, with implications for the evolution Theron A: Atlas mondiales des cercaires. Mém Muséum Nat Hist Natur of Schistosoma parasites. Mol Phylogenet Evol 2002, 25:477-488. 1980, A115:1-235. 29. Attwood SW, Fatih FA, Mondal MM, Alim MA, Fadjar S, Rajapakse RP, 2. Rollinson D, Southgate VR: The genus schistosoma: a taxonomic Rollinson D: A DNA-sequence based study of the Schistosoma indicum appraisal. In The Biology of Schistosomes: From Genes to Latrines Edited by: (Trematoda: Digenea) group; population phylogeny, taxonomy and Rollinson D, Simpson AJ. London: Academic Press; 1987:1-49. historical biogeography. Parasitol 2007, 134:2009-2020. 3. Ravindran R, Lakshmanan B, Ravishankar C, Subramanian H: Visceral 30. Ali J, Aitchison J: Gondwana to Asia: plate tectonics, paleogeography schistosomiasis among domestic ruminants slaughtered in Wayanad, and the biological connectivity of the Indian sub-continent from the South India. SE Asian J Trop Med Publ Hlth 2007, 38:1008-1010. middle Jurassic through latest Eocene (166-35 ma). Earth-Sci Rev 2008, 4. Agrawal MC, Southgate VR: Schistosoma spindale and bovine 88:145-166. schistosomosis. J Vet Parasitol 2000, 14:95-107. 31. McKenna MC: The mobile Indian raft: a reply to Rage and Jaeger. Syst 5. De Bont J, Vercruysse J, Van Aken D, Southgate VR, Rollinson D, Moncrieff Biol 1995, 44:265-271. C: The epidemiology of Schistosoma spindale Montgomery, 1906 in 32. Jørgensen A, Kristensen TK, Stothard J: An investigation of the cattle in Sri Lanka. Parasitology 1991, 102(Pt 2):237-241. "Ancyloplanorbidae" (Gastropoda, Pulmonata, Hygrophila): 6. Islam K: Schistosomiasis in domestic ruminants in Bangladesh. Trop preliminary evidence from DNA sequence data. Mol Phylo Evol 2004, Anim Health Prod 1975, 7:244. 32:778-787. 7. Narain K, Mahanta J: Dermatitis associated with paddy field 33. Albrecht C, Wilke T, Kuhn K, Streit B: Convergent evolution of shell shape environment in Assam, India: a review. Man-Environment Relationship in freshwater limpets: the African genus Burnupia. Zool J Linn Soc 2004, 2000, 9:213-220. 140:577-586. 8. Narain K, Rajguru SK, Mahanta J: Incrimination of Schistosoma spindale as 34. Albrecht C, Kuhn K, Streit B: A molecular phylogeny of Planorboidea a causative agent of farmer's dermatitis in Assam with a note on liver (Gastropoda, Pulmonata): insights from enhanced taxon sampling. pathology in mice. J Commun Diseas 1998, 30:1-6. Zoo. Scripta 2007, 36:27-39. 9. Ditrich O, Nasincova V, Scholz T, Giboda M: Larval stages of medically 35. Xia X, Xie Z, Salemi M, Chen L, Wang Y: An index of substitution important flukes (Trematoda) from Vientiane Province, Laos. Part ii. saturation and its application. Mol Phylo Evol 2003, 26:1-7. Cercariae. Annal Parasitol Hum et Comp 1992, 67:75-81. 36. Brown JM, Hedtke SM, Lemmon AR, Lemmon EM: When trees grow too 10. Palmieri J, Sullivan J, Ow-Yang C: A survey of snail hosts and larval long: investigating the causes of highly inaccurate Bayesian branch- trematodes collected in Peninsular Malaysia and Singapore from 1972 length estimates. Syst Biol 2010, 59:145-161. to 1977. SE Asian J Trop Med Publ Hlth 1977, 8:275-277. 37. Ghazanfar SA, Fisher M: Vegetation of the Arabian peninsula Springer; 1998. 11. Nithiuthai S, Anantaphruti MT, Waikagul J, Gajadhar A: Waterborne 38. Davis GM: Snail hosts of Asian Schistosoma infecting man: evolution zoonotic helminthiases. Vet Parasitol 2004, 126:167-193. and coevolution. The Mekong Schistosome Malacol Rev 1980:195-238. 12. Kullavanijaya P, Wongwaisayawan H: Outbreak of cercarial dermatitis in 39. Meijaard E: Craniometric differences among Malayan sun bears (Ursus Thailand. Int J Dermatol 1993, 32:113-115. malayanus); evolutionary and taxonomic implications. Raff Bull Zool 13. Mulvihill CA, Burnett JW: Swimmer's itch: a cercarial dermatitis. Cutis 2004, 52:665-672. 1990, 46:211-213. 40. Clark PU, Archer D, Pollard D, Blum JD, Rial JA, Brovkin V, Mix AC, Pisias NG, 14. Brandt RA: The non-marine mollusca of Thailand. Arch Molluskenk 1974, Roy M: The middle Pleistocene transition: characteristics, mechanisms, 105:1-423. and implications for long-term changes in atmospheric pCo2. Quat Sci 15. Brown DS, Gallagher MD: Freshwater snails of Oman, South Eastern Rev 2006, 25:3150-3184. Arabia. Hydrobiologia 1985, 127:125-149. 41. Attwood SW, Fatih FA, Campbell IC, Upatham ES: The distribution of 16. Pointier JP, David P, Jarne P: Biological invasions: the case of planorbid Mekong schistosomiasis, past and future: preliminary indications from snails. J Helminthol 2005, 79:249-256. an analysis of genetic variation in the intermediate host. Parasitol Int 17. Brown DS: Freshwater snails of Africa and their medical importance London: 2008, 57:256-270. Taylor & Francis; 1994. 42. Wilke T: Salenthydrobia gen. nov. (Rissooidea: Hydrobiidae): a potential 18. Kristensen TK, Ogunnowo O: Indoplanorbis exustus (Deshayes, 1834), a relict of the messinian salinity crisis. Zool J Linn Soc 2003, 137:319-336. freshwater snail new for Africa, found in Nigeria (Pulmonata: 43. Liu H, Hershler R: A test of the vicariance hypothesis of western North Planorbidae). J Moll Stud 1987, 53:245-246. American freshwater biogeography. J Biogeog 2007, 34:534-548. 19. Mouchet F, Rey JL, Cunin P: Découverte d'Indoplanorbis exustus 44. Edwards SV, Beerli P: Perspective: gene divergence, population (Planorbidae, Bulininae) à Yamossoukro, Côte d'Ivoire. Bull Soc Pathol divergence, and the variance in coalescence time in phylogeographic Exot 1987, 80:811-812. studies. Evolution 2000, 54:1839-1854. 45. Rambaut A, Drummond AJ: Tracer Oxford: University of Oxford; 2004. Liu et al. Parasites & Vectors 2010, 3:57 Page 17 of 18 http://www.parasitesandvectors.com/content/3/1/57

46. Attwood SW, Upatham ES, Zhang YP, Yang ZQ, Southgate VR: A DNA- 68. Neff U: Massenspektrometrische th/u-datierung von höhlensintern aus sequence based phylogeny for triculine snails (Gastropoda: dem oman: klimaarchive des asiatischen monsuns. In PhD thesis Pomatiopsidae: Triculinae), intermediate hosts for Schistosoma Institute of Environmental Physics. Ruprecht-Karls-Universität Heidelberg; (Trematoda: Digenea): phylogeography and the origin of Neotricula. J 2001. Zool 2004, 262:47-56. 69. Owen LA: Latest Pleistocene and Holocene glacier fluctuations in the 47. Batchelor BC: Discontinuously rising late Cainozoic eustatic sea-levels, Himalaya and Tibet. Quat Sci Rev 2009, 28:2150-2164. with special reference to Sundaland, Southeast Asia. Geologie en 70. Patnaik R, Badam G, Murty M: Additional vertebrate remains from one of Mijnbouw 1979, 58:1-20. the late Pleistocene--Holocene Kurnool caves (Muchchatla 48. Blair D, Davis GM, Wu B: Evolutionary relationships between trematodes Chintamanu Gavi) of South India. Quat Int 2008, 192:43-51. and snails emphasizing schistosomes and paragonimids. Parasitol 71. Lòpez-Otálvaro G, Flores JA, Sierro FJ, Cacho I, Grimalt J, Michel E, Cortijo E, 2001, 123:S229-S243. Labeyrie L: Late Pleistocene palaeoproductivity patterns during the last 49. Dejong RJ, Morgan JA, Paraense WL, Pointier JP, Amaritsa M, Ayeh-Kumi climatic cycle in the Guyana basin as revealed by calcareous PF, Babiker A, Barbosa CS, Brémond P, Canese AP, De Souza CP, nannoplankton E. Earth 2009, 4:1-13. Dominguez C, File S, Gutierrez A, Incani RN, Kawano T, Kazibwe F, Kpikpi J, 72. Finlayson C: Biogeography and evolution of the genus Homo. TREE Lwambo NJ, Mimpfoundi R, Njiokou F, Poda JN, Sene M, Velásquez LE, 2005, 20:457-463. Yong M, Adema CM, Hofkin BV, Mkoji GM, Loker ES: Evolutionary 73. Soares P, Trejaut JA, Loo J, Hill C, Mormina M, Lee C, Chen Y, Hudjashov G, relationships and biogeography of Biomphalaria (Gastropoda: Forster P, Macaulay V, Bulbeck D, Oppenheimer S, Lin M, Richards MB: Planorbidae) with implications regarding its role as host of the human Climate change and postglacial human dispersals in Southeast Asia. bloodfluke, Schistosoma mansoni. Mol Biol & Evol 2001, 18:2225-2239. Mol Biol Evol 2008, 25:1209-1218. 50. Jones CS, Rollinson D, Mimpfoundi R, Ouma J, Kariuki HC, Noble LR: 74. Bednarik RG: Seafaring Homo erectus. The Artefact 1995, 18:91-92. Molecular evolution of freshwater snail intermediate hosts within the 75. MacHugh DE, Bradley DG: Livestock genetic origins: goats buck the Bulinus forskalii group. Parasitol 2001, 123:277-292. trend. Proc Nat Acad Sci USA 2001, 98:5382-5384. 51. Karanth KP, Singh L, Collura RV, Stewart C: Molecular phylogeny and 76. Winnepenninck B, Backeljau T, De Wachter R: Extraction of high biogeography of langurs and leaf monkeys of South Asia (Primates: molecular weight DNA from molluscs. Trend Genet 1993, 9:407. Colobinae). Mol Phylo Evol 2008, 46:683-694. 77. Folmer O, Black M, Hoeh W, Lutz R, Vrijenhoek R: DNA primers for 52. Karanth K: Evolution of disjunct distributions among wet-zone species amplification of mitochondrial cytochrome c oxidase subunit I from of the Indian subcontinent: testing various hypotheses using a diverse metazoan invertebrates. Molecular Mar Biol Biotech 1994, phylogenetic approach. Current Science 2003, 85:1276-1283. 3:294-299. 53. Delson E: Fossil macaques, phyletic relationships and a scenario of 78. Palumbi SR, Martin A, Romano S, McMillan WO, Stice L, Grabowski G: The deployment. In The Macaques: In Studies in Ecology, Behavior and Evolution simple fool's guide to PCR. version 2.0 Honolulu: University of Hawaii; 1991. Edited by: Lindburg D. San Francisco: Van Nostrand Reinhold Company; 79. Wilke T, Davis GM, Gong X, Liu H: Erhaia (Gastropoda: Rissooidea): 1980:10-30. phylogenetic relationships and the question of Paragonimus 54. Köhler F, Glaubrecht M: Out of Asia and into India: on the molecular coevolution in asia. Amer J Trop Med Hyg 2000, 62:453-459. phylogeny and biogeography of the endemic freshwater gastropod 80. Attwood SW, Lokman HS, Ong KY: Robertsiella silvicola, a new species of Paracrostoma Cossmann, 1900 (Caenogastropoda: Pachychilidae). Biol triculine snail (Caenogastropoda: Pomatiopsidae) from peninsular J Linn Soc 2007, 91:627-651. Malaysia, intermediate host of Schistosoma malayensis (Trematoda: 55. Lisiecki L: Ages of MIS boundaries. LR04 benthic stack Boston, MA: Boston Digenea). J Moll Stud 2005, 71:379-391. University; 2005. 81. Clackson T, Güssow D, Jones PT: General applications of PCR to gene 56. Zhisheng A, Kutzbach JE, Prell WL, Porter SC: Evolution of Asian cloning and manipulation. PCR: A practical approach 1991, 1:187-214. monsoons and phased uplift of the Himalaya-Tibetan plateau since 82. Dejong RJ, Emery AM, Adema CM: The mitochondrial genome of late Miocene times. Nature 2001, 411:62-66. Biomphalaria glabrata (Gastropoda: Basommatophora), intermediate 57. Osborne CP, Beerling DJ: Nature's green revolution: the remarkable host of Schistosoma mansoni. J Parasitol 2004, 90:991-997. evolutionary rise of C4 plants. Phil Trans R Soc Lond B Biol Sci 2006, 83. Archie JW: A randomization test for phylogenetic information in 361:173-194. systematic data. Syst Zool 1989, 38:239-252. 58. Troll C: The three dimensional zonation of the Himalayan system. In 84. Faith DP, Cranston PS: Could a cladogram this short have arisen by Geoecology of the high mountain systems of Asia Edited by: Troll C. chance alone?: On permutation tests for cladistic structure. Cladistics Wiesbaden Steiner; 1972:23-30. 1991, 7:1-28. 59. Sakai H, Sakai H, Yahagi W, Fujii R, Hayashi T, Upreti BN: Pleistocene rapid 85. Slowinski JB, Crother BI: Is the PTP test useful? Cladistics 1998, uplift of the Himalayan frontal ranges recorded in the Kathmandu and 14:297-302. Siwalik basins. Palaeogeogr, Palaeoclimat, Palaeoecol 2006, 241:16-27. 86. Wilkinson M, Peres-Neto PR, Foster PG, Moncrieff CB: Type 1 error rates of 60. Balasubrahmanyan MN: Geology and tectonics of India Kochi, Japan: the parsimony permutation tail probability test. Syst Biol 2002, International Association for Gondwana Research; 2006. 51:524-527. 61. Zhang YG, Ji J, Balsam W, Liu L, Chen J: Mid-Pliocene Asian monsoon 87. Librado P, Rozas J: Dnasp v5: a software for comprehensive analysis of intensification and the onset of northern hemisphere glaciation. DNA polymorphism data. Bioinformatics 2009, 25:1451-1452. Geology 2009, 37:599-602. 88. DeSalle R, Freedman T, Prager EM, Wilson AC: Tempo and mode of 62. Zazo C: Interglacial sea levels. Quat Int 1999, 55:101-113. sequence evolution in mitochondrial DNA of Hawaiian Drosophila. J 63. Sinha D: Surface circulation in the eastern Indian ocean during last 5 Mol Evol 1987, 26:157-164. million years: planktic foraminiferal evidences. Indian J Mar Sci 2007, 89. Xia X: Dambe (Data Analysis in Molecular Biology and Evolution) software, 36:342-350. version 3.7.48 Hong Kong: University of Hong Kong (Department of 64. Ziegler M: Late Permian to Holocene paleofacies evolution of the Ecology and Biodiversity); 1999. Arabian plate and its hydrocarbon occurrences. GeoArabia 2001, 90. Ogden TH, Rosenberg MS: How should gaps be treated in parsimony? a 6:445-504. comparison of approaches using simulation. Mol Phylo Evol 2007, 65. Winograd IJ, Coplen TB, Landwehr JM, Riggs AC, Ludwig KR, Szabo BJ, 42:817-826. Kolesar PT, Revesz KM: Continuous 500,000-year climate record from 91. Farris JS, Kallersjo M, Kluge AG, Bult C: Constructing a significance test for vein calcite in Devils Hole, Nevada. Science 1992, 258:255-260. incongruence. Syst Biol 1995, 44:570-572. 66. Burns SJ, Fleitmann D, Matter A, Neff U, Mangini A: Speleothem evidence 92. Swofford DL: Paup* Phylogenetic Analysis Using Parsimony (and other from Oman for continental pluvial events during interglacial periods. methods) Sunderland, Massachusetts: Sinauer Associates; 2002. Geology 2001, 29:623-626. 93. Lee MS: Uninformative characters and apparent conflict between 67. Radies D, Preusser F, Matter A, Mange M: Eustatic and climatic controls molecules and morphology. Mol Biol Evol 2001, 18:676-680. on the development of the Wahiba sand sea, Sultanate of Oman. 94. Posada D, Crandall KA: Modeltest: testing the model of DNA Sedimentol 2004, 51:1359-1385. substitution. Bioinformatics 1998, 14:817-818. Liu et al. Parasites & Vectors 2010, 3:57 Page 18 of 18 http://www.parasitesandvectors.com/content/3/1/57

95. Nylander J: Mrmodeltest v2. program distributed by the author Uppsala: Uppsala University; 2004. 96. Posada D, Crandall KA: Selecting the best-fit model of nucleotide substitution. Syst Biol 2001, 50:580-601. 97. Marshall DC: Cryptic failure of partitioned Bayesian phylogenetic analyses: lost in the land of long trees. Syst Biol 2010, 59:108-117. 98. Gascuel O, Steel M: Inferring ancestral sequences in taxon-rich phylogenies. Math Biosci in press. 99. Holder M, Lewis PO: Phylogeny estimation: traditional and Bayesian approaches. Nature Rev: Genetics 2003, 4:275-284. 100. Van Dongen S: Prior specification in Bayesian statistics: three cautionary tales. J Theor Biol 2006, 242:90-100. 101. Ronquist F, Huelsenbeck JP: Mrbayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 2003, 19:1572-1574. 102. Foster PG: Modeling compositional heterogeneity. Syst Biol 2004, 53:485-495. 103. Lewis PO, Holder ME, Holsinger KE: Polytomies and Bayesian phylogenetic inference. Sys Biol 2005, 54:241-253. 104. Nylander JA, Wilgenbusch JC, Warren DL, Swofford DL: AWTY (Are We There Yet?): A system for graphical exploration of MCMC convergence in Bayesian phylogenetics. Bioinformatics 2008, 24:581-583. 105. Gelman A, Rubin DB: Inference from iterative simulation using multiple sequences. Stat Sci 1992, 7:457-472. 106. Marshall D, Simon C, Buckley T: Accurate branch length estimation in partitioned Bayesian analyses requires accommodation of among- partition rate variation and attention to branch length priors. Syst Biol 2006, 55:993-1003. 107. Sullivan J, Swofford D, Naylor G: The effect of taxon sampling on estimating rate heterogeneity parameters of maximum-likelihood models. Mol Biol Evol 1999, 16:1347-1356. 108. Wheeler W: Fixed character states and the optimization of molecular sequence data. Cladistics 1999, 15:379-385. 109. Wheeler WC, Ramírez MJ, Aagesen L, Schulmeister S: Partition-free congruence analysis: implications for sensitivity analysis. Cladistics 2006, 22:256-263. 110. Giribet G: Stability in phylogenetic formulations and its relationship to nodal support. Syst Biol 2003, 52:554-564. 111. Aagesen L, Petersen G, Seberg O: Sequence length variation, indel costs, and congruence in sensitivity analysis. Cladistics 2005, 21:15-30. 112. Wheeler WC: Iterative pass optimization of sequence data. Cladistics 2003, 19:254-260. 113. Shimodaira H, Hasegawa M: Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Mol Biol Evol 1999, 16:1114-1116. 114. Drummond AJ, Rambaut A: Beast. version 1.3 Oxford: Oxford University Press; 2003. 115. Drummond AJ, Nicholls GK, Rodrigo AG, Solomon W: Estimating mutation parameters, population history and genealogy simultaneously from temporally spaced sequence data. Genetics 2002, 161:1307-1320. 116. Sanderson MJ: A nonparametric approach to estimating divergence times in the absence of rate constancy. Mol Biol Evol 1997, 14:1218-1231. 117. Sanderson MJ: Estimating absolute rates of molecular evolution and divergence times: a penalized likelihood approach. Mol Biol Evol 2002, 19:101-109. 118. Ho SYW, Phillips MJ, Cooper A, Drummond AJ: Time dependency of molecular rate estimates and systematic overestimation of recent divergence times. Mol Biol Evol 2005, 22:1561-1568. 119. Pulquério MJ, Nichols RA: Dates from the molecular clock: how wrong can we be? TREE 2006, 22:180-184. 120. Drummond AJ, Ho SYW, Philips MJ, Rambaut A: Relaxed phylogenetics and dating with confidence. PLoS Biology 2006, 4:e88. 121. Newton MA, Raftery AE: Approximate Bayesian inference by the weighted likelihood bootstrap. J R Stat Soc Lond 1994, B56:3-48.

doi: 10.1186/1756-3305-3-57 Cite this article as: Liu et al., The phylogeography of Indoplanorbis exustus (Gastropoda: Planorbidae) in Asia Parasites & Vectors 2010, 3:57