<<

J Mol Evol 12003) 56:38±45 DOI: 10.1007/s00239-002-2378-1

Structure and Evolution of the Mitochondrial Control Region of Leaf Coleoptera: Chrysomelidae): A Hierarchical Analysis of Nucleotide Sequence Variation

Patrick Mardulyn,1,2 Arnaud Termonia,1,2 Michel C. Milinkovitch1

1 Unit of Evolutionary Genetics, Free University of Brussels, Rue Jeener et Brachet 12, 6041 Gosselies, Belgium 2 Laboratoire de Biologie Animale et Cellulaire, CP 160/12, Free University of Brussels, av. F.D. Roosevelt 50, 1050 Brussels, Belgium

Received: 31 October 2001 / Accepted: 19 July 2002

Abstract. To assess the levels of variation at di€er- [T1T)A1A)] repeats, all found in the control region of ent evolutionary scales in the mitochondrial 1mt) other orders. Our analyses allowed us to control region of leaf beetles, we sequenced and identify portions of the control region with enough compared the full mt control region in two genera variation for population genetic or phylogeographic 1Chrysomela and ), in two species within a studies. genus 1Gonioctena olivacea and G. pallida), in indi- viduals from distant populations of these species in Key words: Mitochondrial DNA Ð Control re- Europe, and in individuals from populations sepa- gion Ð Repeated elements Ð Chrysomelidae Ð rated by moderate 110- to 100-km) to short 1<5-km) Sequence variation Ð Concerted evolution distances. In all individuals, a highly repetitive section consisting of the tandem repetition of 12 to 17 im- perfect copies of a 107- to 159-bp-long core sequence was observed. This repetitive fragment accounts for Introduction roughly 50% of the full control-region length. The sequence variability among repeated elements within The noncoding portion of the mitochondrial 1mt) the control region of a given individual depends on genome in is called the control region be- the species considered: the variability within any cause it is believed to control the transcription and G. olivacea individual is much higher than that within replication of the mtDNA molecule. In vertebrates, G. pallida individuals. Comparisons of the repeated the control region has been shown to contain the elements, in a phylogenetic framework, within and promoters for transcription initiation and the origin among individuals of G. olivacea and G. pallida sug- of heavy-strand DNA replication 1e.g., Shadel and gests that the repetitive section of the control region Clayton 1997). In , this region is usually called experienced recurrent duplications/deletions, leading the ``A + T-rich region'' following Fauron and to some degree of concerted evolution. Comparisons Wolstenholme 11976), because it exhibits a frequency between Chrysomela and Gonioctena control regions of deoxyadenylate and deoxythymidylate residues revealed virtually no signi®cant sequence similarity, higher than in the rest of the mt molecule. It was except for two long stretches of A's and several found that the origin of replication lies within the control region in Drosophila 1Goddard and Wol- stenholme 1978), and promoters for transcription Correspondence to: Patrick Mardulyn; email: pmarduly@ulb. have possibly been identi®ed in this segment as well ac.be 1Lewis et al. 1994). 39

The control region varies greatly in size among DNA was extracted from ethanol-preserved insects. Whole speci- insects: from ‹350 bp in Lepidoptera 1Taylor et al. mens were individually ground in an SDS homogenization bu€er 1993) to 13 kbin barkweevils 1Coleoptera) 1Boyce and incubated overnight with proteinase K 12 mg/ml) at 40°C, followed by three phenol/chloroform extractions, ethanol precipi- et al. 1989). Large di€erences in size can also be tation, and resuspension in TE bu€er 110 mM Tris, 1 mM EDTA). found among closely related taxa, as sizes from 1 to 5 Because no sequences of the control region and of the adjacent kbwere reported among Drosophila species 1Lewis et NADH dehydrogenase subunit 2 1ND2) were available in Gen- al. 1994), as well as within species. This length poly- Bank for Coleoptera, a large fragment of 6±8 kb, including the morphism in the control region often results from the control region, the ND2 gene, and portions of the cytochrome oxidase I 1COI) and of the small-subunit rRNA 112S) genes, was presence of a variable number of tandemly repeated PCR-ampli®ed from two individuals per species. This ampli®cation elements 1Rand 1993; Zhang and Hewitt 1997; Lunt was performed with the Expand Long Template PCR System kit et al. 1998). 1Roche), following the manufacturer's protocol, with an annealing Because the control region is known generally to temperature of 55°C and an extension step of 12 min at 68°C. experience high rates of evolution in both nucleotide The primers used are modi®ed versions of universal primers described by Simon et al. 11994) and are located in the COI and 12s substitution and insertion/deletion events 1e.g., Mo- genes: C1-N-2414 15¢GTGCTAATCATCTAAAAATTTTAATT ritz et al. 1987), although this is not always true for CCTG3¢) 1modi®ed reverse complement of Cl-J-2441) and modi®ed insects 1Zhang and Hewitt 1997), it is increasingly 1longer version) SR-J-14233 15¢CCAGTAAGAGYGACGGGC used as a polymorphic marker for population genetic GATGTGT3¢). These primers were chosen because they were studies or for phylogenetic analyses involving closely previously used successfully for the ampli®cation of shorter frag- ments in Gonioctena leaf beetles 1unpublished data). Shorter ver- related taxa 1e.g., McMillan and Palumbi 1997; RuÈ ber sions of these primers were used to sequence the ¯anking regions et al. 2001). However, to our knowledge, this marker 1+/)700 nucleotides) of the long PCR products 1ABI Prism BigDye has never been used for coleopterans, an insect order Terminator Cycle Sequencing Ready Reaction Kit FS; Applied for which no sequence of the control region is avail- Biosystems). Sequencing products were separated by electropho- able in international sequence databases. We present resis on an ABI 377 automated DNA sequencer. New primers were designed on the sequenced ¯anking regions and used to sequence here a hierarchical study of sequence variation in leaf the long PCR fragment further. This primer-walking strategy was beetles: we sequenced and compared the full mt con- continued until reaching the 3¢ and 5¢ boundaries of the control trol region in two genera, in two species within a ge- region. We thereafter designed new primers [and named them nus, in individuals from distant populations of these according to Simon and co-workers' 11994) nomenclature] for the species in Europe, and in individuals from popula- ampli®cation of the complete mt control region 13.3±5.5 kb): SR-J- 14766 15¢CATTATTTGTATAACCGCAACTGCTGGCAC3¢)in tions separated by moderate 110- to 100-km) to short 12S and TM-N-204 15¢TAACCTTYATAAATGGGGTATG3¢)in 1<5-km) distances. The aim of this exploratory work the methionine transfer RNA 1tRNA). This was done using the is to assess the levels of variation at di€erent evolu- Expand Long Template PCR System kit 1Roche), following the tionary scales in the mt control region of leaf beetles manufacturer's protocol, with an annealing temperature of 55°C and infer its mode and tempo of evolution. We also and an extension step of 6 min at 66°C. The ampli®ed products were puri®ed and ligated into a pBluescript II SK1+) vector compare the leaf control-region organization to 1Stratagene) and transferred to E. coli DH5a competent cells. those observed in other orders of insects. Plasmids from positive clones were isolated and subjected to in vitro transposition reactions 1TGS; Finnzymes) that randomly Materials and Methods insert one 1and only one) transposon per plasmid. The transposon contains priming sites and a gene conferring antibiotic resistance. The resulting products were again transferred to E. coli DH5a Insect Collection competent cells. Clones containing a randomly inserted transposon were selected through antibiotic resistance and sequenced using two Sequencing of the mt control region from 24 individuals 1see primers annealing to the transposon. Each clone sequenced with Table 1 for collection sites) allowed the comparison of these both primers 1sequencing in opposite directions) yielded approxi- sequences between two genera 1Gonioctena and mately 1400 bp of sequence. Contigs were assembled manually. Chrysomela, both belonging to the tribe , subfamily Control-region sequences were obtained with this transposon ), between two species within the genus Gonioctena insertion strategy for three individuals per species. On the basis of 1G. pallida and G. olivacea), among geographically distant popu- the alignments among these three sequences we designed, for each lations of these three species in Europe, among ®ve populations of species, a set of 8±12 primers allowing the sequencing 1on both G. pallida across the Vosges mountains and the Black Forest, and strands) of the entire 1or nearly so, see below) control region ini- among ®ve populations of G. olivacea inside a 5 ´ 2-km area in the tially PCR-ampli®ed with SR-J-14766 and TM-N-204. In some Belgian Ardennes. A population is de®ned here as a group of instances, the within-species variability of control region sequences beetles associated with a geographically well-delimited patch of forced us to design population-speci®c primers. The control region host plants 1Salix caprea and/or Salix aurita, and/or Corylus of all studied species includes an area of about 1500±2500 bp 1i.e., avellana for G. pallida, Sarothamnus scoparius for G. olivacea, and roughly 50% of the full control region) consisting of >10 imperfect, Salix spp.orBetula spp. for Chrysomela lapponica). tandemly arranged, copies of a >100-bp motif 1see Results). For two species, G. pallida and C. lapponica, the repeated elements were too similar in sequence for the design of speci®c sequencing DNA Sequencing primers. This problematic region was therefore sequenced in these two species using the cloning±transposon insertion strategy de- We sequenced all or part 1i.e., excluding the repetitive region) of the scribed above. At least three clones 1up to ®ve) per individual were mt control region for all individuals listed in Table 1. Genomic sequenced to avoid sequencing errors due to the cloning procedure. 40

Table 1. Sampling localities of individuals collected for which the control region was sequenced

Species Locality Date Sample size Abbreviation

Gonioctena olivacea Wibrin, Belgium 1B)a August 1999 9 GoliB1±GoliB9 Hald, Denmark 1DK) July 1992 1 GoliDK Gonioctena olivacea Sines, Portugal May 1993 1 GoliPT Tshiertshen, Swiss Alps June 1992 1 GpalS1 Gonioctena pallida Les HauteÁ res, Swiss Alps June 1993 1 GpalS2b Gonioctena pallida Oberried, Black Forest, Germany May 1999 2 GpalDO1b±GpalDO2b Gonioctena pallida Wieden, Black Forest, Germany May 1999 2 GpalDW1b±GpalDW2 Gonioctena pallida Ste-Odile, Vosges, France May 1999 2 GpalFS1b±GpalFS2b Gonioctena pallida Orbey, Vosges, France May 1999 2 GpalFO1±GpalFO2b Gonioctena pallida Giromagny, Vosges, France May 1999 1 GpalFG2b Chrysomela lapponica Queyras, France August 1995 1 ClapF Chrysomela lapponica Finland August 1995 1 ClapFi a Individuals sampled in ®ve populations located in an area of 5 ´ 2 km. b Repetitive region not sequenced for this sample.

Because the cloning±transposon insertion strategy is labor-inten- Using the same sliding window of 40 nucleotides, maximum simi- sive, the repetitive region was sequenced for only 3 G. pallida in- larity values were recorded for each of the 100 pairwise compari- dividuals of 11 1GpalS1, GpalDW2, and GpalFO1) and for the 2 C. sons of random sequences. In pairwise comparisons of original lapponica individuals. experimental sequences, any similarity value higher than the max- imum value recorded in pairwise comparisons of random sequences was considered signi®cant at the 1% level. The evolution of the Data Analyses repetitive region 1see below) was explored by inferring the phylo- genetic relationships among the repeated elements within and The edges of the newly determined control regions were identi®ed among individuals. This was done 11) among repeats from all nine by aligning the ¯anking 12S rRNA and isoleucine tRNA sequences Belgian samples of G. olivacea 1GoliB1±GoliB9), 12) among repeats with their homologues from Drosophila 1Diptera), Locusta 1Or- from the three G. pallida samples for which we have sequenced the thoptera), and Triatoma 1Hemiptera), retrieved from GenBank repetitive region 1GpalS1, GpalDW2, GpalFO1), and 13) among 1accession numbers AJ400907, NC001712, and NC002609). Con- repeats of three distantly related G. olivacea individuals 1GoliB2, trol-region sequences were edited with SEQPUP version 0.6 1Gilbert GoliDK, GoliPT). Phylogenetic networks of repeated elements 1996) and aligned using Clustal X 1Thompson et al. 1997) with were inferred with the parsimony criterion using the TCS program default parameter settings 1gap opening/extension penalties of 1see above). 15.00/6.66). The stability of these alignments was explored further using SOAP version 1.1 1LoÈ ytynoja and Milinkovitch 2001). This program automatically generates several Clustal alignments, using a Results and Discussion speci®ed range of parameter settings, and allows easy comparison of the resulting alignments to detect unstable aligned positions. The proportion of identical sites among sequences and nucleotide fre- All sequences are deposited in GenBank under ac- quencies were calculated using PAUP* 1Swo€ord 2001). Haplotype cession numbers AY117367±AY117382, and networks were inferred with the parsimony criterion using the AF530028±AF530043. The length of the control re- method described by Templeton et al. 11992) and implemented in gion is 3172±3296, 3417±3701, and 4640±5276 bp for TCS V. 1.13 1Clement et al. 2000). Phylogenetic networks are more G. olivacea, G. pallida, and C. lapponica, respectively. convenient to represent phylogenetic relationships among closely related sequences than strictly bifurcating trees, as the former allow Mt control-region length di€erences among individ- the display of all equally parsimonious hypotheses 1i.e., ambiguous uals of each species is essentially due to variation in a relationships) on a single ®gure. In addition to the Clustal align- highly repetitive section of 12 to 17 imperfect copies ments, comparisons of sequences were performed by searching of a 107- to 159-bp-long core sequence. The presence blocks of similarity across the entire control region using the pro- of one or several such repetitive segments has already gram Macaw version 2.0.5 1Schuler et al. 1991; available by FTP at ncbi.nlm.nih.gov in the directory pub/Macaw). This program lo- been reported for many species for which the cates regions of local similarity across sequences and evaluates their sequence variation of the control region has been statistical signi®cance 1Karlin and Altschul 1990). These searches investigated. These repeated elements represent 51± were performed within and among leaf beetle species, but also be- 53, 49±53, and 51±57% of the control region in tween leaf beetles and other insect orders. G. olivacea, G. pallida,and C. lapponica, respectively. Investigation of the sequence variability across the control re- gion in pairwise comparisons of individuals at several hierarchical Examples of these repeated elements for all three levels was performed with a 40-nucleotide-long sliding window, species are available from the authors upon request moving one base at a time across the pairwise alignment, each time or can be found at the following URL: http:// returning a similarity value 1i.e., the proportion of identical sites). www.ulb.ac.be/sciences/ueg/html_®les/publications. When comparing highly divergent sequences, it was dicult to html. The sequence variability among the elements distinguish between true homology and chance similarity. We therefore generated, with MacClade 3.05 1Maddison and Maddi- within the repetitive section of a given individual son 1992), 100 pairs of random sequences of size and base com- depends on the species considered: the tandemly re- position identical to the two original sequences being compared. peated elements within each G. olivacea control-re- 41 gion sequence are much more variable than those among sequences 1even outside the repetitive region) within G. pallida or C. lapponica sequences. Using from di€erent species or genera are highly unstable. Macaw 2.0.5, we were not able to ®nd nonrandom Still, we have checked that all conclusions drawn similarity among the repetitive sections from di€erent below are una€ected by the chosen alignment pa- species. All sequences reported here were obtained rameters. through procedures requiring PCR ampli®cation Control-region sequence variations at di€erent from genomic DNA. Hence, all ambiguities 1after hierarchical levels are summarized in Fig. 1. Only one direct sequencing of PCR products or in contigs of example of pairwise comparison is shown for each sequences obtained from di€erent clones of the same hierarchical level, as the same pattern is observed for PCR product) were reported with the appropriate all comparisons at one given level. The proportion of symbol. These ambiguities could correspond to het- identical sites found in Clustal pairwise alignments of eroplasmy and/or to cloning artifacts. Identi®cation sequences from di€erent genera 1Gonioctena and of the presence/absence of heteroplasmy in the in- Chrysomela) using a 40-bp sliding window is mostly vestigated leaf beetles is clearly beyond the scope of nonsigni®cantly di€erent, i.e., below 65% 1Fig. 1A), the present paper. from what can be expected between random se- Di€erent publications have recently reported that, quences. Searching for blocks of similarity using in a wide variety of animal species, large stretches of Macaw 2.0.5, we found very few signi®cant shared mitochondrial DNA 1mtDNA) have been transferred patterns: two long stretches of A's 15¢ATTA- to the nucleus 1e.g., Collura and Stewart 1995; Ben- AAAAAAAA3¢ and 5¢AWAAAAAAAAAAAA- sasson et al. 2001). Although we cannot be certain ACT3¢, similarly located in all three species) and that the DNA we have sequenced for this study is not di€erent [T1T)A1A)]6±22 repeats. Comparable pat- of nuclear origin, we believe that this is unlikely. In- terns are also found in the control region of species deed, the PCR-ampli®ed product on which all further from other insect orders 1Zhang and Hewitt 1997). primer design was based in our study is 6±8 kb in For example, in Drosophila species, two conserved length. Amplifying a large portion of the mtDNA thymidylate stretches 119±25 and 13±17 nucleotides molecule is one advised method 1see http://www. long) have been observed, one of which is adjacent to pseudogen.net/) to avoid ampli®cation of nuclear a conserved block of 300 bp that has been proposed copies of mtDNA or, at least, to increase the pro- to be involved in mtDNA replication 1Lewis et al. portion of mt relative to nuclear ampli®cations. If the 1994). Whether these patterns are functionally con- proportion of nuclear copies in the PCR product is strained in leaf beetles or correspond to chance sim- suciently small, direct sequencing of this product ilarity is unclear. No blocks of similarity other than will yield mitochondrial sequence. Only when se- the ones described above were detected in this study quencing the repetitive region of G. pallida and when comparing leaf beetle sequences with control- C. lapponica was a cloning strategy used 1see region sequences from other insect species. Materials and Methods). In this case, however, we The pairwise Clustal alignment of the full control have cloned a PCR product of the entire control region of the two Gonioctena species investigated here region 13400±5200 nucleotides). At least three clones yields an overall uncorrected similarity of 71% and per fragment were sequenced and compared, and a exhibits many blocks with signi®cant similarity, consensus sequence was built. mainly in the ®rst half of the control region 1Fig. IB). A + T account for 83, 82, and 75% of the nucle- In intraspeci®c comparisons, the proportion of otides in the G. olivacea, G. pallida, and G. lapponica identical sites found in pairwise alignments of se- mt control regions, respectively. This represents the quences using a 40-bp sliding window is always above lowest control-region A + T content of all insects the estimated level of random similarity and is mostly studied thus far 1Zhang and Hewitt 1997). above 90% 1Figs. 1C±F). Within G. olivacea, the mt The stability of Clustal alignments was tested us- control regions of the three distantly located popu- ing SOAP version 1.1. All alignments of intraspeci®c lations in Europe 1Belgium±Denmark±Portugal) have sequences, when excluding the repetitive section, are an overall similarity of 95±96%, with much of the very stable, i.e., virtually no alignment di€erences are variation located in the repeated elements region observed when using di€erent alignment parameters 1Fig. 1E; sites 1180 to 2912). This pattern of variation 1weighted matrix, gap penalties from 12 to 20 by steps across the control region is con®rmed when com- of 2, and extension penalties from 4 to 10 by steps of paring sequences from populations at a lower geo- 2). This is not surprising given the relatively low level graphic scale: variable sites within populations of divergence observed at the intraspeci®c level. On sampled in the Belgian Ardennes 1within an area of the other hand, alignment of the repetitive section is 5 ´ 2 km) are almost exclusively restricted to the re- more problematic, as it is dicult to establish or- petitive section of the control region 1Fig. IF). Par- thology among repeats from di€erent individuals 1see simony network analyses 1Fig. 2) also indicate that below). SOAP analyses identi®ed that the alignments the number of mutations between pairs of haplotypes 42

Fig. 1. Pairwise sequence similarity of mt control-region sequences indel sites, 1E) GoliB2 and GoliDK, after removal of indel sites but at di€erent taxonomic/population levels. Uncorrected pairwise se- including repeated elements, 1F) GoliB2 and GoliB4, after removal of quence similarity is measured in a sliding window of 40 bp. The scale indel sites but including repeated elements. Sequences are presented on the horizontal axis shows the nucleotide position of the ®rst site from the methionine tRNA end 1left) to the 12S end 1right). The inside the sliding window. Comparisons are made between 1A) Go- horizontal dashed line in A and B indicates the similarity value as- liB2 and ClapF, after removal of the repeated elements, 1B) GoliB2 sessed as signi®cantly di€erent from random 1p<0.01; see Materials and GpalFO1, after removal of the repeated elements, 1C) ClapF and and Methods). Arrows indicate the location of the repetitive region ClapFi, after removal of repeated elements and indel sites, 1D) when it was excluded 1A±D); brackets indicate the range spanned by GpalFS2 and GpalDO2, after removal of repeated elements and the repetitive region when it was included 1E and F). is much higher when including the repeated elements. GoliDK and GoliB2, while, within GoliDK, repeat 8 The high variability of the repetitive section of the mt is identical to repeat 5 1which is not the case in Go- control region may not be the result of point muta- liB2). These observations suggest that repeats within tions only. Indeed, when comparing individuals from the control region of a given individual might have distant populations of G. olivacea 1e.g., GoliDK and been generated through recurrent and recent 1post- GoliB2; see examples available at http://www.ulb.ac. dating the common ancestor of the individuals com- be/sciences/ueg/html_®les/publications.html), more pared) duplication/deletion events. Such a mecha- variation is often observed between repeats of two nism could lead to the partial homogenization of the individuals than among repeats within an individual. repeats within individuals 1i.e., concerted evolution) For example, there are 22 di€erences 1including point and explain the much higher conservation of repeat mutations and indels) between repeats 8 from sequences within than among G. pallida individuals. 43

Fig. 2. Networks showing the most parsimonious relationships removing the repeated elements, 1III) All Gonioctena pallida indi- among control-region sequences inferred with the TCS program of viduals collected in the Vosges mountains, the Black Forest Clement et al. 12000). 1I) Gonioctena olivacea individuals collected mountains, and the Alps; network inferred after removing the re- within ®ve populations inside a 5 ´ 24-km area. 1II) all Gonioctena peated elements. In network II, a circle is drawn around the olivacea individuals collected in Europe; network inferred after haplotypes that are displayed in network I.

Also, diagnostic single-nucleotide substitutions are resulting phylogenetic network 1which can be seen often present in more than one repeat, as if they had at http://www.ulb.ac.be/sciences/ueg/html_®les/pub- propagated across di€erent repeats within an indi- lications. html), each repeat in one individual can be vidual. These substitutions could be homoplasious grouped with the corresponding orthologous repeats and result from the presence of mutational hot spots 1here, two repeats are said to be orthologous if they in the sequence or could simply re¯ect recurrent du- have been generated by vertical descent from a plications/deletions of repeats as suggested above. common ancestral genome rather than by duplication The second explanation is more likely, as those sin- within that genome) in all other individuals. gle-nucleotide mutations are often covariating with In contrast, evidence for concerted evolution can each other. be found at a higher geographic level, i.e., when in- We tested the occurrence of concerted evolution ferring phylogenetic relationships among repeated further by inferring the phylogenetic relationships elements from three G. pallida samples in Europe or among repeated elements within and among indi- among three distantly related G. olivacea individuals. viduals 1see Materials and Methods). In the absence In the resulting networks, we can indeed ®nd several of duplication/deletion events, we would expect that examples of clusters of repeated elements from the each repeat in one individual clusters with one 1and same individual. One of the most conspicuous ex- only one) closely related repeat from each of the other amples is shown in Fig. 3 1arrow), where one block of individuals, i.e., duplication events would have oc- identical repeats 1i) includes ®ve repeats from indi- curred before the divergence of all investigated indi- vidual GpalFO1 1black rectangles), 1ii) is directly viduals from their most recent common ancestor. On connected to four or ®ve other GpalFO1 repeats, 1iii) the other hand, if concerted evolution does occur, we includes only one repeated element from GpalS1 would expect to observe some clustering of repeats 1white rectangles) and one from GpalDW2 1gray from the same individual. The inference of phyloge- rectangle), and 1iv) is directly connected to only two netic relationships among repeated elements from GpalS1 repeats and to one GpalDW2 repeat. An- all nine Belgian samples of G. olivacea led to no other obvious example can be found in the phyloge- evidence of concerted evolution. Indeed, in the netic network of the three distantly related G. olivacea 44

among distantly related individuals. The repeated elements characterized here will be of particular in- terest for population genetic and phylogenetic ana- lyses of G. olivacea involving closely related individuals because, at this taxonomic level, not only can orthology among repeats be unambiguously as- sessed, but also repeats exhibit useful levels of vari- ation, even among individuals collected inside areas as restricted as 5 ´ 2 km. However, the usefulness of the repetitive region depends on the species consid- ered. Indeed, as already mentioned, the level of repeat sequence variation strongly varies among the three species investigated here. P distances 1i.e., the proportion of sites that are di€erent between two aligned sequences) among control-region sequences of G. pallida individuals from di€erent locations 1the Alps, the Vosges Fig. 3. Network showing the most parsimonious relationships mountains, and the Black Forest) range from 0.1 to among repeated elements from three G. pallida individuals 0.9%. The repeated elements, available for only three 1s1 = GpalS1, o1 = GpalFO1, w2 = GpalDW2, from the Alps, individuals 1see Materials and Methods), were not Vosges, and Black Forest, respectively), inferred using the statis- included in this calculation, as it is dicult to estab- tical parsimony approach with the program TCS 1.13 from Clement et al. 12000). Each repeat is represented by a rectangle and lish orthology among repeats for G. pallida 1because is named after the individual it comes from, followed by a number repeats are too similar across the control region of a identifying the position of the repeat in the sequence 1e.g., w2-5 single individual). Interestingly, the haplotype net- refers to repeat number 5 in individual GpalDW2). Repeat ele- work in Fig. 2 1III) indicates no clear separation ments not separated by a branch represent identical sequences. among the individuals from the Vosges, the Black Each branch represents a single mutational step, i.e., one nucleotide substitution or one insertion/deletion event. Open ovals represent Forest, and the Alps, despite large low-elevation areas missing 1i.e., unobserved) intermediate sequences. The dashed lines 1i.e., unsuitable habitats for G. pallida) separating represent four alternative equally parsimonious ways to connect s1- these mountain ranges. Indeed, the two alpine indi- 1 to the rest of the network. The arrow identi®es a group of repeats viduals 1GpalS1 and GpalS2), for which the sampling that are referred to in the text. sites are located several hundreds of kilometers apart, are not genetically more distant from each other or individuals 1available at http://www.ulb.ac.be/sci- from the Vosges and Black Forest individuals 1all ences/ueg/html_®les/publications.html),wherefourre- sampled within a geographical area of 100 ´ 40 km) peated elements from GoliPT can be isolated from the than the latter individuals are from each other. rest of the network. This strongly suggests the occur- The pairwise alignment between the control-region rence of duplication/deletion of repeated elements in sequences of the two Chrysomela lapponica individ- the control region of Gonioctena leaf beetles, leading uals yields an uncorrected sequence similarity of to the homogenization of the repetitive segment. 96.9%. When excluding the repeated elements region, The occurrence of duplication and deletion events for which it is dicult to establish homology among of repeated elements in the mitochondrial control repeats, this value increases to 97.6%. Figure 1C region has already been encountered in many animal shows that the di€erences between the two sequences species. The molecular mechanism1s) responsible for are not homogeneously distributed, as there are many the evolution of the repeated elements is1are) cur- more di€erences located at the end of the sequence rently not fully understood, although slipped-strand than there are in the ®rst 800 bp of the control region. mispairing during replicationÐi.e., frequent mis- Although it has been shown previously in some alignment in the repeat region, prior to elongation, insect species that the substitution rate is lower in the causing length heteroplasmy 1Levinson and Gutman control region than in the third positions of some 1987; Buroker et al. 1990)Ðis believed to be involved protein coding genes 1Caccone et al. 1996; Zhang and 1Lunt et al. 1998). Homologous recombination has Hewitt 1997), the rates of evolution inferred in this also been cited as a possible mechanism 1e.g., Rand study are highly variable across the control region and Harrisson 1989), for which empirical evidence and some speci®c portions of this mtDNA fragment was recently found 1e.g., Lunt and Hyman 1997; reach levels of variation clearly sucient for popu- Awadalla et al. 1999; Ladoukakis and Zouros 2001). lation genetic or phylogeographic studies. For ex- The occurrence of concerted evolution in the leaf ample, one of us characterized a portion of about 400 beetle control region makes it particularly dicult to nucleotides 1Fig. 2D; sites 720±1120) as e€ective for di€erentiate orthologous from paralogous repeats conducting a phylogeographic study of G. pallida 45 populations within the Vosges mountains 1Mardulyn Lewis DL, Farr CL, Farquhar AL, Kaguni LS 11994) Sequence, 2001). When focusing on this 400-nucleotide region, organization, and evolution of the A+T region of Drosophila the sequence divergence among G. pallida individuals melanogaster mitochondrial DNA. Mol Biol Evol 11:523± 538 increase to 2%. LoÈ ytynoja A, Milinkovitch MC 12001) SOAP, cleaning multiple alignments from unstable blocks. Bioinformatics 17:573±574 Lunt DH, Hyman BC 11997) Animal mitochondrial DNA recom- Acknowledgments. Individuals of Chrysomela lapponica and of bination. Nature 387:247 Gonioctena olivacea from Portugal were collected by Jacques Lunt DH, Whipple LE, Hyman BC 11998) Mitochondrial DNA Pasteels, and the Gonioctena olivacea individual from Denmark by variable number of tandem repeats 1VNTRs): utility and J. Nielsen. We are grateful to two anonymous reviewers for pro- problems in molecular ecology. Mol Ecol 7:1441±1455 viding many useful comments on an early version of the manu- script. This research was supported by the FNRS 1Belgian National Maddison WP, Maddison DR 11992) MacClade: Analysis of Fund for Scienti®c Research), the Free University of Brussels phylogeny and character evolution 1version 3.05). Sinauer As- 1ULB), the Van Buuren Fund, the ``CommunauteÂFrancËaise de sociates, Sunderland, MA Belgique'' 1ARC 98/03-223), and the Defay Fund. P.M is a Post- Mardulyn P 12001) Phylogeography of the vosges mountains doctoral Researcher at the FNRS. populations of Gonioctena pallida 1Coleoptera: Chrysomelidae): a nested clade analysis of mitochondrial DNA haplotypes. Mol Ecol 10:1751±1763 References McMillan WO, Palumbi SR 11997) Rapid rate of control region evolution in Paci®c butter¯y®shes 1Chaetodontidae). J Mol Awadalla P, Eyre-Walker A, Maynard Smith J 11999) Linkage Evol 45:473±484 disequilibrium and recombination in hominid mitochondrial Moritz C, Dowling TE, Brown WM 11987) Evolution of animal DNA. Science 286:2524±2525 mitochondrial DNA: Relevance for population biology and Bensasson D, Zhang D-X, Hartl DL, Hewitt GM 12001) Mito- systematics. Annu Rev Ecol Syst 18:269±292 chondrial pseudogenes: Evolution's misplaced witnesses. TREE Rand DM 11993) Endotherms, ectotherms, and mitochondrial 16:314±321 genome size variation. J Mol Evol 37:281±295 Boyce TM, Zwick, ME, Aquadro CF 11989) Mitochondrial DNA Rand DM, Harrison RG 11989) Molecular population genetics of in the bark weevils: Size, structure and heteroplasmy. Genetics mtDNA size variation in crickets. Genetics 121:551±569 123:825±836 RuÈ ber L, Meyer A, Sturmbauers C, Verheyen E 12001) Population Buroker NE, Brown JR, Gilbert TA, O'Hara PJ, Beckenbach AT, structure in two sympatric species of the lake Tanganyika Thomas WK, Smith MJ 11990) Length heteroplasmy of stur- cichlid tribe Eretmodini: Evidence for introgression. Mol Ecol geon mitochondrial DNA: an illegitimate elongation model. 10:1207±1225 Genetics 124:157±163 Schuler GD, Altschul SF, Lipman DJ 11991) A workbench for Caccone A, Garcia BA, Powell JR 11996) Evolution of the mito- multiple alignment construction and analysis. Proteins 9:180± chondrial DNA control region in the Anopheles gambiae com- 190 plex. Insect Mol Biol 5:51±59 Shadel GS, Clayton DA 11997) Mitochondrial DNA maintenance Clement M, Posada D, Crandall KA 12000) TCS: A computer in vertebrates. Annu Rev Biochem 66:409±433 program to estimate gene genealogies. Mol Ecol 9:1657±1659 Simon C, Frati F, Beckenbach A, Crespi B, Liu H, Flook P 11994) Collura RV, Stewart C-B 11995) Insertions and duplications of Evolution, weighting, and phylogenetic utility of mitochondrial mtDNA in the nuclear genomes of Old World monkeys and gene sequences and a compilation of conserved polymerase hominoids. Nature 378:485±489 chain reaction primers. Ann Entomol Soc Am 87:651±701 Fauron CM-R, Wolstenholme DR 11976) Structural heterogeneity Swo€ord DL 12001) PAUP*: Phylogenetic analysis using parsi- of mitochondrial DNA molecules within the genus Drosophila. mony 1and other methods), 1version 4.0b8). Sinauer Associates, Proc Natl Acad Sci USA 73:3623±3627 Sunderland, MA Gilbert DG 11996) SeqPup version 0.6. Published electronically on Taylor MFJ, McKechnie SW, Pierce N, Kreitman M 11993) The the Internet; available via anonymous ftp to ftp.bio.indiana.edu lepidopteran mitochondrial control region: structure and evo- Goddard JM, Wolstenholme DR 11978) Origin and direction of lution. Mol Biol Evol 10:1259±1272 replication in mitochondrial DNA molecules from Drosophila Templeton AR, Crandall KA, Sing CF 11992) A cladistic analysis melanogaster. Proc Natl Acad Sci USA 75:3886±3890 of phenotypic associations with haplotypes inferred from re- Karlin S, Altschul SF 11990) Methods for assessing the statistical striction endonuclease mapping and DNA sequence data. III. signi®cance of molecular sequence features by using general Cladogram estimation. Genetics 132:619±633 scoring schemes. Proc Natl Acad Sci USA 87:2264±2268 Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins Ladoukakis ED, Zouros E 12001) Direct evidence for homologous DG 11997) The CLUSTAL X windows interface: Flexible recombination in mussel 1Mytilus galloprovincialis) mito- strategies for multiple sequence alignment aided by quality chondrial DNA. Mol Biol Evol 18:1168±1175 analysis tools. Nucleic Acids Res 24:4876±4882 Levinson G, Gutman GA 11987) Slipped-strand mispairing: A Zhang D-X, Hewitt GM 11997) Insect mitochondrial control major mechanism for DNA sequence evolution. Mol Biol Evol region: A review of its structure, evolution and usefulness 4:203±221 in evolutionary studies. Biochem Syst Ecol 25:99±120