Phylogeography and population dynamics of

secondary contact zones of Lacerta lepida in the

Iberian Peninsula

Andreia Miraldo

A thesis submitted for the degree of Doctor of Philosophy

Norwich, June 2009

© This copy of the thesis has been supplied on condition that anyone who consults it is understood to recognise that its copyright rests with the author and that no quotation from the thesis, nor any information derived there from, may be published without the author's prior, written consent.

Phylogeography and population dynamics of secondary contact zones of Lacerta lepida in the Iberian Peninsula

Ângela Andreia Firmino Miraldo

June 2009

Lacerta lepida is a lizard species that occurs throughout the Iberian Peninsula. Detailed phylogeographic analysis of the species using mitochondrial DNA and nuclear sequence data revealed a history of population fragmentation and diversification in allopatry. Diversification within the species was estimated to have started in the Miocene probably related to geological events of the region, nevertheless a strong influence of Pleistocene climatic oscillations were also detected. Several glacial refugia and demographic range expansions after diversification in allopatry were detected leading to the establishment of several secondary contact zones. Detailed analysis of two secondary contact zones within the species was carried out.

One of the secondary contact zones was characterized by the existence of intra- individual mitochondrial polymorphism. The origin of the polymorphism was identified to be the result of introgression of mitochondrial DNA fragments from one lineage into the nuclear genome of the other (Numts), suggesting that hybridization between the lineages occurred. Detailed phylogeographic analysis of the identified Numts allowed the inference of lineages recent demographic events. Additionally, further analysis of the polymorphic samples detected within this contact zone revealed the existence of low levels of heteroplasmy and mitochondrial DNA recombination, which until now was rarely reported for natural populations in the literature.

Gene flow dynamics was assessed in another zone of secondary contact between two very divergent mitochondrial lineages, located in south-eastern Spain. The use of mitochondrial DNA and microsatellite data allowed the detection of restricted gene flow amongst the lineages. It was postulated that the two lineages are on independent evolutionary paths, and therefore should be considered as two different species.

The molecular tools used throughout this study revealed that geological events, climatic changes, hybridization, and speciation have shaped the evolutionary history of Lacerta lepida .

i

Acknowledgments

I would like to thank Brent Emerson for the supervision and guidance given throughout the development of this thesis. His constant support, encouragement and friendship were an invaluable help for the successful completion of this PhD.

I am extremely thankful to Godfrey Hewitt with whom I had the pleasure to share so many exciting conversations over these four years and during which I was able to absorb his contagious enthusiasm about Science and History.

To Octavio Paulo I am very thankful for introducing me to the amazing world of reptiles and for giving me the opportunity to develop this project.

I would also like to thank Sara Goodacre for all the help given by introducing me to the lab and guiding me through my first months at UEA. Thank you for your time and for being always so happy to help.

To Paul Dear I am grateful for sparing his time and knowledge with me and accepting me to join his lab every now and then.

This project would have not been possible without the help from all the people I had the pleasure to do fieldwork with. Their continuous enthusiasm despite the hard task they faced every day while chasing lizards is definitely impressive: Pedro Silveirinha, Rui Osório, Rita Jacinto, Nuno Valente, Juan Pablo de la Vega, Luís Garcia, Gabriel Marin, Eugenia Zarza-Franco and Brent Emerson.

Special thanks go to my father, Mário Miraldo, and grandfather, António Firmino, who have spent many weekends developing the best lizard traps ever, which lead to the successful capture of hundreds of clever lizards!

Aos meus pais, irmã e avós um muito obrigado por todo o apoio, carinho e inspiração. Obrigado pelos exemplos de preserverança e espírito de aventura que sempre incutiram em mim. É a voçês que dedico esta tese!

E finalmente para ti Matthew um muito obrigado pela companhia ao longo destes anos!

This work was funded by Fundacao para a Ciencia e Tecnologia through a PhD scholarship (SFRH/BD/1696/2004) and a research fellowship (POCTI/BSE/48365/2002) ii

Crise de vocação

Parte I

Num frágil batel vou navegando Empoleirado nas vagas alterosas Olhando para a lua e os astros namorando O temeroso mar da vida vou sulcando À procura de enseadas radiosas.

Três anos porém já são passados Sem ter conseguido o meu intento Três anos de ilusão amargurados Três anos a negro já passados Três anos levados pelo vento.

Impelido pel’ ansiedade que me vence ‘Inda pensando num sonho que se afasta Olho a lonjura e, com gáudio fremente Mui longe praia diviso atraente. Mas a meus olhos fenece num repente Essa miragem fatídica e nefasta.

Se num dia parecer estar ciente, Olhando por acaso para o Norte Que aquela que julgava inexistente Eu vejo, enfim, com sorriso que se sente Ser meu na vida e não me deixar na morte, Esquecerei para sempre este lamento E a vida não será mais um tormento, Mas um céu de luz, de paz e boa sorte.

Parte II

E depois de tantos anos que passaram Muitas praias radiosas encontrei Umas delas as dores me afagaram Outras tantas a naufrágios me levaram Por não serem, afinal, o que sonhei

Ao vencer cada borrasca com bravura E crer que a vida só vitórias não gerou Se vejo o sol para além da nuvém escura Seja qual for a dôr ou a tortura Desistir da luta eu já não vou. Como não desistiu a micro-criatura Que ao vencer milhões em luta dura No ovo materno me gerou. A poem by Mário Miraldo

iii

Table of Contents

General abstract i Acknowledgments ii Poem by Mário Miraldo iii Table of contents iv

1. General introduction ...... 1 1.1. The Iberian Peninsula ...... 2 1.1.1. Geological history and geographic aspects of the Iberian Peninsula ...... 3 1.1.2. The role of the Iberian Peninsula during the Quaternary climatic oscillations6 1.2. Lacerta lepida ...... 7 1.3. Molecular tools ...... 9 1.3.1 The pitfalls of mitochondrial DNA: Numts, heteroplasmy and recombination11 1.4 Thesis structure ...... 13 1.5. References ...... 15

2. Intra-individual mitochondrial DNA polymorphism in a reptilian secondary contact zone ...... 24 2.1. Abstract ...... 24 2.2. Introduction ...... 25 2.3. Materials and Methods ...... 28 2.3.1. Sampling ...... 28 2.3.2. DNA extraction, amplification and sequencing ...... 28 2.3.3. Identification of polymorphic individuals and quantification of intra-individual variation...... 29 2.3.4. Amplification of the entire cytochrome b gene ...... 30 2.3.5. Haplotype network construction ...... 30 2.4. Results ...... 32 2.4.1 Characterization of polymorphism ...... 32 2.4.2. Characterization of Numts and intra-individual variation ...... 33 2.4.3. Phylogeographic analysis ...... 35

2.5. Discussion...... 38 2.5.1. Phylogeographic history of lineages L3 and L5 ...... 38 2.5.2. Origin of the polymorphism ...... 40 2.5.3. Phylogeographic utility of Numts ...... 42 2.6. Conclusion ...... 42 2.7. References ...... 55

3. Phylogeography of Lacerta lepida in the Iberian Peninsula ...... 62 3.1 Abstract ...... 62 3.2. Introduction ...... 63 3.3. Materials and methods ...... 67 3.3.1. Sampling strategy collection ...... 67 3.3.2. Laboratory procedures ...... 68 3.3.3. Phylogeographic and historical demographic analysis ...... 69 3.3.4. Estimation of divergence times ...... 72 3.4. Results ...... 74 3.4.1. Mitochondrial DNA data ...... 74 3.4.2. Nuclear DNA data ...... 77 3.4.3. Divergence times ...... 78 3.5. Discussion ...... 78 3.5.1. Mitochondrial DNA data ...... 79 3.5.2. Nuclear DNA data ...... 80 3.5.3. Historical biogeography of Lacerta lepida ...... 82 3.6. Conclusion ...... 85 3.7. References ...... 99

4. Genetic analysis of a secondary contact zone between Lacerta lepida lepida and Lacerta lepida nevadensis ...... 106 4.1 Abstract ...... 106 4.2. Introduction ...... 107 4.3. Materials and methods ...... 110 4.3.1. Sampling strategy collection ...... 110 4.3.2. Laboratory procedures ...... 111 4.3.3. Data analyses ...... 112 4.4. Results ...... 116 4.4.1. Mitochondrial DNA data ...... 116 4.4.2. Nuclear DNA data ...... 117 4.4.3. Cline analysis...... 119 4.5. Discussion ...... 120 4.5.1. Genetic structure of the contact zone: tension zone vs neutral diffusion . 121 4.5.2. The historical dynamics of lineages contact and introgression ...... 124 4.5.3. Taxonomic and conservation implications ...... 125 4.6. References ...... 141

5. Testing for the presence of heteroplasmy in Lacerta lepida through single molecule PCR ...... 148 5.1. Abstract ...... 148 5.2. Introduction ...... 149 5.3. Material and methods ...... 152 5.3.1. Sample selection and DNA extraction ...... 152 5.3.2. Estimation of the number of template copies ...... 152 5.3.3. Selection of loci and design of PCR primers ...... 153 5.3.4. PCR amplifications, scoring and sequencing ...... 155 5.4. Results and discussion ...... 156 5.4.1. Ruling out Numts ...... 156 5.4.2. Ruling out contamination ...... 157 5.4.3. Heteroplasmy and mtDNA recombination ...... 158 5.4.4. Origin of heteroplasmy and recombination in Lacerta lepida ...... 159 5.5. Conclusion ...... 160 5.6. References ...... 165

6. General discussion and conclusions ...... 172

Chapter 1

General introduction

Photo by Nuno Valente Cerro del Mencal, Pedro Martínez, Andalucia, Spain*

*Sampling site 5 in chapter 4

1. General introduction

One hundred and fifty one years ago Charles Darwin (Darwin, 1958) and Alfred Wallace (Wallace, 1958) presented to the world their theory about the evolution of species which has changed forever the way the natural world is perceived. The revolutionary, yet very simple theory of evolution later compiled in Darwin’s publication “On the Origin of Species” (Darwin, 1859) was based on two basic concepts: the concept of a “tree-of-life”, the idea that all species diverged from a common ancestor along separate pathways; and the concept of natural selection, a process by which species change and adapt to different environments and thereby identified as the key mechanism responsible for the branching in the “tree-of-life”. Although recent research has disclosed that the verticality relation between ancestor and descent in the Darwinian “tree-of-life” does not always adequately describe how evolution works (e.g. it does not describe the lateral transfer of genes which is now known to have play a very important role in the evolution of archaea and some bacteria), the general concept of the evolution of species holds true making the theory of evolution one of the most important and comprehensive scientific theories ever published. The publication of Darwin’s theory of evolution marked the emergence of evolutionary biology as a new and prosperous field of research concerned with the understanding of the mechanisms responsible for the isolation of populations, their differentiation and ultimately speciation. However, it is only relatively recently that genetic tools have been developed that make it possible to describe and quantify the genetic diversity found within species and gain a detailed understanding of the evolutionary process. An important contribution of genetic studies is the recognition that almost all species show some 1 level of genetic structure (Avise et al. 2000). Following from this it is now clear that studying geographic patterns of the distribution of genetic variation can provide understanding of species evolution. Throughout this thesis questions related to the origin and distribution of intraspecific genetic diversity and the historical and contemporary processes involved in creating and maintaining it will be addressed. These questions will be specifically addressed in Lacerta lepida , a lizard species that is mainly distributed in the Iberian Peninsula, a region that is known to have played a key role in species survival and diversification within Europe through several glacial cycles.

1.1. The Iberian Peninsula

The Iberian Peninsula is extremely rich in species diversity and endemism. Although occupying only a small part of Western Europe ( <6%), as much as 50% of the European plant and terrestrial vertebrate species occur in this Peninsula. Additionally, of the approximately 900 European endemic species, 31% occur in the Iberian Peninsula (Williams et al. , 2000). Its importance for biodiversity conservation has been acknowledged by conservation policies: the Mediterranean biodiversity hotspot from Conservation International included nearly 80% of the Iberian Peninsula area (Myers et al. , 2000), whereas the European network of important sites for conservation (Natura 2000 network) included more than 20%. Furthermore, the high biodiversity and endemism levels found in this peninsula are accompanied by high levels of intraspecific genetic diversity, as disclosed by several phylogeographic studies in the region. The reasons behind such diversity richness are varied, and are thought to be related to long-term species persistence in the region. Nevertheless, it is known that species distribution and the distribution of genetic variation are extremely variable in space and time and they are influenced by many different factors. Amongst them are major geological events, climatic oscillations and environmental changes, all of which are known to have promoted vicariant events and have therefore been invoked to explain species diversification (Avise 2000).

2

1.1.1. Geological history and geographic aspects of the Iberian Peninsula

The geographic position of the Iberian Peninsula within Europe is extremely peculiar. The peninsula connects to Europe by a relatively narrow and mountainous region (the Pyrenees) which is known to constitute an important physical barrier for species dispersal. At the same time, the Iberian Peninsula is the region within Europe with the closest proximity to the African continent with separation from northern Africa by only 14Km across the Strait of Gibraltar. These factors have important implications in the evolutionary histories and distributions of species within this peninsula. Indeed, the biogeographic history of the Iberian Peninsula is intimately associated with North Africa, and this is reflected on the existence of a great number of closely related species in both sides of the Mediterranean Sea, as detected by early biogeographic studies (e.g. Busack, 1986). The biogeography of Iberia and North Africa was greatly affected by common geological events associated with the Alpine orogeny in the middle Tertiary. The Alpine orogeny was mainly a mountain-building event which is known to have had an extreme effect in southern Europe and the Mediterranean basin (Rosenbaum et al. , 2002a; Rosenbaum et al. , 2002b). During the Oligocene (± 30 Mya) the area between the Iberian Peninsula and southern France suffered fragmentation and several land masses became disjunct from the main European continental block (Fig. 1.1.a). These land masses currently correspond to the Betic region in Spain, the Rif region in Morocco, the Kabylies in Algeria, and the Balearic Islands, Sardinia and Corsica throughout the Mediterranean Sea. Since the time of disjunction of the Betic and Rif land masses from the European continental block to their final collision with the Iberian Peninsula and North Africa respectively, several important events are registered which allowed the occurrence of dispersal and vicariance between these two regions. After disjunction and during the drifting process, the Betic and Rif regions remained as part of the same land mass for most of the time. The first land corridor between Iberia and North Africa was established ±15 Mya (Weijermars 1991) (Fig. 1.1.b) with the collision of this block with these

3 two regions simultaneously. Later, ±10-8 Mya, due to the opening of the Betic, Guadalhorce and Rif sea corridors this important land bridge between the two continents suffered fragmentation and the connection was broken (Fig. 1.1.c). The land connection between the Iberian Peninsula and North Africa became re- established (Fig. 1.1.d) with the sequential closure of the three sea corridors, starting with the Betic 7.8-7.6 Mya (Krijgsman et al. , 2000; Krijgsman et al. , 1999), followed by the Guadalhorce 6.8-6.7 Mya (Martín et al. , 2001) and finally the Rif sea corridor 6.7-6.0 Mya (Krijgsman and Langereis, 2000).

a) b) 30 Mya 15 Mya

B+R

B+R

c) 10-8 Myad) 8-6 Mya

Betic corridor

B Guadalhorce R Rif corridor

Fig. 1.1 . Reconstruction of the main geological events of the western Mediterranean from 30 Mya until 6 Mya. a ) 30 Mya parts of the Iberian Peninsula and southern France suffer fragmentation and disjunction from main land. b) 15 Mya the Betic-Rif land masses establish a transient connection between Iberia and North Africa. c) 10-8 Mya the Betic massif suffers frag mentation and three sea corridors emerge: the Betic, the Guadalhorce and the Rif sea corridors. d) 8-6 Mya the sea corridors close in a sequential order, reconnecting Iberia and North Africa again. Adapted from Rosenbaum et al . (2002a).

4

More recently ±5.96 Mya, another important event in the geological history of the Mediterranean basin started with the closure of the connection between the Atlantic Ocean and the Mediterranean Sea. This resulted in the almost complete desiccation of the Mediterranean Sea, a period known as the Messinian salinity crisis (MSC) (Hsu et al. , 1977). During the MSC (5.9 to 5.3 Mya) the European and the North African Mediterranean coasts became connected through extensive land bridges, again allowing the dispersal of terrestrial species throughout the basin. With the refilling of the Mediterranean by the opening of the modern Strait of Gibraltar, terrestrial species distributed on both sides of the Mediterranean became isolated, although for certain groups, not strongly restricted in their dispersal by sea, exchange of individuals through the Strait is known to have occurred relatively recently (e.g. Late Pleistocene) (Alvarez et al. , 2000; Carranza et al. , 2004; Lenk et al. , 1999; Schmitt et al. , 2005). The series of vicariant and dispersal events that occurred during the geological history of the Mediterranean basin have left characteristic signatures in the distributions of some species within this region. The geological events mentioned above have been invoked to explain biogeographic patterns for a number of species that exhibit genetic divergences concordant with those events. For example, Buthus spp. scorpions (Gantenbein and Largiadèr, 2003) show evidence for the occurrence of dispersal between the Iberian Peninsula and North Africa 15-14 Mya, when these regions are first known to have been in contact (see above). Events for allopatric speciation related to the fragmentation of the Betic region between 10-8 Mya have been invoked to explain why species of midwife toad ( Alytes spp.) inhabiting southern Iberia are more closely related to North African species than to other Iberian species (Martínez-Solano et al. , 2004). The sequential closure of the 3 sea corridors between 8-6 Mya have also recently been invoked to explain putative speciation events in the ocellated lizards ( Lacerta spp.) (Paulo et al. , 2008). Finally, for many species, divergences between Iberian and North African groups have been shown to have an origin coinciding with the opening of the Strait of Gibraltar, which marked the end of the MSC 5.3 Mya (e.g Blanus worm lizards (Vasconcelos et al. , 2006) and Discoglossus frogs (Zangari et al. , 2006)).

5

1.1.2. The role of the Iberian Peninsula during the Quaternary climatic oscillations

In Europe, the effect of Quaternary climatic oscillations in the distribution of genetic variation has been well studied over recent years and its important role in the evolution of species has long been recognized (Hewitt, 1996; Hewitt, 2000; Hewitt, 2004). The accumulation of phylogeographic studies within this continent suggests a general pattern of species contraction into southern refugia during glacial periods followed by expansions to northern latitudes during warm interglacials (Hewitt, 1996; Hewitt, 1999; Hewitt, 2004; Taberlet et al. , 1998), a pattern strongly supported by pollen and fossil data (Zagwijn, 1992a; Zagwijn, 1992b). These dramatic changes in the geographic distribution of species have left signatures in the geographic distribution of genetic variation, with southern populations typically exhibiting higher genetic diversity than northern ones (Hewitt, 1996; Hewitt, 2000; Hewitt, 2004). Paleontological and palynological studies from the southern European peninsulas of Iberia, Italy and the Balkans suggest that these regions have remained relatively stable throughout the Quaternary, allowing for the persistence and survival of species during adverse climatic conditions which were more pronounced at northern latitudes (Roucoux et al. , 2005; Tzedakis et al. , 2002). Amongst the southern European peninsulas, Iberia is the best studied in terms of phylogeography, and several studies show that species have in fact persisted there through several ice ages (e.g. Paulo et al. , 2001). This pattern of long-term persistence is also supported by the relatively high biological diversity and endemicity found within this peninsula (Blondel and Aronson, 1999; García-Barros et al. , 2002; Williams et al. , 2000). The long-term persistence of populations in the Iberian Peninsula is intimately associated with its southern geographic position within Europe, which allowed it to remain almost entirely ice free during the glaciations, and with the high topographic and climatic heterogeneity that characterizes it. The latter seems to be intimately associated with the Northern Atlantic and Mediterranean climatic influences which maintain distinct microclimates that change in a northwest-southeast direction in this Peninsula. This climatic gradient is reflected in strong genetic differentiation within Iberia between western and eastern groups as reported for several species (Batista et

6 al. , 2004; Carranza et al. , 2006; Paulo et al. , 2001; Schmitt and Seitz, 2004). However the majority of studied species exhibit more complex phylogeographic structures beyond a simple east-west split. This complexity is associated with isolation in several distinct refugia, a pattern inferred to be related to the high physiographic complexity and multiplicity of climates that the Iberian peninsula offers (for a review on the subject see Gomez and Lunt, 2007 and references therein). Signatures of post-glacial demographic range expansions and the establishment of secondary contact zones, with differing degrees of admixture, as a result of range expansions from distinct glacial refugia are also reported (e.g. Godinho et al. , 2008; Godinho et al. , 2006; Martínez-Solano et al. , 2006; Sequeira et al. , 2005).

1.2. Lacerta lepida

Lacerta lepida (Daudin 1802) together with Lacerta pater (Lataste 1880), Lacerta tangitana (Boulenger 1888) and Lacerta princeps (Balnford 1874) form the ocellated lizards, a subset of approximately 50 species within the genus Lacerta , which have a continuous distribution in south-western Europe ( L. lepida ) and North Africa ( L. pater and L. tangitana ) and with one species occurring disjunctly ( L. princeps ) in parts of the Middle East (Fig. 1.2.). Some authors (e.g. Arnold et al. , 2007) consider that the ocellated lizards should belong to a different genus, Timon , and therefore it is common to find these four species referred to as Timon spp. As there is still some controversy around the phylogenetic relationships within the genus Lacerta (Fu, 1998; Fu, 2000; Harris et al. , 1998), the classic nomenclature, Lacerta spp., will be used throughout this thesis. The ocellated lizards belong to the family Lacertidae, which in turn belongs to the order Squamata. Lacerta lepida, the largest European lacertid lizard, is the only species from the ocellated lizards to occur in Europe, where it occupies almost all of the Iberian Peninsula, southern France and north-western Italy (Castroviejo and Mateo, 1998; Mateo and Castroviejo, 1990; Mateo et al. , 1996). Four subspecies within Lacerta lepida are recognized: Lacerta lepida iberica , which occurs in the

7 north-western corner of the Iberia peninsula; Lacerta lepida nevadensis , which occurs in the south-western parts of Spain, mainly associated with the Betic mountain ranges; Lacerta lepida lepida , the nominal species and which has the widest distribution within the group and occurs in all remaining parts of the Iberian Peninsula, southern France and north-western Italy; and finally Lacerta lepida oteroi , which is restricted to the island of Salvora in northern Spain. The subspecies designations are based on morphological patterns, allozymes and in one case ( L. l. oteroi ) chromosomes (Castroviejo and Mateo, 1998; Mateo, 1988; Mateo and Castroviejo, 1990; Mateo and López-Jurado, 1994; Mateo et al. , 1996).

Fig. 1.2. Distribution of ocellated lizards (grey shadded area). Lacerta lepida occurs in Europe; Lacerta tangitana and Lacerta pater occur in Africa and Lacerta princeps occurs in the Middle East.

The majority of research involving L. lepida has been concerned mainly with its geographic distribution (e.g. Castroviejo and Mateo, 1998; Cheylan and Grillet, 2005; Mateo, 1988; Mateo and López-Jurado, 1994), ecology (e.g. Castilla and Bauwens, 1992; Castilla et al. , 1991; Mateo, 1988), behaviour (e.g. Castilla and Bauwens, 1989; Castilla and Bauwens, 1990; Mateo and Castanet, 1994; Paulo, 1988) and morphological and cariological differences within the species (e.g. Mateo, 1988). The first study concerned with the distribution of genetic variation within Lacerta lepida was carried out in 1988 by Mateo, using allozymes as molecular markers (Mateo, 1988; Mateo et al. , 1996). Although the three continental Lacerta lepida subspecies analysed in those studies formed a homogenous group, relatively

8 higher Nei’s genetic distances were detected between the subspecies L. l. nevadensis and the other two subspecies. Subsequent to this Paulo (2001) carried out a phylogeographic study of the species using a more geographically representative sampling strategy and mitochondrial genealogies. The results suggested a west-east gradient of genetic diversity and several mitochondrial lineages with non- overlapping geographic ranges were described. The results suggested a history of allopatric differentiation in multiple refugia during the Plio-Pleistocene, in concordance with the patterns described for other species within the Iberian Peninsula. The geographic subdivision and population divergence detected by Paulo (2001) do not correspond exactly to the patterns previously obtained by Mateo et al. (1996) with morphology and allozyme data. More recently, data from a phylogenetic study regarding Lacerta spp. involving mitochondrial and nuclear genealogies suggested elevating the subspecies L. l. nevadensis to a new species, due to the high levels of mitochondrial genetic differentiation detected (Paulo et al. , 2008). Despite this apparently deep historical subdivision within Lacerta lepida , as revealed by the mitochondrial genealogy (Paulo et al. , 2008) and some level of differentiation at allozyme markers (Mateo et al. , 1996), it still remains unclear whether these divergent subspecies are evolving independently, or if they are in fact coalescing due to high levels of gene flow at zones of secondary contact.

1.3. Molecular tools

Our understanding of the genetic structure of populations and species is greatly informed by the field of phylogeography (Avise 2000). Phylogeography emerged as an integrative discipline bridging the traditional independent fields of population genetics and molecular phylogenetics, thus providing a valuable contribution to the understanding of the evolutionary pattern and process. Phylogeography, as initially formulated, has developed by integrating information from other fields including demography, historical geography, ethology and molecular genetics (Avise, 1998), which has considerably enhanced our ability to

9 analyze and interpret the genetic structure of populations and species over space and time. The discovery of a fast pace of molecular evolution for mitochondrial DNA thirty years ago (Brown et al. , 1979) strongly shaped the field of phylogeography, which recognized the great utility of this fast evolving molecule as a source of information in an evolutionary context (Avise et al. , 1987; Harrison, 1989). In fact until now phylogeography has relied mainly on data derived from this molecule (Avise, 2004; Avise, 2009). Apart from the high mutation rate generating enough signal to make inferences about population history over short time frames, other important characteristics have made this molecule the tool of choice in phylogeographic studies. In , mitochondrial DNA has an almost exclusively maternal, non-recombining mode of inheritance that enables evolutionary histories to be reconstructed without the complexities introduced by biparental recombination. Additionally, the uniparental inheritance of mtDNA and its haploid state reduce its effective population size to one quarter of that of nuclear genes which facilitates a rapid lineage sorting. Thus, the power for detecting splitting events when using this molecule when compared to nuclear genes is highly increased (Moore, 1995). Nevertheless it has become widely recognized that studies relying solely on mitochondrial DNA have important limitations (reviewed in Zhang and Hewitt, 2003). As a non-recombining molecule mitochondrial DNA will give us information of a single locus only and the evolutionary history of that molecule might not be consistent with the evolutionary history of the organism being studied due to stochastic processes (Harrison, 1989). Furthermore, retention of ancestral polymorphism, introgressive hybridization (e.g. Alves et al. , 2003), natural selection (Ballard and Whitlock, 2004; Bazin et al. , 2006), sex-biased dispersal (e.g. Piertney et al. , 2000) amongst other factors, can dramatically influence phylogeographic results based entirely on mitochondrial DNA. These problems can be addressed by the combined use of nuclear and mitochondrial genes, which have different modes of inheritance and different mutation rates. Such combined approaches are likely to provide insights into complex evolutionary histories as the ones that most likely characterize species that have persisted in the Iberia Peninsula for several ice ages (e.g. Godinho et al. , 2008).

10

1.3.1 The pitfalls of mitochondrial DNA: Numts, heteroplasmy and recombination

When using mitochondrial DNA as a genetic marker several issues have to be taken into account in order to avoid erroneous conclusions revealed by the data, i.e. one has to be aware of the pitfalls of mitochondrial DNA. One of those pitfalls is the existence of Numts - mitochondrial fragments that are incorporated in the nuclear genome. It is known that parts of the mitochondrial genome can be transferred to the nucleus, a phenomenon which has been demonstrated in various taxa (for a review on the subject see Bensasson et al. , 2001; Zhang and Hewitt, 1996). The mitochondrial fragments that are incorporated in the nucleus have different evolutionary histories from their mitochondrial counterparts and therefore can seriously compromise phylogenetic and phylogeographic studies if wrongly incorporated in mitochondrial datasets (e.g. Arctander and Fjeldsa, 1994). Nevertheless, when correctly identified Numts can be useful as they represent relicts of mitochondrial DNA providing therefore information regarding the ancestral state of mitochondrial genealogies (Bensasson et al. , 2001). The correct identification of mitochondrial DNA sequences, and distinguishing them from Numts, are essential steps in any study involving mtDNA as a genetic marker. Other important issues concerning mitochondrial DNA relate to the standard paradigms about its mode of inheritance in animals which, if proven to be wrong, could seriously compromise the suitability of mtDNA for many of its uses. One of the paradigms concerns the lack of recombination in mitochondrial DNA. Apart from some bivalve families, which show a unique type of mtDNA inheritance and where recombination of mtDNA has been detected (Burzynski et al. , 2003; Ladoukakis and Zouros, 2001a), direct evidence that recombination of mtDNA occurs is scarce and has only been reported in humans (Kraytsberg et al. , 2004) and in the nematode Meloidogyne javanica (Lunt and Hyman, 1997). In natural populations, recent strong evidence suggests that it occurred in one individual of the Australian frillneck lizard ( Chlamydosaurus kingii ) (Ujvari et al. , 2007) and in one hybrid of Salmo salar and Salmo trutta (Ciborowski et al. , 2007). Nevertheless, surveys of published data sets suggest that the phenomenon can be more widespread and frequent (Awadalla et al. , 1999; Ladoukakis and Zouros, 2001b; Piganeau et al. ,

11

2004; Tsaousis et al. , 2005) and recent studies report indirect evidence for its occurrence in a variety of systems: butterflies (Andolfatto et al. , 2003), scorpions (Gantenbein et al. , 2005), silkmoths (Arunkumar et al. , 2006) and fish (Guo et al. , 2006). Furthermore, in the last years an increasing number of studies also reported the occurrence of bi-parental inheritance of mitochondrial DNA, through paternal leakage. The consequences of this are that more than one type of mitochondrial DNA (heteroplasmy) are reported to co-occur in the same individual across several taxa: in birds (Kvist et al. , 2003), Drosophila (Kondo et al. , 1990; Sherengul et al. , 2006), mice (Gyllensten et al. , 1991; Kaneda et al. , 1995; Shitara et al. , 1998), cows (Steinborn et al. , 1998; Sutovsky et al. , 2000), cicadas (Fontaine et al. , 2007), mites (Van Leeuwen et al. , 2008), fish (Hoarau et al. , 2002; Magoulas and Zouros, 1993), bees (Meusel and Moritz, 1993), sheep (Zhao et al. , 2004) and humans (Kraytsberg et al. , 2004; Schwartz and Vissing, 2002). The discovery of heteroplasmy in such a wide range of taxa and the recently reported cases of clear recombinants between divergent mitochondrial genomes in natural populations (Ciborowski et al. , 2007; Ujvari et al. , 2007) suggest that these phenomena might be more common than previously suspected and researchers working with mitochondrial DNA should bear this in mind. Taking in account the many pitfalls and problems associated with analyses that rely solely on data obtained from mitochondrial DNA, this study will employ the use of both cytoplasmic (maternally inherited) and nuclear (biparentaly inherited) markers to infer the historical and contemporary processes that have shaped the evolutionary history of Lacerta lepida within the Iberian Peninsula. The mitochondrial marker used throughout the thesis is the cytochrome b gene. This gene is commonly used for phylogenetic and phylogeographic studies in vertebrates, including studies involving Lacerta lepida . The nuclear marker used is the intron 7 of the β-fibrinogen gene that has also been successfully used as a marker in several vertebrate phylogeographic and phylogenetic studies (Dolman and Phillips, 2004; Pinho et al. , 2008; Sequeira et al. , 2006). In a recent phylogenetic study of the genus Lacerta (Paulo et al. , 2008) it was revealed to have sufficient variation within Lacerta lepida for phylogeographical inference. In addition, microsatellite DNA markers are employed to address issues related to a more recent time scale, as the assessment of recent population history.

12

1.4 Thesis structure

This thesis is composed of four data chapters, each dealing with different aspects of the evolutionary biology of Lacerta lepida within the Iberia Peninsula. The first data chapter (chapter 2) assesses the origin of intra-individual polymorphism at the mitochondrial level detected in individuals distributed through a putative zone of contact between two divergent mitochondrial lineages in north- western Iberia. The analysis of mitochondrial DNA sequence data (cytochrome b gene) from individuals representing the entire distribution area of the two lineages, and including a detailed sampling at the putative contact zone, provided a detailed understanding of the recent demographic history of the species in the north-western region of the Iberian Peninsula and allowed the identification of several Numts as the source of intra-individual polymorphism. In chapter 3 the phylogeographic analysis is extended to assess the broader phylogeographic patterns within Lacerta lepida across its distribution. This was achieved by using both mtDNA and nDNA derived genealogies. Their contrasting molecular and population genetic properties facilitated the description of the phylogeography of this species. Using a coalescence approach exploring the geographic distribution of ancestral and derived alleles, the probable refugia areas for each lineage that facilitated their allopatric differentiation were indentified. The influence of the Quaternary climatic oscillations in generating the phylogeographic patterns observed was assessed through the estimation of the diversification timings. The detailed phylogeographic study performed in chapter 3 disclosed the location of several secondary contact zones between divergent mitochondrial lineages and the demographic patterns associated with their establishment. Particularly interesting was the detection of a contact zone between two very divergent lineages that represent the oldest split within Lacerta lepida , estimated to have occurred during the Miocene. In chapter 4, gene flow across the putative contact zone between these two lineages was assessed. This was achieved by using 8 microsatellite loci and mitochondrial DNA data. Although hybridization between the

13 lineages was detected by the presence of F1 hybrids, results suggest that they are on independent evolutionary trajectories. In the final data chapter (chapter 5) the polymorphic individuals detected in chapter 2 are re-analyzed using a single molecule PCR protocol. Results lead to the detection of low levels of heteroplasmy and evidence for mitochondrial DNA recombination in Lacerta lepida . The thesis finishes with a general discussion (chapter 6) where the implications of the findings of this thesis are discussed in detail and topics for future research are addressed.

14

1.5. References

Alvarez Y, Mateo JA, Andreu AC, Diaz-Paniagua C, Diez A, Bautista JM (2000) Brief communication. Mitochondrial DNA haplotyping of Testudo graeca on both continental sides of the Straits of Gibraltar. Journal of Heredity 91 , 39- 41.

Alves PC, Ferrand N, Suchentrunk F, Harris DJ (2003) Ancient introgression of Lepus timidus mtDNA into L. granatensis and L. europaeus in the Iberian Peninsula. Molecular Phylogenetics and Evolution 27 , 70-80.

Andolfatto P, Scriber JM, Charlesworth B (2003) No association between mitochondrial DNA haplotypes and a female-limited mimicry phenotype in Papilio glaucus . Evolution 57 , 305-316.

Arctander P, Fjeldsa J (1994) Andean tapaculos of the genus Scytalopus (Aves, Rhinocryptidae): a study of speciation patterns using mtDNA sequence data. In: Conservation genetics (eds. Loeschcke V, Tomiuk J, Jain SK). Birkhauser, Basel.

Arnold EN, Arribas O, Carranza S (2007) Systematics of the Palaearctic and Oriental lizard tribe Lacertini (Squamata: Lacertidae: Lacertinae), with descriptions of eight new genera. Zootoxa 1430 .

Arunkumar KP, Metta M, Nagaraju J (2006) Molecular phylogeny of silkmoths reveals the origin of domesticated silkmoth, Bombyx mori from Chinese Bombyx mandarina and paternal inheritance of Antheraea proylei mitochondrial DNA. Molecular Phylogenetics and Evolution 40 , 419-427.

Avise JC (1998) The history and purview of phylogeography: a personal reflection. Molecular Ecology 7, 371-379.

Avise JC (2004) Molecular Markers, Natural History and Evolution , 2nd edn. Sinauer Associates, Sunderland, Massachusetts.

Avise JC (2009) Phylogeography: retrospect and prospect. Journal of Biogeography 36 , 3-15.

Avise JC, Arnold J, Ball RM, Bermingham E, Lamb T, Neigel JE, Reeb CA, Saunders NC (1987) Intraspecific phylogeography: the mitochondrial DNA bridge between population genetics and systematics. Annual Review of Ecology and Systematics 18 , 489-522.

Awadalla P, Eyre-Walker A, Smith JM (1999) Linkage disequilibrium and recombination in Hominid mitochondrial DNA. Science 286 , 2524-2525.

Ballard JWO, Whitlock MC (2004) The incomplete natural history of mitochondria. Molecular Ecology 13 , 729-744.

15

Batista V, Harris DJ, Carretero MA (2004) Genetic variation in Pleurodeles waltl Michaelles, 1830 (Amphibia: Salamandridae) across the Strait of Gibraltar derived from mitochondrial DNA sequences. Herpetozoa 16 , 166-168.

Bazin E, Glemin S, Galtier N (2006) Population size does not influence mitochondrial genetic diversity in animals. Science 312 , 570-572.

Bensasson D, Zhang D-X, Hartl DL, Hewitt GM (2001) Mitochondrial pseudogenes: evolution's misplaced witnesses. Trends in Ecology & Evolution 16 , 314-321.

Blondel J, Aronson J (1999) Biology and wildlife of the mediterranean region Oxford University Press, New York.

Brown WM, George M, Wilson AC (1979) Rapid evolution of animal mitochondrial DNA. Proceedings of the National Academy of Sciences of the United States of America 76 , 1967-1971.

Burzynski A, Zbawicka M, Skibinski DOF, Wenne R (2003) Evidence for recombination of mtDNA in the marine mussel Mytilus trossulus from the Baltic. Molecular Biology and Evolution 20 , 388-392.

Busack SD (1986) Biogeographic analysis of the herpetofauna separated by the formation of the Strait of Gibraltar. National Geographic Research 2, 17-36.

Carranza S, Arnold EN, Wade E, Fahd S (2004) Phylogeography of the false smooth snakes, Macroprotodon (Serpentes, Colubridae): mitochondrial DNA sequences show European populations arrived recently from Northwest Africa. Molecular Phylogenetics and Evolution 33 , 523-532.

Carranza S, Harris DJ, Arnold EN, Batista V, Gonzalez de la Vega JP (2006) Phylogeography of the lacertid lizard, Psammodromus algirus , in Iberia and across the Strait of Gibraltar. Journal of Biogeography 33 , 1279-1288.

Castilla AM, Bauwens D (1989) Reproductive characteristics of the lacertid lizard Lacerta lepida . Amphibia-Reptilia 10 , 445-452.

Castilla AM, Bauwens D (1990) Reproductive and fat body cycles of the lizard, Lacerta lepida , in central Spain. Journal of Herpetology 24 , 261-266.

Castilla AM, Bauwens D (1992) Habitat selection by the lizard Lacerta lepida in a Mediterranean oak forest Herpetological Journal 2, 27-30.

Castilla AM, Bauwens D, Llorente GA (1991) Diet composition of the lizard Lacerta lepida in Central Spain. Journal of Herpetology 25 , 30-36.

Castroviejo J, Mateo JA (1998) Una nueva subespecie de Lacerta lepida Daudin 1802 (Sauria, Lacertidae) para la Isla de Salvora (España ). Publicaciones de la Asociacion de Amigos de Doñana 12 , 1–21.

16

Cheylan M, Grillet P (2005) Statut passe et actuel du lezard ocelle ( Lacerta lepida , Sauriens, lacertides) en France. Implication en termes de conservation. Vie et Milleu 55 , 15-30.

Ciborowski KL, Consuegra S, Garcia de Leijniz C, Beaumont MA, Wang J, Jordan WC (2007) Rare and fleeting: an example of interspecific recombination in animal mitochondrial DNA. Biology Letters 3, 554-557.

Darwin C (1859) On the origin of species by means of Natural Selection, or the preservation of favoured races in the struggle for Life J. Murray, London.

Darwin C (1958) On the tendency of species to form varieties; and on the perpetuation of varieties and species by natural means of selection. I. Extract from an unpublished work on species, II. Abstract of a letter from C. Darwin, Esq., to Prof. Asa Gray. Journal of the Proceedings of the Linnean Society of London 3, 45-53.

Dolman G, Phillips B (2004) Single copy nuclear DNA markers characterized for comparative phylogeography in Australian wet tropics rainforest skinks. Molecular Ecology Notes 4, 185-187.

Fontaine KM, Cooley JR, Simon C (2007) Evidence for paternal leakage in hybrid periodical cicadas (Hemiptera: Magicicada spp.). PLoS ONE 2, e892.

Fu J (1998) Toward the phylogeny of the family Lacertidae: implications from mitochondrial DNA 12S and 16S gene sequences (Reptilia: Squamata). Molecular Phylogenetics and Evolution 9, 118-130.

Fu J (2000) Towards the phylogeny of the family Lacertidae: Why 4708 base pairs of mtDNA sequences cannot draw the picture. Biological Journal of the Linnean Society 71 , 203-217.

Gantenbein B, Fet V, Gantenbein-Ritter IA, Balloux Fo (2005) Evidence for recombination in scorpion mitochondrial DNA (Scorpiones: Buthidae). Proceedings of the Royal Society B: Biological Sciences 272 , 697-704.

Gantenbein B, Largiadèr CR (2003) The phylogeographic importance of the Strait of Gibraltar as a gene flow barrier in terrestrial : a case study with the scorpion Buthus occitanus as model organism. Molecular Phylogenetics and Evolution 28 , 119-130.

García-Barros E, Gurrea P, Luciáñez MJ, Cano JM, Munguira ML, Moreno JC, Sainz H, Sanz MJ, Simón JC (2002) Parsimony analysis of endemicity and its application to animal and plant geographical distributions in the Ibero- Balearic region (western Mediterranean). Journal of Biogeography 29 , 109- 124.

Godinho R, Crespo EG, Ferrand N (2008) The limits of mtDNA phylogeography: complex patterns of population history in a highly structured Iberian lizard

17

are only revealed by the use of nuclear markers. Molecular Ecology 17 , 4670-4683.

Godinho R, Mendonca B, Crespo EG, Ferrand N (2006) Genealogy of the nuclear beta-fibrinogen locus in a highly structured lizard species: comparison with mtDNA and evidence for intragenic recombination in the hybrid zone. Heredity 96 , 454-463.

Gomez A, Lunt DH (2007) Refugia within refugia: patterns of phylogeographic concordance in the Iberian Peninsula. In: Phylogeography of Southern European Refugia (eds. Weiss S, Ferrand N). Springer, Dordrecht.

Guo X, Liu S, Liu Y (2006) Evidence for recombination of mitochondrial DNA in triploid Crucian Carp. Genetics 172 , 1745-1749.

Gyllensten U, Wharton D, Josefsson A, Wilson AC (1991) Paternal inheritance of mitochondrial DNA in mice. 352 , 255-257.

Harris DJ, Arnold EN, Thomas RH (1998) Relationships of lacertid lizards (Reptilia: Lacertidae) estimated from mitochondrial DNA sequences and morphology. Proceedings of the Royal Society B: Biological Sciences 265 , 1939-1948.

Harrison RG (1989) Animal mitochondrial DNA as a genetic marker in population and evolutionary biology. Trends in Ecology & Evolution 4, 6-11.

Hewitt GM (1996) Some genetic consequences of ice ages, and their role, in divergence and speciation. Biological Journal of the Linnean Society 58 , 247-276.

Hewitt GM (1999) Post-glacial re-colonization of European biota. Biological Journal of the Linnean Society 68 , 87-112.

Hewitt GM (2000) The genetic legacy of the Quaternary ice ages. Nature 405 , 907- 913.

Hewitt GM (2004) Genetic consequences of climatic oscillations in the Quaternary. Philosophical Transactions of the Royal Society B: Biological Sciences 359 , 183-195.

Hoarau G, Holla S, Lescasse R, Stam WT, Olsen JL (2002) Heteroplasmy and Evidence for Recombination in the Mitochondrial Control Region of the Flatfish Platichthys flesus . Molecular Biology and Evolution 19 , 2261-2264.

Hsu KJ, Montadert L, Bernoulli D, Cita MB, Erickson A, Garrison RE, Kidd RB, Melieres F, Muller C, Wright R (1977) History of the Mediterranean salinity crisis. Nature 267 , 399-403.

Kaneda H, Hayashi J, Takahama S, Taya C, Lindahl K, Yonekawa H (1995) Elimination of paternal mitochondrial DNA in intraspecific crosses during

18

early mouse embryogenesis. Proceedings of the National Academy of Sciences 92 , 4542-4546.

Kondo R, Satta Y, Matsuura ET, Ishiwa H, Takahata N, Chigusa SI (1990) Incomplete Maternal Transmission of Mitochondrial-DNA in Drosophila. Genetics 126 , 657-663.

Kraytsberg Y, Schwartz M, Brown TA, Ebralidse K, Kunz WS, Clayton DA, Vissing J, Khrapko K (2004) Recombination of human mitochondrial DNA. Science 304 , 981-981.

Krijgsman W, Garcés M, Agustí J, Raffi I, Taberner C, Zachariasse WJ (2000) The "Tortonian" salinity crisis' of the eastern Betics (Spain). Earth and Planetary Science Letters 181 , 497-511.

Krijgsman W, Hilgen FJ, Raffi I, Sierro FJ, Wilson DS (1999) Chronology, causes and progression of the Messinian salinity crisis. Nature 400 , 652-655.

Krijgsman W, Langereis CG (2000) Magnetostratigraphy of the Zobzit and Koudiat Zarga sections (Taza-Guercif basin, Morocco): implications for the evolution of the Rifian Corridor. Marine and Petroleum Geology 17 , 359-371.

Kvist L, Martens J, Nazarenko AA, Orell M (2003) Paternal leakage of mitochondrial DNA in the great tit ( Parus major ). Molecular Biology and Evolution 20 , 243-247.

Ladoukakis ED, Zouros E (2001a) Direct evidence for homologous recombination in mussel (Mytilus galloprovincialis) mitochondrial DNA. Molecular Biology and Evolution 18 , 1168-1175.

Ladoukakis ED, Zouros E (2001b) Recombination in animal mitochondrial DNA: Evidence from published sequences. Molecular Biology and Evolution 18 , 2127-2131.

Lenk P, Fritz U, Joger U, Wink M (1999) Mitochondrial phylogeography of the European pond turtle, Emys orbiculari s (Linnaeus 1758). Molecular Ecology 8, 1911-1922.

Lunt DH, Hyman BC (1997) Animal mitochondrial DNA recombination. Nature Genetics , 247.

Magoulas A, Zouros E (1993) Restriction-site heteroplasmy in Anchovy ( Engraulis encrasicolus ) indicates incidental biparental inheritance of mitochondrial DNA. Molecular Biology and Evolution 10 , 319-325.

Martín JM, Braga JC, Betzler C (2001) The Messinian Guadalhorce corridor: the last northern, Atlantic-Mediterranean gateway. Terra Nova 13 , 418-424.

19

Martínez-Solano I, Gonçalves HA, Arntzen JW, García-París M (2004) Phylogenetic relationships and biogeography of midwife toads (Discoglossidae: Alytes ). Journal of Biogeography 31 , 603-618.

Martínez-Solano I, Teixeira J, Buckley D, Garcia-Paris M (2006) Mitochondrial DNA phylogeography of Lissotriton boscai (Caudata, Salamandridae): evidence for old, multiple refugia in an Iberian endemic. Molecular Ecology 15 , 3375-3388.

Mateo JA (1988) Estudio sistematico y zoogeografico de los Lagartos Ocelados, Lacerta lepida Daudin, 1802, y Lacerta pater (Lataste, 1880), (Sauria: Lacertidae) , Universidad de Sevilla.

Mateo JA, Castanet J (1994) Reproductive strategies in three Spanish populations of the ocellated lizard, Lacerta lepida (Sauria, Lacertidae). Acta oecologica 15 , 215-229.

Mateo JA, Castroviejo J (1990) Variation morphologique et revision taxonomique de l’espece Lacerta lepida Daudin, 1802 (Sauria, Lacertidae). Bulletin du Museé de Histoire Naturele de Paris 12 , 691–706.

Mateo JA, López-Jurado LF (1994) Variaciones en el color de los lagartos ocelados; aproximacion a la distribuicion de Lacerta lepida nevadensis Buchholz 1963. Revista Espanola de Herpetologia 8, 29-35.

Mateo JA, López-Jurado LF, Guillaume CP (1996) Variabilité électrophorétique et morphologique des lézards ocellés (Lacertidae): un complexe d’espèces de part et d’autre du détroit de Gibraltar. Comptes Rendus de L’Academie des Sciences Serie iii-Sciences de la Vie-Life Sciences 319 , 737–746.

Meusel MS, Moritz RFA (1993) Transfer of paternal mitochondrial DNA during fertilization of honeybee ( Apis mellifera L.) eggs. Current Genetics 24 , 539- 543.

Moore WS (1995) Inferring phylogenies from mtDNA variation: mitochondrial-gene trees versus nuclear-gene trees. Evolution 49 , 718-726.

Myers N, Mittermeier RA, Mittermeier CG, da Fonseca GAB, Kent J (2000) Biodiversity hotspots for conservation priorities. Nature 403 , 853-858.

Paulo OS (1988) Estudo eco-etologico da populacao de Lacerta lepida (Daudin 1802) (Sauria, LAcertidae) da ilha da Berlenga , Universidade de Lisboa.

Paulo OS (2001) The phylogeography of reptiles of the Iberian Peninsula , University of London.

Paulo OS, Dias C, Bruford MW, Jordan WC, Nichols RA (2001) The persistence of Pliocene populations through the Pleistocene climatic cycles: evidence from the phylogeography of an Iberian lizard. Proceedings of the Royal Society B: Biological Sciences 268 , 1625-1630.

20

Paulo OS, Pinheiro J, Miraldo A, Bruford MW, Jordan WC, Nichols RA (2008) The role of vicariance vs. dispersal in shaping genetic patterns in ocellated lizard species in the western Mediterranean. Molecular Ecology 17 , 1535-1551.

Piertney SB, MacColl ADC, Bacon PJ, Racey PA, Lambin X, Dallas JF (2000) Matrilineal genetic structure and female-mediated gene flow in Red Grouse (Lagopus lagopus scoticus ): and anlysis using mitochondrial DNA. Evolution 54 , 279-289.

Piganeau G, Gardner M, Eyre-Walker A (2004) A broad survey of recombination in animal mitochondria. Molecular Biology and Evolution 21 , 2319-2325.

Pinho C, Harris DJ, Ferrand N (2008) Non-equilibrium estimates of gene flow inferred from nuclear genealogies suggest that Iberian and North African wall lizards (Podarcis spp.) are an assemblage of incipient species. BMC Evolutionary Biology 8, 63.

Rosenbaum G, Lister GS, Duboz C (2002a) Reconstruction of the tectonic evolution of the western Mediterranean since the Oligocene. Journal of the Virtual Explorer 8, 107-130.

Rosenbaum G, Lister GS, Duboz C (2002b) Relative motions of Africa, Iberia and Europe during Alpine orogeny. Tectonophysics 359 , 117-129.

Roucoux KH, de Abreu L, Shackleton NJ, Tzedakis PC (2005) The response of NW Iberian vegetation to North Atlantic climatic oscillations during the last 65 kyr. Quaternary Science Reviews 24 , 1637-1653.

Schmitt T, Rober S, Seitz A (2005) Is the last glaciation the only relevant event for the present genetic population structure of the Meadow Brown butterfly Maniola jurtina (: Nymphalidae)? Biological Journal of the Linnean Society 85 , 419-431.

Schmitt T, Seitz A (2004) Low diversity but high differentiation: the population genetics of infausta (: Lepidoptera). Journal of Biogeography 31 , 137-144.

Schwartz M, Vissing J (2002) Paternal inheritance of mtDNA in a patient with mitochondrial myopathy. European Journal of Human Genetics 10 , 239-239.

Sequeira F, Alexandrino J, Rocha S, Arntzen JW, Ferrand N (2005) Genetic exchange across a hybrid zone within the Iberian endemic golden-striped salamander, Chioglossa lusitanica . Molecular Ecology 14 , 245-254.

Sequeira F, Ferrand N, Harris DJ (2006) Assessing the phylogenetic signal of the nuclear β-Fibrinogen intron 7 in salamandrids (Amphibia: Salamandridae). Amphibia-Reptilia 27 , 409-418.

21

Sherengul W, Kondo R, Matsuura ET (2006) Analysis of paternal transmission of mitochondrial DNA in Drosophila. Genes and Genetic Systems 81 , 399-404.

Shitara H, Hayashi J, Takahama S, Kaneda H, Yonekawa H (1998) Maternal inheritance of mouse mtDNA in interspecific hybrids: Segregation of the leaked paternal mtDNA followed by the prevention of subsequent paternal leakage. Genetics 148 , 851-857.

Steinborn R, Zakhartchenko V, Jelyazkov J, Klein D, Wolf E, Müller M, Brem G (1998) Composition of parental mitochondrial DNA in cloned bovine embryos. FEBS Letters 426 , 352-356.

Sutovsky P, Moreno RD, Ramalho-Santos J, Dominko T, Simerly C, Schatten G (2000) Ubiquitinated sperm mitochondria, selective proteolysis, and the regulation of mitochondrial inheritance in mammalian embryos. Biology of Reproduction 63 , 582-590.

Taberlet P, Fumagalli L, Wust-Saucy A-G, Cosson J-F (1998) Comparative phylogeography and postglacial colonization routes in Europe. Molecular Ecology 7, 453-464.

Tsaousis AD, Martin DP, Ladoukakis ED, Posada D, Zouros E (2005) Widespread recombination in published animal mtDNA sequences. Molecular Biology and Evolution 22 , 925-933.

Tzedakis PC, Lawson IT, Frogley MR, Hewitt GM, Preece RC (2002) Buffered tree population changes in a Quaternary refugium: evolutionary implications. Science 297 , 2044-2047.

Ujvari B, Dowton M, Madsen T (2007) Mitochondrial DNA recombination in a free- ranging Australian lizard. Biology Letters 3, 189-192.

Van Leeuwen T, Vanholme B, Van Pottelberge S, Van Nieuwenhuyse P, Nauen R, Tirry L, Denholm I (2008) Mitochondrial heteroplasmy and the evolution of insecticide resistance: Non-Mendelian inheritance in action. Proceedings of the National Academy of Sciences 105 , 5980-5985.

Vasconcelos R, Carretero MA, Harris DJ (2006) Phylogeography of the genus Blanus (worm lizards) in Iberia and Morocco based on mitochondrial and nuclear markers - preliminary analysis. Amphibia-Reptilia 27 .

Wallace AR (1958) On the tendency of species to form varieties; and on the perpetuation of varieties and species by natural means of selection. III. On the tendency of varieties to depart indefinitely from the original type. Journal of the Proceedings of the Linnean Society of London 3, 53-62.

Williams PH, Humphries C, Araujo MB, Lampinen R, Hagemeijer W, Gasc J-P, Mitchell-Jones T (2000) Endemism and important areas for representing European biodiversity: a preliminary exploration of atlas data for plants and terrestrial vertebrates. Belgian Journal of Entomology 2, 21-46.

22

Zagwijn WH (1992a) The beginning of the Ice Age in Europe and its major subdivisions. Quaternary Science Reviews 11 , 583-591.

Zagwijn WH (1992b) Migration of vegetation during the Quaternary in Europe. Courier Forschungsinstitut Senckenberg 153 , 9-20.

Zangari F, Cimmaruta R, Nascetti G (2006) Genetic relationships of the western Mediterranean painted frogs based on allozymes and mitochondrial markers: evolutionary and taxonomic inferences (Amphibia, Anura, Discoglossidae). Biological Journal of the Linnean Society 87 , 515-536.

Zhang D-X, Hewitt GM (1996) Nuclear integrations: challenges for mitochondrial DNA markers. Trends in Ecology & Evolution 11 , 247-251.

Zhang D-X, Hewitt GM (2003) Nuclear DNA analyses in genetic studies of populations: practice, problems and prospects. Molecular Ecology 12 , 563- 584.

Zhao X, Li N, Guo W, Hu X, Liu Z, Gong G, Wang A, Feng J, Wu C (2004) Further evidence for paternal inheritance of mitochondrial DNA in the sheep ( Ovis aries ). Heredity 93 , 399-403.

23

Chapter 2

Intra-individual mitochondrial DNA polymorphism in a reptilian secondary contact zone

Photos by Rita Jacinto Sampling Lacerta lepida in Sierra Mágina, Andalucia, Spain*

*Sampling site 6 in chapter 4

2. Intra-individual mitochondrial DNA polymorphism in a reptilian secondary contact zone

2.1. Abstract

In the north-western region of the Iberian Peninsula two divergent mitochondrial lineages of Lacerta lepida, L3 and L5, occur. Sequence traces with polymorphic nucleotide sites diagnostic for the two lineages were found in individuals distributed across the geographic range of L3 but far away from a probable contact zone between the lineages. This suggests that genetic exchange involving either biparental inheritance or introgression of mitochondrial fragments of one lineage into the nuclear genome of the other (Numts), has occurred. To identify the origin of this intra-individual polymorphism a detailed phylogeographic study of the putative contact zone between the lineages was carried out. Data suggests that the two mtDNA lineages have diverged in allopatry in two different refugia. A secondary contact zone located to the south of Douro River was formed between both lineages as a consequence of a range expansion, predominantly in L5. The intra- individual polymorphism in L3 is concluded to be due to the presence of Numts. The phylogenetic relatedness of Numts to the mtDNA sequences of both lineages was assessed and the geographic patterns of association are discussed in detail.

Key words : contact zone, Numts, heteroplasmy, mitochondrial lineages

24

2.2. Introduction

Mitochondrial DNA (mtDNA) is the most widely employed molecular marker in animal phylogenetic, phylogeographic and population genetic studies. This preferential use of mtDNA in evolutionary analysis is primarily due to its higher mutation rate compared to the nuclear genome, and maternal inheritance that precludes sexual recombination. A variety of mechanisms acting at many stages of the reproductive process (for a review see Birky, 1995) are responsible for controlling the strict maternal inheritance of mtDNA in animals. In most animals it seems that maternal inheritance of mtDNA is due to a combination of several factors: much lower numbers of mitochondria in the sperm, random replication of mtDNA within cells and active elimination of paternally derived mitochondria (Birky, 1995; Birky, 2001). Under these conditions mtDNA genes within an individual typically exhibit homoplasmy - sequences and genes are identical copies of each other. However, there are two interesting exceptions to this general observation that can lead to the detection of polymorphism upon sequencing mitochondrial genes, which can be more apparent within individuals that are the product of a hybrid history. The first of these exceptions is heteroplasmy resulting from the inheritance of both maternal and paternal mtDNA genomes. We refer to this as Biparental Inheritance (BI) heteroplasmy (Fig. 2.1.). Although not all the mechanisms responsible for preventing paternal mtDNA transmission are as yet fully understood, in recent years processes controlling the active elimination of paternal mitochondria and their mtDNA genomes when inside the egg have been studied in some detail in several mammal species (Shitara et al. , 1998; Sutovsky et al. , 1999; Sutovsky et al. , 2000; Sutovsky et al. , 2004). After fertilization, sperm mitochondria are tagged within the egg by ubiquitin (a proteolytic marker) and are later selectively destroyed. Nevertheless, failure of the successful elimination of all paternal mitochondria in the

25 egg has been shown in hybrid crosses, supporting the idea that the process of active elimination of sperm mitochondria is species-specific (Kaneda et al. , 1995; Sutovsky et al. , 2000). When the mechanism of elimination breaks down, paternal leakage can occur with the generation of heteroplasmic individuals. With the exception of some bivalve species, where paternal transmission of mtDNA occurs due to doubly uniparental inheritance (Skibinski et al. , 1994; Zouros et al. , 1994), most cases of BI heteroplasmy reported in the literature are incidental and typically associated with hybridization events. Such cases of BI mtDNA heteroplasmy have been reported in birds (Kvist et al. , 2003), Drosophila (Kondo et al. , 1990; Sherengul et al. , 2006), mice (Gyllensten et al. , 1991; Kaneda et al. , 1995; Shitara et al. , 1998), cows (Steinborn et al. , 1998; Sutovsky et al. , 2000), cicadas (Fontaine et al. , 2007), mites (Van Leeuwen et al. , 2008), fish (Hoarau et al. , 2002; Magoulas and Zouros, 1993), bees (Meusel and Moritz, 1993), sheep (Zhao et al. , 2004) and humans (Kraytsberg et al. , 2004; Schwartz and Vissing, 2002). Further indirect evidence for BI heteroplasmy comes from recent reports of mtDNA genomes that are the products of recombination between different species or evolutionary lineages. Ciborowski et al. (2007) analysed the mtDNA of 717 Atlantic salmons (Salmo salar ) and identified a single recombinant genome between Atlantic salmon and brown trout ( Salmo trutta ). Ujvari et al. (2007) also identified a single recombinant genome from an analysis of 79 individuals across a contact zone of two divergent mtDNA haplotypes in the Australian frillneck lizard Chlamydosaurus kingii . While neither of these latter studies reported BI heteroplasmy, in both cases BI heteroplasmy had to be a precursor stage for such recombination events to occur. The second exception to mtDNA homoplasmy that can be more apparent within individuals of hybrid ancestry, is polymorphism arising from mitochondrial pseudogenes incorporated in the nucleus (Numts, as first abbreviated by Lopez et al. , 1994). Numts are common and have been recorded in many taxa (for a review in Numts see Bensasson et al. , 2001; Zhang and Hewitt, 1996). They can arise through single or several independent translocations to the nucleus and vary in number and size, from very small translocations comprising only partial fragments of mitochondrial genes to the incorporation of almost the entire mitochondrial genome (Richly and Leister, 2004). In vertebrates it has been shown that the mutation rate of Numts slows down relative to the mtDNA gene regions from which Numts are derived (Arctander, 1995; Collura and Stewart, 1995; Fukuda et al. , 1985; Lu et al. , 26

2002; Smith et al. , 1992; Zischler et al. , 1995) in line with the slower rate of sequence evolution in this nuclear genome (Brown et al. , 1982). Thus the detection of Numts that have recently arisen may be difficult, as the divergence between the Numt and its functional mtDNA copy could be very small or even absent depending on the size of the fragment under study (Fukuda et al. , 1985; Zischler et al. , 1995). Recently Podnar et al. (2007) have demonstrated in Podarcis lizards the existence of a Numt in P. sicula that is genetically more similar to the mtDNA genome of the related species P. muralis . Although their data did not allow for definitive conclusions regarding the origin of the Numt in P. sicula , their study does present an interesting model whereby Numts of recent origin may be more readily detectable within individuals of hybrid ancestry (Fig. 2.1.). In essence, a recent Numt that is genetically identical to the parental genome will remain undetected by PCR when it co-occurs with the parental genome in an individual. However, hybridisation involving individuals with divergent mtDNA genomes can result in offspring where the recent Numt and divergent mtDNA genome reside in the same cells, and thus are detectable by PCR. A recent phylogenetic analysis of the genus Lacerta (Paulo et al. , 2008) revealed phylogeographic structuring within the species Lacerta lepida in the Iberian Peninsula. Clade L of Paulo et al. (2008) is composed of 4 mitochondrial lineages (L1-L4), with only low levels of divergence among them (Fig. 2.2.). These 4 mtDNA lineages were found to have non-overlapping geographic ranges, supporting the idea of allopatric differentiation in multiple Iberian refugia during the Pleistocene (Paulo et al. , 2008). Of particular interest for the present study is the north-western region of Iberia, where two mitochondrial lineages (L3 and L4) occur. Lineage L3 is distributed to the north of Douro River in Portugal and in the regions of Galicia and Asturias in Spain. Lineage L4 has the widest distribution of all lineages occurring from the Atlantic coast of Portugal to southern France and north-western Italy. Although the study of Paulo et al. (2008) did not go so far as to identify a contact zone between lineages L3 and L4, several polymorphic sequence traces at nucleotide sites diagnostic for the lineages (unpublished) were detected within the range of L3 but far away from the probable zone of contact between the two lineages. These polymorphic sequence traces suggest that (1) genetic exchange has occurred between these lineages, and (2) the genetic exchange has involved some level of either BI heteroplasmy or Numts. 27

The central aim of this study is to determine the origin of the polymorphic sequences previously detected in lineage L3 by increased sampling across the geographic ranges of L3 and L4 in north-western Iberia. To achieve this aim we have three specific objectives. The first objective is to define more precisely the geographic ranges of lineages L3 and L4 and the incidence of intra-individual mtDNA sequence polymorphism within these. The second objective is to identify a contact zone between lineages L3 and L4. The third objective is to identify the refugial areas for each lineage and to infer the probable demographic events resulting in a contact zone.

2.3. Materials and Methods

2.3.1. Sampling

Lizards were captured under licence during the years 2005 and 2006. In 2005 the north-western corner of Iberia was sampled broadly to focus on identifying the geographic limits of lineages L3 and L4 previously described by Paulo et al. (2008), particularly where they come into geographic contact. In 2006, five sites (A-E) spanning an area of contact (identified by the sequence data from samples collected in 2005) were sampled. Sites were 20km apart and sampling was targeted at 20 individuals per site. Lizards were captured using tomahawk traps or by hand, and tissue samples were taken by clipping 1cm of the tail tip that was subsequently preserved in 100% ethanol. After tissue sampling, animals were immediately released back into the wild in the place of capture. Geographic coordinates of sampling sites were recorded with a GPS.

2.3.2. DNA extraction, amplification and sequencing

28

Total genomic DNA was extracted from ethanol-preserved muscle tissue using a salt extraction protocol (Aljanabi and Martinez, 1997; Sunnucks and Hales, 1996). A fragment of 627 base pairs (bp) of the mitochondrial DNA (mtDNA) cytochrome b (cytb) gene was amplified using the truncated version of primer L14841 (Kocher et al. , 1989) (CYTBF, 5’-CCA TCC AAC ATC TCA GCA TGA TGA AA-3’) and the modified version of primer MV16 (Moritz et al. , 1992) (CYTBR, 5’- AAA TAG GAA GTA TCA CTC TGG TTT-3’). Polymerase chain reactions (PCRs) were performed in a total volume of 25 µl, containing 0.5U of Taq TM polymerase (BIOTAQ ), 4mM of MgCl 2, 0.4mM of each nucleotide (Bioline),

0.4 µM of each primer, 2 µl of 10x NH4 reaction buffer (Bioline) and approximately 50ng of DNA. PCR amplifications were conducted as follows: DNA was initially denaturated at 94ºC for 3 min followed by 35 cycles of denaturation at 94ºC for 45s, annealing at 51ºC for 45s and extension at 72ºC for 45s, plus a final extension step at 72ºC for 5 min. Negative controls (no DNA) were included for all amplifications. PCR products were visualized on a 2% agarose gel and purified by filtration through QIAquick ® columns (Qiagen) following the manufacturer’s recommendations. Purified PCR products were sequenced in both directions using the above primers with reaction mixes consisting of 6.35 µl of ddH2O, 1.5 µl of primer at 3.5 µM, 1 µl of BigDye Terminator v3.1 TM (Applied Biosystems) and 1 µl of PCR product. Sequence reactions were performed as follows: initial incubation at 96ºC for 1min; 25 cycles of incubation at 90ºC for 10s, 50ºC for 5s and 60ºC for 4min. All PCRs and sequencing reactions were performed in a DNA engine tetrad 2, Peltier thermocycler, and sequences were obtained using an ABI 3700 capillary sequencer. DNA sequences were aligned by eye using BioEdit Sequence Alignment Editor 7.01 (Hall, 1999).

2.3.3. Identification of polymorphic individuals and quantification of intra-individual variation

All chromatograms were visually checked to assess sequence quality and the presence of double peaks. Sequences were classified as polymorphic if at least one double peak was detected. In order to control for contamination as the source of

29 polymorphism, DNA from 8 polymorphic samples and 2 samples with no signs of polymorphism were re-extracted and the cytb fragment was amplified and sequenced using the same conditions as described before. A negative control was used in every step of the experiment. Eighteen polymorphic samples from the contact zone were used to quantify intra-individual variation. For those samples the cytb fragment was amplified and the fragments were cloned using the StrataClone TM PCR Cloning Kit (Stratagene), following the manufacturer’s recommendations. Three samples with no signs of polymorphism were also cloned for control purposes. For each cloned sample, 6 to 10 positive clones were purified using the QIAprep ® Spin Miniprep Kit (Qiagen) and directly sequenced using the conditions described above.

2.3.4. Amplification of the entire cytochrome b gene

For all polymorphic samples the entire cytb gene was amplified, using modified versions of primers L14919 (TRNAGLU, 5’- AAC CAC CGT TGT ATT TCA ACT -3’) and L16064 (TRNATHR, 5’- CTT TGG TTT ACA AGA ACA ATG CTT TA - 3’) (Burbrink et al. , 2000). Primers anneal at tRNA Glu and tRNA Thr genes respectively. PCR reagent conditions were the same as described for the cytb fragment and amplifications were conducted as follows: DNA was initially denatured at 94ºC for 3 min followed by 35 cycles of denaturation at 94ºC for 30s, annealing at 52ºC for 30s and extension at 72ºC for 90s, plus a final extension step at 72ºC for 3 min. Negative controls (no DNA) were included for all amplifications. PCR products were visualized on a 2% agarose gel and purified by filtration through QIAquick ® columns (Qiagen) following the manufacturer’s recommendations. Purified PCR products were sequenced with internal primers specifically designed for this study: CBF (5’ - AAC CTC CTC TCA GCA ATA CC - 3’) and CBR (5’ – CCT GTG GGG TTG TTT GAA - 3’). Sequencing conditions were the same as above.

2.3.5. Haplotype network construction

30

Network approaches have been recommended as more appropriate than phylogenetic trees as a tool for the representation of intraspecific genetic variation as they incorporate population processes important at the intraspecific level; persistence of ancestral haplotypes in the population, lower levels of divergence, multifurcations and reticulations (Cassens et al. , 2005; Posada and Crandall, 2001). Networks also allow the representation of alternative genealogical pathways revealing ambiguities due to homoplasies and/or recombination which would not be revealed by a strict consensus tree (Cassens et al. , 2005). Intraspecific gene genealogies were inferred using two different network construction approaches: median-joining (MJ) (Bandelt et al. , 1999) and statistical parsimony (SP) (Templeton et al., 1992). Both methods have been shown to perform well and give similar outcomes but MJ seems to generate fewer errors than the SP approach when missing node haplotypes are identified (Cassens et al. , 2005). The first approach of MJ is to connect all haplotypes in a tree without inferring ancestral nodes and keeping the tree length to a minimum. The result is a “minimum spanning network” where all possible minimum length trees are represented in a single network. As minimum length trees are not always the most parsimonious ones, consensus sequences (ancestral nodes) are then added to the minimum spanning trees in order to increase parsimony. The consensus sequences can be biologically interpreted as extant unsampled sequences or extinct ancestral haplotypes. The MJ network was computed with the program NETWORK 4.5.0 (www.fluxus-engineering.com ) keeping the parameter ε = 0, which does not allow less parsimonious pathways to be included in the analysis. A SP network was inferred using the program TCS 1.21 (Clement et al. , 2000) with a parsimony confidence limit of 95%. SP networks include less parsimonious alternatives whenever those alternatives cannot be excluded at the confidence limit chosen. Ambiguities within networks were resolved following the criteria of Crandall & Templeton (1993): rare haplotypes occur more often at the tips of cladograms, while common ones are more likely to be interior, and unique haplotypes are more likely to be connected to haplotypes from the same population than to haplotypes from different populations. Ancestral haplotypes were identified using predictions from coalescent theory that ancestral haplotypes will occur at high frequency, be represented in the greatest number of populations, have multiple connections to low frequency haplotypes, and be located at the interior of the network (Crandall & Templeton, 1993; Posada & Crandall, 2001). 31

2.4. Results

A total of 148 samples were collected: 63 from across the geographic range of lineages L3 and L4 (sample codes BS1 to BS63), and 85 from 5 populations spanning the putative contact zone between the two mitochondrial lineages (Population A, n=7; Population B, n=19; Population C, n=20; Population D, n=14 and Population E, n=24). Sampling sites and number of samples per site are detailed in Fig. 2.2. and Table 2.1. respectively. Forty samples generated polymorphic sequences involving: 2 from population A (n=7), 13 from population B (n=19), 18 from population C (n=20) and 7 from outside the transect (samples BS38 to BS44, in sampling sites 3, 6, 7, 8, 10 and 14). In all cases polymorphic sites included nucleotide positions and nucleotide states that are diagnostic between lineages L3 and L4 (Fig. 2.3.), strongly suggesting some form of introgression.

2.4.1 Characterization of polymorphism

To distinguish between BI heteroplasmy and Numts a strategy of increasing amplicon size was adopted. In the case of BI heteroplasmy a signature of heteroplasmy in sequence traces irrespective of amplicon size should be expected. Nevertheless, unless the complete mtDNA genome has been incorporated as a Numt, there should be an upper limit beyond which a Numt will not be amplified, and therefore no polymorphism should be detected. For the first step of this strategy the entire cytb gene (1143 bp) was amplified, which represents only a minor increase in amplicon size. For all samples which were previously shown to be polymorphic, homoplasmic sequences were now obtained. Chromatograms of the complete cytb sequences of all samples had no polymorphic nucleotide sites, indicating that the polymorphic signal arose from the incorporation of a mitochondrial pseudogene into the nucleus. The complete cytb sequences thus represent the authentic mtDNA and revealed that the mtDNA of all polymorphic samples belong to lineage L3.

32

The possibility of co-amplification of Numts during PCR is certainly determined by their abundance in the genome, but primer specificity also plays an important role (e.g. Arctander, 1995; Bensasson et al. , 2001; Collura and Stewart, 1995). From the analysis of the complete L3 cytb sequences from polymorphic individuals it was found that both CYTBF and CYTBR primers have four nucleotide mismatches (Table 2.2.). To compare the specificity of these primers within the L4 lineage we amplified the entire cytb gene for a selection of individuals from this lineage. While the same single mismatches for CYTBF were present within the L4 lineage, the CYTBR primer was a substantially better match with only a single mismatch (Table 2.2.). The lower specificity of the CYTBR primer within the L3 lineage could facilitate preferential amplification of Numts, as has been noted in other studies (e.g. Sorenson and Quinn, 1998). To test this possibility we designed a new CYTBR primer to have no mismatches to the L3 lineage. Amplification of individuals that were polymorphic with the original primer pair resulted in the amplification of L3 mtDNA sequences with no evidence of polymorphism. Indeed we can identify a single nucleotide polymorphism in the CYTBR priming site within L3 mtDNA genome that explains why some L3 samples co-amplified Numts and some did not (Table 2.2.).

2.4.2. Characterization of Numts and intra-individual variation

One hundred and sixty six clones were sequenced from 21 individuals. All clones sequenced within the 3 homoplasmic individuals (26 clones, samples B3, B5 and C1) represent either the authentic mtDNA sequences or sequences that differ from the mtDNA of each sample by 1 to 4 unique point mutations (Table 2.3.). These mutations can be attributed to Taq error, the rate of which has been estimated to vary between 1.1x10 -4 (Tindall and Kunkel, 1988) and 2.0x10 -4 (Saiki et al. , 1988) errors per nucleotide per cycle, depending on PCR reaction conditions. According to these error rates we would expect to have between 2.4 and 4.4 errors in each amplified fragment.

33

Amongst the 140 clones sequenced from 18 polymorphic individuals, 9 represent sequences from the mitochondrial genome of the cloned samples and 7 represent sequences that differ from the mtDNA sequence by less than 4 mutations. Those mutations were attributed to Taq and/or cloning errors. Among the remaining 124 clones, it was possible to detect 4 sequences (18 clones) that occur in more than one sample and thus are considered to represent four different Numts (Numts I to IV). The remaining 106 sequenced clones are either very similar to one of the Numt sequences, differing from those by 1 to 4 mutations (56 clones), or are consistent with recombinants (50 clones) between the different types of sequences present within each sample (Table 2.3.). As a conservative approach, and taking into account the above mentioned Taq error rates, all mutations present in sequences that differ by less than 5 mutations from one of the described Numts will be considered as being potential Taq and/or cloning errors and thus will be ignored, unless the sequence occurs in two or more individuals. As in vitro recombination cannot be excluded as the origin of the recombinant sequences, and all but one of the recombinant sequences are found in single individuals, these were eliminated from further analysis. Furthermore, as it was possible to reproduce in vitro the single recombinant sequence occurring in more than one sample (3 clones in three different samples) by mixing the parental DNA sequences in a PCR (data not shown) this sequence was also eliminated from further analysis. Thus the sequences retained for further analysis are those that are of known origin from the mitochondrial genome, and those that can be definitively classified as Numts by their occurrence in one or more individuals with an origin that cannot be attributable to jumping PCR.

All Numts apart from Numt IV have open reading frames, suggesting a recent translocation to the nucleus. Pairwise comparisons of uncorrected sequence divergences (p-distance) within each sample were estimated using PAUP * version 4.0 b10 (Swofford, 2002). The distances (uncorrected p-values) among the pseudogenes present in each sample and the corresponding mtDNA varies between 1.6% (Numt III) and 17.0% (Numt IV) (Table 2.3.). Numt I is the most common (54 out of 74 clones) occurring in 14 out of 18 cloned samples and Numt II is the second most represented (9 clones in 7 samples). Numt III and IV occur in 3 (6 clones) and 5 (5 clones) samples respectively (Table 2.3.). 34

2.4.3. Phylogeographic analysis

For the construction of networks 182 sequences were used (146 generated in this study and 36 sequences (GB1 to GB36, Table 2.1.) from lineages L3 and L4 from a previous study (Paulo et al., 2008)) that yielded 68 unique haplotypes. Of a total of 627 sites, 82 (13%) were variable, amongst which 48 (59%) were parsimony informative. Both methods used for network estimation (MS and SP) resulted in a single network with the same topology and with two loops (Fig. 2.4.), which were easily resolved. Haplotype 1 can be identified as the most probable ancestral haplotype within the network based on the criteria of Crandall & Templeton (1993) and Posada & Crandall (2001). Haplotype 1 is centrally located within the network and is the most geographically widespread haplotype, occurring in eight sampling localities throughout Spain, Portugal and France. This is corroborated by a more extensive survey of genetic variation within L. lepida (chapter 3), which reveals that across the range of L. lepida haplotype 1 is the most widespread and frequently sampled haplotype, connected to numerous low frequency haplotypes each of which typically have restricted geographic distributions within the distribution of haplotype 1. The two mitochondrial lineages described by Paulo et al. (2008) (L3 and L4) are separated in the network by eight mutations. The more detailed sampling in this study has revealed a further lineage of haplotypes possessing a geographically distinct distribution from other haplotypes within lineage L4. This subgroup exhibits a geographically well-defined distribution within the region between the Tagus and Douro Rivers in central Portugal. This subgroup of phylogeographically distinct haplotypes will be referred to as lineage L5 (Fig. 2.5.).

2.4.3.1. L5 ancestral haplotypes and ancestral area

Within mtDNA lineage L5 it is possible to identify haplotype 13 as the most recent common ancestor (MRCA) of all sampled haplotypes. The MRCA and the closely related descendant haplotypes 14, 15 and 16 occur only in the southern limit

35 of the distribution of lineage L5, near the Tagus River valley. In contrast the most derived haplotypes within lineage L5 (31, 32, 33, 34, 35 and 36) are nearly all (19 out of 20 samples) distributed in the northern limit of the lineage distribution, just south of the Douro River. The geographic distribution of haplotypes within lineage L5 suggests that the region around the Tagus River has most probably functioned as a refugial area for the lineage, followed by a range expansion northwards towards the Douro River valley.

2.4.3.2. L3 ancestral haplotypes and ancestral area

Lineage L3 occupies the entire region north of the Douro River in Portugal and the regions of Galicia and Asturias in Spain (Fig. 2.5.). This lineage is also found up to 20-30 km to the south of Douro River. Within this lineage haplotype 52 is the MRCA of all sampled haplotypes. Both haplotype 52 and its immediate descendant, haplotype 46, are most frequent in the southern limit of the geographic range of L3: of the 35 samples that possess either haplotype 46 or 52, 29 (83%) are from southern sampling sites (south of Douro River) with the remaining 6 (17%) from sampling sites of the centre of the lineage distribution (Geres (site 3), Parque Natural de Montesinho (site 4), Miranda do Douro (site 6) and Macedo de Cavaleiros (site 7); Fig. 2.2.). In contrast, haplotype 40 and its descendants (41, 42, 43, 44, 45, 50, 51, 57 and 58), all of which are derived from haplotype 46, are nearly all represented in more northerly sampling sites: of the 30 samples that possess one of these haplotypes, 9 (30%) are from southern populations, 9 (30%) are from populations at the centre of the distribution (Geres (site 3) and P. N. de Montesinho (sites 4 and 5)) and 12 (40%) are from populations at the northern limit of the lineage distribution (Galicia and Asturias, sites 1 and 2 respectively). This geographic pattern of a more northern distribution for derived haplotypes suggests an expansion from southern populations, where ancestral haplotypes are more frequent. All samples located to the south of Douro River that belong to lineage L3 (apart from 2 samples that correspond to haplotype 57) represent either haplotype 46 or one of its immediate descendants, which are all represented in the network with a star-like genealogy. The remaining haplotypes forming this star-like genealogy (53, 54 and 55) are all found in the sampling sites of Macedo de Cavaleiros (site 7) and P.

36

N. de Montesinho (site 5). This pattern suggests that populations located immediately to the south of Douro River and the population of P. N. de Montesinho and Macedo de Cavaleiros were formed by a range expansion most probably from around the Douro River gorges. This implies that the range expansion from the refugial area was primarily northward, but also involved some range expansion to the south.

2.4.3.3. Contact zone between L3 and L5

Lineages L3 and L5 are geographically distinct, but fine scale sampling around the putative contact zone between both lineages has allowed for the identification of admixed populations, revealing the approximate location of a mtDNA contact zone (Fig. 2.5.). Within transect site C and sampling sites of Castro Daire and Caramulo (sites 16 and 17, respectively) haplotypes from both mtDNA lineages occur in sympatry. The contact zone is located to the south of river Douro from coastal Portugal to Spain and seems to extend further south in the westernmost part of its distribution, in coastal Portugal.

2.4.3.4. Phylogeographic relationships of Numts

In order to determine the number of mutational steps separating each Numt from any given haplotype of the lineages under study a network was constructed with all samples obtained in this study and the Numt sequences identified (Fig. 2.6.). The phylogenetic relatedness of Numts to mtDNA sequences suggests three different events for the incorporation of mtDNA fragments into the nuclear genome. Numts I and II are both derived from mtDNA lineage L5: Numt I corresponds to the sampled haplotype 37 and Numt II differs from I by only one mutation (Fig. 2.6.). The simplest explanation is that, rather than two independent translocations, Numt II is an allele derived from Numt I by a single point mutation, thus we refer to these as Numt I alleles “a” and “b”. Numt III is connected to an unsampled or extinct haplotype near the root of the network, and therefore has a separate mtDNA origin to Numt I. Numt IV is the most divergent Numt and could not be connected to the

37 network under a 95% parsimony connection limit. Furthermore, it does not have an open reading frame, suggesting that it represents an older and independent translocation to the nucleus. Thus the identified Numts are descended from at least 3 independent transfers to the nucleus. The geographic range of each Numt was assessed by examining the sequence trace files for all seven polymorphic samples (Table 2.1.) found outside the contact zone. The presence of each Numt was established by subtracting the true mtDNA sequence, obtained by the amplification of the large amplicon, from the polymorphic sequence trace for each sample. Diagnostic sites between Numts and mtDNA sequences from lineage L3 can be seen in Table 2.4 and the geographic distribution of Numts is represented in Appendix 2.1..The minimum number of mutations separating Numt IV from lineage L3 is 107, thus its presence is easily recognizable from the polymorphic trace files. When Numt III is present together with Numt IV, the presence of Numt I cannot be detected with certainty as there are no independent diagnostic sites that distinguish it (unless allele “b” is present, which has an unique diagnostic site at bp 385 - Table 2.4.). Numt IV is present in all seven samples examined and Numt III is present in five samples (BS38, BS39, BS41, BS42 and BS43). Numt I is present in at least two samples (BS40 and BS44). From the remaining five samples it was not possible to determine the presence of Numt I due to the presence of both Numt III and IV (and the absence of allele “b”).

2.5. Discussion

2.5.1. Phylogeographic history of lineages L3 and L5

Recent phylogeographic studies within the Iberian Peninsula have revealed that a number of species have survived in several different allopatric refugia during glacial periods (see Gomez and Lunt, 2007 for a review), suggesting the existence of different refugia inside this peninsula, an idea first put forward by Cooper and Hewitt (1993). Our results conform to this scenario where in the northwestern range of its distribution Lacerta lepida is structured into two divergent mtDNA lineages

38 which have geographically distinct distributions, forming a zone of secondary contact. Although Lacerta lepida is typically considered a Mediterranean species, its presence in the northwestern part of Iberia, a region predominately influenced by a temperate climate, indicates that populations of this species could have survived during glacial periods in refugia dominated by deciduous forests, common in temperate regions. Indeed the geographic ranges of the L3 and L5 lineages suggest this to be the case. Thus the phylogeographic patterns of Lacerta lepida L3 and L5 lineages can be compared with Iberian species that present a typically Atlantic distribution, restricted to the northwestern part of the Peninsula. These Atlantic species show remarkably concordant phylogeographic patterns characterized by the presence of northwestern and southeastern lineages or sister-species (e.g. Discoglossus galganoi (Martínez-Solano, 2004); Chioglossa lusitanica (Alexandrino et al. , 2002; Alexandrino et al. , 2000 and ); Lacerta schreiberi (Paulo et al. , 2001; Paulo et al. , 2002); Lissotriton boscai (Martínez-Solano et al. , 2006); Podarcis spp. (Pinho et al. , 2007)) which are likely to have diverged in allopatry in northern and southern refugia, although sometimes over very different temporal periods. The northern refugia for some of the above mentioned species have been inferred to be concentrated within or near the relicts of temperate forests during the last glacial maximum (LGM) (Zagwijn, 1992), probably somewhere to the west of “Serra da Estrela” in central Portugal (Alexandrino et al. , 2000; Martínez-Solano et al. , 2006; Paulo et al. , 2001; Paulo et al. , 2002). In the case of Lacerta lepida , the southern L5 lineage seems to extend further north than the southern lineages within most of the species mentioned above. In addition to this, the more northerly southern limit for the L3 lineage suggests a refugium for this lineage further north than has been inferred for other species, most probably in the inland gorges of the Douro River. Interestingly it is in the same region (the “Duero Arribes”) that white oaks are also suggested to have had probable refugia during the full glacial maxima (Olalde et al. , 2002). The geographic distributions of haplotypes within both Lacerta lepida lineages are represented by ancestral haplotypes predominating at the southern distribution limit of each lineage, and derived haplotypes distributed more to the north. This replicated pattern in both lineages is consistent with the hypothesis of range expansions from southern refugia: the Douro river valley for lineage L3 and the Tagus River valley for lineage L5. The geographic mosaic of habitats generated 39 by the complex topographies of these regions has likely allowed the persistence of populations during adverse climatic conditions, with suitable conditions persisting along the deep gorges of both rivers towards the eastern part of Portugal. From these refugial areas, both lineages are inferred to have expanded their ranges when climatic conditions permitted, meeting at what is now a zone of secondary contact. Due to the most likely earlier climate amelioration at southern latitudes, the northward range expansion of lineage L5 is expected to have been initiated earlier than the expansion of lineage L3. This is supported by the geographic distribution of the L5 derived Numt I which provides further detail about the establishment of the secondary contact zone between both lineages (see below).

2.5.2. Origin of the polymorphism

From the strategy of increasing amplicon size it was possible to exclude the hypothesis of BI heteroplasmy in Lacerta lepida . Beyond this there are a number of other potential explanations for the patterns observed, and these will be evaluated in turn below. The first objective is to identify the genomic location (mitochondrial versus nuclear) of the extra cytb sequences responsible for the polymorphic signal and then explore the possible explanations for their origin and prevalence in L3.

Intra-mitochondrial gene duplication

The scenario of intra-mitochondrial gene duplication can be excluded for the sequence classified as “Numt IV” as this sequence does not have an open reading frame, and thus it would have to represent a non-functional gene in the mitochondrial genome. The other cytb pseudogene sequences would also require duplicated genes within the mitochondrial genome to have evolved in different ways. This is highly improbable, particularly if we consider “Numt I” in lineage L3. In this case, intra-mitochondrial gene duplication would have required one of the copies to remain unaltered with the other copy undergoing convergent mutations for lineage L5. The scenario of intra-mitochondrial gene duplication would also imply all individuals in lineage L3 to be polymorphic, with two different copies of the cytb

40 gene, which is not the case. Additionally, intra-mitochondrial gene duplication would result in two differently sized PCR products, and this was not observed. Thus intra-mitochondrial gene duplication can be excluded as the source the detected polymorphism.

Horizontal Gene Transfer (HGT) between mitochondrial genomes

With regard to “Numt I”, the fact that this extra cytb sequence is identical to a mitochondrial haplotype of lineage L5, could suggest that it was horizontally transferred from that genome into the mitochondrial genome of L3, through an external vector. Within multicellular eukaryotes, mitochondrial gene exchange through HGT has been suggested to occur in higher plants (Bergthorsson et al. , 2003; Bergthorsson et al. , 2004) and, although recently suggested as a possible explanation of data in bruchid beetles (Alvarez et al. , 2006), it has not been reported to occur in animals yet. Thus HGT between mitochondrial genomes must be seen as a highly implausible explanation for the patterns observed within this study.

Transfer to the nuclear genome

The data presented in this study is clearly inconsistent with a mitochondrial location for all the cytb sequences identified. Therefore the polymorphism detected in some individuals must be generated by the existence of cytb sequences within the nuclear genome of Lacerta lepida . Intergenomic transfer of mtDNA fragments into the nuclear genome is a widespread phenomenon reported for a great number of taxa (Zhang and Hewitt, 1996), and is the most plausible explanation for the polymorphism observed in lineage L3. The phylogenetic relationships among the three Numts and the mtDNA sequences provide some information about where and/or when some of these translocations occurred. It is important to note that the detection of Numts only in lineage L3 might be a function of primer specificity, and Numts most likely exist within individuals belonging to lineage L5. In fact, Numt I most probably originated within the geographic range of lineage L5 with subsequent introgression into the nuclear genome of the L3 lineage through hybridization and

41 backcrossing when the ranges of these two mtDNA lineages came into secondary contact. Numt III, on the other hand, is phylogenetically closer to the root of the network (Fig. 2.6.) and thus may be the result of an older translocation. The very high divergence between Numt IV and all mtDNA sequences indicates this translocation to be a much older event, predating the divergence events that gave rise to mtDNA genetic diversity sampled within this study.

2.5.3. Phylogeographic utility of Numts

Once correctly identified, Numts can be used as important tools in evolutionary biology, providing a unique window on past evolutionary events (reviewed in Bensasson et al. , 2001; Zhang and Hewitt, 1996). The geographic distribution of Numt I within lineage L3 provides valuable information regarding the demographic history of the two mtDNA phylogeographic lineages under study, complementing information obtained through the analysis of mtDNA sequence data. The data presented provides strong evidence for the introgression of a sequence originated from the mitochondrial genome of lineage L5 into the nuclear gene pool of lineage L3. The fact that Numt I is found as far north as Geres (sampling point 3, Fig. 2.2, Appendix 2.1..), which is approximately 120 km north of the contact zone, suggests that admixture between both lineages was established before, or coincident with, the expansion of lineage L3 northwards. This supports the hypothesis of an earlier, climatically mediated, northward range expansion of lineage L5 from the more southerly refugia. The zone of secondary contact between lineages L3 and L5 is therefore consistent with the leading edge of lineage L5 expanding north and contacting lineage L3 in the vicinity of the L3 refugia, prior to or coincident with the northward range expansion of L3.

2.6. Conclusion

42

This study provides a detailed understanding of the recent demographic history of Lacerta lepida in the north-western part of the Iberian Peninsula. The data presented here suggests that the two L. lepida mtDNA lineages L3 and L5 have diverged in allopatry in two different refugia. A secondary contact zone located to the south of Douro River valley was formed between both lineages as a consequence of a range expansion, predominantly in lineage L5. It was shown that the polymorphism previously detected within lineage L3 is caused by the presence of at least three different Numts. Through the phylogeographic analysis of Numts within lineage L3 it was possible to conclude that genetic exchange between the lineages occurred at the time of secondary contact and before, or coincident with, the northwards expansion of lineage L3. In addition to the many evolutionary features of Numts (see Bensasson et al. , 2001), this study has shown that in the context of phylogeographic analysis Numts can provide evidence for past demographic events. This is an exciting prospect for the field of phylogeography.

43

Table 2.1. Sampled localities with name, site number and country of origin. For each sampling site the total number of samples collected, the respective sample labels and the cytochrome b haplotypes detected are presented. Polymorphic samples are denoted by bold underline font.

Site Location Country No Samples Sample labels Haplotype 1 Galicia Spain 11 BS49, BS50, BS51, BS52, BS53, BS54, BS55, BS56, BS57, GB1, GB2 41, 43 2 Asturias Spain 1 GB3 42 3 Geres Portugal 10 BS34, BS40, BS45, BS46, BS48, BS61, BS62, BS63, GB4, GB5 41, 44, 45, 46, 50, 51, 52, 58 Parque Natural de 4 Montesinho (West) Portugal 3 BS47, BS58, GB6 41, 46, 48 Parque Natural de 5 Montesinho (East) Portugal 3 BS59, BS60, GB7 42, 54, 55 6 Miranda do Douro Portugal 1 BS43 46 7 Macedo de Cavaleiros Portugal 2 BS39, BS42 46, 53 8 Foz Coa (North) Portugal 1 BS41 49 9 Foz Coa Portugal 1 BS31 65 10 Penedono Portugal 1 BS44 40 11 Foz Coa (South) Portugal 1 GB8 47 12 Fig. Castelo Rodrigo Portugal 3 BS14, BS32, BS33 27, 63, 64 13 Pinhel Portugal 1 BS13 27 14 Satao Portugal 1 BS38 56 15 Serra Liomil Portugal 1 GB9 40 16 Castro Daire Portugal 4 BS22, BS26, BS35, BS37 17, 21, 46, 48 17 Serra do Caramulo Portugal 3 BS9, BS23, BS36 19, 32, 46 18 Celorico da Beira Portugal 1 BS8 35 19 Lousa Portugal 1 BS21 22 20 Serra da Estrela Portugal 8 BS11, BS12, BS15, BS16, BS20, BS25, GB10, GB11 17, 23, 26, 29, 30 21 Sabugal Portugal 1 BS19 24 22 Nisa Portugal 3 BS1, BS24, BS17 1, 17, 25 23 Serra Sao Mamede Portugal 2 BS18, GB12 25

44

Table 2.1. - Continuation

Site Location Country No Samples Sample labels Haplotype 24 Monforte Portugal 1 BS30 13 25 Paul Boquilobo Portugal 1 GB13 18 26 Coruche Portugal 1 BS28 15 27 Alcochete Portugal 1 GB14 7 28 Samarra Portugal 3 BS29, GB15, GB16 14, 17, 20 29 Peniche Portugal 4 BS10, GB17, GB18, GB19 17, 30, 33 30 Arrabida Portugal 1 GB20 4 31 Pegoes Portugal 1 BS7 1 32 Evora Portugal 1 GB21 2 33 Elvas Portugal 2 BS2, BS27 1, 16 34 Moura Portugal 1 GB22 6 35 Huelva Spain 1 GB23 10 36 Arcos de la Frontera Spain 1 GB24 68 37 Ronda Spain 1 GB25 67 38 Cordoba Spain 1 GB26 9 39 Sierra Morena Spain 1 GB27 8 40 Sierra Madrona Spain 2 GB28, GB29 3, 5 41 Pontes Rodrigo Spain 1 GB30 1 42 Montes Toledo Spain 2 GB31, GB32 11, 12 43 Navahermosa Spain 2 BS4, GB33 1 44 Soria Spain 1 GB34 1 45 Crau France 2 BS3, BS35 1 46 Oleron France 3 BS5, BS6, GB36 1 A Peso da Regua Portugal 7 A1, A2, A3, A4, A5, A6, A7 46, 57, 59 B Nagosa Portugal 19 B1 to B6, B7 to B19 46, 66 C Penedono (South) Portugal 20 C1, C2, C3 to C20 40, 46, 52, 60, 61, 62 D Vila Franca das Naves Portugal 14 D1 to D14 25, 28, 31, 36, 39 E Lamegal Portugal 24 E1 to E24 25, 27, 31, 34, 37, 38, 39, 63

45

Table 2.2. Primer sequences for the amplification of a fragment of cytb gene and respective specificity with Lacerta lepida mitochondrial lineages L3 and L4. Dots represent nucleotide matches with primers sequences.

Primer Lineage Sequence (5' - 3') CYTBF C C A T C C A A C A T C T C A G C A T G A T G A A A

L3 . . . C . A ...... C . . T ......

L4 . . . C . A ...... C . . T ...... CYTBR A A A T A G G A A G T A T C A C* T C T G G T T T L3 . . . G . . . . . A . . C . . Y ...... L4 . . . G ......

46

Table 2.3. Number of clones sequenced per sample and identification of each type of sequence obtained (mitochrondrial DNA and/or Numts). Sample codes are the same as in Table 2.1. Bold italic sample codes represent control homoplasmic samples. For each sample the mtDNA haplotype is identified (mtDNA Hap.) and the divergence between the Numts and the mtDNA is shown (Divergence (%)).

Numt clones Sample No Clones mtDNA clones Numt I a Numt II b Numt III Numt IV Recombinants mtDNA Hap. Divergence (%) A7 8 4 1 2 57 2.1 - 17.0 B8 9 6 3 46 2.2 B10 7 4 3 46 2.2 B12 8 4 1 3 46 2.2 - 17.0 B13 7 3 1 3 46 2.2 - 2.4 B7 8 5 1 2 46 2.2 - 2.4 B18 10 5 5 66 n.a. B19 8 5 2 1 46 2.2 - 2.4 C3 8 2 1 1 4 40 2.6 - 16.6 C4 8 5 1 2 40 2.6 - 2.7 C5 6 4 2 46 2.2 C6 8 5 1 2 ?c ?c C7 8 5 2 1 46 2.2 - 2.4 C8 8 2 1 5 40 2.6 C9 8 2 1 1 4 62 2.6 - 17.0 C10 8 3 1 1 3 46 1.8 - 17.0 B11 7 4 3 46 2.2 C16 6 4 2 ?c ?c B3 10 10 46 n.a. B5 7 7 66 n.a. C1 9 9 25 n.a. Total 166 42 54 9 5 5 50

47

Table 2.4 Diagnostic sites between each Numt identified in Lacerta lepida mitochondrial lineage L3. Dots represent nucleotide matches to Lineage L3. Light grey shaded base pairs represent the diagnostic sites for Numt II b and Numt III (see text for detailed explanation).

bp 3 21 33 72 78 171 243 264 348 385 391 442 443 507 510 541 567 577 L3 A G C T C C T C C G T G C C C A G G Numt I a T A T C T . C . T . C A . T T T A A Numt II b T A T C T . C . T A C A . T T T A A Numt III T A T C T T . T . . C . T . T T . . Numt IV c T . . C . T C T T . C A . T . C A A

48

PARENTAL FORMS HYBRIDIZATION F1

a) Biparental Inheritance (BI) Heteroplasmy Paternal leakage

Lineage A sperm

Lineage B egg

b) Numts Numt carrier

*Numt

Lineage A Lineage A sperm

egg Numt carrier Lineage B

Fig. 2.1 (a) Biparental Inheritance (BI) heteroplasmy - heteroplasmy is generated through paternal leakage at the time of fertilization. Hybridization of individuals from two different mitochondrial lineages results in F1 carrying two types of mitochondrial DNA. (b) Numts - incorporation of mitochondrial DNA in the nuclear genome of lineage A. When Numt carrying males from lineage A hybridize with females from lineage B, F1 will be polymorphic for the transferred mitochondrial fragments, as they harbour the complete mitochondrial genome from lineage B (through maternal inheritance of mtDNA) and fragments of mitochondrial DNA from lineage A as Numts. Somatic cells are represented as pink squares. Within each cell, mitochondria are shown as ellipses and mitochondrial DNA is represented as coloured ellipses inside mitochondria. Blue mtDNA represents mitochondrial lineage A while red mtDNA represents mitochondrial lineage B. Nuclei are represented as white circles within each cell with nuclear genomes shown as green helices.

49

1

0 a) Broad scale sampling ITALY E Lineage L1 46 ! Lineage L2

Lineage L3 45N Lineage L4 Lineage N FRANCE ! Broad scale sampling points 45 ! ! Transect sampling points

1 ! 2 !

5

L3 E SPAIN ? 4 5 44 ! ! ! 3! 6 0 7! ! ? ! !

W ! ! ! ! 0 ! ? 1 ? ! ! ! ! ! ! ! ? 17! 18 ! L1 ! 20 ! 21 40N 19 !

? !43 22 PORTUGAL 25 ! ! 23 !42 !29 ! 24 ! 26 ! 41 28 ! 33 ! ! !27 ! 31 L4 !40 ! !32 !39 30 ? ? !34 b) Fine scale sampling ? 38 5 ! N 8 L2 W ? ! ! 35 !A 9 36 10 ! ! B! ! 11 ! 12 ! ! ? ! ! C ! 37 ! 15 ! ? 16 14 !D 13 !E 17! 18 ! 050 100 200 Km 20 ! ALGERIA 35N ! 21

MOROCCO

Fig. 2.2. Map of the Iberian Peninsula, southern France and north-western Italy showing the distribution of Lacerta lepida mitochondrial lineages as described in Paulo et al. (2008). Numbers represent broad scale sampling sites and letters in b) represent transect sampling sites regarding the fine scale sampling. Sampling site numbers are the same as in Table 2.1. 50

Base pair (bp) 3 21 33 72 78 243 391 442 507 510 541 577

Lineage L3 A G C T C T T G C C A G

Lineage L5 T A T C T C C A T T T A

Fig. 2.3. Polymorphic sites at cytochrome b gene between Lacerta lepida mitochondrial lineage L3 and L5 and respective polymorphic sequence trace files.

51

51

50 43 41 45

42 40 46 58

57

59 55 56 60 48

62 46 54

61 47 53 49

52 L3

66

63 64

65

7 5 2 67 9

6 3 1 8 11 12 L4 10 68 4

13 14 15

16

20 18

24 23 17 21 22

19

38 26

39 27 25 37 28

30 29

32 36

31

33 35 34 L5

Fig. 2.4. Statistical Parsimony network of Lacerta lepida cytochrome b haplotypes. Haplotype colours are the same as used in Fig. 2.5. to describe lineages distribution area. Dashed lines represent ambiguities in the network. White circles with no numbers represent unsampled or extinct haplotypes.

52

FRANCE

a) L3 SPAIN

1 L1 L5 L4 PORTUGAL

2 L2 N

b)

4 5 L3 3 Douro River 6 7

8 A 9 B 10 11 16 15 12 14 C D 13 17 E 21 20 18 1 (MRCA) 19 Tagus River

22 25 28 23 24 26 33 29 25 L5 N

050 100 Km

Fig. 2.5. a) Geographic distribution of ancestral and derived haplotypes within each Lacerta lepida lineage (L3 and L5). Pie charts represent the proportion of ancestral and derived haplotypes from each lineage found in each sampling site. Ancestral and derived haplotypes from each lineage are represented by different colours: red represents L5 ancestral haplotypes and bright green represents L3 ancestral haplotypes. Derived haplotypes are coloured in light orange for lineage L5 and dark green for lineage L3. Numbers and letters inside pie charts represent sampling sites as in Table 2.1. b) Statistical parsimony network reduced to show haplotypes from lineages L3 and L5 and the most recent common ancestor (MRCA) of both lineages only.

53

51

50 43 41 45

Numts 42 40 46 58

57

59 55 56 60 48

62 46 54

61 47 53 49

52 L3

66

III 63 64

65

7 5 2 67 9

6 3 1 8 11 12 L4 10 68 4

13 14 15

16

20 18

24 23 17 21 22

19

38 26

39 27 2525 II b 37 28 Ia 30 29

32 36

31

33 35 34 L5

Fig. 2.6. Statistical Parsimony network of cytochrome b haplotypes and Numts I, II and III. Numts are represented as red circles, while cytochrome b haplotypes are represented as in Fig. 2.4. (a) Numt I is also mentioned in the text as Numt I allele “a”. (b) Numt II is also mentioned in the text as Numt I allele “b”.

54

2.7. References

Alexandrino J, Arntzen JW, Ferrand N (2002) Nested clade analysis and the genetic evidence for population expansion in the phylogeography of the golden-striped salamander, Chioglossa lusitanica (Amphibia: Urodela). Heredity 88 , 66-74.

Alexandrino J, Froufe E, Arntzen JW, Ferrand N (2000) Genetic subdivision, glacial refugia and postglacial recolonization in the golden-striped salamander, Chioglossa lusitanica (Amphibia: Urodela). Molecular Ecology 9, 771-781.

Aljanabi SM, Martinez I (1997) Universal and rapid salt-extraction of high quality genomic DNA for PCR-based techniques. Nucleic Acids Research 25 , 4692- 4693.

Alvarez N, Benrey B, Hossaert-McKey M, Grill A, McKey D, Galtier N (2006) Phylogeographic support for horizontal gene transfer involving sympatric bruchid species. Biology Direct 1, 21.

Arctander P (1995) Comparison of a Mitochondrial Gene and a corresponding Nuclear Pseudogene. Proceedings of the Royal Society B: Biological Sciences 262 , 13- 19.

Bandelt HJ, Forster P, Rohl A (1999) Median-joining networks for inferring intraspecific phylogenies. Molecular Biology and Evolution 16 , 37-48.

Bensasson D, Zhang D-X, Hartl DL, Hewitt GM (2001) Mitochondrial pseudogenes: evolution's misplaced witnesses. Trends in Ecology & Evolution 16 , 314-321.

Bergthorsson U, Adams KL, Thomason B, Palmer JD (2003) Widespread horizontal transfer of mitochondrial genes in flowering plants. Nature 424 , 197-201.

Bergthorsson U, Richardson AO, Young GJ, Goertzen LR, Palmer JD (2004) Massive horizontal transfer of mitochondrial genes from diverse land plant donors to the basal angiosperm Amborella. Proceedings of the National Academy of Sciences of the United States of America 101 , 17747-17752.

Birky C (1995) Uniparental inheritance of mitochondrial and chloroplast genes: mechanisms and evolution. Proceedings of the National Academy of Sciences 92 , 11331-11338.

Birky CW (2001) The inheritance of genes in mitochondria and chloroplasts: Laws, Mechanisms, and Models. Annual Review of Genetics 35 , 125-148.

55

Brown WM, Prager EM, Wang A, Wilson AC (1982) Mitochondrial DNA sequences of primates: Tempo and mode of evolution. Journal of Molecular Evolution 18 , 225-239.

Burbrink FT, Lawson R, Slowinski JB (2000) Mitochondrial DNA Phylogeography of the Polytypic North American Rat Snake (Elaphe obsoleta): A Critique of the Subspecies Concept. Evolution 54 , 2107-2118.

Cassens I, Mardulyn P, Milinkovitch MC (2005) Evaluating intraspecific network construction methods using simulated sequence data: do existing algorithms outperform the global Maximum Parsimony approach? Systematic Biology 54 , 363 - 372.

Ciborowski KL, Consuegra S, Garcia de Leijniz C, Beaumont MA, Wang J, Jordan WC (2007) Rare and fleeting: an example of interspecific recombination in animal mitochondrial DNA. Biology Letters 3, 554-557.

Clement M, Posada D, Crandall KA (2000) TCS: a computer program to estimate gene genealogies. Molecular Ecology 9, 1657-1659.

Collura RV, Stewart C (1995) Insertions and duplications of mtDNA in the nuclear genomes of Old World monkeys and hominoids. Nature 378 , 485-489.

Cooper SJB, Hewitt GM (1993) Nuclear DNA sequence divergence between parapatric subspecies of the grasshopper Chorthippus parallelus . Molecular Biology 2, 185-194.

Crandall KA, Templeton AR (1993) Empirical tests of some predictions from coalescent theory with applications to intraspecific phylogeny reconstruction. Genetics 134 , 959-969.

Fontaine KM, Cooley JR, Simon C (2007) Evidence for paternal leakage in hybrid periodical cicadas (Hemiptera: Magicicada spp.). PLoS ONE 2, e892.

Fukuda M, Wakasugi S, Tsuzuki T (1985) Mitochondrial DNA-like sequences in the human nuclear genome. Characterization and implications in the evolution of mitochondrial DNA. Journal of Molecular Biology 186 , 257-266.

Gomez A, Lunt DH (2007) Refugia within refugia: patterns of phylogeographic concordance in the Iberian Peninsula. In: Phylogeography of Southern European Refugia (eds. Weiss S, Ferrand N). Springer, Dordrecht.

Gyllensten U, Wharton D, Josefsson A, Wilson AC (1991) Paternal inheritance of mitochondrial DNA in mice. 352 , 255-257.

56

Hall TA (1999) BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows95/98/NT. Nucleic Acids Symposium Series 41 , 95–98.

Hoarau G, Holla S, Lescasse R, Stam WT, Olsen JL (2002) Heteroplasmy and Evidence for Recombination in the Mitochondrial Control Region of the Flatfish Platichthys flesus . Molecular Biology and Evolution 19 , 2261-2264.

Kaneda H, Hayashi J, Takahama S, Taya C, Lindahl K, Yonekawa H (1995) Elimination of paternal mitochondrial DNA in intraspecific crosses during early mouse embryogenesis. Proceedings of the National Academy of Sciences 92 , 4542- 4546.

Kocher TD, Thomas WK, Meyer A, Edwards SV, Paabo S, Villablanca FX, Wilson AC (1989) Dynamics of Mitochondrial-DNA Evolution in Animals - Amplification and Sequencing with Conserved Primers. Proceedings of the National Academy of Sciences of the United States of America 86 , 6196-6200.

Kondo R, Satta Y, Matsuura ET, Ishiwa H, Takahata N, Chigusa SI (1990) Incomplete Maternal Transmission of Mitochondrial-DNA in Drosophila. Genetics 126 , 657-663.

Kraytsberg Y, Schwartz M, Brown TA, Ebralidse K, Kunz WS, Clayton DA, Vissing J, Khrapko K (2004) Recombination of human mitochondrial DNA. Science 304 , 981-981.

Kvist L, Martens J, Nazarenko AA, Orell M (2003) Paternal leakage of mitochondrial DNA in the great tit ( Parus major ). Molecular Biology and Evolution 20 , 243- 247.

Lopez JV, Yuhki N, Masuda R, Modi W, O'Brien SJ (1994) Numt, a recent transfer and tandem amplification of mitochondrial DNA to the nuclear genome of the domestic cat. Journal of Molecular Evolution 39 , 174-190.

Lu X-M, Fu Y-X, Zhang Y-P (2002) Evolution of mitochondrial cytochrome b pseudogene in genus Nycticebus. Molecular Biology and Evolution 19 , 2337- 2341.

Magoulas A, Zouros E (1993) Restriction-site heteroplasmy in Anchovy ( Engraulis encrasicolus ) indicates incidental biparental inheritance of mitochondrial DNA. Molecular Biology and Evolution 10 , 319-325.

Martínez-Solano I (2004) Phylogeography of Iberian Discoglossus (Anura: Discoglossidae). Journal of Zoological Systematics & Evolutionary Research 42 , 298-305.

57

Martínez-Solano I, Teixeira J, Buckley D, Garcia-Paris M (2006) Mitochondrial DNA phylogeography of Lissotriton boscai (Caudata, Salamandridae): evidence for old, multiple refugia in an Iberian endemic. Molecular Ecology 15 , 3375-3388.

Meusel MS, Moritz RFA (1993) Transfer of paternal mitochondrial DNA during fertilization of honeybee ( Apis mellifera L.) eggs. Current Genetics 24 , 539-543.

Moritz C, Schneider CJ, Wake DB (1992) Evolutionary relationships within the Ensatina-Eschscholtzii complex confirm the ring species interpretation. Systematic Biology 41 , 273-291.

Olalde M, Herrán A, Espinel S, Goicoechea PG (2002) White oaks phylogeography in the Iberian Peninsula. Forest Ecology and Management 156 , 89-102.

Paulo OS, Dias C, Bruford MW, Jordan WC, Nichols RA (2001) The persistence of Pliocene populations through the Pleistocene climatic cycles: evidence from the phylogeography of an Iberian lizard. Proceedings of the Royal Society B: Biological Sciences 268 , 1625-1630.

Paulo OS, Jordan WC, Bruford MW, Nichols RA (2002) Using nested clade analysis to assess the history of colonization and the persistence of populations of an Iberian Lizard. Molecular Ecology 11 , 809-819.

Paulo OS, Pinheiro J, Miraldo A, Bruford MW, Jordan WC, Nichols RA (2008) The role of vicariance vs. dispersal in shaping genetic patterns in ocellated lizard species in the western Mediterranean. Molecular Ecology 17 , 1535-1551.

Pinho C, Harris DJ, Ferrand N (2007) Contrasting patterns of population subdivision and historical demography in three western Mediterranean lizard species inferred from mitochondrial DNA variation. Molecular Ecology 16 , 1191-1205.

Posada D, Crandall KA (2001) Intraspecific gene genealogies: trees grafting into networks. Trends in Ecology & Evolution 16 , 37-45.

Richly E, Leister D (2004) NUMTs in sequenced Eukaryotic genomes. Molecular Biology and Evolution 21 , 1081-1084.

Saiki RK, Gelfand DH, Stoffel S, Scharf SJ, Higuchi R, Horn GT, Mullis KB, Erlich HA (1988) Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239 , 487-491.

Schwartz M, Vissing J (2002) Paternal inheritance of mtDNA in a patient with mitochondrial myopathy. European Journal of Human Genetics 10 , 239-239.

Sherengul W, Kondo R, Matsuura ET (2006) Analysis of paternal transmission of mitochondrial DNA in Drosophila. Genes and Genetic Systems 81 , 399-404.

58

Shitara H, Hayashi J, Takahama S, Kaneda H, Yonekawa H (1998) Maternal inheritance of mouse mtDNA in interspecific hybrids: Segregation of the leaked paternal mtDNA followed by the prevention of subsequent paternal leakage. Genetics 148 , 851-857.

Skibinski DOF, Gallagher C, Beynon CM (1994) Sex-limited mitochondrial DNA transmission in the marine mussel Mytilus edulis . Genetics 138 , 801-809.

Smith MF, Thomas WK, Patton JL (1992) Mitochondrial DNA-like sequence in the nuclear genome of an akodontine rodent. Molecular Biology and Evolution 9, 204-215.

Sorenson MD, Quinn TW (1998) Numts: a challenge for avian systematics and population biology. Auk 115 , 214-221.

Steinborn R, Zakhartchenko V, Jelyazkov J, Klein D, Wolf E, Müller M, Brem G (1998) Composition of parental mitochondrial DNA in cloned bovine embryos. FEBS Letters 426 , 352-356.

Sunnucks P, Hales DF (1996) Numerous transposed sequences of mitochondrial cytochrome oxidase I-II in aphids of the genus Sitobion (Hemiptera: Aphididae). Molecular Biology and Evolution 13 , 510-524.

Sutovsky P, Moreno RD, Ramalho-Santos J, Dominko T, Simerly C, Schatten G (1999) Development: Ubiquitin tag for sperm mitochondria. Nature Genetics 402 , 371- 372.

Sutovsky P, Moreno RD, Ramalho-Santos J, Dominko T, Simerly C, Schatten G (2000) Ubiquitinated sperm mitochondria, selective proteolysis, and the regulation of mitochondrial inheritance in mammalian embryos. Biology of Reproduction 63 , 582-590.

Sutovsky P, Van Leyen K, McCauley T, Day BN, Sutovsky M (2004) Degradation of paternal mitochondria after fertilization: implications for heteroplasmy, assisted reproductive technologies and mtDNA inheritance. Reproductive Biomedicine Online 8, 24-33.

Swofford DL (2002) PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods). Sinauer Associates, Sunderland, Massachusetts.

Templeton AR, Crandall KA, Sing CF (1992) A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping and DNA sequence data. III. Cladogram estimation. Genetics 132 , 619-633.

59

Tindall KR, Kunkel TA (1988) Fidelity of DNA synthesis by the Thermus aquaticus DNA polymerase. Biochemistry 27 , 6008-6013.

Ujvari B, Dowton M, Madsen T (2007) Mitochondrial DNA recombination in a free- ranging Australian lizard. Biology Letters 3, 189-192.

Van Leeuwen T, Vanholme B, Van Pottelberge S, Van Nieuwenhuyse P, Nauen R, Tirry L, Denholm I (2008) Mitochondrial heteroplasmy and the evolution of insecticide resistance: Non-Mendelian inheritance in action. Proceedings of the National Academy of Sciences 105 , 5980-5985.

Zagwijn WH (1992) Migration of vegetation during the Quaternary in Europe. Courier Forschungsinstitut Senckenberg 153 , 9-20.

Zhang D-X, Hewitt GM (1996) Nuclear integrations: challenges for mitochondrial DNA markers. Trends in Ecology & Evolution 11 , 247-251.

Zhao X, Li N, Guo W, Hu X, Liu Z, Gong G, Wang A, Feng J, Wu C (2004) Further evidence for paternal inheritance of mitochondrial DNA in the sheep ( Ovis aries ). Heredity 93 , 399-403.

Zischler H, Geiser H, Haeseler A, Paabo S (1995) A nuclear "fossil" of the mitochondrial D-loop and the origin of modern humans. Nature 378 , 489-492.

Zouros E, Ball A, Saavedra C, Freeman K (1994) An unusual type of mitochondrial DNA inheritance in the blue mussel Mytilus. Proceedings of the National Academy of Sciences 91 , 7463-7467.

60

1

0 Lineage L1 a) ITALY E Lineage L2 ! Lineage L3 Lineage L4 45N Lineage L5 Lineage N FRANCE ! Sampling points without NUMTs ! ! Sampling points with NUMTs

! L3 ! 5

E SPAIN

! ! ! ! 0 ! ! !

W

0 ! ! 1 ! ! ! ! ! ! ! ! !! ! L1 ! ! ! ! 40N ! b)

! PORTUGAL L5 ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! L4 ! !3 I III, IV ! !6 !7

5 III, IV L2 W ! N ! !8 III, IV ! IV !A ! ! !B 10! I, IV ! I ! !C ! I, III, IV ! ! !14 III, IV ! ! ! 050 100 200 Km 025 50 Km ! 35N !

Appendix 2.1. Geographic distribution of Numts..a) Distribution of Lacerta lepida mtDNA lineages represented by different colours in the map and sampling sites represented by black and pink dots. Pink dots also represent sites where Numts were detected The type of Numts detected in each site is shown in b) where. numbers inside dots represent sampling sites as in Table 2.1. Note: In sampling sites 6, 7, 8 and 14 the presence of Numt I (allele a) cannot be excluded (see text for detailed explanation). 61

Chapter 3

Phylogeography of Lacerta lepida in the Iberian Peninsula

Photos by Brent Emerson Sampling Lacerta lepida nevadensis in Cabo de Gata, Andalucia, Spain*

*Sampling site 1 in chapter 4

3. Phylogeography of Lacerta lepida in the Iberian Peninsula

3.1 Abstract

Here a detailed phylogeographic study of a lizard species ( Lacerta lepida ) with a distribution encompassing the entire Iberian Peninsula was carried out to better understand the role of Quaternary climatic changes in generating and maintaining the phylogeographic histories of typically Iberian species. Mitochondrial and nuclear sequence data support the existence of 6 evolutionary lineages within L. lepida . The strong association of mtDNA genetic variation with geography suggests a history of allopatric divergence in different refugia. Using a coalescence approach and exploring the geographic distribution of ancestral and derived alleles the refugia for each lineage were identified. A concordant pattern of spatial and demographic expansions within the lineages most probably associated with the last post-glacial climatic oscillation was detected. Inferences of expansion routes from the refugia, together with the detection of divergent nuclear and mitochondrial alleles in narrow zones of sympatry allowed the identification of secondary contact zones. Although divergences between the mitochondrial phylogroups seem to have a Mio-Plio-Pleistocene origin, a strong influence of later Pleistocene events is registered, with ages for each phylogroup estimated to range from 0.45 to 0.85 Mya. Results are compared with several published phylogeographic studies in the region.

Key words : phylogeography, Pleistocene, range expansions, contact zones, refugia. 62

3.2. Introduction

Phylogeography, as first named and described by Avise et al. (1987), is a discipline that studies the role of historical factors in shaping the geographical distribution of genealogical lineages at the intraspecific level. Of the historical processes that have influenced the current distribution of genetic variation within species, the cyclic climatic oscillations and environmental changes during the late Quaternary are probably the most important. The ice-ages of the Pleistocene are generally believed to have had a considerable influence on the genetic structure of populations and in species survival across the world (reviewed in Hewitt 2000). In Europe, the cold and drier conditions characteristic of the Pleistocene glacial periods have led to the contraction of species distribution ranges towards the southern regions due to the advance of the ice- sheet from northern latitudes (Hewitt, 1996; Hewitt, 1999; Hewitt, 2000). Evidence from a number of phylogeographic studies suggests that the southern peninsulas of Iberia, Italy and the Balkans, as well as areas near the Caucasus and the Caspian Sea, have functioned as refugial areas and as species survival pockets during periods of adverse climatic conditions (Hewitt, 2004). The process of southward range contractions and shifts into the so-called ‘glacial refugia’ has led to species contraction and fragmentation into allopatric populations, promoting the diversification of evolutionary lineages among the southern refugial areas. During warmer interglacial periods the isolated populations expanded from the different refugia to re-colonize central and northern Europe. This recolonization of northern latitudes by populations isolated in different allopatric glacial refugia has led to the establishment of hybrid zones, where divergent genomes came into contact. Clusters of hybrid zones in Northern Europe and broad patterns of recolonization routes have been described across different species (Hewitt, 1999; Taberlet et al. , 1998). The cyclic processes of population contractions to southern refugia and range expansions to northern territories have left strong signatures in the geographic distribution of genetic diversity within many species. Species or

63 species-complexes that are widely distributed across Europe often exhibit a latitudinal gradient in the distribution of genetic diversity. Within such species southern populations present high diversity due to long-term persistence while northern populations are typically less diverse, a consequence of extensive and rapid range expansions from southern refugial populations. This pattern of “northern purity versus southern richness” (Hewitt, 1996; Hewitt, 2000) is well illustrated in early European phylogeographic studies from a variety of taxa (e.g. the hedgehog, Erinaceus europaeus/concolor (Santucci et al. , 1998; Seddon et al. , 2001); the grasshopper, Chorthippus parallelus (Cooper et al. , 1995); the bear, Ursus arctus (Taberlet and Bouvet, 1994); the crested newt Triturus spp, (Wallis and Arntzen, 1989) amongst others). The importance of southern refugial areas as survival pockets and sources of species recolonization to more northern regions is now widely accepted. Nevertheless, recent studies also emphasize the important role that these refugial areas had in shaping the evolutionary history of species that have persisted within these regions for several ice ages. As anticipated by early studies (Cooper and Hewitt, 1993; Hewitt, 1996) the topographic complexity and geographic mosaic of habitats in southern refugial peninsulas have most probably favoured the occurrence of multiple isolated refugia, allowing the persistence of isolated populations within them during glacial periods. The cyclic fragmentation of species into different allopatric populations within the main refugial areas allowed for complex demographic and evolutionary histories. For the Iberian Peninsula these complex histories are well researched and described for a wide variety of taxa, with some of the species showing remarkable patterns of phylogeographic concordance (see Gomez and Lunt, 2007 and references therein) involving deep genetic subdivisions, high haplotype richness, and long-term hybrid zones. Indeed, the complexity of the evolutionary histories that have been revealed within the Iberian Peninsula highlights the important role that this region has played. Not only has the Iberian Peninsula facilitated the northern redistribution of species after climatic cooling, but it has also facilitated diversification through patterns of repeated population fragmentation, contraction, expansion and admixture. Detailed phylogeographic studies at the species level for the golden-striped salamander, Chioglossa lusitanica (Alexandrino et al. , 2002; Alexandrino et al. , 2000; Sequeira et

64 al. , 2005) and the Schreiber’s Lizard, Lacerta schreiberi (Godinho et al. , 2008; Paulo et al. , 2001) amongst others, are good examples of the type of complexity that most likely typifies many species within this major Peninsular glacial refugium. The response of a given species to climatic cycles, and the extent to which it undergoes lineage diversification and extinction, will depend on the sharpness of the climatic change, the latitude and the topography of the region and the dispersa1, reproductive, and adaptive capabilities of the species itself (Nichols and Hewitt, 1994). The topographical variation across the Iberian Peninsula (Fig. 3.1.) has most probably influenced the way widely distributed species within this peninsula have responded to the Quaternary climatic cycles. In northern regions of Iberia large mountain systems prevail, allowing the survival of populations by altitudinal shifts while tracking their suitable habitat as the climate changes. These mass altitudinal movements are expected to result in less unstable demographic populations, and therefore smaller reductions of genetic variability are expected to occur. This is partly due to the type of colonization involved in these slow movements, with a high proportion of individuals dispersing only short distances (the "phalanx" type of colonization as described in Nichols and Hewitt 1994). Furthermore mountain systems promote a metapopulation type of structure in species distribution, which also helps to preserve variability. In northern European latitudes it is generally accepted that the effects of glaciations have been more severe. Within northern latitudes fewer areas of suitable habitat were available to serve as refugia during glacial periods, leading to higher probabilities of local extinction resulting in decreased regional genetic diversity. Thus, although multiple populations distributed within the mountainous regions of the Iberian Peninsula can be expected to maintain genetic diversity, the way individual populations respond to climatic changes within those regions will be influenced by their latitudinal position. Perhaps surprisingly, the latitudinal influence on the responses of species to climatic change seems to be detectable even when we consider the limited latitudinal range across the Iberian Peninsula. This pattern of lower diversity and lower number of genetic lineages within northern latitudes was recently described for a complex of sister species of Podarcis spp distributed across a latitudinal gradient in the Iberian Peninsula (Pinho et al. , 2007).

65

Even though the Iberian Peninsula is the most well studied glacial refugia in terms of phylogeography, the majority of the phylogeographic studies have focused on species that either have a narrow distribution within the region (e.g Chioglossa lusitanica (Alexandrino et al. , 2000), Lacerta scheriberi (Paulo et al. , 2001), Lissotriton boscai (Martínez-Solano et al. , 2006)) or involve species-complexes that, although distributed across the entire region, generally present a genetic structure that relates to older cladogenic events (e.g Podarcis spp. (Harris and Sá-Sousa, 2002; Pinho et al. , 2008), Alytes spp (Martínez-Solano et al. , 2004)). In order to better understand the complex phylogeographic history of Iberian species, and the way they have responded to Pleistocene climatic oscillations, a species with a distribution encompassing the entire Iberian Peninsula should be studied in detail. For this purpose the ocellated lizard, Lacerta lepida , is chosen as a model to study the impact of Pleistocene climatic changes in generating and structuring intraspecific genetic diversity across the Iberian Peninsula. The species is typically Mediterranean, with a distribution encompassing all the Iberian Peninsula, and shows apparent phylogeographic structure across the region (Paulo et al. , 2008). Several mitochondrial lineages which appear to have non-overlapping geographic ranges were recently described, suggesting a history of allopatric differentiation in multiple refugia during the Plio-Pleistocene (Paulo et al. , 2008). Recent detailed analysis of two of these lineages in the northwest corner of Iberia (chapter 2) has revealed the probable glacial refugial areas for each, and the region of secondary contact between them. Here this analysis is extended to assess the broader phylogeographic patterns within Lacerta lepida with the specific aims to i) clarify the distribution of mtDNA phylogroups; ii) identify refugial areas within these phylogroups during the glacial periods; iii) identify postglacial expansion routes iv) date the main demographic and evolutionary events within Lacerta lepida ; and finally v) identify contact zones between the different phylogroups. It is generally acknowledged and accepted that phylogeographic histories recovered using only mtDNA as a marker are constrained to reveal only one genealogy which mainly reflects the maternally inherited natural history of an organism. Relationships among phylogroups inferred through mtDNA might be discordant with the inferences made based on nuclear genes (Harrison 1991; Avise 2000) and such

66 discordances have been illustrated in several recent phylogeographic studies (e.g. Dowling et al. , 2008; Leaché and McGuire, 2006; Lindell et al. , 2008b; McGuire et al. , 2007; Thorpe et al. , 2008b; Ujvari et al. , 2008; Zink and Barrowclough, 2008). Discordance is expected when time since divergence between phylogroups is not enough to allow the achievement of reciprocal monophyly for more slowly evolving nuclear DNA sequences. Discordances can also be explained both by the retention of ancestral variation among populations, and/or more recent hybridization events (Avise, 2004). Within the Iberian Peninsula several studies have emphasized the importance of using different types of markers to fully recover the complex evolutionary and demographic scenarios that most likely characterize the species that have persisted there across the Quaternary. For example, in Lacerta schreiberi (Godinho et al. , 2008) evidence for gene flow and ancestral introgression between apparently allopatric mtDNA lineages was only detected by the use of nuclear markers. Further, within the well defined mtDNA species boundaries of Podarcis spp. (Pinho et al. , 2008) the analysis of nuclear gene genealogies allowed for the detection of extensive nuclear introgression between the species, which after detailed analysis was identified as due to incomplete lineage sorting and not to recent gene flow. In this study, both mtDNA and nDNA derived genealogies were used as their contrasting molecular and population properties (principally uniparental versus biparental mode of inheritance and contrasting population sizes) are valuable when opportunities for secondary contact, gene flow and hybridization between diverging populations have most likely occurred.

3.3. Materials and methods

3.3.1. Sampling strategy collection

Lizards were captured under licence during the years 2005, 2006 and 2007. The sampling strategy was devised in order to sample the entire distribution area of Lacerta

67 lepida in Portugal, Spain, and France covering all mitochondrial lineages previously described (Paulo et al. , 2008). Sampling intensity was concentrated in regions of high genetic divergence within western and south-eastern part of Iberia (Paulo et al. , 2008). Lizards were captured using tomahawk traps or by hand, and tissue samples were taken by clipping 1cm of the tail tip that was subsequently preserved in 100% ethanol. After tissue sampling, animals were immediately released back into the wild in the place of capture. Geographic coordinates of sampling sites were recorded with a GPS.

3.3.2. Laboratory procedures

DNA extraction, amplification and sequencing

A fragment of 627 base pairs of the mtDNA cytb gene was amplified using primers CYTBF and CYTBR (chapter 2). DNA extractions, PCR amplifications and sequencing conditions were the same as described in section 2.3.2. In chapter 2 it was shown that primers CYTBF and CYTBR may also co-amplify cytb Numts in some Lacerta lepida samples, therefore all cytb chromatograms were visually assessed for sequence quality and for the presence of double peaks using BioEdit Sequence Alignment Editor 7.01 (Hall, 1999). For all samples detected to be polymorphic (with at least one double peak), the authentic mitochondrial sequence for the cytb fragment was obtained through the amplification of the complete gene (1143 bp) using the primers TRNAGLU and TRNATHR designed in chapter 2. PCR amplifications were conducted as before, but using 52ºC for primer annealing. Purified PCR products were then sequenced with the same internal primers (CBF and CBR) used in section 2.3.4. and sequencing conditions were also the same as before. Intron 7 of the β-fibrinogen gene ( β-fibint7 ) has been successfully used as a nuclear marker in several vertebrate phylogeographic and phylogenetic studies (e.g. Dolman and Phillips, 2004; Godinho et al. , 2006; Pinho et al. , 2008; Prychtko and Moore, 1997; Sequeira et al. , 2006). Specifically it was recently employed for a phylogenetic study of the genus Lacerta (Paulo et al. , 2008) where it revealed to have

68 sufficient variation within Lacerta lepida for phylogeographical inference. Initially, the β-fibint7 amplifications were performed using primers FIB-B17U (5’- GGA GAA AAC AGG ACA ATG ACA ATT CAC - 3’) and FIB-B17L (5’ – TCC CCA GTA GTA TCT GCC ATT AGG GTT - 3’) (Prychtko and Moore, 1997) and the conditions described in Paulo et al. (2008). However, due to low amplification and sequencing success a nested PCR approach as suggested by Sequeira et al. (2006) was subsequently adopted. A fragment of 788bp was first amplified from genomic DNA using primers FIB-B17U and FIB-B17L (PCRa). The product of this reaction (1 µl) was then used as a template for a subsequent PCR of 691bp (PCRb) using primer BFXF (5’ - CAG YAC TTT YGA YAG AGA CAA YGA TGG - 3’) (Sequeira et al. , 2006) and BFX8 (5’ - CAC CAC CGT CTT CTT TGG AAC ACT G - 3’) (Pinho et al. , 2008). Both amplifications were performed in a total volume of 25 µl, and included reagents in the same concentrations as those specified for cytb gene fragment (see section 2.3.2.). PCR cycle conditions were the same as described for cytb fragment but the primer annealing temperatures were 55ºC and 56ºC, for PCRa and PCRb respectively. Negative controls (no DNA) were included for all amplifications. Purified PCRb products were then sequenced with primers BFBX and BFX8 using identical sequencing conditions as for the mtDNA cytb sequencing (see section 2.3.2.).

3.3.3. Phylogeographic and historical demographic analysis

DNA sequences were aligned by eye using BioEdit Sequence Alignment Editor 7.01 (Hall, 1999). β-fibint7 alleles of heterozygous individuals were inferred using

PHASE version 2.1 (Stephens and Scheet, 2005; Stephens et al. , 2001). Several tests implemented in the software RDP3 (Recombination Detection Program, Martin et al. , 2005) were used to detect recombination in this nuclear gene: RDP (Martin and Rybicki, 2000), GENECONV (Padidam et al. , 1999), Maximum Chi Square (Posada and Crandall, 2001a; Smith, 1992), Chimaera (Posada and Crandall, 2001a) and Sister Scanning (Gibbs et al. , 2000). Due to the small size of the fragment used in the analyses

69

(315 bp), the window size for the recombination detection methods was set to 20bp, whenever possible.

Haplotype network construction

To correctly represent intraspecific gene genealogies biological phenomena relevant at the population genetic level need to be taken into account. Processes such as the persistence of ancestral haplotypes in populations and lower levels of divergence usually lead to a lack of phylogenetic resolution. Representing this uncertainty is important if one wants to depict all the possible evolutionary pathways that might explain the data. Furthermore, recombination is expected at the intraspecific level, and if present will lead to reticulate relationships, thus complicating the representation of a genealogy (Cassens et al. , 2005; Posada and Crandall, 2001b). Although recently it has been shown that the Maximum Parsimony (MP) method, which is a standard tree approach, outperforms most of the network approach methods (Woolley et al. , 2008) one major advantage of the latter over phylogenetic trees is that they allow the representation of alternative genealogical pathways in a single graphical representation. The ability to reveal ambiguities due to homoplasy and/or recombination, which cannot be revealed by a strict consensus tree, justifies networks as a more appropriate approach to represent intraspecific evolutionary relationships (Cassens et al. , 2005). In this study, intraspecific gene genealogies were inferred using the median-joining (MJ) (Bandelt et al. , 1999) and the statistical parsimony (SP) (Templeton et al. , 1992) network construction approaches (see section 2.3.5. for an explanation of both methods). The MJ network was computed with the program NETWORK 4.5.0 ( www.fluxus- engineering.com ) and the SP network was inferred using the program TCS 1.21 (Clement et al. , 2000). For the MJ approach the parameter ε was set to 0 which does not allow less parsimony pathways to be included in the analysis. The SP network was inferred with a parsimony confidence limit of 95%, allowing therefore the inclusion of less parsimonious alternatives whenever those alternatives cannot be excluded at the confidence limit chosen. Ambiguities within networks were resolved following the criteria of Crandall & Templeton (1993). 70

Neutrality tests and demographic analyses

Relatively recent demographic events, such as a population growth or a range expansion, leave genetic footprints that can be detected through the analysis of DNA sequences. In order to detect departures from a constant population size under the neutral model the Tajima’s D (Tajima, 1989), Ramos-Onsins & Rozas R2 (Ramos- Onsins and Rozas, 2002) and Fu’s Fs (Fu, 1997) tests were applied to both types of DNA datasets, mtDNA and nDNA. It is important to stress that departures from the null hypothesis could be due either to an effect of natural selection on the markers under study or the result of past demographic expansions. Both Tajima’s D and Ramos-Onsins

& Rozas’s R2 use information from the mutation (segregating sites) frequencies, but the latter also takes into account the average number of nucleotide differences between sequences. Fu’s Fs (1997) is a different type of statistic test based on haplotype distribution information. Both R2 and Fs statistics have been shown to be the best statistical tests to detect population growth ( R2 has been suggested to behave better for small sample sizes whereas Fs is better for bigger ones) (Ramos-Onsins and Rozas, 2002). Population expansions have also been shown to leave particular signatures in the distribution of pairwise sequence differences (Rogers and Harpending, 1992; Slatkin and Hudson, 1991). We capitalized upon this by employing statistics based on the mismatch distribution to test for demographic expansions. The observed distribution of pairwise differences between haplotypes within each mtDNA phylogroup was compared with the expected results under a sudden-demographic and a spatial-demographic expansion model. Statistically significant differences between observed and expected simulated distributions were evaluated with the sum of the square deviations ( SSD ) and the Harpending’s raggedness index ( hg ) (Harpending, 1994; Harpending et al. , 1993).

Tests were performed with ARLEQUIN version 3.11 (Excoffier et al. , 2005) for Tajimas’

D, Fu’s Fs , SDD and hg , and with DNA SP version 4.50 (Rozas et al. , 2003) for R2 and expected values for the mismatch distribution.

71

Geographical distribution of alleles and refugial areas

Using predictions from coalescent theory (haplotypes at the tips of a tree are younger than the interior haplotypes to which they are connected) ancestral and derived haplotypes within each phylogroup were identified, thus obtaining a temporal framework for haplotype origin within phylogroups. The null hypothesis of random geographic distribution of haplotypes was also tested using GEO DIS version 2.5 (Posada et al. , 2000) to perform statistical tests and assess their significance through permuting the data 10 6 times. When non-random associations of haplotypes with geography were detected the geographic distribution of ancestral versus derived haplotypes (interior and tip haplotypes, respectively) was further explored to identify possible refugial areas and the directionality of previously detected demographic and spatial expansions.

3.3.4. Estimation of divergence times

Divergence times within and between phylogroups were estimated from the cytochrome b dataset using BEAST version 1.4.2 (Drummond and Rambaut, 2007).

BEAST performs Bayesian Statistical inferences of parameters, such as divergence times, by using Markov Chain Monte Carlo (MCMC) as a framework. Input files were generated with BEAUTI version 1.4.2 (Rambaut and Drummond, 2007). The nucleotide substitution model and its parameter values were selected according to the results of

MODELTEST version 3.7 (Posada and Crandall, 1998), with upper and lower bounds around the values defined as 120% and 80% respectively (Emerson, 2007). Mutation rates were not fixed and an uncorrelated lognormal relaxed molecular clock was used (Drummond et al. , 2006). A mean mutation rate of 0.01 (the same mutation rate described for a close lizard species, Gallotia spp. (Paulo et al. , 2001)) with a standard deviation of 0.0015, assuming a normal distribution, was used as prior information and implemented in BEAST . No tree was selected at the start of the analysis and a constant population size tree prior was assumed. Two runs were each executed for 10 6

72 generations, sampling every 500 generations and discarding the first 10% as burn in.

Results of the two runs were displayed and combined in TRACER (Rambaut and Drummond, 2005) to check for stationarity and ensure that effective samples sizes (ESS) were above 100. For all analyses one sequence of Lacerta pater (Genebank accession number: AF378963) was included as an outgroup. In a second approach to estimate divergence times within phylogroups the method of Saillard et al. (2000) was employed where each extant haplotype descending from the most recent common ancestor (MRCA) represents a time interval between the present and the MRCA. Average distances from the MRCA within each phylogroup are calculated from the number of mutation steps separating each haplotype sampled from the MRCA. The absolute timing of divergence is then calculated by multiplying the observed values by the average mutational changes per lineage per million years (Myr). The molecular clock for cytochrome b sequences of a reptile species has been previously calculated, using Gallotia spp and the geological origin of the most recently emerged Canary Island, El Hierro (see Paulo et al. , 2001 for details). The mean pairwise sequence divergence obtained was approximately 2%. Based on this information, three mutation rates were used: 0.01, as representing the average mutation rate; a faster mutation rate of 0.0125, assuming an underestimation of the mean calibration mutation rate due to the assumption of immediate island colonization that was used in the molecular clock calibration in the work of Paulo et al. (Paulo et al. , 2001); and finally a slower mutation rate of 0.0085 to account for the longer generation time (3 years approximately) and larger body size of Lacerta spp. group when compared to Gallotia spp (1 year generation time). Variation in rates of nucleotide substitution among divergent taxonomic groups have been shown to be associated with differences in body size and generation time (Martin and Palumbi, 1993). Generation time is the time it takes for germ-line DNA to reproduce itself. If most mutations are the result of errors in this process and if species have similar number of cell divisions per generation, then it is expected that species with longer generation times will accumulate fewer substitutions than those with shorter generations simply because there will be fewer opportunities for replication errors in the former. Furthermore, the results of Martin and Palumbi (1993) indicate that DNA substitutions accumulate at a slower rate in large animals than in

73 small animals and that this is intimately associated with metabolic rates. Metabolic rates and the rate of germ cell division are higher for smaller species and therefore they usually have fast rates of molecular evolution.

3.4. Results

A total of 422 lizards were sampled from 129 different sites across the distribution area of Lacerta lepida . Sampling sites and number of samples per site are shown in Fig. 3.1. and Table 3.1., respectively. A total of 390 cytb and 104 β-Fibrinogen intron 7 sequences were obtained.

3.4.1. Mitochondrial DNA data

All cytb sequences represented uninterrupted open reading frames, with no gaps or premature stop codons, suggesting they are functional mitochondrial DNA copies. One hundred and fifty two (152) unique haplotypes were obtained from the 390 sequences analysed. Of a total of 627 sites sequenced, 171 were variable, from which 123 were parsimony informative. Pairwise genetic distances (uncorrected p-values) between haplotypes ranged from 0.16% to 13.2%. According to the Bayesian Information Criterion (BIC) and the hierarchical Likelihood Ratio Tests (hLRT’s), the model of nucleotide substitution identified as the best fit to the data is the HKY (Hasegawa et al. , 1985) with a gamma distribution (Γ) for substitution rates across sites (shape parameter, α = 0.2889) and no category of invariable sites. According to the Akaike Information Criterion (AIC) this was not the first ranked model, nevertheless it has an AIC difference (delta) of 1.6 and it is included within the 95% confidence limit, thus having substantial support. Pairwise

74 genetic distances among sequences corrected with the above mentioned model ranged between 0.16% and 24.96 %. The genealogical relationships between haplotypes inferred by the two approaches for network construction (MJ and SP) are highly congruent. While the MJ approach connected all haplotypes in a single network (as expected by the nature of the method) the SP approach failed to do so, due to the high number of mutational steps separating groups L2 and N from the main network. Therefore, at the 95% confidence limit, TCS calculated three unconnected networks (results not shown). Nevertheless, the three SP networks were connected in one single network when the confidence limit was reduced (to 92% to include group L2 and to 65 mutational steps for group N) (Fig. 3.2.a). The relationships inferred by the two approaches (MJ and SP) when considering the single network were identical and included 12 loops, from which 7 were easily resolved by applying the criteria of Crandall and Templeton (1993). The network reveals two very divergent groups of haplotypes (phylogroups), separated by 65 mutational steps with an average pairwise uncorrected genetic distance between groups of 11.7% (see Table 3.2. for both corrected and uncorrected genetic distances between phylogroups). The geographic distribution of phylogroup N is coincident with the Betic Mountains in south-eastern Spain while phylogroup L occupies the remaining area of the species distribution. Within phylogroup L, five geographically distinct groups of haplotypes can be identified (Fig. 3.2.a and Fig. 3.3.), which include the four mitochondrial phylogroups (L1-L4) identified by Paulo et al. (2008) and phylogroup L5 identified in chapter 2. Average genetic distances (uncorrected p distances) between these phylogroups range from 1.1% (between phylogroup L4 and L5) and 3.28% (between phylogroups L2 and L5; and L2 and L1). Phylogroup L1 is distributed mainly across the Central Mountain system in-between the Douro and Tagus river basins in Spain. Phylogroup L2 is distributed in southern Portugal, occupying the entire region of Algarve and the south-western part of Alentejo. This phylogroup is clustered together in the network with phylogroup L3 (Fig. 3.2.a) forming a monophyletic group. L3 is distributed several hundred kilometres (300km) to the north of L2 occupying the north-western corner of Iberia, mainly the regions to the north of Douro River in Portugal and the regions of Asturias and Galicia in Spain. The

75 area in between phylogroups L3 and L2 is occupied by two other phylogroups, L4 and L5. Phylogroup L5 is restricted to central Portugal, occupying the region between Tagus and Douro River. L4 has the widest distribution of any phylogroup, and occupies the remaining areas of southern Portugal and Spain; passing through the Ebro valley to reach the Atlantic and Mediterranean coasts of France and is possibly also present in north-western Italy. The root of the network can be inferred to be located somewhere along the branch that connects the very divergent clades L and N, allowing for the inference of the most recent common ancestor (MRCA) among the sampled haplotypes for each phylogroup (Fig. 3.2.a). Although this identification is straightforward for clade N (haplotype 133), the networks reveal two probable ancestral haplotypes (haplotypes 134 and 111) within clade L, which are connected to haplotype 133 through a loop. SP and MJ networks constructed with 0-fold degenerate sites only, thus reducing homoplasy within the data set (Fig. 3.2.b) result in the collapse of this reticulation, and haplotype 133 (clade N) connects unambiguously to haplotype 134 (Clade L), supporting 133 as the ancestral haplotype within phylogroup L. Within phylogroups, statistically significant associations between genetic variation and geographical distribution were detected for L1, L3, L5 and N (Table 3.3.). Significant deviations from neutrality that could reflect past population expansion events were detected for all phylogroups with Tajima’s D and with the exception of L1 the same was true for all phylogroups when more powerful statistics were applied (Table 3.3.). The distributions of pairwise differences within each phylogroup were also found to be consistent with sudden-expansion and spatial-expansion models (as seen by the SDD and hg p values in Table 3.3.), with signatures of population growth being exhibited by the bell shaped mismatch distributions (Fig. 3.4.). However, for L1 and N it is possible to detect slightly bimodal and ragged shaped curves, suggestive of population size constancy within these two groups. Interestingly, phylogroups L1 and N are the ones which have a distribution mainly associated with mountainous areas (Betic Central System and Sierra Nevada, respectively).

76

3.4.2. Nuclear DNA data

β-fibint7 sequences were trimmed to 315 bp in order to eliminate gaps within the sequences. From the 315 bp, 15 sites were variable of which 6 were parsimony informative. Twenty unique haplotypes (B1 to B20) were identified among the 208 alleles analysed and recombination was not detected by any of the tests applied. The relationships among haplotypes inferred by the two network construction approaches (MJ and SP) were identical, resulting in one single network with 4 unresolved reticulations (Fig. 3.5.). In order to root the network a sequence of the sister species Lacerta pater (Genebank accession number: EU: 365413) was incorporated. Haplotype B15 is inferred to be the ancestral haplotype as it connects unambiguously to the outgroup. This haplotype is restricted to the south-west of the species distribution area (sampling sites 11, 56, 72, 94 and 112). Haplotype B1 is the most common haplotype and has the widest distribution within the group, occurring in 83% of samples. This haplotype is connected to several low frequency haplotypes, generating a star-like genealogy, which suggests a possible past range expansion for which signatures of expansion were formally detected by the mismatch distributions and neutrality tests results (Table 3.3.). Within the 50 southernmost samples, 16 alleles are registered, representing 80% of the nuclear allele diversity. From those 16 alleles, 8 are restricted to that area, not occurring further north, suggesting this southern region as the probable source for the northwards range expansion. The nuclear data failed to recover the phylogroups detected by the mtDNA dataset. Nevertheless, when the nuclear dataset is grouped according to the mtDNA phylogroups previously identified structure in the distribution of alleles can be detected. For this analysis each of the mtDNA phylogroups was considered as a geographic region and GEODIS was used to test for geographical structure amongst the nuclear genetic variation. The two most ancestral haplotypes, haplotype B1 and B15, occur in all phylogroups. Additional nuclear haplotypes are also shared among some of the mtDNA phylogroups (with the exception of phylogroup N): all haplotypes from L3 (B7 and B20) occur either in L1 (B7) or in L5 (B20); all haplotypes from L2 (B4, B6, B13 and B14) occur in L4 (with B13 also occurring in L5); all haplotypes from L5 (B5, B13

77 and B20) occur either in L4 (B5 and B13), either in L3 (B20) or in L2 (B13) and finally haplotypes from L1 (B7 and B17) occur either in L3 (B7) or in L4 (B17). It is important to note that only geographically close phylogroups share haplotypes. Nevertheless, it is also possible to identify private haplotypes within the geography of several mtDNA phylogroups: B2 is restricted to phylogroup L1; haplotypes B3, B9, B10, B14, B16 and B18 to L4 and haplotypes B8, B11, B12 and B19 to phylogroup N. This pattern of ancestral haplotypes (B1 and B15) being shared between all phylogroups suggests incomplete lineage sorting. Incomplete lineage sorting may also explain the pattern of geographically close phylogroups sharing more derived haplotypes, but current gene flow between phylogroups is equally plausible.

3.4.3. Divergence times

Mean ages and 95% highest posterior density (HPD) of mtDNA phylogroups are shown in Table 3.4. Divergence within the group is estimated to have started approximately 9.4 million years ago (Ma) (5.58-13.66) in the mid-late Miocene, corresponding to the cladogenic event between phylogroups N and L. Although divergence within phylogroup L is estimated to have started in the late Pliocene (1.96 Ma; 1.13-2.91) the majority of the phylogroups within this clade are estimated to have Pleistocene origins, with divergence times younger than 1.0 Ma. The oldest split within this group refers to the divergence of the monophyletic lineage composed of phylogroups L2 and L3 from the remaining phylogroups, followed by the emergence of phylogroup L1 and finally the divergence of the more recent phylogroup L5.

3.5. Discussion

The strong association of mtDNA genetic variation with geography suggests a history of allopatric divergence in different refugia within the Iberian Peninsula, a

78 pattern that has been described across several taxa within the region (see Gomez and Lunt, 2007, and references therein). Although this pattern of differentiation of distinct evolutionary units in allopatry was less evident from the analysis of the nuclear data, the distribution of nuclear haplotypes is not in conflict with the mtDNA phylogroups. The cytb genealogy clearly defines 6 geographic phylogroups that, in accordance to the β- Fibint7 genealogy, are inferred to have diverged in allopatry in southern refugia, followed by demographic and spatial range expansions. The range expansions have resulted in the establishment of secondary contact zones between the phylogroups, where divergent mitochondrial haplotypes co-occur in the same populations.

3.5.1. Mitochondrial DNA data

Detailed analysis of the distribution of mitochondrial genetic variation within Lacerta lepida across the Iberian Peninsula revealed a complex phylogeographic history for the species. Lacerta lepida is structured into six mitochondrial phylogroups with generally non-overlapping geographic distributions. This pattern of distribution of genetic variation associated with geography suggests the former isolation of at least six populations in different allopatric refugia. The existence of contact zones with very divergent haplotypes occurring in some populations (Table 3.1.) could be explained by a scenario of differentiation in allopatry followed by contact due to range expansions. This pattern has been well described for phylogroups L3 and L5 in the previous chapter (chapter 2), and it also seems to be the case for the other phylogroups. Within Lacerta lepida , usually older haplotypes are found in the southern limits of phylogroups’ distributions with derived haplotypes having more northerly distributions. The neutrality tests and mismatch distribution tests reveal signs of demographic and spatial expansions in almost all phylogroups, suggesting that populations must have been contracted in smaller range refugial areas previously to expansion. Nevertheless, slightly bimodal curves of the mismatch distribution (Fig. 3.4.a) in phylogroups that currently have a distribution area more associated with mountainous regions (L1, L3 and N) reveal some signs of population stasis, indicating that populations have probably persisted in these areas for longer periods. The slightly bimodality registered, which is more apparent in

79 phylogroup L1, could probably reflect different expansions most likely associated with the effect of different ice ages. This emphasizes the importance of mountainous regions as areas that allow long-term survival of populations, which is also suggested by the high endemicity levels usually associated with these regions (García-Barros et al. , 2002). For example, the Sierra Nevada Mountains have been evaluated as encompassing the highest number of plants endemics in Europe (Gomez-Campo et al. , 1984) and also the main Iberian hotspot in biodiversity (Castro-Parga et al. , 1996); the northern mountain ranges of Iberia, as the Sistema Central, have been designated as the principal areas of monocotyledons endemism (Saiz et al. , 1998) as well as for certain animal species. Mountainous regions located at northern latitudes within the Iberian Peninsula, as the Iberian Central System and the regions of northern Portugal, play an even more important role, as they allow the survival of populations at higher latitudes where the impact of climatic oscillations should be more pronounced. It was recently shown that populations at northern latitudes, even across the small latitudinal range observed within the Iberian Peninsula, usually show lower diversity and lower number of genetic lineages (Pinho et al. , 2007) than southern ones. This was shown for a complex of Podarcis species distributed through a latitudinal gradient from Northern Africa to North-western Iberia. The data presented here suggests that this pattern probably does not hold true for populations that are able to persist in mountainous regions at northern latitudes, which seems to be the case for phylogroups L1 and L3. This pattern of persistence is evident in the haplotype network within phylogroups L3 and L1, which show long internal branches with several missing haplotypes (Fig. 3.2.a), suggesting long term persistence with diversification and further extinction of ancestral haplotypes.

3.5.2. Nuclear DNA data

Failure of the nuclear gene genealogy to reveal concordant genetic structure with the mitochondrial genealogy can be expected if we take into account the fact that nuclear genes take on average four times longer to reach monophyly than mitochondrial ones. In fact, most of the intraspecific differentiation within Lacerta lepida is of

80 relatively recent origin, with the majority of phylogroups (L1 to L5) estimated to have diversified within the Pleistocene, increasing therefore the possibility of mitochondrial lineages not being monophyletic when nuclear markers are considered. Nevertheless, lineage N is estimated to have diverged from the remaining phylogroups during the Plio- Miocene (5.6-13.7 Ma) representing a much older cladogenic event within the group. This older split provides longer isolation periods, therefore allowing for a more complete lineage sorting at the nuclear level. This pattern is clearly evident in the composition of nuclear haplotypes of lineage N, which are almost all private with the exception of the ancestral haplotypes (which are shared among almost all phylogroups). Thus, although lineages have not reached monophyly at the nuclear level, some level of differentiation between lineages is detected by the existence of private alleles. In fact phylogroups that represent older cladogenic events (L1, L4 and N) are the ones that show private nuclear alleles. Although not conclusive due to the very low level of variation within the nuclear genealogy, the fact that some derived alleles, which are at the tip of the nuclear network, are shared between geographically close phylogroups could be indicative of the existence of gene flow between the differentiated phylogroups. For example allele B20 occurs only in the very divergent phylogroups L3 and L5 near the zone of contact, but it was not detected in the ancestral phylogroup L4, suggesting that gene flow is occurring between the lineages. The same is true for allele B4 which although more widespread within phylogroup L4, occurs as well in L2 individuals, near the zone of contact, but does not occur in phylogroup L3, which is phylogenetically closer to phylogroup L2. Due to the slower evolving nature of nuclear genes when compared to the mitochondrial genome, nuclear genealogies should record older demographic events (Avise, 2004). Therefore, analysing the distribution of the oldest nuclear haplotypes allows us to access a greater temporal depth on the evolutionary history of Lacerta lepida . Haplotype B15 is the root of the nuclear gene network, representing therefore the most ancestral allele in the dataset. Its distribution is currently registered in the southern regions of Iberia, where it occurs with higher frequency than in northern regions (70% of samples with haplotype B15 are from southern latitudes) , but it is also found as far north as the Spanish Central System and Douro River basin, although much less

81 frequently. The high nuclear haplotype richness found in the southern regions of Iberia, together with the high frequency of B15 in this region suggest that southern populations are older and are the source for the demographic and spatial northerly expansions registered.

3.5.3. Historical biogeography of Lacerta lepida

Divergence within Lacerta lepida is estimated to have started in the Miocene, approximately 9.4 Mya (5.58-13.66), with divergence of phylogroup N. Within phylogroup L, estimated divergence times are much younger and are inferred to have occurred in the Plio-Pleistocene, approximately 1.96 Mya (1.13-2.91 Mya). Interestingly, although divergences between the mitochondrial phylogroups seem to have a Plio-Pleistocene (within group L) or a late Miocene (for group N) origin, haplotype diversity within each phylogroup indicates a strong influence of later Pleistocene events, with ages for each phylogroup estimated to range from 0.45 to 0.85 Ma. The importance of the Pleistocene climatic oscillations in promoting species differentiation in the Iberian Peninsula has been emphasized by previous studies (see Gomez and Lunt, 2007, for a recent review) and this clearly also seems to be the case for Lacerta lepida .

Phylogroup N

The earliest divergence within Lacerta lepida has lead to the establishment of two very divergent mitochondrial lineages, phylogroups N and L. Phylogroup N is distributed across the Betic Mountain range in south-western Spain and its distribution roughly coincides with the described subspecies Lacerta lepida nevadensis . Paulo et al (2008) have inferred that divergence between the phylogroups must have started due to overseas dispersal between what was then the Iberian mainland and the Betic Massif that at that time existed as an island between Iberia and North Africa. Under this

82 scenario contact between the phylogroups must have been initiated after the merging of the Betic Massif with Iberian mainland, due to the closing of the Betic corridor 7.6-7.8 Ma, (see Paulo et al. , 2008, and section 2.1.1. from this thesis for a detailed explanation of the kinematics of the western Mediterranean basin). The fact that these populations have remained genetically distinct at the mitochondrial level since the closing of the Betic corridor suggests reproductive isolation of the forms upon subsequent secondary contact. The Betic Mountain range is a region with high numbers of plant and animal endemics and has been pointed out as a refugium for several taxa (see Gomez and Lunt, 2007). Interestingly, some taxa show similar divergence times and distribution areas as phylogroup N (e.g. Salamandra salamandra longirostris (Garcia-Paris et al. , 1998) and Alytes dickhiller (Arntzen and Garcia-Paris, 1995)) emphasizing the striking similar responses of species to the history of this region. Within phylogroup N, the mismatch distribution shows a negative binomial curve, with a slightly ragged shape at the end (Fig. 3.4.a). The negative binomial curve can be indicative of recent population expansions, nevertheless the ragged shape seems to indicate population stasis with the extra peaks representing older ancestral polymorphisms. A history of a past geographic substructuring with restricted gene flow followed by recent population expansions could result in such pattern (Marjoram and Donnelly, 1994).

Phylogroups L2 and L3

The monophyletic group composed of phylogroups L2 and L3 is estimated to have started diverging from the remaining phylogroups in the early Pleistocene, approximately 1.5 Mya (0.82-2.27). Interestingly these two phylogroups currently have a disjunct distribution, with phylogroup L2 occupying the south of Portugal whereas L3 occupies the north-western parts of the Iberian Peninsula. The intervening region between phylogroups L2 and L3 is occupied by phylogroup L5. A vicariant event during the Plio-Pleistocene transition (0.82-2.27 Mya) triggering divergence between the L2-L3 lineage and the remaining populations of Lacerta lepida seems probable. Interestingly most phylogeographic studies within Iberia reveal similar phylogenetic breaks associated with the same period (e.g. Chioglossa lusitanica (Alexandrino et al. , 2002; 83

Alexandrino et al. , 2000), Oryctolagus cuniculus (Biju-Duval et al. , 1991), Lacerta schreiberi (Paulo et al. , 2001; Paulo et al. , 2002)), suggesting a common vicariant history. Such a vicariant event was most likely climatically mediated as no apparent geographical barrier exists within the western Iberian Peninsula. The fact that phylogroup L2 and L3 are part of a monophyletic group indicates a shared glacial refugium for at least part of their early evolutionary history. The region of western Algarve in southern Portugal has been pointed as the evolutionary centre for several species and also as a main refugial area (Fritz et al. , 2006; Mesquita et al. , 2005). The region harboured relicts of temperate forests during the Last Glacial Maximum (Zagwijn, 1992), probably providing suitable conditions for species survival through glacial periods. Thus the Algarve is a potential refugial area for the ancestor of L2 and L3, although the uncertain geological and ecological history of the region means this must be treated as speculative. The high genetic distances found between phylogroup L2 and L3 can be explained by further range fragmentation and divergence within this group. As L2 is inferred to descend from L3 haplotype 40 (or 44), this reveals an older age for L3, which likely once had a more extensive distribution area than now. The ancestral form, most closely related to L3, became disjunct most likely associated to climatic cooling. This disjunction has promoted divergence between L2 and L3. These phylogroups ought to have remained effectively isolated for a period of time long enough to allow the levels of divergence that are observed today. When climatic conditions allowed, probably during an interglacial period, the intervening region between L2 and L3 was colonized by L5, expanding from a nearby refugium (Tagus river basin, see chapter 2). Interestingly, a similar pattern of distribution of genetic variation is found within Discoglossus galganoi across Portugal, with two phylogenetically closer phylogroups distributed in the south and north and a less related phylogroup bisecting their distribution (Martínez-Solano, 2004). The distribution of Discoglossus galganoi phylogroups and the evolutionary relationship between them is remarkably similar to the ones just described for Lacerta lepida.

84

Phylogroups L1, L4 and L5

The refugium for Lacerta lepida phylogroups L1, L4 and L5 was probably located in the south-eastern side of the Guadalquivir basin. Support for this comes from the distribution of ancestral haplotypes 134, 147 and 8 within phylogroup L4, the oldest within lineage L, that are found only in this region (localities 10, 12, 32, 36, 37, 38, 39, 40, 41). It is also to this region that the two very divergent haplotypes within L4 (67 and 68) are restricted. The occurrence of deeply differentiated taxa isolated on the southern side of the Guadalquivir basin (e.g. Arntzen and Garcia-Paris, 1995; Garcia-Paris et al. , 1998) emphasize the importance that this region may have played in divergence processes inside the Iberian Peninsula. The widespread distribution of the most frequently sampled haplotype 1 suggests that the spatial and demographic expansion detected within L4 was of a “leading edge” type, with few individuals rapidly colonizing adjacent regions, leading to a decrease in genetic diversity on the newly colonized areas. Expansion was likely facilitated by the extensive low altitudinal plains that characterize most of the distribution area of L4. The different phylogroups are most likely the result of three different expansions from the southern refugia dominated by different ancestral haplotypes, followed by further divergence in allopatry.

3.6. Conclusion

Mitochondrial and nuclear gene genealogies in Lacerta lepida provided evidence for a history of isolation and divergence in allopatry resulting in the diversification of six genetically and geographically distinct lineages. Although diversification within the group is largely concordant with the onset of the major glaciations at the beginning of the Pleistocene (approximately 2 Mya) an earlier event, associated with the Miocene, was also registered. This event (9 Mya), which marks the divergence of lineage N, seems to be associated with geological events related to the evolution of the Mediterranean basin. The detailed analyses of the distribution of ancestral and derived

85 alleles within each lineage allowed the identification of six geographically distinct refugia distributed throughout the Iberian Peninsula. Signs of recent demographic and spatial expansions were registered in all lineages. As a result of spatial expansions after periods of divergence in allopatry most lineages have established zones of secondary contact. Further analysis of these zones should provide insights into the mechanisms involved in speciation and divergence in this lizard.

86

49! a) b) 48! L. l. iberica France

L. l. lepida 90!

L. l. nevadensis 91! 5W 118!

92! 87! 47! 89! 88! 86! 117! 84! 83! 57! 59! 85! Spain ! ! 56! 82! 81 80! 79 106! 93! 58! ! 78! ! 54 103! 104 55! 102 ! ! 105 46! 101! 94! Portugal 108! 44!

22! 21! 45! ! 98! ! ! ! 100 99! ! 20 19 43 53 18! 42! 96! 15! ! 97! 14! !16 76! 17 95! ! 2 ! ! ! 41 1 74 13! 40! 119! !3 !4 50! 75! 10! 39! !5 38! 122! 72! !6 ! 37! ! 9 12! 10W 71 ! 52! !7 !8 129! 124 ! 36! 70! 73 11!!125! 30 ! 69! ! 23! 107 64 ! ! 51! 65! ! 24 26 34! 115! 128! ! 68! ! ! 62! 113 67 66 !61 35! 116! 27! 112! 121! ! 77! 25! 33 127! 60! ! 114! 32 109! 110! 111! 28!

! ! 050 100 200 126 29 Kilometers 123! 31!

Fig. 3.1. a) Distribution area of Lacerta lepida and recognized continental subspecies ( L. l. iberica , L. l. lepida and L. l. nevadensis ). b) Sampled localities. Numbers are the same as in Table 3.1. Shaded areas denote altitude gradients, with darker areas representing higher altitudes.

87

a) 102 b) 51 48 53 57 109 110 L2 43 41 50 L3 123 45 43 109 108 124 125 42 C2 C0 107 104 100 103 * 40 44 58 C1 106 105 57 61 47 59 53 122 47 98 101 60 48 62 46 54 67 83 99 139 49 56 4 138 61 55 90 A3 64 D0 A0 B1 B2 16 52 * 85 114 117 84 80 A1 150 A2

N 97 19 145 129 * * L4 38 27 131 133 // 134 149 131 E0 133 134 147 B0 29 151 148 147 121 89 92 146 96 22 137 144 132 126 130 94 12 11 8 36 32 150 5 80 10 81 115 127 93 9 88 128 76 152 71 3 118 66 84 72 85 77 65 120 7 86 143 75 4 135 6 70 1 63 111 2 91 * 119 142 73 90 112 64 117 78 A0: 1, 2, 3, 5,6, 8, 9, 10, 11, 67,68, 60, 70,71, 72, 73, 75, 69 79 141 83 113 140 139 82 87 68 114 76, 78, 81, 86, 88, 89, 91, 92, 93, 115, 118, 119, 120, 121, 136 74 67 L1 135, 136, 137, 140, 141, 142, 143, 148, 149, 150, 151 and * 13 14 15 152 A1: 12, 79, 87 and 94; A2: 74, 82 and 146; A3: 7 and 16 77; B0: 17,18,20,21,23, 24,25, 26,28,30, 31,33, 34,35, 20 18 19 22 21 17 23 24 37, 38, 95, 97 and 116; B1: 13 and 39; B2: 14 and 15; C0: 95 96 40, 41, 42,44, 45, 46, 49, 50,52, 54, 55, 56, 58,60, 62, 98, 97 26 116 27 99, 100, 103, 104, 105, 106, 107, 108, 122, 123, 124, 125 39 25 28 38 29 and 138; C1: 101, 51and 59; C2: 102 and 110; D0: 63, 65, 37 30 L5 66,111,112 and 113; E0: 126, 127, 128, 129, 130, 131, 132 36 32 31 and 144. 35 33 34

Fig. 3.2. Statistical Parsimony network of Lacerta lepida cytochrome b haplotypes (a) using all 627 bp sites and (b) using 0-fold degenerate sites only. Dashed lines represent ambiguities in the networks. White circles with no numbers represent unsampled or extinct haplotypes. L1, L2, L3, L4, L5 and N represent different mitochondrial phylogroups. The ancestral haplotype within each phylogroup is marked with *. Phylogroup N connects to the main network through 65 mutations and the connection is represented by an interrupted line. In (b) numbers inside circles represent haplotypes from network (a) and circles with letters represent groups of haplotypes from network (a) that show no differences in 0-fold degenerate sites, being thus represented as a single haplotype in the network (b). Composition of each group of haplotypes in network (b) is shown under the network.

88

Lineage L1 Lineage L2 ITALY FRANCE Lineage L3 Lineage L4 Lineage L5 Lineage N

L3 SPAIN

!59 !82 !79 !56 !78 !58 L1 L5 !94 !44

22 !!21 !20!19 !53 !18 PORTUGAL 96!!15 L4

!6

!23 N L2 !24 !25 !33

0 150 300Km ALGERIA

Fig. 3.3. Distribution of Lacerta lepida mitochondrial phylogroups based on cytochrome b gene. Colours are the same as in Fig. 3.2. Red areas represent contact zones between phylogroups. Contact zones were inferred according to sampling sites where haplotypes from different phylogroups were detected in sympatry (see Table.3.1). Sampling sites used for inferring the spatial distribution of contact zones are represented by numbers.

89

Theta I = 1.35; Theta F = 1000; t = 0.75 Theta I = 0.72; Theta F = 1000; t = 2.38 L1 L4

Theta I = 0; Theta F = 1000; t = 2.17 Theta I= 0.6; Theta F = 1000; t = 1.51 L2 L5

Theta I = 1.04; Theta F = 1000; t = 1.32 Theta I = 1.51; Theta F = 1000; t = 0.54 L3 N

Fig. 3.4. Mismatch distribution of mtDNA haplotypes for each of the 6 Lacerta lepida phylogroups. The expected frequency is based on a population growth-decline model, determined using DnaSP v4.50 (Rozas et al. , 2003) and is represented by a continuous green line. The observed frequency is represented by a red dotted line

90

B4 B6 B5 B3

B7 B2

B8 B1 B18 B19

B9 B16

B10 B12 B14 B17

B11 B13

B20 B15

L1 L2 L3 L4 L5 N L. pater

Fig. 3.5. Statistical Parsimony network of Lacerta lepida β-Fibrinogen intron 7 alleles. Dashed lines represent ambiguities in the network; white circles with no numbers represent unsampled or extinct haplotypes and the black circle represents the outgroup (Lacerta pater ). The proportion of each allele found within each mitochondrial phylogroup is represented through pie charts. Colours in pie charts are the same as the ones used to represent the mitochondrial phylogroups in Fig. 3.2. 91

Table 3.1 Number of Lacerta lepida samples per site (n) and correspondent number of sequences amplified for cytb (Cytb) and β-Fibrinogen intron 7 (β-Fib) genes. For each site, haplotypes found regarding each gene are shown. Grey shaded rows indicate sites were haplotypes of two different phylogroups were found in sympatry, revealing the location of secondary contact zones (see Fig. 3.3.).

o N of sequences mtDNA Site n Cytb β-Fib Cytb haplotypes β-Fibint7 alleles phylogroup

1 2 2 2 1 B13, B14 L4

2 1 1 0 7 L4 3 1 1 1 2 B1 L4 4 1 1 0 1 L4 5 3 3 0 1, 73, 90 L4 6 3 3 1 6, 70, 104 B1, B4 L2, L4 7 3 3 0 1, 69, 70 L4 8 3 3 3 70, 84, 143 B1, B3, B4, B13 L4 9 2 2 2 10, 85 B1, B6 L4 10 1 1 0 8 L4 11 5 4 4 10, 70, 93 B1,B 3, B15, B16 L4 12 1 1 0 8 L4 13 1 1 0 80 L4 14 3 3 0 1 L4 15 5 5 0 1, 25, 79, 83 L4, L5 16 3 3 0 69 L4 17 9 9 0 1, 6, 16, 69, 86, L4, L5 121 18 6 6 0 1, 25, 69 L4, L5 19 5 5 1 25, 74, 79, 87 B1 L4, L5 20 2 2 0 25, 72 L4, L5 21 5 5 0 1, 25, 79 L4, L5 22 5 5 2 1, 17, 15, 116, 118 1, 13 L4, L5 23 3 3 0 107, 120, 125 L2, L4 24 9 9 0 6, 75, 76, 81, 88, L2, L4 98, 117, 119 25 8 8 2 70, 78, 98, 100, B1, B13, B14 L2, L4 26 3 3 3 135, 139, 142 B1, B5, B16 L4 27 6 5 4 10, 68, 91, 137 B1, B5 L4 28 1 1 0 67 L4

92

Table 3.1. Continuation

o N of sequences mtDNA Site n Cytb β-Fib Cytb haplotypes β-Fibint7 alleles phylogroup

29 5 2 5 140, 141 B1, B4, B5, B13 L4 30 3 3 0 9, 89, 114 L4 31 4 3 3 136, 140, 141 B1, B4, B18 L4

32 1 1 0 8 L4 33 2 2 0 126 L4, N 34 1 1 1 89 B1 L4 35 2 1 1 89 B1, B9 L4 36 4 4 0 89, 147, 148 L4 37 6 6 0 8, 89, 94, 149 L4 38 3 3 0 8, 151, 152 L4 39 4 4 0 8, 147, 150 L4 40 3 3 1 3, 5, 8 B1, B17 L4 41 1 1 0 134 L4 42 1 1 0 1 L4 43 3 3 1 1, 11, 12 B1, B19 L4 44 2 2 0 1 L4 45 1 1 0 69 L4 46 3 3 2 1, 69 B1 L4 47 3 3 3 1, 71 B1, B17 L4 48 2 2 0 1 L4 49 3 3 0 1 L4 50 1 1 0 4 L4 51 1 1 0 92 L4 52 3 3 0 1, 92, 146 L4 53 3 3 2 25, 77, 97 B1 L4, L5 54 6 6 3 111, 112 B1 L1 55 2 2 1 111 B1 L1 56 5 5 1 27, 63, 64 B1, B15 L1, L5 57 1 1 0 65 L1 58 25 25 2 25, 27, 31, 34, 37, B1 L1, L5 38, 39, 63 59 19 19 1 46, 66 B1 L1, L3 60 1 1 1 115 B1, B17 L1 61 4 4 1 98, 104, 106, 108 B1 L2 62 5 5 0 98, 122 L2 63 1 1 0 100 L2

93

Table 3.1. Continuation

o N of sequences mtDNA Site n Cytb β-Fib Cytb haplotypes β-Fibint7 alleles phylogroup

64 3 3 1 100, 104, 123 B4 L2 65 1 1 1 105 B1 L2 66 3 3 1 100, 124 B1 L2

67 1 1 0 98 L2 68 1 1 1 98 B1 L2 69 1 1 1 103 B1, B13 L2 70 2 2 0 100, 103 L2 71 1 1 1 110 B1 L2 72 6 6 2 98, 100, 101, 102, B1, B15 L2 105 73 1 1 0 107 L2 74 1 1 1 109 B1 L2 75 1 1 1 100 B1, B6 L2 76 1 1 0 100 L2 77 1 1 1 99 B1 L2 78 2 2 0 32, 46 L3, L5 79 18 18 0 40, 46, 52, 60, 61, L3 62 80 1 1 0 56 L3 81 1 1 0 40 L3 82 4 4 0 17, 21, 46, 48 L3, L5 83 7 7 0 46, 57, 59 L3 84 1 1 0 49 L3 85 1 1 0 47 L3 86 2 2 0 46, 53 L3 87 3 3 0 42, 54, 55 L3 88 1 1 0 46 L3 89 10 10 0 41, 44, 45, 46, 50, L3 51, 52, 58 90 11 11 1 41, 43 B1 L3 91 1 1 0 42 L3 92 3 3 1 41, 46, 48 B1 L3 93 5 5 0 49 L3 94 4 1 3 138 B1, B2, B7, B15 L1, L3 95 3 3 1 14, 17, 20 B1 L5

94

Table 3.1. Continuation

o N of sequences mtDNA Site n Cytb β-Fib Cytb haplotypes β-Fibint7 alleles phylogroup

96 2 2 0 13, 86 L4, L5 97 1 1 0 15 L5 98 1 1 0 18 L5 99 8 8 2 17, 33, 95, 96 B1, B20 L5 100 2 2 0 30 L5

101 7 7 3 23, 26, 29, 30 B1, B5 L5 102 1 1 1 17 B1 L5 103 2 2 1 19, 32 B1 L5 104 1 1 0 35 L5 105 1 1 1 24 B13, B20 L5 106 14 14 0 25, 28, 31, 36, 39 L5 107 1 1 0 25 L5 108 1 1 1 22 B1 L5 109 1 1 0 133 N 110 1 1 0 131 N 111 5 4 1 127, 128, 130, 132 B1, B8 N 112 11 6 7 126 B1,B8, B11, B12, N B15, B19 113 1 1 0 129 N 114 3 3 0 126 N 115 3 3 0 126, 144 N 116 3 3 0 126, 145 N 117 1 0 1 B7, B20 L3* 118 3 0 3 B1 L4* 119 1 0 1 B1, B17 L4* 120 1 0 1 B1, B17 L4* 121 1 0 1 B1,B 17 L4* 122 1 0 1 B1 L4* 123 2 0 2 B1 L4* 124 1 0 1 B1, B10 L4* 125 1 0 1 B4, B5 L4* 126 1 0 1 B1, B5 L4* 127 1 0 1 B1 N* 128 1 0 1 B1 N* 129 1 0 1 B5, B6 L4* * When sequences were not available the phylogroup was inferred using the geographic location of samples.

95

Table 3.2 Pairwise genetic distances (corrected and uncorrected) between Lacerta lepida mitochondrial phylogroups.

mtDNA L1 L2 L3 L4 L5 N phylogroup uncorrected p distances a (%)

L1 0.32 (0.55) 3.28 2.53 1.48 1.85 11.30 L2 3.74 0.55 (0.53) 2.75 2.92 3.28 12.44 L3 2.80 3.13 0.67 (0.64) 2.10 2.44 12.02 L4 1.57 3.29 2.30 0.74 (0.71) 1.11 11.74 L5 2.00 3.77 2.72 1.17 0.61 (0.59) 11.86 N 19.07 22.38 21.20 20.61 21.07 0.72 (0.69) b HKY corrected (%)

a values above diagonal and values in diagonal inside brackets represent uncorrected genetic distances (p distances). b values under diagonal and values in diagonal outside brackets represent genetic distances corrected using the HKY + Γ model for nucleotide substitution

96

Table 3.3 Results from mismatch distribution and neutrality tests for cytb mtDNA phylogroups and for β-fibint7 nuclear gene. (p(SDD) = sum of square deviations; p(hg) = Harpending’s raggedness index; Tajima’s D (D) and respective p value; Fu’s Fs test (Fs) and respective p value; Ramos-Onsis R2 (R2) and respective p value). Results for the spatial genetic structure estimated with GEODIS are also shown. Statistics that do not suggest range expansion are shown in bold font.

Mismatch Distribution Neutrality tests

Spatial genetic Sudden-expansion Spatial-expansion structure model model D p Fs p R2 p χ2 Locus Phylogroup p p (SDD) p (hg) p (SDD) p (hg) L1 83.60 0.00 0.40 0.76 0.36 0.67 -1.66 0.03 -2.32 0.07* 0.11 0.26* L2 358.07 0.24* 0.10 0.31 0.31 0.38 -1.72 0.02 -9.92 0.00 0.05 0.00 L3 754.10 0.00 0.64 0.84 0.51 0.88 -1.77 0.01 -14.78 0.00 0.04 0.01 Cytb L4 3494.93 0.09* 0.92 0.64 0.94 0.70 -2.35 0.00 -26.33 0.00 0.02 0.00 L5 1280.03 0.00 0.10 0.09 0.06 0.11 -2.19 0.00 -27.19 0.00 0.03 0.00 N 105.14 0.03 0.64 0.74 0.89 0.88 -2.00 0.01 -3.90 0.01 0.06 0.00 β-Fib ... 148.25 0.02 0.03* 0.96 0.58 0.85 -1.58 0.03 -16.52 0.00 0.04 0.02

97

Table 3.4. Divergence time estimates in million years for each Lacerta lepida phylogroup using the method of Saillard et al. (2000) and for each monophyletic group using Beast (see text for explanation of each method).

Saillard a Beast b

Mutation rate 2% 1.70% 2.50% 1% mtDNA phylogroups (mean ± s.d.) Low HPD Mean Upper HPD

L1 0.64 ± 0.12 0.75 ± 0.14 0.51 ± 0.10 0.28 0.76 1.32 L2 0.45 ± 0.21 0.53 ± 0.24 0.36 ± 0.16 0.21 0.47 0.78 L3 0.54 ± 0.34 0.63 ± 0.40 0.43 ± 0.27 n.a n.a n.a L4 0.92 ± 0.30 1.08 ± 0.35 0.73 ± 0.24 n.a n.a n.a L5 0.59 ± 0.19 0.70 ± 0.22 0.47 ± 0.15 0.29 0.61 0.98 N 0.85 ± 0.34 0.99 ± 0.41 0.68 ± 0.28 n.a n.a n.a L2+L3 n.a n.a n.a 0.82 1.50 2.27 L1+L2+L3+L4+L5 n.a n.a n.a 1.13 1.96 2.91 All (L+N) n.a n.a n.a 5.58 9.43 13.66

98

3.7. References

Alexandrino J, Arntzen JW, Ferrand N (2002) Nested clade analysis and the genetic evidence for population expansion in the phylogeography of the golden- striped salamander, Chioglossa lusitanica (Amphibia: Urodela). Heredity 88 , 66-74.

Alexandrino J, Froufe E, Arntzen JW, Ferrand N (2000) Genetic subdivision, glacial refugia and postglacial recolonization in the golden-striped salamander, Chioglossa lusitanica (Amphibia: Urodela). Molecular Ecology 9, 771-781.

Arntzen JW, Garcia-Paris M (1995) Morphological and allozyme studies of midwife toads (genus Alytes ), including the description of two new taxa from Spain. Contributions to zoology 65 , 5-34.

Avise JC (2004) Molecular Markers, Natural History and Evolution , 2nd edn. Sinauer Associates, Sunderland, Massachusetts.

Avise JC, Arnold J, Ball RM, Bermingham E, Lamb T, Neigel JE, Reeb CA, Saunders NC (1987) Intraspecific phylogeography: the mitochondrial DNA bridge between population genetics and systematics. Annual Review of Ecology and Systematics 18 , 489-522.

Bandelt HJ, Forster P, Rohl A (1999) Median-joining networks for inferring intraspecific phylogenies. Molecular Biology and Evolution 16 , 37-48.

Biju-Duval C, Ennafaa H, Dennebouy N, Monnerot M, Mignotte F, Soriguer R, El Gaied A, El Hili A, Monoulou JC (1991) Mitochondrial DNA evolution in Lagomorphs: origin of systematic heteroplasmy and organization of diversity in European rabbits. Journal of Molecular Evolution 33 , 92-102.

Cassens I, Mardulyn P, Milinkovitch MC (2005) Evaluating intraspecific network construction methods using simulated sequence data: do existing algorithms outperform the global Maximum Parsimony approach? Systematic Biology 54 , 363 - 372.

Castro-Parga I, Moreno JC, Christopher S, Humphries J, Williams PH (1996) Strengthening the Natural and National Park system of Iberia to conserve vascular plants. Botanical Journal of the Linnean Society 121 , 189-206.

Clement M, Posada D, Crandall KA (2000) TCS: a computer program to estimate gene genealogies. Molecular Ecology 9, 1657-1659.

Cooper SJB, Hewitt GM (1993) Nuclear DNA sequence divergence between parapatric subspecies of the grasshopper Chorthippus parallelus . Insect Molecular Biology 2, 185-194.

99

Cooper SJB, Ibrahim KM, Hewitt GM (1995) Postglacial expansion and genome subdivision in the European grasshopper Chorthippus parallelus . Molecular Ecology 4, 49-60.

Crandall KA, Templeton AR (1993) Empirical tests of some predictions from coalescent theory with applications to intraspecific phylogeny reconstruction. Genetics 134 , 959-969.

Dolman G, Phillips B (2004) Single copy nuclear DNA markers characterized for comparative phylogeography in Australian wet tropics rainforest skinks. Molecular Ecology Notes 4, 185-187.

Dowling DK, Friberg U, Lindell J (2008) Evolutionary implications of non-neutral mitochondrial genetic variation. Trends in Ecology & Evolution 23 , 546-554.

Drummond AJ, Ho SYW, Phillips MJ, Rambaut A (2006) Relaxed Phylogenetics and Dating with Confidence. PLoS Biology 4, e88.

Drummond AJ, Rambaut A (2007) BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evolutionary Biology 7, 214.

Emerson BC (2007) Alarm bells for the Molecular Clock? No support for Ho et al.'s model of time-dependent molecular rate estimates. Systematic Biology 56 , 337 - 345.

Excoffier L, Laval G, Schneider S (2005) Arlequin ver. 3.0: An integrated software package for population genetics data analysis. Evolutionary Bioinformatics Online 47-50.

Fritz U, Barata M, Busack SD, Fritzsch G, Castilho R (2006) Impact of mountain chains, sea straits and peripheral populations on genetic and taxonomic structure of a freshwater turtle, Mauremys leprosa (Reptilia, Testudines, Geoemydidae). Zoologica Scripta 35 , 97-108.

Fu YX (1997) Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection. Genetics 147 , 915-925.

García-Barros E, Gurrea P, Luciáñez MJ, Cano JM, Munguira ML, Moreno JC, Sainz H, Sanz MJ, Simón JC (2002) Parsimony analysis of endemicity and its application to animal and plant geographical distributions in the Ibero- Balearic region (western Mediterranean). Journal of Biogeography 29 , 109- 124.

Garcia-Paris M, Alcobendas M, Alberch P (1998) Influence of the Guadalquivir river basin on mitochondrial DNA evolution of Salamandra salamandra (Caudata: Salamandridae) from southern Spain. Copeia 1998 , 173-176.

Gibbs MJ, Armstrong JS, Gibbs AJ (2000) Sister-Scanning: a Monte Carlo procedure for assessing signals in recombinant sequences. Bioinformatics 16 , 573-582.

100

Godinho R, Crespo EG, Ferrand N (2008) The limits of mtDNA phylogeography: complex patterns of population history in a highly structured Iberian lizard are only revealed by the use of nuclear markers. Molecular Ecology 17 , 4670-4683.

Godinho R, Mendonca B, Crespo EG, Ferrand N (2006) Genealogy of the nuclear beta-fibrinogen locus in a highly structured lizard species: comparison with mtDNA and evidence for intragenic recombination in the hybrid zone. Heredity 96 , 454-463.

Gomez-Campo C, Bermudez-de-Castro L, Cagiga MG, Sanchez-Yelamo MD (1984) Endemism in the Iberian Peninsula. Webbia 38 , 709-714.

Gomez A, Lunt DH (2007) Refugia within refugia: patterns of phylogeographic concordance in the Iberian Peninsula. In: Phylogeography of Southern European Refugia (eds. Weiss S, Ferrand N). Springer, Dordrecht.

Hall TA (1999) BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows95/98/NT. Nucleic Acids Symposium Series 41 , 95–98.

Harpending HC (1994) Signature of ancient population growth in a low-resolution mitochondrial DNA mismatch distribution. Human Biology 66 , 591-600.

Harpending HC, Sherry ST, Rogers AR, Stoneking M (1993) The Genetic Structure of Ancient Human Populations. Current Anthropology 34 , 483-496.

Harris DJ, Sá-Sousa P (2002) Molecular phylogenetics of Iberian Wall lizards (Podarcis): Is Podarcis hispanica a species complex? Molecular Phylogenetics and Evolution 23 , 75-81.

Hasegawa M, Kishino H, Yano T-A (1985) Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. Journal of Molecular Evolution 22 , 160-174.

Hewitt GM (1996) Some genetic consequences of ice ages, and their role, in divergence and speciation. Biological Journal of the Linnean Society 58 , 247-276.

Hewitt GM (1999) Post-glacial re-colonization of European biota. Biological Journal of the Linnean Society 68 , 87-112.

Hewitt GM (2000) The genetic legacy of the Quaternary ice ages. Nature 405 , 907- 913.

Hewitt GM (2004) Genetic consequences of climatic oscillations in the Quaternary. Philosophical Transactions of the Royal Society B: Biological Sciences 359 , 183-195.

101

Leaché AD, McGuire JA (2006) Phylogenetic relationships of horned lizards (Phrynosoma) based on nuclear and mitochondrial data: Evidence for a misleading mitochondrial gene tree. Molecular Phylogenetics and Evolution 39 , 628-644.

Lindell J, Méndez-de la Cruz FR, Murphy RW (2008) Deep biogeographical history and cytonuclear discordance in the black-tailed brush lizard ( Urosaurus nigricaudus ) of Baja California. Biological Journal of the Linnean Society 94 , 89-104.

Marjoram P, Donnelly P (1994) Pairwise comparisons of mitochondrial DNA sequences in subdivided populations and Implications for early Human evolution. Genetics 136 , 673-683.

Martin AP, Palumbi SR (1993) Body size, metabolic rate, generation time, and the molecular clock. Proceedings of the National Academy of Sciences of the United States of America 90 , 4087-4091.

Martin DP, Rybicki E (2000) RDP: detection of recombination amongst aligned sequences. Bioinformatics 16 , 562-563.

Martin DP, Williamson C, Posada D (2005) RDP2: recombination detection and analysis from sequence alignments. Bioinformatics 21 , 260-262.

Martínez-Solano I (2004) Phylogeography of Iberian Discoglossus (Anura: Discoglossidae). Journal of Zoological Systematics & Evolutionary Research 42 , 298-305.

Martínez-Solano I, Gonçalves HA, Arntzen JW, García-París M (2004) Phylogenetic relationships and biogeography of midwife toads (Discoglossidae: Alytes ). Journal of Biogeography 31 , 603-618.

Martínez-Solano I, Teixeira J, Buckley D, Garcia-Paris M (2006) Mitochondrial DNA phylogeography of Lissotriton boscai (Caudata, Salamandridae): evidence for old, multiple refugia in an Iberian endemic. Molecular Ecology 15 , 3375-3388.

McGuire JA, Linkem CW, Koo MS, Hutchison DW, Kristopher A, David L, Orange I, Lemos-Espinal J, Riddle BR, Jaeger JR (2007) Mitochondrial introgression and incomplete lineage sorting through space and time:phylogenetics of Crotaphytid lizards. Evolution 61 , 2879-2897.

Mesquita N, Hanfling B, Carvalho GR, Coelho MM (2005) Phylogeography of the cyprinid Squalius aradensis and implications for conservation of the endemic freshwater fauna of southern Portugal. Molecular Ecology 14 , 1939-1954.

Nichols RA, Hewitt GM (1994) The genetic consequences of long distance dispersal during colonization. Heredity 72 , 312-317.

102

Padidam M, Sawyer S, Fauquet CM (1999) Possible emergence of new geminiviruses by frequent recombination. Virology 265 , 218-225.

Paulo OS, Dias C, Bruford MW, Jordan WC, Nichols RA (2001) The persistence of Pliocene populations through the Pleistocene climatic cycles: evidence from the phylogeography of an Iberian lizard. Proceedings of the Royal Society B: Biological Sciences 268 , 1625-1630.

Paulo OS, Jordan WC, Bruford MW, Nichols RA (2002) Using nested clade analysis to assess the history of colonization and the persistence of populations of an Iberian Lizard. Molecular Ecology 11 , 809-819.

Paulo OS, Pinheiro J, Miraldo A, Bruford MW, Jordan WC, Nichols RA (2008) The role of vicariance vs. dispersal in shaping genetic patterns in ocellated lizard species in the western Mediterranean. Molecular Ecology 17 , 1535-1551.

Pinho C, Harris DJ, Ferrand N (2007) Contrasting patterns of population subdivision and historical demography in three western Mediterranean lizard species inferred from mitochondrial DNA variation. Molecular Ecology 16 , 1191- 1205.

Pinho C, Harris DJ, Ferrand N (2008) Non-equilibrium estimates of gene flow inferred from nuclear genealogies suggest that Iberian and North African wall lizards (Podarcis spp.) are an assemblage of incipient species. BMC Evolutionary Biology 8, 63.

Posada D, Crandall KA (1998) MODELTEST: testing the model of DNA substitution. Bioinformatics 14 , 817-818.

Posada D, Crandall KA (2001a) Evaluation of methods for detecting recombination from DNA sequences: computer simulations. Proc Natl Acad Sci USA 98 , 13757-13762.

Posada D, Crandall KA (2001b) Intraspecific gene genealogies: trees grafting into networks. Trends in Ecology & Evolution 16 , 37-45.

Posada D, Crandall KA, Templeton AR (2000) GeoDis: a program for the cladistic nested analysis of the geographical distribution of genetic haplotypes. Molecular Ecology 9, 487-488.

Prychtko TM, Moore WM (1997) The utility of DNA sequences of an intron from the beta-fibrinogen gene in phylogenetic analysis of woodpeckers (Aves: Picidae). Molecular Phylogenetics and Evolution 8, 193-204.

Rambaut A, Drummond A (2005) Tracer v1.3. Available from http://beast.bio.edsac.uk/Tracer .

Rambaut A, Drummond A (2007) BEAUTi v1.4.2. In: Bayesian Evolutionary Analysis Utility . Available from: http://beast.bio.edsac.uk/BEAUTi .

103

Ramos-Onsins SE, Rozas J (2002) Statistical properties of new neutrality tests against population growth. Molecular Biology and Evolution 19 , 2092-2100.

Rogers AR, Harpending H (1992) Population growth makes waves in the distribution of pairwise genetic differences. Molecular Biology and Evolution 9, 552-569.

Rozas J, Sanchez-DelBarrio JC, Messeguer X, Rozas R (2003) DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics 19 , 2496-2497.

Saillard J, Forster P, Lynnerup N, Bandelt H-J, Nørby S (2000) mtDNA variation among Greenland Eskimos: the edge of the Beringian expansion. The American Journal of Human Genetics 67 , 718-726.

Saiz JCM, Parga IC, Ollero HS (1998) Numerical analyses of distributions of Iberian and Balearic endemic monocotyledons. Journal of Biogeography 25 , 179- 194.

Santucci F, Emerson BC, Hewitt GM (1998) Mitochondrial DNA phylogeography of European hedgehogs. Molecular Ecology 7, 1163-1172.

Seddon JM, Santucci F, Reeve NJ, Hewitt GM (2001) DNA footprints of European hedgehogs, Erinaceus europaeus and E. concolor : Pleistocene refugia, postglacial expansion and colonization routes. Molecular Ecology 10 , 2187- 2198.

Sequeira F, Alexandrino J, Rocha S, Arntzen JW, Ferrand N (2005) Genetic exchange across a hybrid zone within the Iberian endemic golden-striped salamander, Chioglossa lusitanica . Molecular Ecology 14 , 245-254.

Sequeira F, Ferrand N, Harris DJ (2006) Assessing the phylogenetic signal of the nuclear β-Fibrinogen intron 7 in salamandrids (Amphibia: Salamandridae). Amphibia-Reptilia 27 , 409-418.

Slatkin M, Hudson RR (1991) Pairwise comparisons of mitochondrial DNA sequences in stable and exponentially growing populations. Genetics 129 , 555-562.

Smith JM (1992) Analyzing the mosaic structure of genes. Journal of Molecular Evolution 34 , 126-129.

Stephens M, Scheet P (2005) Accounting for decay of linkage disequilibrium in haplotype inference and missing-data imputation. The American Journal of Human Genetics 76 , 449-462.

Stephens M, Smith NJ, Donnelly P (2001) A new statistical method for haplotype reconstruction from population data. The American Journal of Human Genetics 68 , 978-989.

104

Taberlet P, Bouvet J (1994) Mitochondrial DNA polymorphism, phylogeography, and conservation genetics of the brown bear Ursus arctos in Europe. Proceedings of the Royal Society B: Biological Sciences 255 , 195-200.

Taberlet P, Fumagalli L, Wust-Saucy A-G, Cosson J-F (1998) Comparative phylogeography and postglacial colonization routes in Europe. Molecular Ecology 7, 453-464.

Tajima F (1989) Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123 , 585-595.

Templeton AR, Crandall KA, Sing CF (1992) A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping and DNA sequence data. III. Cladogram estimation. Genetics 132 , 619-633.

Thorpe RS, Surget-Groba Y, Johansson H (2008) The relative importance of ecology and geographic isolation for speciation in anoles. Philosophical Transactions of the Royal Society B: Biological Sciences 363 , 3071-3081.

Ujvari B, Dowton M, Madsen T (2008) Population genetic structure, gene flow and sex-biased dispersal in frillneck lizards ( Chlamydosaurus kingii ). Molecular Ecology 17 , 3557-3564.

Wallis GP, Arntzen JW (1989) Mitochondrial-DNA variation in the crested newt superspecies: limited cytoplasmic gene flow among species. Evolution 43 , 88-104.

Woolley SM, Posada D, Crandall KA (2008) A comparison of phylogenetic network methods using computer simulation. PLoS ONE 3, e1913.

Zagwijn WH (1992) Migration of vegetation during the Quaternary in Europe. Courier Forschungsinstitut Senckenberg 153 , 9-20.

Zink RM, Barrowclough GF (2008) Mitochondrial DNA under siege in avian phylogeography. Molecular Ecology 17 , 2107-2121.

105

Chapter 4

Genetic analysis of a secondary contact zone between Lacerta lepida lepida and Lacerta lepida nevadensis

Photos by Andreia Miraldo Pattern of dorsal scales in Lacerta l. lepida and L. l. nevadensis*

*Lacerta lepida lepida from sampling site 5 and nevadensis from sampling site 1 in chapter 4

4. Genetic analysis of a secondary contact zone between Lacerta lepida lepida and Lacerta lepida nevadensis

4.1 Abstract

Lacerta lepida has endured repeated range fragmentation that has promoted diversification within the species, and a very old split separating lineages N (subspecies Lacerta lepida nevadensis ) and L (subspecies Lacerta lepida lepida ) has been identified in the previous chapter. Using mtDNA and microsatellite data a population genetic analysis of an area of secondary contact between the two mitochondrial lineages was performed. Levels of gene flow across the zone were assessed to clarify if the divergent lineages are independently evolving or if they will coalesce due to high levels of gene flow. Hybridization between the lineages was detected by the presence of F1 hybrids but the overall coincidence of mitochondrial and nuclear loci and the generality of the observed narrow clines in relation to dispersal support the idea that this contact zone is acting as a barrier to gene flow. Population genetic structure within each lineage was assessed and estimates of genetic diversity inferred from mitochondrial DNA are relatively higher than the ones revealed from nuclear markers, especially within lineage L. This suggests that there might be low female dispersal relative to males. Despite some low levels of similarity in allele frequencies between the lineages, results suggest that lineages N and L are in independent evolutionary trajectories.

Key words: gene flow, tension zone, hybrids, selection, dispersal, clines, speciation 106

4.2. Introduction

The Biological Species Concept (BSC) defines species as “groups of actually or potentially interbreeding natural populations which are reproductively isolated from other such groups” (Mayr, 1963). Under this concept, speciation results from the development of reproductive isolation between diverging taxa, which as a general rule, involves the disruption of genic interactions (a process commonly known as the “Dobzhansky-Muller model” (Bateson, 1909; Dobzhansky, 1937; Muller, 1942)). The disruption of genic interactions might occur when populations become geographically isolated for a certain period of time, during which mutations arise and become fixed through natural selection or genetic drift. The new accumulated mutations can lead to epistatic incompabilities between genes responsible for ecological, physiological and/or behavioral differentiation, which are revealed when the differentiated populations meet and hybridize (Coyne and Orr, 2004). With growth in the use of molecular markers over recent years an increasing number of such studies have reported subspecific parapatric contacts within species (Avise, 2000; Avise et al. , 1987). Species that reveal these parapatric contacts within their distributions may well be representative of different stages in speciation, and their study can provide for the quantification of the genetic differences that may underlie this (Coyne and Orr, 2004; Hewitt, 1988). As an example, assessing the level of reproductive isolation between evolutionary lineages at regions of contact may help us to understand if they will remain isolated, coalesce, or sustain some degree of gene flow between full isolation and full coalescence. Thus, examining the degree of reproductive isolation that populations have reached through the process of differentiation is an important issue from both evolutionary and conservation perspectives (Avise, 2000, Crandall, 2000, Moritz, 2002). According to the genic view of speciation (Wu, 2001) if the incompatibilities accumulated during the period of divergence involve only small parts of the genome, the rest may mix freely upon contact, allowing for high levels of gene flow between the diverging populations. At

107 this point populations might still fuse and the process of differentiation can be reversed, with the result being a single genetic entity. However, if differentiation is more extensive, even upon contact populations continue to diverge and there will be a point when gene flow is impeded. Only after this point (“point of no return”, following Wu (2001)), complete reproductive isolation is achieved and therefore speciation (according to the BSC) is complete. It should be noted that Wu’s view on speciation requires that reproductive isolation precedes adaptive differences, which might not always be the case (see Alphen and Seehausen, 2001; Bridle and Ritchie, 2001; Mayr, 2001; Rieseberg and Burke, 2001; Vogler, 2001 for examples and other views), but it seems to offer a plausible explanation for speciation events initiated by a period of allopatric isolation. In fact with increasing allopatric divergence a gradual decrease in gene flow at the time of contact occurs with gene flow being progressively impeded with increasing heterozygote unfitness (due to increasing mutations), until a threshold is reached when the genome is strongly linked and introgression greatly reduced (Barton and Hewitt, 1983; Barton and Hewitt, 1985; Barton and Hewitt, 1989). Stages of incipient speciation are being reported at parapatric contact zones, with different degrees of genetic isolation being found. Measuring the diffusion of genes between evolutionary units through zones of secondary contact allows the detection of gene flow by means of hybridization and backcrossing, providing insight into the extent and nature of the reproductive isolation that has been achieved (Barton and Hewitt, 1985; Harrison, 1990; Harrison, 1993; Hewitt, 1988). When genetically distinct populations meet and if hybrids are produced selection can act on the new combinations of alleles that result from recombination. This will create clines at specific loci, whose shape and width will be determined by a balance between dispersal into the zone and the strength of selection acting on them. If there is variation on selection pressures among loci, different levels of introgression through the hybrid zone are expected for different loci, resulting in clines that are neither coincident nor concordant (e.g. Butlin and Hewitt, 1985). Clines associated with neutral mixing will be wider than clines associated with selection, thus it is possible to detect genes that might be under selection as they will introgress less than neutral markers. However, if the degree of genetic differentiation achieved is extensive and selection against hybrids is strong, changes in allele frequencies across the zone of contact will be steep and a pattern of cline coincidence and concordance 108 among loci is expected (e.g. Szymura and Barton, 1986). With time, due to epistatic effects between loci, and genome wide linkage disequilibria, cline concordance and coincidence will spread across to independent loci, resulting in a strong tension zone (Barton, 1983; Barton and Gale, 1993; Barton and Hewitt, 1985; Barton and Shpak, 2000; Gavrilets, 1997). In this case, populations in the hybrid zone act as genetic barriers to gene flow. Therefore, narrow hybrid zones with coincident and concordant clines are more likely to effectively maintain isolation and further promote divergence between already diverged forms. In previous chapters it has been shown that Lacerta lepida has endured repeated range fragmentation that has promoted diversification within this species. Mitochondrial DNA lineages identified within Lacerta lepida have non-overlapping geographic ranges supporting the idea of allopatric differentiation in multiple refugia, during the Mio-Plio-Pleistocene (chapter 3). The oldest split within the group was estimated to have occurred around 9 Mya (chapter 3) and corresponds to the split between lineage N (subspecies Lacerta lepida nevadensis ) and lineage L (subspecies Lacerta lepida lepida ). These evolutionary lineages are inferred to have undergone range expansions following the last ice age, probably establishing zones of secondary contact (chapter 3). Recently Paulo et al. (2008) suggested elevating lineage N to a new species, due to the high levels of mitochondrial genetic differentiation detected between lineages L and N and the existence of morphological differences between these (Mateo and Castroviejo, 1990; Mateo and López-Jurado, 1994; Mateo et al. , 1996) (Appendix 1 shows pictures of lizards from the two lineages). Furthermore, differences between the lineages seem to be also supported by high levels of allozyme differentiation (Mateo et al. , 1996). Despite the apparently deep historical subdivision within the species, as revealed by mitochondrial genealogy and differentiation at allozyme markers, it remains unclear whether these divergent lineages are independently evolving or if they will in fact coalesce due to high levels of gene flow at zones of secondary contact. Recent studies focusing on reptiles have revealed that deep mtDNA divergence not always corresponds to nuclear genomic divergence (Lindell et al. , 2005a; Lindell et al. , 2008a; Rassmann et al. , 1997; Stenson et al. , 2002; Thorpe et al. , 2008b; Ujvari et al. , 2008), with male biased dispersal been invoked as the main explanation for such discrepancies. In such cases, males act as the main agents of gene flow leading to

109 reduced genetic structuring in biparentaly inherited markers compared to maternally inherited mtDNA. The description of individuals with seemingly hybrid morphology across a putative zone of secondary contact between the lineages (Mateo and López-Jurado, 1994) suggests hybridization is a feature in the interaction between these two taxa. In this chapter a population genetic analysis of an area of secondary contact between the two Lacerta lepida mitochondrial lineages, N and L, is undertaken using both mitochondrial and nuclear markers. The specific objectives of this work are: 1) to indentify the exact geographic location of the putative contact zone between the mitochondrial lineages; 2) to investigate if a barrier to gene exchange exists at the contact zone, by analyzing clinal patterns of genetic variation (shape, coincidence and concordance of clines); 3) to assess levels of gene flow across the contact zone, describing the geographical extent of introgression; and finally 5) to clarify if both lineages can be considered as good species.

4.3. Materials and methods

4.3.1. Sampling strategy collection

Sampling was conducted in 2007 along a Southeast-Northwest transect perpendicular to the location of the putative contact zone between mitochondrial lineage L4 and N (hereafter described as Lepida and Nevadensis lineages respectively) (Fig. 4.1.). In order to identify the exact location of contact, nine populations were sampled along this transect, with the first sampled population corresponding to the south-eastern limit of the geographic distribution of Nevadensis. Additional populations sampled were located to the north-west of the first sampled population with each population being located 20 to 50 km away from the closest sampled population. Lizards were captured using tomahawk traps or by hand, and tissue samples were taken by clipping 1cm of the tail tip that was subsequently preserved in 100% ethanol. After tissue sampling, animals were immediately released back into the wild in the place of capture. Geographic 110 coordinates of sampling sites were recorded with a GPS. All lizards were captured under appropriate license.

4.3.2. Laboratory procedures

Total genomic DNA was extracted from ethanol-preserved muscle tissue using the same protocol described in section 2.3.2. The entire (1143bp) cytochrome b gene was amplified with primers TRNAGLU and TRNATHR (see chapter 2). These primers were shown to be specific for the mitochondrial cytb gene and do not amplify Numts. Protocols for the amplification and sequencing of the entire cytb gene were as in section 2.3.4.

Microsatellites

To obtain genotypic profiles for the lineages under study, eight polymorphic microsatellite loci were amplified for all samples. As there are no microsatellites specifically characterized for Lacerta lepida , microsatellite loci characterized for other lacertid lizard species that were previously shown to be polymorphic within Lacerta lepida (Nunes, unpublished data) were amplified: loci C9, B4 and D1, which were characterized in Podarcis muralis (Nembrini and Oppliger, 2003); loci PB66 and PB73, characterized in Podarcis bocagei (Pinho et al. , 2004); locus LV-4-72, from Lacerta vivipara (Boudjemadi et al. , 1999); locus LVIR17 characterized in Lacerta viridis (Böhme et al. , 2005) and locus LIZ24 in Lacerta schreiberi (Paulo, unpublished data). PCR’s were carried out in a final volume of 10 µL containing 5x PCR buffer,

2.0 L of 10x Go Taq® Buffer, 2.0 mM of MgCl 2, 0.2 mM of each dNTP, 0.5 M of each primer, 2 g of BSA, 0.5 units of Go Taq® DNA polymerase and approximately 50 ng of DNA. Locus C9 was amplified together with locus PB66, and locus D1 with locus PB73 in two different PCR duplexes. All other loci were amplified individually in independent PCRs. All PCR reactions were performed in a DNA engine tetrad 2, Peltier thermocycler, using the following profile: initial

111 incubation at 94 °C for 3 min followed by 30 cycles of denaturing at 94 °C for 30 s, annealing for 30 s (with temperature according to each locus, Table 4.1.) and extension at 72 °C for 30 s; plus a final extension incubation at 72°C for 30 min. PCR products of the 8 loci were combined in two different mixes that allow distinguishing loci according to fluorescent dye and allele size. Mix A included locus D1, LV-4-72, LVIR17 and PB73; and Mix B included locus B4, C9, LIZ24 and PB66. To determine fragment length, 1 µL of either Mix A or Mix B was added to 8.9 µL of Hi-Di Formamide TM and 0.1 µL of GeneScan TM -500 Rox TM size standard. Each cocktail mix was run in an automated ABI Prism 377 and peaks were visualized with Genemapper Software version 4.0 (Applied Biosystems).

4.3.3. Data analyses

Mitochondrial DNA data

DNA sequences were aligned by eye using BioEdit Sequence Alignment Editor 7.01 (Hall, 1999). All sequences were trimmed to 627bp before further analysis. In order to assign each sample to the correct mitochondrial lineage (Lepida versus Nevadensis), median joining (MJ) (Bandelt et al. , 1999) and statistical parsimony (SP) (Templeton et al. , 1992) networks were constructed. The MJ network was computed with the program NETWORK 4.5.0 ( www.fluxus- engineering.com ) keeping the parameter ε = 0, which does not allow less parsimonious pathways to be included in the analysis. The SP network was inferred using the program TCS 1.21 (Clement et al. , 2000) with a connection limit of 70 mutational steps. In order to identify which haplotypes were new, sequences from lineage L4 and N (chapter 3) were added to the dataset and new networks were constructed. For each sampling site the frequency of each haplotype and the frequency of Lepida and Nevadensis haplotypes were calculated. For each population haplotype (H) and nucleotide diversity (π) were also calculated. Population structure within each mtDNA lineage was assessed by estimating values of ΦST (a mtDNA analogue for FST ; Excoffier et al (1992)) for each locality and performing hierarchical analysis of molecular variance (AMOVAs), using localities 112 as groups. Pairwise ΦST values between localities were also calculated as a measure of population genetic differentiation. All tests were performed in ARLEQUIN version 3.11 (Excoffier et al. , 2005).

Microsatellite data

Overall assessment of microsatellite variability

As this is the first time that this set of 8 microsatellite loci has been applied to a population level study in Lacerta lepida , and as almost each locus has been characterized in a different species through different independent studies, tests for non-random associations between diploid genotypes at each pair of loci were carried out. Tests for linkage disequilibrium were performed through a log-likelihood ratio statistic (G-test) using the Markov chain algorithm as implemented in GENEPOP 4.0 (Rousset, 2008). For each locus levels of polymorphism, across and within populations, were determined by assessing allele number and frequency. Deviations from expectations of Mendelian inheritance were tested for each population at each locus using exact tests (Raymond and Rousset, 1995) to check for the presence of heterozygote deficits. Departures from Hardy Weinberg Equilibrium (HWE) can be due to biological factors such as population structure, non-random mating and selection against hybrids. Nevertheless, a departure from HWE can also be due to technical issues that occur during the process of microsatellite amplification, such as the presence of null alleles. Null alleles arise when mutations in the microsatellite flanking regions occur leading to the non-amplification of certain alleles. This can result in heterozygotes being wrongly identified as homozygotes due to the non- amplification of one of the alleles, leading to an apparent excess of homozygotes and departures from HWE. It has been shown that the frequency of null alleles in a congeneric species rapidly increases with increasing phylogenetic distance from a focal species (e.g. Li et al. , 2003). As the microsatellite used in this study were not specifically characterized for L. lepida , flanking region mutations are more likely to have occurred, increasing the probability of non-amplification of alleles. To address this issue the presence of null alleles was assessed using the program MICRO- CHECKER version 2.2.1 (Van Oosterhout et al. , 2004). When null alleles were detected, their frequency was estimated using the methodology of Dempster et al

113

(1977), assuming that any detected heterozygote deficit is due to the presence of null alleles and not to population structure (Wahlund effect.).

Population based analysis

Allele richness, gene diversity (expected heterozygosity) and the indicator of inbreeding within populations (FIS ) were calculated. Levels of genetic differentiation amongst pairs of populations were assessed by multilocus estimates of FST . All tests were carried out in GENEPOP 4.0 (Rousset, 2008). Furthermore, isolation by distance within each mitochondrial lineage was also assessed by regressing a pairwise matrix of FST values (linearised as FST /(1-FST )) against the geographic distance between localities. The statistical significance of the association was determined with a Mantel test using the online version of the software IBD (Bohonak, 2002).

Admixture estimation

STRUCTURE version 2.2 (Pritchard et al. , 2000) was used to evaluate the extent of admixture between the two mitochondrial lineages under study. Two datasets were analysed: one dataset comprising all loci and a reduced dataset with loci that were not in HWE for the majority of localities removed. In an exploratory analysis to infer the number of genetically homogenous groups of individuals (clusters, K) along the transect, several analyses were run changing the value of K from 2 to 9. The true value of K was chosen using information from the posterior probability (Ln P (D)) given by the software and by K, a quantity based on the rate of change of the posterior probability with respect to the number of clusters, as defined by Evano et al. (2005). According to these authors, in most cases the posterior probability given by STRUCTURE does not provide the correct estimation of the number of clusters in the data, while K always shows a clear peak at the true value of K. Analyses were run 3 times for each K with a burn in period of 50,000, to minimize the effect of the starting parameter settings, followed by 500,000 repetitions per run. Consistency and convergence of parameter estimates were checked by visualizing the plots of the parameters. After choosing the true value of K an analysis with all loci was performed to evaluate levels of admixture along the transect. For all analyses the admixture model was assumed. According to this model

114 individuals can have a mixed ancestry, inheriting a fraction of their genome from different ancestors. The posterior mean estimate of those fractions was used to estimate the proportion of membership of each sampling locality in each of the inferred clusters.

Cline analysis

Mitochondrial cline analysis was performed using information about the frequency of Lepida haplotypes in each sampled locality. For the purpose of nuclear genome cline analysis the mean proportion of membership of each sampling locality to Lepida using all microsatellite loci data was first calculated in STRUCTURE as described above. This data was used to estimate a multilocus cline from which levels of nuclear genome introgression across the contact zone could be inferred. Single locus clines were also estimated for loci that showed diagnostic shifts in allele frequencies across the transect. Those loci were collapsed into a two allele system, representing Lepida and Nevadensis alleles. All rare and low frequency alleles were allocated either to Lepida or Nevadensis groups according to their occurrence along the transect. Maximum likelihood clines were fitted independently and plotted with ANALYSE (Barton and Baird, 1995). Cline fitting was performed by adding geographic information of sampling localities to the allele frequency data. Localities were collapsed into a one-dimensional transect, with geographic distances measured from the southernmost sampling site (Locality 1) (Fig. 4.1.). As sampling sites 7 and 8 do not fit well with the one-dimensional transect, they were not used for cline fitting purposes. Clines were fitted to the Symmetric Tanh curve (Barton and Gale, 1993), and the two parameters describing each curve (center, c, and width, w) were estimated by the program. Estimation of both parameters started from approximate values of c calculated from the data and incorporated in the program. The centre of the cline is the point where the frequency of alleles switches above 0.5. Cline width was calculated as the inverse of the maximum of the slope of the cline curve (1/maximum slope) as described in Szymura and Barton (1986). The proportion of membership to Lepida (or the frequency of Lepida alleles in the case of single locus clines), p, was allowed to vary between the pmin and pmax (minimum and maximum

115 gene frequency at the tail ends of the cline) estimated from the data and incorporated in the program.

4.4. Results

A total of 200 samples were collected with 19 to 30 samples collected per sampling site. Sampling sites and number of samples per site are shown in Fig. 4.1.

4.4.1. Mitochondrial DNA data

All cytb sequences represented uninterrupted open reading frames, with no gaps or premature stop codons, suggesting they are functional mitochondrial DNA copies. To allow comparisons with sequences from the chapter 3, all sequences were trimmed to 627 bp. From the 627 bp analysed, a total of 104 were variable from which 85 were parsimony informative. Fifty eight unique haplotypes were obtained. The genealogical relationships between haplotypes inferred by the two approaches for network construction (MJ and SP) were identical (Fig. 4.2.), with two very divergent groups of haplotypes identified. The two haplotype clusters correspond to the two mitochondrial lineages under study, Lepida and Nevadensis, and are connected through 67 mutational steps. From the 58 haplotypes sampled 17 have been previously sampled in a broader phylogeographic analysis of Lacerta lepida (chapter 3), 21 represent new Lepida haplotypes and the remaining 20 represent new Nevadensis haplotypes. No populations were found to be admixed for both mtDNA lineages, with populations 1 to 4 being fixed for Nevadensis haplotypes while populations 5 to 9 were fixed for Lepida haplotypes (Fig. 4.1.). Haplotype frequencies within each lineage are represented in Fig. 4.3. Overall genetic differentiation among mtDNA lineages is shown in Table 4.2. Within Lepida some level of genetic structure was detected, with 11.5%

(ΦST =0.1145, p=0.0) of the genetic variation occurring between sampled localities.

116

This pattern was not found in Nevadensis ( ΦST =0.014, p=0.16). Pairwise ΦST values between localities are shown in Table 4.3.

4.4.2. Nuclear DNA data

Overall assessment of microsatellite variability

From all microsatellite loci used in this study only locus LV-4 failed to amplify consistently across all populations and was therefore eliminated from further analyses. All loci had moderate to high levels of polymorphism with the number of alleles over all populations ranging from 8 (LIZ24) to 23 (B4 and LVIR17) (Table 4.1.). Changes in allele frequencies over all localities and per locality are shown in Fig. 4.4. and Fig. 4.5. respectively. Generally, the most frequent allele at each locus is different for the two mitochondrial lineages (Fig. 4.4.), with the exception of the least polymorphic locus (LIZ24), where allele 115 has a gene frequency of 85% to 99 % in Lepida and Nevadensis, respectively. The differences in allele frequencies per locus within each mitochondrial lineage are shown in Fig. 4.6. The patterns of allele frequency and allele sharing between lineages differ across loci; nevertheless, each mitochondrial lineage exhibits private alleles at each locus. Three loci (Locus C9, LVIR17 and PB73) exhibit a clear difference in allele frequencies between the lineages, revealing a clear clinal transition. The clearest picture of clinal variation occurs at locus C9 where lineages show almost non-overlapping allele size ranges, with few shared alleles of intermediate size and low frequency (Fig. 4.6.). A similar pattern can be detected at loci LVIR17 and PB73. Nevertheless at locus PB73 the frequency of private alleles from each lineage is very low when compared with the frequency of intermediate size shared alleles, which have the highest frequency in each of the lineages. The remaining loci show a broad overlap in allele sizes between Nevadensis and Lepida lineages with the majority of high frequency alleles being shared among them. Although some alleles in these loci show evidence of clinal variation across the transect (for e.g. alleles represented by dark green and turquoise colour in locus B4, Fig. 4.5.), the majority of alleles do not (for e.g. alleles represented by grey and mustard colour in locus B4, Fig. 4.5.). Despite both lineages sharing the same most frequent allele at locus LIZ24, some level of

117 differentiation is still detectable, as all other alleles, although occurring at very low frequencies, are private for the Nevadensis lineage. Significant linkage disequilibrium was detected for 11 pairs of loci in 6 localities, with most of the non-random associations being detected at locality 4 (Table 4.4.). Generally, pairs of loci in linkage disequilibrium were not detected at more than one locality. High deviations from HWE were detected in the majority of loci when considering all populations, with the exception of locus LIZ24 ( χ2=13.6, df=10, p=0.19). Tests of HWE for each locus/locality combination revealed 16 cases of significant heterozygote deficit (FIS>0; p<0.05) relative to what is expected under HWE (Table 4.5.). Loci LVIR17 and D1 showed the highest percentage of localities with a heterozygote deficit. HW disequilibrium in locus LVIR17 occurred mainly within the Lepida lineage while HW disequilibrium in D1 occurred only within the Nevadensis lineage. With the exception of localities 6 and 9, which show heterozygote deficits for the majority of loci, heterozygote deficits were usually detected at only a single locus within each locality. As HW deviations are not distributed consistently across loci or localities they are possibly the result of local population effects. Nevertheless, the presence of null alleles was detected which could explain some of the observed heterozygote deficits. In all cases where null alleles were detected, the estimated frequency was always lower than 0.16 (Table 4.5.). The estimation of null alleles assumes that there is no population structure in the data, and therefore these can be viewed as overestimations. Even though the values estimated are relatively low when compared to allele frequencies within each locus, and therefore null alleles most probably did not influence the data analysis.

Population based analysis

FST values between pairs of localities are shown in Table 4.3. Two groups of localities can be identified that exhibit significant differentiation from each other. One group consists of localities 1 to 4 and the other is comprised of localities 5 to 9, with pairwise FST values between localities from the two groups ranging from 0.09 to 0.15. These two groups correspond to Nevadensis and Lepida mitochondrial lineages respectively. Similar levels of overall multilocus estimates of FST were detected for each mtDNA lineage ( FST =0.022 for Nevadensis and FST =0.024 for

118

Lepida). Localities within each group show little genetic differentiation between them, with pairwise FST values ranging from 0.01 to 0.04 (Table 4.3.) Generally, higher pairwise FST values are reported within Nevadensis. This lineage also shows a trend towards isolation by distance (r 2=0.27) (Fig. 4.7.), although not statistically significant (Mantel test, r=0.52; p=0.13).

Admixture estimation

Similar results were obtained when analysing the microsatellite data with STRUCTURE, which revealed K=2 as representing the most likely number of clusters. The highest values of L(K) were obtained when K=2 and K=9, nevertheless there was a clear peak in K when K=2 (Fig. 4.8.), which according to Evanno et al. (2005) reveals the true value of K. Both analysed datasets (dataset with all loci and reduced dataset with LVIR17 and D1 removed due to deviations from HW expectations in the majority of localities) gave similar results. The proportion of each locality assigned to each cluster is shown in Fig. 4.1. Localities 1 to 4 were assigned to cluster 1 (corresponding to Nevadensis mitochondrial lineage), while localities 5 to 9 were assigned to cluster 2 (corresponding to Lepida mitochondrial lineage). These results are concordant with the mitochondrial data. For each individual analysed, the proportion of assignment to each lineage can be seen in Fig. 4.9. Generally, the majority of individuals have a high proportion of assignment to one of the lineages (higher than 95%), but with a few individuals showing some level of admixture (Fig. 4.9.). In individuals with admixed ancestry, usually a high proportion of their genome is assigned mainly to one of the lineages (80 to 95%). Nevertheless in locality 5 two individuals were identified as having more extensive admixture levels. One of them represents an F1 hybrid between Lepida and Nevadensis, having half of its nuclear genome assigned to each lineage. The other lizard likely represents a backcross of an F1 hybrid with a pure Lepida form, having 24% of its genome assigned to Nevadensis and the remaining assigned to Lepida.

4.4.3. Cline analysis

119

Loci LVIR17, C9 and PB73 showed diagnostic shifts in allele frequencies across the transect and were therefore used for a single locus cline analysis. Maximum likelihood fitted clines are presented in Fig. 4.10. Cline centers are generally coincident, being located between locality 4 and 5 with distance from locality 1 ranging from 103 to 110km depending on the marker used (Table 4.6.). Single locus cline centers derived from microsatellite data were more distant from locality 1 than the center of the mtDNA cline, with the exception of locus LVIR17 (Fig. 4.10.b and Table 4.6.). Although cline centers do not differ much between loci, cline widths do exhibit a greater extent of variation, with locus PB73 having the widest cline (40 Km) and mitochondrial DNA the steepest (2.7 Km). Although microsatellite single locus clines differ from the mtDNA cline in terms of width, this difference is modest overall (3km) when a multilocus cline approach is taken (Fig. 4.10.a and Table 4.6.).

4.5. Discussion

In chapter 3 it was shown that Lacerta lepida has been subject to historical range fragmentation that promoted diversification within the species. The oldest split within the group represents the divergence between lineages N and L and was estimated to have occurred around 9 Mya, during the Miocene. Both evolutionary lineages are inferred to have undergone range expansions following the last ice age establishing a zone of secondary contact. Evidence for the existence of a secondary contact zone between both lineages emerged with the discovery of one population containing both mitochondrial lineages (sampling site 33 in chapter 3) located just to the west end of Sierra Nevada Mountains. Results presented here indicate that the hybrid zone occurs in the valley north of the Sierra Nevada Mountains, between locality 4 and 5. It is plausible to infer that the contact zone runs from the coastal area of Granada and contours the northern side of the Sierra Nevada Mountains in a north-east direction reaching the western part of Sierra de Baza (Locality 4). Further inferences about the eastern extent of the contact zone cannot be made due to lack of sampling. Nevertheless previous studies suggest that individuals with what

120 resembles hybrid morphology between the lineages occur in the limits between the provinces of Murcia and Albacete and also in Valencia (Mateo and López-Jurado, 1994). Several taxa show phylogenetic breaks associated with the Betic mountains caused by an history of allopatry in this region (see Gomez and Lunt, 2007 and references therein), which also seems to be the case for Lacerta lepida . The Sierra Nevada is the highest altitudinal limit to the distribution of Lacerta lepida , where it reaches 2,400m. At the peak of the last glacial maximum (LGM), conditions throughout the Sierra Nevada and adjacent mountains were most likely unsuitable for the persistence of Lacerta lepida , resulting in a distribution restricted to much lower altitudes. It is therefore probable that contact between the lineages has occurred after the LGM, when an increase in temperature allowed populations to expand their ranges from refugial areas. Contact is thus estimated to have occurred approximately 15,000 ya (years ago), or perhaps even more recently due to the effect of the Younger Dryas. The Younger Dryas was a period of rapid climatic change during the interglacial characterized by a dramatic fall in temperature re-establishing conditions similar to the last glacial period. The Younger Dryas ended approximately 10,000 ya, with the start of the pre-boreal when the climate warmed markedly. Although the climatic changes during the YD are thought to have been less dramatic in southern Spain, an increase in steppe type vegetation in the region is registered during this period, especially at higher altitudes (Carrión et al. , 1998; Carrión and Dupre, 1996). It is therefore likely that the Younger Dryas had an impact on the distribution of Lacerta lepida in the region, likely delaying or interrupting contact between the lineages L and N until the end of the climatic reversal, around 10,000 ya.

4.5.1. Genetic structure of the contact zone: tension zone vs neutral diffusion

Clines through the hybrid zone are narrowly coincident with cline centres being located 104-110 km northwest of locality 1. Coincidence of clines is expected after secondary contact that seems to be the case for this contact zone. At the time of

121 contact genetic introgression is initiated, but any co-adaptation of lineage specific alleles may enhance the effect of divergence between the lineages through epistatic interactions among loci. New recombinants generated from hybridisation between the lineages may be less fit, under these conditions epistasis and linkage can promote cline coincidence (Barton and Hewitt, 1989). Nevertheless cline widths are not concordant, with mtDNA cline being extremely narrow (2.7 km for mtDNA) when compared to the consistently wider nuclear clines (between 10 and 40 km) (Table 4.6.). Interestingly, when all nuclear loci are analysed in a multilocus cline approach, the width of the nuclear cline narrows, approximating the mitochondrial one. Frequencies used when generating the multilocus nuclear cline represent the proportion of membership of each sampling locality in each of the inferred clusters (lineages) and it is estimated using information from all microsatellite loci. Therefore the cline calculated using this approach is a representation of what is happening in the nuclear genome as a whole, rather than at single locus. Although it is known that upon contact different parts of the genome will introgress differently depending on the selection forces that act directly or indirectly on them leading to differences in cline widths (Butlin and Hewitt, 1985; Hewitt, 1993), the single locus cline widths should be interpreted with caution. Collapsing allele frequencies to a two allele system, with alleles being classified as either belonging to Lepida or Nevadensis might be a source of error, particularly considering alleles where differences in the frequency across lineages are not very pronounced. Alleles that represent persistent ancestral variation due to incomplete lineage sorting may be represented equally in both lineages, making it difficult to collapse them into the two allele system. These factors might have affected the subsequent single locus cline analysis undertaken, most likely widening the clines. Therefore, the multilocus approach is considered as a better estimate of nuclear genome cline width. The fact that only two hybrids were found in the populations near the centre of the contact zone, and that mtDNA and nuclear clines are coincident suggests that selection against hybrids is occurring in the zone. As is the case for many hybrid zones, the Lacerta lepida hybrid zone conforms to the “tension zone” model where clines are maintained by a balance between selection and dispersal (Barton and Hewitt, 1985). Selection forces that are influencing cline shape are probably endogenous as there is no clear evidence for clinal environmental variation through the zone. Hybrid fitness is therefore most probably determined by genome 122 interactions, such as heterozygote disadvantage and epistasis, independent from the environment. Prezygotic mechanisms can also be responsible for the observed pattern of steep cline widths. In fact, differences in the reproductive activity between both lineages have been identified. Nevadensis shows an extended reproductive period in concordance with the longer period of male sexual activity and has the ability of producing two clutches per year, while Lepida only produces one (Castilla and Bauwens, 1989; Mateo, 1988; Mateo and Castanet, 1994). Nevertheless the presence of F1 hybrids in locality 5 and the occurrence of individuals with intermediate morphological characters between both lineages (Mateo and López- Jurado, 1994) suggest hybridization between lineages to be relatively frequent. The time since the Younger Dryas corresponds approximately to 3,300 generations for Lacerta lepida , where the generation time is estimated to be on average 3 years (Mateo, 1988; Mateo and Castanet, 1994). To generate a cline width of 10km assuming neutral diffusion, a dispersal rate of 100m per generation would have to be invoked (using the equation: T = 0.35 ( d/w )2 (Endler, 1977); where T represents time since contact, d represents dispersal rate and w represents cline width). One hundred metres dispersal per generation (3 years) is a small distance for Lacerta lepida . Dispersal rates for Chioglossa lusitanica , a relatively small salamander, have been estimated to be 120m per generation (Sequeira et al. , 2005). This species’ dispersal might be restricted due to habitat requirements such as high dependence of juveniles on water streams, which is not the case for Lacerta lepida . Furthermore, Lacerta lepida territories were estimated to be on average 3500 m2 for females and 11000 m2 for males (Salvador et al. , 2004). These large territories suggest that the species dispersal rates (variance in parent/offspring distance) might be higher than 100 m.

This is supported by the overall multilocus FST values within each lineage which are relatively small (0.02), suggesting high levels of gene flow. Taking into account these FST values, higher dispersal between the lineages would be expected implying wide nuclear clines, which is not observed. It seems likely that further sampling between localities 4 and 5 (25km apart) will probably reveal much steeper cline widths, requiring even smaller dispersal rates for the nuclear genome to conform to neutral diffusion. Although our evidence is indirect, it seems likely that selection against hybrids is responsible for the observed cline widths.

Interestingly, overall ΦST values within lineages inferred from mitochondrial

DNA are relatively higher than the ones revealed from nuclear markers. ΦST values 123 from mitochondrial data suggest some level of genetic structuring within the Lepida lineage (Table 4.2.). Pairwise ΦST values between localities are generally one order of magnitude higher in Lepida than in Nevadensis, suggesting higher levels of female mediated gene flow in the latter. However, the haplotype frequencies within populations of Nevadensis suggest that the lower ΦST values may be a consequence of the very high frequency of the same haplotype in all populations (Fig. 4.3.). Interestingly this haplotype is that identified as the ancestral haplotype within this lineage (Fig. 4.2.). Thus the lower ΦST value observed within the Nevadensis lineage may be a consequence of incomplete lineage sorting giving a signature of low population differentiation. Indeed some level of differentiation is indicated by the existence of a number of private haplotypes (although at low frequency) within each locality (Fig. 4.3.). The evidence for greater structuring in the maternally inherited mtDNA marker compared to the biparentaly inherited nuclear markers in the Lepida lineage suggests low female dispersal relative to males, which has been increasingly reported in other reptilian studies (e.g. Lindell et al. , 2005b; Lindell et al. , 2008b; Stenson et al. , 2002; Thorpe et al. , 2008a; Ujvari et al. , 2008). The absence of this signature among Nevadensis populations is consistent with either a higher level of female dispersal or mutation-drift non-equilibrium conditions amongst these populations.

4.5.2. The historical dynamics of lineages contact and introgression

It should be noted that there may have been episodes of introgression during earlier interglacial periods of contact between the lineages, especially if during those previous contacts selection against hybrids was weaker or if the contact was maintained for longer. The deeply divergent mtDNA lineages, corresponding to a divergence time of approximately 9 Mya, suggest a substantial period of geological time has been available for climatically mediated allopatry and parapatry. Earlier contact and hybridisation may have allowed for the exchange of alleles between lineages, therefore some degree of similarity in allele frequencies between the lineages is expected, concomitant with the “evolutionary filter” role played by the

124 contact zone. This is corroborated by the FST values, which are much higher in comparisons between lineages than within lineages. It is more likely that those similarities are the result of ancestral gene flow and do not indicate contemporary gene flow amongst lineages. As postulated by Hewitt (1988; 1996) species that have persisted in southern European refugia through the ice ages have most probably established long term hybrid zones through complex patterns of contraction and expansion. Lacerta lepida lineages have most probably established contact repeatedly during the Quaternary. Earlier contacts provided the possibilities for exchange of alleles between the lineages through older hybridization events. Although the low frequency shared alleles observed in this study might be the reflection of incomplete lineage sorting, they most likely represent past introgression from one lineage to the other at the time of earlier contacts.

4.5.3. Taxonomic and conservation implications

Despite some apparent evidence for the existence of gene flow as revealed by the morphological intermediacy found across the hybrid zone (Mateo and López- Jurado, 1994), the existence of clear significant morphological differences between the pure forms (Mateo and Castroviejo, 1990; Mateo and López-Jurado, 1994; Mateo et al. , 1996), clinal variation in genotype frequencies and indirect indication of hybrid inferiority (as revealed by the very low numbers of hybrids detected) suggest that Lepida and Nevadensis are on independent evolutionary trajectories. Divergence between lineages seems to have passed the threshold whereby introgression is greatly reduced and therefore coalescence is unlikely. The two mitochondrial lineages should be considered as different evolutionary units and conservation efforts should be put in place to protect them. Lacerta lepida is widely distributed across Spain and Portugal and there are no specific conservation measures for its protection. The IUCN considers the existence of only one species within the group which is generally classified as in significant decline mainly due to habitat loss. In the last Mediterranean Red list assessment Lacerta lepida was classified as Near Threatened (NT), a status that is more alarming if the existence of two species within it is to be considered. More

125 worrying is the case of “Nevadensis” lineage ( Lacerta lepida nevadensis ) which presents a very restricted distribution area, associated with zones of high touristic pressure and where current changes in land use, (e.g. the increasing density of greenhouses in the province of Almeria) are most likely to be detrimental for the species survival, and the threats posed by habitat loss might be more alarming.

126

Table 4.1. Primer sequences, annealing temperature (T A), number of alleles (N A) and allele size range for each locus in Lacerta lepida . Further information regarding each locus can be found in the papers where they were first characterized (Source).

o Locus Source Primers TA ( C) NA Allele Size Range Paulo (not F: FAM - TCAGTCCAAATATCTCTACAGG LIZ24 50 8 115-139 published) R: AGATGAGCAGCATATAGTGATG Nembrini & F: HEX - AATCTGCAATTCTGGGATGC B4 61 23 122-166 Oppliger (2003) R: AGAAGCAGGGGATGCTACAG Nembrini & F: FAM - CATTGCTGGTTCTGGAGAAAG C9 58 14 130-169 Oppliger (2003) R: CCTGATGAAGGGAAGTGGTG Pinho et al F: FAM - GCCCATGTCACTTCAGGTAGAAGC Pb73 58 17 120-152 (2004) R: GAAAACTAGGAGTTAGGGAGAAGG Pinho et al F: NED - GGACAGCTAGTCCCATGGCTTAC Pb66 58 21 148-192 (2004) R: GGATTGCTGTCACCAGTCTCCCC Nembrini & F: NED - GAGTGCCCAAGACAGTTGTAT D1 58 22 134-209 Oppliger (2003) R: GAGGTCTTGAATCTCCAGGTG Bohme et al. F: NED-AGCTCTGGATCGAGACAACCTGG LVIR17 61 23 221-265 (2005) R: TCTCTGAAGGAGACCGGCTCC Boudjemadi et F: HEX - CCCTACTTGAGTTGCCGTC LV4-72 63 ...... al. (1999) R: CTTTGCAGGTAACAGAGTAG

127

Table 4.2. . Number of samples (N), number of haplotypes (H) and nucleotide diversity (π) for each Lacerta lepida sampled locality using information from mitochondrial DNA

cytb gene sequences. ΦST with respective p values for each mitochondrial lineage is also shown.

N H π Φ p ST Nevadensis 79 24 ... 0.0141 0.1603 Loc 1 19 7 0.0038 Loc 2 22 8 0.0024 Loc 3 18 9 0.0057 Loc 4 20 8 0.0052 Lepida 99 34 ... 0.1145 0.0000 Loc 5 21 7 0.0024 Loc 6 23 8 0.0028 Loc 7 18 8 0.0039 Loc 8 21 11 0.0029 Loc 9 16 10 0.0040

Table 4.3. Pairwise FST values between nine Lacerta lepida localities (Loc.) for 627 bp of mtDNA cytb gene (below diagonal) and for 7 microsatellite loci. Statistically

significant pairwise FST values (p<0.05) are denoted with grey shading.

Loc. 1 2 3 4 5 6 7 8 9 1 ... 0.019 0.016 0.039 0.089 0.115 0.136 0.133 0.095 2 0.021 ... 0.008 0.036 0.091 0.126 0.146 0.140 0.111 3 0.037 0.011 ... 0.014 0.086 0.107 0.126 0.117 0.092 4 0.013 0.025 0.007 ... 0.091 0.117 0.144 0.130 0.105 5 0.974 0.979 0.966 0.967 ... 0.022 0.041 0.032 0.016 6 0.972 0.977 0.965 0.966 0.087 ... 0.024 0.023 0.025 7 0.967 0.974 0.959 0.961 0.227 0.093 ... 0.011 0.027 8 0.971 0.977 0.964 0.965 0.125 0.012 0.135 ... 0.022 9 0.967 0.974 0.958 0.960 0.166 0.107 0.132 0.059 ...

128

Table 4.4. Results of tests for linkage disequilibrium for each pair of 7 microsatellite loci from Lacerta lepida , in each sampled locality. Only significant non-random associations between pairs of loci are shown.

Locality Locus 1 Locus 2 P

1 PB66 D1 0.029

2 LVIR17 PB73 0.030 4 B4 LVIR17 0.036 4 B4 C9 0.000 4 LVIR17 C9 0.000 4 LVIR17 PB73 0.044 4 PB73 C9 0.034 4 B4 PB66 0.000 4 LVIR17 PB66 0.006 4 C9 PB66 0.004 6 LVIR17 PB73 0.014

6 C9 PB66 0.029

8 C9 PB73 0.042 8 B4 CYTB 0.002 9 LIZ24 C9 0.045

129

Table 4.5. . Measures of genetic diversity at 7 microsatellite loci in Lacerta lepida : expected (H E) and observed (H O) heterozygotes,

FIS values and Null allele frequency for each locality (Loc.)/locus combination. Shaded values are statistically significant (p<0.05) and denote significant heterozygote deficits. The presence of null alleles detected by MICROCHECKER is denoted with bold font.

Locus B4 LVIR17 LIZ24 C9

Loc. H (H ) F Null H (H ) F Null H (H ) F Null H (H ) F Null E O IS E O IS E O IS E O IS 1 14.3 (13) 0.10 0.00 15.0 (12) 0.20 0.07 0 (0) - - 17.1 (16) 0.06 0.07 2 15.4 (14) 0.09 0.00 15.8 (17) -0.08 0.00 0 (0) - - 19.0 (19) 0.00 0.08 3 16.14 (13) 0.20 0.07 14.5 (11) 0.25 0.10 1 (1) - 0.00 17.1 (15) 0.12 0.04 4 16.1 (13) 0.20 0.07 17.1 (15) 0.12 0.06 0 (0) - - 16.4 (17) -0.04 0.00 5 21.1 (22) -0.04 0.00 20. 8 (18) 0.14 0.05 3.8 (4) -0.05 0.00 17.4 (15) 0.14 0.06 6 20.6 (17) 0.18 0.07 29.8 (16) 0.20 0.09 4.6 (4) 0.14 0.14 16.2 (12) 0.26 0.11 7 15.2 (12) 0.21 0.06 18.7 (19) -0.01 0.00 9.4 (8) 0.15 0.11 14.1 (12) 0.15 0.05 8 25.2 (26) -0.03 0.01 27.2 (21) 0.25 0.12 7.4 (8) -0.08 0.00 12.9 (13) -0.01 0.00 9 16.3 (14) 0.15 0.07 17.0 (11) 0.36 0.16 5.5 (4) 0.28 0.00 12.8 (9) 0.30 0.11

Table 4.5. Continuation

Locus PB73 PB66 D1

Loc. H (H ) F Null H (H ) F Null H (H ) F Null E O IS E O IS E O IS 1 16.9 (16) 0.06 0.04 17.2 (17) 0.01 0.00 16.1 (11) 0.32 0.12 2 18.3 (19) -0.04 0.00 18.2 (17) 0.07 0.00 18.2 (15) 0.18 0.06 3 15.8 (15) 0.05 0.00 16.2 (18) -0.12 0.00 16.4 (11) 0.34 0.14 4 16.5 (16) 0.03 0.01 17.8 (17) 0.05 0.03 16.1 (10) 0.39 0.16 5 19.8 (21) -0.06 0.00 20.3 (22) -0.09 0.00 20.4 (20) 0.02 0.00 6 17.3 (12) 0.31 0.13 20.9 (20) 0.04 0.00 19.7 (20) -0.02 0.00 7 18.9 (18) 0.05 0.00 18.8 (19) -0.01 0.00 19.5 (19) 0.03 0.00 8 24. 5 (23) 0.06 0.01 26.8 (24) 0.11 0.04 27.2 (24) 0.12 0.05 9 16.1 (13) 0.20 0.10 16.7 (16) 0.04 0.00 17.9 (17) 0.05 0.04

130

Table 4.6. Maximum likelihood estimates of cline centers ( c) and widths ( w) for 3 nuclear loci and cytochrome b (mtDNA), estimated independently, and multilocus cline parameters estimated using all 7 microsatellites loci (nDNA).

Locus C (Km) W (Km) Log C9 106.75 32.94 -6.28

PB73 110.18 40.38 -2.59 LVIR17 103.77 22.27 -5.50 mtDNA 106.87 2.70 0.00 nDNA 109.72 10.73 -0.47

131

a) b)

L3 ! ! !!! !! L1 ! Lepida ! 8 ! ? L5 ! !!! !!!! !! !! ! ! !!!!!!! ! ! ! ! ! !! !!!!! !!!!!! !!!!! !!!! !!!!!!! !! !!!! ! ! L4 ! ! !! !!! !!!!!!!!!!!!!!!! ! !!!!!!! ! !!! ! !!!!!!!!!!!!!!!!!!!! ! !!! ! !!!!!!!!!!!! !!!!!!!!!!!! !! ! !! N ! !!!!!!!!!!!!!!! ! ! ! !! !!!!!!!!!! !!!! ! ! ! L2 !!!!!! !!!!!!! ! !!!! ! ! 9!! ! ! ! 050 100 200 Kilometres !

! ! ! ! ! ! ! !! ! !!!!!! 7 ! ! ? Nevadensis

! ! !! !!!!!! !6 ?

!!! ! !!!!! !!!!! ! !!!!!!! 5 !! ! 3 !! ! !!!!!!!!! !!!!!!! !! 4 !

!! !!!!!!!!! ! ! ! ! 2! 33 !!! !

! !! !!!! !! 1 !

025 50 Kilometres

Locality 1 2 3 4 5 6 7 8 9 n (19) (21) (19) (20) (23) (23) (23) (30) (19) mtDNA nDNA

Fig. 4.1. Distribution of Lacerta lepida mitochondrial lineages as in chapter 3 (a) and the study area (b). Shaded areas denote altitude gradients, with darker areas representing higher altitudes. In b) numbers represent sampling localities along the transect (dashed line) and samples are represented by red dots. The putative zone of secondary contact between both mitochondrial lineages is indicated by question marks (?). Yellow numbered dot represents sampling site of chapter 3 where mitochondrial haplotypes from both phylogroups were found. Pie charts represent: mtDNA - the proportion of mtDNA haplotypes at each site derived from Lepida (red) and Nevadensis (blue) mtDNA lineages; nDNA - the proportion of each site assigned to Lepida (red) and Nevadensis (blue) estimated with the software STRUCTURE using 7 microsatellite loci. The number of samples (n) at each sampling site is also indicated.

132

Lepida

67 mutations

Nevadensis

Fig. 4.2. Statistical Parsimony network of cytochrome b haplotypes. Black circles represent unsampled or extinct haplotypes. White circles represent haplotypes already sampled in the previous chapters while coloured circles represent new haplotypes. Size of circles does not correspond to frequency.

133

0

Pop 1 F

1

Pop 2

Pop 3

Pop 4

Pop 5

Pop 6

Pop 7

Pop 8

Pop 9

Fig. 4.3. Frequency of mitochondrial DNA haplotypes (cytb gene) in each Lacerta lepida sampled population. Each bar represents one haplotype. Haplotypes from Nevadensis are represented in blue and from Lepida in red.

134

0.2 0.2 1 N; L Locus B4 Locus LVIR17 Locus LIZ24 N L N

L 0.1 0.1 0.5

0 0 0 L L 0.3 0.2 0.2 Locus C9 Locus PB73 Locus PB66 L L 0.2 N 0.1 0.1 N 0.1 Frequency N

0 0 0

0.2 Locus D1

N 0.1 L

0

Fig. 4.4. Allele frequencies per locus over all Lacerta lepida sampled localities. Alleles with overall frequency less than 1% are not represented, apart from locus LIZ24, with all alleles included. Each bar represents one allele. The most common allele in each mitochondrial lineage is represented by a letter above the bar (N, for Nevadensis lineage and L for Lepida lineage).

135

1 1 Locus B4 Locus LVIR17

0.5 0.5

0 0 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 Locus LIZ24 1 Locus C9

0.5 0.5 Frequency

0 0 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 1 Locus PB73 Locus PB66

0.5 0.5

0 0 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 Locus D1

0.5

0 1 2 3 4 5 6 7 8 9 Sampling locality

Fig. 4.5. Allele frequencies per locus for each Lacerta lepida sampled locality. Alleles with frequency less than 1% are not represented, apart from locus LIZ24, with all alleles included. Each bar represents one allele. Colours are the same as in Fig. 4.3.

136

0.4 0.4 1

Locus B4 Locus LVIR17 Locus LIZ24

0.2 0.2 0.5

0 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 1 2 3 4 5 6 7 8

0.6 0.3 0.3 Locus C9 Locus PB73 Locus PB66

0.2 0.2

0.3

0.1 0.1

0 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

0.3 Locus D1

0.2

0.1

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

Fig. 4.6. Allele frequencies per locus for each Lacerta lepida mtDNA lineage. Red bars represent Lepida (populations 5 to 9) and blue bars represent Nevadensis (populations 1 to 4). Alleles are ordered according to allele size.

137

a) b) r2 = 0.0012 r2 = 0.27

Fig. 4.7. Isolation by distance analysis showing association between genetic and geographic distance in two Lacerta lepida mtDNA lineages. Genetic distances represent pairwise FST values (linearised as FST /1- FST ) calculated using data from 7 microsatellite loci. Geographic distances represent distances in Km between sampled localities within each mitochondrial lineage. a) represents Lepida mtDNA lineage and b) represents Nevadensis mtDNA lineage.

35

30

25

20 K 15

10

5

0 2 3 4 5 6 7 8 9 K

Fig. 4.8. Magnitude of K, as defined by Evanno et al. (2005), as a function of K.

138

1.00

0.80

0.60

0.40

0.20 Proportionofancestry

0.00 1 2 3 4 5 6 7 8 9 Sampled individuals (columns) and sampled localities (numbers)

Fig. 4.9. Proportion of ancestry of each sampled individual of Lacerta lepida (columns) as inferred with STRUCTURE for 7 microsatellite loci, assuming the admixture model. Dark grey represents Nevadensis ancestry whereas light grey represents Lepida . Individuals are sorted by sampled localities.

139

a) b) 1.0 * *** * 1.01.0 * ** mtDNA * LVIR17 nDNA C9 * PB73 * *

0.5 0.5

P * * * * mtDNA P – P proportionofmembership toLepida ** *** *** * * ** 0.0 47 83 93 118 150 226 0 47 83 93 118 150 226 Distance from Locality 1 (km) Distance from Locality 1 (km)

Fig. 4.10. Best fitted Tanh curves showing the clinal transition of mitochondrial and nuclear markers through the contact zone of two mitochondrial DNA lineages of Lacerta lepida . a) Changes in proportion of membership (P) to Lepida mitochondrial lineage along the transect based on 7 microsatellite loci (black stars) and changes in frequency of Lepida mitochondrial haplotypes (red stars in both graphs). b) Changes in northern allele frequencies along the transect, for PB73 (black stars), C9 (triangles) and LVIR17 (circles).

140

4.6. References

Alphen JJMV, Seehausen O (2001) , reproductive isolation and the genic view of speciation. Journal of Evolutionary Biology 14 , 874-875.

Avise JC (2000) Phylogeography Harvard University Press, Cambridge, MA.

Avise JC, Arnold J, Ball RM, Bermingham E, Lamb T, Neigel JE, Reeb CA, Saunders NC (1987) Intraspecific phylogeography: the mitochondrial DNA bridge between population genetics and systematics. Annual Review of Ecology and Systematics 18 , 489-522.

Bandelt HJ, Forster P, Rohl A (1999) Median-joining networks for inferring intraspecific phylogenies. Molecular Biology and Evolution 16 , 37-48.

Barton NH (1983) Multilocus Clines. Evolution 37 , 454-471.

Barton NH, Baird SJE (1995) Analyse: an application for analysing hybrid zones. Available at www.helios.bto.ed.ac.uk/research/institutes/evolution/software/Mac/Analyse/ index.html , Edinburgh .

Barton NH, Gale KS (1993) Genetic analysis of hybrid zones. In: Hybrid zones and the evolutionary process (ed. Harrison RG), pp. 13-45. Oxford University Press, Oxford.

Barton NH, Hewitt GM (1983) Hybrid zones as gene barriers to gene flow. In: Protein polymorphism: adaptive and taxonomic significance (eds. Oxford GS, Rollinson D), pp. 341-359. Blackwell, Oxford, UK.

Barton NH, Hewitt GM (1985) Analysis of hybrid zones. Annual Review of Ecology and Systematics 16 , 113-148.

Barton NH, Hewitt GM (1989) Adaptation, speciation and hybrid zones. Nature 341 , 497-503.

Barton NH, Shpak M (2000) The effects of epistasis on the structure of hybrid zones. Genetical Research 75 , 179-198.

Bateson W (1909) Mendel's Principles of Heredity Cambridge University Press, Cambridge, Massachusetts.

Böhme MU, Berendonk TU, Schlegel M (2005) Isolation of new microsatellite loci from the Green Lizard ( Lacerta viridis viridis ). Molecular Ecology Notes 5, 45-47.

141

Bohonak AJ (2002) IBD (Isolation by Distance): A Program for Analyses of Isolation by Distance. J Hered 93 , 153-154.

Boudjemadi K, Martin O, Simon J-C, Estoup A (1999) Development and cross- species comparison of microsatellite markers in two lizard species, Lacerta vivipara and Podarcis muralis . Molecular Ecology 8, 513-525.

Bridle JR, Ritchie MG (2001) Assortative mating and the genic view of speciation. Journal of Evolutionary Biology 14 , 878-879.

Butlin RK, Hewitt GM (1985) A hybrid zone between Chorthippus parallelus parallelus and Chorthippus parallelus erythropus (Orthoptera: Acrididae): morphological and electrophoretic characters. Biological Journal of the Linnean Society 26 , 269-285.

Carrión J, Munuera M, Navarro C (1998) The palaeoenvironment of Carihuela Cave (Granada, Spain): a reconstruction on the basis of palynological investigations of cave sediments. Review of Palaeobotany and Palynology 99 , 317-340.

Carrión JS, Dupre M (1996) Late Quaternary vegetational history at Navarres, Eastern Spain. A two core approach. New Phytologist 134 , 177-191.

Castilla AM, Bauwens D (1989) Reproductive characteristics of the lacertid lizard Lacerta lepida . Amphibia-Reptilia 10 , 445-452.

Clement M, Posada D, Crandall KA (2000) TCS: a computer program to estimate gene genealogies. Molecular Ecology 9, 1657-1659.

Coyne JA, Orr HA (2004) Speciation Sinauer, Sunderland, Massachusetts.

Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society B. 39 , 1- 38.

Dobzhansky T (1937) Genetics and the Origin of Species Columbia University Press, New York.

Endler JA (1977) Geographic variation, speciation, and clines Princeton Universty Press, Princeton.

Evanno G, Regnaut S, Goudet J (2005) Detecting the number of clusters of individuals using the software structure: a simulation study. Molecular Ecology 14 , 2611-2620.

Excoffier L, Laval G, Schneider S (2005) Arlequin ver. 3.0: An integrated software package for population genetics data analysis. Evolutionary Bioinformatics Online 47-50.

142

Excoffier L, Smouse PE, Quattro JM (1992) Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to Human mitochondrial DNA restriction data. Genetics 131 , 479-491.

Gavrilets S (1997) Hybrid zones with Dobzhansky-type epistatic selection. Evolution 51 , 1027-1035.

Gomez A, Lunt DH (2007) Refugia within refugia: patterns of phylogeographic concordance in the Iberian Peninsula. In: Phylogeography of Southern European Refugia (eds. Weiss S, Ferrand N). Springer, Dordrecht.

Hall TA (1999) BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows95/98/NT. Nucleic Acids Symposium Series 41 , 95–98.

Harrison RG (1990) Hybrid zones: windows on the evolutionary process. Oxford Surveys in Evolutionary Biology 7, 69-128.

Harrison RG (1993) Hybrid zones and the Evolutionary Process Oxford University Press, New York.

Hewitt GM (1988) Hybrid zones - natural laboratories for evolutionary studies. Trends in Ecology & Evolution 3, 158-167.

Hewitt GM (1993) After the Ice: parallelus meets Erythropus in the Pyrenees. In: Hybrid zones and the evolutionary process (ed. Harrison RG). Oxford University Press, New York.

Hewitt GM (1996) Some genetic consequences of ice ages, and their role, in divergence and speciation. Biological Journal of the Linnean Society 58 , 247-276.

Li G, Hubert S, Bucklin K, Ribes V, Hedgecock D (2003) Characterization of 79 microsatellite DNA markers in the Pacific oyster Crassostrea gigas . Molecular Ecology Notes 3, 228-232.

Lindell J, Mendez-de la Cruz FR, Murphy RW (2005a) Deep genealogical history without population differentiation: Discordance between mtDNA and allozyme divergence in the zebra-tailed lizard ( Callisaurus draconoides ). Molecular Phylogenetics and Evolution 36 , 682-694.

Lindell J, Mendez-De La Cruz FR, Murphy RW (2008a) Deep biogeographical history and cytonuclear discordance in the black-tailed brush lizard (Urosaurus nigricaudus ) of Baja California. Biological Journal of the Linnean Society 94 , 89-104.

Lindell J, Méndez-de la Cruz FR, Murphy RW (2005b) Deep genealogical history without population differentiation: Discordance between mtDNA and allozyme divergence in the zebra-tailed lizard ( Callisaurus draconoides ). Molecular Phylogenetics and Evolution 36 , 682-694.

143

Lindell J, Méndez-de la Cruz FR, Murphy RW (2008b) Deep biogeographical history and cytonuclear discordance in the black-tailed brush lizard (Urosaurus nigricaudus ) of Baja California. Biological Journal of the Linnean Society 94 , 89-104.

Mateo JA (1988) Estudio sistematico y zoogeografico de los Lagartos Ocelados, Lacerta lepida Daudin, 1802, y Lacerta pater (Lataste, 1880), (Sauria: Lacertidae) , Universidad de Sevilla.

Mateo JA, Castanet J (1994) Reproductive strategies in three Spanish populations of the ocellated lizard, Lacerta lepida (Sauria, Lacertidae). Acta oecologica 15 , 215-229.

Mateo JA, Castroviejo J (1990) Variation morphologique et revision taxonomique de l’espece Lacerta lepida Daudin, 1802 (Sauria, Lacertidae). Bulletin du Museé de Histoire Naturele de Paris 12 , 691–706.

Mateo JA, López-Jurado LF (1994) Variaciones en el color de los lagartos ocelados; aproximacion a la distribuicion de Lacerta lepida nevadensis Buchholz 1963. Revista Espanola de Herpetologia 8, 29-35.

Mateo JA, López-Jurado LF, Guillaume CP (1996) Variabilité électrophorétique et morphologique des lézards ocellés (Lacertidae): un complexe d’espèces de part et d’autre du détroit de Gibraltar. Comptes Rendus de L’Academie des Sciences Serie iii-Sciences de la Vie-Life Sciences 319 , 737–746.

Mayr E (1963) Animal species and evolution. Harvard University Press, Cambridge, MA.

Mayr E (2001) Wu's genic view of speciation. Journal of Evolutionary Biology 14 , 866-867.

Muller HJ (1942) Isolating mechanisms, evolution and temperature. Biology Symposium 6, 71-125.

Nembrini M, Oppliger A (2003) Characterization of microsatellite loci in the wall lizard Podarcis muralis (Sauria: Lacertidae). Molecular Ecology Notes 3, 123-124.

Paulo OS, Pinheiro J, Miraldo A, Bruford MW, Jordan WC, Nichols RA (2008) The role of vicariance vs. dispersal in shaping genetic patterns in ocellated lizard species in the western Mediterranean. Molecular Ecology 17 , 1535-1551.

Pinho C, Sequeira F, Godinho R, Harris DJ, Ferrand N (2004) Isolation and characterization of nine microsatellite loci in Podarcis bocagei (Squamata: Lacertidae). Molecular Ecology Notes 4, 286-288.

Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155 , 945-959.

144

Rassmann K, Tautz D, Trillmich F, Gliddon C (1997) The microevolution of the Galapagos marine iguana Amblyrhynchus cristatus assessed by nuclear and mitochondrial genetic analyses. Molecular Ecology 6, 437-452.

Raymond M, Rousset F (1995) An Exact Test for Population Differentiation. Evolution 49 , 1280-1283.

Rieseberg LH, Burke JM (2001) A genic view of species integration. Journal of Evolutionary Biology 14 , 883-886.

Rousset F (2008) Genepop'007: a complete re-implementation of the genepop software for Windows and Linux. Molecular Ecology Resources 8, 103-106.

Salvador A, Veiga JP, Esteban M (2004) Preliminary data on reproductive ecology of Lacerta lepida at a mountain site i central Spain. Herpetological Journal 14 , 47-49.

Sequeira F, Alexandrino J, Rocha S, Arntzen JW, Ferrand N (2005) Genetic exchange across a hybrid zone within the Iberian endemic golden-striped salamander, Chioglossa lusitanica . Molecular Ecology 14 , 245-254.

Stenson AG, Malhotra A, Thorpe RS (2002) Population differentiation and nuclear gene flow in the Dominican anole ( Anolis oculatus ). Molecular Ecology 11 , 1679-1688.

Szymura JM, Barton NH (1986) Genetic analysis of a hybrid zone between the fire- bellied toads, Bombina bombina and B. variegata , near Cracow in Southern Poland. Evolution 40 , 1141-1159.

Templeton AR, Crandall KA, Sing CF (1992) A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping and DNA sequence data. III. Cladogram estimation. Genetics 132 , 619-633.

Thorpe RS, Surget-Groba Y, Johansson H (2008a) The relative importance of ecology and geographic isolation for speciation in anoles. Philosophical Transactions of the Royal Society B-Biological Sciences 363 , 3071-3081.

Thorpe RS, Surget-Groba Y, Johansson H (2008b) The relative importance of ecology and geographic isolation for speciation in anoles. Philosophical Transactions of the Royal Society B: Biological Sciences 363 , 3071-3081.

Ujvari B, Dowton M, Madsen T (2008) Population genetic structure, gene flow and sex-biased dispersal in frillneck lizards ( Chlamydosaurus kingii ). Molecular Ecology 17 , 3557-3564.

Van Oosterhout C, Hutchinson WF, Wills DPM, Shipley P (2004) Micro-checker: software for identifying and correcting genotyping errors in microsatellite data. Molecular Ecology Notes 4, 535-538.

145

Vogler AP (2001) The genic view: a useful model of the process of speciation? Journal of Evolutionary Biology 14 , 876-877.

Wu CI (2001) The genic view of the process of speciation. Journal of Evolutionary Biology 14 , 851-865.

146

Appendix

147

Chapter 5

Testing for the presence of heteroplasmy in Lacerta lepida through single molecule PCR

Photo by Andreia Miraldo Photo taken to mark the capture of lizard number 44*

* Sampling site E in chapter 2

5. Testing for the presence of heteroplasmy in Lacerta lepida through single molecule PCR

5.1. Abstract

In the last decade general assumptions regarding the inheritance of mitochondrial DNA in animals have been questioned mainly as a result of accumulating evidence for the existence of bi-parental inheritance and recombination of mitochondrial DNA across several taxa. In this chapter, polymorphic mitochondrial DNA sequences detected in several individuals from a zone of contact between two Lacerta lepida mitochondrial lineages (chapter 2) are re-analysed using a single molecule PCR approach to test if heteroplasmy and mitochondrial DNA recombination are features of this contact zone. Results indicate that low levels of heteroplasmy occur in some individuals. Strong evidence for mitochondrial DNA recombination was also detected. The origins of heteroplasmy and mitochondrial DNA recombinant haplotypes within Lacerta lepida are discussed in detail.

Key words : heteroplasmy, mitochondrial DNA, recombination, smPCR, contact zone

148

5.2. Introduction

Mitochondrial DNA has been the most employed molecular marker for phylogeographic inference in animals (Avise, 2004). It has become the tool of choice in phylogeographic studies due to several of its properties, not found in nuclear genomes. It has a maternal, non-recombining mode of inheritance that enables evolutionary histories to be reconstructed without the complexities introduced by biparental recombination, and it has a high mutation rate, that generates enough signal to make inferences about population history over short time frames. Nevertheless some of these assumptions have been questioned in the last few years. For instance, although the standard paradigm postulates that mtDNA is strictly maternally inherited, it has been increasingly apparent that more than one mtDNA type can be associated with an individual or cell, a condition known as heteroplasmy. Recent findings have shown that in organisms that normally transmit mtDNA through the female line only, heteroplasmy can be produced by occasional paternal leakage that is usually associated with interspecific crosses (Arunkumar et al. , 2006; Ciborowski et al. , 2007; Fontaine et al. , 2007; Sherengul et al. , 2006). Although intraspecific paternal leakage is thought to be less probable, as the recognition mechanisms of paternal mitochondria are more efficient when genetic divergence between individuals is small (Kaneda et al. , 1995; Shitara et al. , 1998; Sutovsky et al. , 2000), recent studies have reported the occurrence of paternal leakage within species (Gantenbein et al. , 2005; Sherengul et al. , 2006; Ujvari et al. , 2007). The discovery of heteroplasmy through paternal leakage in a wide range of taxa raises questions about the consistency of the other important property of mtDNA, the absence of recombination in this molecule. Recombination in mtDNA is thought to be absent in animals, mainly because of a failure to observe clear cases of recombinant haplotypes in natural populations. However, whether mtDNA recombination occurs is a different issue from whether it produces new haplotypes. In fact, the lack of recombination in animal mtDNA is becoming very controversial 149 as evidence accumulates suggesting that both intra and intermolecular recombination may occur (for a review in inheritance and recombination of mitochondrial genomes in other systems see Barr et al. , 2005). There is evidence that human mitochondria have the required enzymatic machinery for homologous recombination to occur (Thyagarajan et al. , 1996; Yaffe, 1999). Moreover, mitochondria are extremely dynamic organelles that are constantly fusing and dividing, and it has been shown that after fusion matrix contents are mixed providing the possibility for homologous recombination to occur (see Detmer and Chan, 2007 for a recent review on the subject). Furthermore, intramolecular mtDNA recombination has been experimentally demonstrated in the nematode Meloidogyne javanica (Lunt and Hyman, 1997) providing evidence that animal mtDNA can self-recombine. Further indirect evidence for intramolecular recombination comes from the detection of mitochondrial rearrangements (“sublimons”) found at very low levels in healthy human tissues that are suggested to be the result of homologous recombination (Holt et al. , 1997; Kajander et al. , 2000; Tang et al. , 2000). Morevover, intermolecular recombination occurs frequently in mussels (Burzynski et al. , 2006; Ladoukakis and Zouros, 2001), which are known to have a unique mitochondrial inheritance system (“doubly uniparental inheritance”), and incidentally in humans (Kraytsberg et al., 2004b). More recently, strong evidence for the occurrence of intermolecular recombination has been found in fish (Ciborowski et al. , 2007) and reptiles (Ujvari et al. , 2007). The conclusion that animal mitochondrial DNA does not recombine based on the absence of recombinant haplotypes in natural populations does not consider the probability of a mtDNA recombination event producing a detectable recombinant haplotype. This probability is likely to be very small, mainly as a consequence of the typically strict maternal inheritance of mtDNA. In the case of a recombination event occurring, it will most likely occur in homoplasmic cells, making the detection of recombinants very difficult, if not impossible, unless it results in size heteroplasmy (reviewed in Rokas et al. , 2003). Interestingly almost all reported cases of mitochondrial recombination were detected because recombination occurred between divergent mtDNA co-occurring in the same cell. An exception is the recombination events registered in the nematode Meloidogyne javanica , where intramolecular mtDNA recombination was detected as it resulted in size heteroplasmy and variability in sequence organization (Lunt and Hyman, 1997). The detection of mtDNA recombination seems to require an initial state of heteroplasmy 150 that can only be achieved either by paternal leakage or by mutations in the mitochondrial genome of germ-line cells. Given the increasing evidence for paternal leakage in both inter and intraspecific crosses (see above), it would seem that hybrid zones may be likely areas to detect mtDNA recombination events. This idea is further supported by two recent studies which report evidence for recombination of mtDNA in contact zones: the study of Jaramillo-Correa and Bousquet (2005) reports recombination of mtDNA in a zone of contact between two hybridizing conifers while Ujvari and collaborators (2007) report evidence of mitochondrial recombination in an hybrid zone between two mitochondrial lineages of the Australian frillneck lizard ( Chlamydosaurus kingii ). The phylogeographic study of a zone of secondary contact between two divergent mitochondrial lineages (L3 and L5) of Lacerta lepida revealed the existence of clear polymorphic trace files when sequencing the cytochrome b gene for single individuals (chapter 2). Although it was shown that the mixed signal was most probably generated by the presence of Numts, the existence of low levels of heteroplasmy within the species could not be discarded completely. Furthermore, the quantification of intra-individual variation in chapter 2 was achieved through a PCR- cloning procedure, which is known to have some inherent and significant disadvantages. The disadvantages are mainly associated with the amplification step where, PCR derived mutations, template jumping and allelic preference are known to occur (Lin et al. , 2002; Paabo et al. , 1990). These disadvantages become especially problematic when PCR-cloning procedures are used to describe mutations that distinguish different gene copies. While PCR induced errors ( in vitro errors) will not be detectable upon sequencing, as at the most they will affect 25% of all molecules synthesized, upon cloning in vitro polymerase errors will become indistinguishable from in vivo mutations since each of the errors will affect all the molecules (100%) of a clone, just as a genuine in vivo mutation does. Results from chapter 2 show that 35% of the sequenced clones represent probable recombinants between the two divergent mitochondrial lineages. Such recombinant molecules could have originated either due to rearrangements from mixed templates, via jumping PCR, or through intramolecular recombination. Although the data suggests that recombinants could in fact have originated in vitro , the question that still remains is if the recombinant molecules originated through recombination between two divergent mitochondrial

151 molecules (true heteroplasmy) or between a homoplasmic mitochondrial genome and Numts. In this chapter these issues will be further analyzed using a single molecule PCR (smPCR) approach. SmPCR has been used in several types of study, being most commonly employed for sequencing and genotyping purposes (e.g. Konfortov et al. , 2007; Krause et al. , 2006; Kraytsberg and Khrapko, 2005; Lukyanov et al. , 1996). A single molecule PCR is essentially a normal PCR but where the template DNA is diluted to very low concentration. If Numts are the only source of the heteroplasmic signal, and are indeed the origin of the recombinant molecules found in chapter 3, performing a smPCR by limiting DNA dilution to one amplifiable mitochondrial genome should only result in the amplification of mitochondrial fragments. Therefore using this approach it should be possible to identify if true heteroplasmy and recombination occur in Lacerta lepida .

5.3. Material and methods

5.3.1. Sample selection and DNA extraction

SmPCR was performed using DNA of four individuals from a population at the centre of the contact zone between lineages L3 and L5, where haplotypes from both lineages were detected (individuals C3, C4, C8 and C9 from population C, chapter 2). Previous amplification of 627 bp fragment of cytb gene in these samples revealed the existence of clear polymorphic trace files and cloning of the amplified fragments resulted in the detection of several recombinant molecules (chapter 3). DNA extraction, amplification of cytb fragment and cloning procedures are explained in detail in chapter 3.

5.3.2. Estimation of the number of template copies

152

DNA concentration was measured using a NanoDrop® ND-1000 spectrophotometer and was diluted to a concentration of approximately 10 3 mitochondria /µl. Assuming that 1 million base pairs weighs 1pg (1x10 -3ng), 1 mitochondrial genome of approximately 17,000 base pairs weighs 1.7x10 -3ng. Fifty microtitre aliquots (single use aliquots) at a concentration of 10 3 mitochondria/µl (hereafter described as stock DNA) were stored at -20 oC. Stock DNA was serially diluted and dispensed in a 96-well microtitre plate in order to obtain 5 different DNA concentrations (16 wells per concentration and 16 negative controls). In order to assess which DNA concentration conforms to expectations from smPCR, DNA was subjected to the hemi nested smPCR protocol (see below) and the proportion of positive wells for each DNA concentration was estimated. According to a Poisson distribution, 36.8% of the amplifications from DNA at a concentration of a single molecule are not expected to contain a molecule of the desired template, another 36.8% are expected to contain a single molecule, and the remainder are expected to contain multiple molecules (Stephens et al. , 1990). In order to decrease the number of false positives (positive amplifications resulting from multiple molecules), template with a final concentration of 0.3 amplifiable molecules should be used, meaning that approximately one third of the amplifications will yield a product derived from a single molecule template and less than 5% of positive amplifications will be derived from multiple molecules (Kraytsberg et al. , 2004a). Therefore, only amplifications derived from DNA dilutions that resulted in a proportion of positive amplifications conforming to a DNA concentration of 0.3 amplifiable molecules are sequenced.

5.3.3. Selection of loci and design of PCR primers

In order to increase PCR specificity a hemi nested PCR approach was used, which involves the use of two sets of primers employed in two successive PCR reactions (hereafter described as Phase I and Phase II PCR). During Phase I PCR the first set of primers, that includes the forward-external (FEXT) and the reverse primer (REXT), are used to generate a DNA product that is longer than the final target sequence (Fig. 5.1.). The product from Phase I PCR is then used to start a second

153

PCR (Phase II) using a set of primers that involves the same reverse primer used in Phase I and a new forward primer (forward-internal, FINT). The forward-internal primer binding site is located within the first amplified sequence, in a region nearby the forward-external primer binding site (Fig. 5.1.). To assure that the low numbers of positive amplifications (which are expected in a smPCR approach) are the result of single molecule amplifications and not due to PCR inefficiency, two fragments (amplimers) from different regions of the mitochondrial genome were amplified. Co-segregation of the two markers would re-assure PCR effectiveness; therefore if an aliquot gives a positive amplification for one of the markers, then the same aliquot should give a positive amplification for the other marker. If the two markers segregate independently then low number of amplifications might be due to PCR inefficiency. The markers chosen were a fragment of the cytochrome b gene (CYTB amplimer) and a fragment that spans part of the 12S and 16S ribosomal genes (12S amplimer) (Fig. 5.2.). To design specific Lacerta lepida primers to amplify the CYTB amplimer the entire cytb gene was first amplified with modified versions of primer L14919 (TRNAGLU, 5’- AAC CAC CGT TGT ATT TCA ACT - 3’) and L16064 (TRNATHR, 5’- CTT TGG TTT ACA AGA ACA ATG CTT TA - 3’) (Burbrink et al. , 2000) using the conditions described in chapter 3. After aligning the entire cytb sequences specific primers for Lacerta lepida were designed (CYTB-FEXT, 5’-TTA CAA AAT TAT TAA CTC CTC CT - 3’; CYTB-FINT, 5’ - GCC TAT GTC TTA TTA TTC AAG - 3’ and CYTB-REXT, 5’ - GGT TTA CAA GAA CAA TGC TTT A - 3’). The final CYTB amplimer is 1143 bp long. For the 12S amplimer, published (Paulo et al. , 2008) partial sequences of 12S and 16S genes from Lacerta lepida were aligned and specific primers were designed in order to amplify a fragment of similar size to that of the CYTB amplimer. As there are no published sequences for the entire 12S and 16S genes from Lacerta lepida the published mitochondrial genome of Lacerta viridis (Böhme et al. , 2007) was used to estimate the approximate final size of the amplimer. The primers designed for the amplification of the 12S amplimer were: 12S-FEXT (5’ - GCA AAT GTT AGG GAA GAG AT - 3’), 12S-FINT (5’ - CTA TTT TAA CAA CGC TCT GGG - 3’) and 16S-REXT (5’ - GAG TCA CTG GGC AGG CAA GA - 3’). The final 12S amplimer is 933 bp long.

154

5.3.4. PCR amplifications, scoring and sequencing

Phase I PCR consisted of the amplification of both markers using a multiplex approach in a mix containing 1x PCR Gold buffer (Perkin-Elmer), 4 mM MgCl2, 200 mM each of dATP, dCTP, dGTP and dTTP (Applied Biosystems), 0.2 mM of forward-external and reverse external primers (Operon Technologies) for both markers (see above primer selection for each loci) and 0.5U Taq Gold DNA polymerase (Perkin-Elmer). Five microlitres of this mix were added to 5µl of template DNA which was previously prepared by performing serial dilutions from stock DNA and dispensed in a 96 well plate under 1 drop of oil (see section 6.3.2.). Negative controls (no DNA) were included for all amplifications. In order to avoid contamination, Phase I PCR was prepared in a room separated from any PCR products and DNA solutions at high concentration. Amplifications were conducted in the normal lab as follows: 93 oC for 9min, then 28 cycles of 94 oC for 20s, 50 oC for 30s, 72 oC for 90s. Phase I PCR products were diluted to 1000µl with bi-distilled water (ddH 20) and 5µl aliquots of it were used to perform the Phase II PCR. Phase II PCR consists of amplifying the two markers independently in two monoplex PCRs. The 5µl aliquots of diluted Phase I PCR products are supplemented to give a 10µl final volume containing 1 mM each of the relevant forward-internal and reverse primers, 1x PCR Gold buffer, 4 mM MgCl2, 200 mM each dNTP and 0.2U Taq Gold DNA polymerase. Amplifications were conducted as follows: 93 oC for 9min, then 33 cycles of 94 oC for 20 s, 54 oC for 30 s, 72 oC for 90 s. PCR products were analyzed by agarose gel electrophoresis (2%), scoring presence and absence of the expected PCR product. Positive products that could be identified as obtained from a single molecule were purified by filtration through QIAquick ® columns (Qiagen) following manufacturer’s recommendations and the CYTB amplimer was sequenced in both directions using Phase II PCR primers. Sequencing reaction mixes consisted of 6.35 µl of ddH2O, 1.5 µl of primer at 3.5 µM, 1µl of BigDye Terminator v3.1 TM (Applied Biosystems) and 1 µl of PCR product. Sequence reactions were performed as follows: initial incubation at 96ºC for 1min; 25 cycles of incubation at 90ºC for 10s, 50ºC for 5s and 60ºC for 4min. PCR and

155 sequencing reactions were performed in a DNA engine tetrad 2, Peltier thermocycler, and sequences were obtained using an ABI 3700 capillary sequencer.

5.4. Results and discussion

SmPCR was successfully implemented in all samples. The percentage of positive amplifications was always lower (between 0.29 and 0.43, Table 5.1.) than what was expected from a single molecule PCR protocol. Both amplimers were successfully amplified supporting the efficacy of the protocol and no contamination was detected as revealed by negative amplification of controls. To infer the phylogenetic relatedness of all smPCR sequences a statistical parsimony network (see chapter 3, section 3.3.3. for a detailed explanation of the method) using all 68 mitochondrial haplotypes from chapter 2 and the 55 sequences obtained by smPCR was constructed (Fig. 5.3.). Forty eight smPCR sequences correspond to haplotype 40, six sequences correspond to haplotype 62, one sequence to haplotype 1 and one sequence corresponds to a new haplotype (153) not found before. All smPCR sequences from individuals C3 and C4 correspond to the expected mitochondrial DNA haplotype (haplotype 40) previously identified by the amplification of the entire cytb gene (chapter 3). This was not the case for individuals C8 and C9. Although the majority of smPCR sequences in individuals C8 and C9 also correspond to the expected mitochondrial DNA haplotype (haplotype 40 in C8 and 62 in C9), one sequence in each individual corresponds to a different haplotype. Individual C8 carries haplotypes 40 and 1 while individual C9 carries haplotypes 62 and 153.

5.4.1. Ruling out Numts

It is highly unlikely that the sequences achieved through smPCR represent Numts instead of real mitochondrial copies, due to the level of DNA dilution that each sample was subjected to prior to the amplification process. The DNA in each 156 sample was diluted to a level that only allows fragments of DNA smaller than 17.000 bp to be present in each aliquot, excluding therefore the nuclear genome. Although the single molecularity of smPCR protocol refers to “a single amplifiable molecule, which means a continuous DNA with no impassable adducts/modifications” (Kraytsberg et al. , 2004a), additional broken, “non-amplifiable” molecules might be present in the DNA mixture. Nevertheless, it is highly improbable that through dilution an aliquot with a fragment of nuclear DNA representing exactly the portion under analysis (cytb Numt) would be obtained. It is therefore more likely that the sequences obtained by smPCR do in fact represent true mitochondrial copies that exist in heteroplasmy in the individuals analysed.

5.4.2. Ruling out contamination

SmPCR is highly prone to contamination and therefore during the smPCR preparation several measures were carried out to avoid it. The smPCR step most prone to contamination is the preparation of Phase I PCR, as it is during this step that DNA is highly diluted and therefore more prone to contamination. All smPCR Phase I preparations were done in a separated lab (clean lab), which is located in a different building from the main lab. Furthermore the clean lab was never exposed to high DNA concentrations, PCR products or any work involving reptile DNA, therefore substantially reducing the possibility of contamination. In order for contamination to be the source of heteroplasmy detected, DNA representative of haplotype I or the new haplotype 153 would have to be carried to the clean lab, through the handling of reagents and materials or by contamination of the diluted DNA previously to its transfer to the clean lab. All reagents and materials used in the clean lab were specifically bought for this purpose and were never in contact with the main lab where contamination could occur. DNA carried to the clean lab was DNA diluted from the 4 individuals analysed which do not correspond to haplotype 1 neither to the new haplotype (153) detected. Therefore no direct sources of contamination are present in the clean lab. Furthermore, each Phase I PCR preparation was done under a UV hood, which after the assemblage of Phase I PCR plate, was turned on so that any DNA present in the hood was eliminated, thus avoiding cross contamination.

157

The best evidence against contamination in the clean lab is that no amplifications were ever detected in the negative controls, which always represented 16.6% of the wells of each plate. These facts suggest that if contamination was the source of heteroplasmy then DNA would have to be contaminated previously to the phase I PCR set up, either during DNA extraction or dilution in the main lab. This explanation is only applicable regarding haplotype 1 as haplotype 153 was never amplified before. Nevertheless, DNA carried to the clean lab was already extremely diluted and therefore if DNA was already contaminated higher frequency of amplifications of haplotype 1 would be expected, close to the frequency of the other amplified haplotype. The very low frequency of one of the haplotypes in both individuals is therefore more consistent as representing true low levels of heteroplasmy.

5.4.3. Heteroplasmy and mtDNA recombination

The co-occurrence of mitochondrial DNA haplotypes 1 and 40 in individual C8, and haplotypes 153 and 62 in C9 confirms the existence of low levels of heteroplasmy in these individuals. An increasing number of species have been shown to harbour some level of heteroplasmy (see section 6.2) and Lacerta lepida seems to be no exception. Haplotype 153 is connected in the network to haplotype 40 by 4 mutations (Fig. 5.3., branch a) and to an ancestral unsampled haplotype * by 3 mutations (Fig. 5.3., branch b). The mutations involved in branch a and b are the same 7 mutations that occur from the ancestral unsampled haplotype * to haplotype 40, resulting in the loop. The phylogenetic relationship of haplotype 153 with the remaining haplotypes seems to suggest that it resulted from a recombination event between two divergent mitochondrial haplotypes from the network. Another explanation for the origin of haplotype 153 is the occurrence of homoplasies. This would imply that all sites involved either in branch a or in branch b have suffered re-current mutations, which would seem less likely. The origin of the recombinant haplotype153 most likely derived from a recombination event via paternal leakage resulting in the fusion of paternal and maternal mitochondrial DNA. Recombination through paternal leakage

158 has also been inferred as the most likely explanation for the recombinants detected in the contact zone of two conifers, black spruce ( Picea mariana ) and red spruce ( Picea rubens ) (Jaramillo-Correa and Bousquet, 2005) and in the contact zone between two mitochondrial lineages of the Australian frillneck lizard ( Chlamydosaurus kingii ) (Ujvari et al. , 2007).

5.4.4. Origin of heteroplasmy and recombination in Lacerta lepida

In animals heteroplasmy can be achieved through the accumulation of somatic mutations (e.g. Khrapko et al. , 1997), paternal leakage (e.g. Fontaine et al. , 2007) or through intramolecular recombination (e.g. Kajander et al. , 2000; Lunt and Hyman, 1997). In the case of Lacerta lepida the heteroplasmy documented in individuals C8 and C9 is most consistent with paternal leakage. In both individuals the differences between the heteroplasmic haplotypes is too large to be explained by the accumulation of somatic mutations within an individual (11 mutational steps between haplotypes present in C8 and 9 in C9). Moreover, it is very unlikely that the mutations accumulated would result in a previously sampled haplotype, as it is the case of haplotype 1 in individual C8. In fact most cases of heteroplasmy reported to date in animals represent heteroplasmy originated through paternal leakage and it has been reported in birds (Kvist et al. , 2003), (Fontaine et al. , 2007; Kondo et al. , 1990; Meusel and Moritz, 1993; Sherengul et al. , 2006; Van Leeuwen et al. , 2008), fish (Hoarau et al. , 2002; Magoulas and Zouros, 1993) and mammals (Gyllensten et al. , 1991; Kaneda et al. , 1995; Shitara et al. , 1998; Steinborn et al. , 1998; Sutovsky et al. , 2000; Zhao et al. , 2004) including humans (Kraytsberg et al. , 2004b; Schwartz and Vissing, 2002). It is not possible to determine if leakage responsible for the detected heteroplasmy and recombination occurred from the father of the heteroplasmic individuals at the time of fertilization or if it occurred several generations ago and was transmitted to the individuals from the maternal line. The latter hypothesis would imply that heteroplasmy persisted in the population for a long period of time. Heteroplasmy can be resolved within one or few generations through a reduction of

159 mtDNA copies during early oogenesis, as firstly reported in bovines (Ashley et al. , 1989; Hauswirth and Laipis, 1982; Koehler et al. , 1991). Nevertheless, the re- establishment of homoplasmy seems to differ amongst taxa and it is influenced by the type of mutations involved in the heteroplasmy and, in the case of neutral polymorphisms, on the effective population size. For example, reports show that in mice neutral heteroplasmy can persist for as long as 14 generations (Gyllensten et al. , 1991) while in insects the number of generations to re-establish homoplasmy might reach 500 (Rand and Harrison, 1986; Solignac et al. , 1984). In mammals it is known that mitochondrial genotypes segregate differently in the offspring due to a mitochondrial bottleneck and random segregation of organelles into early embryonic cells, which is seen as a tool to prevent the accumulation of deleterious mutations and “mutational meltdown” that would otherwise occur via Muller’s ratchet (Bergstrom and Pritchard, 1998). This decrease in mtDNA per cell during embryogenesis is followed by a dramatic increase during oogenesis, which means that only a sub-set of maternal mtDNA will populate the next generation. This can lead to a return to homoplasmy from a heteroplasmic state but it can also lead to strong founder effects (Bergstrom and Pritchard, 1998). So if heteroplasmy in Lacerta lepida was generated in the past and has persisted in the population through several generations we should expect to detect haplotype 1 and 153 at higher frequencies in the sampled area due to random segregation of mtDNA, which is not the case. If heteroplasmy in Lacerta lepida persists across multiple generations, this would imply that some form of selection is maintaining haplotype 1 and 153 at low frequencies in the population. Recently, evidence for strong purifying selection has been found in heteroplasmic mice (Stewart et al. , 2008), although this is still very controversial. It seems, therefore, more plausible that heteroplasmy is recent, resulting from hybridization of diverged mitochondrial phylogroups. The occurrence of haplotype 1 and 153 in the surrounding areas of phylogroup L3 cannot be completely excluded, and thorough sampling could reveal their presence nearby, allowing the occurrence of paternal leakage through hybridization.

5.5. Conclusion

160

In this study several important issues regarding the inheritance of mitochondrial DNA in Lacerta lepida were disclosed. It was shown that paternal leakage occurs in this species originating low frequency heteroplasmy. Furthermore evidence for recombination of the mitochondrial genome of Lacerta lepida was also detected. Screening of more individuals using smPCR is likely to increase the number of heteroplasmic and recombinant molecules and therefore allow for a better understanding of these phenomena in Lacerta lepida . Despite the widespread occurrence of heteroplasmy reported in the literature and the incidental cases of mitochondrial DNA recombination this is the first case that both phenomena are reported to occur in the same species and in a natural population. Therefore, Lacerta lepida seems to be an excellent system to further investigate issues related to mitochondrial DNA heteroplasmy and recombination.

161

Template DNA

FEXT REXT Phase I PCR

Phase I PCR product and Phase II PCR template

FINT REXT Phase II PCR

Final amplimer

Fig. 5.1. Schematic representation of smPCR nested design.

Bp 0 Bp 2540 12S 16S 943bp 65bp 1532bp tRNA-val

Fint_12S Fext_12S (starts at bp811) Rext_16S (starts at bp702) (starts at bp1873)

1062 bp

Bp 0 Bp 1143 CytB 1143 bp Fint_CytB (starts at bp116)

Fext_CytB Rext_CytB (starts at bp33) (starts at bp1049)

933 bp Lacerta lepida sequences available at Genebank Fragment to be amplified by smPCR (amplimers) Fig. 5.2. Schematic representation of two amplimers to be amplified by smPCR and the position of the primers used in the nested PCR. Base pair numbers in the first scheme are set according to Lacerta viridis publised 12S and 16S genes, and in the second scheme are set according to Lacerta lepida cytb gene.

162

51

43 41 50 45 42 40 40 46 58 351 510 57 264 59 55 56 46 60 48 21 391 62 46 54 52 61 47 b b 53 49 510 72 52 L3 391

33 72 33 153 * 153 * 351 264a 21 a 66 L1

63 64

65

7 5

67 9 2 6 3 1 8 11 12

4 10 68 L4

13 14 15

16

20 18

24 23 17 21 22

19

38 26

39 27 25 smPCR haplotypes 37 28 Extinct/unsampled haplotypes 30 29 32 36

31

33 35 34 L5

Fig. 5.3 . Statistical parsimony network of Lacerta lepida cytochrome b haplotypes. Dashed lines represent ambiguities in the network. White circles with no numbers represent unsampled or extinct haplotypes and yellow circles represent haplotypes detected by smPCR. Grey shaded area shows the loop that connects haplotype 153 to the network. The mutations involved in that loop are shown to the left of the network, where a and b represent alternative branches to connect haplotype 153 to the network.

163

Table 5.1. Number of smPCR amplifications performed in Lacerta lepida samples (Total), with scoring of positive (+) and negative (-) amplifications and respective percentage of positive amplifications (%). The absolute frequency of each mitochondrial haplotype detected in each sample is also shown.

Sample mtDNA smPCR Haplotypes (+) (-) Total % (+) code Haplotype (frequency) C8 H40 13 22 35 0.37 H40 (12), H1 (1) C9 H62 6 15 21 0.29 H62 (6), H153 (1) C4 H40 15 20 35 0.43 H40 (15) C3 H40 21 42 63 0.33 H40 (21)

164

5.6. References

Arunkumar KP, Metta M, Nagaraju J (2006) Molecular phylogeny of silkmoths reveals the origin of domesticated silkmoth, Bombyx mori from Chinese Bombyx mandarina and paternal inheritance of Antheraea proylei mitochondrial DNA. Molecular Phylogenetics and Evolution 40 , 419-427.

Ashley MV, Laipis PJ, Hauswirth WW (1989) Rapid segregation of heteroplasmic bovine mitodiondria. Nucl. Acids Res. 17 , 7325-7331.

Avise JC (2004) Molecular Markers, Natural History and Evolution , 2nd edn. Sinauer Associates, Sunderland, Massachusetts.

Barr CM, Neiman M, Taylor DR (2005) Inheritance and recombination of mitochondrial genomes in plants, fungi and animals. New Phytologist 168 , 39-50.

Bergstrom CT, Pritchard J (1998) Germline bottlenecks and the evolutionary maintenance of mitochondrial genomes. Genetics 149 , 2135-2146.

Böhme MU, Fritzsch G, Tippmann A, Schlegel M, Berendonk TU (2007) The complete mitochondrial genome of the Green Lizard Lacerta viridis viridis (Reptilia: Lacertidae) and its phylogenetic position within squamate reptiles. Gene 394 , 69-77.

Burbrink FT, Lawson R, Slowinski JB (2000) Mitochondrial DNA Phylogeography of the Polytypic North American Rat Snake (Elaphe obsoleta): A Critique of the Subspecies Concept. Evolution 54 , 2107-2118.

Burzynski A, Zbawicka M, Skibinski DOF, Wenne R (2006) Doubly uniparental inheritance is associated with high polymorphism for rearranged and recombinant control region haplotypes in Baltic Mytilus trossulus . Genetics 174 , 1081-1094.

Ciborowski KL, Consuegra S, Garcia de Leijniz C, Beaumont MA, Wang J, Jordan WC (2007) Rare and fleeting: an example of interspecific recombination in animal mitochondrial DNA. Biology Letters 3, 554-557.

Detmer SA, Chan DC (2007) Functions and dysfunctions of mitochondrial dynamics. Nature 8, 870-879.

Fontaine KM, Cooley JR, Simon C (2007) Evidence for paternal leakage in hybrid periodical cicadas (Hemiptera: Magicicada spp.). PLoS ONE 2, e892.

165

Gantenbein B, Fet V, Gantenbein-Ritter IA, Balloux Fo (2005) Evidence for recombination in scorpion mitochondrial DNA (Scorpiones: Buthidae). Proceedings of the Royal Society B: Biological Sciences 272 , 697-704.

Gyllensten U, Wharton D, Josefsson A, Wilson AC (1991) Paternal inheritance of mitochondrial DNA in mice. 352 , 255-257.

Hauswirth WW, Laipis PJ (1982) Mitochondrial DNA polymorphims in a maternal lineage of Holstein cows. Proc Natl Acad Sci USA 79 , 4686-4690.

Hoarau G, Holla S, Lescasse R, Stam WT, Olsen JL (2002) Heteroplasmy and Evidence for Recombination in the Mitochondrial Control Region of the Flatfish Platichthys flesus . Molecular Biology and Evolution 19 , 2261-2264.

Holt IJ, Dunbar DR, Jacobs HT (1997) Behaviour of a population of partially duplicated mitochondrial DNA molecules in cell culture: segregation, maintenance and recombination dependent upon nuclear background. Human Molecular Genetics 6, 1251-1260.

Jaramillo-Correa JP, Bousquet J (2005) Mitochondrial genome recombination in the zone of contact between two hybridizing conifers. Genetics 171 , 1951-1962.

Kajander OA, Rovio AT, Majamaa K, Poulton J, Spelbrink JN, Holt IJ, Karhunen PJ, Jacobs HT (2000) Human mtDNA sublimons resemble rearranged mitochondrial genomes found in pathological states. Hum. Mol. Genet. %R 10.1093/hmg/9.19.2821 9, 2821-2835.

Kaneda H, Hayashi J, Takahama S, Taya C, Lindahl K, Yonekawa H (1995) Elimination of paternal mitochondrial DNA in intraspecific crosses during early mouse embryogenesis. Proceedings of the National Academy of Sciences 92 , 4542-4546.

Khrapko K, Coller HA, André PC, Li X-C, Hanekamp JS, Thilly WG (1997) Mitochondrial mutational spectra in human cells and tissues. Proc Natl Acad Sci USA 94 , 13798-13803.

Koehler CM, Lindberg GL, Brown DR, Beitz DC, Freeman AE, Mayfield JE, Myers AM (1991) Replacement of bovine mitochondrial DNA by a sequence variant within one generation. Genetics 129 , 247-255.

Kondo R, Satta Y, Matsuura ET, Ishiwa H, Takahata N, Chigusa SI (1990) Incomplete Maternal Transmission of Mitochondrial-DNA in Drosophila. Genetics 126 , 657-663.

Konfortov BA, Bankier AT, Dear PH (2007) An efficient method for multi-locus molecular haplotyping. Nucl. Acids Res. 35 , e6-.

Krause J, Dear PH, Pollack JL, Slatkin M, Spriggs H, Barnes I, Lister AM, Ebersberger I, Paabo S, Hofreiter M (2006) Multiplex amplification of the

166

mammoth mitochondrial genome and the evolution of Elephantidae. Nature 439 , 724-727.

Kraytsberg Y, Khrapko K (2005) Single-molecule PCR: an artifact-free PCR approach for the analysis of somatic mutations. Expert Review of Molecular Diagnostics 5, 809-815.

Kraytsberg Y, Nekhaeva E, Chang C, Ebralidse K, Khrapko K (2004a) Analysis of somatic mutations via long-distance single molecule PCR. In: DNA amplification. Current technologies and applications (eds. Deminov VV, Broude NE), p. 335. Horizon bioscience, Wymondham.

Kraytsberg Y, Schwartz M, Brown TA, Ebralidse K, Kunz WS, Clayton DA, Vissing J, Khrapko K (2004b) Recombination of human mitochondrial DNA. Science 304 , 981-981.

Kvist L, Martens J, Nazarenko AA, Orell M (2003) Paternal leakage of mitochondrial DNA in the great tit ( Parus major ). Molecular Biology and Evolution 20 , 243-247.

Ladoukakis ED, Zouros E (2001) Direct evidence for homologous recombination in mussel (Mytilus galloprovincialis) mitochondrial DNA. Molecular Biology and Evolution 18 , 1168-1175.

Lin MT, Simon DK, Ahn CH, Kim LM, Beal MF (2002) High aggregate burden of somatic mtDNA point mutations in aging and Alzheimer's disease brain. Human Molecular Genetics 11 , 133-145.

Lukyanov KA, Matz MV, Bogdanova EA, Gurskaya NG, Lukyanov SA (1996) Molecule by molecule PCR amplification of complex DNA mixtures for direct sequencing: an approach to in vitro cloning. Nucleic Acids Research 24 , 2194-2195.

Lunt DH, Hyman BC (1997) Animal mitochondrial DNA recombination. Nature Genetics , 247.

Magoulas A, Zouros E (1993) Restriction-site heteroplasmy in Anchovy ( Engraulis encrasicolus ) indicates incidental biparental inheritance of mitochondrial DNA. Molecular Biology and Evolution 10 , 319-325.

Meusel MS, Moritz RFA (1993) Transfer of paternal mitochondrial DNA during fertilization of honeybee ( Apis mellifera L.) eggs. Current Genetics 24 , 539- 543.

Paabo S, Irwin DM, Wilson AC (1990) DNA damage promotes jumping between templates during enzymatic amplification. Journal of Biological Chemistry 265 , 4718-4721.

167

Paulo OS, Pinheiro J, Miraldo A, Bruford MW, Jordan WC, Nichols RA (2008) The role of vicariance vs. dispersal in shaping genetic patterns in ocellated lizard species in the western Mediterranean. Molecular Ecology 17 , 1535-1551.

Rand DM, Harrison RG (1986) Mitochondrial DNA transmission in crickets. Genetics 114 , 955-970.

Rokas A, Ladoukakis E, Zouros E (2003) Animal mitochondrial DNA recombination revisited. Trends in Ecology & Evolution 18 , 411-417.

Schwartz M, Vissing J (2002) Paternal inheritance of mtDNA in a patient with mitochondrial myopathy. European Journal of Human Genetics 10 , 239-239.

Sherengul W, Kondo R, Matsuura ET (2006) Analysis of paternal transmission of mitochondrial DNA in Drosophila. Genes and Genetic Systems 81 , 399-404.

Shitara H, Hayashi J, Takahama S, Kaneda H, Yonekawa H (1998) Maternal inheritance of mouse mtDNA in interspecific hybrids: Segregation of the leaked paternal mtDNA followed by the prevention of subsequent paternal leakage. Genetics 148 , 851-857.

Solignac M, Génermont J, Monnerot M, Mounolou J-C (1984) Genetics of mitochondria in Drosophila: mtDNA inheritance in heteroplasmic strains of D. mauritiana . Molecular and General Genetics MGG 197 , 183-188.

Steinborn R, Zakhartchenko V, Jelyazkov J, Klein D, Wolf E, Müller M, Brem G (1998) Composition of parental mitochondrial DNA in cloned bovine embryos. FEBS Letters 426 , 352-356.

Stephens JC, Rogers J, Ruano G (1990) Theoretical underpinning of the single- molecule-dilution (SMD) method of direct haplotype resolution. American Journal of Human Genetics 46 , 1149-1155.

Stewart JB, Freyer C, Elson JL, Wredenberg A, Cansu Z, Trifunovic A, Larsson N-G (2008) Strong purifying selection in transmission of mammalian mitochondrial DNA. PLoS Biology 6, e10.

Sutovsky P, Moreno RD, Ramalho-Santos J, Dominko T, Simerly C, Schatten G (2000) Ubiquitinated sperm mitochondria, selective proteolysis, and the regulation of mitochondrial inheritance in mammalian embryos. Biology of Reproduction 63 , 582-590.

Tang Y, Manfredi G, Hirano M, Schon EA (2000) Maintenance of Human rearranged mitochondrial DNAs in long-term cultured transmitochondrial cell lines. Molecular Biology of the Cell 11 , 2349-2358.

Thyagarajan B, Padua RA, Campbell C (1996) Mammalian mitochondria possess homologous DNA recombination activity. Journal of Biological Chemistry 271 , 27536-27543.

168

Ujvari B, Dowton M, Madsen T (2007) Mitochondrial DNA recombination in a free- ranging Australian lizard. Biology Letters 3, 189-192.

Van Leeuwen T, Vanholme B, Van Pottelberge S, Van Nieuwenhuyse P, Nauen R, Tirry L, Denholm I (2008) Mitochondrial heteroplasmy and the evolution of insecticide resistance: Non-Mendelian inheritance in action. Proceedings of the National Academy of Sciences 105 , 5980-5985.

Yaffe MP (1999) The machinery of mitochondrial inheritance and behavior. Science 283 , 1493-1497.

Zhao X, Li N, Guo W, Hu X, Liu Z, Gong G, Wang A, Feng J, Wu C (2004) Further evidence for paternal inheritance of mitochondrial DNA in the sheep ( Ovis aries ). Heredity 93 , 399-403.

169

Appendix

Schematic representation of smPCR protocol Part I

Template DNA dilution and smPCR PHASE I – “Clean lab”

Step 1 : Prepare a 96 well plate (Plate A) with template DNA serially diluted starting from the stock solution (10 3G/µl). Five different concentrations will be tested, with 16 wells per concentration and 16 negative control wells. Step 1 : Plate A DNA template (Serial DNA dilution scheme: Rows 1 & 2 = 200 µl at 25G/µl (195 µl ddH 2O + 5 µl Stock DNA); Rows 3

(serially diluted) & 4 = 150 µl at 5G/µl (120 µl ddH 2O + 30 µl of A); Rows 5 & 6 = 150 µl at 1G/µl (120 µl ddH 2O + 30 µl of

B); Rows 7 & 8 = 150 µl at 0.20G/µl (120 µl ddH 2O + 30 µl of B); Rows 9 & 10 = 150 µl at 0.04G/µl (120 µl

ddH 2O + 30 µl of B); Rows 11 & 12 = 200 µl ddH 2O)

Step 2 : In a new plate (Plate B) dispense 1 drop of mineral oil in each well.

Step 3 : With a multi-channel pipette transfer 5 µl of DNA template from Plate A to Plate

B, to obtain Plate B1 (Release the DNA template underneath the oil) Step 3 Step 2 : Plate B Step 4 : Prepare Phase I reaction mix for 120 reactions in an eppendorf tube and Mineral oil dispense 75µl of this solution to the first column wells of a new plate (Plate C).

Step 5 : With a multi-channel pipette dispense 5µl of PCR Phase I mix to the wells of Plate B1, obtaining Plate B2. Release the PCR mix at the top of the walls without

touching the mineral oil . (Start dispensing from the lowest concentration to the highest concentration wells)

Plate B1 Step 6 : PCR plate B2. Mineral Oil and DNA template

Step 5

Plate B2 Step 4 : Plate C Mineral oil + DNA template + PCR mix Phase I PCR mix Part II

smPCR Dilution of Phase I PCR products – Normal lab Step 7 : Plate B3

Phase I PCR product + 60µl ddH 2O Step 7 : Add 60µl of ddH 2O to the wells of Phase IPCR product to obtain Plate B3. Centrifuge.

Step 8 : In a deep well plate (Plate D) dispense 250 µl of ddH 2O in each well.

Step 9 : With a multi-channel pipette transfer 15 µl of DNA template from Plate B3 to Plate D, obtaining a final 100x dilution of Phase I PCR products (Plate D1). Step 8 : Plate D

Step 9 Dispense 250µl of ddH 2O

Plate D1 100x dilution of Phase I PCR products

170

Appendix I Continuation

Part III

smPCR Phase II – Normal lab

Step 10 : Dispense 1 drop of mineral oil in a new plate (Plate E).

Step 11 : With a multi-channel pipette transfer 5 µl of Phase I diluted PCR products Plate D1 from Plate D1 to the bottom of plate E, obtaining Plate E1. Centrifuge. 100x dilution of Phase I PCR products Step 12 :. Prepare Phase II reaction mix (for one marker) for 120 wells in an eppendorf tube and dispense 75µl of this solution to the first column wells of a new plate (Plate F).

Step 13 :. With a multi-channel pipette dispense 5µl of PCR Phase II mix to the wells of Plate E1, obtaining plate E2. Release the PCR mix at the top of the walls Step 10 : Plate E without touching the mineral oil .. Centrifuge. Step 11 Mineral oil

Step 14 : PCR plate E2.

Step 15 : repeat steps 11-14 for the second marker, changing the primers used in the reaction mix

Plate E1 Mineral oil + diluted Phase I PCR products Step 13

Plate E2 Mineral oil + DNA template + PCR mix

Step 12: Plate F Phase II PCR mix

171

Chapter 6

General discussion and conclusions

Phots by Andreia Miraldo Car and traps used during 3 years of fieldwork

6. General discussion and conclusions

By studying a species with a distribution that encompasses the entire Iberian Peninsula it was possible to have a broader and more complete picture about the role of this peninsula as a diversification hotspot. Using mitochondrial and nuclear genealogies it became clear that Lacerta lepida, like other species in the region, has endured repeated processes of fragmentation that have promoted the diversification of six genetically and geographically distinct lineages. Estimating the dates of divergence between the different evolutionary lineages revealed that diversification within Lacerta lepida is largely concordant with the onset of the major glaciations at the beginning of the Pleistocene approximately 2 Mya. The earliest divergence, during the Miocene, represents a deep split within the species marking the divergence of a lineage (lineage N) associated with the Betic Mountains in south- eastern Spain. Both climatically mediated events during the Quaternary, and geological events associated with the evolution of the Mediterranean basin, are inferred to have triggered intraspecific diversification within Lacerta lepida .

The majority of phylogeographic studies within Iberia reveal similar diversification events across several taxa that are usually attributed to allopatric differentiation in several refugia within the Peninsula, although sometimes at different temporal scales. These studies are however typified by the absence of detailed analyses of the distribution of ancestral and derived alleles within each lineage. This approach has been shown to be extremely valuable for the delimitation of refugial areas (Emerson and Hewitt, 2005) and in the context of this work it has identified six geographically distinct refugia within the Iberian Peninsula. The 172 identified refugia occur throughout the region: in north-western Iberia, around the gorges of the Douro River; in central Spain around the central mountain system; in inland central Portugal in the Tagus River region; in the south-western corner of Portugal, in the Algarve region; in southern Spain around the Guadalquivir area and finally in the Betic Mountains in south-eastern Spain.

Of particular interest are the refugia detected around the gorges of the Douro River, in the region between Portugal and Spain and in the central system mountains. The detection of such northerly located refugia, for what is considered a Mediterranean species, suggests that suitable ecological conditions have existed at these northern latitudes during glacial maxima. Although northern refugia in Iberia have been previously detected they are typically associated with species with ecological requirements intimately associated with Atlantic influences (e.g. Lacerta schreiberi and Chioglossa lusitanica ). This thesis reveals that Lacerta lepida is likely to have persisted in these northerly refugia as well, emphasizing the importance of these regions in the survival of species with very different ecological requirements throughout adverse climatic conditions.

The phylogeographic analysis of Lacerta lepida has also revealed areas of secondary contact between divergent lineages, formed mainly as a result of demographic range expansions. Detailed analysis of two different contact zones between Lacerta lepida mitochondrial lineages was carried out revealing very different dynamics for each. The contact zone in the north-western part of Iberia is relatively recent. Evidence for hybridization was inferred by the detection of a Numt within one of the lineages that originated from the mitochondrial genome of the other. Detection of additional Numts from different introgression events are consistent with other mitochondrial lineages that are now extinct. Although Numts have been described in a wide range of taxa their function in the genome, if any, is unknown. However, their utility as a tool in evolutionary biology is recognized, as they provide a unique window on past evolutionary events (Bensasson et al. , 2001). Despite their potential as important sources of information, very few studies to date take advantage of Numts. Once detected, Numts are typically discarded from further analysis. This thesis has demonstrated that Numts can be extremely valuable in the context of phylogeographic analysis, as they can provide evidence for past 173 demographic events. The indiscriminate discarding of Numts from analysis may result in researchers losing a valuable source of information, and in this study Numts reveal that within L. lepida hybridization between the lineages has occurred.

An unexpected outcome from the detailed analysis of the northern contact zone was the detection of low levels of heteroplasmy and mitochondrial DNA recombination in one of the mitochondrial lineages. These findings are a new addition to the already extensive list of studies reporting evidence for exceptions to the general assumptions regarding mitochondrial DNA inheritance in animals. Heteroplasmy and mtDNA recombination were only detected in one of the lineages and it remains unknown whether these phenomena are widespread in Lacerta lepida . A wide range of mechanisms are responsible for controlling the strict maternal inheritance of mtDNA in animals which can act at any stage of the reproductive process (for a review on the subject see Birky, 1995). The mechanisms vary from the complete lack of mitochondria in the sperm to the active elimination of paternally derived mitochondria at fertilization. For example, in some tunicates paternal mitochondria fail to enter the egg whereas in honey bees more than a quarter of all mitochondria in one egg are reported to be paternally derived, although their mtDNA has defective replication, making it undetectable in the larvae stage. In most animals though, it seems that the strictly maternal inheritance of mtDNA is derived due to a combination of factors (Birky, 1995; Birky, 2001) involving the limited number of mitochondria from the sperm cell that enter the oocyte during fertilization and their active elimination by a ubiquitin-dependent mechanism (Sutovsky et al,. 1999). This process secures the homoplasmy of the embryo. Nevertheless, the recognition of paternal mtDNA apparently depends on phylogenetic relatedness. As the degree of genetic divergence between species increases, the probability that sperm mitochondria are recognized and eliminated decreases, but also reduces the probability that F1 hybrids are viable and fertile. Therefore, the path to producing heteroplasmy and recombinant haplotypes in a population through hybridization might be narrow. Earlier studies in Drosophila suggested that paternal leakage is more likely to occur if the genetic divergence (uncorrected) between taxa is approximately 2.5% or higher (Kondo et al. , 1990), but more recent studies detected leakage between Drosophila subspecies which show much lower divergence levels (Sherengul et al. , 2006). In cicadas leakage was also demonstrated to occur between 174 crosses that show a wide range of genetic divergence, from almost no divergence to 8% (Fontaine et al. , 2007). Divergence levels detected between Lacerta lepida lineages range from 1% to almost 13% so it is possible that paternal leakage could occur between most lineages that form zones of secondary contact.

The findings of this thesis have implications for evolutionary analyses using mtDNA, but their significance certainly depends on the capacity of the detected heteroplasmy and recombination to leave a footprint at the population level. To be consequential for future generations heteroplasmy must persists via the germ line and remain in the oocyte long enough for recombination to occur. The establishment of a fertile female hybrid carrying a recombinant haplotype may then result in the transmission of such a haplotype to the next generation by backcrossing with a male from either species. Similar events of repeated backcrossing may then potentially fix the recombinant mtDNA haplotype against the nuclear background of one or the other parental species. In summary, even if biparental recombination is detected in an individual, there are other prerequisites for this recombination to leave a footprint at the population level.

The dynamics of gene flow was also assessed in a contact zone between Lepida and Nevadensis mitochondrial lineages in south-eastern Spain. The microsatellite analysis of this contact zone revealed very restricted gene flow amongst the lineages and it was postulated that the lineages are on independent evolutionary paths and therefore should be considered as two different species. For future work it would be interesting to assess the mechanisms that are driving speciation in these lizards. According to Jiggins and Mallet (2000) premating isolation is likely to be more effective than hybrid incompability in maintaining species differences despite gene flow. In Heliconius butterflies it seems that speciation occurred after the evolution of ecological divergence and mate choice differences and well before hybrid unfitness (Mallet et al. , 1998). Although F1 hybrids were detected in the contact zone between Lepida and Nevadensis, data suggests that some form of hybrid incompability might be responsible for the reduced gene flow observed. Nevertheless prezygotic mechanisms are also quite likely to be involved in the dynamics of this contact zone. The lineages show extreme variation in colour patterns (Mateo and Castroviejo, 1990; Mateo and 175

López-Jurado, 1994; Mateo et al. , 1996) and visual cues in these lizards may play an important role in mate recognition as this has been observed for other lacertid lizards (Molina-Borja, 1987). Courtship behaviour in Lacerta lepida usually involves overt displays of lateral blue spots (Paulo, 1988; personal observation) which differ substantially between the lineages in contact. Other prezygotic mechanisms between the lineages could result from differences in the reproductive activity between them as Nevadensis shows an extended reproductive period (Castilla and Bauwens, 1989; Mateo, 1988; Mateo and Castanet, 1994). Both prezygotic and postzygotic mechanisms could well be important for maintaining the isolation between these two lineages, and distinguishing which are more important will require detailed analysis.

This thesis represents a major contribution to our understanding of the evolutionary history of Lacerta lepida , providing extensive information about the history, distribution and dynamics of genetic variation within the species. It is also an important contribution to the understanding of the evolutionary dynamics of Iberian Peninsula biota in general, in particular by describing areas of importance for species diversification and survival in response to historical and contemporary events. This study also provides the first detailed analysis of secondary contact zones within the species, providing insights into hybridization and speciation processes that are relevant for the evolutionary history of Lacerta lepida . The detection of Numts originated as a result of hybridization between divergent lineages and the realization of their utility in elucidating phylogeographic studies is an exciting prospect for the field of phylogeography. Also exciting is the strong evidence for mitochondrial DNA recombination, which until now was rarely reported for natural populations in the literature.

176

6.1. References

Bensasson D, Zhang D-X, Hartl DL, Hewitt GM (2001) Mitochondrial pseudogenes: evolution's misplaced witnesses. Trends in Ecology & Evolution 16 , 314-321.

Birky C (1995) Uniparental inheritance of mitochondrial and chloroplast genes: mechanisms and evolution. Proceedings of the National Academy of Sciences 92 , 11331-11338.

Birky CW (2001) The inheritance of genes in mitochondria and chloroplasts: Laws, Mechanisms, and Models. Annual Review of Genetics 35 , 125-148.

Castilla AM, Bauwens D (1989) Reproductive characteristics of the lacertid lizard Lacerta lepida . Amphibia-Reptilia 10 , 445-452.

Emerson BC, Hewitt GM (2005) Phylogeography. Current Biology 15 , 367-371.

Fontaine KM, Cooley JR, Simon C (2007) Evidence for paternal leakage in hybrid periodical cicadas (Hemiptera: Magicicada spp.). PLoS ONE 2, e892.

Jiggins CD, Mallet J (2000) Bimodal hybrid zones and speciation. Trends in Ecology & Evolution 15 , 250-255.

Kondo R, Satta Y, Matsuura ET, Ishiwa H, Takahata N, Chigusa SI (1990) Incomplete Maternal Transmission of Mitochondrial-DNA in Drosophila. Genetics 126 , 657-663.

Mallet J, McMillan WO, Jiggins CD (1998) Mimicry and warning color at the boundary between races and species. In: Endless forms: species and speciation (eds. Howard D, Berlocher SH), pp. 390-403. Oxford University Press, Oxford.

Mateo JA (1988) Estudio sistematico y zoogeografico de los Lagartos Ocelados, Lacerta lepida Daudin, 1802, y Lacerta pater (Lataste, 1880), (Sauria: Lacertidae) , Universidad de Sevilla.

Mateo JA, Castanet J (1994) Reproductive strategies in three Spanish populations of the ocellated lizard, Lacerta lepida (Sauria, Lacertidae). Acta oecologica 15 , 215-229.

Mateo JA, Castroviejo J (1990) Variation morphologique et revision taxonomique de l’espece Lacerta lepida Daudin, 1802 (Sauria, Lacertidae). Bulletin du Museé de Histoire Naturele de Paris 12 , 691–706.

177

Mateo JA, López-Jurado LF (1994) Variaciones en el color de los lagartos ocelados; aproximacion a la distribuicion de Lacerta lepida nevadensis Buchholz 1963. Revista Espanola de Herpetologia 8, 29-35.

Mateo JA, López-Jurado LF, Guillaume CP (1996) Variabilité électrophorétique et morphologique des lézards ocellés (Lacertidae): un complexe d’espèces de part et d’autre du détroit de Gibraltar. Comptes Rendus de L’Academie des Sciences Serie iii-Sciences de la Vie-Life Sciences 319 , 737–746.

Molina-Borja (1987) Spatio-temporal distribution of aggressive and courting behaviors in the lizard Gallotia galloti from Tenerife, the Canary Islands. Journal of Ethology 5, 11-15.

Paulo OS (1988) Estudo eco-etologico da populacao de Lacerta lepida (Daudin 1802) (Sauria, LAcertidae) da ilha da Berlenga , Universidade de Lisboa.

Sherengul W, Kondo R, Matsuura ET (2006) Analysis of paternal transmission of mitochondrial DNA in Drosophila. Genes and Genetic Systems 81 , 399-404.

178