The spatial scale of speciation and patterns of diversity

Yael Kisel

A thesis submitted for the degree of Doctor of Philosophy

of the University of London

Division of Biology

Imperial College London

Silwood Park Campus

December 2010

Abstract

Many environmental factors and taxon traits have been studied as potential controllers of diversification, but there is no consensus as to which are most important or how to link them into a general theory of diversification. I hypothesise that diversification is strongly controlled by the interaction between area and clades’ spatial scales of speciation, or the minimum amount of area they require for speciation to occur. Furthermore, I hypothesise that the spatial scale of speciation is controlled by population genetic characteristics of clades, as speciation is ultimately a process of population divergence. In this thesis, I quantify taxonomic variation in the spatial scale of speciation, test whether it can be explained by variation in population genetics and evaluate whether it can explain taxonomic patterns of diversity. Using a survey of speciation events on isolated oceanic islands, I show that the spatial scale of speciation varies greatly between birds, lizards, snails, bats, carnivorous mammals, lepidoptera, angiosperms and ferns. I also use a meta-analysis of population genetic data collected from the literature to show that the minimum area for speciation of these groups correlates strongly with their average level of gene flow. I then test the link between population genetics and diversification by comparing population genetic characteristics of sister clades of tropical orchids that differ greatly in species richness. Contrary to expectation, levels of gene flow, genetic drift and local adaptation do not correlate directly with rates of diversification. However, there is some evidence for an interaction between species range size and gene flow in controlling diversification. This thesis supports a framework based on the interaction between area and the spatial scale of speciation as a useful foundation for general theories of diversification. It also highlights the potential for using a comparative population genetic approach in macroevolutionary studies.

2

Acknowledgements

First, I would like to thank Tim Barraclough and Mark Chase for all their support and patience. I am also grateful to NSF, the Imperial College Deputy Rector's Scholarship, the University of London Central Research Fund, the Kew Bentham-Moxon Trust, and Sigma Xi for funding my work. In Costa Rica, I especially thank Diego Bogarín, who helped me with everything. I am very grateful to Jorge Warner for making me welcome at the Jardín Botánico Lankester and helping me with permits. Thanks also to many others at JBL: Franco Pupulin for advice; Rei and Rafa and Adam Karremans for coming on field trips; Allan and Enzo Salas for helping me settle in; and Socorro for dealing with paperwork. Thanks to the helpful staff of SINAC, especially Javier Guevara, Roger Blanco and Oscar Masis, and to UCR reserve director Ronald Sanchez. Thanks also to the owners of Bosque de Paz Reserve, and Melania Muñoz for organizing my visit there; to the owners of the Rara Avis Reserve; and to Freddy and Katia who allowed me to collect on their land. Great thanks to my fieldwork volunteers, Ryan Phillips, Julia Hu, Paul Renshaw and Kath Castillo. Finally, thanks to Fanny Bonilla and Carlos Piedra for hosting us so warmly. In the lab, my thanks first to Martyn Powell, who took on the task of teaching me AFLPs and answered my never-ending questions thereafter. I am also very grateful to Vincent Savolainen for allowing me to use his group’s lab facilities. Thanks to Robyn Cowan and Ovidiu Paun for in-depth talks about AFLP methodology and troubleshooting. Special thanks to Helen Hipperson for troubleshooting help, support and empathy. Thanks to Thomas Hahn for trying hard to figure out my AFLP troubles over email. Finally, thanks to everyone in the Savolainen lab for helping me find my feet with labwork, in particular Haris Saslis-Lagoudakis, Hanno Schaefer, Guillaume Besnard, Silvana del Vecchio, Paul Rymer and Alex Papadopulos. Thanks to Christian Lexer, Ally Phillimore and Alex Pigot for many illuminating discussions, and to all my friendly labmates, especially Diego Fontaneto. Thanks to Diana Anderson and Christine Short for cheerful help with everything administrative, and John Williams and the security crew for opening doors and solving problems. Thanks to the whole friendly Silwood community for making my PhD years so enjoyable, with special mention to Kat, Martina, Irka, Geraldine, Sally, Susanne, and Alice. Immense thanks to the house: Lynsey, Ellie and Lena for everything, but especially keeping me sane. Thanks to Bruce Tiffney, for inspiring me to be a passionate scientist and giving me the idea to go to the UK. Thanks to my parents for their unwavering support, even after I decided to move to a country 8 time zones away. And finally, thanks to Martin, for field help, lab help, R help, formatting help, numerous bloody marys and much more.

3

Declaration of originality

I declare that all the work presented in this thesis is my own original research, with the following acknowledgements for each chapter:

Chapter 2 has been published in slightly modified form in American Naturalist. It was

written in collaboration with Tim Barraclough and made use of unpublished checklist data generously provided by Shai Meiri, Ana M. C. Santos, Roberto S. Gómez, Tod F.

Stuessy and Christophe Thebaud. It was also greatly improved by suggestions from

Jonathan Davies, Joaquin Hortal, Christian Lexer, Shai Meiri, Lynsey McInnes, Ally

Phillimore, Andy Purvis, Vincent Savolainen and two anonymous reviewers.

Chapters 3 and 4 made use of data-formatting and analysis scripts for R written by Martin

Turjak.

4 Table of Contents

Table of Contents

Abstract ...... 2

Acknowledgements ...... 3

Declaration of originality ...... 4

Table of Contents ...... 5

List of Figures ...... 8

List of Tables ...... 9

List of Equations ...... 10

Chapter 1. Introduction...... 11 Comparative methods for studying variation in diversity ...... 11 What controls variation in diversity? Organism traits versus environmental variables . 13 A proposed framework for understanding variation in diversification ...... 15 Approach and aims ...... 21 Summary of aims ...... 23

Chapter 2. Using oceanic islands to measure the spatial scale of speciation and its association with gene flow ...... 24 Introduction ...... 24 Materials and Methods ...... 30 Island selection and data collection ...... 30 Island species data collection ...... 31 Identification of speciation events ...... 33 Adding phylogenetic information ...... 35 Statistical analysis of the speciation-area relationship ...... 36 Gene flow data ...... 37 Gene flow analyses ...... 40 Results ...... 43

5 Table of Contents

Data availability and quality ...... 43 Quantifying the speciation-area relationship ...... 43 Measuring minimum areas for speciation ...... 47 Testing the importance of area when other environmental variables are included ... 48 The effect of gene flow ...... 52 Discussion ...... 53 Main findings ...... 53 The effects of other island characteristics on speciation probability ...... 54 The spatial scale of speciation and gene flow ...... 56 Evolutionary explanations for the observed patterns ...... 59 The effects of taxonomic practice and surveying effort ...... 62 Implications for evolutionary studies of diversity patterns ...... 63

Chapter 3. The relationship between gene flow and clade diversification rates in Costa Rican orchids ...... 65 Introduction ...... 65 Methods ...... 68 Study group selection ...... 68 Study species phylogeny reconstruction ...... 78 Sample collection ...... 79 AFLP genotyping ...... 84 AFLP scoring ...... 89 Finalising AFLP datasets ...... 90

Analysing Fst patterns ...... 91 Testing the influences of species range size and ecology ...... 93 Results ...... 95 Discussion ...... 103 Findings ...... 103 Study limitations ...... 106 Ideas for future work ...... 108 Conclusion ...... 109

6 Table of Contents

Chapter 4. The effects of genetic drift and local adaptation on clade diversification rates in Costa Rican orchids ...... 110 Introduction ...... 110 Methods ...... 113 Results ...... 116 Discussion ...... 121 Main findings ...... 121 Study limitations ...... 123 Conclusion ...... 125

Chapter 5. Conclusion ...... 126 Summary of results ...... 126 Directions for future work ...... 128 General conclusions ...... 129

Bibliography ...... 130 Appendix I. Kisel et al. (in review). How diversification rates and diversity limits combine to create large-scale species-area relationships...... 152 Appendix II. Supplementary tables and figures for chapter three...... 200

7 List of Figures

List of Figures

Figure 1.1. Hypothesised framework linking geographic area and the spatial scale of speciation as controllers of speciation and diversification...... 17 Figure 2.1. Patterns of diversification on islands...... 29

Figure 2.2. Relationship between Fst and the geographic extent of study...... 42 Figure 2.3. Relationship between the probability of speciation and area...... 45 Figure 2.4. Minimum island size for speciation versus the average level of gene flow when measured over geographic ranges of 10-100 km...... 53 Figure 2.5. Results of an alternative gene flow analysis - the relationship between the minimum area for speciation and the spatial scale of neutral population differentiation. 59 Figure 3.1. Photos of study species...... 76-77 Figure 3.2. Sampling locations for study species from (a) the Masdevallia-Trisetella and (b) the Lepanthes-Lepanthopsis clade pairs...... 81 Figure 3.3. Sampling locations for study species from (a) the Platystele- and (b) the Epidendrum-Brassavola clade pairs...... 82 Figure 3.4. Sampling locations for study species from the Scaphyglottis-Jacquiniella clade pair...... 83

Figure 3.5. Relationship between Fst and distance for all study species for the full dataset...... 97

Figure 3.6. Relationship between Fst and distance for all study species for the reduced dataset...... 98

Figure 3.7. Associations of Fst and species range size with species phylogeny (data from the full dataset)...... 100 Figure 3.8. Relationships between species elevation range and number of habitats and whether any pair-wise Fst value is over 0.2, for the full dataset...... 101 Figure 3.9. Associations between species range size and elevation range and number of habitats, over all species native to Costa Rica from the study clades...... 102 Figure 4.1. Examples of digitised leaf outlines...... 115 Figure 4.2. Associations of (a) gene diversity, (b) overall Pst and (c) difference between Pst and Fst with species phylogeny for the full dataset...... 119

Figure 4.3. Associations of (a) gene diversity, (b) overall Pst and (c) difference between Pst and Fst with species phylogeny for the reduced dataset...... 120 Figure 4.4. Relationship between gene diversity and species elevation range for the reduced dataset...... 121

8 List of Tables

List of Tables

Table 2.1. Area and speciation statistics, by taxonomic group...... 33 Table 2.2. Area-only models for the probability of speciation...... 46 Table 2.3. Model-averaged parameter estimates and relative importance values for analyses including archipelagos...... 50 Table 2.4. Model-averaged parameter estimates and relative importance values for analyses excluding archipelagos...... 51 Table 3.1. Study clade pairs, with genera they include and currently accepted species richness...... 72 Table 3.2. Species collected, with distributions and habitats...... 73 Table 3.3. Accession numbers for matK sequences used to build study species phylogeny...... 79 Table 3.4. Details of AFLP method used for each study species...... 88

Table 3.5. Fst values for all study species...... 96 Table 4.1. Measures of levels of genetic drift, local adaptation and overall phenotypic divergence for all study species...... 118

9 List of Equations

List of Equations

Equation 4.1. Nei’s gene diversity...... 113

Equation 4.2. Pst ...... 114

10 Chapter 1. Introduction

Chapter 1. Introduction

One of the biggest mysteries in biology is why groups vary so much in diversity.

Species are unevenly distributed among groups of organisms at all hierarchical levels

(Dial and Marzluff 1989; Marzluff and Dial 1991). The existence of this pattern, also

called imbalance, is not surprising, as it is predicted by null models of diversification

(Raup et al. 1973; Farris 1976). However, not all imbalance seen in nature can be

explained by chance, as many taxa are more imbalanced than expected under null models

(Dial and Marzluff 1989; Marzluff and Dial 1991). This must be the result of variation

between groups in speciation or extinction rates or in limits to group diversity, and so

understanding what controls speciation, extinction and diversity limits is the key to

understanding variation in diversity. This question has been a focus of research for

decades but clear answers are still lacking.

Comparative methods for studying variation in diversity

The study of diversity patterns is fundamentally a comparative field, and the

comparative methods used have advanced greatly over the last decades (reviewed in

Sanderson and Donoghue 1996; Ricklefs 2007a). The earliest studies simply compared

current species richness of taxa (for instance, families), and tested for an association

between species richness and taxon traits (e.g. Herrera 1989; Tiffney and Mazer 1995). It

was soon realised, however, that this approach has many drawbacks: the taxa compared

are likely to be nonmonophyletic (if defined based on morphology), nonindependent

(because relationships between them are not taken into account) and noncomparable

11 Chapter 1. Introduction

(because genera, families, etc. are not defined consistently and vary in age) (Isaac et al.

2003). The solution, first argued for by Felsenstein (1985), was to take account of

phylogenetic relationships, and this has been a part of comparative studies of diversity ever since. One of the simplest methods of taking phylogeny into account is to use sister

group comparisons, in which the relationship between a trait and group diversity is

assessed over pairs of clades that are each other’s closest relatives. Sister group

comparisons have a number of advantages: the sister pairs are independent, the use of replicated pairs increases the power of the test, diversification rates are truly compared because sister pairs are the same age by definition, and many confounding variables are controlled for because sister pairs share most of their history (Barraclough et al. 1998).

However, sister pair comparisons do not make use of all information in a phylogenetic tree, and they cannot separate the effects of speciation and extinction (Barraclough and

Nee 2001). The first problem can be dealt with by making sister group comparisons over all nodes in a tree (Felsenstein 1985; Isaac et al. 2003) or by using whole-tree approaches to test whether appearances of a trait are associated with shifts in diversification rate (e.g.

Nee et al. 1992; Chan and Moore 2002). Separating the effects of speciation and extinction is a more difficult problem. It has been addressed by using lineage through time plots (Harvey et al. 1994; Barraclough and Nee 2001) and likelihood models that include parameters for both speciation and extinction (Nee et al. 1994; Nee 2006;

Ricklefs 2007a), but recent evidence suggests fossil data may always be required

(Quental and Marshall 2010; Rabosky 2010) and the best approach is likely to combine phylogenetic and fossil data (Purvis 2008). In summary, there is now a wide array of methods available for measuring diversification rates and testing their association with

12 Chapter 1. Introduction

factors hypothesised to affect diversification, giving the study of diversity patterns a solid

and rigorous foundation.

What controls variation in diversity? Organism traits versus environmental variables

The first studies testing hypotheses for taxonomic variation in diversity searched

for “key innovations” (single traits that trigger bursts of increased diversification) by comparing the diversification of taxa with and without traits of interest. This focus developed from Simpson’s (1953) idea that increased diversification is driven by ecological opportunity resulting from availability of a new region, extinction of competitors or colonisation of a new “adaptive zone” as the result of a newly evolved key innovation (Futuyma 1986). However, the idea of adaptive zones was later criticised as tautological because they are identified by the niches that organisms occupy (Cracraft

1982) and the range of traits studied with this method broadened to include those that might affect rates of speciation or extinction directly, rather than by opening new adaptive zones. Such trait-based studies are still carried out today, and many have found significant associations (reviews in Coyne and Orr 2004; Jablonski 2008). For example, phytophagy in insects is strongly associated with increased diversification (Mitter et al.

1988) and increased diversification in angiosperms is associated with floral nectar spurs

(Hodges and Arnold 1995). More recently investigated traits include biotic pollination

(Kay and Sargent 2009) and the lability of ornamentation in birds (Cardoso and Mota

2008). However, no individual trait studied so far has had great success in explaining variation in diversity; even multivariate trait models are able to explain only 10-24% of

13 Chapter 1. Introduction variation in species richness among groups (Phillimore et al. 2006). Furthermore, key innovation-type traits are usually specific to one taxon, and so cannot help in establishing general theories of diversification.

Another approach to studying variation in diversity emerged from the field of ecology, where there has long been an interest in identifying environmental correlates of regional species richness. Eventually, it was recognised that these environmental correlates might affect diversification rates as well as ecological limits to species coexistence. One of the earliest theories of how environmental factors could affect diversification rates was proposed by Cracraft (1982, 1985), who placed greatest importance on geological complexity (which he hypothesised would affect speciation rates through its association with numbers of barriers) and “environmental harshness”

(which would control extinction rates). Since then, much research has explored effects of environmental variables on diversification, and the same variables identified as important in structuring regional diversity have also proved important in structuring taxonomic diversity. Foremost on the list in both cases are area, time, latitude, habitat diversity and topographical diversity (Rosenzweig 1995; Mayhew 2007; Ricklefs 2007b).

Environmental variables typically explain much more variation in species richness than do taxon traits (for example, area alone explains 40% of variation in angiosperm sister family diversity, Davies et al. 2004). However, like taxon traits, they are not sufficient on their own for explaining patterns of diversity.

Researchers are increasingly realising that diversification is best understood as the result of the interaction of environmental variables and taxon traits. In angiosperms, for example, biotic pollination is correlated with increased diversification (as mentioned

14 Chapter 1. Introduction above), but usually only in concert with topographical, edaphic or climatic diversity (Kay and Sargent 2009). In birds, species richness is thought to be heavily influenced by the size, geographic complexity, latitude and age of regions occupied by clades, but also by their mating system and levels of sexual dimorphism (Ricklefs 2003, 2007a, b). The diversification histories of taxa at lower hierarchical levels are also often best explained by a combination of environmental variables and organism traits. For example, in the gentian genus Halenia, nectar spurs are associated with increased diversification, but only in clades that colonised a new biotic environment, South America (von Hagen and

Kadereit 2003). In addition, some traits may influence diversification mainly via their effect on clade or species range size (Vamosi and Vamosi 2010).

One factor that, until recently, has rarely been explicitly discussed is the difficulty in distinguishing whether variation in species richness is the result of variation in diversification rates or in ecological limits to diversity (Rabosky 2009a, b). With perhaps some exceptions (e.g. the propensity for sexual selection, which is likely to affect diversity only through speciation rates), all ecological variables and taxon traits that have been studied so far could affect diversity either through diversification rates or diversity limits. This topic is discussed in more detail in Kisel et al. (in review) (Appendix I).

A proposed framework for understanding variation in diversification

For general models of variation in diversity, environmental variables that have been studied have an advantage over most taxon traits studied because they are relevant to all taxa. Whereas all taxa occur in regions of a particular area, latitude and topographic

15 Chapter 1. Introduction complexity, not all taxa can be winged, phytophagous or biotically pollinated. However, some general taxon traits have been studied, and these traits tend to show similar levels of importance across disparate taxa in structuring diversity. For example, molecular rates have only weak support in driving diversification in both (Jobson and Albert 2002;

Davies et al. 2004) and birds (Cardillo et al. 2005), whereas poorer dispersal ability is significantly associated with increased diversification in both birds (Bohonak 1999;

Belliure et al. 2000) and marine invertebrates (Jablonski 1986). This suggests that general taxon traits have promise for constructing a general theory of diversification.

I propose that a simple way for understanding taxonomic variation in diversification rates is to consider them the product of available area and a general taxon trait, the spatial scale of speciation. The spatial scale of speciation is defined here as the minimum size of region (or amount of geographic isolation) required for speciation to occur. Area is known to be a major factor influencing diversification (discussed further below), and if clades vary in their spatial scale of speciation, this should strongly affect the rate at which they can diversify within a region of a particular area. Further, I predict that the spatial scale of speciation is controlled by levels of population genetic processes

(particularly natural selection, gene flow, mutation and genetic drift) because these control the process of speciation (Grant 1981). This conceptual framework is illustrated in

Figure 1.1.

16 Chapter 1. Introduction

Figure 1.1. Hypothesised framework linking geographic area and the spatial scale of speciation as controllers of speciation and diversification. Population genetic processes that control population divergence determine the spatial scale of speciation; the interaction of area and spatial scale of speciation determine the speciation rate, which influences diversification rate and total diversity.

Many lines of evidence support a strong role for geographic area in controlling diversification. First, many taxa show bursts of diversification after dispersing to new areas. This is best exemplified by adaptive radiations on oceanic archipelagos (e.g. Parent and Crespi 2006), but the same pattern has also been seen in the histories of mainland groups (e.g. Kodandaramaiah and Wahlberg 2007). This finding is supported by phylogenetic studies showing increased diversification in groups with greater dispersal ability, where this indicates the ability to colonise new areas (e.g. in birds, Phillimore et

17 Chapter 1. Introduction

al. 2006). Second, there is palaeontological evidence for a link between changes in species richness through time and changes in area of suitable habitat available. One of the best-studied examples of this is in the near-shore marine environment, in which global fossil species richness through time correlates with changes in continental shelf area

(although it is difficult to untangle the effects of increased area and increased fragmentation, and there is debate over whether this is only the result of variation in the amount of fossil-bearing rock preserved; Valentine and Moores 1970; Smith 1988; Smith

2007). Third, speciation rates in birds, snails and cichlid fish are known to increase with the area available (eSARs; Losos and Schluter 2000; Seehausen 2006; Losos and Parent

2009). Finally, there is evidence that larger clade geographic ranges are associated with higher diversification rates and/or species richness (Gaston and Blackburn 1997; Cardillo et al. 2003; Price and Wagner 2004; Phillimore et al. 2007; Vamosi and Vamosi 2010).

The idea of a spatial scale of speciation has not been set out before, but some recent papers studying diversification on islands suggest that it exists and varies between taxa. Coyne and Price (2000) showed that there is no evidence of bird speciation on any

isolated oceanic island smaller than 10,000 km2. Although they used this as evidence that sympatric speciation is rare or nonexistent in birds, it is also evidence that no barriers sufficient for allopatric speciation are available for birds in regions of this size. Losos and

Schluter (2000), studying Anolis lizards on Caribbean islands, found evidence of speciation, but never on islands smaller than 3,000 km2. Losos and Parent (2009) found

that the threshold area for speciation was only 18.1 km2 for snails in the Galapagos, while

Seehausen (2006) found evidence for cichlid fish speciation even in lakes with surface

areas smaller than 1 km2. These studies suggest that birds have a coarse scale of

18 Chapter 1. Introduction speciation, lizards a medium scale of speciation, and snails and cichlid fish, relatively fine scales of speciation.

The spatial scale of speciation should affect speciation rates by determining both whether speciation can occur in a given region and the number of speciation events that can occur simultaneously in regions large enough for it to occur. In a group with a fine spatial scale of speciation, populations are able to diverge even if separated by short distances, so even in a small region many populations could be diverging at any point in time and the overall rate of speciation should be high. In contrast, in a group with a coarse spatial scale of speciation, populations are only able to diverge if separated by great distances, which means that even in larger regions only a few populations will be diverging at any time and the overall rate of speciation should be low. Furthermore, in such a group speciation in small regions will occur only rarely, under unusual circumstances.

Population divergence and speciation are affected by many environmental factors other than area or geographic isolation, and so the spatial scale of speciation should be modulated by additional environmental factors as well. For example, given a particular area, the spatial scale of speciation should be finer in more habitat-rich regions where there is greater opportunity for population divergence as the result of divergent selection.

As a result, the spatial scale of speciation of each taxon is likely to vary among regions.

This complicates studies of the spatial scale of speciation, as regional characteristics must be considered in addition to taxon traits. However, this could also mean that the spatial scale of speciation mediates the effects of many environmental factors on diversification,

19 Chapter 1. Introduction

in addition to area, which would make a theoretical framework based on the spatial scale of speciation even more useful.

Population genetic characteristics of clades should be closely associated with the spatial scale of speciation because speciation always requires population divergence

(Gavrilets 2003). Although there are numerous concepts of what “speciation” means, all of these centre on a scenario of population divergence within a species resulting in two new units that are sufficiently separated to be considered new species (in the biological species concept, by development of reproductive isolating mechanisms; Mayr 1942).

There is much debate over roles of selection (Schluter 2000, 2009; Sobel et al. 2010) and geography/gene flow in driving speciation (Butlin et al. 2008), and whether speciation requires divergence in the whole genome or only some genes (Wu and Ting 2004), but all models of speciation nonetheless require some amount of divergence in at least one trait.

Ultimately, this divergence is controlled by population genetic processes; it is generated by divergent selection, genetic drift or mutation, and it is prevented by gene flow and balancing selection (Grant 1981). For example, in the classic allopatric model of speciation, selection in geographically isolated populations drives divergence that incidentally results in reproductive isolation (Mayr 1942); in polyploid speciation, chromosomal multiplication produces (usually) instant and simultaneous genetic divergence and reproductive isolation (Stebbins 1950); and in speciation by sexual selection, selection in separated populations drives divergence in mate choice traits that cause reproductive isolation as well (Coyne and Orr 2004). For this reason, I hypothesise that the population genetic characteristics of clades should be associated with their spatial scale of speciation and rates of speciation and diversification. Groups with, on average,

20 Chapter 1. Introduction

stronger divergent selection, increased genetic drift, faster mutation rates and/or reduced gene flow within species should have finer spatial scales of speciation (and faster rates of

diversification).

Approach and aims

I use two complementary approaches to investigate the spatial scale of speciation

and its relationship with diversification.

First, in chapter two, I investigate taxonomic variation in the spatial scale of

speciation and test whether it can be explained by variation between taxa in the level of

gene flow. I do this by surveying speciation events on isolated oceanic islands worldwide

for a wide range of animal and plant taxa, and using a large set of gene flow data

compiled from the population genetic literature. I hypothesise that groups with increased

average levels of gene flow within species should have coarser spatial scales of

speciation.

Second, I test the association between diversification rates of clades and

population genetic characteristics of species, which I expect to affect diversification via

their effect on the spatial scale of speciation. In chapter three I test whether the level of

gene flow has an effect on rates of diversification and overall clade diversity; in chapter

four I test the effects of the levels of local adaptation and genetic drift. These analyses use

population genetic data I generate for species from sister groups of Costa Rican orchids

using AFLP genotyping and morphometric analyses of their leaves. I hypothesise that

21 Chapter 1. Introduction

groups with reduced gene flow, increased genetic drift and/or increased local adaptation

within species should have higher rates of diversification and greater species richness.

Throughout, I use molecular data to estimate the population genetic characteristics

of clades. To my knowledge, molecular data have never been used in comparative studies

of diversification. Instead, such studies have used morphological proxies, such as sexual

dichromatism to estimate the past strength of sexual selection. In large part this is due to the previous difficulty in generating population genetic data. However, PCR-based approaches for genotyping individuals, such as AFLP and RAPD, have made population genetic data much easier to generate, and many more are now available in the literature.

Comparative population genetics is a growing field and has the potential to contribute to

the study of diversification patterns.

A final note: in discussing my proposed framework, I have focused on the

contribution of speciation to variation in diversity, but extinction is also important in

controlling diversity. It is clear that extinction rates vary nonrandomly between taxa

(Purvis 2008). Furthermore, population genetic characteristics of clades are expected to

be associated with extinction as well as speciation rates (Frankham et al. 2010), so an

effect of population genetics on diversification would not alone be evidence for my

framework. However, if population genetics affect diversification rates via speciation

rates, the associations should be opposite to those resulting from population genetics

acting via extinction. While increased speciation rates are expected with smaller

populations (and thus increased genetic drift) and reduced gene flow, decreased extinction

rates are expected with larger populations and increased gene flow. Furthermore, whereas

increased speciation is expected with increased local adaptation, there is no simple

22 Chapter 1. Introduction expectation for any association between selection and extinction. Thus, if I find an association between population genetics and diversification, I should be able to distinguish whether this is the result of variation in speciation (thus matching the expectations of my proposed framework) or the result of variation in extinction.

Summary of aims

 To measure taxonomic variation in the spatial scale of speciation  To test whether population genetic characteristics of species control the spatial scale of speciation  To test whether population genetic characteristics of species affect diversification rates of clades

23 Chapter 2. Measuring the spatial scale of speciation

Chapter 2. Using oceanic islands to measure the spatial scale of speciation and its association with gene flow

Introduction

Although area is generally expected to affect speciation rates (MacArthur and

Wilson 1967; Endler 1977; Rosenzweig 1995), most work on the spatial context of speciation has focused on patterns of range overlap between emerging species, ignoring questions of geographical scale (Mayr 1942; Butlin et al. 2008). Geographical theories of speciation predict that the probability of speciation occurring within a given region should

(1) increase with the size of the region – because of greater opportunity for divergence to occur within the region (MacArthur and Wilson 1967; Endler 1977; Rosenzweig 1995;

Gavrilets and Vose 2005; Losos and Parent 2009) - and (2) increase as the level of gene flow decreases, for example among organisms with shorter dispersal distances. Gene flow is the main process opposing population differentiation (Mayr 1963), and so the level of gene flow between populations is expected to be an important determinant of the spatial scale at which genetic divergence and speciation can occur (Slatkin 1973, 1985; Doebeli and Dieckmann 2003). However, despite the potential of this body of theory for explaining taxonomic and geographic variation in biodiversity (Ricklefs 2007b), the extent to which the scale of speciation varies among taxa and the causes of such variation remain unknown.

This chapter was published in slightly modified form as Kisel, Y. and T. G. Barraclough. 2010. Speciation has a spatial scale that depends on levels of gene flow. American Naturalist 175: 316-334.

24 Chapter 2. Measuring the spatial scale of speciation

Oceanic islands are useful for studying speciation because their well-defined boundaries and isolation make it easier to distinguish within island (in situ) speciation from immigration than in continental regions. Several studies have used islands to explore the relationship between speciation rates and area. Diamond (1977) noted the lack of bird speciation in Pacific landmasses smaller than New Zealand, but also observed that insects, lizards and ferns had diversified within smaller islands, such as New Caledonia.

Coyne and Price (2000) found no evidence for speciation in birds within oceanic islands worldwide smaller than 10,000 km2, setting a lower bound for their minimum area for speciation. Losos and Schluter (2000) estimated the minimum area for speciation in

Caribbean Anolis lizards as 3,000 km2 and found that speciation rates increased linearly with island area above this limit. Similar relationships were found in cichlid fish in

African lakes (Seehausen 2006) and Bulimulus snails in the Galapagos (Parent and Crespi

2006; Losos and Parent 2009), but with different minimum areas for speciation (in cichlids, < 1 km2; in snails, 18.1 km2). Together with case studies of speciation on small islands (cichlids in crater lakes, Schliewen et al. 1994; Barluenga et al. 2006; palms on

Lord Howe, Savolainen et al. 2006), these studies suggest that the spatial scale of speciation varies widely among taxa. However, only a few taxa have been investigated, and those on different sets of islands: comparison of several taxa across a broad range of island sizes is needed to quantify taxonomic variation in the spatial scale of speciation and identify its cause. To address this, I survey speciation events for a broad range of taxa on oceanic islands from around the world.

Islands also vary in many other factors that might affect rates of diversification

(Carlquist 1974; Bauer 1988; Paulay 1994; Rosenzweig 1995). Even if a speciation-area

25 Chapter 2. Measuring the spatial scale of speciation

relationship exists, it need not be the result of area directly - for instance, larger islands

tend to have higher habitat diversity, which could foster higher rates of ecological

speciation (Losos and Parent 2009). Island age might also affect diversification, by increasing the time over which speciation can have occurred or through other effects related to the dynamics of island ageing (Emerson and Oromi 2005; Sequeira et al. 2008;

Gillespie 2004; discussed in Whittaker et al. 2009). In addition, the degree of isolation from other landmasses might affect speciation rates, if lower colonisation rates to more isolated islands leave more niches open to be filled by in situ speciation (Gillespie and

Baldwin 2009). My aim here is to use islands as a model to study the spatial scale of speciation, rather than to explain island diversification in all its detail. However, because these other factors - especially habitat diversity and age - are likely to be correlated with island area, I include them in my analysis to be able to separate out the effects of area itself.

Another complication is the existence of archipelagos. As extreme examples of habitat fragmentation, archipelagos are expected to promote higher levels of diversification, especially for taxa that disperse well over land but not water (Diamond

1977; Losos and Parent 2009). However, the degree to which the rate of speciation will be increased within an archipelago should depend on the dispersal ability of the taxon and

the size of the water gaps between islands. For some taxa, barriers within islands may be sufficiently strong isolating factors already that rates of speciation are no higher in

archipelagos than in single islands of comparable size. Therefore, I repeat my analyses

both including and excluding isolated archipelagos, to test their effect on the probability

of speciation.

26 Chapter 2. Measuring the spatial scale of speciation

Many traits of organisms and species have been hypothesised to affect rates of

speciation (Jablonski 2008), but viewed in a geographic context, dispersal ability is

expected to be key. This has especially been argued in the specific context of oceanic

islands (Diamond 1977; Paulay 1994; Ranker et al. 1994; Parent and Crespi 2006;

Whittaker and Fernandez-Palacios 2007; Gillespie and Baldwin 2009; Givnish et al.

2009), where there are many examples of spectacular radiations of taxa with normally

poor dispersal abilities but a propensity for passive long-distance dispersal (for example,

weevils on Rapa, Paulay 1985, and snails in Bonin, Chiba 1999). Diamond (1977) argued more specifically that dispersal ability might determine the threshold island area necessary for within island speciation to take place – but this idea remains untested.

In this study I use a comparative approach to measure the extent of variation in the spatial scale of speciation and to test the importance of gene flow in controlling this variation. To quantify the speciation-area relationship and the spatial scale of speciation, I

survey the probability of in situ speciation on islands of different sizes for angiosperms, bats, birds, mammals of the order Carnivora, ferns, lizards, Macrolepidoptera (large butterflies and moths) and land snails. These taxa were chosen based on the availability of required data (see Methods) but they also represent a broad taxonomic range that varies in presumed dispersal ability. As a measure of the probability of in situ speciation on islands, I use the proportion of endemic lineages derived from single immigration events that have diverged within an island into two or more descendent species (Coyne and Price

2000; Figure 2.1). I also test the relationship between the probability of in situ speciation and other island factors that could potentially confound my analysis of the speciation-area relationship.

27 Chapter 2. Measuring the spatial scale of speciation

After establishing the extent of variation in the spatial scale of speciation, I test the importance of gene flow in setting the spatial scale of speciation by correlating the minimum area for speciation in each group with an independent measure of the average level of gene flow derived from the population genetic literature. To get comparable estimates of the level of gene flow for the study taxa, Fst values are compiled from the molecular ecology literature for each taxon along with a measure of the geographical scale of each study. Fst provides a measure of the genetic differentiation of populations within a species (0 = no differentiation; 1 = complete differentiation; Wright 1931) that should be robust to variation in the spatial arrangement of populations and the type of genetic marker used for analysis (Beaumont and Nichols 1996). It correlates strongly with broad-scale taxonomic variation in dispersal ability, and more consistently so than other population genetic measures such as Nm (Bohonak 1999). When estimated from appropriate data - namely, from neutral loci unaffected by selection, and from populations that are currently connected by gene flow and that have not undergone dramatic historical movements (Barton 2001) - Fst values provide a measure of the level of gene flow that is comparable between species.

Here I test two main hypotheses: that across a variety of major taxa, there should be a positive relationship between area and the probability of in situ speciation; and that variation between taxa in the minimum area required for speciation is controlled by variation in the average level of gene flow within species.

28 Chapter 2. Measuring the spatial scale of speciation

Figure 2.1. Patterns of diversification on islands. A and D are two species in the same genus, both native to the mainland. B, C, E, F and G (in italics) are endemic island species. (a) Illustration of the relationship between patterns of diversification and numbers of endemic species. (1) shows in situ, cladogenetic speciation (Stuessy et al. 2006), in which a mainland species reaches an island and subsequently splits within the island into two new species. (2) shows how multiple colonisation followed by anagenetic change, with no diversification within the island, can create the same pattern of multiple endemic species within one genus (Coyne and Price 2000; Stuessy et al. 2006). (3) shows in situ, anacladogenetic speciation (Stuessy et al. 2006), in which a mainland species reaches an island and remains unchanged while budding off an endemic daughter species. (b) Illustration of the phylogenetic patterns resulting from these three modes of island diversification. Each phylogeny is presented underneath the diversification mode that produces it. Species native to islands are circled; tree nodes within the circles represent within island speciation events.

29 Chapter 2. Measuring the spatial scale of speciation

Materials and Methods

Island selection and data collection

Islands were selected on the basis of their level of isolation - only islands at least

100 km away from any other landmass, including continents and other islands, were used.

I used only isolated islands in order to minimise the chance of continuing gene flow from outside populations of colonising species or of multiple colonisation leading to apparent speciation when no diversification has occurred within the island. Of the islands for which appropriate data were available, only Easter Island (Rapa Nui) was excluded, because of its long history of habitat degradation and human-caused extinction before discovery by European taxonomists (Diamond 2005).

Isolated archipelagos were used as units of study when the longest leg in the minimum network connecting all islands was less than 100 km long, following the logic of Coyne and Price (2000). In all cases, distances between islands were less than half the

distance of the archipelago from the nearest other landmass. The area of each archipelago

was calculated as the area of a 15-19 point minimum spanning polygon enclosing all

islands in the archipelago, using the Analyzing Digital Images software package (Pickle

and Kirtley 2008). All analyses were repeated both with and without the inclusion of

archipelagos. In the multiple regression models including archipelagos, I tested whether speciation is more likely within archipelagos than single islands by including a term,

“ArchYN” coded as a 0 (single island) or 1 (archipelago). Except where indicated, the results from analyses including archipelagos are presented.

30 Chapter 2. Measuring the spatial scale of speciation

Data on island area, elevation, isolation and latitude were largely collected from the United Nations Environment Programme (UNEP) online Island Directory database.

Isolation was calculated following UNEP methods for “Isolation Index” (Dahl 2004).

Island data missing from the UNEP database were collected from primary literature or government databases where possible. Island statistics and references are given in

Appendix Table A1, which can be found in the Dryad online data repository

(http://datadryad.org/repo/handle/10255/dryad.887).

Island species data collection

Study taxa were chosen based on the availability of sufficient and comparable data, while also aiming to represent a broad taxonomic range of plants and animals with presumed differences in dispersal ability. In the context of this study, “bats” refers to the order Chiroptera. “Ferns” refers to the class Filicopsida, phylum Pteridophyta excluding

Psilotopsida, Lycopsida and Equisetopsida. “Lizards” refers to the order Squamata excluding snakes and amphisbaenians, and “snails” refers to terrestrial pulmonate snails in the orders Stylommatophora, Mesurethra, Heterurethra and Sigmurethra.

“Macrolepidoptera” refers to butterflies and moths in the superfamilies Bombycoidea,

Lasiocampoidea, Axioidea, Calliduloidea, Hedyloidea, Drepanoidea, Geometroidea,

Hesperioidea, Mimallonoidea, Noctuidea, Papilionoidea, Sphingoidea and Uranioidea.

Data were collected for all group/island combinations for which complete (to the best ability of the source authors) taxon lists, with endemism information, were found in the libraries of the Natural History Museum, London, the Royal Botanic Gardens, Kew, in online sources (Avibase, Lepage 2008; Flora of Australia Online, Australian Biological

31 Chapter 2. Measuring the spatial scale of speciation

Resources Study 2008) or in databases available to me (island lizards and carnivores

worldwide, S. Meiri, unpublished data; endemic plants of selected islands, R. Salguero-

Gomez, unpublished data; checklist of Mascarene plants, C. Thebaud, unpublished data).

The large scope of the study prohibited comprehensive survey of recently published

journal articles and so I may not have found the most recent species lists in a few cases.

Data were not available for all taxa on all islands included in the analysis (numbers of

islands with data for each taxon given in Table 2.1).

For each group/island combination where data were available, the name and

endemism status of each native species were recorded according to the source. Therefore,

in practice I used the species concepts held by the taxonomists who wrote the species

lists. Species whose endemic status was in doubt were treated as non-endemic. Genera

with apparent speciation events are listed in Appendix Table A2 and sources for these

species data are summarised in Appendix Table A3 (both available in Dryad).

32 Chapter 2. Measuring the spatial scale of speciation

Table 2.1. Area and speciation statistics, by taxonomic group. All island sizes above 1 are shown rounded to the nearest km. Numbers in parentheses are from dataset excluding archipelagos. Number of islands in Number of islands in Minimum area for dataset with endemic dataset with speciation Group Speciation (km2) * species events Snails 0.8 30 (17) 24 (12)

Angiosperms 15 32 (24) 21 (14)

Ferns 15 17 (11) 9 (5)

Birds 64 or 705 50 (33) 9 (1)

Lizards 108 27 (14) 10 (3)

Macrolepidoptera 141,200 6 (3) 1 (0)

Bats 416,400 14 (5) 2 (1)

Carnivora 587,713 2 (2) 1 (1)

*Estimated as the area of the smallest island with a speciation event: snails - Nihoa; angiosperms and ferns - Lord Howe; birds - Norfolk (smaller value) or Tristan da Cunha; lizards - Rodrigues; Macrolepidoptera - Fiji; bats - New Zealand; Carnivora - Madagascar.

Identification of speciation events

Following the method of Coyne and Price (2000), my measure of the probability

of speciation within a given island for a given taxon is the proportion of endemic lineages

derived from single immigration events that have diversified within an island into two or more descendent species. This approach controls for differences among islands in the number of colonisers because it divides the number of speciated lineages by the total number of lineages that colonised the island and could have speciated. I used a binary

33 Chapter 2. Measuring the spatial scale of speciation measure because my interest is in what controls the ability of lineages to speciate at all, rather than what controls the size of radiations.

I considered the number of genera with at least one endemic species to represent the number of endemic lineages, and the number of genera with two or more endemic species to represent the number of lineages which have diversified in situ (Coyne and

Price 2000; Stuessy et al. 2006). Therefore, my measure of the probability of speciation on a particular island was the number of genera with two or more endemic species divided by the number of genera with one or more endemic species. I used only genera with endemic species (and not all native genera) in order to exclude lineages that have not been isolated enough from mainland populations or have not been on an island long enough to speciate within an island. Because I used only genera with endemic species, I excluded the following islands and island groups from my dataset, for which I found no record of endemic species in my study groups: the Bounty Islands, Caroline Island,

Cartier Island, Cocos (Keeling), Diego Garcia, the Gilbert Islands, the Hall Islands, Heard and McDonald, Niuatoputapu and Tafahi, the Prince Edward Islands, Tokelau and Uvea.

The measure used ignores any cases of in situ speciation that have occurred through anacladogenesis (in which a daughter species diverges from an ancestral colonising species that remains unchanged; Figure 2.1; Coyne and Price 2000; Stuessy et al. 1990), because the detailed morphological and genetic data required to identify such cases were not available for most genera. The rate of anacladogenetic speciation should vary with island characteristics in the same manner as the rate of cladogenetic speciation

(which I am measuring) as they are the result of essentially the same process of

34 Chapter 2. Measuring the spatial scale of speciation

divergence. Thus I expect that the inclusion of such speciation events would not

qualitatively change my conclusions.

The minimum area for speciation of each taxon was estimated as the area of the

smallest island or archipelago within which speciation has occurred. Statistics on

speciation and endemism in each group are given in Table 2.1 and Appendix Table A3

(available in Dryad).

Adding phylogenetic information

My method assumes that the chance of multiple endemic species within the same

genus originating by multiple colonisation events, rather than in situ diversification

(Figure 2.1), is small. To validate this assumption, I searched the literature for

phylogenetic information on the study genera. For each genus associated with a putative

speciation event, searches were performed on TreeBase, NCBI GenBank CoreNucleotide

and ISI Web of Science to find published molecular phylogenies. Any phylogenies that

included more than one of the endemic species of a study genus with a putative speciation

event on a particular island or archipelago were used. Cases of congeneric endemic

species shown not to be each other’s closest relatives were reclassified as multiple, non-

speciated lineages (see Losos and Parent 2009). In a few instances, published phylogenies

indicated that multiple genera containing endemic species were all part of one larger

endemic clade, and in this case the genera involved were considered as a single speciated lineage in the analyses. Also, in some cases published phylogenies indicated that the endemic species in one genus were the result of multiple colonisations followed by multiple radiations, in which case the genus was treated as multiple speciated lineages.

35 Chapter 2. Measuring the spatial scale of speciation

Results of the phylogeny search are given in Appendix Tables A2 and A4 (both available in Dryad).

Statistical analysis of the speciation-area relationship

Speciation was treated as a binary response variable: each endemic lineage on each island has a value of 0 for speciation if it contains only 1 endemic species and a value of 1 if it contains 2 or more endemic species. Overall regression models between this response variable and island area, considering all taxa together, were carried out using the lme4 package in R v.2.5 (Bates 2007; R development core team 2007) for generalised linear mixed effects models, using taxon as a random effect, a binomial error structure and Laplace approximation for ML (maximum likelihood) estimates. Individual regression models for each taxon were carried out in R using generalised linear models with a binomial error structure. R2 values for all models were calculated with the formula

(SST-SSE)/SST, where SSE is the deviance of the model and SST is the deviance of a null model (for mixed effects models, consisting only of a different intercept for each taxon, which represents the mean probability of speciation over all islands).

I also constructed multiple regression models to investigate the importance of the other island characteristics that might affect speciation probability. For both the individual taxon models and overall models considering all taxa together, I began with a maximal additive model including area, elevation as a proxy for habitat diversity in the absence of a direct measure (Ackerman et al. 2007), isolation from other landmasses and whether the unit of study was a single island or an archipelago. I then calculated the

Akaike Information Criterion corrected for small sample sizes (AICc; Burnham and

36 Chapter 2. Measuring the spatial scale of speciation

Anderson 2002) for each sub-model of this full model. For the overall models, I ran each sub-model once with each possible random effect term - one indicating only different intercepts for each taxon (written as (1|group)) and one for each island environmental variable, indicating different slopes for each taxon with this variable (for example

(Area|group) indicating different speciation-area slopes for each taxon). The best models for each dataset are listed in Appendix Table A5 (available in Dryad).

The significance and importance of each predictor variable in the multiple regression models was evaluated using model averaging as described in Burnham and

Anderson (2002). First, for each dataset, the full set of additive models was generated.

Then the relative importance of each variable, on a scale from 0 to 1, was calculated as the sum of the Akaike weights of the models in which the variable appears - better models have larger Akaike weights, and a variable which contributes more to model fit and so is included in more of the best models will have a higher relative importance value.

Parameter estimates and unconditional standard errors for each term were calculated by averaging over all models in which the variable appeared, weighting values from individual models by the models’ Akaike weights. A term was considered to be significant for a particular dataset if the 95% confidence interval for its parameter estimate did not include 0.

Gene flow data

Fst data came primarily from the appendices in Morjan and Rieseberg (2004) and were supplemented by a search following the methods of Morjan and Rieseberg (2004) for (TS=((fern or pteridophyt* or snail or Lepidoptera* or Chiroptera or Carnivora or

37 Chapter 2. Measuring the spatial scale of speciation

lizard) and ("gene flow" or Fst or Nm or Nem))) on ISI Web of Science. After this search,

carnivore data were still lacking, and so data were added from another search, for (TS=

(carnivor* and "population structure")). In the additional searches, only papers that

presented an overall Fst value (as opposed to pair-wise Fst values only) for variation

between populations (rather than regions) were used. Two estimators of Fst - Gst (Nei

1973) and Φst (Excoffier et al. 1992; Excoffier 2001) - were used where these were

provided in the original sources instead of Fst.

All studies dealing with aquatic or marine species were excluded, as well as studies of recent habitat fragmentation as a result of human activities, clonality or hybridisation between species or host races. I also excluded studies including historically- isolated lineages, such as those separated by a major geographical barrier (for instance,

Myotis myotis on either side of the Strait of Gibraltar; Castella et al. 2000), as they are often evidence of cryptic speciation, while I wanted estimates of gene flow within species. Only studies of wild populations of native organisms were used - recent introductions and crop pests were excluded. All Fst values derived from organelle markers

(mitochondrial or chloroplast DNA) were excluded, as these reflect only female dispersal and not overall population patterns of genetic differentiation. One plant study in Morjan

and Rieseberg’s database (Proteum glabrum; Morjan and Rieseberg 2004) had a negative

Fst value, which was interpreted as Fst = 0 (Long 1986). Gene flow data used are

summarised in Appendix Table A6 (available in Dryad).

I checked for comparability of gene flow estimates from different studies. Studies

using AFLPs, RAPDs and iSSRs were found to have significantly higher Fst values than studies using other marker types, even when correcting for taxon and the geographic scale

38 Chapter 2. Measuring the spatial scale of speciation of study (ANOVA, F = 7.3, p < 2.2 x 10-16) and so were excluded from analysis.

Isozymes were also excluded because they were used in only one study in my dataset and the comparability of Fst values derived from them to Fst values derived from other markers is unclear. The Fst values from studies using allozymes and those using types of repeats (microsatellites or SSRs, minisatellites, tandem repeats) did not differ significantly (ANOVA, F = 1.7, p = 0.192) and so were lumped for the final analysis.

Studies using these two marker types also did not differ significantly in number of populations used (ANOVA, F = 1.5, p = 0.229) or number of loci used (ANOVA, F =

0.60, p = 0.440). There were small differences between the two marker types in the mean geographic scale of study, but these were inconsistent between taxa in sign and so cannot explain the pattern of taxon variation in Fst. Fst showed no overall relationship with the number of loci (linear regression, p = 0.867) or the number of populations (linear regression, p = 0.885) used in a study. Allele frequencies for individual markers were not consistently available and thus their relationship with Fst could not be tested, but they are not expected to be a major confounding factor.

As a measure of the geographic scale of each population genetic study, geographic range extent was measured as the greatest distance in km between any two populations in the study, following the example of Bohonak (1999). I used the maximum distance between study populations rather than the mean or median because I felt this was a simple and practical measure sufficient to resolve the wide range of geographic scales represented by the studies I used (whose maximum distances vary from 0.01 km to

>14,000 km). In addition, use of alternative measures such as mean distance would be difficult because of the lack of detailed information in many of the sources. This measure

39 Chapter 2. Measuring the spatial scale of speciation

does not explicitly take into account variation in species’ geographic range size or in the

spatial arrangement of sampled populations, but neither was easily quantified from the

sources available to me and neither is expected to have a strong independent effect on Fst

(Beaumont and Nichols 1996). Distances were taken directly from papers if possible.

Otherwise, they were calculated from population coordinates using the Vincenty formula

(Veness 2008), measured from scaled maps given in the studies or measured using

Google EarthTM v. 4.3 (Google) if only place names were provided. Data for 23 taxa were excluded because the original reference could not be found or did not contain sufficient information to calculate the geographic range extent.

Gene flow analyses

To test the effect of gene flow on the spatial scale of speciation, I correlated the ln

of the minimum island size for speciation in each study group with two summary

measures of the spatial scale of gene flow. One potential problem with comparing Fst values is that Fst data tend to be measured at different spatial scales in different taxa

(Figure 2.2). Therefore, as summary measures I calculated the mean Fst for each group at

two scales in turn: between 10 and 100 km, and between 100 and 1000 km. I chose these

geographic scales for analysis because they correspond to the range of island sizes used in

the speciation analysis (47 of 64 islands/island groups, 73%, have maximum linear

extents between 10 and 1000 km) and because they are the only scales at which Fst data were available for all study taxa (except snails between 100 and 1000 km). To control for the effect of outliers in the Fst data I also tested the correlation of median Fst values with

minimum island size for speciation.

40 Chapter 2. Measuring the spatial scale of speciation

For this analysis, the minimum island size for speciation for each study group was represented as the greatest distance between any two points of land within the island or

archipelago. These extents were measured in Google EarthTM v. 4.3 (Google) and the

Analyzing Digital Images software package (Pickle and Kirtley 2008). Linear extents

were used instead of area in order to be directly comparable to the distances between

populations used to represent the spatial scale of gene flow. Because of the uncertainty

over bird speciation on Norfolk, this analysis was carried out twice, once with Norfolk

and once with Tristan da Cunha representing the smallest island with in situ bird

speciation.

Figure 2.2 (next page). Relationship between Fst and the geographic extent of study. Fst estimates are taken from a review of population genetic literature. Each point represents an estimate from a single study, and only values used in analyses are shown. The geographic extent of study is the maximum distance between populations in one study. Fst values given are overall values considering all populations in the study. Dashed lines delimit the two geographic scales that were used for the gene flow analysis. Solid black lines indicate the average Fst for each scale and solid grey lines indicate the median. Where only a grey line is visible, the mean and median values are equal. No data were available for snails at the 100 - 1000 km range.

41

Figure 2.2.

42 Chapter 2. Measuring the spatial scale of speciation

Results

Data availability and quality

I estimated speciation probabilities across 64 islands in total, including 38 single islands and 26 archipelagos, taking into account 471 putatively speciated genera.

Phylogenies were available for 15% of these genera; an additional 15% of the genera were endemic to their island and so most parsimoniously explained by in situ speciation.

These data led me to exclude only seven of the putative speciated genera as being the result of multiple colonisation (~5% of cases where phylogenetic or genus endemism information is available), confirming that the non-phylogenetic measure is a good measure of the number of genera that have speciated in situ and is little affected by multiple colonisation events. Twelve genera were found to be part of a larger adaptive radiation already represented by another included genus, and so were removed from analysis. Three genera were found to be the result of two separate radiations, and one genus the result of three separate radiations, and were split into multiple speciated lineages accordingly. After taking all this into account, my final dataset included 457 speciated lineages. Phylogenetic data by genus are presented in Appendix Table A2 and summarised in Appendix Table A4 (both available in Dryad).

Quantifying the speciation-area relationship

Across taxa, on oceanic islands and archipelagos ranging in size from <1 (Nihoa) to >500,000 km2 (Madagascar), there is a clear positive relationship between the

43 Chapter 2. Measuring the spatial scale of speciation

probability of in situ speciation and island area (p = 1.35 x10-5, r2 = 0.312; when

archipelagos are excluded, p = 1.54 x 10-4, r2 = 0.414; Figure 2.3). The relationship

between the probability of speciation and island area is significant in all taxa with

sufficient data except ferns (Table 2.2). Bats and macrolepidoptera had insufficient data to construct models for their speciation-area relationship, as they show evidence for

speciation only on 2 and 1 islands, respectively, but the data for both nevertheless support

the same positive relationship seen in the other taxa - both are present on many small

islands on which they have not speciated, while they have speciated only on the largest

islands on which they are represented. In contrast, carnivores are rarely present at all on

oceanic islands due to their poor dispersal over water, and have endemic species on only two of the studied islands. Nonetheless, they also show evidence for speciation only on

the largest island on which they are represented (Madagascar), and I predict that they

would also show a positive speciation-area relationship if larger landmasses were

considered. The lack of a speciation-area relationship in ferns, on the other hand, is not

the result of a lack of data. Ferns have clearly speciated on small and large islands with

similar probability, indicating that area is relatively unimportant in controlling their

speciation.

Figure 2.3 (next page). Relationship between the probability of speciation and area. Each point marks the percentage of lineages of a particular taxon, on a particular island, that have speciated. Solid lines mark the regression model for each taxon alone; dashed lines represent the overall model averaged over all taxa. No individual model line is given for bats, carnivores or Macrolepidoptera, as these taxa had too few islands with speciation events to model their speciation-area relationship. Lines are based on analyses including archipelagos.

44

Figure 2.3.

45

Table 2.2. Area-only models for the probability of speciation. noxaT N S Model p-value r2 for area

-5 Overall model 64 38 P(speciation)~ - 4.00 + 0.279 log(Area) + (Area|group) 1.35 x10 0.312 -4 Overall model - excluding archipelagos 40 20 ~ - 5.89 + 0.420 log(Area) + (Area | group) 1.54 x10 0.414 Angiosperms 32 21 ~ - 2.27 + 0.173 log(Area) 1.75 x10-6 0.277

-9 Angiosperms - excluding archipelagos 24 14 ~ - 3.14 + 0.350 log(Area) 6.91 x10 0.534 Birds 50 9 ~ - 5.07 + 0.310 log(Area) 1.66 x10-5 0.580 Ferns 17 9 ~ - 0.641 - 0.0165 log(Area) 0.882 0.00158

Ferns - excluding archipelagos 11 5 ~ - 0.411 - 0.0844 log(Area) 0.704 0.0149 Lizards 27 10 ~ - 4.35 + 0.391 log(Area) 1.75 x10-5 0.527 -4 Lizards - excluding archipelagos 14 3 ~ -5.39 + 0.464 log(Area) 8.6 x10 0.732 Snails 30 24 ~ - 0.700 + 0.0988 log(Area) 0.00234 0.243

Snails - excluding archipelagos 17 12 ~ - 0.810 + 0.0951 log(Area) 0.0219 0.216 NOTE.- N = number of islands/island groups used to construct model. S = number of islands with speciation events. Parameter values are those given by the logistic models and produce predicted values that must be logit transformed to give an estimated probability of speciation. In the overall models, the “(Area|group)” term is the random effect accounting for variation between study taxa in the slope of the speciation-area relationship. Speciation probability was not modelled for bats, birds on single islands, Carnivora or Macrolepidoptera because these groups had fewer than three islands with speciation events.

46 Chapter 2. Measuring the spatial scale of speciation

Measuring minimum areas for speciation

The minimum area for speciation (estimated as the area of the smallest island or

archipelago within which speciation has occurred) varies widely among taxa. Land snails

have speciated within even the smallest island on which they have native species (Nihoa -

0.8 km2), whereas the only example of in situ speciation in Carnivora is on Madagascar

(587,713.3 km2), and bats show no evidence of in situ speciation on any islands except

New Zealand (approximately 416,400 km2) and Madagascar. Macrolepidoptera also

appear to require large areas for speciation - the only island unit in which they show

evidence of speciation is Fiji (141,200 km2). Angiosperms and lizards are intermediate,

with minimum areas for speciation of 14.6 km2 and 107.8 km2 (Table 2.1; more detailed summary of speciation events in Appendix Table A2, available in Dryad). The situation in birds is unclear; even after genetic analysis, it is uncertain whether a putative speciation event on Norfolk Island (64 km2) is not actually the result of multiple

colonisation (Coyne and Price 2000). The next smallest island unit within which bird

speciation has potentially taken place is the Tristan da Cunha archipelago (705 km2), in which some evidence even supports a history of sympatric speciation within the smaller islands of the archipelago (Ryan et al. 2007; Grant and Grant 2009); the smallest single island with firm evidence for in situ bird speciation is Jamaica (11,400 km2; Coyne and

Price 2000). Irrespective of the uncertainty for birds, it is evident that taxonomic variation in the spatial scale of speciation is great, spanning 6 orders of magnitude between snails and carnivorous mammals.

47 Chapter 2. Measuring the spatial scale of speciation

Testing the importance of area when other environmental variables are included

The relationship with area is not an artefact of area’s correlation with another

environmental variable. In my dataset, island area is correlated with elevation (adjusted r2 = 0.478; adj. r2 = 0.365 if archipelagos are excluded), island age (adj. r2 = 0.111;

correlation is not significant if archipelagos are excluded) and whether an island is an

archipelago or not (proportion of variance explained in ANOVA = 0.278), but in multiple

regression models, both overall and for individual taxa, model-averaged parameter

estimates indicate that island area is highly important and significant, independently of other island characteristics (best models are listed in Appendix Table A5, available in

Dryad; model-averaged parameter estimates are given in Tables 2.3 and 2.4).

In the overall models, area has high relative importance values (0.93 when archipelagos are included; 0.97 when archipelagos are excluded; relative importance values are on a scale of 0 to 1), meaning that it is included in a high percentage of the best models. In addition, its parameter estimates are significantly greater than zero, supporting a positive speciation-area relationship. Isolation and elevation also have high relative importance values and are also significant in the overall models, especially in the dataset including archipelagos (Table 2.3). Age and whether an island unit is an archipelago or not have low relative importance values and are not significant in the overall models.

Area is also highly important and significant in most of the single taxon models. It is the most important variable, and its parameter estimate is significantly greater than zero

(indicating a positive speciation-area relationship), in all single taxa except lizards in the

48 Chapter 2. Measuring the spatial scale of speciation

dataset excluding archipelagos, and ferns in both the dataset including and the dataset

excluding archipelagos. Elevation and isolation are also significant in most single taxon

models, although in all cases, except lizards when excluding archipelagos, they are much

less important than area. The parameter estimate for isolation is positive in all taxa except

birds; more distant islands tend to have a higher probability of speciation. Whether an

island is an archipelago or not is important and significant in angiosperms and birds, while island age is relatively important and significant only for snails, when archipelagos are excluded. Over all models, island area is the most consistently important and

significant island variable.

49

Table 2.3. Model-averaged parameter estimates and relative importance values for analyses including archipelagos.

MODEL Area Age Elevation Isolation ArchYN relative importance value estimate ± standard error (confidence interval) OVERALL 0.93 0.26 0.97 0.97 0.3 0.20 ± 0.0044 0.0017 ± 0.0035 0.022 ± 6.21x10-5 0.010 ± 1.29x10-5 -0.025 ± 0.026 (0.19, 0.21) (-0.0051, 0.0086) (0.022, 0.022) (0.010, 0.010) (-0.077, 0.026) Angiosperms 0.99 0.20 0.76 0.87 0.94 0.22 ± 0.0042 0.020 ± 0.016 0.018 ± 0.00014 0.011 ± 2.86x10-5 -0.66 ± 0.061 (0.21, 0.23) (-0.011, 0.051) (0.018, 0.018) (0.011, 0.011) (-0.78, -0.54) Birds 0.98 0.27 0.27 0.26 0.79 0.32 ± 0.0086 0.040 ± 0.082 0.0047 ± 0.00081 -0.00099 ± 0.00026 0.79 ± 0.26 (0.30, 0.34) (-0.12, 0.20) (0.0031, 0.0063) (-0.0015, -0.00048) (0.29, 1.29) Ferns 0.31 0.31 0.34 0.33 0.30 -0.010 ± 0.018 0.095 ± 0.16 0.0074 ± 0.0020 0.0019 ± 0.00017 0.018 ± 0.60 (-0.045, 0.025) (-0.23, 0.42) (0.0036, 0.011) (0.0016, 0.0023) (-1.1, 1.2) Lizards 0.86 0.11 0.34 0.20 0.24 0.32 ± 0.017 0.015 ± 0.079 0.017 ± 0.0027 0.00051 ± 0.00051 0.10 ± 0.44 (0.29, 0.36) (-0.14, 0.17) (0.012, 0.022) (-0.00049, 0.0015) (-0.76, 0.97) Snails 0.78 0.28 0.34 0.21 0.39 0.074 ± 0.0020 0.024 ± 0.018 0.0039 ± 0.00024 1.76x10-5 ± 6.39x10-5 0.13 ± 0.11 (0.070, 0.077) (-0.011, 0.059) (0.0034, 0.0044) (-0.00011, 0.00014) (-0.077, 0.34) NOTE.- The highest parameter relative importance value for each study group is highlighted in bold. Parameter estimates for significant variables are also highlighted in bold. Speciation probability was not modelled for bats, birds on single islands, Carnivora or Macrolepidoptera because these groups had fewer than three islands with speciation events. Age was only available for a subset of islands, and so parameter estimates for age come from regression models using this reduced subset. Parameter values for all other terms come from regression models using the full set of islands, unless age was found to be significant, in which case all parameter values were estimated using the reduced subset of islands with age data.

50

Table 2.4. Model-averaged parameter estimates and relative importance values for analyses excluding archipelagos.

MODEL aerA egA noitavelE noitalosI relative importance value estimate ± se (confidence interval) OVERALL 0.97 0.30 0.47 0.64 0.38 ± 0.016 -0.012 ± 0.023 0.011 ± 0.00045 0.0052 ± 2.83x10-5 (0.35, 0.42) (-0.0587, 0.033) (0.010, 0.012) (0.0051, 0.0052) Angiosperms 0.9998 0.12 0.22 0.33 0.34 ± 0.0043 -0.010 ± 0.029 0.0016 ± 0.00015 0.0021 ± 4.28x10-5 (0.33, 0.35) (-0.067, 0.046) (0.0013, 0.0019) (0.0020, 0.0022) Birds insufficient data - - -

Ferns 0.37 0.23 0.35 0.35 -0.032 ± 0.055 0.010 ± 0.16 -0.0020 ± 0.0025 1.1x10-4 ± 0.00018 (-0.14, 0.075) (-0.31, 0.33) (-0.0069, 0.0030) (-0.00024, 0.00047) Lizards 0.16 0.75 0.91 0.88 -3.47 ± 1.6x108 2.48 ± 4.35x109 65.4 ± 1.6x108 54.7 ± 1.1x108 (-3.2x108, 3.2x108) (-8.5x109, 8.53x109) (-3.1x108, 3.1x108) (-2.2x108, 2.2x108) Snails 0.40 0.42 0.16 0.14 0.033 ± 0.0051 0.075 ± 0.024 -0.00048 ± 0.00047 -9.72 x10-5 ± 0.00011 (0.023, 0.043) (0.028, 0.12) (-0.0014, 0.00044) (-0.00032, 0.00012)

NOTE.- See notes for Table 2.3.

51 Chapter 2. Measuring the spatial scale of speciation

The effect of gene flow

The minimum island size for speciation in each group correlates with the mean

level of gene flow for each taxon when gene flow is measured at the scale of 10-100 km

(slope = -27.21, p = 0.00286, adj. r2 = 0.763; Figure 2.4). Taxa that are able to speciate

within smaller areas (indicated by a smaller minimum island size for speciation) are those

with reduced gene flow (indicated by higher mean Fst values). At the scale of 100-1000

km, at which snails are excluded due to lack of data, the same relationship is found,

although marginally non-significant (slope = -16.51, p = 0.0695, adj. r2 = 0.417). Widely

overlapping 95% confidence intervals (at 10-100 km: -38.22 to -16.21; at 100-1000 km:

-30.56 to -2.46) indicate that there is no significant difference in the slope of the

relationship between the two spatial scales. I found similar results using median Fst instead of mean Fst for each group (at 10-100 km scale, slope = -24.91, p = 0.0221,

adj. r2 = 0.545; at 100-1000 km scale, slope = -20.40, p = 0.221, adj. r2 = 0.138). Using

Norfolk Island instead of Tristan da Cunha for the minimum area of speciation in birds

also did not change the results (at 10-100 km: slope = -28.45, p = 0.00144, adj. r2 = 0.810; at 100-1000 km: slope = -14.69, p = 0.146, adj. r2 = 0.246). When archipelagos are excluded, the same gene flow-minimum area relationship is again found at both spatial scales, although in this case it is significant at 100-1000 km (slope = -21.82, p = 0.0191, adj. r2 = 0.729) but not at 10-100 km (slope = -24.02, p = 0.0917, adj. r2 = 0.358). Again,

widely overlapping confidence intervals for the slope of the relationship (at 10-100 km:

-46.61 to -1.42; at 100-1000 km: -33.07 to -10.56) indicate that there is no significant difference in the gene flow-minimum area relationship between the two spatial scales.

52 Chapter 2. Measuring the spatial scale of speciation

Figure 2.4. Minimum island size for speciation versus the average level of gene flow when measured over geographic ranges of 10-100 km.

Discussion

Main findings

These results show that the speciation-area relationship, in which speciation is

more likely and more frequent within larger areas, is a general pattern common to many

groups of both plants and animals. Ferns are the only group that show no such

relationship, perhaps because of their higher propensity for polyploid and hybrid

53 Chapter 2. Measuring the spatial scale of speciation

speciation, the implications of which are discussed further below. The speciation-area

relationship found is not just a by-product of area’s correlation with other island

characteristics - island area is consistently important and significant in both overall and

taxon-specific multivariate models, which include also island elevation (as a proxy for

habitat diversity), age, isolation from other landmasses and whether island units are

archipelagos or single islands. Though all study taxa except ferns have in common a positive speciation-area relationship, they vary over 6 orders of magnitude in the minimum area required for speciation. Furthermore, this variation in the minimum area for speciation correlates with variation among taxa in the level of gene flow. Taxa with

higher rates of gene flow, measured at a common spatial scale, have a larger minimum

area for speciation and a lower probability of speciation in any given area. This suggests

that the population genetics of divergence directly control the incidence and rate of

speciation - that there is a direct link between microevolutionary and macroevolutionary

processes.

The effects of other island characteristics on speciation probability

Though there is strong evidence for area as a major controller of speciation rates,

this does not rule out a role for other environmental variables. In particular, isolation and

elevation are also important and significant factors in the overall models, and important

and significant in most of the individual taxon models. The effect of elevation is always

positive, as predicted if greater altitudinal variation increases the number of habitats and

promotes greater ecological speciation (Ackerman et al. 2007; Losos and Parent 2009). In

all cases except birds, the probability of in situ speciation increases with increasing

54 Chapter 2. Measuring the spatial scale of speciation isolation, consistent with predictions that lower colonisation rates of distant islands should leave more niches available for speciation (Gillespie and Baldwin 2009). In birds, my measure of speciation probability increases on islands closer to other landmasses, which is unexpected according to my theoretical predictions but might arise if a low frequency of inferred speciation events still represent multiple colonisation (because colonisation is expected to be greater on islands closer to other landmasses). However, isolation is the least important variable for birds and has a small effect on variation in my measure.

Interestingly, considering the great distinction usually made between single islands and archipelagos in island evolution theory, a significant and important effect of archipelagos on speciation probability is found only in birds and angiosperms.

Furthermore, while the parameter estimate for birds is positive, as expected if water gaps between islands act as additional dispersal barriers promoting speciation, the parameter estimate for angiosperms is negative, which is unexpected on theoretical grounds. The lack of significance and importance of the archipelago term in other taxa and in the overall models may indicate that the difference between water gaps and ecological barriers within islands in their strength as dispersal barriers is much greater for birds than the other study taxa (Diamond 1977). For the other study taxa, barriers within islands may be strong enough that diversification within a heterogeneous island is comparable to diversification within an archipelago. Most important for my aims, the speciation-area relationship holds irrespective of whether archipelagos are included or not.

Broad comparative studies such as this one necessarily rely on surrogates and proxies for some underlying variables of interest, and so a lack of correlation in my study

55 Chapter 2. Measuring the spatial scale of speciation

is not conclusive evidence against any environmental factor. Further work would

particularly benefit from improved data on island ages - it is difficult to evaluate the

biological relevance of ages taken from the geological literature (for instance if lava flows

sterilise an island some time after its actual origination and emergence; Whittaker et al.

2008) and ages are lacking for many islands and island groups.

The spatial scale of speciation and gene flow

Consistent with the importance of gene flow in population genetics-based theories of speciation, estimates of the level of gene flow explain up to 76% of the variation in the spatial scale of speciation across taxa. Taxa with lower levels of gene flow are able to speciate within smaller islands, suggesting that the level of gene flow determines the spatial scale of speciation by controlling the minimum spatial extent at which differentiation of populations can occur. This result also accounts for the existence of thresholds in evolutionary species-area relationships (Losos and Schluter 2000) - in situ speciation is expected to contribute significantly to local species richness only in areas large enough that gene flow does not prohibit population differentiation.

The main limitation for this analysis was the availability of gene flow data. Past studies have largely applied molecular markers to single species questions, and meta- analyses like this study are necessarily posterior exercises limited by available data.

While disparate studies are still comparable (Bohonak 1999; Morjan and Rieseberg

2004), targeted studies generating data for a set of species using a standardised sampling design would allow more refined comparative analyses, including the use of more

sophisticated measures of the spatial scale and level of gene flow, such as the mean

56 Chapter 2. Measuring the spatial scale of speciation dispersal distance predicted from the slope of an isolation by distance (IBD) regression line for each species (Kinlan and Gaines 2003) or the Sp statistic (Vekemans and Hardy

2004). It would also be useful to have gene flow data for the specific genera and species for which island data were collected, instead of averaging over each major taxon

(especially given the tendency of island species to evolve reduced dispersal ability,

Carlquist 1974), but these data were not available in the literature.

Because of these constraints, I could compare only a limited number of different taxa at a relatively broad taxonomic scale, while retaining enough information to provide reasonable sample sizes for estimating the study variables. Despite relatively low power, the result is robust for the sample available. The significance of the relationship varied depending on the scale used (which determined whether snails were included or not) and whether archipelagos were included or not, but in an inconsistent way that reflected low power rather than large changes in the underlying relationship. A significant relationship was also found using an alternative measure that is closer to the underlying quantity of interest but less statistically robust than mean Fst (an estimate of the minimum scale at which neutral divergence is expected to occur within species of each major taxon, Figure

2.5).

Therefore, despite the above limitations, and the relatively low power they entail, these results point to gene flow levels as a potentially important determinant of the spatial scale of speciation. It remains possible, however, that the relationship found is the result of other confounding factors that vary between the study taxa in parallel to differences in gene flow. Incorporating more taxa, resolving the chosen taxa more finely and generating

57 Chapter 2. Measuring the spatial scale of speciation more estimates of gene flow would be needed for more powerful tests of this hypothesis in future.

The negative relationship found between gene flow and the probability of speciation within a given area at first seems to contrast with ideas that either high

(Eriksson and Bremer 1991; Owens et al. 1999; Phillimore et al. 2006) or intermediate

(Price and Wagner 2004; Paulay and Meyer 2006) dispersal ability should lead to maximum diversification. These ideas are only incompatible, however, if every species is imagined to have a single value representing its dispersal ability. In reality, dispersal for any taxon is usually thought of as a leptokurtic probability function, with a long tail of infrequent long-distance dispersal events (Tilman and Kareiva 1997). Under this model, dispersal affects diversification in two different ways - shorter-distance dispersal within the species range maintains species cohesion, and rarer long-distance dispersal to new areas outside the species range allows the establishment of new, potentially isolated populations. By considering only lineages able to reach oceanic islands, I intentionally focused on the effect of shorter-distance dispersal ability and controlled for long-distance dispersal, namely colonisation ability.

58 Chapter 2. Measuring the spatial scale of speciation

Figure 2.5. Results of an alternative gene flow analysis - the relationship between the minimum area for speciation and the spatial scale of neutral population differentiation. Minimum island size for speciation is plotted against the minimum geographic extent for

each taxon at which gene flow has been observed to be low enough (Fst high enough) to

allow neutral genetic differentiation of populations (Fst = 0.2, corresponding to Nm = 1). This is estimated for each taxon by the geographic scale of the population genetic study with the smallest geographic scale and Fst ≥ 0.2. Macrolepidoptera are excluded because

none of the Macrolepidoptera studies in my population genetic dataset have Fst ≥ 0.2.

Evolutionary explanations for the observed patterns

Several mechanisms could produce the speciation-area relationship observed

(Gavrilets and Losos 2009). First, larger areas might offer more opportunity for geographical isolation, either by distance alone or via barriers to dispersal (MacArthur

59 Chapter 2. Measuring the spatial scale of speciation

and Wilson 1967; Endler 1977; Rosenzweig 1995). Second, larger areas might encompass

more habitat types, which could increase speciation rates through stronger divergent

selection or by providing additional niches allowing the coexistence of newly formed

species (Losos and Parent 2009). I considered habitat variation in relation to elevation,

but other unmeasured aspects of habitat variation might also scale with area. Third, larger

areas can support larger population sizes, which might increase the rate of adaptive

evolution by increasing the rate of origin of beneficial mutations for selection to act upon

(Gavrilets and Vose 2005). Including data on population sizes might allow the third

mechanism to be distinguished, but in the absence of such information, I believe that the

relationship between gene flow and the spatial scale of speciation is most consistent with

speciation occurring through geographical isolation or ecological divergence into distinct,

spatially structured habitats (Schluter 2001).

By providing the exception to the general pattern observed, ferns strengthen the

support for these conclusions. Ferns are known to have a high incidence of speciation

through hybridisation and polyploidy (Wagner 1969; Otto and Whitton 2000), two major

processes allowing speciation to occur in the face of gene flow (Berlocher 1998). In fact,

of the two fern genera in my study with speciation events supported by published

phylogenies, one is thought to have diversified through hybridisation (Eastwood et al.

2004). In contrast, speciation as a result of hybridisation and polyploidy is rare in

animals, and important but much less frequent in angiosperms (Otto and Whitton 2000).

Thus, as expected if the speciation-area relationship is the result of gene flow-limited divergence, the group that most frequently speciates with continuing gene flow shows no significant speciation-area relationship. I conclude that pure sympatric speciation, namely

60 Chapter 2. Measuring the spatial scale of speciation

in the absence of any geographical isolation and in the presence of gene flow, appears to

be infrequent in all taxa except ferns (see also Barraclough and Vogler 2000; Phillimore

et al. 2008).

Extinction might also influence the relationship between diversification and area -

most directly because extinction rates should be higher on smaller islands with smaller

populations (MacArthur and Wilson 1967). For this reason, extinction has been used in

the past to explain the relationship between island area and the number of single island

endemic species (Mayr 1965). The effect of extinction on the speciation-area relationship

cannot be tested with the type of data presented here; it would require studies of island

taxa for which comprehensive fossil data are available and extinction rates can be

estimated directly (perhaps birds; Steadman 2006). However, I believe that the

association between decreased gene flow and increased diversification cannot be

explained easily by extinction. There are some mechanisms, such as increased pathogen

spread (Thrall et al. 2000) or swamping of local adaptation (Holt and Gomulkiewicz

2004), by which increased gene flow could increase the risk of extinction (and thereby decrease net diversification rate), but neither of these is a necessary outcome of increased

gene flow. It is more usually expected that decreased gene flow should increase the risk of extinction, either through increased inbreeding (Lande 1988) or decreased recolonisation rates within metapopulations (Gaggioti and Hanski 2004). Probability of speciation, on the other hand, is clearly predicted to increase with decreased gene flow.

Therefore, I believe it is more likely that the patterns I observe reflect differential rates of divergence and speciation, rather than an effect of extinction.

61 Chapter 2. Measuring the spatial scale of speciation

The effects of taxonomic practice and surveying effort

In common with most comparative studies of diversification, I assume that entities

named as species represent a similar level of evolutionary divergence across all taxa considered. If different taxa had been subjected to different taxonomic practises, this could influence my conclusions regarding scales at which speciation can occur. For instance, a taxon in which species are split more finely (so that they are equivalent to

subspecies of other taxa) would be counted as being able to speciate within smaller islands. On the other hand, finer splitting, causing subspecies endemic to single islands to be elevated to species status, could lead to more cases of genera with only one endemic

species on an island, and thus lower calculated probabilities of speciation. As this would

not affect which genera are identified as having had speciation events, this would not

affect the estimation of minimum areas for speciation, but would change the slopes of the

speciation-area relationships. In either case, it is unlikely that the differences in

taxonomic practice among my study taxa are in the correct order (for instance,

Lepidoptera lumped more than snails) to be solely responsible for the pattern of minimum

areas for speciation observed.

Data quality is likely to vary among islands and taxa as a result of differences in

past surveying intensity. Total surveying effort has generally been greater for larger

islands, but on small islands less effort is necessary for complete description of their

endemic species. Therefore, I do not believe that the chance of detecting whether a genus

has speciated in situ or not is likely to vary systematically with island area.

62 Chapter 2. Measuring the spatial scale of speciation

Finally, my surveys of island characteristics, species lists and phylogenies of

study genera are not comprehensive, due to limitations on data availability. Future

availability of appropriate data could perhaps alter observed patterns. However, I believe

that the data used include a high percentage of those available and are complete enough to draw broad conclusions.

Implications for evolutionary studies of diversity patterns

These results support a general geographical model of speciation in which area

and gene flow interact via the spatial scale of speciation to control both speciation rates

and resulting diversity patterns. As a result of reduced gene flow, some organisms are

able to differentiate at finer spatial scales than others, leading to increased speciation rates

and higher taxonomic diversity within a given area. Variation among taxa in the level of

gene flow could be caused by several factors, including differences in dispersal ability, in

the degree of habitat specificity (which controls which habitats will act as barriers to

dispersal, Thorpe 1945) and in the strength of natural selection against between-

population hybrids (whose survival is necessary for effective gene flow). The strength of

selection against hybrids will depend on the rate of accumulation of genetic

incompatibilities and the degree of local adaptation (Gavrilets 2004; Fuller 2008), both of

which could vary systematically among taxa. Because the above model incorporates both species traits and environmental characteristics, it should be useful for explaining both

taxonomic and regional variation in diversification rates and total diversity.

Furthermore, the strength of this model highlights more generally the potential of

an evolutionary-process based framework for understanding speciation rates and higher-

63 Chapter 2. Measuring the spatial scale of speciation

level patterns of species richness. Macroevolutionary studies until now have tested a

diverse range of potential correlates of diversification, with mixed results and few general conclusions (for a review of factors tested, see Jablonski 2008). In particular, macroevolutionary studies focusing on organism traits - such as animal body size or plant woodiness - have generally found only weak correlations with diversification rates, explaining no more than 10-24% of the observed variation in clade species richness, even using multivariate models (Phillimore et al. 2006). In contrast, there is stronger evidence for the link between population-level processes (including adaptive divergence, but also sexual selection and gene flow) and rates of speciation and diversification (for example,

Barraclough et al. 1995; Belliure et al. 2000; Stuart-Fox and Owens 2003; Funk et al.

2006; Seddon et al. 2008). These processes relate directly to the population genetic theory

that forms the foundation of our understanding of speciation, and a framework based on

these processes would be applicable to all organisms. Bridging the gap between population genetic theories of speciation and macroevolutionary approaches has great potential for improving our understanding of large-scale patterns of diversity.

64 Chapter 3. Gene flow and diversification

Chapter 3. The relationship between gene flow and clade diversification rates in Costa Rican orchids

Introduction

Gene flow is thought to be the most important factor preventing population divergence and speciation (Mayr 1963; Slatkin 1973; Endler 1977; Slatkin 1985; Nosil

2009). As such, reduced gene flow between populations within a species should make that species more likely to diverge and speciate. Furthermore, if levels of gene flow within species are heritable along related lineages, the level of gene flow could be a species-level trait influencing rates of diversification (Jablonski 2008).

Despite clear theory to suggest a relationship, the association between gene flow within species and clade diversification has never been tested directly using molecular

data (but see chapter two). Some studies have tested the link between dispersal ability and

diversification rates, and dispersal ability is generally a good proxy for rates of gene flow

(Zera 1981; Govindaraju 1988; Bohonak 1999). Most of these studies have found greater diversification associated with poorer dispersal, which should indicate reduced gene flow, as expected (for example, Jablonski 1986; Belliure et al. 2000). However, in some cases greater diversification has been found to be associated with greater dispersability (which likely reflects the ability of species to colonise new regions, rather than the level of gene

flow between established populations; e.g. Owens et al. 1999; Phillimore et al. 2006),

associated with intermediate dispersal (e.g. Price and Wagner 2004; Paulay and Meyer

2006) or not associated with dispersal at all (e.g. Vrba 1984; Herrera 1989). The diversity

65 Chapter 3. Gene flow and diversification

of results obtained suggests that the relationship between dispersal and diversification is

complex. However, existing studies rely on surrogates for measuring gene flow:

quantifying the level of gene flow directly using genetic data might remove some of this

complexity and help to uncover the true relationship between gene flow and

diversification.

Here, I test whether levels of gene flow within species are heritable and whether

they affect clade diversification rates using population genetic data for tropical orchid

species from pairs of sister clades that differ greatly in species richness. Sister group

comparisons control for age in comparing diversification and allow replicated,

phylogenetically independent tests of the hypothesised relationship. In addition, sister

groups typically share most traits, minimising the number of variables that can confound

conclusions about traits of interest (Barraclough et al. 1998).

Orchids are a good study group because their many species and great ecological and morphological variability allow many independent tests of factors hypothesised to affect diversity patterns. They abound in pairs of sister clades that differ greatly in diversity - for example, the most dramatic sister pairs in this dataset compare clades with

42 and 140 species to sister clades with 1,066 and 1,387 species, respectively. They also present an unsolved biodiversity mystery: with over 26,000 described species (Dressler

2005; Govaerts et al. 2010), orchids are probably the most diverse family of angiosperms.

However, even though their exotic flowers and varied ecology have attracted a devoted research community stretching back to Darwin (1862) and earlier, the reasons for their

diversity are still highly debated (Gravendeel et al. 2004; Cozzolino and Widmer 2005;

Tremblay et al. 2005). In addition, 70% of orchids (Benzing 1987) are epiphytes (growing

66 Chapter 3. Gene flow and diversification on other plants for support), a life form that is poorly studied even though epiphytes make up about 10% of all species and are a major component of tropical forest communities (Gentry and Dodson 1987). Finally, orchids deserve study as they are a high conservation priority (all orchids are listed in Appendix I or II of the Convention on

International Trade of Endangered Species; http://www.cites.org) and knowledge of their genetic characteristics and population dynamics is lacking, especially for tropical species.

I measure the level of gene flow for 17 orchid species from five comparisons of species-rich and species-poor sister clades using Fst values estimated from AFLP genotypes. Fst, the proportion of total neutral genetic variation between (rather than within) populations, is the most widely used measure of gene flow. High values of Fst indicate clear population differentiation and low levels of gene flow, and low values indicate little differentiation and high levels of gene flow (Wright 1931; Slatkin 1985). Fst is robust to variation in the spatial arrangement of populations (Beaumont and Nichols

1996) and, when calculated from truly neutral loci, comparable between species (Barton

2001). When gene flow is at equilibrium, Fst should increase with the geographic distance between populations, as migration is more frequent between nearby populations

(Hutchison and Templeton 1999). This makes the scale of sampling an important factor in population genetic studies. For this reason, I evaluate patterns of genetic isolation by distance in addition to analysing overall Fst.

I also account for two possibly confounding factors, species range size and ecology. The relationship between species range size and diversification rates is unclear, as range size may directly influence speciation and extinction rates or it may be the product of lineages’ evolutionary histories, but an association is expected (Rosenzweig

67 Chapter 3. Gene flow and diversification

1995; Webb and Gaston 2003; Pigot et al. in press). Similarly, it is unclear to what extent ecological traits of species generally affect diversification rates, but some association is

expected (e.g. Phillimore et al. 2006), as a species’ niche and niche breadth should affect

its probability of range expansion, speciation, and extinction (Funk et al. 2002; McPeek

2008). Furthermore, there is evidence that both range size and species ecology are associated with variation in levels of gene flow or population differentiation (Hamrick

and Godt 1996; Morjan and Rieseberg 2004). For this reason, I include both a restricted

and a widespread species for as many study clades as possible and I explicitly test the

strength of associations between species ecology and species range size with gene flow

and rates of diversification.

Therefore, to test the predictions of theory linking gene flow and speciation, I test

the hypothesis that a direct association exists between levels of gene flow within species

and diversification rates of clades, taking into account the possible confounding influences of species range size and ecology. I also test whether the level of gene flow is

heritable between clades.

Methods

Study group selection

Study clades were chosen from two subtribes from tribe (subfamily

Epidendroideae) with recently published phylogenetic analyses with complete sampling

at the genus level: Pleurothallidinae (Pridgeon et al. 2001) and Laeliinae (van den Berg et

al. 2009). Sister clades were chosen from well-resolved portions of the phylogenetic trees,

68 Chapter 3. Gene flow and diversification

although relationships between genera within clades were not always completely

resolved. The only major uncertainty in the composition of chosen clades was whether the

genus Meiracyllium belongs in the Brassavola clade or was placed there spuriously (van

den Berg et al. 2009). I assumed that it belongs in the Brassavola clade; Meiracyllium

contains only two species and so should not affect results greatly either way. Sister clades

were chosen that had as large differences as possible in species richness between the two

clades, using species diversities for genera taken from the World Checklist of

Orchidaceae (Govaerts et al. 2008). All pairs chosen differed in species richness by at

least five-fold. Three pleurothallid sister clade pairs were chosen, hereafter referred to as

the Masdevallia-Trisetella, Lepanthes-Lepanthopsis and Platystele-Dryadella sister pairs;

two laeliinid sister clade pairs were chosen, referred to as the Scaphyglottis-Jacquiniella

and Epidendrum-Brassavola sister pairs. Clade species richness and the genera included

in each clade are listed in Table 3.1.

Study species were chosen from one genus from each study clade. Only epiphytic

species were used. Species within genera were chosen to maximise the ease of locating

and identifying them in the field. Species were selected only if they are relatively well-

represented in herbaria and their delimitation from closely related species is clear in the taxonomic literature. In addition to these selection criteria, two more were used. First, species within clade pairs were chosen that had as similar habitat requirements as possible

to control for the effect of habitat on Fst (for example, high elevation habitats are likely to

be more fragmented and less continuous than lower elevation habitats, which could

directly influence the amount of gene flow between populations). Species habitats were

defined according to descriptions given in the Manual de Plantas de Costa Rica, Vol. III

69 Chapter 3. Gene flow and diversification

(MPCR; Hammel et al. 2003). Second, where possible, the species within each genus

were chosen to include both a restricted-range and a widespread species to allow

investigation of the relationship between species range size and Fst. In this case,

“restricted” was defined as occurring only in Costa Rica or Costa Rica and one other

neighbouring country; “widespread” was defined as occurring in two or more countries in

addition to Costa Rica. Species ranges were defined according to the World Checklist of

Orchidaceae (Govaerts et al. 2007).

Although the aim was to choose two species per clade, sampling for some clades

covered fewer species than others. For most of the species-poor clades, sampling multiple

species was problematic because these genera had few or no easily collected species. For

example, Brassavola is represented by only three species in Costa Rica - one of which is

rare and poorly collected and the other two of which are suspected of being synonymous.

In this case, the rare species was not collected. For Dryadella, Lepanthopsis and

Trisetella, all species were difficult to find and I was only able to collect sufficient

numbers of individuals for one species of each genus. In contrast, more than two suitable

potential study species were identified for most of the other genera, and all were collected

when found in the field. Finding populations in the field was a haphazard process and by

collecting samples this way the chance of obtaining sufficient sample individuals for at

least two species per genus was maximised. However, time limitations for the genetic

analysis meant that not all species collected could be genotyped, so no more than one

restricted and one widespread species were genotyped per genus. Some collected samples

proved difficult to identify definitively to species because their species are

morphologically similar to others or hybridise - these were all excluded. In cases where

70 Chapter 3. Gene flow and diversification multiple restricted or widespread species were collected from a single genus, the species with the best sampling (the most samples and covering the widest range of spatial scales) was selected for genotyping. The species collected and genotyped are listed in Table 3.2 along with their geographic and elevation ranges and habitat, and the number of populations sampled and individuals genotyped. Photos of genotyped species are given in

Figure 3.1.

71 Chapter 3. Gene flow and diversification

Table 3.1. Study clade pairs, with genera they include and currently accepted species richness according to the World Checklist of Orchidaceae (Govaerts et al. 2010).

Species-rich clade # spp. Species-poor sister clade # spp.

Masdevallia clade 751 Trisetella clade 23 Masdevallia Ruiz & Pav. 582 Trisetella Luer 23 Diodonopsis Pridgeon & M. 5 W. Chase Dracula Luer 126 Porroglossum Schltr. 38

Lepanthes clade 1066 Lepanthopsis clade 42 Lepanthes Sw. 1066 Lepanthopsis (Cogn.) Ames 42

Platystele clade 276 Dryadella clade 53 Platystele Schltr. 99 Dryadella Luer 53 Scaphosepalum Pfitzer in H. 46 G. A. Engler & K. A. E. Prantl (eds.) Specklinia Lindl. 131

Scaphyglottis clade 77 Jacquiniella clade 13 Scaphyglottis Poepp. & Endl. 68 Jacquiniella Schltr. 12 Dimerandra Schltr. 9 Acrorchis Dressler 1

Epidendrum clade 1387 Brassavola clade 140 Epidendrum L. 1325 Brassavola R. Br. in W. T. 21 Aiton Barkeria Knowles & Westc. 15 Cattleya Lindl. 111 Caularthron Raf. 4 Guarianthe Dressler & W. E. 4 Higgins Laelia Lindl. 24 Meiracyllium Rchb. f. 2 Myrmecophila Rolfe 10 Rhyncholaelia Schltr. 2 Orleanesia Barb. Rodr. 9

72

Table 3.2. Species collected, with distributions and habitats. Species with names in bold were genotyped. Species are arranged by genus, with genera from sister clades arranged together. #L = number of locations sampled. #S = number of samples genotyped.

restricted or Elevation Genus species widespread Geographic range typical habitat range (m) # L # S Masdevallia nidifica Rchb. f. widespread Nicaragua to N. Peru Very humid, rain or 700-2000 5 34 cloud forest rafaeliana Luer restricted Costa Rica and Panama Cloud or oak forest 2600-3000 2 30

chontalensis Rchb. f. widespread throughout central America Very humid, rain or 600-1800 6 - cloud forest picturata Rchb. f. widespread Costa Rica to S. tropical Rain or cloud forest 1200-2300 4 - America

Trisetella triglochin (Rchb. f.) Luer widespread Costa Rica to S. tropical Very humid, rain or 200-1900 2 70 America cloud forest

Lepanthes ciliisepala Schltr. restricted Costa Rica and possibly Cloud or oak forest 1400-2050 2 36 Venezuela elata Rchb. f. widespread Costa Rica to W. Colombia Cloud or oak forest 1500-2600 3 52

wendlandii Rchb. f. restricted Costa Rica and W. Panama Very humid, rain, cloud 1800-3000 3 - or oak forest turialvae Rchb. f. widespread Costa Rica to Panama, Brazil Very humid, rain, cloud 600-2550 3 - or oak forest

Lepanthopsis floripecten (Rchb. f.) Ames widespread S.E. Mexico to S. tropical Rain forest 1900-2000 5 20 America

73

restricted or Elevation Genus species widespread Geographic range typical habitat range (m) # L # S Platystele propinqua (Ames) Garay restricted Costa Rica Cloud or oak forest 1400-1900 3 24

stenostachya (Rchb. f.) Garay widespread Mexico to S. tropical America Very humid or rain 0-1900 3 50 forest microtatantha (Schltr.) Garay restricted Costa Rica Cloud or oak forest 1500-2200 2 -

Dryadella odontostele Luer widespread Costa Rica, Panama, Columbia Very humid forest 50-150 3 12

guatemalensis (Schltr.) Luer widespread Mexico to Colombia Very humid or rain 1200-2000 - forest

Scaphyglottis jimenezii Schltr. restricted Costa Rica and W. Panama Very humid, rain or 700-2400 3 47 cloud forest fusiformis (Griseb.) R. E. Schult. widespread Costa Rica to S. tropical Very humid or rain 50-1400 4 42 America forest prolifera (R. Br.) Cogn. widespread Mexico to northern S. America Humid, very humid, 0-1500 5 - rain or cloud forest

Jacquiniella aporophylla (L. O. Williams) restricted Costa Rica and Panama Rain forest 800-1500 4 37 Dressler teretifolia (Sw.) Britton & P. Wilson widespread Mexico to northern S. America Humid, very humid, 1100-1850 5 46 rain or cloud forest globosa (Jacq.) Schltr. widespread Mexico to northern S. America Very humid or rain 0-1400 4 - forest

74

restricted or Elevation Genus species widespread Geographic range typical habitat range (m) # L # S Epidendrum exasperatum Rchb. f. restricted Costa Rica and Panama Very humid, rain, cloud 900-2500 3 27 or oak forest; pastures and slopes laucheanum Bonhof ex Rolfe widespread Mexico to Colombia Very humid or cloud 1300-2100 5 31 forest vulgoamparoanum Hágsater & L. restricted Costa Rica and Panama Very humid or rain 0-350 4 38 Sánchez forest radicans Pav. ex Lindl. widespread Caribbean and Mexico to Very humid, rain or 850-1900 3 - Colombia cloud forest

Brassavola nodosa (L.) Lindl. widespread Mexico to S. tropical America Dry, humid, very humid 0-100 8 50 or scrub forest; rocks or mangroves

75 Chapter 3. Gene flow and diversification

Masdevallia nidifica Masdevallia rafaeliana Trisetella triglochin

Lepanthes ciliisepala Lepanthes elata Lepanthopsis floripecten

Platystele propinqua Platystele stenostachya Dryadella odontostele

76 Chapter 3. Gene flow and diversification

Scaphyglottis jimenezii Scaphyglottis fusiformis Jacquiniella aporophylla

Jacquiniella teretifolia Epidendrum exasperatum Epidendrum laucheanum

Figure 3.1. Photos of study species. Photos of Masdevallia rafaeliana and Epidendrum exasperatum thanks to Martin Turjak. Photo of Platystele stenostachya thanks to Ernesto Carman. Epidendrum vulgoamparoanum Brassavola nodosa

77 Chapter 3. Gene flow and diversification

Study species phylogeny reconstruction

The phylogeny of study species was reconstructed using matK sequences from

GenBank. Sequences for study species were used when available; otherwise a sequence

for a single species from the study genus was chosen randomly from those available

(GenBank accession numbers given in Table 3.3). Thunia alba was included as an

outgroup to allow rooting of the tree. Sequences were aligned with MAFFT (available at

http://mafft.cbrc.jp/alignment/software; Katoh et al. 2002), using the online interface and the L-NS-i algorithm. The aligned sequences were trimmed to include only sites for which data were present for all species, and a tree was constructed using Maximum

Likelihood and a GTR substitution model with parameters estimated from the data, using

GARLI v. 1.0 (available from http://garli.googlecode.com; Zwickl 2006) using default parameters and three independent runs. All three runs recovered the same topology; branch lengths from the highest likelihood tree were used. Finally, the tree was edited by inserting species for which no sequences had been available. These species were added as polytomies within their genus, with terminal branches of length 0.001. All relationships in the study species analyses match the relationships in the source trees used to select study clades (Pridgeon et al. 2001; van den Berg et al. 2009).

78 Chapter 3. Gene flow and diversification

Table 3.3. Accession numbers for matK sequences used to build study species trees. Species GenBank accession number Brassavola nodosa AF263820 Dryadella edwallii AF265454 Epidendrum campestre AF263781 Jacquiniella aporophylla EU214360 Jacquiniella teretifolia AY396087 Lepanthes ciliisepala EU214373 Lepanthes elata EU214374 Lepanthopsis astrophora AF265487 Masdevallia bicolor AF265447 Platystele stenostachya EF079326 Scaphyglottis fusiformis EU214455 Scaphyglottis jimenezii EU214460 Trisetella triglochin EF065592 Thunia alba (outgroup) AF302706

Sample collection

All samples were collected in Costa Rica with the kind help of the Lankester

Botanical Garden (University of Costa Rica). Sample collection occurred in two field seasons, April/May 2008 and March-May 2009. Whenever possible, living specimens from each population sampled were deposited at the Lankester Botanical Garden, to be maintained in their living collection and added after flowering to their herbarium and silica-gel preserved collections. This was preferred to making herbarium sheets at the time of collection because most plants were not flowering when sampled.

The sampling goal for each species was to collect samples from 20 individuals from each of 3-7 populations throughout Costa Rica. Different locations were sampled in

each field season, and no location was sampled twice for the same species. Populations

79 Chapter 3. Gene flow and diversification

were selected, as much as possible, to represent a range of distances between populations

and a significant portion of the range of each species. At each sampling location,

coordinates were recorded using a Garmin 60CSX GPS device. From March 31 to April

22, 2009, I had no working GPS device, and instead recorded locations using mileage

along roads. These relative locations were later translated into coordinates using Google

EarthTM v. 5.2 (Google). Maps of sampling locations for each genotyped species are given

in Figures 3.2-3.4 and details of sampling locations for each species are given in

Appendix Table II.1.

Plant samples were put into labelled plastic bags in the field and then kept in a

refrigerator or ice chest until they could be put into silica for preservation (Chase and

Hills 1991). Samples were cut into pieces to break the cuticle, put into individual labelled

envelopes made of coffee-filter paper and then dried in larger sealed bags containing

silica gel. The silica gel was changed for fresh, dry silica gel multiple times until all

samples were completely dry. Once dry, samples were stored at room temperature with a

smaller amount of fresh silica.

80 Chapter 3. Gene flow and diversification

a)

b) Figure 3.2. Sampling locations for study species from (a) the Masdevallia-Trisetella and (b) the Lepanthes-Lepanthopsis clade pairs.

81 Chapter 3. Gene flow and diversification

a)

b) Figure 3.3. Sampling locations for study species from (a) the Platystele-Dryadella and (b) the Epidendrum-Brassavola clade pairs.

82 Chapter 3. Gene flow and diversification

a)

b) Figure 3.4. Sampling locations for study species from the Scaphyglottis-Jacquiniella clade pair. (b) is a detail of the boxed area in (a).

83 Chapter 3. Gene flow and diversification

AFLP genotyping

DNA from all genotyped samples was extracted using the Qiagen DNeasy Plant

Kit, following the manufacturer’s protocol. DNA extractions for about a third of samples were carried out using the Plant Mini Kit, with individual extraction tubes; the rest were carried out using the 96 Plant Kit, with 96-well plates. Except when there was too little sample material available (for small species), approximately 20 mg silica-dried material was used for each extraction. Flowers were used in preference to leaves when available.

Some leaves had algae, moss or other contaminants and were scraped clean or wiped with ethanol before being used. Extractions were eluted twice with either 50 or 100 μl AE elution buffer or eluted with 50 and then 25 μl AE elution buffer, depending on amount of sample material available and ease of extracting DNA from each species. Each round of extraction included one or two blanks to check for contamination between tubes/wells. In addition, 10-20% of individuals from each population of each species were repeated, by extracting and genotyping them twice independently, to allow later quantification of genotyping error rates (Bonin et al. 2004).

Trials of potential selective primer combinations were carried out for each species individually, testing genotyping quality of six or twelve primer combinations. Primers were chosen to maximise number of peaks per sample, number of polymorphic peaks per species, evenness of spread of peak sizes (even if this required choosing a primer that produced fewer peaks) and profile repeatability.

84 Chapter 3. Gene flow and diversification

AFLP reactions were carried out following the methods of Vos et al. (1995)

except for the following modifications.

Only 300 ng DNA was used for species that did not successfully amplify using the

original protocol (Epidendrum laucheanum, Trisetella triglochin, Lepanthes elata,

Lepanthes ciliisepala and Platystele propinqua). DNA was dried in a vacuum oven at 40-

60°C and, once dry, resuspended in 5.5 μl distilled water.

Enzyme restriction used EcoRI and MseI (Promega and New England BioLabs,

respectively); ligation used T4 Ligase (Promega). DNA restriction and subsequent

adaptor ligation were carried out simultaneously for most species by incubating each

sample at 37°C for 2 hours with 5 U EcoRI, 1 U MseI, 1.1 μl 0.5M NaCl, 0.55 μg bovine

serum albumen (BSA), 1.1 μl 10X Ligase buffer (Promega), 1 U Ligase, 1 μl each of

MseI and EcoRI adaptor pairs (Applied Biosystems) and distilled water to make up the

volume to 5.5 μl (total reaction volume including sample = 11 μl). For some species,

however, the ligation step worked only if it was carried out separately. In this case,

samples were first incubated as above but in a reaction mixture with water replacing the

ligase and adaptor pairs. Ligation was then carried out by incubating 4 μl of the diluted

restriction product (11 units product diluted with 80 units TE buffer) for 2-3 hours at

room temperature with 1 U Ligase, 1 μl each of MseI and EcoRI adaptor pairs, 1.1 μl

Ligase buffer, 0.55 μg BSA and water to make up the volume to 4 μl (total reaction

volume including sample = 8 μl). For species where restriction and ligation were carried

out together, the reaction product was diluted before the preamplification PCR (11 units

of product diluted with 189 units TE buffer).

85 Chapter 3. Gene flow and diversification

All PCR primers came from the Applied Biosystems AFLP Regular Genome

Plant Mapping Kit and all PCR reactions used a Fermentas PCR mastermix. For the preamplification PCR, 2 μl restriction-ligation product were mixed with 7.5 μl PCR mastermix and 0.5 μl preamplification primers and amplified with the following program:

72°C for 2 min; cycles of 20 s at 94°C, 30 s at 56°C and 2.5 min at 72°C; 30 min at 60°C.

For some species, the 72°C step was shortened to 2 minutes, and the number of cycles used varied between species. The preamplification PCR product was then diluted (5 units product with 95 units TE buffer) for some species, but for most species the final peak profile was stronger if the preamplification PCR product was used undiluted. For the selective PCR, 1.5 μl preamplification product were mixed with 7.5 μl PCR mastermix and 0.5 μl of each of two selective primers and amplified with the following program:

72°C for 2 min; 10 cycles of 20 s at 94°C, 30 s at 66°C and 2 min at 72°C, with a decrease in annealing temperature of 1°C per cycle; 35 cycles of 20 s at 94°C, 30 s at

56°C and 2 min at 72°C; 30 min at 60°C. Three selective PCRs were carried out for each sample: with NED (Yellow), JOE (Green) and FAM (Blue) labelled primer combinations.

Finally, 1.2 μl of each of the three selective PCR products for each sample were combined with 10 μl formamide and 0.2 μl GeneScan-500 ROX size standard (Applied

Biosystems) in a single well for simultaneous genotyping on a capillary sequencer

(Applied Biosystems 3130xl Genetic Analyzer).

Details of the protocol variations used for each species, including the selective primers used, are listed in Table 3.4. If possible, all samples for each species were run together on a single plate for all AFLP reactions to eliminate variability from slight variations in run temperatures etc. When this was not possible, samples were split

86 Chapter 3. Gene flow and diversification between two plates, each containing at least one pair of repeated samples. When samples of one species were split between plates, both plates were run on the same PCR machine for all reactions.

87

Table 3.4. Details of AFLP method used for each study species. A star by the number of preamplification PCR cycles indicates that a 2 min extension time was used instead of the standard 2.5 min. FAM (Blue) NED (Yellow) JOE (Green) Preamp. PCR # preamp. # samples # samples Species primers used primers used primers used Restriction/ligation product diluted? PCR cycles genotyped repeated M. nidifica ACA-CAT ACC-CTT AAG-CAA one reaction yes 25 34 4 M. rafaeliana ACA-CAT AGC-CAT ACG-CAG one reaction yes 25 30 2 T. triglochin ACT-CTG AAC-CTA ACG-CTG separate reactions no 30 70 15 L. ciliisepala ACT-CAA AAC-CAG ACG-CAG separate reactions no 30 36 7 L. elata ACA-CAT AAC-CTA ACG-CTG separate reactions no 30 52 11 L. floripecten ACT-CTG AGC-CTA AAG-CTC separate reactions no 30 20 7 P. propinqua ACT-CTG ACC-CAG ACG-CAC separate reactions no 30 24 4 P. stenostachya ACT-CTG AAC-CAC ACG-CAC separate reactions no 30 50 9 D. odontostele ACT-CTA AAC-CAC AAG-CTG separate reactions no 30 12 6 S. jimenezii ACA-CTT AGC-CAT AAG-CTC one reaction yes 25 47 6 S. fusiformis one reaction no 25* 42 7 J. aporophylla ACA-CTA ACC-CTA AAG-CTC one reaction yes 25 37 4 J. teretifolia ACA-CAT AGC-CAT AAG-CTC one reaction no 25* 46 5 E. exasperatum ACA-CTT AGC-CTA AAG-CTC one reaction no 25 27 5 E. laucheanum ACT-CAA AGC-CTA ACG-CAG separate reactions no 30 31 9 E. vulgoamparoanum ACT-CAA AGC-CTA AAG-CTC one reaction no 25* 38 10

B. nodosa ACT-CTG AGC-CAT AAG-CTC one reaction yes 25* 50 10

88 Chapter 3. Gene flow and diversification

AFLP scoring

AFLP scoring was carried out using GeneMapper v. 4.0 (Applied Biosystems) to manually identify bins and AFLPScore v. 1.4a (available at http://www.sheffield.ac.uk/molecol/software~/aflpscore.html; Whitlock et al. 2008) to optimise scoring parameters and create a binary genotype table for each species.

First, all profiles with poor sizing or evidence of poor PCR amplification (few peaks or peak strength decreasing rapidly with fragment size) were excluded from analysis. Then, as many bins as possible were created in the 50-500 base pair (bp) range for each species dataset. Bins had to be less than 1 bp wide and non-overlapping. In addition, bins had to include at least one peak of height 100 relative frequency units

(RFU) or greater; peaks within a single bin had to be 0.3 bp apart or less and those from different bins had to be at least 0.4 bp apart. These criteria were set to minimise the chance of homoplasy due to including peaks from more than one locus in a single bin, based on the fact that homologous peaks from repeated samples never differed in position by more than 0.3 bp, except in a few cases involving unusually wide peaks. These criteria are strict, but I preferred to exclude a few valid loci than to score non-homologous peaks as bands from the same locus.

Once the maximal bin set was created, scoring parameters were chosen using a version of AFLPScore that I modified, which removes loci with no peaks in any sample after scoring and before calculating error rates. A range of locus and phenotype selection thresholds were tested for each of the four scoring methods available in AFLPScore

89 Chapter 3. Gene flow and diversification

(filtered loci/absolute thresholds, unfiltered/absolute, filtered/relative and

unfiltered/relative) and thresholds were chosen that resulted in the most loci being

retained with an error rate of 5% or less. These optimised thresholds were then used to

generate a binary allele table using AFLPScore. For some species and primer combinations, no scoring parameters resulted in error rates less than 5%. In these cases, the scoring parameters giving the lowest error rate were chosen, as long as this error rate was less than 10%. If an error rate lower than 10% could not be obtained for a particular species/primer dataset, that dataset was excluded from further analysis. After scoring, all loci were removed that had a band presence or absence in only one individual, as these are likely to represent errors in the genotyping process. Details of scoring parameters used and error rates for each species and primer combination are given in Appendix Table II.2.

Finalising AFLP datasets

For all species, the populations used as units for analysis were defined by

distance: all samples collected within 1.5 km of one another were treated as a single

population.

Before carrying out any analyses, the AFLP datasets for each species were

checked for outlier loci potentially under the influence of selection. Outlier loci are those

that have much higher or lower Fst values than expected from the overall distribution of

locus-specific Fst values, taking into account the heterozygosity of each locus. Loci with

higher Fst than expected are potential evidence of divergent selection, whereas loci with lower Fst than expected are potential evidence of balancing selection (Beaumont and

Nichols 1996; Beaumont and Balding 2004). Outlier loci were identified using the

90 Chapter 3. Gene flow and diversification

software DfDist (available from http://www.rubic.rdg.ac.uk/~mab/stuff/; modified to

allow dominant data from Beaumont and Balding 2004), which calculates p-values

representing the likelihood of selection influencing each locus after simulating Fst-

heterozygosity distributions. Loci with p-values less than 0.005 for either balancing or

divergent selection were excluded from further analyses in order to calculate Fst without

bias from loci under selection. Over all species, only 26 loci were excluded.

Some final AFLP datasets included populations with only one or a few individuals

due to difficulties with collecting in the field or with AFLP genotyping. To deal with this

problem, all analyses were carried out both on the full dataset including all populations

and on a reduced dataset that excluded populations with less than three individuals. In the

case of Lepanthopsis floripecten, which had only one population with three or more

samples, the reduced dataset only excluded populations with one individual.

Analysing Fst patterns

Except where stated otherwise, all analyses were carried out using R v. 2.8.1 (R

development core team 2008).

Fst values were estimated for each species using Arlequin v. 3.5 (Excoffier and

Lischer 2010), as Φst values (Excoffier et al. 1992), for which the between- and within- group variances are calculated using analysis of molecular variance (AMOVA) of genetic distances between sample haplotypes. The significance of each Fst value was tested

through permutation of the original haplotype table. Negative values of Fst were replaced

with 0 for further analyses (Long 1986). It was also noted for each species whether any

91 Chapter 3. Gene flow and diversification

pair-wise Fst values between populations were greater than 0.2, which corresponds to a migration rate (Nm) of 1 individual per generation using the formula Fst ≈ 1/(4Nm+1) and

indicates gene flow reduced enough to allow neutral divergence of populations (Wright

1931; Slatkin 1985). Because the geographic arrangement of population samples varied

between species, the relationship between Fst and geographic distance was investigated.

Matrices of pair-wise Fst values between all populations for each species were generated

in Arlequin based on pair-wise differences between haplotypes. Matrices of pair-wise

geographic distances were generated with the AFLPdat package in R (Ehrich 2006) based

on the latitude and longitude for each population. For each species, the significance of the

relationship between Fst and geographic distance was tested using a Mantel test with p-

value calculated from 1000 simulated permutations of the original matrices, using the

ade4 package in R (Dray et al. 2007). In addition, the average pair-wise Fst between

populations separated by 50 km or less (Fst<50) was calculated for each species to give a

measure of Fst calculated at a constant geographical scale.

Heritability was estimated for overall Fst by calculating the phylogenetic signal using the lambda measure (Pagel 1999). Lambda varies from 0 to 1, where a value of 0

means a trait evolves independently of the phylogenetic tree (is not heritable along

lineages), and a value of 1 means trait values are entirely determined by the tree (are

completely heritable). The maximum likelihood value of lambda was calculated using the

CAIC package in R (Orme et al. 2008) and the matK species tree described earlier.

Likelihood ratio tests were used to test whether lambda was significantly different from 0

or 1, by computing the likelihood ratio between a model optimising lambda and a model

92 Chapter 3. Gene flow and diversification

fixing lambda to 0 or 1, respectively, and computing the p-value using a Chi squared

distribution (Freckleton et al. 2002).

The effect of gene flow on diversification was tested using a two-tailed Wilcoxon

signed rank test comparing mean overall Fst values between sister clades.

Testing the influences of species range size and ecology

Measures of Fst were tested for associations with species range size, using two measures of range size. First, overall Fst values and whether any pair-wise Fst was greater

than 0.2 were compared between restricted and widespread species using two-tailed

Wilcoxon rank sum tests. “Restricted” and “widespread” were defined as described in the introduction. Second, overall Fst was regressed against species range size quantified as the

number of TDWG Level 2 Regions (http://www.tdwg.org/standards/109/) for which

species occurrence was noted in the World Checklist of Orchids (Govaerts et al. 2007).

The effect of the interaction between range size and Fst on diversification rate was

tested using a two-tailed Wilcoxon signed rank test comparing mean Fst values of

widespread species between sister clades. Fst values from restricted species were not

compared between sister clades because only one restricted species from a species-poor

clade was collected.

A range of ecological variables were tested to see if they could explain variation

between species in Fst values: branch circumference, elevation range and habitat range.

The circumference of the branch on which each sample plant was growing was measured

in the field at the time of collection. Branch circumference was included because it is a

93 Chapter 3. Gene flow and diversification proxy for many important aspects of epiphyte niche - smaller branches are associated with shorter lifespans, lower water availability and higher light availability (Chase 1988;

Gravendeel et al. 2004). Species elevation ranges and habitat descriptions were taken from the species descriptions in the MPCR. The mean and variance of branch circumference for each species were regressed against overall Fst. The minimum, maximum and range ( = maximum - minimum) of elevation for each species were also regressed against overall Fst, as was the number of habitats occupied. Variance in branch circumference, elevation range and number of habitats were included as proxies of ecological specificity/niche breadth. All these variables were also compared between differentiated (any pair-wise Fst > 0.2) and non-differentiated species using two-tailed

Wilcoxon rank sum tests. In both cases, Bonferroni correction was used to correct for multiple tests. Because each measure of Fst was tested for association with 6 ecological variables, only p-values smaller than 0.0083 were considered significant. The same ecological variables were tested for associations with species range size measured as number of regions using regressions, and with species range as widespread/restricted using ANOVAs. Elevation and habitat ranges of study species are given in Table 3.2.

Branch circumference mean and variance for each species are given in Appendix Table

II.3.

Mean species range size, elevation range and number of habitats were compared between sister clades using two-tailed Wilcoxon rank sum tests to investigate whether differences in range size or ecology could explain differences between clades in diversification. For these tests, range sizes for all species in all genera in each clade were taken from the World Checklist of Orchidaceae (Govaerts et al. 2007) as described above.

94 Chapter 3. Gene flow and diversification

Species elevation and habitat ranges were compiled for all species native to Costa Rica

from all genera in each clade from the MPCR. Elevation and habitat data were compiled only for species native to Costa Rica as this way all data came from a single source and were comparable. Mean range sizes, elevation ranges and habitat ranges of study clades

are given in Appendix Table II.4.

Results

A total of 1341 plants were sampled from 26 consistently identifiable species in 5

pairs of sister genera; 647 of these samples from 17 species in the 5 sister clade pairs

were successfully genotyped.

Overall Fst is variable between species, ranging from zero, indicating no

differentiation between populations, to a maximum value of 0.358, indicating clear

differentiation between populations. Similarly, Fst between populations less than 50 km apart (Fst<50) ranges from zero to 0.431, and the two measures are strongly correlated (p

< 0.005 for both datasets). The relationship between Fst and geographic distance is not

significant for any species, and only near significance for one (E. laucheanum, p =

0.0509; other p-values ranging from 0.11 to 1). Plots of Fst against geographic distance

for all species are given in Figures 3.5 and 3.6. Because there is no evidence for a

relationship between Fst and geographic distance, and overall Fst and Fst<50 are strongly

correlated, Fst<50 is not used in further analyses. Fst values for all species are given in

Table 3.5.

95 Chapter 3. Gene flow and diversification

Table 3.5. Fst values for all study species. Significance of overall Fst is indicated by stars -

*, p < 0.05; **, p < 0.005; ***, p < 0.0005. Datasets with a dash for Fst < 50 had no populations separated by less than 50 km.

overall any pair-wise Species Fst Fst > 0.2 Fst < 50

M. nidifica - full dataset .36 *** yes 0.43 M. nidifica - reduced dataset .35 *** yes - M. rafaeliana 0 no 0 T. triglochin .063 ** no 0.063

L. ciliisepala 0.027 no 0.027 L. elata .072 *** no 0.072 L. floripecten - full dataset 0.14 yes 0.43 L. floripecten - reduced dataset 0 no 0

P. propinqua - full dataset .17 *** yes 0 P. propinqua - reduced dataset .12 *** no - P. stenostachya .24 *** yes 0.16 D. odontostele .18 * yes 0.037

S. jimenezii 0 no 0.00084 S. fusiformis .24 *** yes 0.14 J. aporophylla- full dataset 0 no 0 J. aporophylla - reduced dataset 0 no 0 J. teretifolia - full dataset 0 no 0.0012 J. teretifolia - reduced dataset 0.013 no 0.01

E. exasperatum .13 ** no 0.14 E. laucheanum 0.045 no 0.038 E. vulgoamparoanum 0.019 no 0.027 B. nodosa - full dataset .059 ** no 0.057 B. nodosa - reduced dataset .054 ** no 0.053

96 Chapter 3. Gene flow and diversification

Figure 3.5. Relationship between Fst and distance for all study species for the full dataset. Data are presented by sister clade pair. Species from the species-rich clade in each case are shown in black; species from the species-poor clade are shown in red. Restricted species are shown with open symbols and widespread species with filled symbols. The Epidendrum clade includes two restricted species: E. exasperatum is represented by open circles, whereas E. vulgoamparoanum is represented by open triangles.

97 Chapter 3. Gene flow and diversification

Figure 3.6. Relationship between Fst and distance for all study species for the reduced dataset. All plotting symbols follow the conventions described for Figure 3.5, but the vertical scale here is smaller.

98 Chapter 3. Gene flow and diversification

Overall Fst is not heritable over the clades examined here. The maximum

-5 likelihood value of lambda for overall Fst for both datasets is 6.61x10 and is not significantly different from 0 (for both datasets, p value for difference from 0 = 1; p value for difference from 1 < 0.001). Figure 3.7 shows how Fst maps on to the species tree (data from the full dataset; the equivalent tree with Fst values from the reduced dataset is given in Appendix Figure II.1).

There is no support for my hypothesis that gene flow is associated with diversification rate. There is no significant difference in mean overall Fst between species- rich and species-poor sister clades (full dataset, p = 0.313, reduced dataset, p = 0.125).

Instead, there is a suggestion of an inverse association between the level of gene flow and species range size. Overall Fst is higher for widespread species than restricted species, although the difference is only significant for the full dataset (full dataset, p =

0.0495, reduced dataset, p = 0.0951), and there is no significant relationship between overall Fst and species range size measured as number of regions (full dataset, p = 0.862, reduced dataset, p = 0.955). Furthermore, no difference exists between widespread and restricted species in whether any pair-wise Fst is greater than 0.2 (full dataset, p = 0.158, reduced dataset, p = 0.0734).

There is also a hint of an interaction between species range size and the level of gene flow in determining diversification rates. In both datasets, the only species with overall Fst over 0.2, indicating gene flow reduced enough to allow independent evolution of populations, are widespread species from large clades (Masdevallia nidifica, Platystele stenostachya and Scaphyglottis fusiformis). This association can be seen in Figure 3.7,

99 Chapter 3. Gene flow and diversification

which shows both range size and Fst values mapped on to the species tree. However, there

are an equal number of widespread species from large clades with low Fst and there is no

significant difference between the overall Fst of widespread species from small and large

sister clades (full dataset, p = 0.438, reduced dataset, p = 0.125).

Figure 3.7. Associations of Fst and species range size with species phylogeny (data from the full dataset). Circles at branch tips are sized proportionally to the log of species range size measured as number of regions and shaded according to Fst. Species names are abbreviated by the first letter of genus and species.

100 Chapter 3. Gene flow and diversification

Some variation in Fst between species can be explained by species ecology.

Overall Fst shows no significant correlation with circumference of branches on which

sampled plants occurred, species elevation range or number of habitats per species (p-

values ranging from 0.0838 to 0.913). However, for the full dataset, whether any pair-

wise Fst is greater than 0.2 is significantly associated with both decreased elevation range

(p = 0.0021; Figure 3.8a) and decreased number of habitats (p = 0.0043; Figure 3.8b).

Figure 3.8. Relationships between species elevation range and number of habitats and whether any pair-wise Fst value is over 0.2, for the full dataset.

There is no association between species range size and ecology - neither can

explain the other’s association with Fst. Among the study species, branch circumference,

elevation range and number of habitats do not differ between widespread and restricted

species (p-values between 0.263 and 0.971), nor do they correlate with range size

measured as number of regions (p-values between 0.264 and 0.842). In the wider dataset

of range size and ecology data compiled from the World Checklist of Orchidaceae and the

101 Chapter 3. Gene flow and diversification

MPCR, range size measured as number of regions is positively associated with both

elevation range (p = 0.038; Figure 3.9a) and with number of habitats occupied (p = 0.029;

Figure 3.9b), but in both cases the correlation is weak (elevation range, r2 = 0.0091;

number of habitats, r2 = 0.01).

Figure 3.9. Associations between species range size and elevation range and number of habitats over all species native to Costa Rica from the study clades. Shading of circles is proportional to the number of species with each combination of trait values.

Neither range size nor species ecology shows any association with clade diversification. Mean range size does not differ between large and small sister clades

(p = 0.1); neither do mean elevation range (p = 0.625) or the mean number of habitats occupied (p = 0.125).

102 Chapter 3. Gene flow and diversification

Discussion

Findings

There is no support for the hypothesis that the level of gene flow directly controls

rates of diversification. Across the species studied here, the level of gene flow is not heritable and does not differ between species-poor and species-rich sister clades.

However, there is weak evidence for a relationship between the level of gene flow,

species range size and diversification rates: gene flow low enough to allow high

population differentiation is observed only in widespread species in large clades. Some

variation among species in the level of gene flow can be explained by species ecology,

but this does not contribute to the associations between gene flow and range size or gene flow and diversification.

The most conservative interpretation of these results is that the level of gene flow

within species does not affect clade diversification rates. Even though gene flow is a

major factor limiting population divergence or speciation, there are a number of situations in which it would not be expected to affect diversification rates. First, speciation rates may be limited more by other steps in the speciation process. For these orchids, for instance, speciation may be most limited by the rate at which species colonise new regions that are separated by barriers from the ancestral range or by the diversity of potential pollinators available to drive divergent selection and local adaptation. Second, variation in diversification rates may be driven mainly by extinction rather than speciation. This is especially likely if new species tend to have small ranges or population

103 Chapter 3. Gene flow and diversification

sizes, for example as predicted by point mutation (Rosindell et al. 2010) or peripatric

(Mayr 1982) models of speciation.

There is also some indication of a more complex scenario, where the level of gene flow affects rates of speciation but changes through the lifetime of a species, making it difficult to detect a link. Unlike organismal traits such as body size, the level of gene flow

within species is an emergent trait at the species level (Jablonski 2008) and likely to be

changed by the speciation process in a manner analogous to species range size. This is

because speciation is likely to divide the parent species range where population differentiation is highest, so that most speciation events result in daughter species with

decreased average population differentiation as well as decreased range size compared to

the parent. As a result, younger species should tend to have both smaller ranges and less

differentiation/greater gene flow between populations, although the relationship between

species age, range size and gene flow will depend on how quickly ranges expand and

differentiation develops after speciation (Pigot et al. in press). In this case, clades with a

tendency for less gene flow might speciate more rapidly, but new species would always

show reduced population differentiation, weakening any observed relationship between

species gene flow and clade diversity. Two results are consistent with this scenario: the

lack of heritability for the level of gene flow and the finding that well-differentiated

populations only occur in widespread species from large clades, assuming that

widespread species tend to be older than restricted species. However, if this scenario were

the main explanation for variation within clades in level of gene flow, then widespread

species with low Fst would be expected only if they were young species that had recently

104 Chapter 3. Gene flow and diversification

expanded their range. To my knowledge, the relationship between species age and

population differentiation has not been tested but would be worth exploring.

It is interesting that no significant relationship between genetic differentiation and

geographic distance was found for any species, as such isolation by distance relationships

are expected unless dispersal ability is exceptionally high or low (Peterson and Denno

1998) or gene flow is not at equilibrium with genetic drift (Hutchison and Templeton

1999). Orchids are known to have a high propensity for long-distance dispersal as a result

of their abundant, tiny seeds (Arditti and Ghani 2000) and so a lack of isolation by

distance in orchid species should be associated with low population differentiation

(Peterson and Denno 1998) and the species with high values of Fst here are anomalous.

This suggests that population genetics in these highly differentiated species are dominated

by genetic drift within populations (Hutchison and Templeton 1999). One way this could

happen is through founder-effect drift (Mayr 1963), in the context of the “everything is

everywhere” hypothesis for microorganisms (Baas-Becking 1934; Finlay 2002).

According to this hypothesis, the distributions of microorganisms are limited not by

colonisation ability, but by habitat requirements, because of their great abundance and

long-distance dispersal ability (Finlay 2002). As a result, new populations can be founded

by a mix of colonists from throughout the original species range. If dispersal rates are

high, the entire species range will stay homogeneous, as in any model of gene flow.

However, if dispersal rates are low, populations should be randomly differentiated as a

result of receiving sets of colonists from different parts of the species range and there

should be no pattern of isolation by distance (Fontaneto et al. 2008). In theory, orchids

might also follow this model, as their seeds are on the same scale as microorganisms

105 Chapter 3. Gene flow and diversification

covered by the “everything is everywhere” hypothesis (< 2 mm), and single seed pods can

contain millions of seeds (Arditti and Ghani 2000). The data presented here are not

sufficient to test this possibility, but it would be worthwhile to investigate it further by

testing for isolation by distance using data from more individuals and more populations

throughout the full extent of an orchid species range.

Study limitations

An important limitation of this study is the use of current diversities of clades to quantify diversification rates, as it is possible that differences in sister clade diversity are instead the result of differences in diversity limits (Rabosky 2009a). Evidence is building in the field of diversity patterns that much variation in clade richness (at least at the family scale) is not the result of variation in diversification rates, but in limits to the number of related species that can coexist in a region (Appendix I; Rabosky 2009a;

Vamosi and Vamosi 2010). This possibility cannot be ruled out here, but further studies

using younger sister clades or estimating diversification rates over an entire phylogenetic

tree (and thus able to test explicitly for diversity-dependent diversification) would avoid

this problem.

Another possible reason for the lack of evidence in favour of the original

hypothesis is the small number of species used. In particular, the ability of this study to

explore the interaction of range size with gene flow in affecting diversification was

limited because only one restricted species from a species-poor sister clade was sampled.

It is also possible that the species used here are not adequate representatives of their

clades and that a different or larger set of species would have shown a different pattern.

106 Chapter 3. Gene flow and diversification

Even so, the data here are a clear indication that the level of gene flow is not consistent within clades.

Largely because most study species have patchy distributions, sampling within species was not optimal. Some species could only be found in a restricted region, precluding between-species comparisons of long-distance patterns of differentiation, and few species had consistently large populations, leading to generally low sample sizes. In addition, difficulties with AFLP genotyping reduced sample sizes for some species even further. However, Fst values were tested for significance and only one Fst value over 0.05

(indicating at least moderate population differentiation; Freeland 2005) was found to be not significant (Lepanthopsis floripecten, full dataset), even though there were a number of relatively poorly sampled species with high Fst. This suggests that despite sub-optimal sampling, the Fst values reported here reflect real patterns of differentiation in the study species.

Additional ecological data would also be helpful for better understanding diversification patterns in these orchids. Data regarding pollinators and mycorrhizal fungi, neither of which is well known, would be particularly useful, as both could be major drivers of orchid population structure. The pollen dispersal potential of different insects known to pollinate orchids, for example, gnats versus hawkmoths, varies greatly.

Additionally, all orchid seeds require a mycorrhizal fungus partner in order to germinate

(Benzing 1987; Rasmussen 1995), and so orchid distributions should be limited by their mycorrhizal specificity and the distributions of their mycorrhizal partners (Swarts et al.

2010). However, even anecdotal identification of pollinators is unavailable for most tropical orchid species and even less is known about their mycorrhizal associations.

107 Chapter 3. Gene flow and diversification

Among the species studied here, only broad generalisations are known regarding

pollinators: Brassavola species are pollinated by sphingid moths; Lepanthes probably by fungus gnats although pleurothallid orchids in general (including Lepanthes,

Lepanthopsis, Masdevallia, Trisetella, Platystele and Dryadella) tend to be fly-pollinated; and Epidendrum species tend to be pollinated by Lepidoptera although there is a lot of variability in the group (Pijl and Dodson 1966; Cingel 2001). Mycorrhizae of most tropical orchids have not been studied at all and the few data that exist are not enough to make any conclusions regarding mycorrhizal specificity. For the study genera, the only indications in this direction are that multiple Epidendrum species tend to associate with the same fungal genus, Epulorhiza (Zettler et al. 1998; Nogueira et al. 2005; Zettler et al.

2007; Pereira et al. 2009); and that in Masdevallia, for which mycorrhizal associations were investigated using roots sampled from the same plants used in this study, the widespread species M. nidifica appears to be more specific in mycorrhizal associations than the narrow endemic M. rafaeliana (Renshaw 2010).

Ideas for future work

The difficulties encountered in this study would be most easily avoided by

carrying out a similar study using a better-known taxon or using data drawn from the

published literature. For better-studied taxa, such as temperate plants or birds, much more

detailed ecological data would be available to provide context and more previous

collection data would be available to help guide sampling. With a meta-analysis of

published data, it would be easier to collect data for many species. That said, there are

some obvious patterns in the orchid data that would be important to compare to patterns

108 Chapter 3. Gene flow and diversification from other taxa. Tropical orchids have many unusual characteristics, including their tiny seeds, complex relationships with pollinators and mycorrhizae and incredibly patchy distributions, but it is unclear whether this means that they should also be unique in the context of diversification.

Conclusion

The results presented here are inconclusive. It appears that the average level of gene flow of a clade does not directly affect its diversification rate, but there is weak support for a more complex scenario in which gene flow within species affects lineage rates of diversification but changes through species’ lifetimes. Furthermore, no effect of species ecology on the level of gene flow was found, but no data were available to test the effects of pollinators and mycorrhizae, which should have strong effects on gene flow.

More data are needed to resolve these results. In addition, the lack of a strong effect of gene flow on diversification rates does not preclude the existence of an effect of genetic drift or local adaptation on diversification. I test this possibility in chapter four.

109 Chapter 4. Genetic drift, local adaptation, and diversification

Chapter 4. The effects of genetic drift and local adaptation on clade diversification rates in Costa Rican orchids

Introduction

Just as gene flow is expected to affect speciation rates because of its role in

limiting population divergence, genetic drift and local adaptation could be expected to

affect speciation rates because of their role in generating population divergence (Grant

1981). In chapter three I showed that, for the set of orchid clades studied, there is no

direct link between the level of gene flow within species and the diversification rates of

clades. However, this does not rule out the possibility that levels of genetic drift or local

adaptation within species are traits that affect clade diversification.

Genetic drift, though once given a key role in models of adaptation (Wright 1931,

1982) and speciation (primarily in a set of related models involving small founder

populations, summarised in Carson and Templeton 1984) is now generally thought to

have only a minor role relative to selection (Sobel et al. 2010). Selection and gene flow

are thought to drive most speciation events, as they are expected to act much more

effectively than drift (Coyne et al. 1997, 2000; Gavrilets 2003). Furthermore, there is no

empirical support for a strong general role of drift in speciation (Rice and Hostert 1993;

Coyne et al. 1997, 2000). However, there are special cases in which theoretical models

suggest that genetic drift may have an important role in driving divergence (Gavrilets

2003): when it acts in combination with selection, and in taxa with characteristics

producing small effective population sizes (including low population density, patchy

110 Chapter 4. Genetic drift, local adaptation, and diversification distributions and ephemeral populations). Epiphytic orchids may be one of the exceptional taxa strongly affected by genetic drift. This is because small population sizes and patchy distributions are common traits of epiphytic orchids, as well as high variance in reproductive success, which also reduces effective population size (Tremblay 1997;

Tremblay et al. 2005). Furthermore, there is evidence that drift drives floral colour variation in some orchid species (Tremblay et al. 2005).

Selection is central to most models of speciation, as it is believed to be the most important process driving population differentiation and divergence (Schluter 2000, 2009;

Sobel et al. 2010). Unlike gene flow and drift, its role in driving diversification rates has been tested in a number of studies. Barraclough (1995) found evidence for a positive link between the strength of sexual selection (indicated by the magnitude of colour differences between sexes) and diversification rate of birds. Seddon et al. (2008) confirmed this result for birds both for song and colour and Stuart-Fox and Owens (2003) found similar results for agamid lizards. However, all of these studies focused on the special case of sexual selection, and used indirect surrogates of the likely strength of selection, rather than direct measures. There is a lack of tests of the link between the degree of local adaptation and diversification rates and of tests that use direct measures of selection derived from population-level data.

Here I estimate the level of genetic drift by measuring the amount of neutral genetic diversity within populations using AFLP data, and I estimate the level of local adaptation by measuring the level of divergence between populations in leaf shape.

Average genetic diversity within populations should be lower in species that have experienced increased genetic drift (Hedrick 2000). The level of morphological

111 Chapter 4. Genetic drift, local adaptation, and diversification

divergence between populations, when compared to divergence in neutral genetic

markers, is a direct measure of divergent selection (Merilä and Crnokrak 2001). I measure

divergence in leaf shape using geometric morphometrics and the Pst metric. Leaf shape is

used because it is an ecologically relevant trait associated with plant water, temperature

and light relations (Takenaka 1994; Royer et al. 2005; Wang 2007) and easily measured

in the field. Pst is a phenotype-based analogue to Fst that measures the fraction of variation in phenotype that lies between populations out of the total amount of variation in the sampled individuals. As with Fst, a Pst value of 0 indicates no differentiation between populations, and the maximum Pst value of 1 indicates clear differentiation. Pst is also similar to Qst, a metric of divergence in genes for quantitative traits (Spitze 1993), except

that Pst is influenced by environmental effects (Leinonen et al. 2006). When Qst is greater

than Fst it indicates that morphological divergence between populations is greater than

that expected from drift and gene flow alone, making the difference between Qst and Fst a measure of local adaptation (Merilä and Crnokrak 2001), and the difference between Pst and Fst an estimate of local adaptation. I also analyse overall Pst independently as a

measure of total phenotypic divergence, which should be the result of the combined

effects of gene flow, drift and local adaptation on morphological variation.

I hypothesise that increased genetic drift and local adaptation drive greater

population divergence within species and, thus, higher speciation rates. I use the same

approach as in chapter three to test this hypothesis. First, I test whether the levels of

genetic drift and local adaptation are heritable across lineages. Then I test for associations

between the level of each process and diversification rates of clades, taking into account

possible confounding influences of species range size and ecology.

112 Chapter 4. Genetic drift, local adaptation, and diversification

Methods

Study species and genetic data were the same as those used in chapter three. As in chapter three, analyses (with the exception of estimating Pst) were carried out both on the full dataset and a reduced dataset excluding populations with less than three samples (less than two samples for Lepanthopsis floripecten). Except where stated otherwise, all analyses were carried out using R v. 2.8.1 (R development core team 2008).

Genetic diversity within populations was estimated using Nei’s gene diversity, a simple measure of variation based on allele frequencies (Nei 1987). Nei’s gene diversity was calculated as

n D    allelefreq 2  allelefreq 2 ,)()(1 (Eq. 4.1) n 1 1 0 where n is the number of samples (Nei 1987), using the diversity function in the

AFLPdat package for R (Ehrich 2006). Gene diversity was only calculated for populations containing five or more samples to reduce noise from small sample sizes. I checked for a sample size bias using linear mixed effects models of genetic diversity versus sample size with species as a random effect, compared to null models lacking a sample size parameter. Because there was no evidence for a sample size bias (full dataset, p = 0.12; reduced dataset, p = 0.22), average gene diversity for each species was calculated simply as the mean over all populations.

To calculate Pst, leaf shapes were first recorded by photographing leaves from each sample at the time of collection. Leaf outlines were digitised by hand using the

113 Chapter 4. Genetic drift, local adaptation, and diversification

Curves tool in TpsDig (available from the SUNY Stony Brook morphometrics webpage, http://life.bio.sunysb.edu/morph/index.html; Rohlf 2008). Full outlines were digitised for species with asymmetric leaves (Jacquiniella aporophylla) or leaves that are long, thin and rarely straight (Brassavola nodosa, Dryadella odontostele and Trisetella triglochin).

Otherwise only the left-hand side of the leaf outline was digitised. Whenever possible, two leaves per sample were digitised, and when a sample contained more than two leaves, the longest leaves were chosen. Examples of digitised leaves are shown in Figure 4.1.

Populations with only one sample were excluded from analysis, except in the case of

Lepanthopsis floripecten. The digitised leaf shapes were converted into scores for independent shape traits using eigenshape analysis (Lohmann 1983; MacLeod 1999) in

Mathematica v. 7 (Wolfram Research), with the morpho-tools notebooks kindly provided by Jonathan Krieger (Krieger and MacLeod pers. comm.). Full leaf outlines were analysed using the extended eigenshape method, with the leaf tip marked as a landmark; half outlines were analysed as simple curves, with no landmarks, using the standard eigenshape method. A different set of shape traits was generated for each species. After shape scores were generated, the scores from multiple leaves from the same sample were averaged in order to reduce noise from within-sample variability. Between and within population variance components for each shape trait for each species were then calculated from a linear mixed effects model with the shape trait as response variable and population as a random effect. Pst for each shape trait was calculated as

vb Pst  2 , (Eq. 4.2)  2 wb hvv

114 Chapter 4. Genetic drift, local adaptation, and diversification

2 where vb is the variance between populations, vw the variance within populations and h the heritability of each trait (Leinonen et al. 2006). Because no data are available regarding the heritability of leaf shape in orchids, I conservatively assumed a heritability of 0.5 for all shape traits, meaning that half of the observed morphological variation results from environmental or nonadditive genetic effects (Leinonen et al. 2006). Overall

Pst for each species was calculated as the mean Pst over all shape traits. The difference between Pst and Fst was calculated by subtracting overall Fst for each species from overall

Pst.

Figure 4.1. Examples of digitised leaf outlines. Red circles mark points whose coordinates were recorded. a) shows a Lepanthes ciliisepala sample; because leaves are symmetrical, only one side of each leaf was digitised. b) shows a sample of Jacquiniella aporophylla; because leaves are asymmetrical, whole leaf outlines were digitised.

115 Chapter 4. Genetic drift, local adaptation, and diversification

Phylogenetic heritability was estimated for gene diversity, overall Pst and the difference between Pst and Fst as described in chapter three. The effects of genetic drift and local adaptation on diversification were tested using two-tailed Wilcoxon signed rank tests comparing mean trait values between sister clades. The mean gene diversity, overall

Pst and difference between Pst and Fst were calculated for each clade over all sampled species.

Gene diversity, overall Pst and the difference between Pst and Fst were tested for association with species range size and ecology in the same way that measures of Fst were tested in chapter three. Association between all three measures and species range measured as number of regions was tested using linear regression and the three measures were compared between restricted and widespread species using two-tailed Wilcoxon rank-sum tests. In addition, all three measures were tested for association with mean branch circumference, variance in branch circumference, minimum elevation, maximum elevation, elevation range and number of habitats using linear regression. For the tests involving ecological variables, Bonferroni correction was used to correct for multiple tests by considering only p-values less than 0.0083 to be significant.

Results

Gene diversity within populations does not vary greatly between species. It ranges from 0.12 in Trisetella triglochin to 0.24 in Jacquiniella aporophylla (full dataset).

Overall Pst values vary from 0.015 in Trisetella triglochin to 0.106 in Lepanthopsis floripecten. The difference between Pst and Fst ranges from -0.33 to 0.11. Overall Pst is

116 Chapter 4. Genetic drift, local adaptation, and diversification

greater than overall Fst, indicating local adaptation, for about one third of the study species, in all cases species with very low Fst values (< 0.03). Gene diversity, overall Pst and difference between Pst and Fst are listed for all species in Table 4.1.

The maximum likelihood value of lambda for gene diversity is 6.61x10-5 (p-value for difference from 0 = 1; p-value for difference from 1 < 0.005). The maximum likelihood value of lambda for overall Pst is 0.94 (p-value for difference from 0 = 0.15; p-value for difference from 1 = 0.39) and this result is robust to variation in branch lengths: the maximum likelihood value of lambda calculated with equal branch lengths is

0.27 (p-value for difference from 0 = 0.65; p-value for difference from 1 = 0.36). Unlike lambda for overall Pst, the maximum likelihood value of lambda for the difference

-5 between Pst and Fst is 6.61x10 (p-value for difference from 0 = 1; p-value for difference from 1 < 0.005). The association between all three traits with species phylogeny is shown in Figure 4.2 for the full dataset and Figure 4.3 for the reduced dataset.

Neither gene diversity nor overall Pst shows a significant association with diversification rate (gene diversity, full dataset, p = 0.813, reduced dataset, p = 0.625; overall Pst, p = 0.813). The relationship between the difference between Pst and Fst and diversification approaches significance, but only for the reduced dataset, and in the opposite direction as expected: increased diversification is associated with smaller values of (Pst - Fst) (full dataset, p = 0.313, reduced dataset, p = 0.0625).

The only significant association between a species genetic trait and an ecological trait is between gene diversity and elevation range for the reduced dataset (p = 0.0081,

2 r = 0.34; Figure 4.4). Otherwise, gene diversity, overall Pst and the difference between Pst

117 Chapter 4. Genetic drift, local adaptation, and diversification

and Fst are not associated with any measure of species range size (p-values between 0.07 and 0.536) or species ecology (p-values between 0.0166 and 0.995).

Table 4.1. Measures of levels of genetic drift, local adaptation and overall phenotypic divergence for all study species. Species are separated according to sister clade pair.

Gene Species diversity Overall Pst Pst - Fst M. nidifica - full dataset 0.221 0.024 -0.333 M. nidifica - reduced dataset 0.196 0.024 -0.326 M. rafaeliana 0.23 0.018 0.018 T. triglochin 0.124 0.015 -0.048 L. ciliisepala 0.15 0.051 0.024 L. elata 0.126 0.031 -0.041 L. floripecten - full dataset 0.219 0.106 -0.035 L. floripecten - reduced dataset 0.236 0.106 0.106 P. propinqua - full dataset 0.23 0.03 -0.144 P. propinqua - reduced dataset 0.233 0.03 -0.09 P. stenostachya 0.173 0.031 -0.211 D. odontostele 0.235 0.08 -0.102 S. fusiformis 0.149 0.031 -0.206 S. jimenezii 0.178 0.02 0.02 J. aporophylla - full dataset 0.239 0.016 0.016 J. aporophylla - reduced dataset 0.229 0.016 0.016 J. teretifolia - full dataset 0.175 0.023 0.023 J. teretifolia - reduced dataset 0.172 0.023 0.01 E. exasperatum 0.177 0.035 -0.094 E. laucheanum 0.217 0.042 -0.003 E. vulgoamparoanum 0.229 0.059 0.04 B. nodosa - full dataset 0.176 0.038 -0.02 B. nodosa - reduced dataset 0.176 0.038 -0.016

118

Figure 4.2. Associations of (a) gene diversity, (b) overall Pst and (c) difference between Pst and Fst with species phylogeny for the full dataset. Trait values for species are represented by colour of circles at branch tips. Species names are abbreviated by the first letters of genus and species.

119

Figure 4.3. Associations of (a) gene diversity, (b) overall Pst and (c) difference between Pst and Fst with species phylogeny for the reduced dataset. Traits are represented and species names abbreviated as in Figure 4.2.

120 Chapter 4. Genetic drift, local adaptation, and diversification

Figure 4.4. Relationship between gene diversity and species elevation range for the reduced dataset.

Discussion

Main findings

There is no evidence for the heritability of either the level of genetic drift or the level of local adaptation within species. Following from this finding, it is unsurprising that neither trait is associated with clade diversification rates. Both appear to be too labile to be considered as clade traits, at least at the taxonomic level investigated here.

121 Chapter 4. Genetic drift, local adaptation, and diversification

In contrast, it is interesting that overall Pst has such a high heritability value, although there is insufficient power to say that this value is significantly different from 0.

This may mean that the total level of phenotypic divergence within species is a heritable clade trait, perhaps as a result of heritability in the variety of environments used by species or in species responses to selection. It would be worthwhile to test this possibility further with data from more species. However, like measures of drift and local adaptation, overall Pst does not show any association with clade diversification rates.

The simplest conclusion from these results is that there is no relationship between levels of local adaptation or genetic drift within species and diversification rates of higher clades. There are also a number of alternative possibilities.

First, population genetic characteristics of species may be meaningful predictors of diversification only at higher taxonomic levels, such as families, where differences between clades outweigh the variability within them or where bigger differences exist between clades. In this group of study clades, for instance, the measure of genetic drift did not vary greatly either within or between clades. This possibility could be tested by repeating this study with higher-level sister clades and data for more species.

Second, as discussed in chapter three, population genetic characteristics of species may correlate with diversification rate, but with the signal confounded by the tendency for the speciation process itself to change species population genetics. This does not seem likely for genetic drift, as it is measured within populations and will not be affected by populations being split between new daughter species. However, as local adaptation is a relative measure quantified by comparing populations, it should generally be decreased

122 Chapter 4. Genetic drift, local adaptation, and diversification by speciation in the same way as neutral genetic differentiation. If this were the case, an association between species range size and degree of local adaptation, as was found for gene flow in chapter three, would be expected. Alhough this association was not found, such a complex scenario cannot yet be ruled out and would be worth exploring by studying further the relationships between degree of local adaptation and species range size and age.

Finally, it is possible that all lineages do not contribute equally to clade diversification (i.e. there is imbalance even within the clades studied here) and that lineages that contribute more to diversification, through higher rates of speciation, differ in their population genetics from those that contribute less to diversification. This possibility could be addressed by measuring the association between population genetics and diversification rates using data for all species within a single clade.

Study limitations

The dataset used here is not perfect, and its imperfections illustrate the difficulties inherent in comparative population genetic studies. Sampling was the greatest problem. In this study there were four levels at which extensive sampling was important: number of sister clade pairs included, number of species included in each clade, number of populations included for each species and number of plants sampled from each population. Sampling at the level of individuals and populations is equally important to sampling at the level of species and clades, in order to have confidence in both the population genetic data being compared and the outcome of the species comparisons themselves. Optimally, many more species and sister clade pairs would be included than

123 Chapter 4. Genetic drift, local adaptation, and diversification were included here; at least five populations would be sampled for each species, extending across each species range; and genetic data would be available for ten individuals for each population. In reality, however, with limited resources for fieldwork and labwork, tradeoffs must be accepted. In further work, this problem would most easily be resolved with collaborative efforts such as the IntraBioDiv consortium (Gugerli et al.

2008) or with databases of population genetic data. Nevertheless, this study had sufficient sampling for a first test of the relationship between population genetics and diversification and could have found evidence for a strong direct association if it existed.

Another difficulty in this study was choosing appropriate metrics to compare between species and clades. Many metrics exist for measuring levels of genetic drift and local adaptation (Freeland 2005), but it is impossible to distil all complexities of either process into any single measure. The measures of genetic drift and local adaptation used here were chosen as the simplest available measures that also make the fewest assumptions in order to summarise the overall level of each process. It is unlikely that any other measures of drift or local adaptation would have given qualitatively different results.

Although it is unlikely that a different measure of local adaptation based on leaf shape would have given a qualitatively different result, it is possible that leaf shape is not representative of ecological divergence in these species and that calculating Pst for a different trait would have changed the results. It would have been particularly worthwhile to calculate Pst for floral traits. Orchid species are highly variable and often specialised in pollination traits, and often reproductively isolated from close relatives by pollination barriers alone (Schiestl and Schluter 2009). For these reasons, pollinator interactions are

124 Chapter 4. Genetic drift, local adaptation, and diversification thought to be important in driving orchid speciation (Cozzolino and Widmer 2005;

Schiestl 2005), and variation in floral traits, which mediate pollinator interactions, might be expected to correlate with speciation rates. In this study, analysing floral traits was not possible because few flowering individuals were found in the field. However, future studies using cultivated plants or species that flower more frequently in the field might be able to address this.

Conclusion

Notwithstanding the obstacles discussed above, this study shows that comparative population genetics can be used to address macroevolutionary questions. Comparative studies so far have focused mainly on exploring the range of natural variation in population genetics and testing how much of this variation can be explained by species traits (Loveless and Hamrick 1984; Hamrick and Godt 1996; Morjan and Rieseberg

2004). As population genetic data become easier to generate with the advance of molecular technology, and as population genetic data accumulate, potential applications for such data increase. Rather than exploring only factors that control population genetics, it is now possible to study the extent to which population genetics control other processes.

In this study, no association was found between population genetics and diversification, but considering the potential complexities of the relationship between the two and the central role of population genetics in speciation theory, this is a research area worth exploring further.

125 Chapter 5. Conclusion

Chapter 5. Conclusion

In this thesis, I aimed to test a proposed framework in which the spatial scale of speciation, set by population genetic characteristics of taxa, interacts with region area to determine diversification rates.

Summary of results

In chapter two, I surveyed speciation events in angiosperms, bats, birds, carnivorous mammals, ferns, lizards, Macrolepidoptera and snails on isolated oceanic islands to measure taxonomic variation in the spatial scale of speciation. Using the minimum island area with evidence of in situ speciation as an estimate of the spatial scale of speciation, I found great variation between taxa: at one extreme, I found evidence for snail speciation even on Nihoa, which is only 0.8 km2; at the other extreme, I found no evidence for carnivore speciation on any island smaller than Madagascar (587,713 km2). I also found that the probability of speciation within an island increases with island area for all taxa except ferns, supporting a strong role of area in limiting speciation and diversification. Finally, using data from a survey of the population genetic literature, I found that minimum island areas for speciation are strongly correlated across taxa with their average level of gene flow, supporting a link between the population genetic characteristics of taxa and the spatial scale of speciation.

In chapters three and four, I generated population genetic data for orchid species from sister clades differing in species richness to test the link between population genetic

126 Chapter 5. Conclusion characteristics of taxa and their rates of diversification. I investigated the effects of three population genetic processes: gene flow, quantified using Fst, a measure of between- population differentiation in neutral genetic markers; genetic drift, quantified using gene diversity, a measure of genetic diversity within populations; and local adaptation, quantified using the difference between Fst and Pst, a measure of between-population differentiation in phenotypic traits. I found no evidence for a direct link between any of these three processes and rates of diversification. However, I did find some support for a link between gene flow and diversification rate mediated by species range size - all species with great population differentiation were widespread species from the more species-rich sister clade.

These results provide partial support for the framework I originally proposed.

Although the spatial scale of speciation is clearly linked to both gene flow and the probability of speciation within a region, there is no support for a link between the diversification of taxa and their population genetic characteristics (which I expected would affect diversification via their effect on the spatial scale of speciation). This may mean that the spatial scale of speciation is important in the context of individual speciation events, but does not affect overall diversification rates of clades. However, the results of the orchid study are not conclusive - it is possible that different results would come from choosing sister clades at a different taxonomic scale or from including data for more species. Thus, the relationship between the spatial scale of speciation and diversification rates of taxa requires further study.

127 Chapter 5. Conclusion

Directions for future work

Following on the results presented here, the most interesting question to address next is whether a direct relationship exists between the spatial scale of speciation and diversification rates. This could be tested by comparing minimum areas for speciation across sister groups or across clades with known phylogenies, while also accounting for clade range size or the amount of area available to each clade. If a strong relationship exists, groups with smaller minimum areas for speciation should contain more species.

More broadly, there is great potential for improving our understanding of diversification and diversity patterns by considering the spatial scale of speciation and population genetic characteristics of taxa in macroevolutionary studies. Comparative population genetics is a challenging approach, due to the difficulty of assembling comparable and high quality data across a suite of species. Nevertheless, as I have demonstrated throughout this thesis, it can be used effectively for investigating macroevolutionary questions, and it could be applied to a range of macroevolutionary topics. Measuring the spatial scale of speciation of taxa is also difficult, unless well- bounded regions (such as the oceanic islands I used in chapter two) can be identified in which speciation events can be inferred. However, with sufficient environmental and biogeographical information, it should also be possible to make such inferences in continental settings. This would expand the range of taxa for which the spatial scale of speciation could be measured and make it possible to use this measure in a broad range of macroevolutionary studies.

128 Chapter 5. Conclusion

Finally, this work demonstrates the value of integrating the effects of both environmental and taxon characters in studies of diversification. In chapter two, I found that the probability of speciation within an island was best modelled taking into account both taxon identity and island environmental characteristics. This result, combined with many other studies that have found evidence for an interaction between environmental and taxon characteristics (reviewed in chapter one), suggests that studies of diversification should either explicitly control for regional effects (as I did in chapters three and four by limiting study species to natives of Costa Rica and sampling populations within Costa Rica only) or include environmental characteristics of regions in analyses.

General conclusions

The variety and complexity of diversity patterns on Earth are fascinating, yet challenging to untangle. Centuries of thought and research have identified many possible factors and mechanisms that could contribute to structuring species richness, but as yet there is no consensus as to which are the most important or most general in their effects.

The framework proposed and explored in this thesis, which links area, population genetic characteristics of clades and the spatial scale of speciation, is surely insufficient to explain the full breadth of existing diversity patterns. Nevertheless, as it is applicable to all taxa and integrates both environmental and taxon characteristics, it has potential as a backbone for theories of diversification and diversity patterns.

129 Bibliography

Bibliography

Ackerman, J. D., J. C. Trejo-Torres and Y. Crespo-Chuy. 2007. Orchids of the West Indies: predictability of diversity and endemism. Journal of Biogeography 34: 779- 786.

Arditti, J. and A. K. A. Ghani. 2000. Tansley review No. 110 - Numerical and physical properties of orchid seeds and their biological implications. New Phytologist 145: 367-421.

Australian Biological Resources Study, Canberra. 2008. "Flora of Australia online: oceanic islands excluding Norfolk and Lord Howe Islands." from http://www.environment.gov.au/biodiversity/abrs/online- resources/flora/50/index.html

Baas-Becking, L. G. M. 1934. Geobiologie of inleiding tot de milieukunde. W. P. van Stockum and Zoon, The Hague, Netherlands.

Barluenga, M., K. Stolting, W. Salzburger, M. Muschick and A. Meyer. 2006. Sympatric speciation in Nicaraguan crater lake cichlid fish. Nature 439: 719-723.

Barraclough, T. G., P. H. Harvey and S. Nee. 1995. Sexual selection and taxonomic diversity in passerine birds. Proceedings of the Royal Society B: Biological Sciences 259: 211-215.

Barraclough, T. G. and S. Nee. 2001. Phylogenetics and speciation. Trends in Ecology & Evolution 16: 391-399.

Barraclough, T. G. and A. P. Vogler. 2000. Detecting the geographical pattern of speciation from species-level phylogenies. American Naturalist 155: 419-434.

Barraclough, T. G., A. P. Vogler and P. H. Harvey. 1998. Revealing the factors that promote speciation. Philosophical Transactions of the Royal Society B: Biological Sciences 353: 241-249.

130 Bibliography

Barton, N. H. 2001. The evolutionary consequences of gene flow and local adaptation: future approaches. Pages 329-340 in J. Clobert, E. Danchin, A. A. Dhondt and J. D. Nichols, eds. Dispersal. Oxford University Press, Oxford.

Bates, D. 2007. lme4: linear mixed-effects models using S4 classes. R package version 0.99875-9.

Bauer, A. M. 1988. Reptiles and the biogeographic interpretation of New Caledonia. Tuatara 30: 39-50.

Beaumont, M. A. and D. J. Balding. 2004. Identifying adaptive genetic divergence among populations from genome scans. Molecular Ecology 13: 969-980.

Beaumont, M. A. and R. A. Nichols. 1996. Evaluating loci for use in the genetic analysis of population structure. Proceedings of the Royal Society B: Biological Sciences 263: 1619-1626.

Belliure, J., G. Sorci, A. P. Moller and J. Clobert. 2000. Dispersal distances predict subspecies richness in birds. Journal of Evolutionary Biology 13: 480-487.

Benzing, D. H. 1987. Vascular epiphytism: taxonomic participation and adaptive diversity. Annals of the Missouri Botanical Garden 74: 183-204.

Berlocher, S. H. 1998. Can sympatric speciation via host or habitat shift be proven from phylogenetic and biogeographic evidence? Pages 99-113 in D. J. Howard and S. H. Berlocher, eds. Endless forms - species and speciation. Oxford University Press, New York.

Bohonak, A. J. 1999. Dispersal, gene flow, and population structure. Quarterly Review of Biology 74: 21-45.

Bonin, A., E. Bellemain, P. Bronken Eidesen, F. Pompanon, C. Brochmann and P. Taberlet. 2004. How to track and assess genotyping errors in population genetics studies. Molecular Ecology 13: 3261-3273.

131 Bibliography

Burnham, K. P. and D. R. Anderson. 2002. Model selection and multimodel inference: a practical information-theoretic approach. Springer-Verlag, New York.

Butlin, R. K., J. Galindo and J. W. Grahame. 2008. Sympatric, parapatric or allopatric: the most important way to classify speciation? Philosophical Transactions of the Royal Society B: Biological Sciences 363: 2997-3007.

Cardillo, M., J. S. Huxtable and L. Bromham. 2003. Geographic range size, life history and rates of diversification in Australian mammals. Journal of Evolutionary Biology 16: 282-288.

Cardillo, M., C. D. L. Orme and I. P. F. Owens. 2005. Testing for latitudinal bias in diversification rates: An example using New World birds. Ecology 86: 2278-2287.

Cardoso, G. C. and P. G. Mota. 2008. Speciational evolution of coloration in the genus Carduelis. Evolution 62: 753-762.

Carlquist, S. J. 1974. Island biology. Columbia University Press, New York.

Carson, H. L. and A. R. Templeton. 1984. Genetic revolutions in relation to speciation phenomena: The founding of new populations. Annual Review of Ecology and Systematics 15: 97-131.

Castella, V., M. Ruedi, L. Excoffier, C. Ibanez, R. Arlettaz and J. Hausser. 2000. Is the Gibraltar Strait a barrier to gene flow for the bat Myotis myotis (Chiroptera : Vespertilionidae)? Molecular Ecology 9: 1761-1772.

Chan, K. M. A. and B. R. Moore. 2002. Whole-tree methods for detecting differential diversification rates. Systematic Biology 51: 855-865.

Chase, M. W. 1988. Obligate twig epiphytes: a distinct subset of Neotropical orchidaceous epiphytes. Selbyana 10: 24-30.

Chase, M. W. and H. H. Hills. 1991. Silica-gel - an ideal material for field preservation of leaf samples for DNA studies. Taxon 40: 215-220.

132 Bibliography

Chiba, S. 1999. Accelerated evolution of land snails Mandarina in the oceanic Bonin Islands: evidence from mitochondrial DNA sequences. Evolution 53: 460-471.

Cingel, N. A. v. d. 2001. An atlas of orchid pollination: America, Africa, Asia and Australia. A.A. Balkema Publishers, Rotterdam.

Coyne, J. A., N. H. Barton and M. Turelli. 1997. Perspective: a critique of Sewall Wright's shifting balance theory of evolution. Evolution 51: 643-671.

Coyne, J. A., N. H. Barton and M. Turelli. 2000. Is Wright's shifting balance process important in evolution? Evolution 54: 306-317.

Coyne, J. A. and H. A. Orr. 2004. Speciation. Sinauer Associates, Inc., Sunderland, Massachusetts.

Coyne, J. A. and T. D. Price. 2000. Little evidence for sympatric speciation in island birds. Evolution 54: 2166-2171.

Cozzolino, S. and A. Widmer. 2005. Orchid diversity: an evolutionary consequence of deception? Trends in Ecology & Evolution 20: 487-494.

Cracraft, J. 1982. A non-equilibrium theory for the rate-control of speciation and extinction and the origin of macroevolutionary patterns. Systematic Zoology 31: 348- 365.

Cracraft, J. 1985. Biological diversification and its causes. Annals of the Missouri Botanical Garden 72: 794-822.

Dahl, A. 2004. "United Nations Environment Program island directory." from http://islands.unep.ch/isldir.htm

Darwin, C. 1862. The various contrivances by which orchids are fertilized by insects. John Murray, London.

133 Bibliography

Davies, T. J., V. Savolainen, M. W. Chase, J. Moat and T. G. Barraclough. 2004. Environmental energy and evolutionary rates in flowering plants. Proceedings of the Royal Society B: Biological Sciences 271: 2195-2200.

Dial, K. P. and J. M. Marzluff. 1989. Nonrandom diversification within taxonomic assemblages. Systematic Zoology 38: 26-37.

Diamond, J. M. 1977. Continental and insular speciation in Pacific land birds. Systematic Zoology 26: 263-268.

Diamond, J. M. 2005. Collapse : how societies choose to fail or succeed. Viking, New York.

Doebeli, M. and U. Dieckmann. 2003. Speciation along environmental gradients. Nature 421: 259-264.

Dray, S., A. B. Dufour and D. Chessel. 2007. The ade4 package-II: Two-table and K- table methods. R News 7: 47-52.

Dressler, R. L. 2005. How many orchid species. Selbyana 26: 155-158.

Eastwood, A., Q. C. B. Cronk, J. C. Vogel, A. Hemp and M. Gibby. 2004. Comparison of molecular and morphological data on St Helena: Elaphoglossum. Plant Systematics and Evolution 245: 93-106.

Ehrich, D. 2006. AFLPDAT: a collection of R functions for convenient handling of AFLP data. Molecular Ecology Notes 6: 603-604.

Emerson, B. C. and P. Oromi. 2005. Diversification of the forest beetle genus Tarphius on the Canary Islands, and the evolutionary origins of island endemics. Evolution 59: 586-598.

Endler, J. A. 1977. Geographic variation, speciation, and clines. Princeton University Press, Princeton.

134 Bibliography

Eriksson, O. and B. Bremer. 1991. Fruit characteristics, life forms, and species richness in the plant family Rubiaceae. American Naturalist 138: 751-761.

Excoffier, L. 2001. Analysis of population subdivision. Pages 271-307 in D. J. Balding, M. Bishop and C. Cannings, eds. Handbook of statistical genetics. John Wiley & Sons, Chichester.

Excoffier, L. and H. E. L. Lischer. 2010. Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Molecular Ecology Resources 10: 564-567.

Excoffier, L., P. E. Smouse and J. M. Quattro. 1992. Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics 131: 479-91.

Farris, J. S. 1976. Expected asymmetry of phylogenetic trees. Systematic Zoology 25: 196-198.

Felsenstein, J. 1985. Phylogenies and the comparative method. American Naturalist 125: 1-15.

Finlay, B. J. 2002. Global dispersal of free-living microbial eukaryote species. Science 296: 1061-1063.

Fontaneto, D., T. G. Barraclough, K. Chen, C. Ricci and E. A. Herniou. 2008. Molecular evidence for broad-scale distributions in bdelloid rotifers: everything is not everywhere but most things are very widespread. Molecular Ecology 17: 3136-3146.

Frankham, R., J. D. Ballou and D. A. Briscoe. 2010. Introduction to conservation genetics. Cambridge University Press, Cambridge, UK.

Freckleton, R. P., P. H. Harvey and M. Pagel. 2002. Phylogenetic analysis and comparative data: a test and review of evidence. American Naturalist 160: 712-26.

Freeland, J. R. 2005. Molecular Ecology. John Wiley & Sons, Ltd, Chichester.

135 Bibliography

Fu, Y. X. 1997. Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection. Genetics 147: 915-925.

Fuller, R. C. 2008. Genetic incompatibilities in killifish and the role of environment. Evolution 62: 3056-3068.

Funk, D. J., K. E. Filchak and J. L. Feder. 2002. Herbivorous insects: model systems for the comparative study of speciation ecology. Genetica 116: 251-267.

Funk, D. J., P. Nosil and W. J. Etges. 2006. Ecological divergence exhibits consistently positive associations with reproductive isolation across disparate taxa. Proceedings of the National Academy of Sciences of the USA 103: 3209-3213.

Futuyma, D. 1986. Evolutionary Biology. Sinauer, Sunderland.

Gaggioti, O. E. and I. Hanski. 2004. Mechanisms of population extinction. Pages 337-366 in I. Hanski and O. E. Gaggioti, eds. Ecology, genetics, and evolution of metapopulations. Elsevier Academic Press, Amsterdam.

Gaston, K. J. and T. M. Blackburn. 1997. Age, area and avian diversification. Biological Journal of the Linnean Society 62: 239-253.

Gavrilets, S. 2003. Perspective: Models of speciation: What have we learned in 40 years? Evolution 57: 2197-2215.

Gavrilets, S. 2004. Fitness landscapes and the origin of species. Princeton University Press, Princeton.

Gavrilets, S. and J. B. Losos. 2009. Adaptive radiation: contrasting theory with data. Science 323: 732-737.

Gavrilets, S. and A. Vose. 2005. Dynamic patterns of adaptive radiation. Proceedings of the National Academy of Sciences of the USA 102: 18040-18045.

Gentry, A. H. and C. H. Dodson. 1987. Diversity and biogeography of Neotropical vascular epiphytes. Annals of the Missouri Botanical Garden 74: 205-233.

136 Bibliography

Gillespie, R. 2004. Community assembly through adaptive radiation in Hawaiian spiders. Science 303: 356-359.

Gillespie, R. G. and B. G. Baldwin. 2009. Island biogeography of remote archipelagoes: interplay between ecological and evolutionary processes. in J. B. Losos and R. E. Ricklefs, eds. The theory of island biogeography at 40: impacts and prospects. Princeton University Press, Princeton.

Givnish, T. J., K. C. Millam, A. R. Mast, T. B. Paterson, T. J. Theim, A. L. Hipp, J. M. Henss, J. F. Smith, K. R. Wood and K. J. Sytsma. 2009. Origin, adaptive radiation and diversification of the Hawaiian lobeliads (Asterales: Campanulaceae). Proceedings of the Royal Society B: Biological Sciences 276: 407-416.

Govaerts, R., J. Pfahl, M. A. Campacci, D. Holland Baptista, H. Tigges, J. Shaw, P. Cribb, A. George, K. Kreuz and J. Wood. 2007. World Checklist of Orchidaceae. The Board of Trustees of the Royal Botanic Gardens, Kew, Published on the Internet; http://www.kew.org./wcsp/.

Govaerts, R., J. Pfahl, M. A. Campacci, D. Holland Baptista, H. Tigges, J. Shaw, P. Cribb, A. George, K. Kreuz and J. Wood. 2008. World Checklist of Orchidaceae. The Board of Trustees of the Royal Botanic Gardens, Kew, Published on the Internet; http://www.kew.org./wcsp/.

Govaerts, R., J. Pfahl, M. A. Campacci, D. Holland Baptista, H. Tigges, J. Shaw, P. Cribb, A. George, K. Kreuz and J. Wood. 2010. World Checklist of Orchidaceae. The Board of Trustees of the Royal Botanic Gardens, Kew, Published on the Internet; http://www.kew.org./wcsp/.

Govindaraju, D. R. 1988. Relationship between dispersal ability and levels of gene flow in plants. Oikos 52: 31-35.

Grant, P. R. and B. R. Grant. 2009. Sympatric speciation, immigration, and hybridization in island birds. in J. B. Losos and R. E. Ricklefs, eds. The theory of island biogeography at 40: impacts and prospects. Princeton University Press, Princeton.

137 Bibliography

Grant, V. 1981. Plant Speciation. Columbia University Press, New York.

Gravendeel, B., A. Smithson, F. J. W. Slik and A. Schuiteman. 2004. Epiphytism and pollinator specialization: drivers for orchid diversity? Philosophical Transactions of the Royal Society B: Biological Sciences 359: 1523-1535.

Gugerli, F., T. Englisch, H. Niklfeld, A. Tribsch, Z. Mirek, M. Ronikier, N. E. Zimmermann, R. Holderegger, P. Taberlet and C. IntraBioDiv. 2008. Relationships among levels of biodiversity and the relevance of intraspecific diversity in conservation - a project synopsis. Perspectives in Plant Ecology Evolution and Systematics 10: 259-281.

Hammel, B. E., M. H. Grayum, C. Herrera and N. Zamora, Eds. 2003. Manual de Plantas de Costa Rica Volumen III. Missouri Botanical Garden Press, St. Louis.

Hamrick, J. L. and M. J. W. Godt. 1996. Effects of life history traits on genetic diversity in plant species. Philosophical Transactions of the Royal Society B: Biological Sciences 351: 1291-1298.

Harvey, P. H., R. M. May and S. Nee. 1994. Phylogenies without fossils. Evolution 48: 523-529.

Hedrick, P. W. 2000. Genetics of populations. Jones and Bartlett Publishers, Sudbury, Massachusetts.

Herrera, C. M. 1989. Seed dispersal by animals - a role in angiosperm diversification. American Naturalist 133: 309-322.

Hodges, S. A. and M. L. Arnold. 1995. Spurring plant diversification: Are floral nectar spurs a key innovation? Proceedings of the Royal Society B: Biological Sciences 262: 343-348.

Holt, R. D. and Gomulkiewicz. 2004. Conservation implications of niche conservatism and evolution in heterogeneous environments. Pages 244-264 in R. Ferrière, U.

138 Bibliography

Dieckmann and D. Couvet, eds. Evolutionary conservation biology. Cambridge University Press, Cambridge.

Hutchison, D. W. and A. R. Templeton. 1999. Correlation of pair-wise genetic and geographic distance measures: Inferring the relative influences of gene flow and drift on the distribution of genetic variability. Evolution 53: 1898-1914.

Isaac, N. J. B., P. M. Agapow, P. H. Harvey and A. Purvis. 2003. Phylogenetically nested comparisons for testing correlates of species richness: A simulation study of continuous variables. Evolution 57: 18-26.

Jablonski, D. 1986. Larval ecology and macroevolution in marine invertebrates. Bulletin of Marine Science 39: 565-587.

Jablonski, D. 2008. Species selection: theory and data. Annual Review of Ecology, Evolution, and Systematics 39: 501-524.

Jobson, R. W. and V. A. Albert. 2002. Molecular rates parallel diversification contrasts between carnivorous plant sister lineages. Cladistics 18: 127-136.

Katoh, K., K. Misawa, K. Kuma and T. Miyata. 2002. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Research 30: 3059-66.

Kay, K. M. and R. D. Sargent. 2009. The role of animal pollination in plant speciation: Integrating ecology, geography, and genetics. Annual Review of Ecology Evolution and Systematics 40: 637-656.

Kinlan, B. P. and S. D. Gaines. 2003. Propagule dispersal in marine and terrestrial environments: a community perspective. Ecology 84: 2007-2020.

Kisel, Y., L. M. McInnes, N. H. Toomey and C. D. L. Orme. in review. How diversification rates and diversity limits combine to create large-scale species-area relationships. Philosophical Transactions of the Royal Society B: Biological Sciences.

139 Bibliography

Kodandaramaiah, U. and N. Wahlberg. 2007. Out-of-Africa origin and dispersal-mediated diversification of the butterfly genus Junonia (Nymphalidae : Nymphalinae). Journal of Evolutionary Biology 20: 2181-2191.

Lande, R. 1988. Genetics and demography in biological conservation. Science 241: 1455- 1460.

Leinonen, T., J. M. Cano, H. Makinen and J. Merila. 2006. Contrasting patterns of body shape and neutral genetic divergence in marine and lake populations of threespine sticklebacks. Journal of Evolutionary Biology 19: 1803-1812.

Lepage, D. 2008. "Avibase - the world bird database." from http://www.bsc- eoc.org/avibase/avibase.jsp

Lohmann, G. P. 1983. Eigenshape analysis of micro-fossils - a general morphometric procedure for describing changes in shape. Journal of the International Association for Mathematical Geology 15: 659-672.

Long, J. C. 1986. The allelic correlation structure of Gainj-speaking and Kalam-speaking people. 1. The estimation and interpretation of Wright's F-Statistics. Genetics 112: 629-647.

Losos, J. B. and C. E. Parent. 2009. The speciation-area relationship. in J. B. Losos and R. E. Ricklefs, eds. The theory of island biogeography at 40: impacts and prospects. Princeton University Press, Princeton.

Losos, J. B. and D. Schluter. 2000. Analysis of an evolutionary species-area relationship. Nature 408: 847-850.

Loveless, M. D. and J. L. Hamrick. 1984. Ecological determinants of genetic structure in plant populations. Annual Review of Ecology and Systematics 15: 65-95.

MacArthur, R. H. and E. O. Wilson. 1967. The theory of island biogeography. Princeton University Press, Princeton.

140 Bibliography

MacLeod, N. 1999. Generalizing and extending the eigenshape method of shape space visualization and analysis. Paleobiology 25: 107-138.

Marzluff, J. M. and K. P. Dial. 1991. Life-history correlates of taxonomic diversity. Ecology 72: 428-439.

Mayhew, P. J. 2007. Why are there so many insect species? Perspectives from fossils and phylogenies. Biological Reviews 82: 425-454.

Mayr, E. 1942. Systematics and the origin of species from the viewpoint of a zoologist. Columbia University Press, New York.

Mayr, E. 1963. Animal species and evolution. Belknap Press of Harvard University Press, Cambridge.

Mayr, E. 1965. Avifauna: turnover on islands. Science 150: 1587-1588.

Mayr, E. 1982. Processes of speciation in animals. Pages 1-19 in A. R. I. Liss, ed. Mechanisms of Speciation. Alan R. Liss Inc., New York.

McPeek, M. A. 2008. The ecological dynamics of clade diversification and community assembly. American Naturalist 172: E270-E284.

Merilä, J. and P. Crnokrak. 2001. Comparison of genetic differentiation at marker loci and quantitative traits. Journal of Evolutionary Biology 14: 892-903.

Mitter, C., B. Farrell and B. Wiegmann. 1988. The phylogenetic study of adaptive zones - has phytophagy promoted insect diversification. American Naturalist 132: 107-128.

Morjan, C. L. and L. H. Rieseberg. 2004. How species evolve collectively: implications of gene flow and selection for the spread of advantageous alleles. Molecular Ecology 13: 1341-1356.

Nee, S. 2006. Birth-death models in macroevolution. Annual Review of Ecology Evolution and Systematics 37: 1-17.

141 Bibliography

Nee, S., E. C. Holmes, R. M. May and P. H. Harvey. 1994. Extinction rates can be estimated from molecular phylogenies. Philosophical Transactions of the Royal Society B: Biological Sciences 344: 77-82.

Nee, S., A. O. Mooers and P. H. Harvey. 1992. Tempo and mode of evolution revealed from molecular phylogenies. Proceedings of the National Academy of Sciences of the USA 89: 8322-8326.

Nei, M. 1973. Analysis of gene diversity in subdivided populations. Proceedings of the National Academy of Sciences of the USA 70: 3321-3.

Nei, M. 1987. Molecular evolutionary genetics. Columbia University Press, New York.

Nogueira, R. E., O. L. Pereira, M. C. M. Kasuya, M. C. d. S. Lanna and M. P. Mendonça. 2005. Mycorrhizal fungi associated to orchids growing in "campos rupestres" in "Quadrilatero Ferrifero" region, Minas Gerais State, Brazil. Acta Botanica Brasilica 19: 417-424.

Nosil, P. 2009. Adaptive population divergence in cryptic color-pattern following a reduction in gene flow. Evolution 63: 1902-12.

Orme, C. D. L., R. P. Freckleton and G. H. Thomas. 2008. CAIC: Comparative Analyses using Independent Contrasts.

Otto, S. P. and J. Whitton. 2000. Polyploid incidence and evolution. Annual Review of Genetics 34: 401-437.

Owens, I., P. Bennett and P. Harvey. 1999. Species richness among birds: body size, life history, sexual selection or ecology? Proceedings of the Royal Society B: Biological Sciences 266: 933-939.

Pagel, M. 1999. Inferring the historical patterns of biological evolution. Nature 401: 877- 884.

142 Bibliography

Parent, C. E. and B. J. Crespi. 2006. Sequential colonization and diversification of Galapagos endemic land snail genus Bulimulus (Gastropoda, Stylommatophora). Evolution 60: 2311-2328.

Paulay, G. 1985. Adaptive radiation on an isolated oceanic island - the Cryptorhynchinae (Curculionidae) of Rapa revisited. Biological Journal of the Linnean Society 26: 95- 187.

Paulay, G. 1994. Biodiversity on oceanic islands - its origin and extinction. American Zoologist 34: 134-144.

Paulay, G. and C. Meyer. 2006. Dispersal and divergence across the greatest ocean region: do larvae matter? Integrative and Comparative Biology 46: 269-281.

Pereira, M. C., O. L. Pereira, M. D. Costa, R. B. Rocha and M. C. M. Kasuya. 2009. Diversity of mycorrhizal fungi Epulorhiza spp. isolated from Epidendrum secundum (Orchidaceae). Revista Brasileira De Ciencia Do Solo 33: 1187-1197.

Peterson, M. A. and R. F. Denno. 1998. The influence of dispersal and diet breadth on patterns of genetic isolation by distance in phytophagous insects. American Naturalist 152: 428-446.

Phillimore, A. B., R. P. Freckleton, C. D. L. Orme and I. P. F. Owens. 2006. Ecology predicts large-scale patterns of phylogenetic diversification in birds. American Naturalist 168: 220-229.

Phillimore, A. B., C. D. L. Orme, R. G. Davies, J. D. Hadfield, W. J. Reed, K. J. Gaston, R. P. Freckleton and I. P. F. Owens. 2007. Biogeographical basis of recent phenotypic divergence among birds: a global study of subspecies richness. Evolution 61: 942- 957.

Phillimore, A. B., C. D. L. Orme, G. H. Thomas, T. M. Blackburn, P. M. Bennett, K. J. Gaston and I. P. F. Owens. 2008. Sympatric speciation in birds is rare: insights from range data and simulations. American Naturalist 171: 646-657.

143 Bibliography

Pickle, J. and J. Kirtley. 2008. Analyzing Digital Images 2008. Museum of Science, Boston, Boston.

Pigot, A. L., A. B. Phillimore, I. P. F. Owens and C. D. L. Orme. in press. The shape and temporal dynamics of phylogenetic trees arising from geographic speciation. Systematic Biology.

Pijl, L. v. d. and C. H. Dodson. 1966. Orchid flowers: their pollination and evolution. Fairchild Tropical Garden and University of Miami Press, Coral Gables.

Price, J. P. and W. L. Wagner. 2004. Speciation in Hawaiian angiosperm lineages: cause, consequence, and mode. Evolution 58: 2185-2200.

Pridgeon, A. M., R. Solano and M. W. Chase. 2001. Phylogenetic relationships in Pleurothallidinae (Orchidaceae): Combined evidence from nuclear and plastid DNA sequences. American Journal of Botany 88: 2286-2308.

Purvis, A. 2008. Phylogenetic approaches to the study of extinction. Annual Review of Ecology Evolution and Systematics 39: 301-319.

Quental, T. B. and C. R. Marshall. 2010. Diversity dynamics: molecular phylogenies need the fossil record. Trends in Ecology & Evolution 25: 434-441.

R development core team. 2007. R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna.

R development core team. 2008. R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna.

Rabosky, D. L. 2009a. Ecological limits and diversification rate: alternative paradigms to explain the variation in species richness among clades and regions. Ecology Letters 12: 735-743.

Rabosky, D. L. 2009b. Ecological limits on clade diversification in higher taxa. American Naturalist 173: 662-674.

144 Bibliography

Rabosky, D. L. 2010. Extinction rates should not be estimated from molecular phylogenies. Evolution 64: 1816-1824.

Ramos-Onsins, S. E. and J. Rozas. 2002. Statistical properties of new neutrality tests against population growth. Molecular Biology and Evolution 19: 2092-100.

Ranker, T., S. Floyd and P. Trapp. 1994. Multiple colonizations of Asplenium adiantum- nigrum onto the Hawaiian archipelago. Evolution 48: 1364-1370.

Rasmussen, H. N. 1995. Terrestrial orchids from seed to mycotrophic plant. Cambridge University Press, Cambridge.

Raup, D. M., S. J. Gould, T. J. M. Schopf and D. S. Simberloff. 1973. Stochastic models of phylogeny and evolution of diversity. Journal of Geology 81: 525-542.

Renshaw, P. B. 2010. Are patterns of mycorrhizal specificity conserved across the orchid genus Masdevallia? Imperial College, London. MSc thesis.

Rice, W. R. and E. E. Hostert. 1993. Laboratory experiments on speciation - what have we learned in 40 years. Evolution 47: 1637-1653.

Ricklefs, R. E. 2003. Global diversification rates of passerine birds. Proceedings of the Royal Society B: Biological Sciences 270: 2285-2291.

Ricklefs, R. E. 2007a. Estimating diversification rates from phylogenetic information. Trends in Ecology & Evolution 22: 601-610.

Ricklefs, R. E. 2007b. History and diversity: explorations at the intersection of ecology and evolution. American Naturalist 170: S56-S70.

Rohlf, F. J. 2008. tpsDig, digitize landmarks and outlines, version 2.12, Department of Ecology and Evolution, State University of New York at Stony Brook.

Rosenzweig, M. L. 1995. Species diversity in space and time. Cambridge University Press, Cambridge.

145 Bibliography

Rosindell, J., S. J. Cornell, S. P. Hubbell and R. S. Etienne. 2010. Protracted speciation revitalizes the neutral theory of biodiversity. Ecology Letters 13: 716-727.

Royer, D. L., P. Wilf, D. A. Janesko, E. A. Kowalski and D. L. Dilcher. 2005. Correlations of climate and plant ecology to leaf size and shape: Potential proxies for the fossil record. American Journal of Botany 92: 1141-1151.

Ryan, P. G., P. Bloomer, C. L. Moloney, T. J. Grant and W. Delport. 2007. Ecological speciation in South Atlantic island finches. Science 315: 1420-1423.

Sanderson, M. J. and M. J. Donoghue. 1996. Reconstructing shifts in diversification rates on phylogenetic trees. Trends in Ecology & Evolution 11: 15-20.

Savolainen, V., M. C. Anstett, C. Lexer, I. Hutton, J. J. Clarkson, M. V. Norup, M. P. Powell, D. Springate, N. Salamin and W. J. Baker. 2006. Sympatric speciation in palms on an oceanic island. Nature 441: 210-213.

Schiestl, F. P. 2005. On the success of a swindle: pollination by deception in orchids. Naturwissenschaften 92: 255-264.

Schiestl, F. P. and P. M. Schluter. 2009. Floral isolation, specialized pollination, and pollinator behavior in orchids. Annual Review of Entomology 54: 425-446.

Schliewen, U. K., D. Tautz and S. Paabo. 1994. Sympatric speciation suggested by monophyly of crater lake cichlids. Nature 368: 629-632.

Schluter, D. 2000. The ecology of adaptive radiation. Oxford University Press, Oxford.

Schluter, D. 2001. Ecology and the origin of species. Trends in Ecology & Evolution 16: 372-380.

Schluter, D. 2009. Evidence for ecological speciation and its alternative. Science 323: 737-741.

146 Bibliography

Seddon, N., R. M. Merrill and J. A. Tobias. 2008. Sexually selected traits predict patterns of species richness in a diverse clade of suboscine birds. American Naturalist 171: 620-631.

Seehausen, O. 2006. African cichlid fish: a model system in adaptive radiation research. Proceedings of the Royal Society B: Biological Sciences 273: 1987-1998.

Sequeira, A., A. Lanteri, R. Albelo, S. Bhattacharya and M. Sijapati. 2008. Colonization history, ecological shifts and diversification in the evolution of endemic Galapagos weevils. Molecular Ecology 17: 1089-1107.

Simpson, G. G. 1953. The major features of evolution. Columbia University Press, New York.

Slatkin, M. 1973. Gene flow and selection in a cline. Genetics 75: 733-756.

Slatkin, M. 1985. Gene flow in natural populations. Annual Review of Ecology and Systematics 16: 393-430.

Smith, A. B. 2007. Marine diversity through the Phanerozoic: problems and prospects. Journal of the Geological Society 164: 731-745.

Smith, P. L. 1988. Paleoscene .11. Paleobiogeography and plate-tectonics. Geoscience Canada 15: 261-279.

Sobel, J. M., G. F. Chen, L. R. Watt and D. W. Schemske. 2010. The biology of speciation. Evolution 64: 295-315.

Spitze, K. 1993. Population structure in Daphnia obtusa: Quantitative genetic and allozymic variation. Genetics 135: 367-374.

Steadman, D. W. 2006. Extinction & biogeography of tropical Pacific birds. University of Chicago Press, Chicago.

Stebbins, G. L. 1950. Variation and evolution in plants. Columbia University Press, New York.

147 Bibliography

Stuart-Fox, D. and I. P. F. Owens. 2003. Species richness in agamid lizards: chance, body size, sexual selection or ecology? Journal of Evolutionary Biology 16: 659-669.

Stuessy, T. F., D. J. Crawford and C. Marticorena. 1990. Patterns of phylogeny in the endemic vascular flora of the Juan Fernandez Islands, Chile. Systematic Botany 15: 338-346.

Stuessy, T. F., G. Jakubowsky, R. S. Gomez, M. Pfosser, P. M. Schluter, T. Fer, B. Y. Sun and H. Kato. 2006. Anagenetic evolution in island plants. Journal of Biogeography 33: 1259-1265.

Swarts, N. D., E. A. Sinclair, A. Francis and K. W. Dixon. 2010. Ecological specialization in mycorrhizal symbiosis leads to rarity in an endangered orchid. Molecular Ecology 19: 3226-3242.

Takenaka, A. 1994. Effects of leaf blade narrowness and petiole length on the light capture efficiency of a shoot. Ecological Research 9: 109-114.

Thorpe, W. H. 1945. The evolutionary significance of habitat selection. Journal of Animal Ecology 14: 67-70.

Thrall, P. H., J. J. Burdon and B. R. Murray. 2000. The metapopulation paradigm: a fragmented view of conservation biology. Pages 75-95 in A. G. Young and G. M. Clarke, eds. Genetics, demography and viability of fragmented populations. Cambridge University Press, Cambridge.

Tiffney, B. H. and S. J. Mazer. 1995. Angiosperm growth habit, dispersal and diversification reconsidered. Evolutionary Ecology 9: 93-117.

Tilman, D. and P. M. Kareiva. 1997. Spatial ecology: the role of space in population dynamics and interspecific interactions. Princeton University Press, Princeton.

Tremblay, R. L. 1997. Distribution and dispersion patterns of individuals in nine species of Lepanthes (Orchidaceae). Biotropica 29: 38-45.

148 Bibliography

Tremblay, R. L., J. D. Ackerman, J. K. Zimmerman and R. N. Calvo. 2005. Variation in sexual reproduction in orchids and its evolutionary consequences: a spasmodic journey to diversification. Biological Journal of the Linnean Society 84: 1-54.

Valentine, J. W. and E. M. Moores. 1970. Plate-tectonic regulation of faunal diversity and sea level - a model. Nature 228: 657-659.

Vamosi, J. C. and S. M. Vamosi. 2010. Key innovations within a geographical context in flowering plants: towards resolving Darwin's abominable mystery. Ecology Letters 13: 1270-1279. van den Berg, C., W. E. Higgins, R. L. Dressler, W. M. Whitten, M. A. Soto-Arenas and M. W. Chase. 2009. A phylogenetic study of Laeliinae (Orchidaceae) based on combined nuclear and plastid DNA sequences. Annals of Botany 104: 417-30.

Vekemans, X. and O. J. Hardy. 2004. New insights from fine-scale spatial genetic structure analyses in plant populations. Molecular Ecology 13: 921-935.

Veness, C. 2008. "Geodesic distance between two latitude/longitude points using Vincenty ellipsoid formula in JavaScript." from http://www.movable- type.co.uk/scripts/latlong-vincenty.html. von Hagen, K. B. and J. W. Kadereit. 2003. The diversification of Halenia (Gentianaceae): Ecological opportunity versus key innovation. Evolution 57: 2507- 2518.

Vos, P., R. Hogers, M. Bleeker, M. Reijans, T. v. d. Lee, M. Hornes, A. Frijters, J. Pot, J. Peleman, M. Kuiper and M. Zabeau. 1995. AFLP: a new technique for DNA fingerprinting. Nucleic Acids Research 23: 4407-4414.

Vrba, E. S. 1984. Evolutionary pattern and process in the sister-group Alcelaphini- Aepycerotini (Mammalia: Bovidae). Pages 62-79 in N. Eldredge and S. M. Stanley, eds. Living Fossils. Springer-Verlag, New York.

149 Bibliography

Wagner, W. H. 1969. The role and taxonomic treatment of hybrids. Bioscience 19: 785- 795.

Wang, Y. T. 2007. Average daily temperature and reversed day/night temperature regulate vegetative and reproductive responses of a Doritis pulcherrima Lindley hybrid. Hortscience 42: 68-70.

Webb, T. J. and K. J. Gaston. 2003. On the heritability of geographic range sizes. American Naturalist 161: 553-566.

Whitlock, R., H. Hipperson, M. Mannarelli, R. K. Butlin and T. Burke. 2008. An objective, rapid and reproducible method for scoring AFLP peak-height data that minimizes genotyping error. Molecular Ecology Resources 8: 725-735.

Whittaker, R. J. and J. M. Fernandez-Palacios. 2007. Island biogeography. Oxford University Press, Oxford.

Whittaker, R. J., K. A. Triantis and R. J. Ladle. 2008. A general dynamic theory of oceanic island biogeography. Journal of Biogeography 35: 977-994.

Whittaker, R. J., K. A. Triantis and R. J. Ladle. 2009. A general dynamic theory of oceanic island biogeography: extending the MacArthur-Wilson theory to accommodate the rise and fall of volcanic islands. in J. B. Losos and R. E. Ricklefs, eds. The theory of island biogeography at 40: impacts and prospects. Princeton University Press, Princeton.

Wright, S. 1931. Evolution in Mendelian populations. Genetics 16: 97-159.

Wright, S. 1982. The shifting balance theory and macroevolution. Annual Review of Genetics 16: 1-19.

Wu, C. I. and C. T. Ting. 2004. Genes and speciation. Nature Reviews Genetics 5: 114- 122.

150 Bibliography

Zera, A. J. 1981. Genetic structure of 2 species of waterstriders (Gerridae, Hemiptera) with differing degrees of winglessness. Evolution 35: 218-225.

Zettler, L. W., S. B. Poulter, K. I. McDonald and S. L. Stewart. 2007. Conservation- driven propagation of an epiphytic orchid (Epidendrum nocturnum) with a mycorrhizal fungus. Hortscience 42: 135-139.

Zettler, L. W., T. Wilson Delaney and J. A. Sunley. 1998. Seed propagation of the epiphytic green-fly orchid, Epidendrum conopseum R. Brown, using its endophytic fungus. Selbyana 19: 249-253.

Zwickl, D. J. 2006. Genetic algorithm approaches for the phylogenetic analysis of large biological sequence datasets under the maximum likelihood criterion. The University of Texas at Austin. Ph.D. thesis.

151 Appendix I.

Appendix I.

How diversification rates and diversity limits combine to create large-scale species-

area relationships

Yael Kisel1*, Lynsey McInnes 1,2*, Nicola H Toomey1, C David L Orme1,2

1 Division of Biology, Imperial College London, Silwood Park, Ascot, Berkshire, SL5 7PY, UK

2 Grantham Institute for Climate Change, Imperial College, London SW7 2AZ, UK

* Authors for correspondence: [email protected] and [email protected]; these

authors contributed equally to this work.

Abstract

Species-area relationships have mostly been treated from an ecological perspective,

focusing on immigration, local extinction, and resource-based limits to species

coexistence. However, a full understanding across large regions is impossible without also considering speciation and global extinction. Rates of both speciation and extinction are known to be strongly affected by area and thus should contribute to spatial patterns of diversity. Here, we explore how variation in diversification rates and ecologically- mediated diversity limits among regions of different sizes can result in the formation of species-area relationships. We explain how this area-related variation in diversification can be caused by either the direct effects of area or the effects of factors that are highly correlated with area, such as habitat diversity and population size. We also review

This appendix is an invited contribution for a special issue of the Philosophical Transactions of the Royal Society B: Biological Sciences, with the theme of “Global Biodiversity of Mammals”, to be published in 2011.

152 Appendix I. environmental, clade-specific, and historical factors that affect diversification and diversity limits but are not highly correlated with region area, and thus are likely to cause scatter in observed species-area relationships. We present new analyses using data on the distributions, ages and traits of mammalian species to illustrate these mechanisms; in doing so we provide an integrated perspective on the evolutionary processes shaping species-area relationships.

Introduction

The species-area relationship (SAR), which describes an increase in the number of species as region size increases, is a nearly ubiquitous pattern of biodiversity. SARs exist at a wide range of spatial scales, from local to global, and in a wide range of taxa, including mammals (Pagel et al. 1991). In the ecological literature, SARs have been explained by considering the factors that limit species from immigrating into, establishing, and persisting in a region (Arrhenius 1921; MacArthur & Wilson 1967;

Preston 1960). However, at large geographic scales, in situ diversification contributes significantly to generating diversity, and so a full understanding of the generation of species-area relationships at such scales is impossible without also considering the macroevolutionary processes of speciation and extinction (Losos & Schluter 2000;

Rosenzweig 1995; 1998).

Here, we explore the evolutionary underpinnings of large-scale SARs, outlining the roles of area itself, environmental variation, clade traits and historical contingency. We adopt a model of clade diversity in which clade diversification within regions is diversity- dependent and SARs are created by the scaling of both diversity limits and diversification

153 Appendix I.

rates with area. We support this discussion with new analyses using mammals as they are

a well-known, diverse, and globally-distributed group with a wide variety of life

histories, occupying a wide range of habitats and with robust data for many key traits

(Jones et al. 2009).

SARs have traditionally been treated as the outcome of differences between regions in

the balance between immigration and local extinction (MacArthur & Wilson 1967) and in

the number of species that can coexist (Arrhenius 1921; Preston 1960). However, it was later recognized that SARs may not be controlled by the same processes at all spatial

scales (Palmer & White 1994; Rosenzweig 1995). At the smallest scales, SARs result from more complete sampling of the local biota as the area sampled increases, and as

such they are sampling rather than biological phenomena. At larger scales (sampling all

of the local biota), classical ecological explanations apply, with SARs emerging as a

result of more species being able to immigrate into and persist in larger areas. Finally, at

the largest scales, differences between regions in rates of speciation and extinction should

be the main factor generating SARs (Losos & Schluter 2000; Rosenzweig 1995, 1998).

Here, we focus on SARs at the largest geographic scale. For mammals, this large-scale

phase is likely to occur only when considering quite large regions: in Kisel &

Barraclough’s (2010) study of the spatial scale of speciation, the two mammal groups represented (bats and carnivores) both required a region larger than 400,000 - 500,000 km2 for any in situ speciation to occur.

We use a framework of diversity-dependent cladogenesis (Section 1) to explore how the

area (Section 2) and environment (Section 3) of regions affect diversification and

154 Appendix I. diversity limits in the generation of SARs. We also examine the role of clade traits

(Section 4) and temporal patterns of diversification (Section 5) in modulating the shape of SARs. See Table 1 for a summary of the factors addressed.

Table 1. Summary of factors affecting diversification rates and diversity limits.

Effects on Speciation Effects on Extinction Effects on Diversity Type of factor Factor Rate Rate Limits

Area ↑ potential for ↓ by ↑ survival of geographic isolation of refuge populations separated populations

Environmental Population Size ↑ rate of appearance of ↓ by buffering ↑ number of species with factors strongly beneficial mutations, populations from viable populations correlated with standing genetic demographic supported area variation, persistence of stochasticity, incipient species environmental disasters, habitat loss

Habitat Diversity ↑ population divergence ↑ niche space available through local adaptation

Fragmentation ↑ isolation of if fragmentation ↑ by allowing more /Topographic populations; however results in too small ecologically equivalent Diversity past a certain point, will patches of area or species to be supported ↓ speciation by ↓ habitat, will ↑ population size extinction rate

Environmental Energy ↑ rate of molecular ↑ by facilitating factors not availability evolution, rate of co- specialization to narrow strongly evolutionary dynamics, niches correlated with size of populations area supported

Clade traits Life history traits faster life cycle ↑ faster life cycle ↓ speciation by increasing extinction by ↑ mutation rate resilience to disturbance

Range size larger range sizes ↑ larger range sizes ↓ smaller range sizes ↑ speciation by ↑ potential extinction by ↑ diversity limit by allowing for isolation of survival of refuge more species to pack into populations populations same area

Niche breadth narrower niche breadths narrower niche narrower niche breadths ↑ associated with ↑ breadths associated diversity limit by allowing speciation with ↑ extinction finer subdivision of niche space

Dispersal ↓ by reducing potential ↓ by ↑ resilience to ↓ if high dispersal ability isolation of populations, disturbance is associated with large, but can also ↑ speciation nonoverlapping species rate by ↑ rate at which ranges species colonize new regions

155 Appendix I.

Methods

We used the geographic distributions of 4650 terrestrial mammal species within

PanTHERIA (Jones et al. 2009) to explore the scaling of species richness with area. The

choice of appropriate regions at a global scale is not obvious, so we have taken two

approaches to identifying provinces. First, we used botanical sampling regions based on

geopolitical units (Taxonomic Database Working Group (TDWG), Brummit 2001) to

subdivide continental landmasses, although we further separated disjunct sub-regions,

such as islands. Second, we identified species presence in equal-area grid cells at a

resolution (96.5 km) comparable to a 1° grid. We then used complete linkage hierarchical clustering on the Jaccard distance (Linder et al. 2005; but see Kreft & Jetz in press) between grid cells to identify approximate mammalian biotic regions. Both methods are hierarchically nested between levels but regions within the same level are not nested. The fineness of subdivision can also be varied: the TDWG standard defines four levels, ranging roughly from different biomes at the coarsest scale (level 1) to subdivisions within countries at the finest scale (level 4); the hierarchical clustering can be cut at different “heights” to give different numbers of regions and we have used 50, 100, 150 and 200 regions (mapped in Fig S1). The two region types differ in ways that are likely to affect the outcome: for example, political boundaries are likely to more finely partition large biotically homogenous regions in the temperate zone and agglomerate smaller biotically heterogeneous tropical regions. We used both methods and the variety of scales to assess the robustness of our conclusions to the details of sampling. Separating discontinuous parts of detailed polygons of TDWG regions, in combination with the imprecision in global species distribution maps, led to a large number of tiny islands and boundary regions with implausible biotas. We therefore removed all regions at the

156 Appendix I. coarsest TDWG scale that did not contain at least one species endemic to that region, reducing 3974 candidate regions to 117. All nested subdivisions of these 117 regions at the finer TDWG scales were retained.

The areas of both geopolitical and clustered regions were calculated using an equal-area projection of the land within each region (Fig S2). We recorded both the total and endemic mammalian species richness for each region and fitted SARs at each scale of subdivision using linear models on log-log axes to estimate the slope. We modeled species richness (S) as a power of area (A) as S = cAz (Arrhenius 1921; Rosenzweig

1995): although there has been considerable debate about the shape of SARs (Lomolino

2000; Scheiner 2003), our results should be general to alternative functions. For all further analyses, we used the most finely divided regions and compared results using

TDWG Level 4 and 200 biotic regions. We also explored the differences between slopes of SARs arising from species endemic to a region versus those occurring in more than one region, and the variation in slopes of SARs within mammalian orders.

To investigate the additional explanatory power of habitat diversity and environmental variables, we used two variables to capture different elements of habitat diversity: the diversity of land cover classes (GLCC v2.0, http://edc2.usgs.gov/glcc/glcc.php), calculated as the inverse of Simpson’s diversity index (1-D) on the relative areas of the classes within each region; and the log range in elevation (GTOPO30, http://eros.usgs.gov/) within each region. We considered two environmental variables within regions: the mean annual temperature (www.worldclim.org) and the mean normalized difference vegetation index (NDVI, Los, pers comm., updated versions of

157 Appendix I.

Los et al. 2000). We fitted multiple regressions with log area and each of these four

variables in turn as predictors of log species richness. For each variable, we tested

whether it showed a significant interaction with area as well as its significance as a main

effect. All covariates were mean centred and standardized to facilitate the interpretation

and comparison of these models (Schielzeth in press).

An approximate measure of habitat breadth for mammalian species was found by counting the number of GLCC habitat cover classes across all the 96.5km cells intersecting each species' range. This number correlates strongly with the species' geographic range (Kendall's tau = 0.61) and we therefore also estimated a number of major habitats by counting only those habitats with a proportional contribution of at least

0.142. This cutoff was selected because it minimises the observed correlation between the resulting number of major habitats and the species' range size (Kendall's tau = -0.0002).

We then calculated the Kendall's correlation between family species richness and both the number of habitats and number of major habitats.

In order to explore the effects of area on the temporal patterns of recent diversification within mammals, we identified two sets of monophyletic clades from the mammal supertree (Bininda-Emonds et al. 2007; 2008), excluding monotypic clades. One set had crown ages younger than 20 MY (421 clades), the other had crown ages younger than 10

MY (616 clades) and was nested inside the older set. We recorded each clade’s species richness, stem-group age and present-day area (either the total area of all TDWG level 4

provinces or of all biotic regions (finest scale) in which the component species occurred).

We then fitted a suite of six models of diversification rate across each set (Rabosky

158 Appendix I.

2009b; Phillimore 2010). The most complex model is an extension of those outlined in

Rabosky (2009b) and Phillimore (2010) and fits an exponential decline in diversification

with rate z over clade age (t) from an initial diversification rate (), but where log

present-day area (A) contributes to both initial  (scaling by c) and the rate of decline

(scaling by p); the overall diversification rate is always scaled by the relative extinction

rate ():

(z p logAi )ti ri  c log Ai e 1

We also fitted five simplifications of this model by fixing sets of parameters at zero: a

constant diversification rate across clades (c, z and p fixed), a constant diversification rate scaled by individual clade area (z and p fixed), an exponential decline in rate within clades (c and p fixed), an exponential decline from an initial  scaled by area (p fixed)

and an exponential decline at a rate z scaled by area (c fixed). We optimized parameter

estimates for the free variables in each model by maximizing the sum of log likelihoods

of the observed species richness (n) across clades given clade age and the model

estimates (following Bokma 2003; Ricklefs 2009; Phillimore 2010). The models were not

nested and we therefore used AIC to assess relative model support. As the two methods

to define regions gave qualitatively similar results, we report only the TDWG analysis

here (see ESM for biotic region results).

Section 1: A verbal model for clade diversification in space

Diversity-dependent models of diversification have two main features: a growth phase,

where the clade in question diversifies until it reaches an external limit; and an

equilibrium phase, where species identity turns over but clade size fluctuates about that

159 Appendix I.

limit (Sepkoski 1978, Alroy 1998). The precise shape of diversity-dependent

diversification has been debated (Nee et al. 1992; Rabosky 2009a), but the exact shape of

the diversification trajectory should not change the broad-scale implications of the

existence of diversity-dependent diversification. There is taxonomic, phylogenetic, and paleontological evidence to support the existence of diversity-dependent diversification

in many cases, described variously as “ecological limits on diversity,” “diversification

slowdowns,” and “diversity equilibria” (Sepkoski 1976; Rabosky 2009a; Alroy 1998,

Nee et al. 1992; Rabosky & Lovette 2008, Vamosi & Vamosi in press).

A variety of processes could generate diversity-dependent diversification. Perhaps the

most commonly referenced is a model of ecological limits wherein, as available niches are filled, speciation declines and new species are only added to a region following extinctions and release of sufficient niche space (McKinney 1998; Rabosky 2009a). Such a mechanism would provide a link between the ecological processes typically associated with SARs and the evolutionary processes being proposed here. Alternatively, reduction of both population and range sizes as diversity increases could lead to decreased rates of speciation and increased rates of extinction and thus a diversification slowdown conceivably divorced from any niche-based mechanism (Pigot et al. in press; Rosenzweig

1975).

Within our diversity-dependent framework, there are only three features of a clade’s

diversification curve that can vary: the speed at which a region initially accumulates

species (Fig. 1a), the diversity limit (or equilibrium species richness, Fig. 1b), and the age

at which diversification begins (Fig. 1c) (see also Rabosky 2009a). Before equilibrium is

160 Appendix I.

reached, the richness of clades depends only on their age and their rate of diversification.

In contrast, clade sizes at equilibrium depend on their diversity limits, which are

controlled by the interaction of external factors with clade traits (Mallet submitted, and

see below). SARs will emerge from this model whenever diversification rates and/or

diversity limits are higher in larger regions (Fig. 2). When a clade inhabits multiple

separate regions of different areas, the species richness of that clade will be higher in the

larger regions, creating a SAR.

Figure 1. Variation in patterns of clade diversification from A) initial rate of diversification, B) equilibrium diversity, C) clade age and D) reinforcing (solid grey) and opposing (dashed grey) combinations of rate and equilibrium diversity. Sampling clade diversity at the time specified by the vertical line demonstrates the variation possible.

161 Appendix I.

Figure 2. The development of a species area relationship (SAR) across three regions (X, Y, Z), in which both initial rate of diversification and equilibrium diversity increase with area (A). The resulting SAR across regions (B) exhibits power law scaling both before (dashed line) and after (solid line) the regions have reached equilibrium diversity. It is important to discriminate between the clade diversification curves (A) and SARs (B); each region will follow a particular diversification trajectory but contributes a single point to the SAR.

162 Appendix I.

Globally, mammalian species richness shows strong scaling with area between non- nested provinces for both TDWG and clustered regions at all four scales (Fig. 3). These are well described by power laws but there are differences between the two region types

(Fig 3a): clustered regions show consistent slopes across changing scales (0.41 - 0.43), whereas TDWG regions show a decline in slope from 0.47 to 0.24 with increasing subdivision. These slopes lie within the range of 65 previously reported slopes from mammal power law SARs (Fig 3b; Drakare et al. 2006), but the higher values fall toward the top of the reported range (92% quantile). The changes in slope between TDWG scales is accompanied by higher intercepts (Table S1, Figures S3 and S4) and is primarily driven by small political units, such as the Vatican City and Likoma, within species-rich areas (Fig 3c); these outliers are not found in small regions based on mammalian biotas

(Fig 3d). In all cases, endemic species also show significant scaling with area but with reduced slopes compared to total and non-endemic species richness (Fig 3b-c, Table S1,

Figures S3 and S4).

163 Appendix I.

Figure 3. A) Slopes and their standard errors of species area relationships (SARs) for 4560 terrestrial mammals at four different scales across geopolitical regions (T 1-4) and biotic regions (C 1-4). B) Distribution of power law exponents from mammalian SARs showing the range of non-nested region sizes considered (grey lines – data from Drakare et al. 2006; black lines – values from panel above). Scatterplots show the distribution and least squares fit of SARs for T4 (C) and C4 (D) for total (black) and endemic species richness (grey). See also Table S1 and Figures S3 and S4.

164 Appendix I.

We also tested how well area explains variation in diversification rate across sets of mammalian clades. For both sets of clades (crown group age < 20 MY and < 10MY,

Tables 2, S2), an exponential decline in diversification rate is best supported, demonstrating apparent limits to diversity. For clades younger than 20 MY, the most complex model was best supported, with clades occupying larger areas having increased initial diversification rates and decreased rate of decline. For clades < 10 MY, a simpler model, with area affecting only the rate of decline, could not be rejected. These results suggest that for mammals the decline in diversification rate as a region fills is more strongly affected by available area than the initial rate. Nevertheless, support for an effect of available area on initial rate was still found for both clade sets and the similar likelihoods for the younger clades may simply reflect individual clade differences within the set tested (see also Cardillo et al. 2005; Linder 2008).

165 Appendix I.

Table 2. Summary of diversification models fitted to mammalian clades with crown age younger than (a) 20 and (b) 10 million years before present using the TDWG Level 4,. Six models of diversification are fitted representing: constant rate (1), constant rate scaled by region area (2), exponential decline (3), exponential decline region area scaling initial rate (4), rate of decline (5) or both (6). In each case, the maximum likelihood estimate of the model is reported for each free parameter within the bounds shown. Dashed parameter estimates were fixed at zero. The overall best-fit model for each period is shown in bold. Results for biotic regions are presented in Table S2. Likelihoo Lambda c z p Epsilon ΔAICc d [- [-1,1] [-0.2,0.2] [-0.2,0.2] 0.2,0.2] [0.5,0.999] a) 20 MY 1 0.340 ------0.990 222.4 -1410.0 2 -0.300 0.040 ------0.990 136.8 -1366.2 3 0.790 --- -0.030 --- 0.990 187.2 -1391.4 4 -0.300 0.040 -0.030 --- 0.610 53.1 -1323.3 5 0.474 --- -0.138 0.007 0.814 19.1 -1306.3 6 -0.260 0.040 -0.100 0.004 0.610 0.0 -1295.8 b) TDWG Level 4, 10 MY 1 0.265 ------0.999 164.2 -1578.3 2 -0.223 0.030 ------0.990 93.4 -1541.9 3 0.530 --- -0.043 --- 0.999 110.6 -1550.5 4 -0.193 0.031 -0.043 --- 0.520 30.0 -1509.1 5 0.377 --- -0.232 0.012 0.711 0.0 -1494.1 6 -0.064 0.023 -0.120 0.005 0.500 1.16 -1493.7

166 Appendix I.

Section 2: Generating SARs in an evolutionary framework

In explanations of SARs, area is frequently viewed as a proxy or summary variable

(Hubbell 2001) acting only indirectly via other variables, such as population size and habitat diversity that are highly correlated with area (MacArthur & Wilson 1967). The individual effects of area and such correlated factors are difficult to separate in practice

(Triantis et al. 2003; Kallimanis et al. 2008), and their relative importance is likely to vary depending on the taxon concerned (Rosenzweig 1995; Ricklefs & Lovette 1999).

However, we believe that area could conceivably have some direct effects, and we discuss these first.

Direct effects of area

We can see only two ways that area could control diversity directly (ie. without invoking increased population sizes or habitat variety). Firstly, extinction rates should be lower in larger regions, in which refuge populations are more likely to survive after any catastrophic disturbance affecting only part of the region (Wiley & Wunderle 1994).

Secondly, if populations are patchily distributed, speciation rates should be higher in larger areas (Losos & Schluter 2000), where distances between populations can be larger and barriers that can cause vicariant speciation are likely to be larger and more numerous

(Rosenzweig 1995). It could be argued that the effect of barriers is really an indirect effect of area via fragmentation, and we discuss this point further below. Greater geographic isolation between populations will lead to higher speciation rates if: 1) there is sufficient selection pressure and/or genetic drift to drive population divergence through to reproductive isolation (although there is no evidence for speciation via genetic drift on its own: Coyne & Orr 2004); 2) gene flow is the main force preventing population

167 Appendix I.

divergence and speciation (Slatkin 1987); and 3) the regions considered are large enough

for populations to be sufficiently isolated to permit speciation. The definition of ‘large

enough’ will depend on the dispersal ability of the organism and the strength of selection

relative to gene flow, as poorer dispersers will attain sufficient isolation in smaller

regions (Kisel & Barraclough 2010), as will species whose populations experience

stronger divergent selection (Slatkin 1973; Slatkin 1985).

Effects of area via population size

Because larger regions are able to support greater total numbers of individuals (Brown

1995), and thus are also likely to have species with larger population sizes, the effects of

population size on diversification can contribute to the generation of SARs. In fact, many

of the effects of population size that we describe below have previously been described as

direct effects of area itself (MacArthur & Wilson 1967, Ricklefs & Lovette 1999). It is

well established that larger populations are less likely to go extinct, as they are more buffered from the effects of demographic stochasticity, environmental disasters, and habitat loss (Lande 1993; Rosenzweig 1995). Additionally, there are three ways that larger population size may drive higher speciation rates. First, new beneficial mutations will arise faster in larger populations (Willi et al. 2006), allowing faster divergence between separated populations if mutation limits speciation (Schluter 2009). Second,

larger populations hold more standing genetic variation (Frankham 1996; Leimu et al.

2006) for selection to work on (Schluter & Conte 2009; Weber 1990). Third, newly

isolated populations resulting from the break-up of larger populations will also be larger,

and therefore more likely to survive long enough to diverge into new species (Chown &

Gaston 2000). In addition to effects on rates of diversification, the total abundance of

168 Appendix I.

individuals supported by a region places a hard limit on the number of species that the

region can hold. If we assume that all species are ecologically identical and so have the

same minimum viable population size (Gilpin & Soulé 1986, Hubbell 2001), then larger

regions will be able to support more species at sustainable equilibrium population sizes.

Effects of area via habitat diversity and fragmentation

Some authors have suggested that SARs are only a proxy for the scaling of species

richness with habitat diversity (MacArthur & Wilson 1967; Baldi 2008; Triantis et al.

2003; Losos & Parent 2010), and indeed habitat diversity and area are typically very highly correlated. Along steep environmental gradients, and in heterogeneous habitats, populations can more easily become specialised to different habitats, making ecological speciation more likely and perhaps more rapid (Schluter 2009). Regions with high habitat diversity also have a higher number of possible distinct niches or niche combinations

(Hutchinson & MacArthur 1959), thus increasing the number of species that can coexist at equilibrium.

High levels of regional fragmentation can also elevate diversification rate and diversity limits, by providing a textured landscape with subunits that are physically isolated from one another but environmentally equivalent. Barrier formation can occur through many processes, including river formation, mountain building, sea-level fluctuations, volcanic uplift, and habitat fragmentation, and is more likely in larger regions. Barriers elevate diversification rate by separating previously interacting populations, which are then more likely to evolve reproductive isolation (Rosenzweig 1995). In addition, fragmentation can boost equilibrium diversity, as ecologically equivalent species can be maintained in

169 Appendix I.

separated sub-regions (Shmida & Wilson 1985; Orme et al. in prep). For example,

Esselstyn et al. (2009) suggest that tree shrew diversity in the Phillippines has arisen

predominantly via speciation in allopatry on newly formed islands, with limited apparent

morphological or ecological differentiation. One particularly important measure of

regional fragmentation is topographic complexity, as environmental turnover along

altitude gradients is a barrier to many species’ ranges (McInnes et al. 2009). The richness of uniquely adapted, restricted-range endemics found along altitudinal transects in tropical mountains is perhaps the classic example of such fine-scale spatial partitioning

(Janzen 1967; Rahbek & Graves 2001).

The effects of fragmentation on species richness will show a complex relationship to the total summed area of the subunits. While greater fragmentation of a region may permit more species to exist within the same total area, it may also push the area of the component fragments below a size which can maintain viable populations (Gilpin &

Soule 1986; Maurer & Nott 1998) or generate endemics (Losos & Schluter 2000; Kisel &

Barraclough 2010). Thus, plots of species richness against total area occupied may not yield significant relationships unless the degree of fragmentation is also considered and total area is scaled appropriately (see Orme et al. in prep). In addition, the dispersal ability of a clade in combination with the geographic structure of the fragments will influence the number of fragments that can be occupied. Finally, the effect of barriers will depend on the average range sizes of species in a region: if the average range size is small, barriers need not be large or bisect an entire region to cause speciation

(Rosenzweig 1975).

170 Appendix I.

Attesting to the importance of environmental features in the generation of SARs, increased elevational range is associated with higher diversity in both geopolitical and biotic regions; habitat diversity also drives higher diversity, but only in geopolitical regions (Table 3, Figure S4). This arises from differences between the clustering methods: areas with similar habitat are likely to be biotically homogenous and therefore form a single biotic region, whereas political boundaries are more likely to cut across such regions. As a result, Simpson’s index (1-D) of habitat diversity is low in biotic clusters and scales extremely weakly with region area (intercept: 0.227, se = 0.042, t=3.83; slope: 0.018, se = 0.014, t = 1.27; df = 148) whereas in TDWG regions it is higher and scales strongly with area (intercept: 0.356, se = 0.025, t=14.39; slope: 0.055, se = 0.005, t = 10.27; df = 578). In all these models, the high relative magnitude of the standardized parameter estimate for area also implies it is not simply acting as a proxy for either variable.

171 Appendix I.

Table 3. Multivariate regressions of SARs including a) habitat diversity, b) log range in elevation, c) mean annual temperature and d) mean NDVI. The models are fitted to log 10 species richness within both geopolitical and biotic regions and the explanatory covariates in all models are centred and standardized to facilitate model comparison (* p < 0.05, ** p < 0.01, *** p < 0.001). The number of regions with available data is shown for each model.

a) Geopolitical regions b) Biotic regions Estimate SE Estimate SE a) n 580 150 Intercept 1.7419 0.0186 *** 1.2670 0.0342 *** Habitat diversity 0.0234 0.0194 -0.0649 0.0340 . log Area 0.3407 0.0203 *** 0.7608 0.0352 *** Interaction 0.0410 0.0152 ** -0.0340 0.0325 b) n 578 200 Intercept 1.6810 0.0202 *** 1.0211 0.0391 *** log Elevation range 0.0533 0.0297 . 0.2383 0.0573 *** log Area 0.3956 0.0237 *** 0.5869 0.0478 *** Interaction 0.1114 0.0162 *** 0.1923 0.0343 *** c) n 477 130 Intercept 1.8201 0.0160 *** 1.0802 0.0496 *** NDVI 0.0142 0.0174 0.1947 0.0521 *** log Area 0.3075 0.0218 *** 0.8997 0.0536 *** Interaction 0.1007 0.0218 *** -0.0442 0.0550 d) n 525 196 Intercept 1.7463 0.0158 *** 1.1339 0.0343 *** Temperature 0.0577 0.0168 *** 0.2750 0.0407 *** log Area 0.4110 0.0201 *** 0.8411 0.0341 *** Interaction 0.0979 0.0179 *** -0.0545 0.0432

172 Appendix I.

Section 3: Abiotic factors modulating the species-area relationship

Some abiotic factors, such as energy availability, do not correlate closely with area but

may still affect diversification rates or diversity limits of different regions, leading to

departures from SARs that depend on a region’s prevailing environmental conditions.

Energy availability is one of the key variables thought to contribute to large-scale spatial patterns of diversity, and has mainly been discussed for its part in generating latitudinal differences in diversity (reviewed in Willig et al. 2003; Mittelbach et al. 2007). On average, energy availability (either ambient, e.g. temperature, or productive, e.g. plant biomass) explains 60% of the variation in broad-scale richness across a range of plant and animal groups (Hawkins et al. 2003). This variation should lead to consistent differences between SARs of high- and low-energy regions.

As expected, increases in both mean annual temperature and mean NDVI act to significantly elevate both overall mammal diversity and slopes of mammalian SARs

(Table 3, Figure S5). Again though, as in analyses including habitat and topographical diversity, the relative magnitudes of standardized regression coefficients show that area is

the main driver of diversity within regions.

We expect energy to affect SARs through both diversification rates and diversity limits.

First, it could affect speciation rates through faster rates of molecular evolution, with

increased metabolic rates in higher-energy regions leading to both shorter generation times and higher mutation rates (Rohde 1992). There has been mixed evidence for this molecular rate hypothesis, with particularly weak support in endotherms (Cardillo et al.

173 Appendix I.

2005) and no support in angiosperms (although a direct effect of energy on species

richness is supported: Davies et al. 2004). However, Gillman et al. (2009) recently

presented evidence for higher rates of microevolution in tropical mammals and explained

this as an indirect consequence of more rapid co-evolution with other tropical ectotherms

(see also Fischer 1960; Schemske 2002). Energy is also expected to increase diversification rates through effects on population dynamics, as aseasonal and elevated productive energy can support larger populations, resulting in increased speciation and reduced extinction, as described above. Such an aseasonal and high-energy environment

will also increase the equilibrium diversity limit by increasing resource availability,

facilitating specialisation to very narrow niches, and thus increasing the number of

distinct niches available (Janzen 1967). Conversely, seasonal habitats in temperate

regions may select for more motile, generalist species. These traits should decrease both

speciation rate and the number of species that can be supported in a region (Dynesius &

Jansson 2000; Sheldon 1996). Although not attempted here, incorporating ecological covariates into our diversification models could lend insight into the effects of, for example, energy availability on the diversification trajectory of clades in different regions

(Vamosi & Vamosi in press).

Section 4: Clade traits modulating the species-area relationship

So far our framework has considered species richness within a region as an outcome of solely environmental and geographic influences, taking a neutral view of the organisms themselves (MacArthur & Wilson 1967). However, there is abundant research (reviewed in Coyne & Orr 2004) indicating that species traits affect clade diversity. Any clade traits that affect diversity will give rise to clade-specific SARs, and create scatter around SARs

174 Appendix I.

that aggregate species richness across multiple clades. The effects of clade traits on SARs

are reflected in the clear differences between mammalian orders in the scaling of species

richness with area: order-specific slopes vary between -1.71 to 0.59 with medians of 0.16

for clustered regions and 0.11 for geopolitical regions (Table S3; because regions are not nested, negative slopes arise simply where orders have high diversity in small regions).

According to our general model, clade traits can modulate SARs by modifying the net rate of diversification (Fig 1a) and/or the diversity limit (Fig 1b). It is not straightforward to assign traits to one of these mechanisms. Firstly, data are lacking: studies analysing differences between clades in diversification (reviewed in: Jablonski 2008b; Rabosky &

McCune 2010) have not discriminated between effects on diversification rate and effects on diversity limits (but see Vamosi & Vamosi in press), and studies of diversification slowdowns in phylogenies (e.g. Phillimore & Price 2008) have not investigated the influence of species’ traits. Secondly, individual traits are unlikely to act solely through modification of either diversification rates or diversity limits (Mallet submitted). Finally, many clade traits are strongly correlated (for example, geographic range size, dispersal distance and body size: Jablonski 2008b; Jones et al. 2009) and so any traits acting through one mechanism are likely to be associated with traits acting through the other.

Below, we discuss traits expected to influence SARs, with particular emphasis on those that affect species’ use of space.

While most traits are likely to influence both diversification rates and diversity limits, life history traits are perhaps the only class of traits expected to influence only diversification rate. Typically, r selected species exhibit higher net rates of diversification

175 Appendix I. than K selected species, and several mechanisms have been proposed to explain this

(Mayhew 2007, Marzluff & Dial 1991). Short generation times are associated with high rates of population increase and the ability to rapidly exploit favourable conditions

(Mayhew 2007), conferring resilience to disturbance and leading to lower rates of extinction. They are also associated with increased rates of evolution due to shorter nucleotide generation times (Martin & Palumbi 1993; Mittelbach et al. 2007), and higher metabolic rates (Martin & Palumbi 1993), both leading to higher rates of speciation. In addition, the larger population sizes associated with r selection should increase speciation rates and decrease extinction rates, as discussed in Section 2.

Clade traits that determine how space is occupied within a region also affect both the generation and maintenance of SARs. Larger species ranges are associated with lower clade diversity limits as well as reduced rates of extinction (e.g. Jablonski 2008a; Payne

& Finnegan 2007), and increased rates of speciation (Phillimore et al. 2006, but see

Jablonski & Roy 2003). Regarding diversity limits, there is evidence from both mammals

(Orme et al. in prep) and birds (Phillimore et al. 2008) that increasing species’ range overlap is a stronger predictor of increased species richness than decreased median range size.

Similarly to species’ range size, several aspects of narrow niche breadth, such as ecological specialisation, high host specificity and narrow environmental tolerances, have been associated with increased diversity limits as well as increased rates of extinction and speciation (Jablonski 2008b). Increased clade diversity is also associated with greater niche overlap rather than decreased niche breadth.

176 Appendix I.

Ricklefs (2009) has shown that South American bird families of varying species richness do not differ in the average number of habitats occupied by species, suggesting that niche overlap increases. We find the same in mammals, using simple measures of the number of habitats used by species. There is no significant correlation between the richness of mammalian families and either the average total number of habitats occupied

(tau = -0.076, p=0.21) or the average number of major habitats occupied (tau = -0.043, p=0.49) nor is there a decrease in mean species range size with increasing family richness (tau = -0.001, p=0.99).

Finally, increased dispersal ability has been found to reduce speciation and extinction rates in some cases (Xiang et al. 2004), while in others it has been shown to increase diversification rate (Phillimore et al. 2006; Phillimore & Price 2009). With respect to diversity limits, high dispersal ability may lead to low equilibrium diversity within a region if it leads to clades consisting of few species with large ranges. At the other extreme, strong philopatry, where individuals retain or return to natal locations, might both increase rates of diversification by accelerating rates of genetic differentiation

(Peterson 2008) and increase equilibrium diversity by impeding range expansion and boosting the number of equivalent species that can persist in a region (Shmida & Wilson

1985; Seehausen 2006). Alternatively, high dispersal ability can increase the rate at which new regions are occupied, increasing clade richness through occupation of multiple regions. Such long-distance dispersal may significantly distort SARs if newly colonised regions harbour clades with higher diversity due to competitive release.

Section 5: Historical and temporal effects on the species area relationship

177 Appendix I.

SARs will be clearest when clades have reached equilibrium throughout their ranges, but this requires that they have had enough time to diversify to their limit in each region that they occupy. Thus, in parts of the world where the current habitat has only recently become available, current diversity is likely to be lower than expected (e.g. a recently- formed island, Esselstyn et al. 2009, or a recently deglaciated region, Pielou 1979) and may be biased toward large-ranged generalists (Dynesius & Jansson 2000). In contrast, a comparison of mammalian sister taxon pairs with disjunct distributions across two realms indicated that sisters remaining in the realm unambiguously reconstructed as ancestral

(DIVA: Ronquist 1997) are significantly less species rich (12 out of 41, binomial p =

0.004 Table S4) than sisters that dispersed. This suggests a diversification burst in newly colonized regions, driven by competitive release. Finally, if a region is subject to frequent extrinsic perturbations (such as an archipelago subject to repeated sea-level changes), fluctuating extinction rates make it unlikely that equilibrium diversity will ever be reached or maintained (Whittaker et al. 2008; Esselstyn et al. 2009). Indeed, explanations for high tropical diversity, such as the time-for-speciation effect (Stephens & Wiens

2003) and reduced extinction due to long-term climatic stability (Fischer 1960), are compatible with tropical regions being able to more closely approach diversity limits.

Diversity may also transiently over- or under-shoot the diversity limit of a region if speciation or extinction occurs very rapidly, or if perturbations occur that suddenly alter clade diversity limits (Gavrilets & Vose 2005). Alternatively, non-ecological modes of speciation (e.g. via sexual selection or polyploidy), may produce transient species that are unable to persist in the long-term given the niche space available, and thus are committed to eventual extinction (McPeek 2008; Rosenzweig 1995, Chesson 2000). This may also

178 Appendix I.

apply to ecologically equivalent species formed in allopatry, if the barriers separating

them are themselves transient. Transient dynamics are now thought to be crucial in

predicting biodiversity responses to current global change (recently reviewed in Jackson

& Sax 2010); though the changes will likely not be as immediately apparent as for

ecological processes such as community assembly, evolutionary clade dynamics will

certainly be affected as well (Rosenzweig 2001).

Conclusions

We have presented a framework, based on a diversity-dependent model of clade

diversification, for understanding how evolutionary processes contribute to the creation

of large-scale SARs. This framework is supported by analyses on mammals using data

from the PanTHERIA database (Jones et al. 2009). SARs themselves result from direct

and positive effects of area on diversification rates and diversity limits, as well as indirect

effects of area through population size, habitat diversity, and habitat fragmentation. We found that these effects are apparent in the histories of mammal diversification – clades occupying larger areas had higher initial diversification rates and lower rates of decline in diversification. We also confirmed that habitat and topographical diversity are significant

predictors of regional diversity in mammals, but found that neither is a proxy for area -

the most predictive models of diversity always include area as well. Environmental

factors and clade traits that are not tightly correlated with area also cause systematic

differences in SARs between clades or regions, and cause scatter around any general

SAR generated without accounting for them. We tested the influence of energy availability on mammal diversity and showed that high energy availability significantly increases the slopes and intercepts of SARs. In addition, mammal orders vary greatly in

179 Appendix I.

the slopes of their SARs. Finally, we provide evidence that historical contingencies impact SARs, demonstrating that mammal clades able to colonize new, competitor-free regions are more diverse than their stay-at-home sisters.

Schoener (1976) referred to the species-area relationship as the phenomenon closest to

attaining rule status in ecology, and SARs are indeed one of the most general diversity

patterns, existing for a wide range of organisms across a range of spatial scales. However,

we argue here that in addition to the processes most discussed in the ecological literature

– immigration, local extinction and species coexistence - SARs are also influenced by

macroevolutionary processes, in particular speciation and global extinction. None of

these processes operates in isolation, and every SAR is the result of interplay between both ecological and evolutionary processes. Diversity limits, for instance, must ultimately result from ecological limits on the number of species that can coexist in a region, though the speed at which they are reached may depend on evolutionary processes. We suggest that a full understanding of species-area relationships will require integrating both ecological and evolutionary perspectives on the processes that generate and constrain diversity.

180 Appendix I.

Acknowledgements

We thank Kate Jones for the invitation to contribute to this special issue and Tim

Barraclough, Natalie Cooper, Susanne Fritz, Alex Pigot and James Rosindell for

comments on previous versions of the manuscript. Special thanks are extended to Ally

Phillimore for insightful discussion and comments, and R code for implementing the diversification models. YK was supported by a U.S. National Science Foundation

Graduate Research Fellowship and a Deputy Rector’s Award from Imperial College

London, LM by a Grantham Institute studentship, and CDLO by an RCUK fellowship.

181 Appendix I.

References

Alroy, J. 1998 Equilibrial diversity dynamics in North American mammals. In Biodiversity Dynamics: Turnover of Populations, Taxa and Communities (ed. M. L. McKinney & J. A. Drake). New York: Columbia University Press. Arrhenius, O. 1921 Species and area. Journal of Ecology 9, 95-99. Baldi, A. 2008 Habitat heterogeneity overrides the species-area relationship. Journal of Biogeography 35, 675-681. Bininda-Emonds, O. R. P., Cardillo, M., Jones, K. E., MacPhee, R. D. E., Beck, R. M. D., Grenyer, R., Price, S. A., Vos, R. A., Gittleman, J. L. & Purvis, A. 2007 The delayed rise of present-day mammals. Nature 446, 507-512. Bininda-Emonds, O. R. P., Cardillo, M., Jones, K. E., MacPhee, R. D. E., Beck, R. M. D., Grenyer, R., Price, S. A., Vos, R. A., Gittleman, J. L. & Purvis, A. 2008 The delayed rise of present-day mammals (vol 446, pg 507, 2007). Nature 456, 274-274. Bokma, F. 2003 Testing for equal rates of cladogenesis in diverse taxa. Evolution 57, 2469-2474. Brown, J. H. 1995 Macroecology. Chicago: University of Chicago Press. Brummit, R. K. 2001 World Geographical Scheme for Recording Plant Distributions. Pittsburgh, Pennsylvania: Hunt Institute for Botanical Documentation, Carnegie- Mellon University (for the International Working Group on Taxonomic Databases for Plant Sciences). Cardillo, M., Orme, C. D. L. & Owens, I. P. F. 2005 Testing for latitudinal bias in diversification rates: An example using New World birds. Ecology 86, 2278-2287. Chesson, P. 2000 Mechanisms of maintenance of species diversity. Annual Review of Ecology and Systematics 31, 343-366. Chown, S. L. & Gaston, K. J. 2000 Areas, cradles and museums: the latitudinal gradient in species richness. Trends in Ecology & Evolution 15, 311-315. Coyne, J. A. & Orr, H. A. 2004 Speciation. Sunderland, Massachusetts: Sinauer Associates, Inc. Davies, T. J., Savolainen, V., Chase, M. W., Moat, J. & Barraclough, T. G. 2004 Environmental energy and evolutionary rates in flowering plants. Proceedings of the Royal Society of London Series B-Biological Sciences 271, 2195-2200.

182 Appendix I.

Drakare, S., Lennon, J. J. & Hillebrand, H. 2006 The imprint of the geographical, evolutionary and ecological context on species–area relationships. Ecology Letters 9, 215-227. Dynesius, M. & Jansson, R. 2000 Evolutionary consequences of changes in species' geographical distributions driven by Milankovitch climate oscillations. Proceedings of the National Academy of Sciences of the United States of America 97, 9115-9120. Esselstyn, J. A., Timm, R. M. & Brown, R. M. 2009 Do geological or climatic processes drive speciation in dynamic archipelagos? The tempo and mode of diversification in southeast Asian shrews. Evolution 63, 2595-2610. Fischer, A. G. 1960 Latitudinal variations in organic diversity. Evolution 14, 64-81. Frankham, R. 1996 Relationship of genetic variation to population size in wildlife. Conservation Biology 10, 1500-1508. Gavrilets, S. & Vose, A. 2005 Dynamic patterns of adaptive radiation. Proceedings of the National Academy of Sciences of the United States of America 102, 18040-18045. Gillman, L. N., Keeling, D. J., Ross, H. A. & Wright, S. D. 2009 Latitude, elevation and the tempo of molecular evolution in mammals. Proceedings of the Royal Society B- Biological Sciences 276, 3353-3359. Gilpin, M. E. & Soulé, M. E. 1986 Minimum viable populations: processes of species extinction. In Conservation biology: the science of scarcity and diversity (ed. M. E. Soule), pp. 19-34. Sunderland, Massachusetts: Sinauer Associate, Inc. Hawkins, B. A., Field, R., Cornell, H. V., Currie, D. J., Guegan, J. F., Kaufman, D. M., Kerr, J. T., Mittelbach, G. G., Oberdorff, T., O'Brien, E. M., Porter, E. E. & Turner, J. R. G. 2003 Energy, water, and broad-scale geographic patterns of species richness. Ecology 84, 3105-3117. Hubbell, S. P. 2001 The Unified Neutral Theory of Biodiversity and Biogeography. Princeton: Princeton University Press. Hutchinson, G. E. & MacArthur, R. H. 1959 A theoretical ecological model of size distributions among species of animals. The American Naturalist 93, 117. Jablonski, D. 2008a Extinction and the spatial dynamics of biodiversity. Proceedings of the National Academy of Sciences of the United States of America 105, 11528-11535.

183 Appendix I.

Jablonski, D. 2008b Species selection: theory and data. Annual Review of Ecology Evolution and Systematics 39, 501-524. Jablonski, D. & Roy, K. 2003 Geographical range and speciation in fossil and living molluscs. Proceedings of the Royal Society B-Biological Sciences 270, 401-406. Jackson, S. T. & Sax, D. F. 2010 Balancing biodiversity in a changing environment: extinction debt, immigration credit and species turnover. Trends in Ecology & Evolution 25, 153-160. Janzen, D. 1967 Why mountain passes are higher in the tropics. American Naturalist 101, 233-249 Jones, K. E., Bielby, J., Cardillo, M., Fritz, S. A., O'Dell, J., Orme, C. D. L., Safi, K., Sechrest, W., Boakes, E. H., Carbone, C., Connolly, C., Cutts, M. J., Foster, J. K., Grenyer, R., Habib, M., Plaster, C. A., Price, S. A., Rigby, E. A., Rist, J., Teacher, A., Bininda-Emonds, O. R. P., Gittleman, J. L., Mace, G. M., Purvis, A. & Michener, W. K. 2009 PanTHERIA: a species-level database of life history, ecology, and geography of extant and recently extinct mammals. Ecology 90, 2648-2648. Kallimanis, A. S., Mazaris, A. D., Tzanopoulos, J., Halley, J. M., Pantis, J. D. & Sgardelis, S. P. 2008 How does habitat diversity affect the species-area relationship? Global Ecology and Biogeography 17, 532-538. Kisel, Y. & Barraclough, T. G. 2010 Speciation has a spatial scale that depends on levels of gene flow. American Naturalist 175, 316-334. Kreft, H. & Jetz, W. A framework for delineating biogeographical regions based on species distributions. Journal of Biogeography, in press. Lande, R. 1993 Risks of population extinction from demographic and environmental stochasticity and random catastrophes. American Naturalist 142, 911-927. Leimu, R., Mutikainen, P., Koricheva, J. & Fischer, M. 2006 How general are positive relationships between plant population size, fitness and genetic variation? Journal of Ecology 94, 942-952. Linder, H. P. 2008 Plant species radiations: where, when, why? Philosophical Transactions of the Royal Society B-Biological Sciences 363, 3097-3105.

184 Appendix I.

Linder, H. P., Lovett, J. C., Mutke, J., Barthlott, W., Jürgens, N., Rebelo, T. & Küper, W. 2005 A numerical re-evaluation of the sub-Saharan phytochoria of mainland Africa. Biologiske Skrifter 55, 229-252. Lomolino, M. V. 2000 Ecology's most general, yet protean pattern: the species-area relationship. Journal of Biogeography 27, 17-26. Los, S. O., Collatz, G. J., Sellers, P. J., Malmstrom,̈ C. M., Pollack, N. H., DeFries, R. S., Bounoua, L., Parris, M. T., Tucker, C. J. & Dazlich, D. A. 2000 A global 9-year biophysical land-surface data set from NOAA AVHRR data. Journal of Hydrometerology 1, 183-199. Losos, J. B. & Parent, C. E. 2010 The speciation-area relationship. In The Theory of Island Biogeography Revisited (ed. J. B. Losos & R. E. Ricklefs), pp. 415-438. Oxford: Princeton University Press. Losos, J. B. & Schluter, D. 2000 Analysis of an evolutionary species-area relationship. Nature 408, 847-850. MacArthur, R. H. & Wilson, E. O. 1967 The Theory of Island Biogeography. Princeton: Princeton University Press. Mallet, J. The 'struggle for existence:' why the mismatch of theory in ecology and evolution? . American Naturalist, submitted. Martin, A. P. & Palumbi, S. R. 1993 Body size, metabolic rate, generation time and the molecular clock. Proceedings of the National Academy of Sciences of the United States of America 90, 4087-4091. Marzluff, J. M. & Dial, K. P. 1991 Life history correlates of taxonomic diversity. Ecology 72, 428-439. Maurer, B. A. & Nott, M. P. 1998 Geographic range fragmentation and the evolution of biological diversity. In Biodiversity Dynamics: Turnover of Populations, Taxa and Communities (ed. M. L. McKinney & J. A. Drake). New York: Columbia University Press. Mayhew, P. J. 2007 Why are there so many insect species? Perspectives from fossils and phylogenies. Biological Reviews 82, 425-454.

185 Appendix I.

McInnes, L., Purvis, A. & Orme, C. D. L. 2009 Where do species' geographic ranges stop and why? Landscape impermeability and the Afrotropical avifauna. Proceedings of the Royal Society B-Biological Sciences 276, 3063-3070. McKinney, M. L. 1998 Biodiversity dynamics: niche preemption and saturation in diversity equilibria. In Biodiversity Dynamics: Turnover of Populations, Taxa and Communities (ed. M. L. McKinney & J. A. Drake). New York: Columbia University Press. McPeek, M. A. 2008 The ecological dynamics of clade diversification and community assembly. American Naturalist 172, E270-E284. Mittelbach, G. G., Schemske, D. W., Cornell, H. V., Allen, A. P., Brown, J. M., Bush, M. B., Harrison, S. P., Hurlbert, A. H., Knowlton, N., Lessios, H. A., McCain, C. M., McCune, A. R., McDade, L. A., McPeek, M. A., Near, T. J., Price, T. D., Ricklefs, R. E., Roy, K., Sax, D. F., Schluter, D., Sobel, J. M. & Turelli, M. 2007 Evolution and the latitudinal diversity gradient: speciation, extinction and biogeography. Ecology Letters 10, 315-331. Nee, S., Mooers, A. & Harvey, P. 1992 Tempo and mode of evolution revealed from molecular phylogenies. Proceedings of the National Academy of Sciences of the United States of America 89, 8322 - 8326. Pagel, M. D., May, R. M. & Collie, A. R. 1991 Ecological aspects of the geographical distribution and diversity of mammalian species. American Naturalist 137, 791-815. Palmer, M. W. & White, P. S. 1994 Scale dependence and the species-area relationship. American Naturalist 144, 717-740. Payne, J. L. & Finnegan, S. 2007 The effect of geographic range on extinction risk during background and mass extinction. Proceedings of the National Academy of Sciences of the United States of America 104, 10506-10511. Peterson, A. T. 2008 Philopatry and genetic differentiation in the Aphelocoma jays (Corvidae). Biological Journal of the Linnean Society 47, 249-260. Phillimore, A. B. 2010 Subspecies origination and extinction in birds. Ornithological Monographs 67, 42-53.

186 Appendix I.

Phillimore, A. B., Freckleton, R. P., Orme, C. D. L. & Owens, I. P. F. 2006 Ecology predicts large-scale patterns of phylogenetic diversification in birds. American Naturalist 168, 220-229. Phillimore, A. B., Orme, C. D. L., Thomas, G. H., Blackburn, T. M., Bennett, P. M., Gaston, K. J. & Owens, I. P. F. 2008 Sympatric speciation in birds is rare: Insights from range data and simulations. American Naturalist 171, 646-657. Phillimore, A. B. & Price, T. D. 2008 Density-dependent cladogenesis in birds. Plos Biology 6, 483-489. Phillimore, A. B. & Price, T. D. 2009 Ecological influences on the temporal pattern of speciation. In Speciation and Patterns of Diversity (ed. R. Butlin, J. R. Bridle & D. Schluter). Cambridge: Cambridge University Press. Pielou, E. C. 1979 Biogeography. New York: Wiley. Pigot, A. L., Phillimore, A. B., Owens, I. P. F. & Orme, C. D. L. The shape and temporal dynamics of phylogenetic trees arising from geographic speciation. Systematic Biology, in press. Preston, F. W. 1960 Time and space and the variation of species. Ecology 41, 612-627. Rabosky, D. L. 2009a Ecological limits and diversification rate: alternative paradigms to explain the variation in species richness among clades and regions. Ecology Letters 12, 735-743. Rabosky, D. L. 2009b Ecological limits on clade diversification in higher taxa. American Naturalist 173, 662-674. Rabosky, D. L. & Lovette, I. J. 2008 Density-dependent diversification in North American wood warblers. Proceedings of the Royal Society B-Biological Sciences 275, 2363-2371. Rabosky, D. L. & McCune, A. R. 2010 Reinventing species selection with molecular phylogenies. Trends in Ecology & Evolution 25, 68-74. Rahbek, C. & Graves, G. R. 2001 Multiscale assessment of patterns of avian species richness. Proceedings of the National Academy of Sciences of the United States of America 98, 4534-4539.

187 Appendix I.

Ricklefs, R. E. 2009 Speciation, extinction and diversity. In Speciation and Patterns of Diversity (ed. R. Butlin, J. R. Bridle & D. Schluter). Cambridge: Cambridge University Press. Ricklefs, R. E. & Lovette, I. J. 1999 The roles of island area per se and habitat diversity in the species-area relationships of four Lesser Antillean faunal groups. Journal of Animal Ecology 68, 1142-1160. Rohde, K. 1992 Latitudinal gradients in species diversity - the search for the primary cause. Oikos 65, 514-527. Ronquist, F. 1997 Dispersal-vicariance analysis: A new approach to the quantification of historical biogeography. Systematic Biology 46, 195-203. Rosenzweig, M. L. 1975 On continental steady states of species diversity. In The Ecology and Evolution of Communities (ed. M. Cody & J. Diamond), pp. 121-140. Cambridge, MA. : Harvard University Press. Rosenzweig, M. L. 1995 Species Diversity in Space and Time. Cambridge: Cambridge University Press. Rosenzweig, M. L. 1998 Preston's ergodic conjecture: the accumulation of species in space and time. In Biodiversity Dynamics: Turnover of Populations, Taxa and Communities (ed. M. L. McKinney & J. A. Drake). New York: Columbia University Press. Rosenzweig, M. L. 2001 Loss of speciation rate will impoverish future diversity. Proceedings of the National Academy of Sciences of the United States of America 98, 5404-5410. Scheiner, S. M. 2003 Six types of species-area curves. Global Ecology and Biogeography 12, 441-447. Schemske, D. W. 2002 Tropical diversity: patterns and processes. In Ecological and evolutionary perspectives on the origins of tropical diversity: key papers and commentaries (ed. R. Chazdon & T. Whitmore), pp. 163-173. Chicago: University of Chicago Press. Schielzeth, H. Simple means to improve the interpretability of regression coefficients. Methods in Ecology and Evolution, in press.

188 Appendix I.

Schluter, D. 2009 Evidence for ecological speciation and Its alternative. Science 323, 737-741. Schluter, D. & Conte, G. L. 2009 Genetics and ecological speciation. Proceedings of the National Academy of Sciences of the United States of America 106, 9955-9962. Schoener, T. W. 1976 The species-area relationship within archipelagoes: models and evidence from island birds. Proceedings of XVI International Ornithological Congress 6, 629-642. Seehausen, O. 2006 African cichlid fish: a model system in adaptive radiation research. Proceedings of the Royal Society B: Biological Sciences 273, 1987-1998. Sepkoski, J. J. 1978 Kinetic model of Phanerozoic taxonomic diversity, 1. Analysis of marine orders. . Paleobiology 4, 223-251. Sepkoski, J. J., Jr. 1976 Species diversity in the Phanerozoic: species-area effects. Paleobiology 2, 298-303. Sheldon, P. R. 1996 Plus ca change - A model for stasis and evolution in different environments. Palaeogeography Palaeoclimatology Palaeoecology 127, 209-227. Shmida, A. & Wilson, M. V. 1985 Biological determinants of species diversity. Journal of Biogeography 12, 1-20. Slatkin, M. 1973 Gene flow and selection in a cline. Genetics 75, 733-756. Slatkin, M. 1985 Gene flow in natural populations. Annual Review of Ecology and Systematics 16, 393-430. Slatkin, M. 1987 Gene flow and the geographic structure of natural populations. Science 236, 787-792. Stephens, P. R. & Wiens, J. J. 2003 Explaining species richness from continents to communities: The time-for-speciation effect in emydid turtles. American Naturalist 161, 112-128. Triantis, K. A., Mylonas, M., Lika, K. & Vardinoyannis, K. 2003 A model for the species-area-habitat relationship. Journal of Biogeography 30, 19-27. Vamosi, J. C. & Vamosi, S. M. Key innovations within a geographical context in flowering plants: towards resolving Darwin’s abominable mystery. Ecology Letters, in press.

189 Appendix I.

Weber, K. E. 1990 Increased selection response in larger populations - 1. selection for wing-tip height in Drosophila melanogaster at 3 population sizes. Genetics 125, 579- 584. Whittaker, R. J., Triantis, K. A. & Ladle, R. J. 2008 A general dynamic theory of oceanic island biogeography. Journal of Biogeography 35, 977-994. Wiley, J. W. & Wunderle, J. M. 1994 The effects of hurricanes on birds, with special reference to Caribbean islands. Bird Conservation International 3, 319-349. Willi, Y., Van Buskirk, J. & Hoffmann, A. A. 2006 Limits to the adaptive potential of small populations. Annual Review of Ecology Evolution and Systematics 37, 433-458. Willig, M., Kaufmann, D. & Stevens, R. 2003 Latitudinal gradients of biodiversity: pattern, process, scale, and synthesis. Annual Review of Ecology and Systematics 34, 273 - 309. Xiang, Q. Y. J., Zhang, W. H., Ricklefs, R. E., Qian, H., Chen, Z. D., Wen, J. & Li, J. H. 2004 Regional differences in rates of plant speciation and molecular evolution: A comparison between eastern Asia and eastern North America. Evolution 58, 2175- 2184.

190 Appendix I.

Supplementary information

Table S1. Parameters of linear regression models of log species richness as a function of log region area for two region types at four scales and for all, endemic and non-endemic species within each region.

Clustering TDWG 50 100 150 200 L 1 L 2 L 3 L 4 all int -0.855 -0.653-0.617 -0.488 -0.591 -0.273 0.197 0.721 se 0.15 0.1080.086 0.081 0.089 0.083 0.072 0.059 p *** *** *** *** *** ** ** *** slope 0.425 0.4270.429 0.412 0.474 0.4 0.329 0.237 se 0.036 0.0280.02 0.019 0.024 0.019 0.015 0.013 p *** *** *** *** *** *** *** *** endemic int -0.818 -0.915 -0.709 -0.555 -1.009 -0.762 -0.388 -0.329 se 0.309 0.1680.148 0.15 0.09 0.089 0.09 0.084 p * *** *** *** *** *** *** *** slope 0.365 0.3970.326 0.26 0.39 0.306 0.168 0.142 se 0.062 0.0370.030 0.030 0.024 0.021 0.019 0.018 p *** *** *** *** *** *** *** *** non- endemic int -0.849 -0.609 -0.610 -0.477 -0.454 -0.166 0.37 0.898 se 0.150 0.1120.090 0.086 0.117 0.1 0.08 0.06 p *** *** *** *** *** . *** *** slope 0.406 0.4020.416 0.402 0.421 0.368 0.293 0.201 se 0.034 0.0280.021 0.019 0.029 0.022 0.017 0.013 p *** *** *** *** *** *** *** ***

191

Table S2 - Summary of diversification models fitted to mammalian clades with crown age younger than (a) 20 and (b) 10 million years before present using the biotic regions (200) classification. Six models of diversification are fitted representing: constant rate (1), constant rate scaled by region area (2), exponential decline (3), exponential decline region area scaling initial rate (4), rate of decline (5) or both (6). In each case, the maximum likelihood estimate of the model is reported for each free parameter within the bounds shown. Dashed parameter estimates were fixed at zero. The overall best-fit model for each period is shown in bold.

Lambda c z p Epsilon ΔAICc Likelihood [-1,1] [-0.2,0.2] [-0.2,0.2] [-0.2,0.2] [0.5,0.999] A. 20MY 1 0.340 ------0.990 196.3 -1412.7 2 -0.320 0.040 ------0.990 119.1 -1373.1 3 0.790 --- -0.030 --- 0.990 161.2 -1394.2 4 -0.330 0.040 -0.030 --- 0.560 40.9 -1333.0 5 0.478 --- -0.200 0.011 0.908 11.5 -1318.3 6 -0.260 0.040 -0.100 0.004 0.650 0.0 -1311.5 B. 10MY 1 0.265 ------0.999 -160.5 -1578.3 2 -0.251 0.031 ------0.990 -86.1 -1540.1 3 0.530 --- -0.043 --- 0.999 -106.9 -1550.5 4 -0.333 0.041 -0.046 --- 0.581 -24.4 -1508.2 5 0.377 --- -0.233 0.011 0.712 0.6 -1495.7 6 -0.197 0.031 -0.115 0.004 0.500 0.0 -1495.0 192

Appendix I.

Table S3. Slopes of SARs for mammalian orders for both geopolitical and biotic regions.

a) TDWG b) Clustering Slope SE N Slope SE N Afrosoricida -0.015 0.077 37 -0.733 0.382 5 Artiodactyla 0.112 0.013 482 0.304 0.033 79 Carnivora 0.106 0.009 498 0.304 0.027 92 Chiroptera 0.131 0.014 557 0.285 0.021 167 Cingulata 0.183 0.040 129 0.111 0.068 18 Dasyuromorphia 0.311 0.047 11 0.372 0.057 15 Dermoptera 0.000 0.000 16 0.000 0.000 7 Didelphimorphia 0.127 0.031 160 0.091 0.066 19 Diprotodontia 0.302 0.058 26 0.334 0.046 31 Erinaceomorpha 0.035 0.010 199 0.115 0.071 30 Hyracoidea -0.006 0.021 72 0.130 0.067 9 Lagomorpha 0.079 0.010 416 0.175 0.042 57 Macroscelidea 0.053 0.028 39 -0.685 0.681 5 Microbiotheria 0.000 0.000 5 0.000 0.000 3 Monotremata 0.001 0.049 12 0.096 0.069 12 Notoryctemorphia 0.623 0.546 3 0.396 0.268 4 Paucituberculata 0.099 0.203 5 0.388 0.067 4 Peramelemorphia 0.174 0.035 17 0.127 0.054 18 Perissodactyla 0.022 0.020 142 0.240 0.074 30 Pholidota 0.037 0.011 132 0.079 0.030 23 Pilosa 0.212 0.032 70 0.041 0.042 9 Primates 0.120 0.023 230 0.217 0.043 51 Proboscidea 0.075 0.018 73 0.182 0.079 11 Rodentia 0.209 0.011 547 0.403 0.025 132 Scandentia 0.036 0.023 63 0.038 0.040 24 Soricomorpha 0.111 0.013 406 0.274 0.033 65 Tubulidentata 0.000 0.000 53 0.000 0.000 6

193 Appendix I.

Table S4. Sister clade pairs identified using DIVA that show unambiguous reconstructed ancestral ranges and where one sister clade retains that ancestral range and the other occupies a differing new range. Italicized rows show clades with higher species richness in the new range. Mammal Ancestral # Spp. # Spp. Family Tree Node Area New Area Ancestral Area New Area Bovidae -1008 IndoMalay Nearctic, Palearctic 1 2 Bovidae -940 Afrotropics Palearctic 12 22 Bovidae -973 Afrotropics Palearctic 4 3 Canidae -1217 Palearctic Neotropics 1 9 Canidae -1225 Palearctic Afrotropics 9 1 Canidae -1226 Palearctic Nearctic 9 2 Cercopethicidae -780 Afrotropics IndoMalay 8 11 Cercopethicidae -790 IndoMalay Australasia 2 4 Cercopethicidae -801 Afrotropics IndoMalay 9 23 Cervidae -1029 Palearctic Neotropics 1 14 Cervidae -1023 Palearctic IndoMalay 5 18 Emballonuridae -1429 Australasia Neotropics 7 18 Emballonuridae -1417 Australasia Afrotropics 7 2 Equidae -1116 Palearctic Afrotropics 2 3 Erinaceidae -1788 Palearctic IndoMalay 11 6 Heteromyidae -891 Nearctic Neotropics 4 31 Leporidae -878 Afrotropics Nearctic 6 15 Loridae -909 Afrotropics IndoMalay 3 3 Manidae -1328 Afrotropics IndoMalay 3 4 Molossidae -1513 Afrotropics Neotropics 2 3 Molossidae -1524 Neotropics Australasia, IndoMalay 15 4 Molossidae -1538 Neotropics Afrotropics 15 22 Muridae -68 IndoMalay Palearctic 2 20 Muridae -881 Afrotropics IndoMalay 11 4 Mustelidae -1138 Palearctic Neotropics 3 2 Mustelidae -1159 IndoMalay Neotropics 2 3 Mustelidae -1164 IndoMalay Afrotropics 1 2 Mustelidae -1166 IndoMalay Neotropics 4 8 Myoxidae -670 Palearctic Afrotropics 2 14 Ochotonidae -783 Palearctic Nearctic 2 2 Pteropodidae -1342 Australasia IndoMalay 14 21 Pteropodidae -1357 Australasia Afrotropics 60 28 Rhinolophidae -1474 IndoMalay Australasia 3 14 Rhinolophidae -1476 Australasia Afrotropics 1 2 Rhinolophidae -1491 Afrotropics Australasia 2 5 Sciuridae -503 Nearctic Palearctic 2 11 Sciuridae -617 IndoMalay Nearctic 4 2 Tapiridae -1111 IndoMalay Neotropics 1 3 Vespertilionidae -1631 IndoMalay Palearctic 5 7 Viverridae -1310 Afrotropics IndoMalay 1 2 Viverridae -1312 Afrotropics IndoMalay 1 5

194 Appendix I.

Figure S1. Geographic distributions of biotic clusters defined by between-cell Jaccard distances (a – 50, b – 100, c –150 and d – 200 clusters).

195 Appendix I.

Figure S2. Size distributions of geopolitical (TDWG) and clustered biotic regions used to measure species-area relationships.

196 Appendix I.

Figure S3. Species-area relationships for geopolitical regions at four spatial scales from

TDWG Level 1 (A) to TDWG Level 4 (D) for all (black circles and line, SA), widespread

(grey crosses and dashed grey line, SW) and endemic (grey dots and grey solid line, SE) species in each region. See also Table S1.

197 Appendix I.

Figure S4. Species-area relationships for clustered biotic regions at four spatial scales from 50 (A) to 200 (D) clusters for all (black circles and line, SA), widespread (grey crosses and dashed grey line, SW) and endemic (grey dots and grey solid line, SE) species in each region. See also Table S1.

198 Appendix I.

Figure S5. Prediction surfaces from models of log species richness. Four variables (Simpson’s diversity index in habitat diversity, log elevational range, mean NDVI and mean temperature) with regions were fitted in turn as a covariate with regional area with the model including the interaction between each pair. The coloured surface shows the predicted diversity (white – high, red – low) and the relative size of the points show the observed diversity. Model coefficients were estimated using scaled and centred covariates (bottom and left axes) but these plots also show the variables on their original scale (top and right axes).

199 Appendix II.

Appendix II. Supplementary figures and tables for chapter three

Appendix Table II.1. Sampling locations for all genotyped species...... 201

Appendix Table II.2. Scoring analysis details for AFLP datasets...... 207

Appendix Table II.3. Branch circumference mean and variance for all study species. ..208

Appendix Table II.4. Ecological characteristics of study clades...... 208

Appendix Figure II.1. Associations of Fst and species range size with species phylogeny

(data from the reduced dataset)...... 209

200 1 5 1 14 16 50 20 27 26 analysed # samples # samples and p astures forest ate 1560 pasture 1250 roadside, between rain forest 2770 pasture in matrix of cloud 1520 forest roadside in humid 1670 cloud forest 1670 cloud 1160 roadside among pastures among 1 1160 roadside forest 2830 cloud/oak 2160 2100 montane rain forest montane rain forest 2160 montane rain forest approxim elevation (m) Habitat W Longitude itude / Lat itude 83.68 84.321 W 9.783 N / 83.764 W 83.846 W 94.5492 W 84.324 W 84.549 W 9.72 N / 83.898 W 9.55 N / 9.770 N / 10.208 N / 84.12 W 9.717 N / 84.12 W 9.643 N / 9.757 N / Bosque de Paz reserve potrero road to Taus 10.192 N / Cerro de la Muerte pasture road near CATIE station forest near Orosi La Paz 10.165 N / Bosque de Paz reserve Cerro Caraigres forest near Navarro Cerro Caraigres Location Species M. rafaeliana

M. nidifica T. triglochin L. ciliisepala Table II.1. Sampling locations for all genotyped species. Table II.1. Sampling

201 9 2 1 9 1 2 13 14 19 20 14 analysed le s # samp abitat pastures in matrix of rain pastures in matrix and p astures and p astures roadside forest forest remnants forest humid ate tion (m) H 760 2100 montane rain forest 1670 cloud forest 1670 cloud 1210 1250 forest roadside, between humid roadside, between rain forest 1520 forest roadside in humid 1830 cloud forest 1830 cloud 1930 of cloud forest by fragment 1165 of coffee plantation in matrix 1960 pasture in matrix of cloud 1900-2100 montane rain forest approxim elev a W de / Latitu de Longitude 83.898 W 83.829 83.720 W 84.324 W 83.898 W 83.799 W 9.785 N / 83.763 W 83.905 W 9.812 N / 83.777 W 9.757 N / 83.978 W 9.770 N / 9.781 N / 10.208 N / 9.770 N / 83.995 W 9.757 N / 9.757 N / 9.768 N / 9.808 N / y g hwa forest near Navarro forest near Navarro cafetal near Tapantí road to P. N. Tapantí road to Taus Taus Bosque de Paz reserve Cachi finca on Interamerican 43 marker km Hi forest near Orosi Freddy's finca Freddy's Location Species

L. floripecten P. propinqua L. elata L. ciliisepala L. ciliisepala

202 8 4 1 9 6 6 3 17 analysed le s # samp trix of abitat rain forest humid forest humid matrix of humid forest of humid matrix humid forest humid and p astures ate tion (m) H 750 forest at edge of pasture humid 11 750 of fragments pasture among 217 pasture bordering rain forest 3 710 rain forest 710 rain 1260 coffee plantation in ma 1450 dry forest 1450 dry 1560 of fragments pasture among 1250 roadside, between rain forest 1000 humid forest 1000 humid unknown roadside, between pastures in 1510-1650 secondary cloud forest cloud 1510-1650 secondary 37 approxim elev a W 83.808 W 10.788 N / 85.349 W 9.763 N / de / Latitu de Longitude 84.001 9.814 N / 83.655 W 84.995 W 83.718 W 84.321 W 9.819 N / 83.788 W 9.785 N / 83.763 W 9.7817 N / 9.383 N / 83.6 W 10.340 N / 84.043 W 10.751 N / 10.281 N / P. N. Rincon de la Vieja, trail to main P. N. Rincon de la Vieja, trail to main crate r cafetal near Tapantí road to Taus Horquetas road to P. N. Tenorio La Esperanza Shadehouse in Perez Zeledon (thought forest from humid near to come Bosque de Paz reserve potrero 10.192 N / Cachi finca Taus Quizzará) Rara Avis Reserve Location

Species D. odontostele

P. propinqua P. stenostachya S. fusiformis

203 6 8 7 1 2 1 28 33 18 13 analysed le s # samp p lantations abitat rain forest and p astures and p astures and p astures forest and p astures coffee ate tion (m) H 750 of fragments pasture among 750 forest at edge of pasture humid 2 750 of rain pastures in matrix 1130 roadside among pastures among 1130 roadside 1240 roadside, between rain forest 1250 roadside, between rain forest 1000 pasture forest 1220 humid 1260 roadside, between rain forest 1130 roadside among pastures among 11 1130 roadside 1250 roadside, between rain forest 1250 roadside, between pastures and approxim elev a W 9.394 N / 9.383 N / de / Latitu de Longitude 84.503 W 84.503 9.78 N / 83.76 W 83.718 W 9.785 N / 83.763 W 83.6 W 9.814 N / 83.655 W 83.594 W 83.796 W 83.72 W 10.130 N / 10.130 N / 9.783 N / 83.767 W 9.782 N / 83.798 W 9.78 N / 9.770 N / 9.770 N / San Ramon San Ramon road to Taus road to Taus Taus Perez Zeledon pasture La Esperanza Perez Zeledon finca road to P. N. Tapantí Taus San Ramon San Ramon road to Taus road to P. N. Tapantí Location Species

J. aporophylla J. teretifolia S. jimenezii S. jimenezii

204 3 5 5 4 4 7 10 12 18 analysed le s # samp abitat roadside forest remnants and p astures and p astures ate 2 2 coastal forest bordering beach coastal forest bordering beach 9 16 2 coastal forest bordering beach 2 coastal forest bordering beach 3 tion (m) H 1870 forest by of humid fragment 1630 cloud forest 1630 cloud 1970 1560 pasture pasture in matrix of cloud 1890 forest roadside, between humid 2160 1210 montane rain forest forest roadside, between humid 1720-1740 cloud forest 1720-1740 cloud approxim elev a 9.098 N / de / Latitu de Longitude 84.325 W 83.992 W 9.818 N / 83.782 W 83.995 W 84.321 W 83.992 W 84.128 W 83.799 W 9.647 N / 84.145 W 83.686 W 83.841 W 10.208 N / 9.779 N / 9.778 N / 9.382 N / 9.226 N / 9.768 N / 84.656 W 9.808 N / Playa Dominicalito Playa Dominicalito Bosque de Paz reserve Cachi finca finca Freddy's Bosque de Paz reserve potrero San Cristobal 10.192 N / Cerro Caraigres 9.718 N / road to P. N. Tapantí San Cristobal Playa Piñuela P. N. Manuel Antonio Playa Herradura Location aroanum goamp Species E. vul E. laucheanum E. exasperatum

205

6 3 analysed le s # samp abitat ate

1 1 coastal forest bordering beach coastal forest bordering beach 9 2 2 mangroves bordering beach 10 bordering 2 mangroves 2 2 coastal forest bordering beach 2 coastal forest bordering beach 10 coastal forest bordering beach 7 3 tion (m) H 290 dry forest near coast 190 dry forest near coast approxim elev a

9.098 N / 82.610 W 9.748 N / 82.813 W 10.834 N / 9.193 N / de / Latitu de Longitude 85.613 W 84.145 W 83.779 W 85.733 W 83.686 W 10.806 N / 85.641 W 9.382 N / P. N. Cahuita La Casona, P. N. Santa Rosa Bahia Hachal, P. N. Santa Rosa 10.935 N / (thought to come from Gandoca Beach) from (thought to come 9.604 N / P. N. Manuel Antonio Playa Hermosa Playa Piñuela Road to Playa Naranjo, P. N. Santa Rosa Garden in Bri Location

Species B. nodosa

206 Appendix II.

Appendix Table II.2. Scoring analysis details for AFLP datasets. Datasets that had unacceptably high error rates were not used and are not listed here. unf. = unfiltered; fil. = filtered; abs. = absolute scoring threshold; rel. = relative scoring threshold. Primer scoring locus phenotype mismatch # loci species colour method threshold threshold error rate scored M. nidifica G unf. abs. 130 50 6.45 29 B fil. abs. 50 50 6.02 26 Y unf. abs. 60 190 6.94 8 M. rafaeliana G fil. rel. 160 30 5.13 17 B fil. abs. 115 80 4.55 23 Y unf. abs. 75 75 5 11 T. triglochin B fil. abs. 50 50 4.87 25 Y unf. abs. 50 160 1.19 7 L. ciliisepala G fil. abs. 120 190 4.98 13 B fil. abs. 30 75 5.08 18 L. elata G unf. abs. 30 125 4.91 28 B unf. abs. 95 30 4.55 1 Y unf. abs. 30 125 4.91 10 L. floripecten G fil. abs. 80 85 4.76 11 B unf. abs. 120 105 4.76 6 Y fil. abs. 150 100 4.76 3 P. propinqua G fil. abs. 30 30 4.76 28 B fil. abs. 70 50 4.88 26 P. stenostachya B fil. abs. 30 30 4.91 51 D. odontostele G fil. rel. 100 64 3.92 5 B fil. rel. 175 26 4.6 9 Y unf. abs. 120 60 4.55 9 S. jimenezii G fil. rel. 40 36 3.37 49 B fil. abs. 40 45 4.24 60 Y fil. rel. 40 30 4.92 31 S. fusiformis G fil. rel. 125 30 4.86 23 B fil. rel. 90 10 4.64 18 Y fil. abs. 40 35 3.65 41 J. aporophylla G unf. abs. 70 70 5.16 31 B fil. rel. 155 44 5.95 11 Y fil. abs. 160 54 4.81 11 J. teretifolia G unf. rel. 100 90 6.329 17 B unf. rel. 65 50 5.977 32 E. exasperatum B unf. rel. 50 50 5 13 Y fil. abs. 30 50 4.86 13 E. laucheanum Y unf. rel. 28 70 5.88 6 E. vulgoamparoanum G fil. rel. 90 10 4.8 36 B fil. abs. 30 120 4.69 13 Y fil. abs. 50 150 5 23 B. nodosa B unf. rel. 35 60 7.82 27 Y unf. abs. 35 155 5.71 14

207 Appendix II.

Appendix Table II.3. Branch circumference mean and variance for all study species. Dryadella odontostele was only found growing on a branch once, and so no variance is given. Mean branch Variance in circumference branch Species (cm) circumference M. nidifica 23.7 293.8 M. rafaeliana 66.3 2717.2 T. triglochin 86.2 2517.7 L. ciliisepala 31 1133.3 L. elata 37 1096.9 L. floripecten 18 139 P. propinqua 15.6 612.1 P. stenostachya 129.2 1804.8 D. odontostele 6 - S. fusiformis 55.4 1066.2 S. jimenezii 47.4 3948.7 J. aporophylla 13.6 195.3 J. teretifolia 31.3 231.8 E. exasperatum 36.2 363.9 E. laucheanum 27.4 309.9 E. vulgoamparoanum 59.7 1893.9 B. nodosa 64.5 991

Appendix Table II.4. Ecological characteristics of study clades. Clade values are means of species values. Species range size is measured as number of regions. Mean elevation Mean number Mean Clade range (m) of habitats range size Masdevallia 659.7 1.9 1.3 Trisetella 1300 2.3 1.7 Lepanthes 437.3 1.6 1.2 Lepanthopsis 33.3 1 2 Platystele 748.6 1.9 2.3 Dryadella 483.3 1.7 1.9 Scaphyglottis 960 2.3 4.8 Jacquiniella 750 2 6.5 Epidendrum 625.5 1.8 2.1 Brassavola 250 2.2 3.2

208 Appendix II.

Appendix Figure II.5. Associations of Fst and species range size with species phylogeny (data from the reduced dataset). Circles at branch tips are sized proportionally to the log

of species range size measured as number of regions, and shaded according to Fst. Species names are abbreviated by the first letter of genus and species.

209