<<

Radiation and Macroevolutionary Ecology of the African L.

Gail Reeves

A thesis submitted for the degree of Doctor of Philosophy

Imperial College of Science, Technology and Medicine & NERC Centre for Population Biology University of London January 2001

1 Abstract

The (CFR) of harbors one of the highest concentrations of on Earth. The aim of this thesis was to investigate factors promoting the radiation of this diverse flora using a reconstructed species-level molecular phylogeny for one of the CFR's flagship genera, Protea. Chapter 2 of this thesis describes the use of five non-coding regions from the plastid and nuclear genome to reconstruct relationships among 88 Protea species. Extremely low levels of sequence divergence were found among species, and consequently in Chapter 3 amplified fragment length polymorphism (AFLP) markers were also employed to infer relationships. These markers were found to be extremely useful in combination with DNA sequence data for phylogenetic reconstruction in this group. Contrary to previous hypotheses, the phylogeny supports a Cape origin for the group followed by expansion into tropical Africa. In Chapter 4, the age of the root node of Protea was estimated to evaluate the widely held view that much of the diversification in the Cape occurred since the onset of Mediterranean- climates ca. five million years ago. Contrary to this hypothesis, the timing and the temporal dynamics of the radiation of Protea indicated that the lineage is at least 36 myr old, and that its diversification rate has declined significantly over the last 20 mya. Chapter 5 investigates the role of special characteristics of the CFR, including complex topography and heterogenous edaphic environment in the diversification of the flora. In Protea, it appears that speciation has been largely allopatric, but there is no significant pattern to suggest that soil factors or subdivison have been involved in speciation. Comparison of diversification rates between lineages that re- and re- sprout after fire indicated higher diversification rates in the former within the Cape, but this rate is less than that for re-sprouting lineages outside of the Cape. In summary, the diversity of Protea species in the CFR may be due to high coexistence of species that diversified over a long timespan, rather than a recent rapid radiation in this lineage.

2 Acknowledgements

As possibly the most extensively supervised PhD student in the history of PhD students I have a considerable list of those to whom I owe enormous gratitude. At Imperial College and the NERC Centre for Population Biology my supervisors Tim Barraclough, John Lawton and Alfried Vogler, also CPB staff Phil Heads and Claire Challis. At the Royal Botanic Gardens, Kew my advisors Mark Chase and Mike Fay, also staff of the Molecular Systematics section Vincent Savolainen, Cassio Van den Berg, Robyn Cowan, Jeff Joseph, Martyn Powell and Tim Fulcher. In our collaborators at the National Botanical Institute Tony Rebelo and John Rourke, and at the Institute for Plant Conservation UCT, Richard Cowling. Also in South Africa Mervyn Lotter, Wendy Paisley, Suzette Foster, my beautiful field assistant Stephanie Yelenick and all those involved in the Protea Atlas Project, especially Nigel Forshaw and Val Charlton.

Extra special personal thanks go to: my surrogate family - the Cherries, M & M, private programmer Rob, fishing-partner Andrew and best friends Emma, Sarah and Anouk. Finally, my forever-supportive family: Nan, Grandma, brother Ben, soon-to-be-husband Steve, and most of all Mum and Dad.

3 Table of Contents

Abstract 2

Acknowledgements 3

Table of Contents 4

Index of Figures 7

Index of Tables 11

Chapter One - General Introduction 13 D Physical and ecological setting 14 D Causes of species richness 16 D Reconstructed phylogenies as tools for studying diversification 17

D Protea as a case study 18

Chapter Two - Molecular Phylogenetics of Protea: Evidence from Plastid and Nuclear DNA Sequences 24

Materials and Methods 26

Results 34 D ITS region 34 D Plastid regions 34 D ncpGS region 35 D Plastid and ncpGS regions combined 36

Discussion 46

Chapter Three - Phylogenetic Reconstruction of Protea: Combined Evidence from DNA Sequence Data and AFLP Markers 50

Materials and Methods 54

Results 60 D AFLPs 60 D AFLP and DNA sequence data combined 60

4 Discussion 70

Chapter Four - Timing and Temporal Dynamics of the Radiation of Protea 72

Materials and Methods 76 > Age estimation for the root node of the Protea clade 76 D Producing an ultrametric 76 > Temporal dynamics of the radiation of Protea 77

Results 83 > Age estimation for the root node of the Protea clade 83 D Temporal dynamics of the Protea radiation 85

Discussion 95 D Sources of error in age estimates 95 > Use of the correct tree and substitution noise 95 D Incorrect calibration 96 > Variability in substitution rate 96 > Age estimate for the radiation of Protea and its implications 97 D Temporal dynamics of the Protea radiation 97

Chapter Five — Investigating the Factors Promoting Diversification in Protea using Sister Group Analysis 99 > Topography 99 > Edaphic specialization 100 > Fire 102

Materials & Methods 103 > Topography 103 > Edaphic specialization 104 D Relationship between habitat preference and degree of sympatry 105 > Fire survival strategy 105

Results 108 > Topography 108 > Edaphic Specialization 108 > Relationship between sympatry and habitat difference 110 D Diversification rate in re-seeding and re-sprouting lineages 111

Discussion 116 D Topography 116

5 > Edaphic specialization 116 > Fire 117 > Summary 118

Chapter 6 - Conclusions 120

References 124

APPENDIX 1 CD ROM affixed to back cover

> PAUP file 1: DNA sequence and AFLP matrices used in Chapters 2 & 3 > PAUP file 2: DNA sequence matrix used in Chapter 4, analysis 1 > PAUP file 3: DNA sequence matrix used in Chapter 4, analysis 2

6 Index of Figures

Chapter 1

FIGURE 1 14 Distribution of within the CFR (after Low & Rebelo 1998).

FIGURE 2 22 Geological time-scales and sequence of some of the important events in the history of the fynbos region (after Deacon et al. 1992 & Cowling & Richardson 1995).

FIGURE 3 19 Geographical distribution of Protea throughout Africa (after Rourke 1980).

FIGURE 4 .21 WorldMap (Williams 1998) representation of Protea species diversity in the CFR. Grid cells are 1/8 degrees on a side —12Icm. Colour scale indicates Protea species diversity of zero to a maximum of 23. (Reproduced with kind permission of the Trotea Atlas Project', National Botanical Institute, Cape Town.)

Chapter 2

FIGURE 1 38 One of the equally most parsimonious found from analysis of ITS sequences for 14 Protea and one species. Number of trees = 96, number of steps = 548, CI = 0.76, RI = 0.80. Branches not recovered in the strict consensus are indicated with a circle. Branches lengths are shown above and bootstrap percentages below the branches. Terminal taxa with identical names were sequenced from the same plant.

FIGURE 2 39 Adams consensus of 9490 equally most parsimonious trees found from analysis of four plastid DNA non-coding regions for 88 species of Protea. Number of steps = 238, CI = 0.84, RI = 0.91. Branches not recovered in the strict consensus are indicated with a circle. Bootstrap percentages are indicated below branches.

FIGURE 3 40 Adams consensus of 3280 equally most parsimonious trees found from analysis of ncpGS sequences for 77 Protea species. Branches not recovered in the strict consensus are indicated with a circle. Bootstrap percentages are indicated below the branches.

FIGURE 4 41 One of the equally most parsimonious trees found in the combined analysis of ncpGS and plastid data sets. Branch lengths are shown above branches. Indels scored from both data sets are indicated on the tree. Taxa in bold are found in summer rainfall regions.

FIGURE 5 42 Adams consensus of one of the 3280 equally most parsimonious trees found in the combined analysis of ncpGs and plastid data sets. Number of steps = 514, CI = 0.82, RI = 0.89. Branches not recovered in the strict consensus are indicated with a circle. Bootstrap percentages are indicated below branches.

7 Figure 6 48 (a) hypothetical phylogenetic reconstruction of the relationship among tropical and Cape taxa according to Rourke (1998). (b) relationship among tropical and Cape Protea species recovered in the DNA sequence trees.

Chapter 3

FIGURE 1 .53 Schematic of the AFLP procedure.

FIGURE 2 .62 Adams consensus of 26 equally most parsimonious trees found from analysis of 138 AFLP bands for 72 Protea taxa. Number of steps = 609, CI = 0.23, RI = 0.52. Branches not recovered in the strict consensus are indicated with a circle. Bootstrap percentages are indicated below branches.

FIGURE 3 .63 Rooted neighbour joining phylogram derived from analysis of 138 AFLP bands for 72 Protea taxa.

FIGURE 4 .64 One of the 20 equally most parsimonious trees found in a combined analysis of DNA sequence and AFLP data sets. Branch lengths are indicated above the branches.

FIGURE 5 .65 Adams consensus of 20 equally most parsimonious trees found in a combined analysis of DNA sequence and AFLP data sets for 86 Protea species. Number of steps = 1185, CI = 0.47, RI = 0.65. Branches not recovered in the strict consensus are indicated with a circle. Bootstrap percentages are indicated below branches.

FIGURE 6 66 Major clades recovered from analysis of (a) DNA sequence, (b) AFLP and (c) combined data sets. corresponding to the numbered clades is detailed in Table 2. * corresponds to taxa that are missing from a clade with respect to those identified in 6c. Groups that are not recovered in the strict consensus are marked with a circle.

Figure 7 67 Percentage of missed homoplasy discovered for subsets of informative characters with respect to the best tree built from 302 characters (curve A) and 150 characters (curve B).

Chapter 4

FIGURE1 .73 Alternative hypotheses to explain the current of . (a) Major lineages evolved before the separation of Africa from Australasia/S.America. (b) Major lineages evolved after the separation of Africa from Australasia/S.America.

FIGURE 2 74 A hypothetical log number of lineages through time plot.

FIGURE 3 .75 Possible behavior of a log number of lineages through time plot. A: constant net speciation with

8 background extinction or constant net speciation with an increase in rate towards the present. B: constant speciation — pure birth process. C: constant net speciation with a slow down in rate towards the present or taxa missing from the sample.

FIGURE 4 . 89 One of the 2620 equally parsimonious trees found from analysis of rbcL-atpB spacer sequences. Number steps = 666, CI = 0.72, RI = 0.81. Nodes not recovered in the strict consensus are indicated with a circle. Branch lengths are indicated above and bootstrap percentages below branches. Colour codes indicate geographical distribution of genera.

FIGURE 5 .90 One of the equally most parsimonious trees fitted with ML branch lengths (indicated above branches) found from analysis of trnL-F, rps16 intron and atpB-rbcL spacer sequences for 46 Proteaceae taxa.

FIGURE 6 .91 NPRS tree from ML branch lengths for 46 Proteaceae. Calibration of the node separating and at 105 mya gives a date for the root node of Protea of 36 mya.

FIGURE 7 .84 Bootstrap (100) distribution of age estimates for the root node of Protea using a 105 myr calibration for the node separating subfamilies Proteoideae and Grevilleoideae (indicated on Figure 6).

FIGURE 8 .92 One of the most equally most parsimonious trees found from analysis of all DNA sequence and AFLP data for 86 Protea species. ML branch lengths derived from the DNA sequence data only are shown above the branches.

FIGURE 9 .93 NPRS tree from ML branch lengths for 86 Protea taxa.

FIGURE 10 86 Log lineages through absolute time plot for Protea.

Chapter 5

FIGURE 1 100 Predictions for the relationship between geographical overlap and node age.

FIGURE 2 101 Predictions for the relationship between habitat contrast and node age.

FIGURE 3 102 Predictions for the relationship between habitat contrast and degree of sympatry.

FIGURE 4 104 Calculation of soil type contrast between clade A and clade B.

FIGURE 5 108 Plot of degree of sympatry against node age for 69 Cape Protea taxa.

9 FIGURE 6 112 Soil type for 67 Cape taxa traced onto one of the equally most parsimonious combined trees.

FIGURE 7 109 Plot of soil contrast against node age for 69 Cape Protea taxa

FIGURE 8 109 Distribution of arcsine slopes for the relationship between soil contrast and node age from 1000 trials evolving soil type randomly onto the tree.

FIGURE 9 110 Association between soil contrast and sympatry for 69 Cape Protea taxa.

FIGURE 10 111 Distribution of arcsine slopes for the relationship between soil contrast and degree of sympatry from 1000 trials evolving soil type randomly onto the tree.

FIGURE 11 113 Re-seeding and re-sprouting fire-survival strategy mapped onto one of the combined equally most parsimonious trees.

10 Index of Tables

Chapter 2

TABLE 1 29 Protea and outgroup species used in the molecular analyses, with sequence data and voucher information. The classification used is Rourke (in prep.) for the southern African species and Beard (1963) for the tropical African species (- indicates missing sequence data).

TABLE 2 37 a & b. Statistics for each of the three analyses.

TABLE 3 .43 Clades recovered in each of the three analyses.

TABLE 4 .45 Bootstrap percentages for the clades defined in Table 3 for each of the three analyses.

Chapter 3

TABLE 1 .57 AFLP and sequence data for each taxon. Taxonomy is that of Rourke (in prep.) and Beard (1963). Shaded rows indicate taxa for which one or more data source is missing.

TABLE 2 68 Clades identified in each of the three analyses.

Chapter 4

TABLE 1 .80 Proteaceae taxa sampled in Analysis 1. Classification is after Hoot & Douglas (1998; updated from Johnson & Briggs 1975).

TABLE 2a 87 Negative log likelihood scores for one of trees produced in analysis 2 (d.f. = 1).

TABLE 2b 87 Negative log likelihood scores with and without a molecular clock for one of trees pro.duced in analysis 2 using the three-parameter model of sequence evolution (d.f. = n-2 = 44, where n equals number of taxa).

TABLE 3a 88 Negative log likelihood scores for one of trees produced in analysis 3 (d.f. = 1).

TABLE 3 b 88 Negative log likelihood scores with and without a molecular clock for one of trees produced in analysis 3 using the 3 parameter model of sequence evolution. (d.f. = n-2 = 89, where n equals number of taxa).

TABLE 4 94 Estimated diversification rate for Protea, overall and separately for 36-20 mya and 20 mya to the present.

11 Chapter 5

TABLE 1 106 Soil type categories for Cape Protea species (after Rourke 1980 & Rebelo 1995). 1 = present on this soil types, 0 = absent.

TABLE 2 114 Estimated diversification rates for individual seeding and sprouting clades.

TABLE 3 115 Average estimated diversification rates for re-seeders and re-sprouters.

12 Chapter One - General Introduction

The flora of is extraordinarily species rich with 21 137 indigenous species, around 80% of which are also endemic to the region (Goldblatt 1978). Compared with similarly sized areas, it is among the richest in the world, a remarkable phenomenon given that southern Africa is a predominantly warm temperate, semi-arid region whereas most other species-rich regions include large areas of tropical rainforest (Myers 1988). Both and richness are considerably higher in southern than in tropical Africa (Goldblatt 1978; Gibbs Russell 1985; Cowling et al. 1989), a pattern contradicting the generalisation that plant-species richness decreases and range size increases with increasing latitude and associated climatic extremes (Pianka 1966; Stevens 1989; Cowling & Samways 1996).

Within the subcontinent, the Cape floristic region (CFR) is the richest floristic hotspot of southern Africa. It covers less than four percent of the total area but contains 41% (8970) of the subcontinent's species, of these 69% are endemic (Goldblatt & Manning 2000). The plant diversity of the CFR is concentrated in relatively few lineages, or 'species flocks' (sensu Rosenzweig 1995), that have radiated spectacularly. Thus, 13 genera (out of 988 total) each comprise more than 100 species, and together these account for 25% of all species in the flora (Goldblatt & Manning 2000). Similarly of the region's 173 families, 12 comprise more than 200 species and constitute 64% of the CFR flora. In total seven plant families and 198 genera (20% of the total) are endemic to the CFR. In summary the CFR harbours one of the highest concentrations of endemic plant species in the world, at levels normally associated with oceanic islands. Diversity in the CFR is not reflected at the alpha level, which is only moderate (Cowling, eta!. 1992), but is most marked in the extremely high turnover between i.e. beta diversity (Cowling 1990) in which rainfall and soil factors play important roles in determining plant community boundaries along altitude and aspect gradients (Goldblatt 1997, Richards eta!. 1995).

Faced with such overwhelming diversity and levels of endemism, the general question of how this diversity arose has often been posed (for example Kruckeberg & Rabinowitz 1985; Major 1988; Cowling & Hilton-Taylor 1997). Key questions include the age of origin of the flora and which biological traits and habitat preferences have promoted speciation. The purpose of this chapter is to outline the historical, physical and ecological characteristics of the CFR and summarise how these have been implicated to cause species richness in the region. This chapter also outlines the phylogenetic approach to studying diversification in the CFR and introduces the focus of this thesis, Protea.

13 > Physical and ecological setting The flora of southern Africa has been classified into more than 70 major vegetative units (Acocks 1953; Low & Rebelo 1998) that contribute to the subcontinent's six putatively floristically distinct biomes (Rutherford & Westfall 1994; Gibbs Russell 1987). This study concentrates on the most species-rich biome, fynbos, which is largely coincident with the CFR (Goldblatt & Manning 2000; see Figure 1 below).

FIGURE 1. Distribution of fynbos within the CFR (after Low & Rebelo 1998).

Fynbos is a vernacular term, meaning 'fine leaved', used to describe the predominant vegetation type in the biome. It is an evergreen, fire-prone , confined largely to sandy, infertile soils and characterised structurally by the presence of restioid, ericoid and proteoid . Proteoid fynbos, which is widespread on the deep and relatively fertile colluvial soils at the base of mountains, is replaced at higher altitudes by ericaceous fynbos where soils are permanently wet and rich in organic carbon. Warmer north-facing slopes, where soils are shallower and more drought prone, support restioid fynbos dominated by shallow rooted (Cowling & Holmes 1992).

The contemporary fynbos biome and the Mediterranean climate of the Cape region, characterised by seasonal (winter) rainfall, is believed to have become fully established in the late Pliocene (see Figure 2) after the northern hemisphere was glaciated and the symmetry of zonal climates was

established (Deacon et al. 1992). Fossil pollen data show that markedly different vegetation was present in this region during the early Neogene with subtropical forest, including palms (Coetzee

et al. 1983). Also present were early representatives of taxa that now dominate fynbos such as Proteaceae, Restionaceae and Ericaceae, which could have been part of the forest understorey

vegetation (Scott et al. 1997). However, considerable climate change occurred in southern Africa during the Neogene, related to the progressive cooling of ocean waters, the growth of the

14 Antarctic ice sheet and development of the circum-Antarctic current in the southern ocean (Deacon et al. 1992.). This forced the climate overwhelmingly in the direction of greater aridity, and as a result subtropical forest cover became attenuated opening up the opportunity for gap filling by understorey taxa (Deacon et al. 1992).

The time period from the Pliocene to the present, since the onset of Mediterranean-type climates, is widely believed to be the interval during which most of the diversification leading to the modern CFR flora occurred. Much attention has therefore been focused on the contribution of climate change to diversification in the CFR. This appears particularly important when taking into consideration that all five Mediterranean-type climate regions of the world are associated with higher than average levels of plant species richness (Cowling eta!. 1996).

Whereas climatic conditions are geologically recent, the fynbos landscape and substrates, dominated by the rugged and steep quartzitic mountains of the Cape folded belt, are ancient and have existed in near-modern form for many millions of years (Deacon et al. 1992). Soils, which are derived from ancient Cape supergroup sediments ( and Witteberg groups), are extremely poor in exchangeable bases and extractable phosphorus (Kruger 1979). Because nitrogen and phosphorous for protein synthesis are limiting, excess carbon produced from photosynthesis is channelled into the production of woody fibres and tannins (Rebelo 1995). As a result sclerophylly is a structural feature of fynbos vegetation growing on nutrient poor soils, and has a significant impact on plant-herbivore interactions (Owen-Smith & Danckwerts 1997). The high fibre content, low water content, thick cuticles and lack of trace elements in sclerophyllous reduces their palatability to herbivores. Fibre, wax and cutin are poorly digested by mammals especially (Stock et al. 1992), and the high ratio of carbon to nitrogen make them indigestible to most insect larvae (Rebelo 1995). Therefore, although species richness of birds, mammals, frogs, reptiles and insects is high in fynbos, it sustains a low animal biomass. Despite the significant roles played by animals in and seed dispersal they play a minor part in influencing vegetation structure and composition due to the low impact of herbivory (Low & Rebelo 1998).

Fire is a dominant factor in Cape and an important selective agent at all levels of biological organisation (Kruger 1977, 1983). Fynbos must burn at between six and 45 years of age to sustain its plant species because without fire fynbos becomes senescent, and forest and thicket elements begin to invade. Forests are generally associated with deeper, more fertile soils than fynbos (Cowling & Holmes 1992), but recent studies have suggested that forest may develop on soils identical to those that support fynbos. Manders & Richardson's (1992) model to explain the forest/fynbos boundary has suggested that forest species are excluded from fynbos because

15 their recruitment is not coupled with fire. Many fynbos taxa are serotinous, meaning that they delay seed release by storing seed reserves in the canopy of the plant, usually in closed woody cones. All seed reserves are released after fire, in which the parent are killed, and none are stored in the soil until the next fire. Serotinous species therefore rely entirely on these reserves for recruitment after fires (Bond 1984), and if seedlings fail to establish, populations will become locally extinct (Cowling 1987). Other fynbos taxa are able to survive fires by re-sprouting from underground boles or rootstocks which contain buds that are stimulated to produce more growth after a fire has killed all above-ground parts of the plant. Other, usually arborescent taxa, produce a thick bark that protects buds in the stem that are stimulated to grow when branch tips are killed by fire (Rebelo 1995).

> Causes of species richness One of the main tasks in evolutionary biology is to question whether diversity has arisen as the by-product of chance evolutionary change within lineages, or whether there are factors responsible for the promotion of diversification. With such extraordinary levels of plant species richness in fynbos, many investigators have hypothesised as to the causes of species richness in the biome, including how both intrinsic and extrinsic factors have influenced speciation, extinction and coexistence of the flora. Given the special characteristics of the Cape Region, namely topological and edaphic complexity, diversity, and a spatially and temporally variable fire regime, many hypotheses invoke the presence of a mosaic of diverse habitats and steep ecological gradients to explain species richness in the CFR (e.g. Vrba 1985; Cowling 1987, Johnson 1996; Goldblatt 1997; Cowling eta!. 1997).

Studies on diverse taxa have suggested that natural selection caused by shifts in ecology or invasions of novel habitats may have played an important role in adaptive divergence and speciation (Carroll et al. 1997; Losos et al. 1997; Reznick et al. 1997). In the CFR, the observation that a large proportion of taxa are edaphic specialists, has led several authors to highlight the role of substrate gradients in the differentiation of CFR taxa (Linder 1985; Cowling 1987; Linder & Vlok 1991; Goldblatt & Manning 1996). (For example Cowling & Holmes (1992) recorded that 69% of regional and 85% of local endemics were confined to a single substratum in a coastal lowland area of the CFR). In a phylogenetic analysis of Rhodocoma (Restionaceae) Linder & Vlok (1991) concluded that sympatric species favoured different habitats, and similarly Goldblatt & Manning (1996) concluded that speciation in Lapeirousia subgenus Lapeirousia (Iridaceae) appeared to be promoted by shifts in substrate preference when descendant species occurred in adjacent habitats. Both studies advocated that soil factors are the predominant cause of species richness in the south- and that parapatric rather than strict allopatric speciation is important with differentiation of taxa on adjacent but different habitats.

16 Both of these studies compliment Cowling's (1987) model that invokes recurrent fire, edaphic specialisation and short dispersal distances to explain speciation in fynbos. He has argued that fragmentation of populations as a result of fire-induced local extinction may isolate marginal populations on a soil type different from the main population. Edaphic races on the atypical soil type may then rapidly evolve into species because short dispersal distances (<10m; Manders 1986) mean that the distances between sub-populations need not be great to ensure isolation. However, in contrast, Johnson (1996) has argued that if heterogeneity of the growth environment is responsible for diversification this should be evident in the radiation of vegetative characters. Instead many large Cape genera show radiation in floral characters, evidence which Johnson argues, shows that adaptation to has played a major role in speciation.

Cowling (1987) has also argued that fire is an important disturbance factor that facilitates coexistence in fynbos, where, under a lottery model for recruitment (Chesson et al. 1997), every co-existing species may be dominant in establishment under some possible combination of environmental conditions. The importance of disturbance in coexistence among fynbos species stems from the parallel hypothesis that light-gap disturbances in species-rich tropical forests promote the coexistence of species having different resource use strategies and competitive abilities (Connell 1978). However, in a recent long-term study by Hubbell et al. (1999) it was found that spatial and temporal variation in gap disturbance did not explain variation in species richness. Brokaw & Busing (2000) also reported that niche partitioning contributes less, and chance events more, than expected to maintaining tree species richness via gap dynamics in tropical and temperate forests. Similarly, disturbance regime may not prove to explain coexistence in fynbos, but Cowling (1987) has highlighted an important issue in his attempt to explain diversification patterns in the CFR, namely whether high levels of coexistence and low extinction, rather than high net speciation can explain species richness in the CFR.

> Reconstructed phylogenies as tools for studying diversification There have been numerous studies that seek to identify correlates of diversity in the CFR, but in the absence of phylogenetic information most of these have assessed vegetation trends among a range of taxa that span broad ecological gradients. (e.g. Linder 1991; Cowling & Holmes 1992; McDonald et al. 1994). To assess whether specific factors have actually been involved in diversification it is necessary to distinguish independent evolutionary origins of character states from cases of identity by descent (Harvey & Pagel 1991). To this end reconstructed phylogenies provide an indirect record of past speciation events, and therefore offer valuable insight into processes of diversification. To rigorously assess the role of ecological factors in the promotion of diversification in the CFR it will be necessary to ask whether they have been involved in the

17 separation of lineages for a group of closely related Cape taxa.

Uncovering the patterns of changing biodiversity in the CFR over time has also proved to be extremely difficult because the palaeobotanic record is scarce (Scott et al. 1997). For example there are no pollen records for fynbos vegetation in the Cape that have been dated prior to 11000 years ago. However, a phylogenetic tree built from DNA sequence data can provide a record of the temporal dynamics of speciation, and allow the age of specific nodes to be estimated by applying a calibration from one part of the tree (estimated from either fossil ages of extant taxa or biogeographic evidence) to the remaining nodes. This can then allow inferences to be made regarding the geologic time scales and environmental conditions under which lineages diversified.

Both of the investigations outlined above require the use of a detailed phylogenetic tree, which to completely avoid any potential circularity should be based on an estimate derived independently of the traits involved (Givnish 1997). The phylogenetic tree used in this thesis is therefore based upon molecular data. Investigations such as this have been immeasurably aided by the rapid development of automated DNA sequencing technology which now allows large numbers of molecular markers to be collected in a relatively short time period. There is also a wide choice of loci for which universal amplification primers are available, although many have been developed for higher level systematics and may evolve too slowly to resolve species-level relationships. The choice of molecular markers has formed an important part of this study, and both the efficacy of DNA sequence and amplified fragment length polymorphisms (AFLPs; Vos et al. 1995) for inferring species-level relationships are addressed in Chapters 2 and 3 respectively. A variety of algorithms for reconstructing trees from molecular and morphological data are available, for example maximum parsimony and maximum likelihood, although the latter approach may be time consuming to implement for data sets comprising large numbers of taxa. However, the two methods do typically give similar outcomes (Hillis, 1996).

Protea as a case study This study recognizes the need for the incorporation of detailed phylogenetic hypotheses into the study of macroevolutionary ecology in the CFR. For the purpose of studying plant species- diversity in southern Africa, and particularly fynbos, the genus Protea represents an ideal subject. Proteaceae represent an ancient eudicot lineage which, from their distribution across the Southern- Hemisphere, predate the break up of the supercontinent, (Linder et al 1992). In Africa Protea is the largest and most widely distributed genus of Proteaceae, ranging from the savannah woodland of west-central and eastern Africa to the sclerophyllous heathland habitats of the western Cape (Figure 3).

18 FIGURE 3 Geographical distribution of Protea throughout Africa (after Rourke 1980)

Protea has radiated predominantly in the Cape region (70 of its species are endemic to this region), where its species are among those that dominate fynbos landscapes The group displays high regional and within habitat diversity associated with extreme variation in both floral and vegetative characters, life history and habitat preference In general many of the tropical species are widespread, relatively similar morphologically and have been descnbed as 'plastic' and difficult to define taxonomically (Rourke 1998) This is m stark contrast to the often narrowly endemic, morphologically diverse, and taxonomically well-defined species of the western Cape (Rourke 1980)

Subtropical and tropical taxa are fairly long lived (200-250 years) resproutmg savannah woodland trees with simple, nearly actmomorphic , and are presumed to be the least specialised representatives of the genus They are usually not serotmous shedding mature about four months after pollination In contrast many of the temperate Cape species are short lived (30-40 years), serotmous re-seeders, although some have developed subterranean fire-resistant rhizomes Compared to their tropical counterparts, Cape species display a wide array of morphological diversity Habit ranges from trees and upnght shrubs to creepmg shrubs and plants with underground stems, leaves mclude isobilateral and needle-like forms and mfloresences vary markedly from terminal goblet-shaped capitula with bnghtly coloured involucral enclosing the floral parts, to cryptically concealed axillary mfloresences The latter is in response to pollination by small mammals (Weins & Rourke 1978) Rourke (1998) stated that morphological changes m the western Cape species of Protea represent a sigmficant evolutionary shift from the relatively long-lived, fire-resistant, re-sprouting savannah trees of tropical and south-central

19 Africa. Cape species have a much more rapid generation turnover and consequently they are believed to have diversified greatly under the influence of edaphic and fire pressure and pollinator diversification (Cowling 1987; Johnson 1996; Rourke 1998).

A taxonomically well-studied group, Protea is also well documented from an ecological standpoint. In 1991 the Protea Atlas Project" was established at the initiative of the National Botanical Institute (NBI) SA, funded by the World Wildlife Fund (WWF) SA and the Department of Environmental Affairs and Tourism. The project is based at Kirstenbosh Botanic Gardens (NBI) in Cape Town. The aims of the project were to record the distribution of all members Proteaceae throughout South Africa whilst encouraging amateur involvement in . Distribution records also include detailed ecological data relating to habitat preferences and population dynamics (Rebelo 1995). Plots are limited to an area of 0.5 x 0.5km, randomly chosen and pin pointed to a single longitude/latitude. To date, almost 45 000 localities (Rebelo pers. comm) have been recorded, amounting to 200 000 species records from among 375 Proteaceae species. Data submitted to the project will be compiled into the Atlas to be published at the end of the data collection period in 2001. In the interim, data can be accessed for studies such as the one described here. The current distribution map of Protea species in the CFR using data collected by the Protea Atlas Project is shown below.

I www.nbi.ac.za/protea/

20 I.

Proteas of the Cape Flora 99

Figure 4. WorldMap l (Williams 1998) representation of Protea species diversity in the CFR. Grid cells are 1/8 degrees on a side —12km. Colour scale indicates Protea species diversity of zero to a maximum of 23. (Reproduced with kind permission of the Protea Atlas Project', National Botanical Institute, Cape Town.)

Protea thus represents an ideal case study to investigate the timing and causes of plant diversification in the CFR. The outline of this thesis is as follows:

> Chapter 2 presents the reconstruction of a species-level phylogeny for Protea using DNA sequence data. > Chapter 3 investigates the efficacy of amplified fragment length polymorphisms (AFLPs) in reconstruction of species-level relationships for Protea. > Chapter 4 utilises the phylogenetic tree to investigate the timing and temporal dynamics of the radiation of Protea. > Chapter 5 investigates the roles of geography, edaphic specialisation and fire-survival strategy in promoting diversification. > Chapter 6; general conclusions.

1 worldmap @nhm.ac.uk

21 ▪

.....rn U o' c4-1 ico" .pu litc.; o .— .— a.) 0. cou 7.8 ..o v) 0 4cE) -o *.Od -o , 1g g 4... 23 no c. 0.) • ''' ra. Q ;_ ac= r, 4-4 cd 42. U) a 6) ,0:. O -0 -o o 5 C.) ..o ,9 = 0 .... 013 E O act pi% = C.) .01_ .47.1 7.3. lcia B t. , f:1:1 C.) 6 —01 fs 7,5_, 0 r.) .6. 0 ct3 c; 3 m _ •5 ...• t.,-, •— o.. '' O si o ...,rn .0 .4 ci) /—. 54-4 CIO ,t,,,, , 0 E •S E c4... S c..) cl 0 5a)7t1 ° ti) = .8 0 o >9' 0) 0 6) c4-4 CI 0 .4g (4—o ^c:, 0) o pi..-g .9.. E E z • — -0 ..= i b1:1 z . 5 ;ics 5 _ge 'R. (12. tt 2U — ,,, sci cz r... cd .5 to 8 P_ Va' 2 a) aci it ._ >, .5 g. 0 0 4-:.' a rID 6) 0 6) 0 a) 0 0 0 a) 0 0 0 6) (..) a) a) 00 oU ...9 c.) U o .—w o o 0 o a) .— us Z a El a., auaVoam klup.m.L butualunb I o!ozouteD I I I kr) 'fin in koin '0

(1)

• 'CI()

< 'U c4-1 p O < c0

4.) CID ct AReg snogoulao

'I c> kr1 •zr C> •nI in

0

0 0 0 '2, 0 0 0 CI) ."'En .....en C.) cd cr, .1-1 cd cd co C) I... • — a. Du' o 8- 6 i- . o !uz z - o!ozosoN opzoggiud eotquo oi.ozonumid I I In C> c:) kr) 1. Protea angustata 2. gazensis 3.

4. 5. 6. Protea lorea

7. Protea tenax 8. 9. Protea nana

10. Protea pendula 11. Protea subulifolia 12. Protea magnifca

PLATES 1-12. Representatives of the genus Protea.

23 Chapter Two - Molecular Phylogenetics of Protea: Evidence from Plastid and Nuclear DNA Sequences

The origins of the Cape floristic region (CFR) in South Africa have long been the subject of intense interest and debate (Levyns 1952 & 1963; Linder et al. 1992; Rourke 1998). However previous studies addressing these issues have been compromised by lack of a detailed fossil record. Phylogenetic trees offer an alternative source of information to address these questions, but detailed phylogenetic trees for Cape taxa are again lacking. The aim of this chapter is to present such a detailed species-level phylogenetic reconstruction for one of the CFR's 'flagship' genera, Protea L. using DNA sequences. This phylogenetic framework will then be used in later chapters to examine the timing and causes of species richness in this lineage.

Protea comprises some 112 species, making it the largest genus of Proteaceae on the African continent. The centre of diversity and endemism for the genus is the fynbos biome of the CFR, but Protea also extends through tropical Africa as far north as . Protea species are dominant throughout the fynbos biome of the CFR, and it is believed that they have radiated extensively in this region under the influence of edaphic pressure and pollinator diversification (Rourke 1980; Rourke 1998). Compared to species in tropical Africa the genus displays more diversity in the western Cape, with respect to floral and vegetative morphology, life histories and edaphic specialization.

The South African species of Protea have been most recently classified by Rourke (1980 & in prep.), whereas the tropical African species have been studied by Beard (1963 & 1993). Of the 83 South African species, 70 are endemic to the fynbos biome of the CFR. Nine species are only found in the wooded habitats of the summer rainfall regions of South Africa, one species, P. subvestita, is found in both summer and winter rainfall regions, and the remaining three species (P. gaguedii P. caffra and P. welwitschii) extend northwards into tropical Africa north of the . Rourke (in prep.) split these 83 species into two sub-genera, subgenus Hypocephala with five species, all with axillary , and subgenus Protea with 18 sections, all characterized by terminal inflorescences.

Beard (1963) placed the tropical species of Protea (33; Beard 1993) into five sections (three of these are common to Rourke's clasification scheme), although in a more recent revision Chisumpa and Brummit (1987) were unable to recognize these formal sectional delimitations. Sources of conflict among the treatments of summer rainfall taxa are likely to be due to difficulties in circumscribing these morphologically plastic species, which are in stark contrast

24 with the distinct species of the western Cape (Rourke 1998).

To date no author has placed their taxonomic expertise in a formal phylogenetic context. Thus beyond sectional delimitations there are no current hypotheses covering inter-relationships within the genus. Particularly lacking are ideas concerning affinities among the two treatments, tropical and South African, since these have only been touched upon in cases for which ranges of widely distributed species overlap. Neither a modern taxonomy nor cladistic analysis has previously been attempted for the genus throughout its entire range.

For the purposes of reconstructing species-level relationships in Protea this study employs DNA sequence data. Although in plants numerous loci from all three genomes have been successfully targeted to infer relationships (summarized in Soltis & Soltis 1998), it is extremely difficult to judge a priori rates of sequence divergence for specific groups. The rate of sequence variation for both coding and non-coding regions varies widely across groups, even among closely related taxa (e.g. subfamilies of Iridaceae, Reeves et al. submitted). In addition, few studies have attempted reconstruction of a species-level phylogeny for all species in a monophyletic group, and there were no examples to emulate in terms of sequencing strategy for a genus of this size. This study represents the first application of molecular data to systematic study of Protea, and it was therefore necessary to survey several DNA regions to assess their potential for reconstructing relationships within Protea. This chapter describes the use of both nuclear and plastid non-coding regions as a source of phylogenetic information. These comprised the internal transcribed spacer (ITS) region and a region of the plastid-expressed glutamine synthetase gene, both in the nuclear genome, and four regions from the plastid genome: the trnL intron, trnL-F intergenic spacer, rpsI6 intron and atpB-rbcL intergenic spacer. The outcome of these phylogenetic analyses highlights how investment, in terms of time and cost, can be particularly unpredictable when working at these lower taxonomic levels and so the performance of DNA sequence data in the reconstruction of species-level relationships is discussed.

25 Materials and Methods

Total genomic DNA was extracted from 93 accessions, comprising 88 Protea and five Faurea species listed in Table One. DNA was extracted from 0.2 — 1.0g of silica dried material using the 2X CTAB method (Doyle & Doyle 1987), and purified by cesium-chloride/ethidium-bromide density gradient (1.55g/m1). Purified total DNAs were dialysed in 1X TE and stored at -80°C. Successful amplification also required that all DNA extracts were further purified and concentrated using QIAquick silica columns (Qiagen Inc.) according to the manufacturers protocol for cleaning PCR products.

For Taq-mediated amplification of both nuclear and plastid regions, 100 micro-liter reactions contained Promega magnesium free thermophilic buffer (50mM KC1, 10mM Tris-HC1, 0.1% Triton X-100), 3mM MgC12, 0.004% BSA (Savolainen et al. 1995), 0.2mM each dNTP, 10Ong of each primer, 2.5U Taq polymerase and 20-50ng total genomic DNA.

Amplification of the ITS region (encompassing ITS 1, 5.8S, ITS2) was achieved using forward primer 17SE and reverse primer 26SE (Sun eta!. 1994). These primers are located in the adjacent 18S and 26S genes, respectively, and 17SE is angiosperm specific. Direct sequencing of ITS PCR products proved impossible due to sequence and/or length heterogeneity within a single individual. Therefore, cloning of individual copies was carried out using the pGEM-T vector system (Promega) following the manufacturer's protocol. In most cases five colonies per taxon were chosen and re-amplified using the colony directly as template DNA.

Amplification of a region of nuclear glutamine synthetase (ncpGS), which encompasses four introns, was carried out using primers 687F and 994R (Emshwiller & Doyle 1999). Those accessions for which DNA was too degraded to amplify ncpGS in a whole piece were amplified with primers 687F/856R and 853F/994R. The spacer region between the rbcL and atpB exons (atpB intergenic spacer) was amplified using primers atpB2F (Savolainen eta!. 1994) and rbcL1R (GTT TCT GTT TGT GGT GAC AT; this is the reverse compliment of the 1F primer commonly used to amplify rbcL). Primers rps16F and rps162R (Oxelman et al. 1997) were used to amplify the rps16 intron. In most cases greater than 80% overlap was achieved between the complimentary strands. Primers 'c' and 'd' and `e' and 1' (Taberlet et al. 1991) were used to amplify the adjacent trnL intron and trnL-F intergenic spacer between the trnL and trnF exons in two non-overlapping pieces (primers 'd' and `e' are direct compliments). For each of the above regions, amplification primers were then used as sequencing primers.

26 Amplification of each of the five plastid regions was carried out using the following program: denaturation, 94°C, one minute; annealing, 48°C, one minute; extension, 72°C, one minute for 30 cycles. Amplification of the ncpGS in one piece was carried out using the following program: 94°C, one minute; annealing, 48°C, 30 seconds; extension, 72°C, one minute for 30 cycles with a final extension time of six minutes. For some taxa amplification of ncpGS in two pieces was necessary using the following modified touchdown PCR program: 94°C, one minute; 55°C (with a decrease of one degree per cycle), 30 seconds; 72°C, one minute for seven cycles and a further 25 cycles with an of annealing temperature of 48°C. Initial amplification of the ITS region was carried out using the following program: denaturation, 97°C, one minute; annealing, 50°C, one minute; extension, 72°C, three minutes, for 27 cycles with a final extension period of seven minutes. For re-amplification of cloned ITS products an annealing temperature of 55°C and 25 cycles were used.

For all of the above regions, amplified double-stranded DNA fragments were purified using QIAquick silica columns (Qiagen Inc.) and directly sequenced on an ABI 377 automated sequencer using standard dye-terminator chemistry following manufacturer's protocols (Applied Biosystems Inc.). For assembly and editing of the complimentary strands 'Sequence Navigator' and `Autoassembler' (Applied Biosystems Inc.) were used.

For all nuclear and plastid regions length variation among species was minimal, and thus all sequences were easily aligned by eye. Seven discrete gaps were coded as present/absent (A/T) from the trnL-F (1), atpB-rboL intergenic spacer (1) and ncpGS (5) regions and added to the end of the matrix. Otherwise, gaps were coded as missing because length variation among species in the remaining indels made assignment of homology difficult.

All cladistic analyses on the following data matrices were performed using the parsimony algorithm of the software package PAUP* version 4.02b (Swofford 2000):

> ITS data set representing 14 Protea and one Faurea species. > Four plastid data sets combined for 88 Protea and five Faurea species. • ncpGS data set for 75 Protea and five Faurea species. > Plastid and ncpGS data sets combined for 88 Protea and five Faurea species.

The search strategy for each of the above analyses used 1000 replicates of random taxon addition, tree bisection-reconnection (TBR) branch swapping, with MULPARS on. All character transformations were treated as equally likely (Fitch parsimony; Fitch 1971). A limit of ten trees

27 were set for each replicate to reduce time spent swapping on large numbers of trees at or near the optimum. To assess internal support, 1000 bootstrap replicates were performed using simple taxon addition and TBR branch swapping with a tree limit of ten trees per replicate.

In each analysis the position of a few taxa differed among the topologies of equally parsimonious trees, which resulted in lack of resolution in the strict consensus trees. For this reason Adams consensus trees were constructed for all analyses to demonstrate the consistent patterns present in the data. An Adams consensus is designed to give the highest resolution possible between trees (Wiley et al. 1991) by recovering components of clades present in every tree and relocating taxa responsible for conflict to unresolved positions.

Due to the extremely low number of variable positions in each of the non-coding plastid regions I combined them into a single data set, which will be referred to from here on as the 'plastid data set'. Due to their uniparental mode of inheritance and non-coding nature, these regions would be expected to be congruent. To assess congruence between the plastid and ncpGS data sets the partition-homogeneity test was used (Farris et al. 1995; implemented in PAUP* version 4.02b). This test is a bootstrap approach that randomly partitions characters to test the null hypothesis that characters are randomly distributed across a given partition of a data set. If two data sets are highly incongruent then the sum of their minimal trees should be significantly shorter than that of the sum of tree lengths from random partitions of the combined data, and the null hypothesis will be rejected. The partition-homogeneity test was carried out with 200 replicates with a full heuristic search (comprising three random taxon addition replicates per partition-homogeneity replicate) and TBR branch swapping, saving ten trees per replicate.

28 •

a) Cl) ci) = 0 = 0 S SSSS ,SSSS S SSSS C) = 0

a) = ,•••4 45

a) 0:18 0 c.) = Z:1 (9" S SSSS SSSSS 0 S S SSSS = o . 0 0 .10 .0

• SSSS SSSSS S S SSSS

s.. S SSSS SSSSS S S SSSS = • 0 c9 gc9gg ct ct C.) C.) C.) C.) C.) .2 .2 .2 "a' 4; 4; 4 4; 4; 4=1 cb ski 4=1 a)" w <4 <4 <4 <4 <4 <4 < 4a. 8 ..o 000 • cAvivivi vicncn c/.3 I I I I. 00— o—ocuwo 0 CD CU CU CD re5a E2,EEEE. E,..,tattt EE EE w 6= EEEEE EEEE O 00= Acni2 c2..3,2,2c7"2 c2 OMEnCn

O\ n 1-."6,,,I.I_ .6c40:6 (flvp ,x oo ci)cpc) ‘.0 ‘13$ 00 c.) c•2 ^ LI C.) 4D C.) ,--1 C.) C.1 cn con C0 02 CD CO =0..,=r... CV CU CI) t t Cn t) o 0 0 o) oa ts o.) > > > CI) tt) CO c4 c4 4, 1=4 oryocyo C1:41:4C4

Q Z 23° Q,b .c.b...,2i.Z Z ocl w o 0 c%) 0 w 0 O 0 0 0 o) oa ,,o o) o) Go o) LI o) C0 CU KI) ..cs az. 0 ...1:) ...C) CS ..Cz. t:i ..C) ..1Z) 0) 0.) .-C 0) 0) •.- 0) ,-0. 0) cl)C12q) 1 4 rZ4 C.) 1:4 sg C.J cC U cC 6. Cl) X t .0 c.... -00., W ig o. 0 at . ,IS .9... cc8 = 0 x 6 ci) cn ..,4 0 0 E co ci) ^c) 6: 0) ...ezt .0r'=" ).-"Ci zt3 z 0'' .S46, -0• .2 0,,,C1:1 a.) 1:1 8 .,. t , 4,_4 g s 0 O En x 2 -1 c`I la ) • a — c, i s I . '4 c • , 1:141.0 ...., S-79.1 -.1!eo. '14=EE cjS.o u 4":4-g ts "Ei 1' I ...... ts --... • 54 g g t Vs . ..:2 2 -e3t) cs, oo. cu'z Ps fa 0-6." 496., ''5.43t:i 0 0,, t 0 0 , ,:,:z. T0 0...o, -- C.) .... ••••• '`. E O cs., z •.... CI az. .... ND „, 7 t3 c%) ....' 1:14 0 Z .L.5"- 1:1o.:14.-01. Fszz4to k ...... Ey= k0Qt:Sz wpa=C CSCIt3t3WUMCIMclaat:St:sWm w W .4=, .....o) 0.) 0) ..0). Z 0) 0) 0) 0) 0..,) 0 0) 0 0) ,,.. ..., CO ea 01 cu-6 =00 CSOCSOZISI51500ZOZOP-Q0000 7 C.) L. L. L. L. L. Es L. )... L. L.. L. CWC:,.4

(/) •ss Iss ssss s, s s

s s , sss sss ssss ss s s

=

,c) s s s sss sss ssss ss s s s

S S S S SS S S S S S SS SS S S S g P

< <4 <4 <4 vi I 70 I ^a I . "0"0" t I-• 1-4 1-4 i. O-0, .. ,. , I- 0000",...", E"E"E0 0 000 E4) EI=E1=JE i i .E.l.l III El I 1 I epoicia

. E. ,t1' (0'1 G P., ..-; Pi g,C 44 citig' 4 V tt ts '' Ind fa. 0) 0 •—. ,,--goEct,„,,,,,t.illown-e....1w— a) :::: a, 6, 4) ,..: ca =•,-. .2) ,...0 .S11,5 0 ..t,i.. =z1zas m a)%(7....,...... „P47::8W,_iZ'e...4WW • Z• la" ''*3 !I:I-4- 4.; Lti w .4 ti ;•-." '''' '''' CC1 a4 Z2) ,-.• .< g • % t) 'TH.- l-'' ..1 ...'2 --I "Cy :.z .--; ..i :Its tu "ts .- u c ... .:5.3. w •- c cn cs. 7ts z .5.,)1 (, ,:-. F,,3 %., -., .F.1 -- ,...'u i) Ht" ...'‘ , ts ,zz ,--,;.,,C1 . • CS tS. WW 5i1474 W .... k- -6 — .,.., w...—. 0.60E2.,,Q. t, tot' t . t, 11 c,,z 4wbe.a.g.,...40v....0 ..,,, t3z ts, .-4 to ts ts , ,u,:„ 3=, C = = CS .16 CS CS ° CS CS CS CS 0 CS CS ..--1 CI C3 0 tt C2I.00.) i... "P •.,C't11 0.1 0) 0.1t:11Ct3° •. „0.) ..n ,0....) •WWW,_.W.°0)0)W...WW.W "aW WwW ...C3 1Z3 tatIS C'S, Otiotio7d00000000c.) 0_ P 5 =,. 0..P.... it)" 4.P. s.... 0 4)4. ;,.. L.41)L4Jk.cnisisi...0L,..LJ...4....4... 4.. 1:).4 '.GCCIC4 (4.4 1:1n4 °.tC.4C/D14.4,../°n 1:1,CINC,)/%41:1.4g2n4Q.4 • ••

0 S SS S ,SS •PSSS S SS%) S

1 0 TIL(C'4 luP.3S SS S SSS SSSS S SS SSS S SSS

8

SS S SSS SSSS S SS SSS S SSS

S SS S SSS SSSS S SS SS1 S SS S

CI) = a?c,1 L. L &. I-, 8_, • $.n 1-1 4) V VVa.F 0) Cll 03 V V V4, 0 0 0 0 0 CL) *5: A.5 g g .s.g gggg g gg ggg gag.s.s 3 33 3 333 3333 3 3 33 333 3c,3333

(.,-0- I-, N as h.ch WI 0 0 1... tri 00" N n0 U ,—.(Nis NCNIN N .z MtoOW1141 enc..) = en C.) =ocnen 0) 0) (-) '0') ts:s 03 ',,,Ia ca2 'al 'az tt g t r— t t c:. tt E ; t.4-. .CEE ,,,,x) > ) ' V ,,,, ,, w c. 0 w — , WO) 124C4 0 C44...4 1244,124r24 /24 124r--as 0124124 °,?3cti z t t t t t tt t .. t ho, Z tt oo 6 o w ° o,00 o o ox v 600 0)0) 0) o.) E., I,' IS .3 o, oa V o., o, --- -5 -is -is -is ..c. 4)VV 4) 4)0) 4zVV ts 4) 4)0)...C t3 4)0) ,4:), 0 0) .4)CO 4)0 0 0= 04) 04) 0) 4) 4/24 124 04U/:4 g4Uf241:4 124 124U>C41241:4 ,--:, .4 x ...., wx .6 0 z mg:L= X ..6 0 .6 m ,-: CA 444 V, cz(/) U) 1,1 as . ,t_. 9. • ii 4 . . 0 E Ci) s..., 04 Cla ...6,4 ' to as ,....z us 0 cn (4, w ._ = 1- :-. C/9 -cn 00) W E g '4" • .4 = "cl :),2 ,S.3 - ----, . - ia E - m --• = wcil'UN al cn0 4,405m-,-F.1- 1 0 1,-40,4,-.1^'IbStatl—m14 3., gc1;: :73 ,zn ,MQ, :-.. x.. t.)t ts t .2d-zsu_ ...7..-ts 1..cu e-,..-at-oc,u..t.I.s.Wtl 'te.""i , z.!zc,h1419,) 'IS = t.) . 0„selz,00tslc:42t1 4:,--- .ut.; 1:2§il, '6.ati0tAl.tkOP.QW)gtbp".”-• z c.)0 ?)fic'e41 ,Z2 n g n0ti ,..0,41Z0C.40P=Ucened0 .2 aZ• a s e• = " '..a - -- ti 00 2 ..s 2 2 2 :2' 2 2 2 .9. Z' I ,s, 2 2 .s 2 .9. 2 •A 2 2 zit' oit' o9 4 ..z.© .9 Zu.. Z'..) ,!: -d-t-T OIS Zi..?-1"MtZCSOZI-J-61-j-OZZS o -8 CI - c.i c) 2 2 2 L.. . 5 L. i. • .2.- L. L. L. u 4. L. L. L. cu L- 0 a.-- x 4- L- i. L. L. L q eu)..

riD 0sa, S,SSS , SS S S SS a8

I 0 CO.E.4 sn S S SSSSS SSS1S S SS - SS S

8 nin S SSSSS SSS`I'S S SS S S SS S

SSSSS SSS S SS S SS S

I. I. I-. I-1 I. LI I. 3-4 I-• I-1 /4 I. 1-1 L. 1.1 0 0 CD CD CD CD CD CD CD $2 4.) CD CD 0.) CD a) a) a) 0 g .5.5.5.5.5 g .5.s gg g gg g .5 g g g 3 3 33333 33333 3 33 3 3 33 3 3

. k k.. ,—, o n . .9 0I. N telt--; e4). i.. . -': cc 1.. . (.: ' . cr- (60 . (4 (4 .--, 0 b en gni' en o mm m krIen M m = .= = L.2 0 t•2 rol in 0 ,•3 {1 DI '01 n ., '4 0) ,-, 0) 0) 01 0'03 'in) C , ' O?) O3`,2 -2 c. a)tattt >www MIA c3.) n 1, n 4: a) o4 04 o 04 04 04 o QC 04 04 QC o QC QC ry QC QC 04 QC 04 z.b r., '43 Z4.:J't c,b Z°4,:i°b.. z. 0 0% 0 Ci 0 0 0 00000 0 00 ta 0 00 ct2 0 0.) In 0) 0) 0) 0) ca W WwWw a) 0) 0) t• a) 0) 0) w 4:::, 4c -0-0 .0 40 .0 -0 .0 .0 .0 -0 .0 .0 0 ..e) ..e;) .1c t .C) 0) . 0.)(1)0)0)0.) 0) 0) 0) 0.) 0) W 0) 0.) .0 a) 0) 0) 0) 0) 1:4 1=4 1g 24 C4 1:4 C4 1=4 I=4 1=4 1Z4 1:4 0D U 1=4 1Z4 1:4 1=4 1:4 a) )-: X s4-n 0X CA 0 et cis ..... 0 e-s (4-. C:1 W (..,..= ta. ,s4c.,, a )-. (i) f:4 O. 0 . CL..-I .1 a) := "clti.= , 0# 1:40 , v .,15 ,4 = cn= .=got W 46..cno,-:sil... (nrxm 74'ao 12-c4c4iits .-z '6554=33 ,4 Eta,---°=6-t=,- tu,...:,1,414gl'zIE! 4itstura, m .5.3 = CI ztswtl t, u),4 11%. z9 otg-a p4 u s.--.113. ,..,...t2 ei)30Q 0W —44.4 .1 w e2' .11.StI V-1 2§,...22 za,' 41u.t, -Q -.z.....E., "g 2 ,-I zel§t, F.) w w I...... 9...t.,Qz tkQ,Q Q..z ...zz.,Tiz .t.1 , = m = I- M = al CS 0 t3 CS C CS ,JS CS 3 zl 2.2 cv C1)t rc....., W.I.o WC110.1wt CI)i CI)ci ,lzornc QJ . r., ...rn !,,....,o Z)..=w...... ,,,a., c... = ,...=— Se' -8 1.6 le" *6 *21 r:' 2 *;''' t; at *Z3 2 ...,':' *21 2 at 2 . 6 2 2 cdc 2 • 0 2 al 2 2 'GI 2 2 '"E cica..1:1,cnc5„act,c1-0,(41...,Q,c40.0„0,„"' :1..a.,Q..mca,a4c1,u)cL.0..cncx.E.-0..a.

Cl) (.7 •), S

I 0 CO (-) C), 0 EL S SS I

8

n.0 0." S SS S SSS S S SS SS S

z • S SS S SSS S S SS SS S

• ccs c„ o .%) 0.. 0. O 0 .5 I. I-. a.) E 1,7, a) a) a.) a) .s.) a) as a) E 5 E E 5 5 (5 .4g g g .4g g 5 5 E E 6-1 • 3 3 3 3 3 3 3 3 a a a y) a 3 ri)

s... 1.-. s-. cc" ntr:..o _H-n" ,_..--.7 (-4. c•I' •:1-- cm" ri r--- a) co us co 1/40 1/4C) ,0 = ...61) 61) •-• pc- . 0 n ,,) ,,) C.7 V) ,I c.) t.) cn e c,) .4) 4) 4) C*4 a( ) = = 4) C0 4) CI) 4/ 4) .4) 0 0 0 t ''' t t t t q.) t t t > > > .4) t% .41 4) .4) .4) q) .4) 4) 4.) s) 0 4, 04 04 04 04 04 m4 c4 04 4, ') x X .:.. i t oti o.b t ct ct c..b t t c4b ct 6 c).. o7 o 0 0 0 0 0 0 0 0 q, 0 0) 0)0) 0) t..)0)0) 0) 0) 0) 0) 0) -03 cn 0) .0 .0 .0 .0 -0-0CS -0 -0-0..ip ...p .0 0 ..0 0)01 0) 0) 0) ..,Z0) 0) 0) 0) 0) 0) CD ...Z 0) C: 1: t: L.) C4 C4 C4 C4 ‘: C4 C4 C“..)

...-:.. ,--:. rn .0 ci) 1 = ca. o u) = LI-i ...c2. . ch ...... -. c,. 74 = -F, •r) -1 7••• a) p2 Lij . a c.s1 Tcs" . .t.. .s ,0 _ _ , &MO I:43.... al ••n• _, .:4 ... 4) cz ;-. as 2 E ••••• • p.., En cn ..ca ts - ..pe p.., -ra •11.:1-. • z = 4.. ..?.: = 0 o a.) P.1 ;4 4) co ••t 1:0 ts . "c' :9 s'-' ,c2 a' W et -S2 3 e. ''' 3 W '-' < at, .§ tatt .•-• 8 E.' g Zcz "Fx rli 41 tit E-. C.' Ei i- {, ,,?.. -,-. o .z-.• p : "Z: ...t, ..,, z Z,) 1:26:z zs, cu c,,t2.... ., I 4. ) p • 0 p W t 3 t...- .... , .., ...-. $41 *1-1; *CS ,t2 E .., 1,-, z ts re) t ,t1. a t 0.4 L. .-z .ho E r=L. tJ ts L- -- 'ts ...' = '''' VD • 1... Z tt p) .1Z C..) 4.1 p ^... .6 ..• 0 Q.• 4_ o ..0, 0 ONC„Jcarkiz 1=1.00 •CS ..., cq "ts w, E 0%) c4 = 4. i.. g ., 0, E 0 = 0 Cs ..CS E., „ -.0 ,--4 c ,r4) E O 0 ..., c 0 0 t3 ri t3 0 ci.) 0 0 p 0 0 ck) 0 0 0 0 4-' 0 i`l ,,-,0)0)C1-0) 0) 0)0)1:40)).:0)0)0).40) 10 0 0 DD 4... „ - , , = ..... •47. ,...... "...... 61:1 4.... &'' = Z.) tr. g ,,•:'''' O Q as cs-0 0 1:0 0 a) as • -• 0 clupc.)%p 4-•.., L.. aa L. L. = L. • . q ,... 0) as a. CL) L. '-' t3 0 .r. Cs as ..E as Q. Cn cl, Ca., V) CIL g ON CI. 0, 0.4 0:1 Un 0, CID 0, C) 4, 4, al.., 4, 4, PL. 4, Results

> ITS region Between two and five ITS colonies were sequenced from 14 Protea and a single Faurea species. Analysis of the entire ITS region for these taxa (comprising 51 individual ITS copies) included 768 characters of which 353 (46%) were variable and 155 (20%) potentially parsimony informative. Analysis resulted in 96 equally parsimonious trees of length 548, with a consistency index (CI), including autapomorphies, of 0.76 and a retention index (RI) of 0.80. Figure 1 shows one of the most parsimonious trees; branches not recovered in the strict consensus are marked with a circle.

Analysis of ITS sequences demonstrated that not all ITS copies from a given individual formed a monophyletic group in the following species: P. comptonii, P. curvata, P canaliculata, P. restionifolia, P. parvula and P. scolopendrifolia. ITS sequences from the remaining eight species were resolved as monophyletic groups but still displayed length and sequence heterogeneity (with the exception of P. laurifolia since only one ITS copy was sequenced for this taxon).

> Plastid regions Due to difficulties in obtaining PCR products, P. laetans and F. rubriflora are missing from the atpB-rbcL intergenic spacer data set. Protea lorifolia is missing from the rps16 intron data set. One indel shared by all Faurea species and P. lorea was scored from the trnL-F data set, a second indel present in all Protea species except P. pendula, P. canaliculata, P. acuminata, P. mucronifolia and P. odorata, was scored from the atpB-rbcL intergenic spacer (these indels are mapped on the tree in Figure 4). Analysis of the plastid data set included 2703 characters, 188 (7%) of which were variable and 92 (3%) potentially parsimony informative. Analysis yielded 9490 trees of length 238 with a CI of 0.84 (including all variable sites) and a RI of 0.91. Within Protea the bootstrap replicates gave >50% support to 18 clades, of these five had bootstrap percentages exceeding 85%. Within Protea 19 (22%) nodes were resolved in the strict consensus of 9490 trees.

The Adams consensus tree for the plastid data set is shown in Figure 2; those branches not recovered in the strict consensus are marked with a circle. Clades resolved in the Adams consensus tree that are in agreement with existing taxonomy are described in Table 3. These include all members of sections Paracynaroideae and Breviflorae, and subgenus Hypocephala sensu Rourke (in prep.). Seven summer rainfall species belonging to sections Leiocephalae, Lasiocephalae and Patentiflorae comprise a clade although the remaining species belonging to

34 these sections are unresolved. Protea lorea was resolved as the sister species to the remainder of the genus.

> ncpGS region Due to poor quality of total DNA, it was not possible to amplify ncpGS from the following species; P. aspera, P. petiolaris, P. cordata, P. scabriuscula and P. angolensis subsp. divaricarta. Heterozygotes were identified in 30 taxa; of these it was possible to sequence the entire ncpGS region without cloning for 24 taxa despite the presence of two slightly heterogenous alleles. In these cases variation among alleles was manifested as either a single indel and/or single nucleotide substitutions, and it was possible to edit the sequences right up to the indel using internal sequencing primers. For the remaining six species there appeared to be different multiple indels in each allele, and thus it was not possible to directly sequence these from PCR products. The following taxa are therefore also missing from the ncpGS data set: P. susannae, P. lacticolor, P. eximia, P. magnifica, P. nubigena, and P. wentzeliana.

Five indels were scored from the ncpGS matrix (mapped onto Figure 4). Two of these marked a clade comprising P. subvestita, P. venusta, P. mundii and P. punctata; the third indel defined a clade comprising P. amplexicaulis, P. decurrens, P. subulifolia and P. humillora; the fourth indel was common to P. cynaroides, P. scolopendriifolia, P. pruinosa, P. cryophila and P. lorea; and the fifth indel was present only in Faurea species. Analysis of the ncpGS region included 844 characters of which 205 (24%) were variable and 98 (12%) potentially parsimony informative. Analysis gave 9760 trees of length 245 with a CI of 0.90 and a RI of 0.94. Within Protea, bootstrap replicates produced >50% support for 19 clades, three of which exceeded 85%. Within Protea 20 (26%) nodes were resolved in the strict consensus of 9760 trees. The Adams consensus tree is shown in Figure 3; groups not recovered in the strict consensus are marked with a circle.

Clades recovered in the Adams consensus tree that are in agreement with existing taxonomy are described in Table 3. As for the plastid trees all members of sections Paracynaroideae and Breviflorae, and subgenus Hypocephala sensu Rourke (in prep.) were monophyletic. The Adams consensus also defined a clade including all summer rainfall taxa with the exception of P. angolensis angolensis and P. roupelliae. This clade occupied a derived position within . Cape taxa. A clade comprising P. lorea, P. cynaroides and all included members of Paracynaroideae was resolved as the sister group to the remainder of the genus. Overall resolution was greater in the Adams consensus of ncpGS trees compared with the plastid Adams consensus tree. However, with respect to currently recognized sections and intuitive ideas of relationships both data sets performed similarly in the groups present in the Adams consensus trees (as summarized in Table 3).

35 > Plastid and ncpGS regions combined The partition homegeneity test (Farris et al. 1995) indicated that the plastid and ncpGS data sets were incongruent with respect to one another. However, because many of the groups that were resolved in the Adams consensus tree for each analysis were in agreement with each other and existing taxonomy, the two data sets were combined in a single analysis.

Analysis of plastid and ncpGS sequences combined included 190 (5%) potentially informative characters; this gave 3280 trees of length 514 with a CI of 0.82 and a RI of 0.89. To illustrate branch lengths one of the equally most parsimonious trees was chosen at random and is shown in Figure 4. Within Protea 1000 bootstrap replicates gave >50% support to 20 clades, six of which exceeded 85%. Within Protea 21(24%) nodes were resolved in the strict consensus of 3280 trees. The Adams consensus tree is shown in Figure 5; those branches not recovered in the strict consensus are marked with a circle. A summary of the statistics for each analysis is provided in Table 2.

A summary of the clades recovered in the combined Adams consensus tree is shown in Table 3. Within the exception of P. nubigena and P. roupelliae all summer rainfall taxa in the Adams consensus formed a clade which was embedded within Cape taxa. As for the plastid analysis P. lorea is resolved as the sister taxon to the remainder of the genus. Although overall bootstrap support is still lacking in the combined tree Table 4 shows how in almost all cases bootstrap support increased in the combined analysis for those clades defined in Table 3.

36 TABLE 2 a & b. Statistics for each of the three analyses. a. number of number of parsimony steps CI RI number of characters informative characters trees plastid 2703 92 238 0.84 0.91 9490 ncpGS 844 98 245 0.90 0.94 9760 combined 514 0.82 0.89 3280

b. % nodes resolved in strict % nodes receiving >50% % nodes receiving >85% consensus bootstrap support bootstrap support plastid 22 21 6 ncpGS 26 25 4 combined 24 23 7

37 P parvula 0 -827- I P parvula 1 1 61 P comptonii 3 1 P curvata 58 1 P parvula 2 63 P parvula 2 64 P curvata 1 P gaguedi 1 59 P gaguedi 2 83 2 54 P gaguedi 72 2 1 3 FE P gaguedi 54 96 P gaguedi P c omptonii 8 2 P rubropilosa 62 3 86 P rubropilosa 1 P sulphurea P sulphurea 2 3 P sulphurea 0 93 P sulphurea 2 P sulphurea 1 P glabra 61 P glabra 4 3 88 P glabra P glabra P canaliculata 77 13 25 P dracomontana 15 100 P drac omontana 31 5 P curvata 60 38 P canalkulata P repens P repens 0 P repens 0 9 P repens 0 P repens P roupelliae 6 P roupelliae 9 0 P roupelliae 2 P restionifilia 39 3 6 3 P restionifilia 50 99 3 22 P restionifilia 26 28 P restionifolia 55 P laurifolia P sc olope ndriUblia 100 5 95 P scolopendriUblia 100 8 1 P sc olopendriftlia 2 I 5 94 29 P sc olope ndriftlia 42 P scolopendrqolia 62 E macnaughtonii E macnaughtonii

FIGURE 1. One of the equally most parsimonious trees found from analysis of ITS sequences for 14 Protea and one Faurea species. Number of trees = 96, number of steps = 548, CI = 0.76, RI = 0.80. Branches not recovered in the strict consensus are indicated with a circle. Branches lengths are shown above and bootstrap percentages below the branches. Terminal taxa with identical names were sequenced from the same plant.

38 ' rilexicaulis • e urr ns ' su ul oaa 63 ' hum ora 72 cor ata ' stoicgel 94 '. scpoero c zosanata 1 :. nerujolig 66 • lonzyolta p udens . . 89 ' scorzonenfolia 91 ' . susay . ' • restiom oha '.. p4scm. . . obtusi oha ' combpacta '. cae praaosa sca tenax ' joilosa 61 ' mucronifolia odorata 93 '•supvestda ' aristata '• vexenimusiata ,• repens . rdupethae • montana • grancticeps .NcolymoCephala • mundu .unc(ay urche .ii urifo za 62 • otosericea cticolor • epuclocarpodendron • ma mfica • ayfea a • e usa . 65 • condjta 53 • witzenbergiana . le9gea- . acum na a 58 • cart tcu ata 64 • silt urea • reVoluta • conv.exa 62 • tams • angustata • narnaguana • ztabra .Thopina • acdulos . curva4a •parvyia •simplex • corigto.nll 76 •ga eal Yu ropilosa • laetans •scoioilmdriifolia 64 scab aieuiFasc un laa 100 ...

60 : giniOrnusg cci 1. en rvis . ru icola I. era . cynaroides . nang, _ _ . vo2Iscaeh ..

I.:Ic,p2itscv 11 . a comontana 100 . intonsa : era . IA a . lanceplata . petiotans . dracomontana i lorea .saagna 96 roc-44(Iva 61 Aaigtrq macnaugntonii

FIGURE 2. Adams consensus of 9490 equally most parsimonious trees found from analysis of four plastid DNA non-coding regions for 88 species of Protea. Number of steps = 238, CI = 0.84, RI = 0.91. Branches not recovered in the strict consensus are indicated with a circle. Bootstrap percentages are indicated below branches.

39 e. longifolia 63 P. pudens P. 63 P. Peupritaeciliripodendron P. heriifolla J. holosericea 63 P. compacta P. laurffolia lorifolia P. subvestita venusla 86 . mundii P. punctata P. obtusifolia 60 . roupe7liae P. azalea a decurr.ens 62 P. subulffolia 82 qmplegicaulis I.. numgiora 51 . restiongolia 55 . gra.ndiceps i. stokoei 61 P. speciosa foliosa . i.. scorzonerifolia P. aristata e. piscina P. coronata P. montana P. intonsa P. caespitosa P. scabra P. cagra P. simplex P. dracomontana i P. laetan.s P. enervis P. curvata P. parvula P. welwitschii i. dracomohtana P. comptonn P. Ted' I. ropilosa P. e usa e. tenax I-'. rupicola P. pendula canaliculata 79 . angustaia P. mdbronffolia P. odorata 79 P. vogtsiae 99 P. laneeolata I. glabra 63 P. nitida P. witzeribergiana I. acummata P. namaquana P. pityphylla e. moping P.redonada e. sulphurea P. angotensis a P. revoluta . convexa 52 laevis acaulosP. 75 I. nand 100 P. scolymocephala e. repens P. prumosa 62 P. ciyophila 86 P. scolbpendrnfolia 95 P. cynaroides P. lorea J. sa1zgza.. . gaisping 93 E. rochwana i. runt-glom( 1. macnaughtonii

FIGURE 3. Adams consensus of 3280 equally most parsimonious trees found from analysis of ncpGS sequences for 77 Protea species. Branches not recovered in the strict consensus are indicated with a circle. Bootstrap percentages are indicated below the branches.

40

sa ondita e zaf a a a . cana tculata .P acuminata surrea ustata namaquana ....Nu bra indels scored from the ncpGS data set oropina 1 1 I fr voktsiae P. ldnceolata _ 2_ P. eli eats II indels scored from the plastid data set i-•curvata P. puedi P.so out.; ."t r ropilosa

_5_,_ a i L i; li ri itil ev:U la 4 nyrdeliana i E drac °far:tuna P. we twitscha 16 . wentmlian 4 P. aggta?sirrais a ,± P. itida 414 u : a15. mucroni fo 1 i a darata. F. subvestita P venusw ..mundii F nunctata M;ae usenchai )rt 0 a e ; ocarpodendron i' o la a icolor Riffpg Caa 5 scoz;zerionl iaea ig Paespitosa 4 L.P...stokoei F. s ectosa i P Disc a c oronata . F ionktfoha P P dude& P Iserifotta P. q'tnirsgfitla c ra cannierkaulis alumns

9 P codata P. restionifolia oliosa aristata . mon ana 2 inton,sa ranaccepy p. rev ta aevis coin,exa nana P. scolymocephala P. eximif rePens .1P7pub en a 6 . be tarts r wold ..scabtuscur . r.F. iosc roe l :en ru lia P. crvobttila P.pruutbsa 71 IIJ 3 E asb?ra. P. cynaroides macna liztonii r , sallnat II flirchetta E gat tnu

FIGURE 4. One of the equally most parsimonious trees found in the combined analysis of ncpGS and plastid data sets. Branch lengths are shown above branches. Indels scored from both data sets are indicated on the tree. Taxa in bold are found in summer rainfall regions.

41 comptonii • ruloropilosa curvalp pgrvula 53 I. sin: le ue a ns caJjra • gr.omontanaac i ac ontontana wenkehatia P angolensts d 67 andoillsnsis a ettusa 74 condjta itzceigbgrgiana gtp 57 canarkulata 50 acurinata suip urea . revo uta onvexa 54 claews 89 acautos 52 nang 84 • scolymopephala mucronageolia 100 1 odorat tenax angustata namaguana tnopifia vogtsiag 78 • lanceolata giaora repgns .nittda • aecurrens 57 subultfolia ampieraulisc 67 hum ora 77 c o 4: sto oe: spectra 98 long: olia pucre s 97 scorzonerifolia susannae restionuolia 92 piscma ,S8tugiYolia P comp,acta nertgolia caespitosa t. cgara bolosericea urc e iizz ep corpodendron aur otia or • lac 'color • magni ica • aurea a. subvestIta venusyg 59 mundit 68 • Rugstata Jartstaola roupe liae ohtana intonsg grand:ceps rum.= exlmta • nukigeng petrruans sca rscula . sco openartMla pruinqsa 9 crvomilia 100 QM:arm:des asrera 10 ea sa na. 99 ro ettana fzia 98 ord mac augntonii

FIGURE 5. Adams consensus of one of the 3280 equally most parsimonious trees found in the combined analysis of ncpGs and plastid data sets. Number of steps = 514, CI = 0.82, RI = 0.89. Branches not recovered in the strict consensus are indicated with a circle. Bootstrap percentages are indicated below branches.

42 TABLE 3. Clades recovered in each of the three analyses.

Taxonomy sensu Rourke Clades identified in Clades recovered in Clades recovered

(in prep.) unless otherwise combined plastid and plastid trees in ncpGS trees stated ncpGS trees 1 Leiocephalae P. parvula 1 except /except P. dracomontana P. dracomontana, P. angolensis P. dracomontana P. dracomontana subsp. angolensis subsp. inyanganiensis subsp. inyanganiensis P. caffi.a P. caffi.a, P. enervis P. simplex & P. welwitschii P. nubigena Lasiocephalae P. welwitschii P. gaguedii P. laetans Paludosae (Beard 1963) P. enervis Patent florae P. curvata (Beard 1963) P. rubropilosa P. angolensis subsp. angolensis P. angolensis subsp. divaricarta P. comptonii Cristatae (Beard 1963) P. wentzeliana

2 Humifusae P. revoluta except P. convexa P. acaulos P. laevis P. acaulos

3 Crateriflorae P. effusa P. recondita

4 Breviflorae P. mucronifolia P. odorata

5 Vinosae P. pendula except P acuminata P. canaliculata Fruticosae P. acuminata

6 Hypocephala P. decurrens If P. subulifolia P. amplexicaulis P. humiflora P. cordata

7 Ligullatae P. laurifolia but also with but also with P. burchellii P. mundii, P. longifolia, P. holosericea P. punctata, P. pudens, P. lepidocarpodendron P. magnifica, P. P. neriifolia & P. lorifolia aurea & P. lacticolor P. compacta

8 Speciosae P. stokoei P. speciosa

43 Taxonomy sensu Rourke Clades identified in Clades recovered in Clades recovered (in prep.) unless otherwise combined plastid and plastid trees in ncpGS trees stated ncpGS trees 9 Exertae P. venusta 1 except 1 P. punctata P. subvestita & P. mundii P. venusta P. subvestita 10 Ligullatae P. longifolia 1 1 P. pudens

11 Paracynaroideae P. pruinosa 1 1 P. cryophila P. scolopendrifolia P. scabriuscula

44 TABLE 4. Bootstrap percentages for the clades defined in Table 3 for each of the three analyses.

Clade (from Table 3) Bootstrap percentage Combined Plastid ncpGS 2 89 52 3 74 65 4 100 93 79 6 77 72 82 8 98 94 61 9 59 86 10 97 89 63 11 97 100 86

- indicates the absence of a clade or bootstrap support of <50%. Percentages in bold indicate an increase in support in the combined analysis.

45 Discussion

One of the difficulties encountered in this study of Protea was the inability to use the two internal transcribed spacers (ITS1 and ITS2) of nuclear ribosomal DNA. These regions have become commonly exploited sources of variation for interspecific/intergeneric phylogenetic analyses in angiosperms and other eukaryotes. Despite high copy numbers, the near uniformity of ITS paralogues, attributed to rapid concerted evolution, normally allows direct sequencing of PCR products in many taxa. However, in some taxa difficulty with the use of the region stems from the existence of polymorphisms among repeat units, which may cause extensive differentiation even within a single individual (Vogler et al. 1994). Divergent paralogues were detected in all species of Protea and Faurea surveyed to date (15 species). Differentiation among copy types, manifested as both sequence heterogeneity and length variation, made direct sequencing of the ITS region from PCR products impossible.

ITS sequences, however, have been successfully obtained from representatives of all other South African Proteaceae genera without the need for cloning (Barker pers. comm.). ITS sequences obtained by cloning from several Protea species have been subjected to a 'BLAST' search in GenBank, which identified all copies as plant nuclear rRNA sequences, ruling out fungal contamination. Additionally an angiosperm-specific primer located in the flanking 18S exon was used for amplification, which should have precluded any possible amplification of fungal contaminants. Multiple and divergent ITS copies may also be present in individuals of origin; however due to the prevalence of multiple copies in all 14 Protea species examined to date it is unlikely that all accessions are recent hybrids. The end result was that it was impossible to assign homology to each of the copy types, and the non-monophyly of many of the ITS sequences cloned from single individuals rendered phylogenetic inference using this region untenable for Protea species. However, it is useful to mention these findings and problems since the occurrence of ITS paralogy goes largely unreported in the plant systematics literature.

This reconstruction of species-level relationships for Protea included some 2700 nucleotide characters from four non-coding plastid regions. Analysis of these data individually provided such a small number of parsimony informative characters that few groups were resolved in the strict consensus trees (data not shown). Treating all four regions as a single data set however provided some resolution, largely at the tips of the phylogeny because few characters defined the spine. The ncpGS gene, which encompasses four introns, provided additional information with a larger proportion (12%) of informative characters. It was not possible to obtain ncpGS sequences from all species due to the presence of two alleles with length heterogeneity in more than one location.

46 These taxa will require cloning to characterize the individual copy types.

Parsimony analysis of all five regions combined using five Faurea species as outgroups yielded fewer than 8% parsimony informative characters. Consistency and retention indices for all matrices were high, which served as a good indication that those characters that did change were mostly in agreement, however the CI may be automatically elevated as a result of the small pool of variable characters. Despite the consistency among variable characters, high bootstrap support was rare in all analyses. This is perhaps not surprising since re-sampling of such a large matrix is unlikely to recover groups that are only marked by one or two characters.

The partition homogeneity test rejected the null hypothesis that the data sets were congruent. However, separating sampling error from hard incongruence is the primary challenge in these circumstances (Huelsenbeck et al. 1996), and tests such as the partition-homogeneity will not distinguish between these fundamentally different causes of disparity among data sources (Reeves et al. submitted). In these analyses it is impossible to identify real differences among the data sets that are caused by hard incongruence because the degree of error caused by too few characters in -.. each is large. The situation is further complicated because hard incongruence, if present, is unlikely to be a charactersistic that affects placement of all taxa simultaneously (Weins 1998) and is thus best assessed on a node by node basis with measures such as the bootstrap. The challenge of how to assess confidence in the combined topology is discussed further in Chapter 3 with the addition of more variable characters to infer relationships within Protea.

If it were possible to detect hard incongruence between the plastid and nuclear trees one cause could be hybridization. If hybridization is a common phenomenon among Protea species this could seriously affect these results because the method of analysis used cannot deal with reticulation. Natural hybrids do occur in the wild between closely related species of Protea so this is a potential problem. It is extremely unlikely that hybrids have been sampled for this study since all specimens were collected by, or in partnership with, the authorities on the genus in Cape Town. Also, hybrids are nearly always between closely related species, (i.e. within species groups of Protea; Rebelo pers. comm.), which means sequences of the parental species in a genus with such low levels of sequence divergence would be identical or nearly so. Therefore, confounding signal as a result of hybridization is unlikely to be discernable.

Despite the lack of bootstrap support and resolution in the consensus tree, which to a large extent precludes a rigorous assessment of existing taxonomic ideas, the trees do allow discussion of more general points relating to the origin and current biogeography of the group. In particular, owing to its extensive distribution through Africa, phylogenetic reconstruction of Protea allows

47 re-evaluation of the origin of these Cape taxa in relation to their tropical and sub-tropical congeners in Africa.

On morphological grounds, it is thought that Protea species with the greatest number of primitive characters occur in the tropics and sub-tropics of Africa, whereas the morphologically specialized species are in the western Cape (Rourke 1998). Thus Rourke (1998) has stated:

"Basing one's arguments on morphological evidence it is clear that the least specialized and probably most-primitive representatives of African Proteoideae originated in a tropical forest environment... From such forms it is possible to trace the emergence of more advanced types (such as Protea) into savannah woodland habitats or montane heathlands, along the high mountain backbone of eastern, central and southern Africa, from where further groups radiated into the fynbos heathlands of the western Cape."

In a phylogenetic reconstruction Rourke's hypothesis would be expected to give a paraphyletic grade of 'primitive' tropical taxa with Cape endemics as the most derived clade (as demonstrated in the hypothetical tree below; Figure 6a):

(a) (b)

ECape taxa [1 Tropical forest taxa

Cape taxa Tropical forest taxa -T-

Figure 6. (a) hypothetical phylogenetic reconstruction of the relationship among tropical and Cape taxa according to Rourke (1998). (b) relationship among tropical and Cape Protea species recovered in the DNA sequence trees.

Contrary to this, all summer rainfall taxa occupied derived positions in the DNA trees presented here (summarized in Figure 6b). Of the 19 summer rainfall taxa included in this analysis 16 comprise a monophyletic group embedded within Cape species (Figure 4). Of the remainder, P. subvestita and P. roupelliae are monophyletic but embedded within a group of Cape endemics, and P. petiolaris ( and ) occupies an unresolved position (ncpGS sequence is

48 missing for this taxon). Protea subvestita is the only taxon found in both winter and summer rainfall regions of South Africa. Based upon this evidence extant Protea taxa with presumed pleisiomorphic morphology are part of the same radiation that also gave rise to the present day Cape species and do not represent ancestral lineages with respect to Cape endemics. The tree refutes Rourke's hypothesis, that tropical taxa are the most primitive extant representatives of Protea, because the earliest diverging lineages in the DNA sequence tree are all Cape endemics. If extant sub-tropical lineages have retained, re-evolved or uniquely evolved this presumed pleisiomorphic morphology remains speculative.

In the plastid trees P. lorea occupies a position as sister to the remainder of the genus, and similarly the ncpGS analysis placed P. lorea as a component of a clade including P. cynaroides and Paracynaroideae that is sister to the rest of the genus. In the combined analysis P. lorea is sister to the rest of the genus in all trees. Protea lorea is in section Microgeanthe sensu Rourke (in prep.), and at first this placement was considered extremely suspicious. I therefore extracted and sequenced DNA from a second collection of this species for the ncpGS region and found that the sequences obtained were identical. It is also worth noting that P. lorea and the outgroups lack several plastid insertions present in all other Protea species, thus excluding the former from the main Protea clade. The only indel that P. lorea shared with other Protea species was in the ncpGS matrix where one 19 bp indel was common to P. lorea, P. cynaroides and Paracynaroideae. On morphological grounds there appears to be no justification for the position of P. lorea, but the molecular data consistently separate this taxon as sister to the rest of the genus.

Although the low level of sequence divergence encountered in all DNA regions sampled here has significantly hindered robust phylogenetic reconstruction of relationships within Protea, an attempt to explain this observation should also shed light on the timing of the radiation of this group. Intuitively the low levels of sequence divergence could be taken to indicate that many of the currently recognized species have developed over a relatively short evolutionary period. This would be in agreement with the widely held view that much of the diversification in the CFR has taken place in the last ca. five million years since the onset of Mediterranean-type climates in the region (Linder et al. 1992; Cowling & Holmes 1992). Alternatively the Protea lineage could be older but demonstrates a relatively slow rate of sequence change. To investigate this phenomenon further Chapter four uses a phylogenetic estimate for Protea to examine the timing of this radiation in detail.

49 Chapter Three - Phylogenetic Reconstruction of Protea: Combined Evidence from DNA Sequence Data and AFLP Markers

Detailed phylogenetic reconstruction at the species level is essential for investigating the patterns and processes of speciation within groups (e.g. Pagel 1994; Barraclough et al. 1998; Barraclough et al. 1999). The revolution in automated DNA sequencing technology has improved our ability to carry out such detailed studies, and in plants a wide range of DNA regions, from all three genomes, have been shown to have phylogenetic utility (reviewed in Soltis & Soltis 1998). However, most regions currently employed in plant systematics demonstrate levels of variability suitable for inferring relationships at generic or higher taxonomic levels. Consequently, among the many nuclear, plastid and mitochondrial sequence regions at the disposal of plant systematists, disproportionately few are suitable for the reconstruction of species-level relationships.

The most widely used regions in the reconstruction of relationships at the specific level are the internal transcribed spacers (ITS) of the nuclear ribosomal DNA. However, there are reports (usually unpublished) of an inability to use ITS in phylogenetic reconstruction in many phylogenetically distant taxa due to paralogy. This was found to be true in Protea, the focus of this study, and its nearest relative Faurea. In most instances cloned ITS copy types from a single individual were not monophyletic, and similar paralogous copies were discovered among copy types from all taxa examined (Chapter 2).

Non-coding regions of the plastid genome offer an alternative source of phylogenetic information at the specific level, and numerous regions with phylogenetic potential have been identified (reviewed in Soltis & Soltis 1998). Although plastid non-coding regions have been shown to provide useful information at the specific level most examples ultimately involve the combination of data from several regions, one of which is often ITS (Baum et al. 1998; Richardson et al. in press). For Protea four non-coding plastid regions were sequenced in addition to an alternative nuclear region totalling 3475 characters of which only 8% were parsimony informative. In the combined analysis, the strict consensus of 3430 trees resolved only 24% of nodes (Chapter 2). Speciation must have occurred rapidly relative to the rate of sequence change, and few phylogenetic splits are recorded in any single intron or spacer region. Under circumstances of such low levels of sequence variability the limits of what can be achieved with DNA sequence data appeared to have been reached. Matters may be further complicated in species-groups having undergone a rapid radiation since the rapidity of speciation increases the process of lineage sorting. Therefore, if speciation has occurred faster than the fixation of new alleles in a population, an individual gene genealogy is unlikely to represent the species phylogeny (Avise

50 1994).

With such low levels of sequence divergence, sequencing additional loci for Protea became prohibitively time consuming and expensive. Therefore the need to improve on the `starburse phylogenies that resulted from low sequence variation led to the search for more variable markers to infer species-level relationships. This chapter thus describes the application of AFLPsTM for phylogenetic reconstruction of species relationships within Protea.

Amplified fragment length polymorphisms (AFLPs; Vos et al. 1995) constitute a fingerprinting technique that samples restriction endonuclease sites over the entire nuclear genome (e.g. Remington et al. 1999, Vuylsteke et al. 1999, Arcade et al. 2000) by selective amplification of particular restriction fragments from a digest of total genomic DNA. The method involves restriction of genomic DNA, ligation of oligonucleotide adapters to the DNA fragments, and high stringency selective amplification of a subset of all the fragments in the digest. The ligation of the oligonucleotide adapters enables PCR to be performed for any species without prior sequence knowledge. The selective amplification uses primers of complimentary sequence to the ligated adapter plus one to three additional arbitrary nucleotides. Subsequent electrophoresis of the PCR product typically reveals a complex multi-locus profile of up to 100 bands. These bands are generally treated as dominant markers, with polymorphism detected as band presence or absence. If individuals differ in sequence at one of the restriction sites, at the specific internal bases used for amplification, or in fragment length, they will have different band profiles (Figure 1).

AFLPs are more often employed to study within population and species-boundary questions (summarized in Mueller & Wolfenbarger 1999), so prior experience may lead us to expect that AFLP markers would be so variable among species that phylogenetic inference would be untenable. However, in a group such as Protea with remarkably low levels of sequence divergence, the possibility that these markers would prove to be phylogenetically tractable was thought worthy of investigation. Genomic markers have been successfully utilized by previous authors to resolve species-level relationships; Maguire et al. (1997) used random amplified polymorphic DNA (RAPDs) to examine the affinities of (Proteaceae) species to its sister genus Dryandra. Richardson et al. (in press) used AFLPs to resolve species complexes of Phylica (Rhamnaceae) in which DNA sequence data gave insufficient variation. Similarly, Hodkinson et al. (2000) used AFLPs to infer relationships among species of Phyllostachys (Poaceae). In summary there are increasing examples of the use and efficacy of fingerprinting methods for cases in which DNA sequence data are unable to provide sufficient phylogenetic information. However this is a rare example of the application of AFLPs for phylogenetic reconstruction of all species in a large and morphologically variable group.

51 This chapter also evaluates the contribution of AFLP characters in a combined analysis with the DNA sequence data from Chapter 2. As this is the final estimate of phylogenetic relationships to be used in later chapters it is important to address the reliability of this tree. Problems with obtaining bootstrap support under circumstances of low sequence variability have been highlighted in Chapter 2, and therefore two alternative methods are used in this chapter to assess

confidence in the final tree. The firstst is corroboration of the final topology with existing taxonomy (which reflects morphological characters), and the second is a re-sampling approach to assess whether the amount of data collected is adequate to make a reliable phylogenetic inference.

The desired outcome in a phylogenetic analysis is to collect information until the effect of adding data no longer changes the composition of groups but serves only to increase branch lengths and internal support. However, the amount of data collected in this study are insufficient to obtain high bootstrap support for most branches so it has been necessary to address the question of whether the data are converging on a stable topology by evaluating the proportion of missed homoplasy that is recovered as characters are added. This approach stems from the observation that if a subset of characters are fitted to the best tree topology (achieved from analysis of all available data) it will almost always give a longer tree length than analysis of those characters alone, i.e. it will reflect homoplasy missed in that character partition. However, as more data are added the proportion of missed homoplasy will diminish because tree lengths become additive as a more stable topology is approached. The objective in this chapter is to ask whether sufficient data have been collected to converge on a stable topology, i.e. whether the proportion of missed homoplasy has stabilized. This approach will not indicate if the data have phylogenetic content because random data are likely to behave in the same way, but it will determine whether a stable outcome is being reached. Phylogenetic content has then to be assessed by corroboration of the tree topology with taxonomy and other data.

52 FIGURE 1. Schematic of the AFLP procedure.

Total genomic DNA restricted IC AATTCCAC

with enzymes EcoR I & Mse 1 AGCAAT TG

Ligation of adaptor pairs TCGTTACTCAGGACTCATCGT - GTAGACTGCGTACCAATTCCAC AGCAATGAGTCCTGAGTAGCAG- CATCTGACGCATGGTTAAGGTG

Pre-selective amplification

selective base

AATGAGTCCTGAGTAGCAGA TCGTTACTCAGGACTCATCGTC —GTAGACTGCGTACCAATTCCAC AGCAATGAGTCCTGAGTAGCAG —CATCTGACGCATGGTTAAGGTG 4IFTGTAGACTGCGTACCAATTC

t selective base

Selective amplification

0—, AATGAGTCCTGAGTAGCAGATC TCGTTACTCAGGACTCATCGTCT— TGTAGACTGCGTACCAATTCCAC AGCAATGAGTCCTGAGTAGCAGA—ACATCTGACGCATGGTTAAGGTG 11—AATGTAGACTGCGTACCAATTC

MCI Restriction fragments of varying lengths IMMI Primers °---• 5' Fluorescent dye label

53 Materials and Methods

Genomic DNA used for DNA sequencing in Chapter Two was also used as template for the AFLP procedure. All extracts were further concentrated to obtain sufficient DNA for the AFLP protocol.

AFLPs were conducted according to the AFLP Plant Mapping Protocol of Applied Biosystems Inc. (Warrington, Cheshire, UK). The AFLP procedure involved four major steps:

> Restriction-ligation reaction: Genomic DNA was restricted with enzymes EcoRI (a rare six-base cutter) and MseI (a frequent four-base cutter). In the same reaction EcoRI and MseI adapter pairs were ligated to the restriction sites, generating primer binding sites for the subsequent PCR reactions.

• Preselective amplification: Preselective primers are designed to anneal to the adapter pairs with the addition of a single nucleotide. These primers thus allowed amplification of a subset of the total restriction fragments with the matching nucleotide downstream from the restriction sites. This resulted in a ca. 16-fold (Vos et al. 1995) reduction in the number of amplified fragments.

> Selective amplification: Preselective products were amplified with primers having two additional selective bases; at this stage there were 64 possible primer combinations from which to choose. A primer trial was conducted using 12 primer combinations to identify pairs of selective primers that would be useful to expand the study. Consequently primers Msel-CTT and EcoRI-AAC (yellow fluorescent label) were used to produce AFLP profiles for all Protea species since this combination yielded a suitable number of bands and variation among loci. Selective amplification resulted in a further ca. 256-fold reduction in the number of fragments amplified.

> Fragment analysis: Samples (including GS-500 ROX-labelled internal size standard) were loaded onto a 5% denaturing polyacrylamide gel and run on an ABI PRISM 377 DNA sequencer according to manufacturers protocols (Applied Biosystems Inc.)

All subsequent analysis and interpretation of AFLP fragments was carried out using Genescan (version 2.02) and Genotyper (version 1.1) analysis software (Applied Biosystems Inc.).

54 Fragments were scored as 0/1 binary characters and the resulting matrix was analyzed using PAUP* version 4.02b (Swofford 2000) using both the neighbor joining algorithm (NJ; Saitou & Nei 1987) and maximum parsimony (with 1000 random taxon additions, TBR branch swapping and a tree limit of ten trees saved per replicate). In both analyses trees were rooted between Paracynaroideae and the rest of the genus based upon their placement in the analysis of DNA sequence data (Chapter 2). Internal support was assessed with 1000 bootstrap replicates using simple taxon addition, TBR branch swapping with a limit of ten trees saved per replicate.

Data available for the combined analysis of AFLP and sequence characters are summarized in Table 1. The combined matrix of DNA sequence data (3417 nucleotides) and the AFLP data set was analyzed using maximum parsimony as described above. Internal support was determined with 1000 bootstrap replicates using simple taxon addition, TBR branch swapping with a limit of ten trees saved per replicate.

To evaluate the proportion of homoplasy recovered as characters are added to improve phylogenetic estimation, 50 jacknife matrices (ten each consisting of 50, 100, 150, 200 and 250 characters drawn randomly from the pool of 302 informative characters) were constructed. Two tree scores were calculated for each matrix (see below): first, each matrix was analyzed individually with 100 random addition replicates under the maximum parsimony criterion; second, each matrix was fitted to the 'best tree' found from analyzing all 302 informative characters. The difference between the two scores was then calculated as a percentage of the tree length found from fitting the matrix to the best tree. These values were averaged across the ten replicates for each matrix size, and the standard error calculated for each. The results were presented as a graph of number of characters plotted against percentage of steps longer than the best tree.

Search for tree using random subset of characters

301 informative ______x number of r- A —* tree length B — tree length A _10. % steps longer than characters characters the 'best tree' Subsets of L B —0. tree length A characters chosen Calculate tree length for random randomly subset of characters with the 'best tree'

It is possible that the curve artificially levels of towards the endpoint, therefore the above process was repeated using a best tree constructed from 150 parsimony informative characters 'A-- randomly from the pool of the total 302). Forty jacknife matrices were built, ten each of 30 and 120 characters. The differences between tree lengths found when each matrix was analyzea

55 individually and fitted to the best tree (found from analyzing all 150 characters) were converted to percentages as described above. The results were shown on the same graph as above.

56 TABLE 1 . AFLP and sequence data for each taxon. Taxonomy is that of Rourke (in prep.) and Beard (1963). Shaded rows indicate taxa for which one or more data source is missing.

trnL-F region rps16 intron rbcL-atpB ncpGS AFLPs spacer Subgenus Protea Section Leiocephalae Protea caffra 1 1 1 1 1 Protea simplex 1 1 1 1 1 Protea parvula if 1 1 1 if Protea dracomontana i if 1 1 1 Protea dracomontana if if if 1 X subsp. inyanganiensis Protea nubigena if if if X J Protea nitida J 1 J J if Protea inopina if if if if J Protea glabra if if if if if Protea rupicola 1 1 1 if 1 Protea lanceolata if if if 1 I Protea petiolaris if if if X X Section Patentiflorae Protea rubropilosa if if if if I if if 1 1 J if if I J if Protea angolensis V if 1 if I subsp. angolensis Protea angolensis I if if X X subsp. divaricarta Section Lasiocephalae if if if if J Protea gaguedi if if ./ if if Protea laetans if if X 1 if Section Cristatae Protea wentzeliana if J J J J Section Paludosae Protea enervis if if if if 1 Section Cynaroideae if if if 1 X Section Paracynaroideae Protea 1 if if if 1 scolopendriifolia Protea scabriuscula if J if X X Protea cryophila V if if I if Protea pruinosa if V l J 1 Section Melliferae Protea aristata J if J if 1 Protea aristata if V I V J 1 if 1 if 1 Protea pudens if if if if 1 if 1 V J X Section Ligullatae Protea eximia if .1 i X if if if if J 1 if if , if J if J 1 X J J V J J J Protea lorifolia J X J if X

57 trnL-F region rps16 intron rbcL-atpB ncpGS AFLPs spacer Protea I I I i I lepidocarpodendron Protea holosericea I I I I X V V l X X I I I I V I I I i i Section Speciosae I I I I I I I I I I Protea stokoei I I I I I I I I I I Section Obvallatae Protea caespitosa I V I I I Section Microgeanthe Protea scorzonerifolia I I I I I Protea lorea I I I I X Protea aspera I I I X X Protea scabra I I I I I Protea piscina I I V I I Protea restionifolia I I I I I Section Exertae Protea subvestita I I I I I Protea lacticolor V I l X I I I I I I I I I I I subsp. i i i i 1 aurea Protea venusta I I I I I Section Criniflorae Protea foliosa I I I I X Protea tenax I I I X I Protea vogtsiae I I I I I Protea intonsa I V I I I Protea montana I I V I I Section Humifusae V V I i X Protea angustata I I I I I Protea laevis I I V I I Protea convexa I I I I I Protea revoluta I I I I I Section Crateriflorae Protea effusa V V I l X Protea recondita I I I X I Protea sulphurea I I I I I Protea namaquana I I I I I Section Fruticosae Protea acuminata I V I I V Protea l I l I X scolymocephala Section Vinosae Protea pendula I I I I V Protea canaliculata I I I I I Section Pinifoliae Protea nana I V I I V Protea witzenbergiana I I V I l Protea pityphylla I I V I I

58 trnL-F region rpsI6 intron rbcL-atpB ncpGS AFLPs spacer Section Breviflorae Protea mucronifolia 1 1 1 1 1 1 1 1 1 1 Subgenus Hypocephala Protea amplexicaulis 1 1 1 1 1 1 1 1 X 1 Protea decurrens 1 1 1 1 1 Protea subulifolia 1 1 1 1 1 Protea humiflora 1 1 1 1 1

59 Results

AFLPs Performing AFLPs with one selective primer combination produced 138 scorable bands for 72 taxa. Of the 138 bands only one was common to all species, but many bands were common to a large number of taxa, which made editing of the AFLP profiles in Genotyper straightforward. However, AFLP profiles of the following species were weak and were not included in the subsequent analysis: P. effusa, P. foliosa, P. scabriuscula, P. holosericea, P. scolymocephala, P. dracomontana subsp. inyanganiensis, P. aspera, P. cynaroides, P. longifolia, P. roupelliae, P. petiolaris, P. lorea, P. magnifica, P. lorifolia, P. acaulos and, P. angolensis subsp. divaricarta.

In parsimony analysis, 112 (82%) characters were identified as potentially parsimony informative, and gave 26 trees with a length of 609, CI of 0.23 and RI of 0.52. The Adams consensus of 26 trees is shown in Figure 2 (nodes not present in the strict consensus are marked with a circle). The NJ tree is shown in Figure 3. Due to the high degree of topological conformity among the trees derived from the parsimony and NJ analyses, only the parsimony tree is discussed from here on.

Clades identified in the AFLP analysis are summarized in Figure 6b and Table 2. The only clades recovered by the sequence data that are not present in the AFLP tree are clades seven and eight, which comprise species P. nana and P. scolymocephala (but this is not recovered in the strict consensus of the sequence data trees), and all members of subgenus Hypocephala respectively. However, several clades are better resolved by the AFLP data than the DNA sequence trees. These comprise clade 4 (P. glabra, P. inopina and P.nitida), clade 9 (P. compacta, P. obtusifolia and P. susannae go elsewhere in the DNA sequence trees), clade 12 (P. roupelliae, P. lacticolor and P. aurea go elsewhere in the DNA sequence trees) and clade 13 (P. montana, P. intonsa and P. vogtsiae).

AFLP and DNA sequence data combined Analysis of all DNA sequence and AFLP data in a combined matrix included 3685 characters and 91 taxa (86 Protea and 5 Faurea species). Protea magnifica and P. petiolaris were excluded from the analysis because these taxa were missing from both the AFLP and ncpGS data sets. Of the included characters 528 (14%) were variable and 302 (8%) variable and potentially parsimony informative. Analysis gave 20 trees of length 1185 with CI of 0.47 and RI of 0.65. One of the 20 equally parsimonious trees is shown in Figure 4 and the Adams consensus tree in Figure 5 (nodes not present in the strict consensus are marked with a circle). In the Adams consensus tree 67 (74%) nodes are resolved, eight of which are not recovered in the strict consensus. Within Protea

60 17 nodes received bootstrap support of greater than 50%, of which six exceeded 85%.

A summary of the clades recovered in the AFLP, DNA sequence and combined analyses is shown in Figure 6. The sections corresponding to these clades sensu Rourke (in prep.) are detailed in Table 2. The combined analysis of all data produced a topology that is closer to current ideas of relationships (Rourke, in prep.) than any of the individual analyses, and all terminal groupings identified in the combined Adams consensus tree in Figure 6c were recovered in the strict consensus. All summer rainfall (South African and tropical) taxa were resolved as monophyletic with the exception of P. roupelliae and P. subvestita, these taxa instead formed a monophyletic group with section Exertae. Protea lanceolata plus section Humifusae were monophyletic for the first time in the combined analysis. Section Ligullatae has been re-arranged on numerous occasions (Rourke, pers. comm.) but in the combined analysis formed two unrelated groups (clades 9 and 14). Sections Paracynaroideae, Breviflorae and subgenus Hypocephala were resolved exactly according to Rourke's taxonomy. Protea lorea was resolved as the sister taxon to the reminder of the genus. The remaining clades, which only require minor rearrangement of Rourke's taxonomy, are described in Table 2. The combined tree topology is the one used in Chapters 4 and 5 to investigate the temporal dynamics and factors promoting diversification in Protea respectively.

In Figure 7, curve A shows how missed homoplasy was recovered as subsets of the available data were fitted to the 'best tree' (which lies at point X). Initially there was a steep drop in the curve that then leveled off as more data were added. This provides some evidence that the characters are converging on an outcome because the proportion of missed homplasy recovered by adding more data has stabilized. Curve B shows how this is not a result of the fact that the best tree used in these calculations lies at point X (the best tree for curve B lies at point Y). Despite this, curve B did not demonstrate the same leveling off as curve A but remained in the steep section, which indicates that a large proportion of homoplasy is being recovered at every step and that this number of characters (150) is insufficient to reach a stable topology.

61 ••

pP rcomhmpinoinioisi a 66 P. enervis P. laetans P welwitschii P. gaguedi P. curvata P. caffi.a P simplex P dracomontana nubigena • angolensis a amplexicaulis P. caespitosa • lanceolata inopina P nitida glabra P restionifolia eximia witzenbergiana P PitvPhvIla rupicola recondita scorzonerifolia P. nerfifolia pendula 82 P canaliculata P sulphurea namaquana nana parvula wentzeliana subulifolia coronata mucronifolia 1-971— P odorata P tenax R revoluta R convexa 61 4 84 P angustata cordata • laevis R decurrens P obtusifolia P. compacta P susannae pudens burchellii 62 P laurifolia P lepidocarpodendron P. stokoei R speciosa P humillora grandtceps • vogtsiae P. into nsa P montana mundii P. lacticolor P. aurea a R verrusta 54 P punctata subvestita P. aristata 91 P. repens piscina P scabra P. acuminata pruinosa 65 P cryophila P. scolopendrajblia

FIGURE 2. Adams consensus of 26 equally most parsimonious trees found from analysis of 138 AFLP bands for 72 Protea taxa. Number of steps = 609, CI = 0.23, RI = 0.52. Branches not recovered in the strict consensus are indicated with a circle. Bootstrap percentages are indicated below branches.

62 P. enervis P. laetans P. comptonii P. rubropilosa P. curvata P. welwitchii P. gaguedi P. ca ra . simplex P angolensis a P nubigena P. dracomontana P lanceolata P. decurrens P. caespitosa P amplexicaulis P. mucronifolia P. odorata P. namaquana P. nana P. sulphurea P. revoluta P. convexa P. angustata P laevis P condata P subulifolia P. wentzeliana P. parvula P. coronata P.rupicol a P.recondita P. pendula P. canaliculata P. witzenbergiana P. pityphylla P. inopina P. nitida P glabra P neriOlia P. restionifolia P. eximia P. scorzonerifolia P. tenax P. burchellii P laurifolia P. pudens P. lepidocarpodendron P. susannae P compacta P. obtusifolia stokoei P. speciosa P. grandiceps P. vogtsiae P. intonsa P. montana P. humiflora P piscina P. scabra P. aristata P. repens P. venusta P. punctata P. mundii P lac ticolor P. auira a P. subvestita P. acuminata P. pruinosa P. cryophila P scolopendriifolia

FIGURE 3. Rooted neighbour joining phylogram derived from analysis of 138 AFLP bands for 72 Protea taxa.

63

3 P. caffra P. dracomontana i 11 P. simplex 2 8 P. parvula 6 P. dracomontana 2 5 P. laetans 3 U_ P. comptonn 3 7 P. rubropilosa 5 P• enervis 4 9 P curvata 3 3 P. gaguedi 25 P. welwitschil 8 P. angolensis a 1 P. nubigena r P. wentzeliana angolrsis d7 3 7 P rev oluta 1 P. convexa 9 P laevis 4 7 3 4 P. acaulos 12 P. angustata 11 P. lance olata P sulphureq 0 5 P. pendula P canaliculata 3 16 P. acuminata P w rim nbergiana P. pit pirylla E e usa 7 t'. r condita P. rupicola p inopina P nitida 8 P Elabra 8 P8 namaquwra 14 10 P. mucronMia ii P. odorata 1 2 P. npna 12 P. scolymocephala 11 P. tenax P eximia 14 13 P. decurrens 5 P. subu101ia 7 P hum fflora P. amplexicaulis P. cordata P. coronata P caespitosa P. neriifolia P. scorfynerifolia 3 P. longifolia 7 to P. pudens cornpacta 14). susannae 5 9P. obtusifolia 9 i is P. stokoei 10 P. speciosa P. scabra 11--R P. restionff'olia 3 P. piscina — P. aspera 8 4 4 9 P. venusta 2 4 P. punctata 20 P. mundii P aurea a 4 9P. lacticolor to =— P. Itibvestita 13 P roupe iae 9 P. grandiceps 18 P. anstata 3 P. repens 12 P vogtsiae P. intonsa P. foliosa 2 6 P m9rltana lurrh,e Uri 5 Ii P. lauriJolia 8 P. lonfolia 9r P. holoserkea •/ P. le_pidocarpodendron 5 P. pruinosa 7-1 7 P cryophila 10 P. scolopendriijblia 73 9 6 P. scabriuscula 20 P. cynaro ides P. lorea 6 E saligna E rochetiana F. galpinii E rubr:Ilora macnaughtonii

FIGURE 4. One of the 20 equally most parsimonious trees found in a combined analysis of DNA sequence and AFLP data sets. Branch lengths are indicated above the branches.

64 •

parvula arromontana • cq plex • racomontana i • aetans comptorgi rubropilosa 81 • enervls •curvatq •gaguedi . weiwitschii • an olensis a nub zgenp wen eitazza an o ensis d . re o uta 7 . convexa !gem 82 acaulos • ngustQta ". 1aicço1ata • slaphurea esa 63 ' r condita • rzipico a glabra moppla • thttaa pendzaa 98 . canahculata • aeummata • witzehbe.rgtana 55 : PitYPhYlla . namaquezna • mucrônifolia 100 • odorata 91 •. sncaoneymocephala • tcnax . elecur7n$ •subujz olza . QMpylCatiliS . hum ora • cor ata . coronata •caeqpitpsa p. nen yolia P. 4cor4Qrwrifolia longyolza . pudens 94 .-compacta . susarmac L. obtyszfolia F. stolcoez 61 F91= . speyosa sca ra . res zomfolia piscina I. aspera J. venusta 59 . punctata t. tnunazl . aurea a . laqticolor J. subvecitzta I. roupe lzae i. gran iceps .Drzstata F. vogtswe 51 infonsa 73 montana

urcioliaur —67= ow ericea • lepidocarpodendron . Idryolia I. repens I. exurpa . pruznow 54 .-cryopmia salirtusla 98 sco ope.n riifolia unarm es I. Lorca 100 t. salzknet 97 .rocnetiana 97 t CVO ci)rq r . macnaugntonii

FIGURE 5. Adams consensus of 20 equally most parsimonious trees found in a combined analysis of DNA sequence and AFLP data sets for 86 Protea species. Number of steps = 1185, CI = 0.47, RI = 0.65. Branches not recovered in the strict consensus are indicated with a circle. Bootstrap percentages are indicated below branches.

65 V

>, E 0 o 0. g 0

4 d 2 a) vo (11 S A os N 5 -0 "Cj y yu 0 0 .0 .—

0 C.)

"0

r/3 a) 6.4

C.) -0

C.)

• y ry •

IT

• ▪

-a ,.-,v) 0 c.) v) 0 V-1 ,.D kr) a) = InI C/) e" O ;-. Ca b0 o O 4, 0 r"- n >1 "0 >s... 0 .0 1...O z CI. 0 %.--C.) > 4, 0 w O c.) I,0 a) 7:3 vi .0 )... 0 >, cs 0 E co'n -F.) ,—I = Z ..... ,,,n E•—n O -0 ..0 0 -00 ..--cd 0 . i 0 0 6 kfl 0.) s... taD w Cl 1, • ID M ... a) 0 O Cl I.. a) 0 r-: u.) c4 E 0

In 0 tr) 0 in Cr:, tr) d-c CN1 N TABLE 2. Clades identified in each of the three analyses.

Taxonomy sensu Rourke (in Clades identified in Clades recovered in Clades recovered in prep.) unless otherwise stated combined AFLP and AFLP trees DNA sequence trees sequence data trees 1 Leiocephalae P. parvula /except P. dracomontana P. nubigena P. dracomontana subsp. inyanganiensis P. caffra P. simplex P. nubigena Lasiocephalae P. welwitschii P. gaguedi P. laetans Paludosae (Beard 1963) P. enervis Patent florae (Beard 1963) P. curvata P. rubropilosa P. angolensis subsp. angolensis P. angolensis subsp. divaricarta P. comptonii Cristatae (Beard 1963) P. wentzeliana

2 Humifusae P. revoluta 1 except 1 except P. convexa P. laevis & P. angustata P. acaulos P. laevis P. lanceolata & P. lanceolata P. angustata Leiocephalae P. lanceolata

3 Crateriflorae P. effusa I I except P. rupicola P. recondita Leiocephalae P. rupicola

4 Leiocephalae P. glabra 1 P. inopina P. nitida

5 Vinosae P. pendula 1 P. canaliculata Pinifoliae P. pityphylla P. witzenbergiana Fruticosae P. acuminata

6 Breviflorae P. mucronifolia P. odorata

7 Pinifoliae P. nana Fruticosae P. scolymocephala

8 Hypocephala P. decurrens 1 P. subulifolia P. amplexicaulis P. humiflora P. cordata

68 Taxonomy sensu Rourke (in Clades identified in Clades recovered in Clades recovered in prep.) unless otherwise stated combined AFLP and AFLP trees DNA sequence trees sequence data trees 9 Ligullatae P. longifolia 1 1 except P. pudens P. compacta, P. compacta P. obtusifolia & P. susannae P. susannae P. obtusifolia

10 Speciosae P. stokoei 1 1 P. speciosa

11 Microgeanthe P. restionifolia P. piscina

12 Exertae P. venusta 1 1 except P. punctata P. roupelliae, P. mundii P. lacticolor P. aurea subsp. aurea & P. aurea P. lacticolor P. subvestita Ligullatae P. roupelliae

13 Criniflorae P. montana 1 P. intonsa P. vogtsiae

14 Ligullatae P. laurifolia 1 1 P. burchellii P. holosericea P. lepidocarpodendron P. lorifolia

15 Paracynaroideae P. pruinosa 1 1 P. cryophila P. scolopendriifolia P. scabriuscula

69 Discussion

There are many fundamental evolutionary questions that can be addressed using well-sampled and robust phylogenetic trees. However, reconstructing suitable phylogenetic trees at the species-level can be time consuming and expensive when based on DNA sequence data alone. Even after such efforts, the patterns may be well sampled taxonomically but not necessarily robust because sequence divergence of commonly exploited loci often do not contain sufficient phylogenetic signal to make confident statements. Thus, the aim of this chapter was to assess the efficacy of AFLPs markers to augment sequence data in species-level phylogenetic reconstruction.

Analysis of this relatively small number of AFLP characters (138) alone did not provide a robust pattern for Protea, but the performance of the AFLP characters in terms of recovering the clades identified in Figure 6c was better than the five sequence data sets combined. When taking into account the phylogenetic information gained from each data source, the cost in terms of both time and expense for collection of AFLP data was much lower, an important factor to be considered in future data collection for this group. The best estimate of relationships within Protea was achieved through combination of AFLP markers with sequence data, which yielded far fewer trees (20). In the combined analysis RI is high and many more of the groups recovered agree with Rourke's taxonomy than previously achieved (Table 2); this provides some evidence that the trees are reliable even though bootstrap support is absent.

Even though the combined trees provided the best estimate of relationships thus far, the possibility that the number of parsimony informative characters used to build the tree was too low to provide a clear signal could not be discounted. To evaluate this, the proportion of missed homoplasy recovered by subsets of the data was calculated with respect to the best tree built from all the available data. The expectation was that the curve would begin to level off as the included characters were converging on a common topology. The results confirmed this expectation, which implies that even though the data set does not contain sufficient informative characters to give high bootstrap values, the data are converging on an outcome. With insufficient data the curve will not automatically reach a point where it begins to level off simply because the calculation relies upon comparison to a best tree built from all the available data (as demonstrated in the example using only 150 characters). However, because an artificially constructed data set will also behave similarly (data not shown) there is no way of knowing from this pattern whether there is phylogenetic content in the data. Corroboration of the combined tree topologies by the taxonomic scheme therefore had to be relied upon to provide evidence of phylogenetic signal. Such agreement could not be due to chance.

70 There are two main concerns when using AFLP bands for phylogenetic reconstruction. The first is whether co-migrating bands are truly homologous (Arnold & Emms 1998; Wolfe & Liston 1998). The sensitivity of the automated approach makes it possible to visualize significant differences in base composition among co-migrating bands because they are sized to a fraction of a base pair. This means that even if two fragments are identical in length, their mobility is also a function of base composition, and thus careful editing of the AFLP profiles can eliminate bands for which homology appears dubious (Fay pers. comm.). The second concern is whether AFLP bands can be treated as independent characters in a parsimony analysis. This is because individual bands may not be completely independent of one another if the loss or gain of a particular band influences the make up of others. It is difficult to assess how often this may occur without sequencing each band and searching for overlaps, but in principle non-independence could inflate support for groups if double coding was commonplace.

This study represents a preliminary assessment of the efficacy of using AFLPs for phylogenetic reconstruction within Protea, and thus far they appear to be a useful and efficient way to improve resolution in combination with DNA sequences. More studies are needed to look closely at the issue of non-independence and additional taxon sampling should be carried out in the future to confirm species monoophyly with this technique. A relatively small number of AFLP characters were sampled in this study, but it may be possible to infer species-level relationships solely from AFLP characters with the inclusion of further characters (there are 64 alternative primer combinations). In a study of cichlid fish, Albertson et al. (1999) reported increased resolution as they increased to 750 variable markers and increased bootstrap support as they increased to over 1200 variable markers. By contrast this survey included only 137 variable AFLP markers. However, although AFLPs represent a tempting alternative to DNA sequences for studying species-level relationships, a significant drawback is that they cannot be used in any subsequent analyses that rely on an ultrametric tree derived from branch lengths. In chapters 4 and 5 branch lengths are estimated using maximum likelihood, an algorithm that models DNA sequence evolution to reconstruct branch lengths, and so it is necessary to exclude the AFLP characters and fit only DNA sequence data to the best topology. Unfortunately this inevitably means that although AFLPs help to fix a topology some resolution may be lost when branch lengths are fitted using only sequence data.

71 Chapter Four - Timing and Temporal Dynamics of the Radiation of Protea

Important to understanding the origins of the Cape flora are the time scales over which plant speciation has occurred in the region, particularly with respect to the geologic and climatic conditions that prevailed during the evolution of this diverse and species-rich flora. Many authors have advocated that massive diversification of the CFR flora occurred relatively recently, mostly after climatic deterioration in the late Pliocene when seasonal (Mediterranean-type) climates developed and fire became an important ecological factor (within the last five million years; Cowling 1987; Linder et al. 1992). However, the great diversity of the CFR has also been attributed merely to a long history (Whittaker 1977), explained by the great age of the southern Gondwanan landscapes, which were not glaciated in the Pleistocene.

Understanding of past evolutionary processes over geological time scales has been largely deciphered from the fossil record. Due to the remarkably scarce palaeobotanical record in southern Africa there is a great need to understand the temporal evolutionary patterns of the CFR from a phylogenetic perspective. To date neither of the contrasting hypotheses outlined above regarding the build up of species diversity have been corroborated with phylogenetic information for all species in a group from the CFR. Using the phylogenetic tree derived in Chapter 3 from DNA sequence and AFLP characters this chapter aims to provide an estimate for the time scale over which the radiation of Protea has occurred.

Phylogenetic trees based on DNA sequence data enable estimation of divergence times if sequences have a uniform rate of evolution. In these circumstances, a known date of divergence for a given pair can then be used to calculate a rate of substitution (the calibration rate), which can be applied to dating other nodes (Rambaut & Bromham 1998). However the use of a molecular clock to place absolute dates on lineage divergence times is limited by the validity of applying a calibration rate from one part of the tree to date other nodes. Uniform rates of change across a tree cannot be assumed, as lineage specific rate variation has been demonstrated for many taxonomic groups including plants (Bosquet et al. 1992; Gaut et al. 1992). Incorrectly assuming the clock may lead to spurious date estimates (Takezaki et al. 1995), and thus any analysis must incorporate explicit means to evaluate rate constancy, and if necessary to deal with rate-variable data. In light of this situation various authors have proposed alternative algorithms that are designed to produce ultrametric trees without assuming a global molecular clock (Sanderson 1997; Rambaut & Bromham 1998; Thorne et al. 1998; Huelsenbeck et al. 2000). The method used in this study to produce an ultrametric tree is Sanderson's method of nonparametric rate smoothing (NPRS), which assumes that evolutionary rates are autocorrelated in time (Sanderson 1997). This means

72 that substitution rates are assumed to be inherited and limits are put on the rate changes from an ancestral to a descendant lineage (Sanderson 1997). For each branch in a given tree the local rate of molecular evolution is estimated, and then the sum of the differences between the local estimated rates is minimized for ancestor and descendant branches across the tree.

There are no fossils of Protea species to provide a calibration for the tree, but because Proteaceae have a Gondwanan distribution across the Southern Hemisphere it has been possible to use a minimum date for the separation of Africa from to provide the calibration point. Therefore to estimate the time scale over which the radiation of Protea occurred, it has been necessary to reconstruct relationships for the whole family. DNA sequence data have previously been employed to determine generic relationships within Proteaceae (Hoot & Douglas 1998; Barker et al. in prep.), the most comprehensive of which used atpB and atpB-rbeL spacer sequences to reconstruct relationships among 46 of the 79 genera currently recognized (Hoot & Douglas 1998). This analysis resolved two major clades comprising subfamilies Grevilleoideae (89% bootstrap support with Carnarvonioideae and Sphalmioideae embedded) and Proteoideae (65% bootstrap support with Eidotheoideae embedded). The largely Australasian and South American Grevilleoideae include only one genus endemic to Africa, , whereas Proteoideae comprise a mixture of African and Australasian taxa. This taxonomic and phylogenetic evidence provides two possible scenarios that could explain the current biogeography of the family. Either the two major lineages diversified after the break up of the supercontinent with subsequent dispersal between the landmasses, or the major lineages existed before the breakup of Gondwana in the early to mid-Cretaceous. The former explanation would allow a minimum date for the final separation of Africa from South America (ca. 105 Mya, Deacon 1992) to be applied to the shared node of the two major lineages, whereas the latter would require that the age be applied to the most recent split of an African lineage from a South American lineage. These alternative scenarios are summarized in Figure 1.

(a) (b)

Proteoideae: Australasia Proteoideae: Australasia Proteoideae: Africa Proteoideae: Africa dispersal Grevilleoideae: Australasia/ Grevilleoideae: Australasia/ S. America S. America GrevilleoideaeAfrica : Grevilleoideae : Africa Separation 105 mya Separation 105 mya

FIGURE 1. Alternative hypotheses to explain the current biogeography of Proteaceae. (a) Major lineages evolved before the separation of Africa from Australasia/S.America. (b) Major lineages evolved after the separation of Africa from Australasia/S.America.

73 Sampling in the analysis of Hoot & Douglas (1998) covered largely Australasian and South American taxa, and only four of the 14 genera from Africa were included. I have expanded the atpB-rboL spacer data set to include another 20 African Proteaceae species representative of a further nine genera. This framework was used to assess the biogeography of Proteaceae, but to make an estimate of the timing of the Protea radiation it was necessary to collect additional sequence data. Longer sequences are expected to give more robust estimates by reducing the noise introduced by stochastic processes of nucleotide substitution. Therefore, based upon the outcome of the family analysis additional character sampling encompassing three non-coding regions from the plastid genome (comprising the trnL intron, trnL-F intergene spacer and the rps16 intron) was undertaken for a subset of the genera. This was then the phylogenetic tree used to make an estimate of the time period during which the radiation of Protea occurred.

In addition, a detailed phylogenetic tree of Protea can be used to investigate the dynamics of their radiation over time (Purvis 1996; Nee 1996). Often these data are presented in the form of log number of lineages through time plots (Figure 2).

Log number 1'5 of lineages

Relative time since root node

FIGURE 2. A hypothetical log number of lineages through time plot.

Under the simplest null model of diversification — constant speciation rate with no extinction, the log lineage through time plot for the reconstructed phylogeny should be a straight line, where the slope is equal to the speciation rate. However, several processes may lead to significant departures from a straight line. For example, when background extinction is factored into the null model (a

74 constant-rates birth-death model), the expected plot is a straight line with a curved region that steepens towards the present (Harvey et al. 1994), but alternatively a recent increase in speciation rate may also lead to a significant upturn in the curve. At the opposite extreme, a plateau in the curve towards the present may indicate a decrease in speciation rate, but another possible cause of this pattern can be due to taxa missing from the sample (Figure 3).

Log number of lineages

Relative time since root node

FIGURE 3. Possible behavior of a log number of lineages through time plot. A: constant net speciation with background extinction or constant net speciation with an increase in rate towards the present. B: constant speciation — pure birth process. C: constant net speciation with a slow down in rate towards the present or taxa missing from the sample.

Therefore with respect to the alternative hypotheses for Protea:

> If the speciation rate has increased since the onset of Mediterranean-type climates we would expect an upturn in the lineage through time plot starting around five mya (but note that background extinction could leave a similar pattern). > If species have accumulated over a long period of time we would expect a straight line. If speciation rates have declined the plot may even plateau towards the present.

To evaluate these patterns, the phylogenetic tree, described in Chapter 3 based upon DNA sequence and AFLP characters for 86 Protea species (with branch lengths fitted from only the DNA sequence data), was used to investigate the temporal dynamics of the radiation of Protea.

75 Materials and Methods

DNA extraction, amplification and sequencing of the trnL-F region, rps16 intron and atpB-rbcL spacer region were performed as described in Chapter Two. All sequences produced here and those from Hoot & Douglas (1998) were aligned by eye, all gaps were coded as missing. Phylogenetic trees were estimated using Fitch Parsimony with 1000 replicates of random taxon addition with MULPARS on, using PAUP* 4.0 (Swofford 2000).

> Age estimation for the root node of the Protea clade

Analysis 1:

To examine the current biogeographic patterns of the family, phylogenetic relationships were initially inferred from atpB-rbcL spacer sequences for 85 species, representing 54 genera of Proteaceae. Platanus occidentalis was specified as the outgroup taxon. These taxa, with Genbank accession and voucher information, are listed in Table One. Of these, 41 taxa were previously published by Hoot & Douglas (1998). Seven genera already published by Hoot & Douglas (1998) were duplicated to demonstrate conformity with the DNA sequences produced here.

Analysis 2:

To make a more reliable age estimate for the radiation of Protea three further non-coding plastid regions, the trnL intron, trnL-F intergene spacer and rps16 intron, were sequenced for a subset of the taxa sampled in analysis 1 (representative of the two major clades within the familial topology). These were combined with complimentary sequence data for ten randomly chosen species of Protea (from Chapter Two) and Platanus occidentalis as the outgroup taxon (atpB- rbcL spacer only). The final matrix included 46 taxa (representing 21 Proteaceae genera).

> Producing an ultrametric tree Branch lengths were fitted to one of the most parsimonious trees derived from analysis 2 using maximum likelihood (ML). Likelihood ratio (LR) tests were used to choose the best model from a series of three substitution models of increasing complexity. The models were: a one-parameter model with base frequencies estimated from the data, equal transition-transversion ratio and equal rates among sites, a two-parameter model with transition-transversion ratio estimated from the data, and a three-parameter model with transition-transversion ratio and a gamma distribution of rate variation among sites estimated from the data. The one-parameter model was first compared with the two-parameter model, then the two-parameter compared with the three-parameter model.

76 At each step the more complex model was chosen if it provided a significantly better fit to the data. The statistic is twice the difference in log likelihood scores between the simpler and more complex model. This has a x2 distribution with one degree of freedom (because one parameter is added in each case).

Once the ML model had been chosen, the hypothesis of rate constancy was evaluated using a LR test (Felsenstein, 1981). This statistic is twice the difference in log likelihood of branch lengths between a rate-constrained tree (forcing the molecular clock in PAUP) and a tree that has no constraints on branch lengths. Degrees of freedom are equal to the difference between the number of branches in the unrooted unconstrained tree (2n-3) and the number of nodes in the rooted constrained tree (n-1), i.e. n-2. If the log likelihood is greater than the 95 th percentile of the distribution, then the constrained hypothesis may be rejected as significantly worse than the unconstrained hypothesis. Where the molecular clock was rejected an ultrametric tree was produced from the ML branch lengths using the NPRS method of Sanderson (1997) in TreeEdit3 version 1.0 alpha 4-61 (Rambaut & Charleston 2000). This tree was then used to calibrate the timing of the radiation based upon a South America - Africa Gondwanan split date of 105 myr (Deacon 1992). To evaluate the two scenarios outlined in the introduction, the calibration date was applied in turn to the shared node of Grevilleoideae and Proteoideae, and the most recent split of an African lineage from a S. American lineage.

To calculate the standard error in divergence time estimates (caused by sampling only a finite number of characters), 100 bootstrap matrices of the plastid data set were produced using the Seqboot algorithm in Phylip version 3.573 (Felsenstein 1995; PAUP* 4.0 does not have this feature). Branch lengths were then fitted to the best topology derived in analysis 2 for each of these matrices. In each case ML branch lengths were fitted as described above, and an ultrametric tree produced by transforming branch lengths with NPRS. The bootstrap distribution of divergence times for the root node of the Protea clade was obtained from the resulting 100 ultrametric trees.

> Temporal dynamics of the radiation of Protea

Analysis 3:

To investigate the temporal dynamics of the radiation, one of the most parsimonious trees derived in Chapter Three using all sequence and AFLP characters for 86 Protea taxa was fitted with branch lengths using only the DNA sequence data. ML branch lengths were then fitted using the

3 http://evolve.zoo.ox.ac.uk/software/TreeEdit/

77 three parameter model described above, which was chosen after performing a series of more complicated LR tests. As above the hypothesis of rate constancy was evaluated using a LR test (Felsenstein 1981). Ultimately an ultrametric tree was produced from ML branch lengths using NPRS (Sanderson 1997) in Tree Edit v. 1.0 (Rambaut & Charleston 2000). This tree was then used to produce a plot of log number of lineages through time for the radiation of Protea. Node ages were calculated in absolute time using the calibration obtained in analysis two for the root node of Protea.

The overall per linaege rate of species diversification within Protea was estimated from the tree using the formula below. This represents a maximum likelihood estimate for diversification rate under a constant speciation rate model (Baldwin & Sanderson 1998).

N — 2

where N is the total number of lineages (representing the total number of reconstructed speciation events) and B is the sum of all the branch lengths calibrated in absolute time (representing the total 'lineage' time available for speciation events to occur). The estimate will have associated error due to the finite number of observations used in the calculation. Confidence intervals based on these errors are given by the following formula (Baldwin & Sanderson 1998; Nee in press):

1.96 1± [N -2

To test for significant changes in diversification rate over time the rate was also estimated separately for the two halves of the curve, before and after 'half time'. For the second half diversification rate can be calculated as:

Nand — Nstart

where B is the sum of branch lengths in the chosen time interval.

Another source of error may be caused by the uncertainty in the divergence time estimates from the bootstrap distribution of age estimates. This was taken into consideration by re-estimating the

78 diversification rate and confidence intervals for the upper and lower dates generated from the standard error of the mean of the bootstrap distribution.

79 TABLE 1 . Proteaceae taxa sampled in Analysis 1. Classification is after Hoot & Douglas (1998; updated from Johnson & Briggs 1975).

Taxon Voucher Literature Citation/ Geographical GenBank Accession distribution of. genera Proteoideae Au/ax pallasia Stapf Rebelo & Reeves 28, K This chapter Africa umbellata (Thunb.) R. Br. Rebelo & Reeves 36, K This chapter Africa Faurea rubriflora Mamer Rebelo & Reeves 84, K This chapter Africa Faurea rochetiana (A. Rich.) Rebelo & Reeves 60, K This chapter Africa Pic. Serm. Harv. Rebelo & Reeves 63, K This chapter Africa E. Chase & Fay 128, K This chapter Africa Phillips E. Phillips Rebelo & Reeves 67, K This chapter Africa salignum P. J. Rebelo & Reeves 16, K This chapter Africa Bergius Leucadendron chamelaea Rebelo & Reeves 89, K This chapter Africa (Lam.) I. Williams (Lam.) Rebelo & Reeves 99, K This chapter Africa Fourc. Leucadendron meridianum I. Rebelo & Reeves, This chapter Africa Williams No voucher cordifolium Rebelo & Reeves, This chapter Africa (Salisb. ex Knight) Fourc. No voucher Leucospermum saxsosum S. Rebelo & Reeves 85, K This chapter Africa Moore Leucospermum pedunculatum Rebelo & Reeves 97, K This chapter Africa Klotzsch Leucospermum truncatulum Rebelo & Reeves 96, K This chapter Africa (Salisb. ex Knight) Rourke dispersus Levyns Rebelo & Reeves 72, K This chapter Africa Paranomus candicans (Thunb.) Rebelo & Reeves 104, K This chapter Africa Kuntze Paranomus spathulatus (Thunb.) Rebelo & Reeves 101, K This chapter Africa Kuntze parilis Salisb. ex Rebelo & Reeves 90, K This chapter Africa Knight fasciflora Salisb. ex Rebelo & Reeves 91, K This chapter Africa Knight Serruria nervosa Meisn. Rebelo & Reeves 95, K This chapter Africa zeyheri Pappe ex Rebelo 94, photo This chapter Africa Hook. f. cucullatus (L.) R. Br. Rebelo & Reeves 98, K This chapter Africa Vexetorella amoena (Rourke) Rebelo & Reeves 100, K This chapter Africa Rourke Vexetorella obtusata (Thunb.) Rebelo & Reeves 102, K This chapter Africa Rourke subsp. obtusata Spate/la incurva (Thunb.) R. Br. Rebelo & Reeves 105, K This chapter Africa Protea inopina Rourke Rebelo, No voucher This chapter Africa Protea laurifolia Thunb. Chase & Fay 77, K This chapter Africa Protea nubigena Rourke Rebelo, No voucher This chapter Africa Protea lorifolia (Salisb. ex Rebelo & Reeves 14, K This chapter Africa Knight) Fourc. Protea odorata Thunb. Rebelo, No voucher This chapter Africa Protea recondita H. Buesk ex Rebelo, No voucher This chapter Africa Meisn.

80 Taxon Voucher Literature Citation/ Geographical GenBank Accession distribution of genera Protea comptonii Beard Rebelo & Reeves 62, K This chapter Africa Protea humiflora Andrews Rebelo & Reeves 11, K This chapter Africa Protea neriifolia R. Br. Rebelo & Reeves 2, K This chapter Africa Protea magnifica Link Rebelo & Reeves 12, K This chapter Africa Protea speciosa (L.) L. Chase & Fay 98, K This chapter Africa sericea Labill. Chase 10151, K This chapter Australasia Labill. Douglas 271 (ex 125) Hoot & Douglas Australasia MEL (1998) AF060739 fucifolia R.Br. Douglas 392, MEL Hoot & Douglas Australasia (1998) AF060721 montana Brongn. & NSW368725 Hoot & Douglas Australasia Gris) Virot (1998) AF060749 odorata R.Br. Douglas 403, MEL Hoot & Douglas Australasia (1998) AF060717 montanum R.Br. Douglas 243, MEL Hoot & Douglas Australasia (1998) AF060733 latifolia R.Br. (Steud.) Douglas 655, MEL Hoot & Douglas, Australasia (1998) AF060738 dawsonii F. Muell ex Chase 10150 This chapter Australasia R. T. Baker Isopogon bwcifolia R.Br. NSW397508 Hoot & Douglas Australasia (1998) AF060734 circinata Kippist ex Douglas 372, MEL Hoot & Douglas Australasia Meisn. (1998) AF060735 mitchelii Meisn. Douglas 512, MEL Hoot & Douglas Australasia (1998) AF060728 media A.S.George Douglas 303, MEL Hoot & Douglas Australasia (1998) AF060729 Sphalmioideae racemosum Douglas 635, MEL Hoot & Douglas Australasia (C.T.White) B.G.Briggs, (1998) AF060719 B.Hyland & L.A.S. Johnson Carnarvoniodeae araliifolia F.Muell. Douglas 628, MEL Hoot & Douglas Australasia (1998) AF060726 Grevilleoideae lancifolia F.Muell. Douglas 335, MEL Hoot & Douglas Australasia/ (1998) AF060718 S.America excelsa R.Br. Douglas 366, MEL Hoot & Douglas Australasia (1998) AF060744 sublimis F.Muell. Weston s.n., NSW Hoot & Douglas Australasia (1998) AF060753 salignus R.Br. Douglas 331, MEL Hoot & Douglas Australasia (1998) AF060743 celsissima Douglas 290, MEL Hoot & Douglas Australasia F.Muell. (1998) AF060742 heterophylla Douglas 610, MEL Hoot & Douglas Australasia L.S.Sm. (1998) AF060725 tasmanica W. M. Curtis Chase 558, K This chapter Australasia/ S.America Douglas 364, MEL Hoot & Douglas Australasia/ (C.F.Gaertn.) Domin (1998) AF060722 S.America wickhamii (W.Hill ex Weston s.n., NSW Hoot & Douglas Australasia F.Muell.) P.H.Weston & Cripp (1998) AF060752 Telopea sp. Weston s.n., NSW Hoot & Douglas Australasia (1998) AF060758

81 Taxon Voucher Literature Citation/ Geographical GenBank Accession distribution of genera coccineum Forst. et Weston s.n., NSW Hoot & Douglas S.America f. (1998) AF060754 australasica F.Muell. Douglas 509, MEL Hoot & Douglas Australasia (1998) AF060724 scottianum Douglas 669, MEL Hoot & Douglas Australasia (F.Muell.) F.Muell. (1998) AF060741 montana (C.T.White) Douglas 670, MEL Hoot & Douglas Australasia Foreman (1998) AF060740 inaequalis (Pohl) Walter 2696, NSW Hoot & Douglas S.America Engler (1998) Euplassa occidentalis I. M. Plana 11, K This chapter S.America Johnston bleasdalei (F.Mull.) NSW 368723 Hoot & Douglas Australasia/ Sleumer (1998) AF060748 S.America Brabejum stellatifolium L. Rebelo & Reeves 106, K This chapter Africa integrifolia Maiden Chase 10149, K This chapter Australasia & Betche C.L.Gross NSW368737 Hoot & Douglas Australasia & P.H.Weston (1998) AF060750 ferruginea (Meisn.) Plana 34, K Hoot & Douglas S.America Pittier (1998) AF060756 Panopsis pearcei Rusby Plana 37, K This chapter S.America formosa Sm. Douglas 201, MEL Hoot & Douglas Australasia (1998) AF060737 macrophylla Pohl. Douglas 131, MEL Hoot & Douglas S.America (1998) AF060713 hilliana F. Muell. Chase 10148, K This chapter Australasia MvGillivray Douglas 242, MEL Hoot & Douglas Australasia (1998) AF060747 trinervia Douglas 376, MEL Hoot & Douglas Australasia C.T.White (1998) AF060720 heterophylla Douglas 599, MEL Hoot & Douglas Australasia L.S.Smith (1998) AF060727 Banksia cuneata A.S.George Douglas 653, MEL Hoot & Douglas Australasia (1998) AF060731 coriaceum Douglas 110 (ex.262a) Hoot & Douglas Australasia C.T.White & W.D.Francis (19980 AF060712 toru (A.Cunn.) Douglas 300 Hoot & Douglas Australasia L.A.S.Johnson & B.G.Briggs (1998) AF060736 Bellendenoideae montana R.Br. Douglas 400 Hoot & Douglas Australasia (1998) AF060715 Eidotheoideae zoexylocarya Douglas 377 Hoot & Douglas Australasia A.W.Douglas & B.Hyland (1998) AF060714 Outgroup Platanus occidentalis L. No voucher information Hoot & Douglas Northern (1998) AF060755 hemisphere

82 Results

D Age estimation for the root node of the Protea clade

Analysis 1:

Analysis of atpB-rbcL spacer sequences for 54 genera (85 taxa) of Proteaceae and one outgroup taxon included 995 characters, of which 380 (38%) were variable and 168 (17%) potentially parsimony informative. Analysis gave 2620 equally most parsimonious trees of length 666 with a CI of 0.72 and a RI of 0.81. One of the equally most parsimonious trees, with the geographical distribution of taxa, is shown in Figure 4. The seven genera for which sequences were already available from Hoot & Douglas (1998) and duplicated here to verify conformity among sequences form sister taxa in every case.

If taxon sampling overlaps, the topology of the strict consensus tree is in agreement with the tree of Hoot & Douglas (1998). The two major clades resolved approximately equate to an African Proteoideae and an Australasian/South American Grevilleoideae clade (sensu Johnson & Briggs, 1975). Australasian taxa Isopogon, Adenanthos, Petrophile, and Beauprea belonging to Proteoideae fall with the African Proteoideae, and grevilleoid Brabejum, endemic to the CFR, is placed in the Australasian/South American grevilleoid clade. Further Proteoideae from Australasia are unresolved in this analysis but fall with the African clade in the analysis of Hoot & Douglas (1998).

This tree was not used to estimate the age of the root node of Protea because this analysis included too few variable sites to be accurate for rate estimates. The increased character sampling in analysis 2 was intended to reduce error caused by stochastic processes. In addition this phylogenetic tree is not ideal for specifically estimating the age of the root node of Protea because it mixes the Protea and Faurea species.

Analysis 2:

Analysis of four plastid regions (atpB-rbcL spacer, trnL-F region and rps16 intron) for 46 taxa using Platanus as the outgroup included 2716 characters of which 562 (21%) were variable and 258 (9%) potentially parsimony informative. Analysis gave 3334 equally parsimonious trees of length 808 with CI of 0.78 and RI of 0.88. A three-parameter model of sequence evolution was chosen to fit ML branch lengths to one of the most parsimonious trees after performing a series of more complicated LR tests. This tree with ML branch lengths is shown in Figure 5. A summary of

83 ML log likelihood calculations is shown in Table 2a. A molecular clock was rejected with a p value < 0.005, ML calculations with and without a molecular' clock are summarized in Table 2b.

Due to rate heterogeneity, Sanderson's method of nonparametric rate smoothing (NPRS) was applied to produce an ultrametric tree. This NPRS tree with ML branch lengths is shown in Figure 6. To calibrate the tree in absolute time a minimum date of 105 myr was used to define the split between the largely African and S.American/Australasian clades. When applied to the shared node between Grevilleoideae and Proteoideae this calibration gave a minimum date of 36 myr for the node representing the root of the Protea clade (indicated on Figure 6). The most recent split between an African and a S. American lineage was the node shared by Brabejum and Panopsis. Calibration at this node gave an estimated minimum age for the root node of Protea of 125 myr and an estimated age for the root node of the family of 354 myr. Due to the unfeasibility of the latter estimate the former estimate, which invokes dispersal to explain the current biogeography of the family, was used in all calculations from here on.

To evaluate whether the date estimated was highly influenced by sampling error due to too few sequence data, the divergence time for the Protea radiation was recalculated 100 times using branch lengths derived from bootstrapping the four DNA regions. The resulting bootstrap distribution of age estimates (Figure 7) gave a mean age estimate of 37 myr with a standard error of 1.0. This mean age differs from the date of 36 myr estimated from the tree in Figure 6, but this mean value and standard error from the bootstrap distribution is used to estimate diversification rates in analysis 3.

25 —

20 — Percent of bootstrap 15—1 replicates 10—

5 —

0 nr" 29 33 37 41 45 49 Age of the root node of Protea (millions of years)

FIGURE 7. Bootstrap (100) distribution of age estimates for the root node of Protea using a 105 myr calibration for the node separating subfamilies Proteoideae and Grevilleoideae (indicated on Figure 6).

84 > Temporal dynamics of the Protea radiation

Analysis 3:

As in analysis 2, a three-parameter model of sequence evolution was chosen after performing a series of LR tests to fit ML branch lengths to one of the most parsimonious trees (comprising 86 Protea species). This tree with ML branch lengths is shown in Figure 8 and a summary of ML log likelihood calculations is shown in Table 3a. A molecular clock was rejected with a p-value of <0.005; ML calculations with and without a molecular clock are summarized in Table 3b. To produce an ultrametric tree, Sanderson's (1997) method of nonparametric rate smoothing (NPRS) was used as in analysis 2. This NPRS tree with ML branch lengths is shown in Figure 9.

For Protea a lineage through time plot (Figure 10, with absolute time in millions of years represented on the x-axis) demonstrates linear behavior until the slope decreases markedly in the last 20 myr. The average rate of diversification within Protea was estimated as ,§= 0.054 with an upper confidence interval of 0.068 and a lower interval of 0.044. The diversification rate estimated from the linear section of the plot (36-20 mya) was .§= 0.128 with an upper confidence interval of 0.194 and a lower interval of 0.095. For the region from 20 mya to the present the diversification rate was estimated as ,§' = 0.039 with an upper confidence interval of 0.054 and a lower interval of 0.031. The diversification rate with confidence intervals was also calculated using ages for the root node of Protea of 36 and 38 myr, corresponding to 37 ± the standard error estimated from the bootstrap distribution of age estimates. Taking these confidence intervals into consideration a summary of the estimated diversification rates for the entire plot and for 37-20 mya and 20 mya — present is shown in Table 4. The confidence intervals calculated from the first and second regions of the lineage through time plot do not overlap indicating that there has been a significant slowdown in diversification rate in the last 20 myr. In addition this plot does not show evidence for background extinction because the lineage through time plot would be expected to steepen towards the present if this were the case.

85 5 4.5 4

3.5 Log number 3 2.5 of lineages

2 - 1.5 -1 - 0.5 0 -40 -35 -30 -25 -20 -15 -10 -5 0 Node age (millions of years)

FIGURE 10. Log lineages through absolute time plot for Protea.

86 CO

C.) Cl • ..- trl C CZ 0 -crs' 14 V C::) C1-. c) os

-o - • o a)0 C) 0 mr, 0 V) cn a) (4., 20 OP 0 C_ a) .43 oo o X .y4•2) tr) a) E Cs1 on rs1 c "Ci o c Cn •—• al-,0 cal •C3 0 0

7.3 7.) 00 it) Cc1 00 _se 0 06 :- CL)i-. Csi tol r- r- r- 0 0C.) C\ 00 S e rn 00 00 00 El) oe co "C aci

Uci) o E 7:1 co cs II co E +J E E E Cl E 0 Cl E 4)(9 Cl II -o -o a) co 4-1 C). ra 0- -0 E & E CI U, .6, 0 I C; • cn CC (4-4 o CU CD or4 Cl, 06- C1) 0 C) a9ct, -c) "0 0 cr E o CC E 0 0 ..c-A 1-`1 a) o

CO 00 0 -6 'le 0.) 8_, 0 1-, a) > "5 40 rs cat cal a) 01:1 to E t) C) co c..) Z 0 I-, I-, 0 0)0) 00) c-1 E tr, t-s) Cl.) a) EE E w S. LU os co co o t, en -o CI1 CO CO 0. — CV en E

to

.0 a) 0 (,) 0 Er)

ell 0 0 "0 • 0 •cl• O 00 14) 00 a) 0 cs1 Csi

irt c gr) VD WI cf.; VD rsi cri rs1 N •—• a) N oo oo oo c.) 00 00 L"a)

I S O .,.. 0 .... C0 4- 0) c 0.' 0 C.) > Cl) a) "0 0 O C.) ON 0 \ 0 C O\00

, ,—. n—n 0

C.) 0 7. .)• s. ct:1 c.) c.) a) 0 —0- ) E

0 0 0

••

• Paranomus D taste Ila 6! Mimetes Le ucospermutn te ucospermum 60 Le ucospermum Africa eucaspermum rothamitus Australasia exe tore Ila Is opogon 1 1.s opo gon South America Spate ha Sgrruria S.America & Australasia Paranomus Pa ranomus Vexe to re Ila Le ucadena ron go Le ucadendron 81 O 4, e ucaclendrotz 5 Le ucaaenaron 6 Adenant has 10 98 Adenarkthos ce narrhe nes 4 u0x 9" 9 7 — u tax 61 et rophile E ro tea 0 Protea Protea 1 0 Patten • P ro tea 63 Pro tea erotea f roten 1 ; Fa urea 2 I59 Eaurea 9 8 aure a 86 Faurea P rotea Protea 1 rotea ro tea 8 ea upr ea re vale a re vz le a 4 94• 92 uck ingham ia 76 oislh tol epi s Paopea A 10 lloxylon 7 0 flehothrium 9 11 o Ilanaae a 10 0 lielicia 1 _ Lontatia 6 6 9 1.oinati a 14 .5 /enema rpus Ca rnaryonta ') KnIgittla 2 Musgrave a 99 8 A usifomuellera 5 8 Banlcsia . 8 t 'floe Ilia 1 yionie tum I I 1 4 Roupala n— S p halm ium ()rites 18 0 Triunia . 5 P anopsts I08 Panopsts 4 54 , Brablum 1 87 i M a ca am to 89 Ala cat a ill ta 1 Euplassa 10 91 4 (iiielo,tilainssaa 8 4 1 00 Card we Ilia

I 1 Ave./ O rs 77 Unto zotiem a 7 1 'frldt. tea . Eran -land ta 21 Svnaphea C otiasperniu in 100 4 51 1 .)1117ing to Toronta 39 5 9 9 elacosne rtnlim 57 He Ile ntletta Platanus

FIGURE 4. One of the 2620 equally parsimonious trees found from analysis of rbcL-atpB spacer sequences. Number steps = 666, CI = 0.72, RI = 0.81. Nodes not recovered in the strict consensus are indicated with a circle. Branch lengths are indicated above and bootstrap percentages below branches. Colour codes indicate geographical distribution of genera.

89 0.002 Leucospermum saxsosum 0.000 Leucospermum pedunculatum 0.001 4.422e-81 Leucospermum trunclatulum 0.0 02 4.4 22e-81 Leucospermum cordijblium 0.002 Paranom us spathubtus 0.003 0.001 0.004 D iastlla paring 0.001 0 rothamnus zeyheri II I 0.002 Serurria fascijlora I II_ Se rurria nervosa 0.0 08 Paranomus dispersus I II, Vexetorella am oena 0.007 Spate Ila incurva 3.34 0e-7,8 0.001 0.001 Paanom us candicans 1-0700T Vexetorella obtusata 0.001 4.422e-81 10.000 0.005 Leucadendron laureolum 0.0 01 0.006 Leucadendron meridianum 0.006 0.007 Leucadendron chamelaea 0.015 Adenanthos sericea 0.0 19 0.0 01 Protea tenax 10.003 Protea foliosa 0.002 Protea subvestia 0,003 0.001 Pro tea effusa 0.001 Protea enerv is 2.23 4e-7 7 0.000 4.4 22 e-E 1 Protea holosericea 0.0 01 Protea comptonii 0.002 0.005 Protea scabriuscula 0.005 0.001 Protea rupicola 0.008 Protea lorea 0.000 0.016 Faurea.saligna 0.000 4.422e-8 Faurea rochetiana 0.0 01 0.001 Faurea rubrillora 0.001 0.005 Faurea galpinii 0.001 Faurea macnaughtonii 0.004 0.0 18 Au/ax pallasia 1(71(176 Au/ax umbellata 0.005 Brabejum stellatijbilum 0.0 07 0.0 66 0.006 Panopsis pearcei 0.0 19 0.001 Macadamia integnfolia 0.0 18 0.001 Euplassa occidentalis 0.008 0.0 11 0.006 1(7).17T1 G re ville a hilliana 0.02 0 Banksia petiolaris Platanus occidentalis

FIGURE 5. One of the equally most parsimonious trees fitted with ML branch lengths (indicated above branches) found from analysis of trnL-F, rps I 6 intron and atpB-rbcL spacer sequences for 46 Proteaceae taxa.

90 0.009 0.011d Leucospermum saxsosum 0.009 0.00 8 Leucospermum pedunculatum 0.01 2 Leucospermum trunclatulum 0.01 2 Leuc ospermum cordifolium 0.014 ).0021 Paronomus spat hulatus 0.014 0.C;150 Okl. • Mime tes cucullatus 0,017 Diastella parilis 0.020 Orothamnus zeyheri p.006 0.008 0.0 14 Serurria fasc Ulora b.006 Serurria nervosa 0.0 19 Paronomus dispersus 0.0 15 Vexetorella amoena 0.0C 2 0A::$01 0.015 Spatella incurva 0.0 19 0.)041 Paronomus candi cans 0.0 19 Vexetorella obtusata 0.0 10 Leucadendron salignum 0.0 10 0.010 0.0 03] Leucadendron laureolum 0,010 0.019 Leucadendron meridianum 0.011 I 0.014 Leucadendron chamelaea 0.033 Ade nanthos serice a 0.043 Isopogon dawsonii 0.00 0.0 06 Protea te nax 4.0 06 0.0 02 Protea foliosa 0.007 Protea subvestita 0.009 0 005 Protea effusa 0.009 Protea enerv is 0.009 36 million years Protea holose ric ea 0.009 ''''"\ ,11160.0 05 Protea comptonii 0.014 Protea scabriuscula 0.00 9 01214 Protea rupicola 0.019 Protea lorea 0.025 0. Faurea saligna 105 million years 0.009 0. 2. Faurea rochetiana 0.003 0.011 Faurea rubrylora .001 0.015 0.011 Faurea galpinii 0.014 Faurea macnaughtonii O. 1 0.042 Aulax pallasia 0.012 Aulax umbellata 0.016 0.011 Brabejum stellatUblium 0.0 10 0.0 16 Panopsis pe arc ei 0.027 0.171 0.0 0.0 36 0.00! Euplassa occidentalis 0.03 0 0.0 14 0.009 Lomatia tasmanica 0.0 30 Greville a hilliana 0.040 Banksia petiolaris Platanus occidentalis

FIGURE 6. NPRS tree from ML branch lengths for 46 Proteaceae. Calibration of the node separating Proteoideae and Grevilleoideae at 105 mya gives a date for the root node of Protea of 36 mya.

91

0.001 0 P caffra 0.000 cirrnawenxon na i 0 001 0.00 P. parvula 0,00 dracornontana laetans P comptonii rubropilosa 8:88b enervis 0001 P curvata (11)01 gaguedi 0.007 weiwitschii 0.001 0.000 angol ens is a 0.000 P nub igena 0.000 P wenfreliana angoiensis d reVottua 0.002 0 002 P convexa (A/01 aevisl 0.001 acaulos n 001 P angustata 0 001 lanceolata sulphurea 0.001 OffEE pendula 0.002 E canalioculata 0 000 acum in 0.001 0.002 P witzenbergiana 0.001 R PityPhYlla 0.001 0.001 10 001 P. efmsa trool P. recondita 0.001 P. rupico la 0.001 P. inopina 0002 P. n it Kla 0007 P. gla bra 0.002 e namaq 0 002 om P. mucroni ul 0.004 P odorata 0.001 0.001 0M P. nana 0 002 ..igoillyxmocephala 0001 0.00 eximia decurrens 0.001 P sub 0.001 0 ( g (1_ P hum i ora 0,000 P ampno exicaulis 0 002 P co rdata 0 0 3 P coronata nni P. caeyitosa n nnn P nertgblia P. scorzonerffblia 0.002 0.001cm P longijblia 0.001 1.657e-48 I 0 MII1111 PR Pcoildens mpacta 0.002 P susannae P. obtustfolia 0.0Q2_ 0.001 I III I P stokoei P speciosa 0.0E_ 0.001 0 I I rs ec satbioranifilla 0.000 thu P. 0.002 P. piscina u.uu l 0 P aspera P. venusta P. punctata 0.001 08 08081 P. mundii 0 001 P aurea a 0 001 0.002 P. lacticolor 001 P subvestita 0,001 P roupelliae 0.001 P grand iceps UMW P aristata n P repens n 001 P vogtsiae 0.003 onsaP ini 0 001 P foliosa 0.001 P "Ionia na 0.026 EmE P burchellii 0.001 II iik)1tiia P ho osericea P. lepidocarpodendron 0001 •4 pru inqs.gi 0.003 6.001 cryopr. m la 0 002 P scolopendri ffolia 0.002 P scabriuscula n 007 P cynaroides 0.002 0 002P lorea Ea aligna 0.0 F: rochetiana 0.000 F: galpinii F: rubriflora F: macnaughtonii

FIGURE 8. One of the most equally most parsimonious trees found from analysis of all DNA sequence and AFLP data for 86 Protea species. ML branch lengths derived from the DNA sequence data only are shown above the branches.

92

0.003 0.002 P. caffra 002 dracqmontana i r. simplex 8.8in parvula 0 002 t'. dracomontana 0.002 8:82 laetans r. comptomi reiiper fl as a 0.003 P curvata 0.008 P. gag uedi (JUL. 1 P. we7witschii P. angolensis a 0.002 8:811 P. nubigena P. wenaeliana 0.004 2.883 P. angolensis d 0.004 P. revoluta 0.001 0.004 P. convexa 0.006 Eff Kim P. laevis WM* P. acaulos .002 P. angustata P. lanceolata 0.013 P. sulphurea 0.004 0.005 P pendula .E, canahculata acuminata Q24 wierzbgrgiana 0.004 P PitpnYll 0.002 I) (1()4 e sa . 0.003 3.002 r. conclita rupicola 0.0 P. znbpina 0.013 F. alga E namaq 0.007 P. mucrarn la 002 E odorata 0.002 0.004 • nana 'R. olymocephala P. scnatex 0 016 E eximia t'. aecurrens rumibul ia 0,005 P. amp exicaulis P. cordata P. coronata g P. caespitosa 0.010 P. neriVolia 0 MO P. scorzonerifolia 0.006 1 02 P. Ion gijblia 0_302 FIIITI P. pudens 0.001 0.008 P. compacta 0.010 P. susannae 0 0 10 0.004 P. obtusOlia 0.008 0 P. stokoei 002 P. speciosa 0.002 0.010 P scabra 0.010 P .restionprolia P. piscina 0.014 P. aspera .001) P. venusta 0.003 0.006 P punctata 0.0013 P. mundii 0.004 0.00 P. aurea a P. lacticolor O.( 01 P. subvestita 0.001 0. 011.) P. roupelliae 0.01 P. grandiceps 0,014 u.o14 P. aristata .002 P. repens .002 0.012 P. vogtsiae 0.012 P. inronsa .014 P. foliosa .014 P rnontana P. burchellii 0.018 P lauorfolia P. lor olia 0.007 P. h asericea P. lepidocarpodendron P. pruinosa P .cryophi la 0. 01 1 P. scolopendriifolia .0 1 P scabriuscula 0.018 P. cynaro ides 0.020 P. lorea 0 017 0. 0 10 E.saligna 0.016 F. rochetiana 0.02111111-- E galpinii 0 On E rubriflora (I (1•X E macnaughtonii

FIGURE 9. NPRS tree from ML branch lengths for 86 Protea taxa.

93 TABLE 4. Estimated diversification rate for Protea, overall and separately for 36-20 mya and 20 mya to the present.

Estimated diversification Upper confidence interval Lower confidence interval rate (S) Overall 0.054 0.070 0.043 37 - 20 mya 0.128 0.232 0.082 20 mya - present 0.039 0.054 0.031

94 Discussion

The estimated age for the root node of Protea of 36 myr in this study does not corroborate the view that much of the diversification of the Cape flora took place around five million years ago. If this finding can be corroborated further it may prove to be extremely important for studying speciation patterns in the CFR because by accurately determining the temporal setting it should be possible to explore the historical and ecological settings throughout the evolution of the Cape flora. Calibration of branching events across the whole tree has also shown how species richness has accumulated gradually over these 36 myr. Rather than most of the species diversity having been associated with a massive burst of speciation with the onset of Mediterranean type climates (ca. five mya) the lineage through time plot actually indicates a significant slow down in net speciation. Similarly this slowdown is not associated with the climate change from sub-tropical to a seasonal rainfall regime but began well before this. Also speciation rate (probability of each lineage speciating per million years) is lower (0.054) than that calculated for angiosperm families (average of 0.12 species per million years; Eriksson & Bremer 1992). The value estimated in Eriksson & Bremer (1992) for Proteaceae was 0.11, however this was based upon a minimum age for the family of 67 myr that is almost certainly an underestimate. Using Wikstrom's et al. (submitted) estimated minimum age for Proteaceae of 108 myr then this diversification rate declines to 0.067, an estimate that is closer to that for the genus Protea. Certainly in Protea there is no evidence for a burst of speciation within a short, recent time interval.

> Sources of error in age estimates Accurate date estimation is pivotal, and therefore it is important to acknowledge that when utilizing phylogenetic trees errors affecting the accuracy of estimated divergence dates may arise from several sources. These include use of an incorrect tree, noise introduced from stochastic processes of substitution, incorrect calibration and an inability to correctly account for substitution rate variation among lineages. Each of these will be discussed with respect to the accuracy of estimating a divergence time for the root node of Protea.

> Use of the correct tree and substitution noise The atpB-rbcL spacer tree gave a good indication of the generic-level biogeography of the Proteaceae across the southern continents, which is in good agreement with the two gene-region analysis of Hoot & Douglas (1998). However, the low level of sequence variability displayed by this tree (owing to the fact that it was derived from too few variable sites) did increase the possibility that age estimation would suffer from noise introduced by stochastic processes of substitution. To remedy this, character sampling was increased for a subset of taxa that included

95 all African and Australasian Proteoideae previously sampled, as well as representative taxa from the other major clade comprising Grevilleodieae. Ideally it would have been preferable to collect further sequence data for the whole family but this was not possible within the limits of this study.

The reduced taxon tree derived from four plastid DNA regions provided sufficient information to resolve Protea as monophyletic and provided an age estimate for the root node of Protea. The bootstrap distribution of age estimates for this data matrix (Figure 9) clearly shows that this estimate is consistent around a mean of 37 ± 1.0 myr. This indicates that the estimate is not influenced too much by noise introduced by substitution processes.

> Incorrect calibration Fossil evidence for African genera of Proteaceae is non-existent, and therefore it was necessary to use minimum ages based upon separation of the Southern Hemisphere continents to calibrate the tree. This presupposes factors relating to the biogeography of Proteaceae based upon whether the major lineages diversified prior to the separation of the southern continents or diverged afterwards and then dispersed. Dispersal between Africa and Australasia may have been possible in the late Cretaceous across a proto-Indian ocean (Raven 1983). If the latter scenario is correct then the tree should be calibrated at the node that distinguishes the two subfamilies Proteoideae and Grevilleoideae. Applying this calibration provided an age estimate of 36 myr for the root node of the Protea clade. An alternative possibility is that there has been no dispersal, but if this were the case the root node of Protea would be estimated to be 125 myr. This appears extremely unlikely given that this calibration also provides an age for the root node of the family of 354 myr. The most important point in this discussion is that the estimate of 36 myr is a minimum estimate.

> Variability in substitution rate The DNA sequence data used here demonstrated rate variability among lineages (Table 2b). Therefore it was necessary to produce an ultrametric tree using an alternative method that compensated for rate variability among lineages. Sanderson's (1997) method of non-parametric rate smoothing (NPRS) allows rate variation under the assumption that substitution rates are autocorrelated in time. The method produces accurate estimates of divergence times when sequence lengths are sufficiently long, rates are truly non-clock-like, and rates are moderately to highly autocorrelated in time (Sanderson 1997). The data used here certainly satisfy the first two criteria, 3500 base pairs of sequence data were used and the molecular clock was convincingly rejected. Demonstrating the presence of autocorrelation is more difficult but methods are currently being developed to evaluate this (for example Thorne et al. 1998), and could be applied to the Protea data in future.

96 > Age estimate for the radiation of Protea and its implications An intuitive interpretation for the extremely low level of sequence divergence displayed by Protea would be that this group had undergone a recent and explosive diversification which produced the high species richness, but low levels of genetic differentiation displayed by extant Protea species. Similar observations have been made for other components of the Cape flora. For example low sequence divergence in trnL-F sequences for species of Pelargonium (Geraniaceae) (Bakker et al. 1999) led the authors to conclude that many of the currently recognized species have developed over a relatively short evolutionary period. This window of evolutionary time was then assumed to be in agreement with the hypothesis that most of the diversification in the CFR has taken place as a consequence of late Tertiary climate change in the last 2-5 myr (e.g. Linder et al 1992; Goldblatt 1997). This statement was made without taking into account rates of sequence change.

This study is one of the first to employ phylogenetic analyses to infer a divergence time for a group within the Cape flora. Perhaps with the exception of other Proteaceae, it is difficult to speculate how Protea may behave as a model for other components of the flora. However, this study suggests a much earlier radiation for the group than the Pliocene with the root node of Protea aged at ca. 36 myr ago in the Palaeogene. This estimate coincides with dates for the emergence of a `proto-fynbos' in the CFR when the region experienced a drier phase (see Figure 1, Chapter 1). However, although elements of the CFR flora have been recognized in pollen cores dating from the Oligocene (Scholz 1985), many authors contest that it was probably not until after the beginning of the Pliocene (i.e. <5 mya), that the present Cape flora could be distinguished (Linder et al 1992; Goldblatt 1997). In this analysis to force the root node of Protea (even excluding P. lorea) to be estimated at ca. five myr old gives an unreasonable minimum age estimate for the root node of Proteaceae of ca. 35 myr. Recent work carried out to calibrate the family tree of flowering plants has provided a minimum age estimate for Proteaceae of 108 myr (Wikstrom et al. submitted) and fossil evidence has estimated a minimum age for Proteaceae 97 myr (Magallon et al 1999). Both of these are substantially older than the age that results if the root node of Protea is calibrated at five myr.

> Temporal dynamics of the Protea radiation The age estimate for the root node of Protea may be older than anticipated, but this date per se does not imply that an explosion of diversification was associated with this time interval. It is plausible that although the lineage arose ca. 36 mya most of the species diversity arose much later, with the inception of Mediterranean-type climates in the Cape. Investigating the temporal dynamics of speciation within Protea alone provides a window onto the accumulation of species richness in the group. However, rather than demonstrating a recent increase in net speciation rate,

97 the log number of lineages through time plot for Protea indicates the exact opposite. It shows a marked slow down in net rate, and if an absolute time scale is applied to this based upon 36 myr

for the root node, then this slowdown has occurred in the last 20 myr.

However there are alternative explanations that may be responsible for this pattern that should be considered. For example, a recent mass extinction of Protea species could leave this, but this cause seems unlikely as there have not been any known human induced extinctions of Protea species. This behavior could also occur if the net speciation rate has remained constant over time

but species are missing from the sample. There are 24 described species missing from the tree, but other than three Cape species these are all tropical taxa. Based upon the monophyly of the tropical taxa with the South African summer rainfall taxa it is highly probable that the missing tropical taxa also belong to this clade. If this were the case, then their absence from the phylogeny would not have this effect on the lineage through time plot because missing taxa can only be detected if they are distributed evenly throughout the tree (Harvey et al. 1994). It is also unlikely that there are extant species that have not been described, particularly with ca. 450 amateur botanists actively recording species across the CFR. It may be feasible that the current species concept for Protea may be drawing a veil over recent diversification. Taxonomic reappraisal may reveal more species (for example Vogts, 1982, recognized 81 distinct 'varieties' of P. cynaroides), but to extrapolate the linear section of the lineage through time plot to the present would require 1000 species to be presently undescribed. It is certain that description of so many new species cannot be justified. Perhaps the most likely biological explanation is that there has been a genuine decline in net diversification rate, possibly as a consequence of a density-dependent process such as niche filling. This would imply a limit to the number of species that can be accommodated in the current habitat, and that this limiting factor has had an effect.

Contrary to the premise that drastic climate change and the inception of Mediterranean-type climates in the CFR provided a trigger for massive diversification, the time scale estimated here for the radiation of Protea species implies the exact opposite. Not only does this result provide a minimum age for the root node of Protea in the region of 36 mya but also that the rate of diversification continued to decline through the inception of Mediterranean-type climates. In addition the log lineages through time plot does not show the characteristic upturn towards the present associated with background extinction. This may point towards a particularly high level of coexistence and low level of extinction among species. This, rather than a high speciation rate, may explain the high level of species richness.

98 Chapter Five — Investigating the Factors Promoting Diversification in Protea using Sister Group Analysis

The occurrence of biodiversity hot spots, such as the Cape Floristic Region (CFR), have naturally led investigators to ask what combination of extrinsic and intrinsic factors have promoted speciation. Factors involved in the separation of lineages are expected to display high variability between closely related species (Ban-aclough et al. 1998, 1999), and so species-level phylogenetic trees allow hypothesis regarding correlates of species richness to be directly evaluated. Using a detailed species-level phylogenetic tree for Protea, the aim of this chapter is to investigate ecological traits that may have been important in diversification. In the fynbos biome of the CFR, to which 70 species of Protea are endemic, several factors have been hypothesized to play a significant role in promoting speciation. Notably the role of nutrient-poor substrates, Mediterranean summer-dry climates with recurrent fire, and to a lesser extent topography have all been emphasized. The potential contribution of each of these factors is discussed in greater detail below:

> Topography Fynbos occupies a landscape that is both geologically heterogeneous and topographically complex. For this reason speciation models in fynbos often invoke allopatric speciation in which populations are isolated geographically in a mountainous terrain providing a mosaic of habitats. This template for geographic speciation ensures that broad sweeps of one vegetation type are isolated from one another by inhospitable habitats (Goldblatt 1997). Therefore, one hypothesis for speciation in the CFR is that the mountainous terrain results in high levels of isolation among populations leading to allopatric speciation.

Collection of range data by the Protea Atlas Project has resulted in thorough distribution maps for all CFR Protea species. Thus, the hypothesis of allopatric speciation can be evaluated by comparing range overlap of closely related sister species and clades. If allopatry has promoted speciation, then recent sister taxa would be expected to have a low geographic range overlap (Barraclougb et al. 1998). This chapter therefore uses the approach of Barraclough et al. (1998) and Barraclough & Vogler (2000) by considering the pattern of geographic ranges for sister clades across all nodes in the phylogenetic tree (Lynch 1989; Cheeser & Zink 1994). This pattern can be illustrated by a plot of degree of sympatry against node age (Figure 1), in which deeper nodes in the phylogeny may be more reflective of changes in geographic ranges over time (Barraclough & Vogler 2000).

99 A: sympatric distribution with no range change 1 • • • • • • • • • • • • degree o • • • • B: allopatric distribution with subsequent range changes sympatr • • • • C: allopatric distribution with no range change 0 • • • • • • • • • • node age

FIGURE 1. Predictions for the relationship between geographical overlap and node age.

Most models of CFR speciation focus on the assumption that speciation mechanisms have operated within a short and evolutionarily recent time frame. However, if diversification within Protea has occurred less recently than previously assumed for most neoendemic components of the modern Cape flora (as discussed in Chapter 4) then it will be more difficult to infer historical processes from contemporary biogeographical patterns. Pattern is not evidence of process, and it is conceivable that the impact of post-speciational ranges changes may have had longer to manifest themselves than has been previously contemplated. For example, speciation may have been predominantly allopatric, but it is possible that similar and closely related species have converged on similar habitats leaving a pattern of sympatry in modern day ranges.

> Edaphic specialization The high species turnover between habitats in fynbos has led many authors to conclude that ecological conditions have promoted extensive inter-specific differentiation in different habitats or niches. In the CFR there are numerous genera with large clusters of closely related species flocks (Rosenzweig 1995) that have subdivided habitats on fine scales (Cowling & Holmes 1992). Fynbos soils are characteristically nutrient poor, but apart from their nutrient status the soils differ significantly in their structure and water retention properties. At low precipitation levels these factors become so limiting that they support distinctly different suites of species (Goldblatt 1997). With ample rain the effect of soil on vegetation composition is less prominent. However, almost throughout the region, rainfall is limiting, and vegetation varies conspicuously with soil and moisture availability. Local substrate specialists are therefore believed to be common in fynbos vegetation, where a mosaic of ecological niches is provided by soil differences. Thus, the hypothesis to be evaluated in this case is whether differentiation onto different soil types has promoted speciation in Cape taxa.

Soil preferences are well documented for all Protea species in fynbos (Rourke 1980; Protea Atlas Project) and so comparisons among closely related sister species with regard to edaphic

100 preference is possible. If heterogeneity of the soil has promoted speciation, recent sister-species pairs would be expected to display preference for different soil types. The alternative is that habitat differences are not associated with the separation of lineages but evolve incidentally over time. In this case sister species would tend to be found in similar habitats due to their recent ancestry. In an approach similar to that used to assess the degree of sympatry across all nodes in a reconstructed phylogeny, degree of habitat difference (specifically soil type in this case) can be plotted against node age. In this comparison the habitat contrast between each sister clade in the tree can range from zero, signifying that sister clades occur on the same soil type, to one, signifying that sister clades occur on completely different types. The predictions outlined above are summarized in the diagrams below (Figure 2):

A • ISMIn1 A A A A A

A: habitat difference involved in speciation B: habitat difference not involved in speciation

• • A: habitat difference not involved in speciation • habitat • • • • a • • • • contrast • • • • • • • • B: habitat difference involved in speciation 0 node age

FIGURE 2. Predictions for the relationship between habitat contrast and node age.

An alternative hypothesis is that heterogeneity of soil type does not play a role in speciation, but facilitates coexistence of species if their ranges become regionally sympatric. Therefore evaluating the association between habitat difference and degree of sympatry can shed light upon whether niche differentiation is involved in regional coexistence. An increase in habitat difference with increasing sympatry would indicate habitat subdivision and niche partitioning among regionally co-occurring species. The opposite trend (a negative slope) would suggest that species

101 in the same habitat have similar ranges (perhaps because that is where the habitat is found). These predictions are summarized in Figure 3 below:

1 • A: niche differentiation • • • • habitat • • n • • i / • • • • B: null model; no relationship contrast • • • • • C: habitat convergence 0 sympatry 1

FIGURE 3. Predictions for the relationship between habitat contrast and degree of sympatry.

> Fire The disturbance regime caused by fire is believed to play a significant role in speciation in fynbos. Fires normally kill all above ground parts of fynbos and consume most material except thicker stems. A large proportion of fynbos plants are obligate seeders, that is the whole plant (roots included) dies after fire and can only reproduce through seed (Rutherford & Westfall 1994). In terms of diversification it is believed that fire-induced mortality of re-seeding species increases generation turnover, which in turn favors the proliferation of these lineages. The potential for diversification in re-seeding species is therefore considered to be greater than re-sprouting lineages (Cowling 1987). Therefore the hypothesis to be evaluated here is whether re-seeding lineages have faster diversification rates than re-sprouting lineages. The prediction is that a comparison of sister lineages would reveal re-seeding clades to have more species than re- sprouting clades. However, in the reconstructed phylogeny of Protea there are too few clades with which to test the hypothesis in this way. Instead an extension of the approach used in Chapter 4 is adopted to estimate diversification rates in re-seeding and re-sprouting lineages separately.

To rigorously evaluate the above predictions, it is important to take into account trends observed simply by chance with no causative association with cladogenesis. For example, even if differences in soil type accumulate at random over time, as a by-product of species' independent histories, it still may be possible to observe differences in soil preference between very closely related species. Therefore it is necessary to evaluate whether the observed patterns are significantly different from those expected under null models of random accumulation of ecological traits (Barraclough et al. 1999). In addition, although each prediction is evaluated separately (for example, the contribution of fire and adaptation to soil type), the hypotheses are not mutually exclusive and will therefore be synthesized in the discussion.

102 Materials & Methods

Each of the analyses described below were performed on the ultrametric tree with maximum likelihood branch lengths described in Chapter 4 (analysis 3) for 86 species of Protea. However, because the hypotheses to be tested relate to speciation models in fynbos, those species occurring outside of the CFR have been deleted from the tree. These analyses therefore include 69 Cape Protea species.

> Topography As in Barraclough & Vogler (2000) the approach used here was to calculate range overlap between sister clades for all nodes in the phylogeny and assess the pattern of these measures in relation to the relative age of nodes. Species ranges for the CFR were taken from Protea Atlas distribution maps, generated in WorldMap Version 4.1 (Williams 1998). Each grid square represented one eighth of a degree on a side (-12km; although the area of grids in km 2 will vary with latitude this effect is minor for the study region, <5% Rebelo pers. comm.). Grid squares for which there was a record were also included.

Using the shaded range maps, the area occupied by each species, and range overlaps between sister clades, were calculated using the public domain program NIH Image (developed at the U.S. National Institutes of Health). The degree of sympatry between sister clades was defined as the percentage of the more restricted clade's range overlapped by its more widespread sister (Chesser & Zink 1994):

area of overlap

range size of clade with smaller range

This ranges from zero signifying no range overlap, to one, signifying that the range of one clade is entirely overlapped by its sister. The results are presented as a plot of degree of sympatry against node age where node age was estimated in absolute time using the calibration of 36 myr for the root node of Protea obtained in Chapter 4. A regression line was fitted to this plot by performing an arcsine transformation (an arcsine regression was used because the values are bound between zero and one).

4 http://rsb.info.nih.govinih-image/

103 > Edaphic specialization Edaphic preferences were taken from Rebelo (1995) and Rourke (1980) and refined by Rebelo (pers. comm.). The following categories were recognized: sand, neutral sand, acid sand, alkaline sand, loam, clay and peat. Where taxa occur on more than one of these soil types, multiple categories were assigned accordingly (Table 1). Contrast in soil preference was calculated as shown in Figure 4.

Soil category 0 1 2 3 Clade A 0 0.5 0.5 1 1 0.25 2 0.25 0.5 Clade 3

FIGURE 4. Calculation of soil type contrast between clade A and clade B.

In Figure 4, clade A has one daughter lineage on soil type 0 and another on soil type 1, therefore the average for each is 0.5. Clade B has one daughter lineage on soil types 1 and 2 and another on soil type 3, the averages for each are shown above. The soil type similarity therefore between clade A and B is 0.25 (i.e. overlap of soil type 1), and the soil type contrast is 1 - 0.25 = 0.75. Possible values range from one, indicating no similarity in edaphic preference, to zero indicating identical soil preference.. Soil contrast was then plotted against node age and an arcsine regression line fitted as for degree of sympatry.

To evaluate whether the observed association between soil contrast and node age is significantly different from that expected through random accumulation of ecological traits, soil type was evolved randomly onto the tree using a Visual Basic macro (Barraclough unpublished application). Soil type was treated as seven independent characters, each of which could take a value of present or absent. Of the seven possible soil categories four were kept constant (acid sand, alkaline sand, neutral sand and peat) because only single species are found on each of these soil types. The remaining three, sand, clay and loam were evolved onto the tree with a constant probability of change per unit time. For the three characters combined the average rate of change was first estimated from the data as:

observed number of changes total branch length

104 However, the rate calculated in this way is an underestimate because it does not take into account undetected multiple changes. This was corrected for by increasing the rate of change incrementally, and performing 100 trials at each rate, until the number of changes detected in the trial runs centered around the average value of the observed number of changes. This was then the rate of change used in the randomization.

For each of the 1000 randomizations an arcsine regression was fitted to estimate the slope of the relationship between habitat contrast calculated from the random data and observed node ages. A two-tailed test was used to evaluate whether the slope of the observed data was significantly different from those produced in the random trials. If the observed measure falls in the outer 95th percentile of the distribution then it may be inferred that there is a significant association between habitat contrast and node age.

> Relationship between habitat preference and degree of sympatry Soil type contrast was plotted against degree of sympatry and a comparison made between the slope of the observed arcsine regression and the slopes of the arcsine regressions obtained from 1000 randomizations where soil type was evolved on the tree as described above. Again a two tailed test was performed to evaluate whether the observed slope was significantly different from those observed by chance in the random trials.

> Fire survival strategy All taxa were coded as re-seeding or re-sprouting (Rebelo 1995; Rourke pers. comm) and mapped onto one of the combined trees (Chapter 3) in MacClade version 3.05 (Maddison & Maddison 1992). All monophyletic groups made up exclusively of re-seeding and re-sprouting species were identified from the tree. Using the equation detailed in Chapter 4, diversification rate was estimated for each of these clades. Average diversification rates were then calculated for seeders and sprouters including all lineages for which the diversification rate was zero (i.e. clade of one species). For comparison, diversification rate was also estimated for the re-sprouting clade composed of summer rainfall taxa (which includes tropical and South African representatives).

105 TABLE 1. Soil type categories for Cape Protea species (after Rourke 1980 & Rebelo 1995). 1 = present on this soil types, 0 = absent.

Soil Type sand clay loam acid sand alkaline neutral peat sand sand Protea nitida 1 1 0 0 0 0 0 Protea inopina 1 0 0 0 0 0 0 Protea glabra 1 0 0 0 0 0 0 Protea rupicola 1 0 0 0 0 0 0 Protea lanceolata 1 0 1 0 0 0 0 Protea cynaroides 1 0 0 0 0 0 0 Protea 1 1 0 0 0 0 0 scolopendriifolia Protea scabriuscula 1 1 0 0 0 0 0 Protea ayophila 1 0 0 0 0 0 0 Protea pruinosa 1 0 0 0 0 0 0 Protea repens 1 0 1 0 0 0 0 Protea longifolia 1 0 0 0 0 0 0 Protea pudens 0 0 1 0 0 0 0 Protea aristata 1 0 0 0 0 0 0 Protea eximia 1 0 0 0 0 0 0 Protea compacta 0 0 0 1 0 0 0 Protea obtusifolia 0 0 0 0 1 0 0 Protea susannae 0 0 0 0 0 0 0 Pro tea burchellii 0 1 1 0 0 0 0 Protea lorifolia 1 0 0 0 0 0 0 Protea 0 1 1 0 0 0 0 lepidocarpodendron Protea holosericea 1 0 0 0 0 0 0 Protea laurifolia 1 0 1 0 0 0 0 Protea neriifolia 1 1 1 0 0 0 0 Protea coronata 0 1 0 0 0 0 0 Protea speciosa 0 0 , 0 0 0 0 0 Protea stokoei 0 0 0 0 0 0 0 Protea grandiceps 1 0 0 0 0 0 0 Protea caespitosa 0 0 1 0 0 0 0 Protea scorzonerifolia 1 1 0 0 0 0 0 Protea lorea 1 1 0 0 0 0 0 Protea aspera 1 1 0 0 0 0 0 Protea scabra 1 1 0 0 0 0 0 Protea piscina 1 0 0 0 0 0 0 Protea restionifolia 0 1 0 0 0 0 0 Protea subvestita 1 0 0 0 0 0 0 Protea lacticolor 0 1 0 0 0 0 0 Protea punctata 1 1 0 0 0 0 0 Protea mundii 1 0 0 0 0 0 0 Protea aurea 1 0 0 0 0 0 0 Protea venusta 1 0 0 0 0 0 0 Protea foliosa 1 0 0 0 0 0 0 Protea tenax 1 0 0 0 0 0 0 Protea vogtsiae 1 0 0 0 0 0 0 Protea intonsa 1 0 0 0 0 0 0 Protea montana 1 0 0 0 0 0 0 Protea acaulos 1 1 1 0 0 0 0

106 Soil Type sand clay loam acid sand alkaline neutral peat sand sand Protea angustata 1 1 0 . 0 0 0 0 Protea laevis 1 1 0 0 0 0 0 Protea convexa 1 0 0 0 0 0 0 Protea revoluta 1 1 0 0 0 0 0 Protea recondita 1 0 0 0 0 0 0 Protea effusa 1 0 0 0 0 0 0 Protea sulphurea 1 0 0 0 0 0 0 Protea namaquana 0 0 1 0 0 0 0 1 0 0 0 0 0 0 Protea acuminata 1 0 0 0 0 0 0 Protea pendula 1 0 0 0 0 0 0 Protea canaliculata 1 0 0 0 0 0 0 Protea nana 1 1 0 0 0 0 0 Protea witzenbergiana 1 1 0 0 0 0 0 Protea pityphylla 0 1 0 0 0 0 0 Protea mucronifolia 0 1 0 0 0 0 0 Protea odorata 0 1 0 0 0 0 0 Protea amplexicaulis 1 0 0 0 0 0 0 Protea cordata 0 1 0 0 0 0 0 Protea decurrens 0 1 0 0 0 0 0 Protea subulifolia 1 1 0 0 0 0 0 Protea humiflora 1 0 0 0 0 0 0

•07 Results

> Topography Figure 5 shows a plot of sympatry against node age for 69 Cape taxa. The arcsine regression line has a positive slope, indicating an overall increase in geographical overlap between sister clades with increasing node age. This trend agrees with a pattern of allopatric speciation but the spread of values between zero and one suggest a large degree of post-speciational range movement. No mode of speciation produces intermediate values.

FIGURE 5. Plot of degree of sympatry against node age for 69 Cape Protea taxa.

• • • degree of 0.6- sympatry • • • •

0 5 10 15 20 25 30 35 40 node age (myr)

> Edaphic Specialization Figure 6 shows soil preference mapped onto the phylogenetic tree. In Figure 7 degree of soil contrast is plotted against node age for 69 Cape taxa. The arcsine regression shown on this plot has a negative slope, in the direction expected if soil difference has been involved in speciation. However, the histogram in Figure 8 shows the distribution of arcsine slopes for 1000 trials where soil type was randomly evolved onto the tree. In a two tailed test the probability that the random trials had a slope more negative than the observed slope was 0.176. Therefore the association between soil contrast and node age is not significant and may be due to random switches onto different soil types.

108 taxa. FIGURE 7. Plot of soil contrast against node age for 69 Cape Protea

• • 1 • • • • • • 0.8 s • • a • soil I 0.6 • • contrast in • • E 1 • s • • N• 0.4 • • • s il • • • a • • • • • • I, 0.2 • of • • a • 0 —Am , 1 U U 1 0 5 10 15 20 25 30 35 40 node age (myr)

FIGURE 8. Distribution of arcsine slopes for the relationship between soil contrast and node age from 1000 trials evolving soil type randomly onto the tree.

140

120

100

number of 80 occurrences 60

40

20

0 -0.020 -0.014 -0.080 -0.Q02 0.004 0.010 0.0160 0.022 0.028 0.034 slope

109 > Relationship between sympatry and habitat difference Figure 9 shows a plot of sympatry against soil contrast. The arcsine regression shown on this plot has a positive slope, a trend that would indicate that sympatric sister species occur in different habitats. The histogram in Figure 10 shows the distribution of arcsine slopes for 1000 trials where soil type was randomly evolved onto the tree. In a two tailed test the probability that the random trials had a slope more positive than the observed slope was 0.108. Therefore the association between soil contrast and sympatry for the observed data is not significant and may be explained by chance association.

FIGURE 9. Association between soil contrast and sympatry for 69 Cape Protea taxa.

1 • • IM •

0.8 •

soil 0.6 contrast • • • • • 0.4- • • • 0.2- • 1.0 •

0.2 0.4 0.6 0.8 1 degree of sympatry

110 FIGURE 10. Distribution of arcsine slopes for the relationship between soil contrast and degree of sympatry from 1000 trials evolving soil type randomly onto the tree.

160-

140- Observed 120-

number of 100-

occurrences 80-

60-

40- 20- III III o -0.35 -0.25 -0.15 -0.05 0.15 0.25 0.35 0.45 0.55 0.65 slope

> Diversification rate in re-seeding and re-sprouting lineages Figure 11 shows re-sprouting and re-seeding mapped onto the phylogenetic tree. It is clear that switches between traits have occurred several times within the genus. Estimated diversification rates for eight exclusively re-seeding and three re-sprouting clades are listed in Table 2. The values estimated for re-seeders are all higher than those estimated for re-sprouting clades. Including all the lineages for which diversification rate was zero the overall speciation rate for re- sprouters was estimated as 0.013, and the average rate for re-seeders estimated as 0.043 (Table 3). Using the non-parametric Mann-Whitney test (because the values are not normally distributed) the mean ranks of diversification rate between re-seeders and re-sprouters were found to be significantly different with a p-value of 0.025. In Protea increased diversification rate is certainly associated with re-seeding lineages in Cape taxa. However, diversification rate for re-sprouting summer rainfall lineages (which were monophyletic) was estimated as 0.067. This estimate is higher than all but one of the estimates for re-seeding lineages in the Cape.

111 • effusa Irlcplgiga J- canaliculata E acyminata E wztzenberglana igtlYfhhtigg sand E namaquana E glabra Nor clay inwa 17-771 loam • enerws E caffra dracqmontana i neutral sand r. simplex lae fans L I acid sand comptontz e rubroptlosa alkaline sand E curvata. peat taellttchii e angolensis a E nubtgena equivocal • dracomontana E parvula. wentzeltapa P. angolensis d • revoluta E convexa acecilsla 7112,PesotraTa ?siroz&mocephala otncyggoltaif tenax extmza repens sttbvestita roupelliae venitsta p punctata mundtt ticrttiolor foliosa grandtceps artstata vogtstae infonsa montana kir°crerlifia laurifolia leNdocarpodendron lortfolta scorzonertfolia amplextcaulis decurzens humifiora sub 14 ifolia cordata coronata caespitosa nertifolza susannae punleitsliaIo nirtrsVlt i a stokoei spectosa p scab.ra resttonifolia piscina asp era scabrtuscula scolopendritfolia frOgfra cvnaroides _ lbrea salimna rocttettana rubriflara P gatptnti ' E macnaughtonii

FIGURE 6. Soil type for 69 Cape taxa traced onto one of the equally most parsimonious combined trees.

112 P. effusa e. remndita 1'. rupicola Pgizatfulata e ac.uminata . I'. witzenbergzana P. guightirleical MI re-seeder 8 nainaduana P. glabra re-sprouter P. ?Tana P. enervis P.caffra e. dracomontana i P. simplex . laetans .. P. comptonu e. rubropzlosa P. curvala Z Raguedi . P. Welwitschti e. angolensis a P. nubigena E dracomontana P. parvula — 11146 Z wentzeliatza P. angolensis d 8 revoluta P. convexa P. laaceavulos ;:fraVesotfatfa .e. nand P. scolymopephaia .„,,,..,..„...c mucromfozza IniinL. pP.. odoratataw< C P. exinua e. repens P. stibvestita C roupelliae P. venusta ..' puncvta r. inunau e. Gurga er P. lacticolor F.=n..... p. foliosa P.grandiceps e. aristata P. vogtstae z intonsa P. montang P.kitrerifir a C laurifolia P. lepidocarpodendron .e. lorifolia . 91 P. scononertforia P. amplexicaulis C deculjfens P. hum ora P. subu ifolia P. cordata .e. corongta P. caespttosa .e. nerufolta P.susannae e. longifolia P. pudens ;: 2711,Vfot 11 a C stokoei P. speciosa .e. scahra . . P. restionifoha e. piscina P. aspera C. scabriuscula P. scolopendrufolia —1.= e. pruinosd P. cryophya E cvnaroides P. lbrea ----= .E. saligna --•-,. -- P. rociletiana C rgualpriara -1 F. macnaughtonii

FIGURE 11. Re-seeding and re-sprouting fire-survival strategy mapped onto one of the combined equally most parsimonious trees.

113

TABLE 2. Estimated diversification rates for individual seeding and sprouting clades.

Cape seeders Cape sprouters I.§ S

1. P. pruinosa 0.0427 1. P. foliosa 0.0278

P. cryophila P. intonsa P. vogtsiae 2. P. burchellii 0.0463

P. laurifolia 2. P. restionifolia 0.0278 P. lorifolia P. piscina P. holosericea

P. lepidocarpodendron 3. P. glabra 0.0285 P. inopina 3. P. venusta 0.0467 P. nitida P. punctata P. mundii P. lorea 0 P. aurea subsp. aurea P. scolopendriifolia 0 P. lacticolor P. cynaroides 0 P. subvestita P. aspera 0 P. roupelliae P. scabra 0 P. grandiceps P. speciosa 0 P. aristata P. scorzonerifolia 0 P. repens P. tenax 0 P. angustata 0 4. P. longifolia 0.0556 P. acaulos 0 P. pudens P. revoluta 0 P. compacta P. susannae P. obtusifolia Summer rainfall sprouters s"

5. P. neriifolia 0.0452 1. P. caffra 0.0667 P. caespitosa P. dracomontana subsp. P. coronata inyanganiensis P. cordata P. simplex P. amplexicaulis P. parvula P. humiflora P. dracomontana P. subulifolia P. laetans P. decurrens P. comptonii P. rubropilosa 6. P. nana 0.0355 P. enervis P. scolymocephala P. curvata P. mucronifolia P. gaguedi P. odorata P. welwitschii P. nubigena 7. P. pendula 0.0648 P. wentzeliana P. canaliculata P. angolensis subsp. P. acuminata divaricarta P. witzenbergiana P. angolensis subsp. P. pityphylla angolensis

8. P. convexa 0.0694 P. laevis

P. scabriuscula 0 P. montana 0 P. eximia 0 P. namaquana 0 P. sulphurea 0 P. lanceolata 0

114 TABLE 3. Average estimated diversification rates for re-seeders and re-sprouters.

Average speciation rate for Cape seeding taxa ,§= 0.0426 n= 14 s.d. = 0.027 Average speciation rate for Cape sprouting taxa ,§= 0.0134 n = 14 s.d. = 0.012

Speciation rate for tropical and South African ,.§= 0.0667 n = 1 summer rainfall taxa

115 Discussion

The aim of this chapter has been to look for overall patterns that may be associated with the separation of lineages within Protea. The factors that were specifically targeted in this investigation; topography, edaphic specialization and fire, are commonly cited as being involved in speciation in fynbos. However, their contribution has never been considered taking phylogenetic relationships into account for all species in a group from the CFR. Tests of association can give very different results from those that ignore hierarchical relationships and thus phylogenetic trees are invaluable tools in studies of diversification (Purvis 1996). In addition, the findings from Chapter 4 indicating that the Protea lineage is at least 36 million years old may shed a different perspective on the discussion. If Protea is significantly older than has been previously estimated the historical context for diversification may be more difficult to extrapolate from contemporary patterns.

D Topography In theory there is no mode of speciation that could directly give rise to values between zero and one for degree of sympatry, and so the large number of observed values that fall between the two extremes require explanation. This phenomenon implies of a large degree of range changes after speciation events which seems likely if the genus has been diversifying over a period of 36 million years. The overall trend for Protea is an increase in geographic overlap with increasing node age, suggesting an overall allopatric mode of speciation. However there are still many recent nodes with an extremely high degree of sympatry. Therefore, either generalizations are not applicable to all speciation events or historical mode of speciation has been erased by range movements.

D Edaphic specialization Although the overall trend for Protea shows a negative relationship between soil contrast and node age, indicating that recent sister clades occur on a different soil type, the results are not significant with respect to a null model of random accumulation of ecological traits. This indicates that soil difference is not significantly associated with the separation of lineages in Protea. In Figure 7 the only clade that displays high variability with respect to soil type preference contains species P. obtusifolia (limestone), P. susannae (neutral sand) and P. compacta (acid sand). The species occur in sympatry at a local scale on the Aghulas plain, a site that has received much attention from researchers, because taxa belonging to a range of families display edaphic endemism between sister species pairs. Cowling & Holmes (1992) have suggested that the subtle differences in nutrient and moisture status between soil types on the

116 Aghulas plain probably represent a major selective force in plant speciation. However, most of the sediments and soils in this area are believed to be less than four million years old since the whole area was inundated by transgressions during the early-mid Pliocene (Hendley, 1983). Estimated ages for P. compacta, P. obtusifolia and P. susannae from the NPRS tree for Protea are 14, 18 and 18 million years old respectively (Figure 9, Chapter 4). Because these lineages are substantially older than the relatively young sediments they occur on, this may provide some evidence that soil type per se has not been involved in speciation events. Instead the complex edaphic environment may allow sister species to coexist in sympatry.

The positive slope observed for the relationship between degree of sympatry and soil contrast would be consistent with a pattern of habitat subdivision among closely related species. However, the association was not significant when compared to a model of random accumulation of habitat disparity. Range movement would certainly cloud interpretation of this relationship but it may also be that the spatial scales defined in this study may be too large to detect an underlying pattern at more local levels. To remedy this would require identification of ranges at a local scale, but it is likely that these may provide conflicting patterns from site to site.

Another factor that may allow closely related Protea species to coexist could be environmental heterogeneity (such as elevation, soil factors and mositure availability) generating significant barriers to gene flow via their effects on phenology (Linhart & Grant, 1996). Contrary to this, data on flowering times collected by the Protea Atlas Project indicates that most sister species occurring in sympatry have overlapping flowering times. However this phenomenon requires more thorough investigation.

> Fire In fynbos re-seeding Protea species are associated with increased diversification rates in comparison to re-sprouting lineages. This would appear to be in agreement with Cowling's hypothesis that re-seeding fire survival strategy promotes increased speciation rate in fynbos. However, re-sprouting summer rainfall Protea lineages are associated with a higher diversification rates than Cape re-seeding lineages (this estimate is also likely to be an underestimate because many tropical African taxa are missing from this tree). This would suggest that if re-sprouting lineages can escape from the Cape then they can do as well re-seeding lineages in the Cape if not better. In contrast having a re-sprouting fire survival strategy in the Cape appears to represent an evolutionary dead end. Maybe the question should therefore be why re- sprouting lineages are more prone to extinction in the Cape, rather than why re-seeding lineages are more prone to diversification. This is in light of the fact that re-sprouting lineages can speciate at the same rate as re-seeders if they are able to escape from the Cape.

117 > Summary The model of microgeographic speciation (Cowling & Lamont 1998), in which fragmentation by fire is reinforced by topographic, edaphic or other environmental heterogeneity providing isolated microhabitats close together, relies heavily on short dispersal distances to readily isolate populations. The model is supported by the observation that most Cape species, Protea included, show no adaptations for dispersal and are regarded as passively dispersed (distances <10m; Manders 1986). Long distance dispersal i.e. birds and bats, is least common, especially rare in plants on nutrient poor substrates. The assumption is that plants on such soils cannot afford to allocate resources to protein rich fleshy fruits (Bond & Slingsby, 1983). In general, genera with fleshy fruits or seeds have wide ranges and few species per genus (Goldblatt, 1997). However, there are no empirical studies that have quantified levels of gene flow as a result of pollen flow among populations. Pollen flow may be considerable across much larger distances, particularly in Protea where many species are pollinated by birds.

Rather than microgeographic speciation, it is likely that speciation in Protea has been predominantly allopatric, and contemporary patterns of sympatry between closely related taxa are probably the result of post-speciational range movement over the groups long speciation history. Supporting evidence comes from the conclusion that in Protea edaphic factors do not appear to have been involved in speciation. Instead, there is evidence in some clades that soil factors may play a significant role in maintaining diversity by providing a variety of ecological niches for closely related species to coexist if their ranges do become sympatric (as illustrated by the example of P. susannae, P. compacta and P. obtusifolia).

Fire survival strategy does appear to impact on diversification rates given the observation that obligate re-seeding lineages, in which parent plants are killed by fire, display higher diversification rates than lineages that are able to re-sprout after fire. However, this observation is counteracted by the estimate of diversification rate for summer rainfall re-sprouting lineages, which is higher than that for Cape re-seeding lineages. This would suggest that it is wrong to look for consequences of re-seeding that may favor increased diversification rates, because evidently it is possible to attain similar or even higher speciation rates as a re-sprouter outside of the Cape. In the literature the observation that there are more re-seeding than re-sprouting species has led to hypotheses that center around the special characteristics of re-seeding strategy that have resulted in increased diversification. Explanations have therefore concentrated on factors that may give rise to increased genetic novelty in re-seeding lineages, such as fragmentation caused by fire leading to genetic bottlenecks and local population extinction; increased generation turnover and no back-crossing of progeny with parents because they are killed by fire.

118 However, fire is unimportant in the tropical forest environment, and if fires do occur, are less intense low-level grass fires. Species survive in these environments by adopting a sprawling habit, and re-sprout after fire, or grow tall and produce a thick bark to protect against fires. In contrast to Cape re-seeding lineages, these lineages have much longer generation times, are able to back cross with parent populations and do not experience population crashes caused by fire-induced local extinction. Despite this they have a diversification rate that is higher than re-seeding lineages in the Cape.

Perhaps therefore, the role of fire in diversification in the Cape should be approached from a different angle. It seems that the only way to survive the fire regime in fynbos is to have a re- seeding strategy, but this per se does not elevate diversification because similar rates can be achieved by re-sprouting lineages if they manage to escape the Cape and the associated fire regime. The real question is why re-sprouting as a fire-survival strategy in the Cape represents such an evolutionary dead end. Why is re-sprouting as a syndrome so unsuited to the type of fires experienced in fynbos? It may be that the relatively long intervals between fires, which allow a large fuel load to build, may result in fires that are so hot that they outright kill more re-sprouting plants than we are aware of. To succumb to fire, and invest all resources in the next generation, maybe the only way to prosper in the Cape.

119 Chapter 6 - Conclusions

Uncovering the evolutionary forces involved in speciation in the CFR has long been the focus of botanical and ecological research in South Africa, and there are numerous hypotheses that seek to explain both the origins and maintenance of diversity in this region. However, few studies have thus far incorporated detailed species-level phylogenetic hypotheses into their framework, but with the advance of molecular techniques and cladistic methodology the time is ripe for many existing hypotheses to be phylogenetically challenged. Therefore, given the potential contribution of phylogenetic hypotheses in the study of macroevolution, the foundation of this thesis was a reconstructed species-level phylogeny for one the CFRs best studied genera, Protea.

In Chapter 2 the starting point for this thesis was reconstructing species-level relationships using DNA sequence data. However, extremely low levels of sequence divergence among species resulted in overall lack of resolution; on average, one parsimony informative character was recovered for every nine base pairs of nuclear DNA sequenced, and only one parsimony informative character was recovered for every 29 base pairs of plastid DNA. The limitations imposed by such low sequence variability inevitably led to the search for more variable markers to infer species-level relationships. Unfortunately, few characterized gene regions evolve at a suitable rate to be informative at the specific level, and so in Chapter 3 the efficacy of AFLP markers to reconstruct relationships was considered as an alternative to DNA sequencing. In combination with DNA sequence data AFLPs proved to be extremely useful, and analysis of the combined evidence provided good resolution with groupings that were largely in agreement with current taxonomy. Due to time constraints it was not possible to perform AFLPs for multiple representatives of each species, but this should be carried out in future to check for species monophyly using this technique. The conclusion drawn from this study is that the AFLP approach should prove to be extremely useful for resolving species-level relationships, particularly for groups in which sufficient variation in DNA sequence data is difficult to obtain.Whilst detailed taxonomic reappraisal of Protea may not be possible using the combined tree obtained in Chapter 3 (due to an overall lack of bootstrap support), it did provide convincing evidence that summer- rainfall sub-tropical taxa are deeply embedded within the topology. In contrast to existing hypotheses that suggest that sub-tropical taxa are the most primitive reresentatives of the genus, this tree points to a Cape origin for extant taxa, with subsequent expansion into tropical Africa.

In Chapter 4, I used the molecular data to investigate the timing of the radiation of Protea. The estimated age and temporal dynamics of Protea did not corroborate the hypothesis that climate change at the beginning of the Pliocene (ca. five mya) prompted an explosion of plant

120 diversification in the Cape. Instead the pattern in Protea indicated a gradual increase in species richness from 36 mya, with a significant slowdown in diversification over the last 20 myr. This date represents a minimum estimate derived from the two alternative scenrios outlined in Chapter 4 for the biogeography of the family Proteaceae, following the breakup of Gondwana. These observations pose a paradox, in which lineages that have been diversifying for at least 36 myrs show remarkably low levels of sequence divergence. Whether this paradox holds true for other components of the Cape flora requires investigation across a range of taxa. The only other example where a phylogenetic estimate has been used to provide an age for a group within the CFR is by Richardson et al. (in press). In this study, the major radiation of Phylica taxa (adapted to arid environments in the Cape) was estimated have started in the region of six million years ago. Contrary to the Protea estimate, this does concord well with the inception of Mediterranean- type climates in the Cape. It is important to acknowledge that the estimate for Protea relies upon the NPRS method to produce an ultrametric tree from rate variable sequence data. It is clear that dealing with rate heterogeneity in these circumstances is difficult, and future work should include the application of alternative algorithms that are designed to produce ultrametric trees from rate variable sequence data (e.g. Thorne et al. 1998; Huelsenbeck et al. 2000).

In Chapter 5, I used the phylogenetic tree to evaluate the roles of geography and edaphic specialization in the separation of Protea lineages. For geography, the results indicated a large amount of post-speciational range movement, which is probably not surprising if the lineage has been diversifying for a minimum of 36 million years. Overall, the trend exhibited by Protea is one of increasing range overlap with node age, a pattern that would be expected if the mode of speciation has been predominantly allopatric. This is in contrast to published studies in which it has been concluded that local parapatric speciation through differentiation of local populations in adjacent but different habitats, has been more important than allopatric speciation (Linder & Vlock, 1991; Goldblatt & Manning 1996). In Protea, there is a weak trend consistent with the role of edaphic specialization and coexistence in the separation of lineages, but this is not significant with respect to random accumulation of ecological disparity over time. However, there is an example of species that are estimated to be in excess of 14 million years old, which are now confined to sediments and soils that are geologically young. This may rather suggest that a high level of coexistence is mediated by edaphic complexity if the ranges of closely related taxa become sympatric. The limestones, colluvial acid sands, duplex soils and calcareous coastal sands of the coastal lowlands, which support most of the endemics in the south-west and the south-east of the Cape, were only deposited or formed since the middle to late Pliocene, some four myr (Hendley 1983), and in the past it has been assumed that species confined to these surfaces must be younger than this (Cowling et al. 1992).

121 Also in Chapter 5, the discovery that speciation rate of non-Cape re-sprouting lineages is higher than re-seeding lineages in the Cape demands re-evaluation of the factors relating to fire-survival strategy in the Cape that may promote diversification. The position of summer-rainfall taxa in the phylogenetic tree indicates that these taxa are representative of an 'escaped' lineage from the Cape. Having escaped the fire regime in the Cape it appears that re-sprouting lineages are able to diversify as rapidly as Cape re-seeding lineages, although over larger areas. It may be therefore, that factors associated with a re-seeding strategy that have been assumed to promote genetic novelty may be less important than previously considered. This is because short generation times, fire-induced local population extinctions and fragmentation, and inability to back-cross with parent populations are not shared characteristics of re-sprouting lineages that achieve similar diversification rates outside of the Cape.

In summary, using the data and observations collected for Protea, I propose the following hypothesis to explain speciation mechanisms in this lineage:

> An allopatric mode of speciation with a large degree of range movement over the lineage's 36 myr speciation history. > A gradual build up of species richness over a long time period with low levels of extinction. > A high degree of coexistence mediated in some instances by the complex edaphic environment. > Re-seeding fire survival strategy associated with higher diversification rates in the Cape in comparison to re-sprouting lineages. However, this may not be attributable to factors such as increased generation turnover or population fragmentation and extinction introducing increased genetic novelty. Instead high extinction in Cape re-sprouting lineages may be induced by the intense fires in the Cape, which may kill more individual plants than currently realised.

In answer to the question of why the CFR demonstrates such remarkable plant species richness, the conclusion from this study would suggest that whilst diversification rates per se are not remarkably high there is little evidence of extinction. This would suggest a high degree of coexistence among species that have diversified over a long time period in a topographically complex area. The significant slow down in diversification rate may suggest niche filling, and that over its long speciation history the group has filled all the available habitats in what is a geographically small and well-defined area. Escaping from the Cape does not appear to be easy (this has only occurred twice in Protea according to the tree presented here), but when achieved even higher diversification rates are witnessed.

122 Finally, conservation measures in South Africa have begun to recognize the importance of understanding the contribution of the physical environment and abiotic factors to the diversification process. Consequently, conservation planning priorities are shifting to encompass the belief that it is essential to design systems of conservation areas that are not only representative of biodiversity patterns, but also the processes that sustain them (Cowling et al. 1999). Unlike Proteus, the mythological Greek god who could see into the future, the aim of this thesis has been to look into the past. I hope it has gone some way to contribute to our understanding of the factors that are important in the natural cycle of speciation and extinction in the Cape Floristic Region, and that this knowledge can be applied to its long-term conservation.

123 References

Acocks, J. P. H. (1953). types of South Africa. Memoirs of the Botanical Survey of South Africa 28: 1-192.

Albertson, R. C., J. A. Markert, P. D. Danley & T. D. Kocher. (1999). Phylogeny of a rapidly evolving clade: the cichlid fishes of Lake Malawi, . Proceedings of the National Academy of Sciences, USA 96: 5107-5110.

Arcade, A., F. Anselin, P. Faivre Rampant, M. C. Lesage, L. E. Paques & D. Prat. (2000). Application of AFLP, RAPD and ISSR markers to genetic mapping of European and Japanese larch. Theoretical Applied 100: 299-307.

Arnold, M. L. & S. K. Emms. (1998). Molecular markers, gene flow and natural selection. In Molecular systematics of plants II DNA sequencing (ed. D. E. Soltis, P. S. Soltis & J. J. Doyle) pp. 442-458. Kluwer Academic Publishers, MA.

Avise, J. C. (1994). Molecular markers, natural history and evolution. Chapman and Hall, New York.

Baldwin, B. G. & M. J. Sanderson. (1998). Age and rate of diversification of the Hawaiian Silversword alliance. Proceedings of the National Academy of Science, USA 95: 9402- 9406.

Bakker, F. T., A. Culham, L. C. Daugherty & M. Gibby. (1999). A trnL-F based phylogeny for species of Pelergonium (Geraniaceae) with small chromosomes. Plant Systematics and Evolution 216: 309-324.

Barraclough, T. G., A. P. Vogler, & P. H. Harvey. (1998). Revealing the factors that promote speciation. Philosophical Transactions of the Royal Society of London B Biological Sciences 353: 1271-1280.

J. E . Hogan, & A. P. Vogler. (1999). Testing whether ecological factors promote cladogenesis in a group of tiger beetles (Coleoptera: Cicindelidae). Proceedings of the Royal Society of Biological Sciences Series B 266: 1061-1067.

124 Barraclough, T. G. & Vogler, A. P. (2000). Detecting the geographical pattern of speciation in species-level phylogeneies. American Naturalist 155: 419-434.

Baum, D. A., R. L. Small & J. F. Wendel. (1998). Biogeography and floral evolution of Baobabs (Adansonia, Bombacaceae) as inferred from multiple data sets. Systematic Biology 47: 181-207.

Beard J. S. (1963). The genus Protea in tropical Africa. Kirkia 3: 138-206.

(1993). The of Tropical Africa. Kangaroo Press.

Bond, W. J. & P. Slingsby. (1983). Seed dispersal by in shrublands of the Cape province and its evolutionary implications. South African Journal of Science 79: 231-233.

(1984). Fire survival of Cape Proteaceae — influence of fire season and seed predators. Vegetatio 56: 65-74.

Bosquet, J., S. H. Strauss, A. H. Doeksen & R. A. Price. (1992). Extensive variation in evolutionary rate of rbcL gene sequences among seed plants. Proceedings of the National Academy of Science, U.S.A. 89: 7844-7848.

Brokaw, N. & R. T. Busing. (2000). Niche versus chance and tree diversity in forest gaps. Trends in Ecology and Evolution 15: 183-188.

Carroll, S. P., H. Dingle & S. P. Klasson. (1997). Genetic differentiation of fitness-associated traits among rapidly evolving populations of the soapberry bug. Evolution 51: 1182-1188.

Chesser, R. T & R. M. Zink. (1994). Modes of speciation in birds: a test of Lynch's method. Evolution 48: 490-497.

Chesson, P. & N. Huntley. (1997). The roles of harsh and fluctuating conditions in the dynamics of ecological communities. American Naturalist 150: 519-553.

Chisumpa, S. M. & R. K. Brummit. (1987). Taxonomic notes on tropical African species of Protea. Kew Bulletin 42: 815-853

125 Coetzee, J. A., A. Scholtz & H. J. Deacon. (1983). Palynological studies and the vegetative history of the fynbos. In: Fynbos Palaeoecology: a preliminary synthesis pages 156-17. Eds. H. J. Deacon, Q. B. Hendey & J. J. N. Lambrechts. South African National Scientific Programmes Report 75, CSIR, Pretoria.

Connell, J. H. (1978). Diversity in tropical forests and coral reefs. Science 199: 1302.

Cowling, R. M. (1987). Fire and its role in coexistence and speciation in Gondwanan shrublands. South African Journal of Botany 83: 106-112.

Gibbs Russell, G. E., M. T. Hoffman & C. Hilton—Taylor. (1989). Patterns of plant species diversity on southern Africa. In: Biotic diversity in southern Africa: concepts and conservation. Ed. B. J. Huntley. Pp. 19-50. Cape Town: Oxford University Press.

(1990). Diversity components in a species-rich area of the Cape Floristic Region. Journal of Vegetative Science 1: 699-710.

& P. M. Holmes. (1992). Endemism and speciation in a lowland flora from the Cape Floristic Region. Botanical Journal of the Linnean Society 47:367-383.

, P. M. Holmes & A. G. Rebelo. (1992). Plant diversity and endemism. In: Fynbos: Nutrients, Fire and Diversity pages 62-112. Ed. R. M. Cowling. Oxford University Press, South Africa.

& D. Richardson. (1995). Fynbos, South Africa's Unique Floral Kingdom. Fernwood Press, South Africa.

& M. J. Samways. (1996). Predicting global patterns of endemic plant species richness. Biodiversity Letters. 2: 17-21.

, P. W. Rundel, B. B. Lamont, M. Kahn Arroyo & M. Arianoutsou. (1996). Plant diversity in Mediterranean- climate regions. Trends in Ecology and Evolution 11: 362- 366.

& C. Hilton-Taylor. (1997). Phytogeography, flora and endemism. In: Vegetation of Southern Africa pages 397-420. Eds. R. M. Cowling, D. M. Richardson & S. M. Pierce. Cambridge University Press, Cambridge.

126 Cowling, R. M., D. M. Richardson & P. J. Mustart. (1997). Fynbos. In: Vegetation of Southern Africa pages 99-130. Eds. R. M. Cowling, D. M. Richardson & S. M. Pierce. Cambridge University Press, Cambridge.

& B. B. Lamont. (1998). On the nature of Gondwanan species flocks: diversity of Proteaceae in Mediterranean south-western and South Africa. Australian Journal of Botany 46: 335-355.

Deacon, H. J., M. R. Jury & F. Ellis. (1992). Selective regime over time. In: Fynbos: Nutrients, Fire and Diversity pages 6-22. Ed. R. M. Cowling. Oxford University Press, South Africa.

Doyle, J. J. & J. L. Doyle. (1987). A rapid DNA isolation procedure from small quantities of fresh leaf tissue. Phytochemical Bulletin, Botanical Society of America 19: 11-15.

Emshwiller, E. & J. J. Doyle. (1999). Chloroplast-expressed glutamine synthetase (ncpGS): potential utility for phylogenetic studies with an example from Oxalis (Oxalidaceae). Molecular Phylogenetics and Evolution 12: 310-319.

Eriksson, 0. & B. Bremer. (1992). Population systems, dispersal modes, life forms and diversification rates in angiosperm families. Evolution 46: 258-266.

Farris, J. S., M. Kallersjo, A. G. Kluge & C. Bult. (1995).Testing the significance of congruence. Cladistics 10: 315-319.

Felsenstein, J. (1981).Evolutionary trees from DNA sequences: a maximum likelihood approach. Journal of Molecular Evolution 17: 368-378.

(1995). PHYLIP: Phylogenetic Inference Package. University of Washington, Seattle.

Fitch, W. M. (1971). Towards defining the course of evolution: Minimum change for a specific tree topology. Systematic Zoology 20: 406-416.

Gaut, B. S., S. V. Muse, W. D. Clark & M. T. Clegg. (1992). Relative rates of nucleotide substitution at the rbcL locus of moncotyledonous plants. Journal of Molecular Evolution 35: 292-303.

127 Gibbs Russell, G. E. (1985). Analysis of the size and composition of the southern African flora. Bothalia. 17: 213-227.

(1987). Preliminary floristic analysis of the major biomes in southern Africa. Bothalia 17: 213-227.

Givnish, T. J. (1997). Adaptive radiation and molecular systematics: issues and approaches. In: Molecular Evolution and Adaptive Radiation. Eds. T.J. Givnish & K.J. Sytsma. Cambridge University Press.

Goldblatt, P. (1978). An analysis of the flora of southern Africa: its charactersitics, relationships and origins. Annals of the Missouri Botanic Gardens 65: 369-436.

(1997). Floristic diversity in the Cape flora of South Africa. Biodiversity and Conservation 6: 359-377.

& J.C. Manning. (1998). Adaptive radiation of bee-pollinated Gladiolus species (Iridaceae) in southern Africa. Annals of the Missouri Botanic Garden 85: 492-517.

& J. C. Manning. (2000). Cape Plants: A Conspectus of the Cape Flora of South Africa. Strelizia 9. National Botanical Institute & Missouri Botanical Garden.

Harvey, P. H. & M. D. Pagel. (1991). The Comparative Method in Evolutionary Biology. Oxford University Press.

, R. M. May & S. Nee. (1994). Phylogenies without fossils: estimating lineage birth and death rates. Evolution 48: 523-529.

Hendley, Q. B. (1983). Palaeontology and palaeoecology of the fynbos region: an introduction. In: Palaeoecology of the fynbos landscape: a preliminary synthesis pp. 100-115. Eds. H. J. Deacon, Q. B. Hendley & J. J. N. Lamprechts. South African National Scientific Programmes Report 75. CSIR, Pretoria.

Hillis, D. M. (1996). Inferring complex phylogenies. Nature 383: 130.

128 Hodkinson, T. R., S. A. Renvoize, G. Ni Chonghaile, C. M. A. Stapleton & M. W. Chase.(2000). A comparison of ITS nuclear rDNA sequence data and AFLP markers for phylogenetic studies in Phyllostachys (Bambusoideae, Poaceae). Journal of Plant Research 113: 259- 269.

Hoot, S.B. & A.W. Douglas. (1998). Phylogeny of the Proteaceae based on atpB and atpB-rbcL intergenic spacer region sequences. Australian Systematic Botany. 11: 301-320.

Hubbell, S. P., R. B. Foster, S. T. O'Brien, K. E. Harms, R. Condit, B. Wechsler, S. J. Wright & S. Loo do Lao. (1999). Light-gap disturbances, recruitment limitation, and tree diversity in a neotropical forest. Science 283: 554-557.

Huelsenbeck, J. P., J. J. Bull & C. W. Cunningham. (1996). Combining data in phylogenetic analysis. Trends in Ecology and Evolution 11: 152-158.

Huelsenbeck J. P., B. Larget & D. Swofford. (2000). A compound Poisson process for relaxing the molecular clock. Genetics 154: 1879-1892.

Johnson, L. A. S. & B. G. Briggs. (1975). On the Proteaceae — the evolution and classification of a southern family. Botanical Journal of the Linnean Society 70: 83-182.

Johnson, S. D. (1996). Pollination, adaptation and speciation models in the Cape flora of South Africa.. Taxon 45: 59-66.

Kruckeberg, A.R. & D. Rabinowitz. (1985). Biological aspects of endemism in higher plants. Annual Review of Ecology and Systematics 16: 449-479.

Kruger, F.J. (1979). South African heathlands. In: Heathlands of the world: a descriptive catalogue. Ed. R.L. Specht, 19-80. Amsterdam: Elsevier.

(1983). Plant community diversity and dynamics in relation to fire. In: Mediterranean Type Ecosystems. The role of Nutrients. Eds. F.J. Kruger, D.T. Mitchell, J.M. Jarvis. Springer-Verlag, Berlin.

Levyns, M. R. (1952). Clues to the past in the Cape flora of today. South African Journal of Science 49: 155-164.

129 Levyns, M. R. (1963). Migrations and origin of the Cape flora. Transactions of the Royal Society of South Africa 37: 85-106.

Linder, H.P. (1985). Gene flow, speciation and species diversity patterns in a species-rich area: the Cape flora. In Species and Speciation ed. E.S. Vrba, 53-57. Pretoria: Transvaal Museum.

(1991). Environmental correlates of patterns of species richness in the south-western Cape Province of South Africa. Journal of Biogeography 18: 509-518.

& J. Vlok. (1991). The morphology, taxonomy and evolution of Rhodocoma (Restionaceae). Plant Systematics and Evolution 175: 139-160.

, M. E. Meadows & R. M. Cowling. (1992). History of the Cape flora. In: Fynbos: Nutrients, Fire and Diversity pages 113-134. Ed. R. M. Cowling. Oxford University Press, South Africa.

Linhart Y. B. & M. C. Grant (1996). Evolutionary significance of local genetic differentiation in plants. Annual Review of Ecology and Systematics 27: 237-277.

Losos, J. B., K. B. Warheit & T. W. Schoener. (1997). Adaptive differentiation following experimental island colonisation in Anolis lizards. Nature 387: 70-73.

Low, A.B. & A.G. Rebelo. (1998). Vegetation of South Africa, and Swaziland. DEAT, Pretoria.

Lynch, J. D. (1989). The gauge of speciation: on the frequency of modes of speciation. In: Speciation and its consequences. Pages 527-553. Eds. D. Otte & J. A. Endler. Sinuar, Sunderland, Massachusettes.

Maddison, W. P. and D. R. Maddison. (1992). MacClade 3.01. Sinauer Associates Inc., Sunderland, Mass., USA.

Magallon, S., P. R. Crane & P. S. Herendeen. (1999). Phylogenetic patterns, diversity, and diversification of . Annals of the Missouri Botanic Gardens 86: 297-372.

130 Manders, P. T. (1986). Seed dispersal and seedling recruitment in Protea laurifolia. South African Journal of Botany 52: 421-424.

& D.M. Richardson. (1992). Colonization of Cape fynbos communities by forest species. Forest Ecology and Management 48:277-293.

Major, J. (1988). Endemism: a botanical perspective. In Analytical Biogeography. An Integrated Study of Plant and Animal Distributionsl. Ed. A.A. Myers & P.S. Giller. 117-146. New York: Chapman and Hall.

McDonald, D. J., J. M. Juritz, R. M. Cowling & W. J. Knottenbelt. (1994). Modelling the biological aspects of local endemism in South African Fynbos. Plant Systematics and Evolution 195: 137-147.

Mueller, U. G. & L. L. Wolfenbarger. (1999). AFLP genotyping and fingerprinting. Trends in Ecology and Evolution 14: 389-393.

Myers, N. (1988). Threatened biotas: `Hotspots' in tropical forests. The Environmentalist 10: 1- 20.

Nee, S. in press. On inferring speciation rates from phylogenies. Evolution.

Owen-Smith, N. & J. E. Danckwerts. (1997). Herbivory. In: Vegetation of Southern Africa pages 397-420. Eds. R. M. Cowling, D. M. Richardson & S. M. Pierce. Cambridge University Press, Cambridge.

Oxelman, B., M. Liden & D. Berglund. (1997). Chloroplast rps16 intron phylogeny of the trie Sileneae (Caryophyllaceae). Plant Systematis and Evolution 206: 393-410.

Pagel, M. (1994). Detecting correlated evolution on phylogenies: a general method for the comparative method of discrete characters. Proceedings of the Royal Society, Biological Sciences Series B.255: 37-45.

Pianka, E. R. (1966). Latitudinal gradients in species diversity: a review of concepts. The American Naturalist. 100: 33-46.

131 Purvis, A. (1996). Using interspecies phylogenies to test macroevolutionary hypotheses. In New Uses for New Phylogenies. P. H. Harvey, A. J. Leigh Brown, J. Maynard Smith & S. Nee (Eds.). Oxford University Press, UK.

Rambaut A. & L Bromham (1998). Estimating divergence dates from molecular sequences. Molecular Biology and Evolution. 15: 442-448.

& M. Charleston. (2000). TreeEdit: Phylogenetic Tree Editor version 1.0.

Raven, P. H. (1983). The migration and evolution of floras in the southern hemisphere. Bothalia 14: 325-328.

Rebelo, A. G. (1995). A Field Guide to the Proteas of Southern Africa. Fernwood Press, South Africa.

Reeves, G., P. Goldblatt, P. J. Rudall, T. Souza-Chies, B. Lejeune, A. V. Cox, M. F. Fay, & M. W. Chase. Molecular systematics of Iridaceae: evidence from four plastid DNA regions. Submitted: American Journal of Botany.

Remmington, D. L., R. W. Whetten, B. H. Liu & D. M. O'Malley. (1999). Construction of an AFLP genetic map with nearly complete coverage in Pinus taeda. Theoretical Applied Genetics 98: 1279-1292.

Reznick, D. N. (1997). Evaluation of the rate of evolution in natural populations of guppies (Poecilia reticulata). Science 275: 1934-1937.

Richards, M. B., R. M. Cowling & W. D. Stock. (1995). Fynbos plant communities and vegetation — environment relationships in the Soetanysberg hills, Western Cape. South African Journal of Botany 61: 298-305.

Richardson J. E., F. M. Weitz, M. F. Fay, Q. C. B. Cronk, H. P. Linder & M. W. Chase. (in press). Phylogenetic analysis of Phylica L. with an emphasis on island species: evidence from plastid trnL-F and nuclear internal transcribed spacer (ribosomal DNA) sequences. Taxon.

Rosenzweig, M. L. (1985). Species Diversity in Space and Time. Cambridge University Press: Cambridge.

132 Rourke, J. P. (1980). The Proteas of Southern Africa. Purnell: Cape Town.

(1998). A review of the systematics and phylogeny of the African Proteaceae. Australian Systematic Botany 11: 267-285.

Rutherford M. C. & R. H. Westfall. (1994). Biomes of southern Africa: an objective categorisation, 2nd edition: Memoirs of the Botanical Survey of South Africa 63: 1-94.

Sanderson, M. J. (1997). A nonparametric approach to estimating divergence times in the absence of rate constancy. Molecular Biology and Evolution 14: 1218-1232.

Saitou, N. & M. Nei. (1987). The neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular Biology and Evolution 4: 406-425.

Savolainen, V., J. F. Manen, E. Douzery & R. Spichiger. (1994). Molecular phylogeny of families related to Celestrales based on rbcL 5' flanking sequences. Molecular Phylogenetics and Evolution 3: 27-37.

, P. Cuenoud, R. Spichiger, M. D. P. Martinez, M. Crovecoeur & J. F. Manen. (1995). The use of herbarium specimens in DNA phylogenetics: evaluation and improvement. Plant Systematics and Evolution 197: 87-98.

Scholtz, A. (1985). Palynology of the Upper Cretaceaous lacustrine sediments of the Arnot Pipe, Banke, Namaqualand. Annals of the South Africa Museum 95: 1-109.

Scott, L., H. M. Anderson & R. M. Anderson. (1997). Vegetation history. In: Vegetation of Southern Africa pages 62-90. Eds. R. M. Cowling, D. M. Richardson & S. M. Pierce. Cambridge University Press, Cambridge.

Soltis, D. E. & P. S. Soltis. (1998). Approaches and genes for phylogenetic analysis. In Molecular Systematics of Plants II: DNA sequencing Ed. D. E. Soltis, P. S. Soltis & J. J. Doyle. Kluwer Academic Publishers, Boston.

Stevens, G. C. (1989). The latitudinal gradient in geographical range: how so many species coexist in the tropics. The American Naturalist. 133: 240-256.

133 Stock, W. D., F. Van der Heydon & 0. A. M. Lewis. (1992). Plant structure and function. In: Fynbos: Nutrients, Fire and Diversity pages 226-240. Ed. R. M. Cowling. Oxford University Press, South Africa.

Sun, Y., D. Z. Skinner, G. H. Liang & S. H. Hulbert. (1994). Phylogenetic analysis of Sorghum and related taxa using internal transcribed spacers of nuclear ribosomal DNA. Theoretical Applied Genetics 89: 26-32.

Swofford, D. L. (2000). PAUP*. Phylo genetic Analysis using Parsimony (*and other methods) Version 4. Sinauer Associates, Sunderland, Massachusettes.

Taberlet, P., L. Gielly, G. Pautou & J. Bouvet. (1991). Universal primers for amplification of three non-coding regions of chloroplast DNA. Plant Molecular Biology 17: 1105-1109.

Takezaki, N., A. Rzhetsky & M. Nei. (1995). Phylogenetic test of the molecular clock and linearized trees. Molecular Biology and Evolution 12: 823-833.

Thorne, J. L., H. Kishino & I. S. Painter. (1998). Estimating the rate of evolution of the rate of molecular evolution. Molecular Biology and Evolution 15: 1647-1657.

Vogler, A. P. & R. DeSalle. (1994). Evolution and phylogenetic information content of the ITS-1 region in the Tiger Beetle Cicindela dorsalisl. Moleculat Biology and Evolution 11: 393- 405.

Vogts, M. (1982). South Africa's Proteaceae, know them and grow them. C. Struik, Cape Town.

Vos, P., R. Hogers, M. Bleeker, M. Reijans, T. Van de Lee, M. Homes, A. Frijters, J. Pot, J. Peleman, M. Kuiper & M. Zabeau. (1995). AFLP: a new technique for DNA fingerprinting. Nucleic Acids Research 23: 4407-4414.

Vrba, E.S (1985). Environment and evolution: alternative causes of the temporal distribution of evolutionary events. South African Journal of Science 81: 229-236.

Vuylsteke, M., R. Mank, R. Antonise, E. Bastiaans, M. L. Senior, C. W. Stuber, A. E. Melchinger, T. Lubberstedt, X. C. Xia, P. Sram, M. Zabeau & M. Kuiper. (1999). Two high-density AFLP linkage maps of Zea mays L.: analysis of distribution of AFLP markers. Theoretical Applied Genetics 99: 921-935.

134 Weins D. & J. P. Rourke. (1978). Rodent pollination in southern African Protea spp. Nature 276: 71-73.

Weins, J. J. (1998). Combining data sets with different phylogenetic histories. Systematic Biology 47: 568-581.

Whittaker, R. H. (1977). Evolution of species diversity in land communities. Evolutionary Biology 10: 1-67.

Wiley, E. 0., D. Siegel-Causey, D. R. Brooks & V. A. Funk. (1991). The complete cladist — a primer of phylogenetic procedures. The University of Kansas, Museum of Natural history. Special Publication No. 19.

Williams, P. H. (1998). WorldMap Version 4.1 in Windows: software and help document. Privately distributed London.

Wikstrom, N., V. Savolainen & M. W. Chase. (submitted). Evolution of angiosperms: calibrating the family tree. Evolution.

Wolfe, A. D. & A. Liston. (1998). Contributions of PCR-based methods to plant systematics and evolutionary biology. In Molecular systematics of plants II: DNA sequencing (ed. D. E. Soltis, P. S. Soltis & J. J. Doyle) pp. 43-86. Kluwer Academic Publishers, MA.

135