Phylogenetic relationships and speciation in the L. () inferred from chloroplast and nuclear sequence data

by

Anemari van Niekerk

Dissertation presented in fulfilment of the requirements for the degree

MAGISTER SCIENTIAE

in BOTANY

in the FACULTY OF NATURAL SCIENCES at the UNIVERSITY OF JOHANNESBURG

Supervisor: DR. M. VAN DER BANK DESEMBER 2005

I declare that this dissertation has been composed by myself and the work contained within, unless otherwise stated, is my own.

A. van Niekerk Desember 2005

Table of Contents

Table of content i Abstract iii Index of Figures iv Index of Tables vii Acknowledgements viii

Chapter 1 General introduction and Aims of the study 1

1.1 General introduction 1 1.2 Passerina L. (Thymelaeaceae) as a case study 4 1.3 Aims of the study 8

Chapter 2 Material and Methods 10

2.1 DNA extraction 10 2.2 DNA amplification 10 2.3 DNA cycle sequencing and alignment 10 2.4 Phylogenetic analysis 11 2.5 Dating radiation of Passerina 12 2.6 Temporal dynamics of the radiation of Passerina 12 2.7 Factors promoting speciation in Passerina 14

Chapter 3 Results 28

3.1 Molecular results 28 3.1.1 Statistics 28 3.1.2 Molecular evolution 28 3.1.3 Plastid regions 30 3.1.4 Nuclear region 34 3.1.5 Total evidence 34 3.2 Dating radiation of Passerina 41 3.3 Temporal dynamics of the radiation of Passerina 43 3.4 Factors promoting speciation in Passerina 43

Chapter 4 Discussion 45

4.1 Molecular evolution 45 4.2 Combining of molecular dataset analyses 46 4.3 Morphological characters in Passerina 46 4.4 Age estimate of Passerina 49 4.5 General causes and rates of speciation within clades 52 4.5.1 Temporal dynamics of the Passerina radiation 53 4.5.2 Factors promoting speciation in Passerina 54

i Table of Contents

Chapter 5 Conclusion 57

Chapter 6 References 58

Appendix 68

A.1 The use of molecular and morphological characters in 68 phylogenetic studies A.2 Choosing an appropriate gene in the molecular 68 phylogenetic study of Passerina A.3 Analyses of phylogenetic relationships 70 A.3.1 Analysis based on distance methods 70 A.3.2 Analysis based on discrete methods 72 A.3.3 Variants of parsimony (optimally criterion) 73 A.3.4 Additive and ultrametric trees 75 A.3.5 Tree building methods 75 A.3.5.1 Exact methods 76 A.3.5.2 Heuristic methods 78 A.3.6 Credibility of the hypotheses 81 A.3.7 Consensus trees 82 A.3.8 Choice of outgroups 84 A.3.9 Choice of method used to analyse sequence data 84

ii Abstract

Abstract The eastern Cape is regarded as the centre of diversity for Passerina, except for two species occurring in the outliers of eastern Africa. Ten species are endemic to the Cape Floristic Region and four are regarded as near endemics. A complete species-level phylogeny for Passerina utilising sequences from three plastid and one nuclear gene is presented. The loci sequences were rbcL, trnL-F, rps16 and ITS. Parsimony and Bayesian analysis yield identical relationships and two informal groupings are described. Passerina is well imbedded within the tribe Gnideae and not sister to it as previously suggested. The elevation of the subtribe Passerininae (under the tribe Gnidieae) to the monogeneric tribe Passerineae, is thus not supported. The age of the root node of Passerina was estimated to evaluate the widely held view that much of the diversification in the Cape occurred ca. 5Mya with the start of the Mediterranean climate. Contrary to this, the timing and the temporary dynamics of the radiation of Passerina indicated that the lineage is at least 18 million years old and that the diversification rate had declined slightly over the past 5 million years. In Passerina, it also appears that speciation has been largely allopatric with a high frequency of range shifts.

iii Index of Figures

Index of Figures Chapter 1

Figure 1.1 The six Floral Kingdoms of the world, Boreal (Holarctic), Paleotropical, 1 Neotropical, South African (CFK), Australian and Antarctic (Hudson, 2000).

Figure 1.2 The Cape Floral Kingdom (CFK; Low & Rebelo, 1996). 2

Figure 1.3 Biomes of South Africa (Marais, 1999). 3

Figure 1.4 Distribution of Thymelaeaceae (Domke, 1934; Meusel et al., 1978). 5

Figure 1.5 Phylogeny of expanded based on Alverson et al., (1998) and Bayer 6 et al. (1999) hypothesis. The core Malvales includes the taxa traditionally placed in Bombacaceae, Malvaceae, Sterculiaceae and Tiliaceae.

Figure 1.6 Distribution of the genus Passerina L. (Bredenkamp, 2002). 7

Figure 1.7 Distribution of the genus with number of species per grid square 7 (Bredenkamp, 2002).

Figure 1.8 Various Passerina species to illustrate the growth forms and flowers. 9

Chapter 2

Figure 2.1 A hypothetical log number of lineages through time plot plot (after Purvis, 13 1996; Reeves, 2001).

Figure 2.2 Possible behaviour of a log number of lineages through time plot. A: 13 constant net speciation with background extinction or constant net speciation with an increase in rate towards the present. B: constant speciation. C: constant net speciation with a slow down rate towards the present of taxa missing from the sample (after Barraclough & Vogler, 2002).

Figure 2.3 Predictions for the relationship between geographical overlap and node age 14 age (after Reeves, 2001).

Chapter 3

Figure 3.1 One of the 13 most parsimonious rbcL trees (TL=233; CI=0.67; RI=0.82), for 31 17 Passerina taxa and outgroups. Numbers above the branches are Fitch lengths (DELTRAN optimisation), and bootstrap percentage over 50% are indicated below the branches. Solid arrowheads indicate branches not present in the strict consensus tree.

Figure 3.2 One of the most parsimonious trees of 188 steps (CI=0.82; RI=0.89) based 32 on the analysis of trnL-F sequence data. Numbers above the branches are Fitch lengths (DELTRAN optimisation) and below the branches are bootstrap percentages over 50%. Branches not present in strict consensus tree are indicated by solid arrows.

Figure 3.3 One of the 1250 most parsimonious trees from the analysis of the rps16 33 intron (TL=135; CI=0.84; RI=0.90). Numbers above the branches are Fitch lengths (DELTRAN optimisation), and bootstrap percentage over 50% are indicated below the branches. Solid arrowheads indicate branches not

iv Index of Figures

present in the strict consensus tree.

Figure 3.4 One of the 339 Fitch trees obtained from analysis of ITS sequences 36 (TL=400; CI=0.5; RI=0.47). Fitch lengths (DELTRAN optimisation) are shown above the branches and bootstrap percentage over 50% below (SW bootstrap results are underlined). Solid arrowheads indicate branches not present in the strict consensus tree and open arrowheads indicate groups not found in both the SW and Fitch consensus trees.

Figure 3.5 One of the 898 most parsimonious trees (CI=0.72; RI=0.83) based on the 37 combined plastid data (rbcL, trnL-F and rps16). Numbers above the branches are Fitch lengths (DELTRAN optimisation) and below branches are bootstraps values greater than 50% (SW bootstrap results are underlined). Branches not present in the Fitch majority rule consensus tree are indicated by solid arrows and open arrowheads indicate groups not found in both the SW and Fitch consensus trees.

Figure 3.6 Comparison between the bootstrap consensus trees of the combined plastid 38 (A) and ITS (B) data sets. Bootstrap percentages over 50% are shown below branches.

Figure 3.7 One of the 6241 equally most parsimonious trees (TL=1024; CI=0.62; 39 RI=0.79) found from the combined molecular data. Numbers above the branches are Fitch lengths (DELTRAN optimisation)/Posterior probability (MrBayes), and bootstrap percentage over 50% are indicated below the branches (SW bootstrap results are underlined). Solid arrowheads indicate branches that are not recovered in the Fitch majority rule consensus tree and open arrowheads indicate groups not found in both the SW and Fitch consensus trees.

Figure 3.8 Bayesian analysis of combined data set. Majority rule consensus tree with 40 Posterior probability (MrBayes) values shown above branches.

Figure 3.9 Bootstrap distribution of age estimates for the root node of Passerina. 41

Figure 3.10 One of the equally most parsimonious trees obtained from the combined 42 molecular analysis.

Figure 3.11 Log numbers of lineages through time plot for Passerina. 43

Figure 3.12 Plot of sympatry and node age for 19 taxa of Passerina. 44

Chapter 4

Figure 4.1 Estimate of possible phylogenetic relationships in the genus Passerina 48 obtained from cladistic analysis (Bredenkamp, 2002).

Figure 4.2 Radiation dates of several fynbos : (a) Proteaceae, 35Mya; (b) 51 Moraea, 25Mya; (c) Pelargonium, 22Mya; (d) Phylica, 8Mya; (e) Lachneae, 17.95Mya; (f) Ehrhartia, 9.8Mya; (g) Rafnia, 18.1Mya; (h) Restionaceae, 22Mya; (www.plantwab.co.za).

v Index of Figures

Figure 4.3 Accumulation of species diversity. The upper half of the diagram shows the 52 accumulation retrojected estimates of diversity for those lineages for which the radiation have been dates. Each lineage is colour-coded, and the total estimated diversity indicated by the upper line. The calculated starting dates for each radiation are indicated with colour-coded arrows, the points of which indicate the mean and the width indicates the error of the estimates (Linder, 2005).

Figure 4.4 Log number of lineages through time plot for Lachnaea (Robinson, 2004). 53

Figure 4.5 Sympatry versus node age plots obtained when range movements occur by 56 large-scale shifts to entire species range. A: entirely allopatric; B: 50% sympatric; C: entirely sympatric speciation. Y-axis is the degree of sympatry and the X-axis is the node age. The frequency of range shifts varies from low, to medium, to high (Barraclough & Volger, 2000).

Appendix

Figure A.1 Example of an additive (A) and ultrametric tree (B; Page, 1995). 75

Figure A.2 llustration of an exhaustive search (Swofford et al., 1996). 76

Figure A.3 A diagrammatic representation of the branch-and-bound approach 77 (Felsenstein, 2004).

Figure A.4 The process of Nearest-Neighbour Interchange (NNI; Felsenstein, 2004). 79

Figure A.5 Subtree Pruning and Regrafting (SPR) rearrangement (Swell & Thollesson, 80 2001)

Figure A.6 Tree Bisection and Reconnection (TBR; Swell & Thollesson, 2001). 81

Figure A.7 Two trees (1 and 2) and their strict consensus tree (after Felsenstein, 2004). 83

Figure A.8 Three trees (1, 2 and 3) and their majority rule consensus tree (after 83 Felsenstein, 2004).

Figure A.9 Two trees (1and 2) and their adams consensus tree (after Felsenstein, 84 2004).

vi Index of Tables

Index of Tables Chapter 2

Table 2.1 Sources of materials used in species-level phylogeny (1Robinson, 15 2004; 2Van der Bank et al., 2002; 3Rautenbach, unpublished; 4from this study).

Table 2.2 Sources of plant materials used in high-level phylogeny(1Fay et al., 1998; 23 2Van der Bank et al., 2002; 3Van der Bank et al., unpublished; 4Rautenbach, unpublished; 5Motsi, unpublished)..

Table 2.3 Sequences and source of primers used for PCR amplification and 27 sequencing.

Table 2.4 PCR amplification conditions. 27

Table 2.5 Unamplified taxa. 27

Chapter 3

Table 3.1 Statistics from PAUP analyses of separate and combined data matrices. 28

Table 3.2 Number of steps, CI, RI, for transitions (ts) and transversions (tv) for each of 29 the three plastid genes and ITS.

Table 3.3 Informal clades indentified in each of the four analyses (* not present in 29 analysis).

Table 3.4 Bootstrap percentage (MP) and Posterior Probabilities (PP; MrBayes) for the 29 monophyly of Passerina and for clades defined in Table 3.3.

vii Acknowledgements

Acknowledgements I would like to express my gratitude to the following people for their support and assistance in the completion of this dissertation:

Š Dr. M van der Bank, for her valuable guidence throughout this study. Š My parants, for their support and their encouragements. Š Most important to God Almighty for giving me strenght.

viii