A phylogeny of the and a biogeographic study of its subfamily

Thomas Schwartz Supervisor Bernard Pfeil

Degree project for Master of Science In Systematics and Biodiversity, Biology 60 hec

Department of and Environmental Science University of Gothenburg Abstract The Rutaceae classification is complex and has undergone several changes. In addition to morphological studies, phylogenetic inference using molecular data has also led to classification changes. Thus far only chloroplast data and ITS have been used, sometimes combined with morphology to infer the phylogeny. This study adds information from a low copy nuclear gene to test the existing phylogenetic hypothesis using a species tree framework. A biogeographic study was also performed on the Aurantioideae subfamily. A pilot study looked at the choice of genes, followed by testing and evaluation of several methods for extraction of Rutaceae DNA. Thereafter, a new method for efficient separation of alleles and paralogues was examined. The sequences obtained were analysed for recombination, positive selection and hybridisation. Trees for three loci (chloroplast, nuclear HYB and MDH) were made using MrBayes and BEAST, and a species tree was constructed with *BEAST. The *BEAST species tree is used as a template for a biogeographic study with the Lagrange geographic range likelihood analysis. A Bayesian biogeographic study is also performed using a Bayesian discrete biogeographical mode (an addition to BEAST). The results are then compared with previous studies, corroborating some and rejecting others.

Sammanfattning Rutace-familjens struktur är komplex och föränderlig. Förutom morfologiska studier har även fylogenetiska studier använts för att få ordning i den. Hittills har man endast tittat på kloroplastgener och ribosom-DNA, i enstaka fall i kombination med morfologiska karaktärer. Den här studien tillför nukleär-DNA från lågkopiegener samt en biogeografisk studie av dess underfamilj, Aurantioideae. En förstudie undersöker hur användbara generna som använts är. Detta följs av en undersökning av flera metoder för extraktion av DNA. Därefter testas en ny metod för att skilja på alleler och paraloga gener. De gensekvenser som tas fram analyseras för rekombination, riktad utveckling samt hybridisering. Fylogenetiska träd för de olika generna konstrueras med MrBayes och BEAST. *BEAST används för att göra ett gemensamt artträd. Detta artträdet blir sedan stommen i en studie där programmet Lagrange beräknar sannolikheten för hur den geografiska utbredningen har utvecklats. En bayesisk biogeografisk undersökning utförs också med BEAST. Resultaten jämförs till sist med tidigare forskning, där en del studier blir styrkta medan andra blir avvisade. Background

Rutaceae is a varied and widely spread family of mainly tropical trees and (Scott et al., 2000). At least a few of the about 170 genera can be found on all continents except Antarctica. It is, however, not represented north of the Alps or in Canada and only poorly represented in Europe, central Asia and outside tropical parts of North America. The most well recognised subfamily, the Aurantioideae, includes all the species (Mabberley, 1998). Other genera of economic interest are Pilocarpus, Boronia, Choisya, Poncirus and Skimmia (Chase et al., 1999).

As can be seen in the table below, the phylogenetic history of Rutaceae is varied and uncertain. While it is recognised that the family, here broadly delimited (i.e., including Cneoraceae and Ptaeroxylaceae), is monophyletic (Chase et al., 1999, Gadek et al., 1996) , the internal classifications are contested (Scott et al., 2000). The first systematic treatment of Rutaceae, was made by (Engler et al., 1887) who divided the family into 6 subfamilies: Aurantioideae, Dictyolomoideae, Flindersioideae, , Spathelioideae, and Toddalioideae. These are further subdivided into a total of 25 tribes. Engler based his classification mainly upon and fruit morphology (Chase et al., 1999). Out of these subfamilies, the Aurantioideae is strongly supported as a monophyletic group (Samuel et al., 2001, Bayer et al., 2009).

Engler remains the main authority on Rutaceae and the starting point of phylogenetic work on the family (Scott et al., 2000, Groppo et al., 2008) despite the new work. Subfamilies other than the Aurantioideae have been rearranged, generally by merging to form larger subfamilies as explained below. Different classifications have been suggested since the work of Engler, and many of the papers cited here have suggested changes. One classification which takes these suggestions in account is found at the Germplasm Resources Information Network (GRIN). A list of papers whose conclusions are accepted can be found at the the website (GRIN). Some new findings that have not been incorporated include parts of the tribe level conclusions in the (Morton, 2009) paper. I will use both the GRIN and the Engler classifications as a base for this study with some additional support from Swingle (Swingle and Reece, 1967) for the Aurantioid subfamily. The following table [Table 1] shows the classifications mentioned above.

Previous studies have been mainly based on plastid genes (broadly defined), including rbcL, rps16, trnL-trnF, atpB-rbcL (Gadek et al., 1996, Chase et al., 1999, Scott et al., 2000, Groppo et al., 2008). Some, however, have also used nuclear genes, ITS1 and ITS2 in this case (Morton, 2009, Poon et al., 2007) . As Morton's (2009) analysis focused on Aurantioideae and Poon's (2007) on Rutoideae and Toddalioideae, a family-wide phylogeny using the nuclear genome is lacking. Rutaceae Rutaceae GRIN Engler GRIN Engler Swingle Toddalioideae- Aurantioideae- Toddalioideae Toddalieae Aurantioideae Aurantieae Toddaliinae Citreae Acronychia Balsamocitrinae Citrinae Balsamocitrinae Halfordia Halfordia Aeglopsis Aeglopsis Skimmia Skimmia Afraegle Afraegle Toddalia Toddalia Balsamocitrus Balsamocitrus Phellodendron Phellodendron Aegle Aegle Aegle Amyridinae Citreae Citrinae Amyris Citrus Citrus Citrus Pteleinae Feronia Feronia Feronia Ptelea Ptelea Feroniella Feroniella Feroniella Brombya Limoniinae Oriciinae Atalantia Atalantia Atalantia Rutoideae- Xanthoxyleae Citropsis Citropsis Citropsis Evodiinae Naringi Naringi Herperethusa Bouchardatia Bouchardatia Paramignya Paramignya Paramignya Evodia Evodia Pleiospermium Pleiospermium Pleiospermium Fagara Fagara Severinia Severinia Severinia Geijera Geijera Triphasia Triphasia Triphasia Melicope Triphasiinae Orixa Orixa Monanthocitrus Monanthocitrus Sarcomelicope Sarcomelicope Wenzelia Wenzelia Zanthoxylum Zanthoxylum Burkillanthus Burkillanthus Boninia Swinglea Swinglea Choisyinae Microcitrus Microcitrus Choisya Choisya Poncirus Poncirus Dutaillyea Dutaillyea Clymenia Clymenia Medicosma Medicosma Oxanthera Oxanthera Decatropidinae Pamburus Pamburus Megastigma Clauseneae Lunasiinae Merrilliinae Merrilliinae Lunasia Lunasia Merrillia Merrillia Pitaviinae Clauseniae

Rutoideae-Ruteae Murraya Murraya Murraya Dinosperma Clauseneae Dictamninae Bergera Bergera Dictamnus Dictamnus L. Clausena Clausena Clausena Glycosmis Glycosmis Glycosmis Pitaviaster Micromelinae Micromelum Micromelum Micromelum Flindersioideae- Rutoideae-Boronieae Flindersioideae Flindersieae Boroniinae Flindersia Flindersia Myrtopsis Myrtopsis Rutoideae-Ruteae Boronieae Rutoideae Rutinae Acradenia Ruta Ruta Boronia Boronia Dictyolomatoideae Zieria Spathelioideae -Dictyolomateae Cneorum Eriostemoninae (CNEORACEAE) Asterolasia Asterolasia Dictyoloma Dictyoloma Harrisonia Eriostemon (SIMAROUBACEAE) Neochamaelea Phebalium (CNEORACEAE) Spathelioideae- Philotheca Spathelieae Nematolepidinae Nematolepis Correinae Correa Rutoideae-Cusparieae Diplolaeninae Diplolaena Diplolaena Cusparieae Cuspariinae Erythrochiton Erythrochiton Ravenia Pilocarpinae Pilocarpus Pilocarpus Nycticalanthus Rutoideae-Diosmeae Diosmeae Diosminae Agathosma Empleurinae Empleurum Empleurum Calodendrinae Calodendrum Calodendrum Sheilanthera lacks subfamily and tribe Boninia

Table 1: The classifications of Rutaceae made by Engler 1986, GRIN 2010, and Swingle's subfamily Aurantioideae classification. The table includes those genera that have been part of this thesis in either analysis or extraction. Family, subfamily and tribe are underlined. The tribes are in italic.

Single gene phylogenies can be misleading (not track the species phylogeny) because of selection e.g., (Stefanović et al., 2009), mistaken orthology (e.g., (Straub et al., 2006)), lineage sorting (e.g., (Avise et al., 1983)), hybridisation (e.g., (Cronn and Wendel, 2004)) and recombination (e.g., (Sanderson and Doyle, 1992)). Therefore, generating sequence from more than one gene, and especially from more than one linkage group, is the basic information required to infer species trees accurately (Edwards et al., 2007). Low-copy nuclear genes (those perhaps least subject to concerted evolution) may provide the best source of additional genes not linked to the chloroplast genome to complement existing phylogenetic evidence in Rutaceae.

The aim of this paper, then, is to achieve a better estimate of the phylogeny for Rutaceae, built on the low-copy nuclear genes HYB (beta-carotene hydroxylase) and MDH (malate dehydrogenase).

Species of interest The main objective of this paper is to verify the classifications made for the Rutaceae family. This mainly points to the subfamilies and tribes that are found in (Engler, Krause et al. 1887) and GRIN. To do this successfully, several species and genera from a each subfamily will have to be obtained. Each subfamily should be sufficiently represented to form a clade in a final tree, which would require at least two individuals. Larger subfamilies would need a wider representation than small subfamilies. Further, attempts were made to connect with previous work in the group by finding samples of the same species used before, in an effort to test those earlier studies. Suggestions gathered from these papers (Chase et al., 1999, Groppo et al., 2008, Poon et al., 2007, Pfeil and Crisp, 2008, Morton et al., 2003, Morton, 2009, Salvo et al., 2008, Mole et al., 2004, Ling et al., 2009, Samuel et al., 2001, Scott et al., 2000, Bayer et al., 2009) were influential in the gathering of an ideal species sample list. Many of these papers have looked at samples from the Aurantioideae subfamily but also the Toddaliinae and the Australian part of Boronieae. The Aurantioideae are also good points of reference when working through the wider Rutaceae, considering that many genomic resources, including primers for several low-copy nuclear genes, are available for the Aurantioid Citrus. Other genera that have been extensively studied are Ruta, Skimmia and Zanthoxylum. Other questions that may be answered by this study include Chase (1999) suggestion to look into the relationship between Flindersia and Chloroxylon. There are also a group of genera that have been moved out of Rutaceae. These includes Cedrelopsis and Ptaeroxylon, moved to Ptaeroxylaceae; Cneorum and Neochamaelea, moved to Cneoraceae; and Harrisonia, moved to Simaroubaceae. Some genera have also been placed outside of both subfamily and tribe in the GRIN classification and are therefore phylogenetic orphans in search of a home clade. These are Ivodea, Kodalyodendron, Megastigma, Tractocopevodia and Pseudoisma. Kodalyodendron is endemic to Cuba and, since I have found no contemporary references to it, possibly extinct. Geographical concerns The question of whether to take account of geology / geography and climate and how to do it must be answered in a study which wishes to discuss speciation and the historical spread of species. The present continental settings relevant here were more or less formed between 40 and 20 Ma ago (Irving, 1983). India had begun its collision with Eurasia at about 50 Ma and the archipelago between Eurasia and started forming around 40 Ma (Daly et al., 1991) and would reach its conclusion when the Australian and Philippine plates met at 25 Ma (Baillie et al., 2004). Therefore the fossil calibrated age of the Aurantioideae (Pfeil and Crisp, 2008) of around 20 Ma is comfortably younger than any of the continental movements that might otherwise be related to the spread of the Aurantioids. All of the Aurantioids live south or east of the Himalayas, therefore suggesting that vicariance through mountain formation is not an issue. The rise and fall of islands between Malaysia and Australia could possibly be related to the speciation process, even though this would probably require the simultaneous disappearance of archipelagos, which does not seem to have happened. All the considerations mentioned above gives leave to leave major geological changes out of this study. There is also the climate aspect to consider. Studies have shown that the Sahara desert has varied between its present dry state and tropical xerophytic -land (Pound et al.) or woodland (PICKFORD et al., 2006) around 11 Ma. More recently there have been green periods in parts of the desert during the archaeological time frame (Claussen and Gayler, 1997, Kuper and Kröpelin, 2006). What is known, then, is that the climate changes greatly, even during relatively short periods of time. Such weather patterns would regularly open and close dispersal routes between Africa and Asia, and quite likely similarly within Australia. However, the difficulty in simulating even the present weather conditions accurately, let alone those from thousands of years back (de Noblet- Ducoudré et al., 2000) means that trying to find the patterns and link them to phylogenetic work surely would be a work worthy of Sisyphus. Pilot Project

Methods To examine the utility of the HYB and MDH genes for a Rutaceae phylogeny, a pilot test was performed. belonging to 13 different genera (Orixa sp., Phebalium squamulosum, Ptelea trifoliata, Tetradium daniellii, Zanthoxylum bungeanum, “Skimmia sp.”, Ruta graveoleus, Phellodendron japonicum, Correa decumbens, Dictamnus albus, Ailanthus altissima, Erythrochiton brasiliensis and Murraya paniculta) grown in the Gothenburg Botanical Garden were collected and dried in silica gel. They were extracted using the OMEGA Bio-Tek's EZNA Plant DNA MiniPrep kit by the manufacturers instructions. The resulting DNA was then amplified through PCR using a HotStartTaq DNA polymerase mix with the following concentrations: (per sample: Buffer *10; 2.5ul, MgCl2; 0,5ul, dNTP; 0,5ul, Primer 1; 1ul, Primer 2; 1ul, HotStart; 0,125ul, H2O; 19,4ul), and then run in the PCR using the following scheme: (95C:5min, 94C:0.30min, 54C:0.30min, 72C:1min, 72C:5min, 4C=>infinity). The samples were amplified for HYB with the F120 and R635 primers and for MDH with the F1 and R1 primers (Table 2). The samples were cleaned with a QIAquick spin purification procedure and their DNA quality and concentration was assessed with a Pharmacia Genequant II before being sent to Macrogen (Korea) for sequencing.

Nine (Orixa sp., P. trifoliata, T. daniellii, Z. bungeanum, “Skimmia sp.”, R. graveoleus, C. decumbens, D. albus, and M. paniculta) of the locally extracted specimens resulted in successful amplification of MDH and two (D. albus, M. paniculta) in amplification of HYB. Additional amplifications were performed using previously extracted DNA mainly from Aurantioideae species (Citrus gracilis, Citrus wintersiae Choisya ternata, Glycosmis maritiana, Glycosmis trichanthera, Merrillia caloxylon, Atalantia monophylla, Clausena hamandiana, Paramignya lobata, Bergera koenigii, Monanthocitrus cornuta, Murraya paniculta, Aegle marmelos, Afraegle paniculata, Pamburus missionis, Balsamocitrus dawei, Aeglopsis chevalieri, Melicope elleryana, Philotheca deserti). A total of 17 samples (A. marmelos, A. chevalieri, A. monophylla, B. dawei, B. koenigii, C. ternata, C. gracilis, C. hamandiana, G. maritiana, M. caloxylon, M. elleryana, M. cornuta, M. paniculta, P. missionis, P. lobata, and P. deserti) were amplified for the HYB gene. Sixteen of them (excluding A. paniculata) were found to be of a sufficient quality to sequence and 14, excluding C. ternata and G. maritiana, were successfully sequenced. The results from Bergera were of an uncertain quality.

Alignments of MDH and HYB were made using Muscle (www.ebi.ac.uk/Tools/msa/muscle/), then manually edited and corrected in BioEdit. MDH was represented by 24 samples in 14 genera, whereas HYB was represented by the 13 specimens mentioned above, excluding Bergera. Two of the MDH samples were designated as out-groups, whereas HYB had a Citrus mRNA sequence added from GenBank (AY623047). Using PAUP* (Swofford, D. L. 2002.) the number of parsimony-informative characters was determined with the command CStatus. Distances between the sequences was calculated using the command DScores with default settings. The HYB data were examined with two different alignments: the first included the full alignment, whereas the second excluded areas containing lengthy off-setting of sections that were difficult to align, as well as the last 200 bases that was also difficult to align. Every species examined had two sequences for the HYB gene with unknown correct allele alignment. The alleles were compared within the clades as discussed below.

Results and Discussion The analyses showed that the percentage of parsimony informative bases were sufficient for both genes. The MDH gene had 160 out of a total of 1163 (14%) parsimony informative (p.i.) characters in this alignment. The two HYB alignments did even better with 287 p.i. characters out of a 1162 total characters (25%) for the HYB which included exon and stable intron, while the full HYB alignment had 431 p.i. characters out of a total of 2126 characters (20%). The distance matrix results were generally low for all alignments: MDH had an average of 6%, the limited HYB an average of 9% and the full HYB alignment an average of 11%. Melicope, which (together with Philotheca for the HYB alignments) was the only non-Aurantioid in this alignment, stood out in all three alignments. It scored 11% to 14% points above the average for MDH, 5% points above the average Aurantioid for the limited HYB alignment and between 3% and 6% points above average for the full HYB. Melicope stood above average against Philotheca as well in the limited HYB comparison, although not as markedly as compared to the other species. The full HYB alignment had some further high scores that were lacking for the other alignments. One notable difference was an increase in distance levels for Atalantia and Murraya, especially then against Paramignya. Here one of two species of Atalantia proposed alleles had a 5% difference to Paramignya compared with the other proposed allele. That allele also stood out in comparison with the clade containing Aegle, Aeglopsis and Balsamocitrus. Murraya generally got higher distance scores than Philotheca. Generally, it appeared likely that both of these genes would provide suitable information for a phylogenetic analysis of the family. Main Project

METHODS Species were initially sought after according to the criteria mentioned above. However, the main criteria for species selection was that fresh material is better than dry material, which in practical terms limited species selection to those species which are grown in botanical gardens. The herbarium material gathered at the Missouri Botanical Gardens herbarium at TROPICOS® as well as the silica gel collections at the National History Museum in Paris were investigated, however requests were made too late to be used in this project. A list of acquired species can be found in under Tables and Trees in an appendix.

Sampling and Processing The first group of samples which were all gathered in the Gothenburg Botanical Garden, were extracted using an (1) Omega bio-tek EZNA sp plant DNA Miniprep kit according to the manufacturer's protocol (Ext23). The kit was thereafter used for extracting 12 specimens from the Herbarium GB (Ext24).

CTAB I (2) To get a better result with fragile herbarium specimens, as well as with those fresh samples that had resisted kit extraction, a CTAB manual extraction was performed (Ext25). Seven of the herbarium samples were then extracted using a modified CTAB protocol together with four fresh leaf samples. The modifications included adding PVP-40, DIECA and Ascorbic acid to the standard CTAB recipe (Doyle and Doyle, 1990). After drying but before re-suspending the samples in TE, they were treated with proteinase K in a 1mM TRIS solution.

Searching for extraction processes Having a poor result from the standard CTAB protocol mentioned above, a number of earlier studies regarding DNA extraction chemistry were reviewed: (Kreader, 1996, Arif et al., 2010, Alzate-Marin et al., 2009, Crouse, 1987) extraction protocol review articles (Drábková et al., 2002, Ribeiro and Lovato, 2007) as well as other articles dealing with extractions from herbarium material (Rogers and Bendich, 1985, Puchooa and Khoyratty, 2004). Ribeiro's article mentioned successful extraction from herbarium material. It also mentioned successful extraction from Flindersia and Melicope, two of the Rutaceae genera that I aimed to extract. The original article (Scott and Playford, 1996) was found and the ingredients that were different from the original protocol were located or bought. Preparing most of the buffers caused no problems. However, lacking a 5M NaCl solution and failing to make one myself, the first (S&P) extraction (Ext26) used the same CTAB solution as was used in the previous extraction and changed the RNAse purification step to a proteinase K purification (adding 100 ul 1mM CaCl2 and 2.5 μl proteinase K, incubating for 15 minutes and deactivation at 95C for 10 minutes). It followed the protocol in all other respects including grinding the samples in sand. The following extraction (Ext26b) tried to grind the samples in liquid nitrogen. The resulting pellets had a much to gelatinous consistency, wherefore the following extractions aimed at reducing this problem.

The two articles (Ribeiro, Scott and Playford) presented the protocol in different ways regarding how specimens should be prepared. Therefore both liquid N2 and room temperature sand grinding were tried and compared in the next step (Ext26c) which also used a 2M NaCl CTAB in which all other ingredients were according to the recipe. Silica gel dried samples of four species growing in the botanical garden were extracted. A different phenomenon was noted regarding the EtOH precipitation, where washing with 1 volume of EtOH showed generous amounts of precipitate as long as it was in a separate phase from the extraction supernatant but re-entered solution once the tubes were mixed. These concerns together with generally weak results in 260/280 absorbance tests and remaining concerns regarding gelatinous (see results) pellets led to further study of the theory behind the process. Going back to the chemistry, further adjustments to the CTAB solution were made. I identified the purpose each ingredient had on DNA extraction and compared it with the listed ingredients of the kit I had done my first extraction with. Concluding from basic extraction tutorials found on the internet (e.g. basic extraction demo's where adding NaCl to ground meat produces extractable DNA) that a certain level of NaCl was necessary for successful DNA extraction, I realised that the final NaCl concentration was diluted by 1/3 through adding 1ml CTAB to 0.5 ml wash buffer.

The standard CTAB protocol uses 2M NaCl, which means that at least 3M NaCl is necessary for successful extraction through the Scott and Playford protocol. I came to the conclusion that NaCl was the key and made a new attempt at the CTAB solution. First making a 3M NaCl and then adding NaCl until I approached 4M NaCl, although all of the salt in the latter did not enter solution. Ext26d had the 4M NaCl CTAB solution and used N2 grinding. It also experimented on the effects of using the initial extraction buffer twice and the CIA cleaning was performed twice as is common in CTAB extractions. Everything went well until the drying step after the EtOH precipitation step. Despite staying in the fume hood overnight, drops were visible on the inside of each tube. I assumed that substances which should have been removed in the cleaning step but had remained were the cause drying had been prevented and thus the EtOH step was repeated. I concluded that both the NaCl and EtOH concentrations needed to be sufficiently high for successful extraction, since the NaCl concentration was instrumental in producing pellets which were not encapsulated in gelatinous substances and the EtOH concentration was required for a real precipitation to occur. The Ext26e extraction tested whether precipitating in 2 or 3 volumes of EtOH would give the best results. The leafs were ground in N2 and several of them (i.e. 3,4,5,6, which belonged to Phellodendron samples) had a consistency approaching that of syrup in the extraction buffer. All samples were to produce pellets given enough time in the centrifuge (15 minutes proved to short to form pellets). After drying over night, the tubes had what looked like dried salt patches around the tube mouths. (I have later realised that what I took to be salt may well have been DNA). They were therefore washed again in EtOH.

The following extraction (Ext26f) changed direction on my extraction experiments, going back to an initial 1 volume precipitation that was followed by a lengthy (23 min) centrifugation step. Two of the samples (1,4) formed pellets and were washed in 75% EtOH whereas the remaining specimens were centrifuged an additional five minutes. 500 ul liquid was then removed from the samples and replaced with 96% EtOH before another 15 min of centrifugation commenced. This resulted in two (3,5) more tubes having pellets, which were promptly washed. The remaining two tubes had the liquid replacement step repeated. No pellet or DNA was ever recovered from these tubes. Using liquid N2 for extraction purposes also proved to be disadvantageous. The Phellodendron samples consistently produced an extraction buffer with the consistency of syrup, and extractions where sand grinding had been used would still result in a thick extraction buffer if it was exposed to freezing temperatures in the lab freezer. This probably reduced the extraction efficiency. This was later confirmed by K.D. Scott through personal communication. To determine whether the white cloud which forms in the 96% EtOH phase when it is added in the precipitation step is DNA, Ext27 was performed. The precipitation step was performed in three steps. First cold EtOH was carefully added, making sure that it did not mix with the supernatant from the CIA cleaning step. The EtOH was thereafter removed to a new tube. A second portion of EtOH was added. After a 30 minute incubation in the freezer, this EtOH phase was also removed to a new tube. The two tubes with EtOH and the tube containing the remaining supernatant were then pelleted.

The next extraction Ext28 tested whether more careful removal of the supernatant following the CIA cleaning step would provide better results. Usually the samples had a white layer between the CIA and the supernatant and a green slimy layer floating on top of the supernatant. Therefore two supernatants were gathered from each sample. First the top 200 μl of the supernatant was removed to the “b” tube, followed by 500 μl being removed to the “a” tube. Then the remaining supernatant was added to the “b” tube. The tubes were initially precipitated in 2 volumes of cold 96% EtOH. All samples except number 4 failed to provide pellets, instead producing a lower transparent phase near the bottom of the tube. 800 μl of the upper phase was replaced with 70% EtOH in each of these samples. They were then centrifuged again resulting in large pellets. The final extraction (Ext29) was mainly a final try to add species to the list of obtained sequences. It implemented the 2 supernatant transfers and adding and removing an EtOH phase before proper precipitation. Unfortunately, I forgot to add the Ammonium acetate from the beginning of the precipitation step, an oversight which caused a confused precipitation procedure. In the end, pellets for both supernatant phases as well as for the EtOH was had been obtained. Finally I came to the protocol that follows, which follows Ext28.

CTAB II (3) A CTAB protocol designed for tropical species (Scott and Playford, 1996) was further modified and used as follows: approx. 30 – 50 mg of leaf tissue was ground in 2 ml extraction buffer (50 mM Tris-HCl, pH 8.0; 5 mM EDTA; 0.35 M sorbitol; 0.1% bovine serum albumin [BSA] and 10% polyethylene glycol, mol wt 6000) and approx. 0.2 g sea sand. This mix was centrifuged at ≥ XX,000 g for 5 min, the supernatant discarded and the pellets re-suspended in 400 µl of wash buffer (50 mM Tris-HCl, pH 8; 25 mM EDTA and 0.35 M sorbitol) with 100 µl of 5% sarkosyl solution added immediately thereafter. The tubes were then incubated at room temperature for 15 min. 1 ml of CTAB buffer (0.05 M CTAB; 1 M Tris-HCl, pH 8.0; 0.5 M EDTA; approx. 4 M NaCl) was added and mixed by inversion. Scott and Playford's (1996) protocol lists 0.5 M CTAB at this step, but this is in error (K.D. Scott pers. comm.). The solution was then incubated at 55C for ≥ 30 minutes. Warming the combined solution for a few minutes makes the CTAB easier to mix with the previous buffer. After incubation, the samples were centrifuged for 5 min, and then 1 ml of supernatant was transferred to a new 2 ml tube. Chloroform:isoamyl alcohol (1 ml) (CIA) was then added to each tube and mixed by inversion 15 times. All tubes were then centrifuged for 1 min with two 500 µl aliquots of supernatant from each tube then transferred to two new 1.5 ml tubes. Cold ≥ 96% EtOH (1 ml) is then added to each tube together with 50 µl of 7.5M ammonium acetate. The tubes were then centrifuged for 20 minutes. At this point each tube had two clear phases. One large phase on top and one small phase with a more viscous consistency at the bottom. 800 µl of the upper phase was removed and replaced by an equal amount of 70% EtOH. After further centrifugation (15 min), the DNA formed pellets. These were washed with 70% EtOH before being dried. Each pellet was then re-suspended in 60 µl 10 mM TRIS (pH 8).

HYB_F120 CTGCCGTCATGTCTAGTTTTGG HYB_R635 GAAAGAGCCCATATGGAACACC MDH_F1 GCTCCTGTGGAAGAGACCC MDH_R1 GCTCCAGAGATGACCAAAC Table 2: The primers used in this study. All of the samples were tested in a standard PCR procedure based on QIAGEN's HotStartTaq set. The main primers are the HYB F120 and R635 and the MDH F1 and R1. Other primers were tested but were not used for any of the sequences in this paper. Most of the Aurantioideae samples resulted in positive bands when run in an 1% Fermentas agarose gel. A few of the other samples tested resulted in positive agarose bands. Most of them, however, did not. Both 10% dilution and increased concentration of the template DNA were tested without success. Samples that gave faint but visible bands in the standard PCR gave strong bands when rerun another 30 cycles with new dNTP and MgCl2. This also resulted in secondary bands, making gel separation necessary for further sequencing of the genes. Temperature Gradient Gel Electrophoresis (TGGE) There were notable levels of polymorphisms and occasional length variations in the HYB genes sampled. The TGGE method, described by (Myers et al., 1985) and developed for separating paralogues of single copy genes in polyploid species by (Töpel, 2010), was used to identify the gene variants. Special GC-rich anchor primers were developed and DNA that included a GC tail in one end were made. The first gel, designed to identify a suitable melting temperature used Afraegle DNA. It was determined that 34-39C would be a good temperature range. However, the first parallel gel where this temperature range was used resulted in most of the sampled species ending outside of their ideal melting range. A second parallel gel which had a 31-37C melting temperature range gave good bands for all sampled specimens. The bands were cut out of the gel using fresh, sterile scalpel blades. New blades were used for each new gel slice. The gel slices were put in PCR tubes together with 50 μl of TE. Then the tubes were run in a PCR for 95C 20min to release the DNA from the gel. The tubes were then used as source for new PCR runs where the original primer was used on the side of the GC-primer and an inner primer was used in the other end.

The sequencing of standard PCR products was performed by Macrogen and the individual sequences were then assembled with BioEdit (Hall, T.A. 1999). Most (22/27) of the sequences had polymorphisms. Polymorphic sequences needed to be separated into the correct allelic or copy phase in order to analyse them. As we were not successful with the TGGE method, instead we assigned polymorphisms detected by overlapping peaks in the sequence trace files as follows. The alleles of species previously inferred to be sister for cpDNA regions (Bayer et al., 2009) were compared with those polymorphic sequences to be phased, on the assumption that the gene sequences would share similarities in the nDNA regions used here. Then, all polymorphisms shared with the sister sample were assigned to one allele in the polymorphic sequence. In the case of Balsamocitrus, this resulted in one of its alleles being more closely related to Aegle than the other, although in all other cases the inferred alleles for an individual were sister in all analyses. This assumption should have no topological affect in phylogenetic inference (except for Balsamocitrus), although a slight over or underestimation of terminal branch lengths might occur. The number of polymorphic sites within an individual's sequence ranged from nil to 20, which is generally low relative to the number of p.i. sites across each gene. Clearly there is much uncertainty in this assumption as the paralogues/alleles may have split form a common ancestor allele, where both are now separately evolving in diverging directions. The sequences were then aligned to each other, starting with a MUSCLE alignment followed by manual editing in BioEdit and Geneious (Drummond AJ, 2010). Parts of sequences for certain species, mainly non-Aurantioid Melicope and Philotheca, which could not be reasonably aligned with the main body of species, were instead off-sett to avoid non-homologous comparisons. DATA Analysis

MrBayes

The samples that were successfully sequenced and 12B_koenigii 1 13B_koenigii 19Clausena aligned were pooled with previous sequences 1 18Clausena 26M_paniculata prepared by (Ramadugu et al., in prep.). Chloroplast 1 20M_caloxylon 121M_caloxylon 30P_missionis sequences by Bayer et al (2009) were downloaded at 0,77 129P_missionis 1 1 31P_lobata TreeBASE and paired with the nuclear gene data. 1 32P_lobata 0,99 5A_paniculata 0,79 MDH and HYB alignments were prepared with 1 4A_chevalieri 1 3A_chevalieri 0,98 11B_dawei Muscle. One set of alignments was further edited 1 10B_dawei 1 2A_marmelos manually for a decrease in off-setting. Thus, two 11A_marmelos 24M_cornuta 1 25M_cornuta alignments for each gene were tested for models 37S_glutinosa 1 138S_glutinosa using ModelTest in PAUP and the windows version 27N_crenulata 1 128N_crenulata of Modeltest3.7 (Posada D and Crandall KA 1998). 8A_monophylla 19A_monophylla 0,96 1 7A_ceylonica The alignment files were then prepared for MrBayes 16A_ceylonica 35P_trifoliata (Huelsenbeck, J. P. and F. Ronquist. 2001, Ronquist, 0,91 33P_trifoliata 1 36P_trifoliata 34P_trifoliata F. and J. P. Huelsenbeck. 2003). Modeltest 1 17C_sinensis 116C_sinensis suggested TVM + Gamma as the best model for 0,56 14C_gracilis 115C_gracilis 1 23M_australasica both genes and all four alignments. This model is 1 22M_australasica intermediate in complexity between GTR and HKY in MrBayes. Files were therefore made to run 0.03 Figure 1: The MrBayes tree of the HYB alignment. MrBayes on both GTR + Gamma and HKY + The genera are: Bergera, Clausena, Murraya, Gamma models. MrBayes consensus trees were Merrillia, Pamburus, Paramignya, Afraegle, Aeglopsis, made using Toona as out-group (as per Bayer et al., Balsamocitrus, Aegle, Monanthocitrus, Swinglea, 2009) for the MDH gene tree and Clausena as out- Naringi, Atalantia, Poncirus, Citrus and Microcitrus. group for the HYB tree. While Clausena is part of the Aurantioid in-group in question, it has been found at its edges and were therefore deemed to be suitable as an out-group. Not specifying any out-groups would have made comparison between the two trees much more challenging.

Tests for intra-genic incongruence The trees inferred for each gene were not exactly the same. As there can be multiple causes of incongruence, several methods were used with the nuclear genes to examine possible causes. Recombination within the sequences alignments would skew the tree-making efforts such as the one by *BEAST. Therefore the RDP (Recombination detection program)(Martin DP, Lemey P, et al 2010) program was run using all available scanning methods (they are listed in the bibliography) to identify potential recombination events. For the Bootscan and SiScan methods, different window sizes such as 150, 200, and 250 were tried. Positive selection could affect phylogenetic inference via convergence at non-synonymous sites. To identify potential cases of positive selection the data was examined through the Phylogenetic Analysis by Maximum Likelihood (PAML)(Yang, Z. 2007) program package. Since PAML measures the ratio of nucleotide changes that affect amino acid changes, it requires that the alignments are in codon order. The easiest way to do this is to reduce the alignments to exon-only datasets. With the HYB data, three of the species (Atalantia ceylonica, Bergera and Paramignya lobata) had insufficient length and had to be removed for this test. For the MDH data, the two Skimmia alleles were too short and removed. The yn00 program (Yang 52T_ciliata 0,63 49S_anquetilia 1 & Nielsen 2000) was used to “estimate 50S_anquetilia synonymous and non-synonymous substitution 12C_ternata 0,55 56Z_monophyllum rates (dS and dN) in pairwise comparisons of 1 protein-coding DNA sequences”, and the results 55Z_monophyllum 0,63 47R_graveolens were examined. Some species from the MDH 26F_australis 33M_minutum gene alignment were further examined in 0,39 0,99 9B_koenigii 1 SplitsTree (D.H. Huson and D. Bryant, 2006), 10B_koenigii where the network was made using the full 0,66 0,18 20C_excavata 1 sequence and compared with the exon-only 21C_harmandiana 0,44 29G_trichanthera network. The later was examined as a whole and 1 27G_mauritiana st nd rd 1 also using only 1 and 2 codons versus 3 1 28G_mauritiana 30M_caloxylon codons. 1 35M_paniculata 39P_missionis Hybridization tests 0,81 40P_lobata 1 1 41P_scandens The last search for sources of incongruence 53T_trifolia 1 54W_dolichophylla involved finding evidence for possible 1 1 hybridization. The method described by 34M_cornuta 3A_paniculata 1 (Maureira-Butler et al., 2008) was used. This and 2A_chevalieri the following analysis were done on only the 0,86 1 1A_marmelos 0,998B_dawei Aurantioideae clade of Rutaceae. Several of the 1 7B_dawei non-Aurantioide alignments had proven to be of a 42P_latialatum bad quality or too short in the previous tests. 0,320,78 11B_malaccensis 0,99 24F_limonia They were also generally difficult to align with 0,64 51S_glutinosa the majority Aurantioid sequences and were 36N_crenulata therefore trimmed from the MDH and CP trees 0,98 13C_daweana 0,87 1 before running it on BEAST 1.6.1. (Drummond 0,99 14C_schweinfurthii 48S_buxifolia AJ, et. al., 2002, Drummond AJ & Rambaut A, 1 4A_ceylonica 2007) A GTR gamma model with a relaxed 1 6A_monophylla 1 lognormal clock (Drummond AJ, et. al., 2006) 5A_citroides 0,97 46P_trifoliata and a UPGMA, Yule process speciation tree prior 44P_trifoliata 1 was used. The root node was calibrated to 19.8 43P_trifoliata MA (12.1-28.2) after Pfeil and Crisp (2009). 45P_trifoliata 1 23E_glauca However, it was found that the posterior and 0,0822C_polyandra 1 likelihood values soon turned into infinity. A new 38O_sp_nov 0,4832M_papuana file was therefore prepared and run with the 0,1 17C_gracilis uncorrelated log-normal relaxed molecular clock 0,0231M_australasica (ucld) mean upper and lower borders fixed to 0,0237O_neocaledonica between 0.1 and 0.000001 (equivalent to a clock 0,0915C_amboiensis 0,9916C_amboiensis rate of 10-7 to 10-12 substitutions per site per 1 18C_sinensis year). 119C_sinensis 1 25F_oblata Twenty trees from the end of the MDH and CP posterior distributions of trees were taken for 0.03 Figure 2: MrBayes tree of the MDH alignment. The further analysis (although taking only every 5th genera are: Toona, Skimmia, Choisya, Zanthoxylum, Ruta, tree to minimize dependency between them). Flindersia, Micromelum, Bergera, Clausena, Glycosmis, These trees were then imported into Mesquite Merrillia, Murraya, Pamburus, Paramignya, Triphasia, where the time scale was converted to the number Wenzelia, Monanthocitrus, Afraegle, Aeglopsis, Aegle, of generations. Two sets of files were made, Balsamocitrus, Pleisopermium, Burkillanthus, Feronia, Swinglea, Naringi, Citropsis, Severinia, Atalantia, where one set of files was calculated to a 5 year Poncirus, Eremocitrus, Clymenia, Oxanthera, Microcitrus, generation time and a 4 000 individual population Citrus, Microcitrus, Oxanthera, Citrus and Feroniella. size. Another set was calculated to a 50 year generation time with a 40 000 individual population size. Thereafter each of the 20 trees was used as a surrogate species tree for a simulation of 20 new trees using a coalescent model where linage sorting alone can produce differences among simulated gene trees. The real generation time and historical population sizes were unknown, and it was assumed that species representing 40 different genera would have a wide range of both generation times and historical population sizes. Therefore both the lower and higher ranges were examined. The distances between each of the trees chosen from BEAST was then compared with its 20 simulated trees using *PAUPs TreeDist command in order to construct null distributions. Each of the CP trees was also measured against each of the MDH trees – these distances are the observed distances between the gene trees accounting for uncertainty in their inference. The test of Maureira-Butler et al. (2008) compares these distributions to determine whether the null of lineage sorting can be rejected to explain gene tree incongruence. We assumed a generation time of 20 years, based on observations of the time to maturation in Citrus, although this is a somewhat arbitrary value because of the occurrence of clonal reproduction (via nucellar embryony) in some species (M.L. Roose, pers. com.). Further, the mean ancestral effective population sizes of several Citrus species have been estimated to be around 4,000 - 4,500, based on the diversity found within three nuclear genes (Ramadugu et al., in prep.). Therefore, a simulation set with these characteristics (20 years and 4 000 population size) was prepared and compared with the previous results. The 20 year generation time was also used with the 40 000 population size.

Thereafter, ten species were chosen which had been incongruently placed when comparing the CP and MDH trees generated by either BEAST or MrBayes. These were removed from the lower range tree distance analysis one at a time and their individual effect on the test statistic determined. To account for the uncertainty in the data, the 95% credibility interval for the CP/MDH data test statistic was obtained through discarding the ten highest and ten lowest numbers. The MDH simulated trees null distribution was closer to the observed distances than the CP null distribution and therefore produces a lower type 1 error. We used this null in the following calculations. The highest values of the 20/4 000 and the 20/40 000 distances for simulated trees for MDH were compared with the lowest remaining value of the CP/MDH distances. This resulted in two numbers for each species deletion set. These numbers were compared with the results where no deletions had taken place and three of the species ended up with a 4 point difference. A further three of them had a 2 point difference, whereas the remaining four tested had less affect on the test statistic and were not considered further. These six species having the largest affect (Bergera koenigii, Oxanthera sp., Feroniella oblata, Severinia buxifolia, Swinglea glutinosa, and Clymenia polyandra) were identified for combined deletion tests. They were tested in several deletion series where they were removed one at a time, adding one removal to the preceding one until all six had been removed from the full alignment. This was done for the 20/4 000, 20/40 000 simulation sets and several combinations of removal order were tried. The size of distance reduction between the simulations and the original trees remained the same as in the first test. Although we cannot be certain, the first three species may be of hybrid origin and were therefore excluded from the *BEAST tree-building analysis, because this analysis assumes no hybridisation.

*BEAST Two *BEAST (Joseph Heled and Alexei J. 2010) runs were performed. The first used all chloroplast and MDH sequences for Aurantioideae that we had available except the three mentioned in the previous analysis. The second used chloroplast, MDH and HYB sequences for all Aurantioideae species that were represented at least once by all three genes. Again, Bergera koenigii,the only probable hybrid species that was available for HYB, was excluded here. The *BEAST set up file for the CP/MDH set up assumed a coalescent start tree, which is the default option, for CP while using a UPGMA start tree for MDH. All trees in the three gene set up assumed UPGMA trees. The CP data used a modified mitochondrial ploidy level, where the original value of 0.5 was changed into 1 to account for the assumption that these plants are all hermaphroditic and thus may inherit the plastid from either parent. The CP/MDH *BEAST file was divided in four files that each proceeded for 30 million generations, whereas the CP/MDH/HYB *BEAST file was divided in three files each doing 40 million generations, thus producing a total of 120 million generations for both sets. The resulting trees files were examined in Tracer. Excluding a 10% burn-in, the trees generated by each *BEAST run were merged to files covering the full analysis. Maximum clade credibility trees were then made using TreeAnnotator v1.6.1

Lagrange Geographic Range Evolution The species trees made by *BEAST were turned into NEWICK format and used as a base in a Lagrange biogeography analysis. Both the 2 gene tree and the 3 gene tree were used as a base for a

Afr1 Southern and Western Africa af1 af2 ind pam soa sin aus ncd pap Afr2 Eastern Africa af1 — 1 Ind India af2 — 1 1 Pam Pamir (including the Indian provinces bordering the ind — 1 1 1 Himalayas from Nepal and west) SOA South-East Asia (Burma and east) pam — 1 Sin China soa — 1 1 Aus Australia sin — 1 NCD New Caledonia aus — 1 1 Pap Papua (coding for all islands between Malaysia and the ncd — 1 Australian mainland) Svh The Pacific Islands pap — Table 3: The Lagrange region codes followed by a table showing which areas were coded as neighbours to each other. Lagrange biogeography. Besides the dated Newick tree, Lagrange uses a species matrix text file as input. A 10 region master matrix was made including the regions in [Table3]. The matrices used for the different species configurations were then adopted from this list. Lagrange further requests information on which regions neighbour each other.

Lastly, Lagrange requests information on the likelihood of dispersal between the different areas. Two different examinations were made for each species tree. The first limited the amount of areas as suggested by the Lagrange Configurator coding program. This run also assumed a dispersal probability of 1 for neighbouring regions and a probability of 0.5 for all other areas. Range limitations for the 3 gene tree included removing Ncd and Svh which had no representatives among the species in this tree. Further, the Pam region was also excluded. These species were also present in India. The second round included the Pam region. For the first 2 gene tree, all of Africa was made into one region, India and Pamir were joined, Australia and New Caledonia were joined and Papua and the Pacific Islands were joined to each other. The second 2 gene tree analysis used the regions found in the neighbouring regions table. The second round of analysis also differed from the first in a more scaled dispersal probability. Neighbouring regions still had a probability of 1, but now a dispersal that jumped over one region had a probability of 0.7 and one jumping over 2 regions a probability of 0.4. Jumping further than that had a probability of 0.1, except if the jump took place over land in the 2 gene case when the dispersal probability was set to 0. Practically, this meant that nothing was regarded as likely to disperse directly between China and Africa, between Australia and Pamir or between New Caledonia and any area except Australia, Papua or China.

BEAST Phylogeography A phylogeographic analysis was run using BEAST (Lemey P, Rambaut A, Drummond AJ & Suchard MA, 2009). For the BEAST phylogeographic analysis, the A-G rate in the GTR model fixed to "1" through unchecking it as an operator in BEAUTi, thus remaking it into the TVM model. Initial attempts to have a working BEAST phylogeny linked the MDH and CP dataset results. Doing so consistently resulted in unconverged data for the prior, posterior, Yule-birthrate, clock rates and tree probability. Therefore the two datasets were examined unlinked, producing separate output. This analysis was also run with the three possibly hybridised species removed. Two different biogeography model runs were set up. The first was a discrete phylogeographic analysis, which used area code names. The coded areas were:

ASIA_TROP_EAST Southeast Asia ASIA_TROP_WEST Southwest Asia ASIA_TROP Southern Asia AFRICA_W_T Western Tropical Africa AFRICA_E_T Eastern Tropical Africa AFRICA All of Africa ASIA All of Asia PAPUA Islands between Australia and Malaysia AUSTRALIA Australia CHINA China NEW_CAL New Caledonia The other was a continuous phlyogeographic analysis that used latitude and longitude. These were chosen from the centre of each species' distribution area as identified through applying the GRIN native distribution range on Google Maps. The different coding used in BEAST compared with Lagrange are due to the requirements of each program, where Lagrange can accept one species inhabiting several different areas whereas BEAST can not. RESULTS The species table [Table 5]show all species that have been a part of this study through either sequences, PCR or extraction. They represent all subfamilies and tribes in the GRIN classification and all Englers subfamilies except Spathelioideae. Several further species were also acquired although no work was done with them. They also cover 22 of Englers 23 tribes. Therefore the potential coverage of the Rutaceae has been Figure 3: Electrophoresis gels, with good. the Ext24 gel to the left and the Ext25 gel to the right. The Ext24 gel has two ladders, where the upper one is a GeneRuler High Range 10000-48500 bp and the lower is a FastRuler Middle Range 100-5000 bp ladder. Ext25 only has the High range ladder, the well at its bottom. Sample Processing A total of twelve extractions were done in this course, see [Table 4]. Omega kit(1) The first (Ext23) extracted from thirteen silica gel dried species specimens and nine of them resulted in successful PCR reactions. The second extraction (Ext24) had 12 herbarium specimens from an equal amount of species. One of the samples resulted in a successful PCR amplification.

CTAB I (2) Ext25 included 11 samples where the first four were fresh silica gel dried whereas the remaining seven samples were from the herbarium. All of these samples had 50% diluted extracts quantified measuring the 260/280 nm absorbance ratio and DNA concentration using a Pharmacia Genequant II. The ratio was generally between 1.75 and 1.86 with concentrations between 65 and 125. One sample had a concentration of 5 with a 1.75 ratio while another sample had a concentration of 170 with a ratio of 1.57. Electrophoresis (due to inexperience, all electrophoresis results presented here on extractions were performed after Ext27 and were thus unknown during the larger part of this process) showed that the fresh leaf samples and one herbarium sample were likely candidates for successful PCR. However, only two of the fresh samples gave weak positive results in the PCR. CTAB II (3) First among the extractions exploring the Scott and Figure 4: Ext26, it has Playford extraction protocol (Ext26) used 18 samples, where 17 were both the high and middle silica gel dried and one was a herbarium specimen. Most of the pellets range ladders as described above. were characterised by being transparent and having a gelatinous appearance. Nine of them gave 260/280 absorbance results between 1.55 and 1.92. The lowest measured result was 1 and the highest was 6.99. Electrophoresis suggested that only three of the samples might give positive results and no successful PCR results were produced. Ext26b used six samples, where five gave gelatinous pellets. The 260/280 absorbance results ranged between 1.35 and 2.5 and the electrophoresis results suggests that only one (sample 4) of the samples had potentially extractable DNA. However, a PCR performed to test new primers gave positive results for sample 3. In Ext26c, two of the samples produced gelatinous pellets while the rest had standard pellets. The electrophoresis show better results for the sand grinding compared with the N2 grinding extractions and all samples except (sample 1) result in bands and only (sample 2) resulted in amplification in a PCR testing the sand extractions.

Figure 5: Extractions Ext26b, Ext26c, Ext26d and Ext26f. Ext 26b has a high range ladder while the other gels have both high and middle range ladders. Ext26d showed that repeating the extraction buffer step reduced yields dramatically (absorbance concentration reductions were as follows: 184 → 33; 175 → 85). The second species testing double extraction buffers had used two different leafs for the different samples, where one leaf was young whereas the other was mature. The mature leaf had a markedly higher viscosity in the extraction buffer compared with the young leaf. Absorbance ratios ranged between 1 and 8.2 with a median of 2.1. Three of the samples (3,4,5), which formed nice pellets after one EtOH precipitation resulted in good electrophoresis bands. The other samples had a repeated precipitation step and produced either very weak bands (1,2) or had most of its contents remain in the well (6). Only one (1) of the samples produced a weak result in PCR. Due to the extra washing step after the samples had dried over night, no results were gained for the Ext26e extraction. The Ext26f extraction resulted in blurry electrophoresis bands for the four samples (1,3,4,5) which produced pellets. No sample resulted in successful PCR. The EtOH experiment in Ext27 resulted in only the supernatant samples showing evidence of DNA in the electrophoresis test. No PCR was performed. Ext28 showed that only taking the middle part of the supernatant after CIA cleaning produces cleaner electrophoresis bands than the end parts do. These results failed to transfer to the PCR test. The final extraction (Ext29) resulted in long and blurry electrophoresis bands for most species (3,5,8 being exceptions). The samples for specimens 2 and 4 also produced bands in the EtOH wash. All samples also had their absorbance ratios examined, and no resulting ratio was close to indicating DNA. Not surprisingly, the PCR gave depressing results both before and after a QIAEX II desalting protocol had been used on the samples.

Figure 6: Ext27, Ext28 with its "a" phase to the left followed by the "b" phase, Ext29 with the "a" phase to the left followed by the "b" phase and the EtOH wash phase. Ext27 has only the high range ladder, Ext29 EtOH has only the middle range ladder and the others have both ladders.

Extraction Ext23 % Ext24 % Ext25 % Ext26 % Ext26b % Ext26c % Omega Omega CTAB CTAB CTAB Method used kit kit CTAB S&P S&P S&P Nr of extracted samples 13 12 11 18 6 8 PCR result 10 77 1 8 3 27 0 0 2 33 1 13 electrophoresis result - - 2 17 5 45 4 22 1 17 7 88 Extraction Ext26d % Ext26f % Ext27 % Ext28 % Ext29 % CTAB CTAB CTAB CTAB CTAB Method used S&P S&P S&P S&P S&P Nr of extracted samples 6 6 8 12 11 PCR result 1 17 3 50 0 0 0 0 1 9 electrophoresis result 5 83 4 67 3 38 10 83 7 64 Table 4: The extractions made during this project are listed together with their success rate. Ext26e is not listed since no results were obtained from it.

TGGE results The first parallel gel resulted in inconclusive results. Two specimens, Afraegle (6), counting from the left, and Philotheca (8) found melting points where they could stop in the gel. The others did apparently melt already in the well, and therefore

Figure 7: TGGE gel 1 migrated straight out of it. The second parallel gel gave a more even result, showing that almost all species melt within a fairly short temperature range. The bands were generally close to each other, but some of them could be distinguished and cut out for extraction testing. These were Dinosperma (9), Choisya (11), Merrillia (12), Atalantia (13), Paramignya (15), Bergera (16), and Murraya (17). All species except Atalantia and Bergera contributed two bands. Several attempts to recover DNA from the gel slices failed and further work on the method was aborted. One possible reason for this that was not explored further was the realisation that the outer and inner primers were located next to each other with no spacing between them. It is also possible that a favourable result would have been obtained through using overlapping inner primers, thus increasing recovery of possibly broken DNA fragments. With a reliable solution for DNA extraction from the gel, this method is expected to be a fast and accurate competitor to paralogue separation by cloning. Figure 8: TGGE gel 2

DATA Analysis MrBayes

Modeltesting Comparing the different genes that were available resulted in 47 species that were represented by both the chloroplast data and the MDH gene. A total of 20 species were found for the combination of chloroplast, MDH and HYB sequences. Comparing the four trees resulting from running MrBayes with both the GTR and HKY models on both sets of alignments gave no significant differences, although the consensus tree based on the GTR model using the manual alignment had stronger support in some nodes than other combinations of model and alignment. Recombination testing using RDP Running the HYB file in RDP did not find any support for recombination. Running the MDH file in RDP resulted in a single putative recombination event for Skimmia. However, the event was at the end of the sequence which was of a poor quality and that is the likely cause of the signal, rather than true recombination. Due to its poor quality, the sequence was removed from further analysis.

Positive selection testing using PAML In HYB, one possible case had an omega value of 0.86 (values higher than 1 are usually required at specific sites, but averaged across a sequence values lower than 1 found using Yang and Nielsen (2000) indicate sequences to investigate further) suggesting that positive selected for evolution was found between one allele of Citrus sinensis and one allele of Clausena hamandiana. The C. hamandiana sequence differed from the Citrus in two nucleotide locations compared with the other C. hamandiana allele. Upon further examination in SplitsTree, it was discovered that the respective alleles stick together within the genus and that the alleles from each genera are always far from each other in the network. In MDH, 110 cases had results with an omega number between 0.5 and 99, which identifies possible positively selected evolution. The smallest common denominator summing up these 110 cases involved the following species; Burkillanthus, one of the Glycosmis mauritiana alleles, Paramignya lobata and Paramignya scandens, Severinia buxifolia and Swinglea glutinosa combined with each other or other species. The two Glycosmis mauritiana alleles were always found together in the phylogeny and forming a clade with Glycosmis trichanthera. This is fully expected given the taxon sample. The two Paramignya species were consistently found together in the full sequence network as well as with the exon, 1-2 codon network. Paramignya scandens also had a clear relationship to Burkillanthus in the exon 1-2 codon SplitsTree network, which is inconsistent with the MrBayes tree structure. Further investigation of the sequence alignment revealed that this effect was caused by a single base that Paramignya scandens shared with Burkillanthus but not with Paramignya lobata. This relation between Burkillanthus and Paramignya scandens was the only one to consistently appear with Burkillanthus through all the SplitsTree network constellations that were examined. Severinia was consistently grouped with Atalantia and Citropsis in the SplitsTree network as was expected. Swinglea did not share any noteworthy splits with other species. After thus examining the full sequences and comparing them to first and second codon positions while finding no significant differences, it was concluded that any positive selection the phylogenetic signal of the full sequence unaltered. The concerns raised by PAML where therefore not corroborated.

Hybridization test results The simulation test results showed that at the larger generation time and population size combinations (50/40 000), the null hypothesis of lineage sorting alone could not be rejected. However, at every other lower value combination (20/40 000 and smaller) the null hypothesis could be rejected for three or more samples. When the individual effect on the test statistic were examined, three species were found to have the greatest and consistent effect, with no difference in removal order (Bergera koenigii, Oxanthera sp. and Feroniella oblata).

*BEAST tree results Two species trees were produced. One was based on 37 species, the three species identified in the hybridisation test having been removed, using the CP and MDH data (figure 10). The other was based on 18 species and CP, MDH and HYB data (figure 9). The two trees had different root ages: the 3 gene tree had a root age of 15.24 MA, whereas the 2 gene tree had a root age of 10.55 MA. Neither of the trees had a root age prior, and differed only by starting tree modelling.

Lagrange Biogeographic results Both the simple and advanced 3 gene trees gave the same results except on the N3 node (see table 7). The N4 node, which includes Clausena, Merrillia and Murraya and the root node: N34 were regarded of Papuan origin. The N26 node, forming a clade with Afraegle, Aeglopsis, Balsamocitrus and Aegle were regarded African/African-Asian whereas all other nodes were inferred to have most likely been present in India. The two 2 gene trees disagreed on the location of the higher nodes (N72, N70, N69, N66, N65, N62, N61) (see table 6). The simple model favoured Papua as the origin of the root and higher nodes, whereas the more advanced model favoured South-east Asia. At the N37 and N27 nodes, the advanced model has India-Papua and India as primary choices for branch radiation whereas the simple model in both cases instead inferred South-east Asia. The results of the advanced models are included in tables 6 and 7 with the related cladograms.

BEAST biogeography BEAST produced separate trees for the CP and MDH trees. The results support a south-east Asian origin for the Aurantioids. It gives a Papuan origin for the Citrus species, but almost all other development in the higher nodes take place in south-east Asia. The two trees differ regarding the node that gathers the Balsamocitrinae clade, where the MDH results have these developments take place in Africa while the chloroplast tree place them as well in south-east Asia. Removing the three possibly hybridized species made convergence possible. Discussion

Extractions The first lesson from my extraction experiments is the importance of evaluating results after each extraction. Up to Ext26e, results were becoming better with the level of polysaccharides decreasing with pellets no longer exhibiting the gelatinous properties from the first extraction attempts and with the EtOH precipitation working fairly smoothly. If my hindsight is correct, and I did see dried DNA which I interpreted as dried salt, my search for a working extraction protocol could have been successfully accomplished here. Lacking any verifiable results on this extraction, it is none the less clear that the following extraction (Ext26f) produced the best PCR result (for Ruta pinnata) gained through the CTAB extractions. The Ruta pinnata PCR was equal to the results gained for Ruta graveolens after the kit extraction, and the CTAB further resulted in weaker bands for both Phellodendron species that had not extracted with the kit. Ext26b gave an almost perfect result in both PCR and absorbance ratios for Phellodendron amurense, although the gelatinous pellets both this and the other samples had were problematic. The tubes from this extraction all ended with a several millimetre thick gel-like sedimentation in the bottom. Therefore, much work and possibly better results would have been gained by running extractions on electrophoresis earlier and to amplify the extracts more faithfully and not least taking the time to step back and consider the work from a distance once in a while. The second lesson learned was the value of understanding how the processes used work and what the ingredients do. Knowing what an added ingredient is intended to do is crucial in understanding how changing its recipe might affect the end results. Both realising the effect of NaCl concentration for reducing polysaccharides contamination and the importance of reaching a 75% end concentration at the EtOH step were important lessons. The negative impact of sub-zero temperatures on Rutaceae extraction is another lesson which might have provided better result if the evidence had been noticed earlier. The extraction notes mention the thick extraction consistency but fails to put it in relation to the cold until a mention of it was made during an email conversation with K. Scott. It is also unclear what the relationship between extraction results and PCR results was. Several of the non-Aurantioid species resisted amplification when extracts made by others, which had been successfully used in chloroplast based studies, were used as PCR templates for low-copy nuclear genes. It is therefore possible that the main problem does not rest with the extractions once the above mentioned extraction issues have been dealt with, but with the primers used. It seems that finding new primers is relatively difficult for species that have not had extensive sequencing done previously, and that primer identification for the non-Aurantioid species therefore has been out of reach for this project. I believe that finishing the goal of inferring a phylogeny of the Rutaceae using low-copy nuclear genes will require that this genome-fishing takes place for perhaps 4 or 5 species throughout the remainder of the family.

Classifications The 3 gene tree results from *BEAST do not support the Balsamocitrinae / Citreae division found in the GRIN classification. Instead it takes Monanthocitrus, Pamburus and Paramignya and moves them into the Balsamocitrinae subtribe. Engler, who had a slightly different classification of the Aurantioids, has a similar problem with Monanthocitrus moving from his Hesperethusinae subtribe into Citrinae. The second large inner clade in the 3 gene tree suggests further breakdown of the Citrinae subtribe. The situation is not helped by the larger number of species in the 2 gene tree. The GRIN tribe Clauseneae and the subtribe Merrilliinae fail to form clades but instead form a grade relative to the root node. It further supports the 3 gene tree in moving genera to the Balsamocitrinae subtribe and further adds Triphasia and Wenzelia to it. It further divides the Citreae into a clade containing Citrus proper (e.g. Citrus, Microcitrus, Poncirus, Clymenia and Oxanthera) and a clade for the remaining species. Compared to the Engler subtribes, Citrinae is now evenly divided between the three clades mentioned whereas Hesperethusinae is divided between the two later clades. Comparing the 2 gene tree with Samuel's (2001) atpB/rbcL tree in Fig 1, the Clauseneae tribe proposed fails to form a clade. The Citreae tribe changes places between Paramignya and Citropsis and lifts out the Atalantia/Severinia clade altogether compared with my results. There is better agreement with Chase (1999) in Fig 3 where the placement of Clausena next to Aegle is the main problem. Besides the clade gathering Citrus, Clymenia, Microcitrus and Poncirus together, there is no agreement between the 2 gene tree and Morton's (2003) rps16/trnL-trnF trees. The situation is more complex in the 2009 paper. Here several trees are published, starting with a trnL-trnF tree that says little more than that nodes (see cladogram 2) N16 and N27 exist. The second tree, based on rps16, is a little more detailed, giving some resolution for the N51 clade, but inexplicably joining Clausena to it. Her third tree is based on atpB-rbcL and largely has the same topology as the previous trees, with the exception of Murraya ending up sister to Pamburus. This is an anomaly that will be discussed further. The fourth tree is based on ITS and is also generally in agreement with the previous trees, which does not say much considering the very limited resolution of the aforementioned trees. One important anomaly contained in the ITS tree is that Merrillia is placed sister to Glycosmis.

The fifth tree presented in Morton (2009) is based on a concatenation of the previously mentioned genes. The reason given for concatenating the data was the principle that if the individual gene trees are not contradicting each other, there is no problem with joining the data. This requires that several theoretical and methodological issues are ignored: This assumption forces you to disregard the evolutionary differences between plastid and nuclear evolution, e.g. plastid and nuclear genomes (and unlinked nuclear loci) would be separately affected by hybridisation and nuclear loci can be paralogues. Further, plastid and nuclear genomes have different effective population sizes which affects lineage sorting rates and thereby also coalescence times. The concatenated tree adds a little resolution but retains the placements of Murraya with Pamburus and Merrillia with Glycosmis. Finally the last tree in Morton (Fig. 6, 2009), which is based upon the previous data combined with morphological data, we find that the resolution has increased and only the genera Wenzelia, Merrillia and Murraya are conflicting with the results in this thesis. The pairing of Merrillia to Glycosmis that was previously seen in the ITS (with a bootstrap support of 98) and concatenated (91) trees, and of Murraya to Pamburus that occurred weakly in the atpB-rbcL tree (59 bootstrap support) and hardly any better in the concatenated tree (61), now show an overwhelming support of 100 and 96. This is a huge increase for the latter. Morton (2009) concluded that neither the Clauseneae nor Citreae tribes are monophyletic. The basis for Citreae failing to be monophyletic is the Murraya-Pamburus connection, which hardly can be said to be supported without the morphological data. The validity of her conclusion thus rests on whether concatenating molecular data with morphological data is sound, or not. Finally comparing with Bayer, with which some of the sequences and the method of analysis (Bayesian analysis) is shared reveal that the placement of Eremocitrus glauca and added high-level resolution are the only differences. The trees are otherwise in agreement regarding the clades.

Biogeography The Citrus clade is divided into two parts, with most of the classic Citrus in one subclade N15 and most of the new Citrus in another subclade N10 (see cladogram 2). The N15 clade gathers the Chinese Poncirus, the Papua New Guinea Citrus “amboin” and Citrus sinensis of uncertain but possibly Chinese origin. The N10 clade has 6 species out of which half are Australian. Another species is from New Caledonia and the two remaining species occur in Papua New Guinea and the nearby New Ireland island. The same results are found in the BEAST biogeographical analysis. As all of the Papuan species in this group are found on the Australasian side of the Wallace line, this clade is only found in Australasia. (Beattie et al., 2008) suggested that the Citrus clade of the Aurantioids originated in Australasia, which he defines as Australia, New Guinea and other smaller islands east as far as Fiji. He also suggests that this is a more likely direction of dispersal, compared with migration southwards from Asia, when the westward direction of ocean currents and terrane movements are considered (Hall, 1997). He may well be right concerning the ocean currents, but the dating of the Aurantioideae by Pfeil (2009) using fossil evidence to calibrate a chloroplast phylogeny, as well as the dates gained through *BEAST's analysis of the alignment here (informed by Pfeil's date calibration applied to the species tree) this work suggests that the geological movements are so old that any island movements hardly can be part of the explanation. As the Lagrange result table shows at N38 and N16, the question regarding the origin of the Citrus clade is not whether it originated in Australasia but how soon it started spreading north to China.

Despite the assumptions that must be made in Lagrange regarding probable dispersal routes and choosing good geographical areas to correspond, it does none the less seem to be the better route to take in this kind of study compared with the discrete BEAST phylogeography model. The reason is that Lagrange accepts a species in several different areas, whereas BEAST does not. Thus the confusing situation where a region such as Tropical_Asia equals a combination of two smaller regions, Tropical_Asia_East and Tropical_Asia_West. All of these three regions are further part of the general region Asia which also includes China. But about this, BEAST will know nothing and it does therefore seem likely that its results can easily be confused on this regard. A continuous BEAST phylogeography model including range-wide sampling with unique geographical information for each sample would perhaps solve this problem. This would include gathering wild samples from several parts of each species' range where more widespread species would require more samples than species endemic to small islands. This would provide both geographic information on the species area as well as possibly identify a smaller root node area, the place where the species first developed, for each species. The collected data would also have to provide a greater range of genetic information than is currently available. Tables and Trees

Clausena_harmandiana Micromelum_minutum 0,47 12,51 1 Glycosmis_mauritiana Murraya_paniculata 5,63 1 Glycosmis_trichanthera 5,81 10,55 1 Clausena_harmandiana Merrillia_caloxylon 4,52 0,41 Clausena_excavata 9,96 1 Merrillia_caloxylon Swinglea_glutinosa 3,28 Murraya_paniculata

1 Naringi_crenulata 0,69 0,98 Triphasia_trifolia 9,11 9,49 5 1 Wenzelia_dolichophylla 0,18 15,24 Poncirus_trifoliata 0,41 Monanthocitrus_cornuta 6,71 0,77 1 Pamburus_missionis 8,84 4,57 Citrus_sinensis 0,96 0,42 1 Paramignya_lobata 0,93 7,95 6,4 2,09 3,64 Paramignya_scandens Citrus_gracilis 0,66 0,98 5,76 0,81 2,24 Aegle_marmelos 8,07 1 Microcitrus_australasica 0,641,65 Balsamocitrus_dawei 0,571,08 Afraegle_paniculata 0,58 0,9 Atalantia_monophylla Aeglopsis_chevalieri 12,56 1 5,19 0,98 0,99 Naringi_crenulata Atalantia_ceylonica 7,05 3,810,97 Citropsis_schweinfurthii 0,11 2,18 Citropsis_daweana Afraegle_paniculata 5,64 0,77 Swinglea_glutinosa 1,52 0,63 0,715,2 Aeglopsis_chevalieri 0,36 Burkillanthus_malaccensis 1 5,9 4,6 2,38 Pleisopermium_latialatum Balsamocitrus_dawei Feronia_limonia 0,96 0,43 0,63 5,18 Severinia_buxifolia 0,28 0,98 Aegle_marmelos 4,07 Atalantia_ceylonica 12,11 0,99 0,72 6,03 3,11 1 Atalantia_monophylla Monanthocitrus_cornuta 0,14 Atalantia_citroides 0,3 11,33 Poncirus_trifoliata Paramignya_lobata 0,31 0,87 2,291 Citrus_sinensis 9,7 0,37 1 Citrus_amboiensis Pamburus_missionis 2,77 Clymenia_polyandra 0,54 2.0 2,19 Oxanthera_neocaledonica 0,77 1,66 Figure 9: The 3 gene *BEAST tree. The node labels 0,58 Microcitrus_australasica indicate age (in italics) and the branch labels indicate 0,391,39 Eremocitrus_glauca 0,541,21 Citrus_gracilis posterior probability (in bold). 0,85 Microcitrus_papuana

2.0 Figure 10: The 2 gene *BEAST tree. The node labels indicate age (in italics) and the branch labels indicate posterior probability (in bold). Aegle marmelos (L.) Corrêa H&M USDA PI 539142 Merrillia caloxylon Swingle H&M USDA PI 539733 Aeglopsis chevalieri Swingle H&M USDA PI 539143 Monanthocitrus cornuta Tanaka H&M CANB TJ Hoe s.n. Afraegle paniculata(Schumach.) Engl. H&M USDA PI 103107 Murraya paniculata ( L.) Jack H&M CANB 743224 Atalantia monophylla (L.) DC. H&M USDA PI 109613 Pamburus missionis Swingle H&M USDA PI 095350 Balsamocitrus dawei Stapf H&M USDA PI 539147 Paramignya lobata Burkill H&M USDA PI 600642 Bergera koenigii L. H&M USDA PI 539745 Melicope elleryana ( F.Muell.) HYB PIF 34003 - 18/7/8 T.G.Hartley Citrus gracilis Mabb. H&M CANB 644758 Philotheca deserti (E. Pritz.) P.G HYB MJB 1919 Wilson var. deserti Clausena harmandiana Pierre ex H&M USDA PI 600640 Melicope micrococca ( F.Muell. ) MDH ANBG 8501279 Guillaumin T.G.Hartley Acronychia acidula F. Muell. MDH ANBG c664755 Microcitrus australasica Swingle MDH USDA PI 312872 Ailanthus altissima (Mill.) Swingle MDH GotBot 1988-999 Micromelum minutum ( G.Forst.) MDH USDA PI 600637 G, BRUNS Wight & Arn. Asterolasia hexapetala Druce MDH ANBG c9505139 Murraya paniculata ( L.) Jack MDH GotBot 1933-3691sG Atalantia citroides Pierre ex MDH USDA PI 539145 Murraya paniculata ( L.) Jack MDH Perth 853900 ((bot. g. Guillaumin Perth)) Atalantia ceylanica (Arn.) Oliv. MDH CRC 3287 Naringi crenulata ( Roxb.) Nicolson MDH USDA PI539748 Bergera koenigiiL. MDH CANB 743217 ( Labill.) Paul MDH ANBG c629571-46 G.Wilson Boronia anemonifolia A.Cunn. MDH ANBG 9506114 Orixa japonica Thunb. MDH GotBot 1918-0012 G, Hesse Burkillanthus malaccensis (Ridl.) MDH CHM78–extraction Oxanthera neocaledonica Tanaka MDH USDA PI 539671 Swingle (CSIRO Plant Industry) Choisya ternata Kunth MDH CANB 743233 Oxanthera Montrouz. sp MDH Veillon 7758 ((sp. in canberra)) Citropsis daweana Swingle & Kellerm. MDH USDA PI 247137 Paramignya scandens Craib MDH USDA PI 109758 Citropsis schweinfurthii Swingle & MDH USDA PI 231240 Phebalium squamulosum Vent. MDH ANBG c653838 Kellerman Citrus amboiensis (Citrus sp. “ MDH Merbein CO054 Pleiospermium latialatumSwingle MDH USDA PI 600643 Amboin, New Guinea ”) Citrus australis( Sweet ) Planch. MDH Crisp 10432 Poncirus trifoliata ( L.) Raf. MDH CRC 1717 Citrus glauca ( Lindl.) Burkill MDH USDA PI 539717 Poncirus trifoliata( L.) Raf. MDH CRC 3330 Citrus sinensis Osbeck MDH CRC 1241 Ptelea trifoliata L. MDH GotBot 1900-3888nU Citrus wintersii Mabb. MDH USDA PI 410943 Ruta graveolens L. MDH GotBot 1997-3115 G, SÄVE Clausena excavataBurm.f. MDH USDA PI 235419 Sarcomelicope simplicifolia( Endl. ) MDH ANBG 8501648 T.G.Hartley ssp Clymenia polyandra ( Tanaka ) MDH USDA PI 263640 Severinia buxifolia Ten. MDH USDA PI 539793 Swingle Cneorum tricoccon L. MDH 303GU Malaga, SkimmiaThunb. sp. MDH CHM75–extraction Spain, 1968-04-16 (CSIRO Plant Industry) Correa decumbens F.Muell. MDH GotBot 1986-1296 *”Skimmia” (?) Unknown I.D. MDH ”GotBot 1952-0046” Lorentzon Dictamnus albus L. MDH GotBot 1996- Swinglea glutinosa Merr. MDH USDA PI 142571 0506sw JJM950936 Erythrochiton brasiliensis Nees & MDH GotBot 2007- Tetradium daniellii ( Benn.) MDH GotBot 2008-2718sW Mart. 1167sG T.G.Hartley Feronia limonia Swingle MDH USDA PI236991 Toona ciliata M.Roem. MDH 21B GH06-015 Feroniella oblataSwingle MDH USDA PI 539720 Toona ciliata M.Roem. MDH CANB 743215 Flindersia australis R.Br. MDH CANB 743207 Triphasia trifolia P.Wilson MDH USDA PI 539800 Glycosmis mauritiana Tanaka MDH USDA PI 600641 Wenzelia dolichophylla Tanaka MDH USDA PI 277441 Glycosmis pentaphylla Corrêa MDH Merbein CR044 Zanthoxylum bungeanum . MDH GotBot 2008-2762sW Glycosmis trichanthera Guillaumin MDH PI RRUT 12 Zieria tuberculata J.A.Armstr. MDH ANBG c665919 Kippist MJB 1964 - 29/4/8 Lindl. ANBG 723469 Agathosma adenandriflora Schltr GU 2613 P.A.Bean Geijera parviflora Lindl. PIF 31159 – 27/3/8 S.W.Cape Ceres 1990 Amyris elemifera L 2007-0711 A Fairchild Halfordia kendack ( Montr. ) PIF 34745 - 12/3/9 Guillaumin Boninia glabra Planch GU no8221 E.H.Wilson Harrisonia abyssinica Oliv. GU nr3289 G.Taylor Bonin Islands 1917 Buganda 1935 Boronia ternata Endl MJB 1931 Lunasia amara Blanco Sanko 2955 - 22/5/8 Bouchardatia neurococca ( F.Muell. ) Sanko 2949 - 13/5/8 Medicosma subsessilis T.G.Hartley Sanko 1572 - 29/4/8 Baill. Brombya platynema F.Muell. RDC (Pete Green) - 27/3/8 Myrtopsis novae-caledoniae Engl. GU Mckee 5041 Ile des Pims 1956 Calodendrum capense Thunb 63114a B Fairchild Neochamaelea pulverulenta ( Vent. ) GU C Persson 1064 Erdtm. Citrus wintersii Mabb. CANB 155240 Nycticalanthus speciosus Ducke GU Antonelli 499 DUCKE #3578 Correa lawrenceana var. Grampiana MJB 1988 Phebalium squamulosum Vent. GotBot 1994-V0055 G, Paul G. Wilson AARHUS bot. Trädg. Dictyoloma incanescens DC. GU 4613 Vicosa Ynes Phellodendron amurense Rupr. GotBot 1946-3851 G, Mexia 1930 WEIBULL Dictyoloma vandellianum A.Juss. GU 0039124 G.Hatschbach Phellodendron japonicum Maxim. GotBot 1954-0009 W, Murai 49448 1985 Dinosperma melanophloia MJB 1887 - 11/7/7 Pilocarpus pennatifolius Lem. BELGIUM (XX-0-BR- ( C.T.White) T.G.Hartley 19121462) Diplolaena drummondii ( Benth. ) MJB 1956 Pitaviaster haplophyllus ( F.Muell.) MJB 1964 Ostenf. T.G.Hartley Diplolaena dampieri Desf. GU S/N GU w.A. Australia Ravenia spectabilis ( Lindl.) Planch. 65336A Fairchild Sept 1936 Ex Griseb. Dutaillyea sessilifoliola Guillaumin GU Mckee 3848 New Ruta pinnata L.f. GU 1407 Caledonia 1956 Empleurum serrulatum Sol. GU Delagoa bay 21XI Sheilanthera pubens I.Williams GU nr 27899 E.Esterhuysen 1898 Africa australi- Cape Province 1958 occidentali Eriostemon australasius Pers MJB 1869 Skimmia × confusa N.P.Taylor GotBot 1982-0281 sU Erythrochiton brasiliensis Nees & GotBot 1962-v2882pG Skimmia japonica Thunb. GotBot 1976-3854 pG Mart. Euodia pubifolia T.G.Hartley cult. ex PIF 25751 - Skimmia japonica Thunb. GotBot 1985-0166 pG 31/10/8 Euodia ridleyi Hochr 91587B Fairchild Skimmia japonica x intermedia GotBot Nitz. P-70 Fagara gilletii De Wild. BELGIUM (XX-0-BR- Toddalia asiatica Baill. GU nr 1324 K.Å.Dahlstrand 19391845) Transvaal 1963 Flindersia australis R.Br. RJBGH06-007 Zanthoxylum bonifaziae Cornejo & GU 7141 Reynel Flindersia bennettiana F.Muell. Ex ANGB 8301757 Zanthoxylum brachyacanthum Sanko 2955 Benth. F.Muell.

Table 5: The species which have HYB, H&B or MDH marked between the species name and its source information have been extracted for either of the two genes or both of them. Cladogram 2 gene Lagrange biogeography. Based on the *BEAST 2 gene species tree. (branch lengths not to scale): ------+ [pap] Clymenia_polyandra N10+ ------+ [ncd] Oxanthera_neocaledonica : -N9+ -----+ [aus] Eremocitrus_glauca : : -N6+ : : : : --+ [aus] Citrus_gracilis N16+ -N8+ -N5+ : : : --+ [pap] Microcitrus_papuana : : ------+ [aus] Microcitrus_australasica : : ------+ [sin] Poncirus_trifoliata : ---N15+ : : -----+ [ind] Citrus_sinensis : ---N14+ : -----+ [pap] Citrus_amboiensis N38+ : : ----+ [af1+af2] Citropsis_schweinfurthii : : --N19+ : : -N21+ ----+ [af1+af2] Citropsis_daweana : : -N27+ ------+ [ind+pam+soa+sin] Naringi_crenulata : : : : ------+ [pap] Swinglea_glutinosa : : : -N26+ : : : : ----+ [soa+pap] Burkillanthus_malaccensis : : : --N25+ : N37+ ----+ [pap] Pleisopermium_latialatum : : ------+ [soa+sin+pap] Severinia_buxifolia N58+ : : : : : N34+ ---+ [ind+soa] Atalantia_monophylla : : : : : -N31+ : : : : -N33+ ---+ [soa] Atalantia_citroides : : N36+ : : : : ------+ [ind] Atalantia_ceylonica : : ------+ [ind+pam+soa+sin] Feronia_limonia : : ------+ [pap] Wenzelia_dolichophylla : : ----N41+ : : ----N43+ ------+ [pap] Monanthocitrus_cornuta : : : ------+ [soa+pap] Triphasia_trifolia N62+ N57+ ------+ [ind] Pamburus_missionis : : : : ---+ [af1] Afraegle_paniculata : : : : -N47+ : : N56+ -N49+ ---+ [af1] Aeglopsis_chevalieri : : : N51+ ------+ [af2] Balsamocitrus_dawei : : N55+ ------+ [ind+pam+soa] Aegle_marmelos N66+ : : : : : : ------+ [soa] Paramignya_lobata : : : ----N54+ : : : ------+ [ind+soa] Paramignya_scandens : : : ------+ [soa+pap] Merrillia_caloxylon : : ------N61+ N70+ : ------+ [ind+pam+soa+sin+aus+pap] : : : Murraya_paniculata : : : ------+ [soa+pap] Clausena_harmandiana : : ------N65+ : : ------+ [ind+pam+soa+sin+pap] Clausena_excavata N72+ : : : ------+ [ind+soa+sin+pap] Glycosmis_mauritiana : ------N69+ : ------+ [soa+pap] Glycosmis_trichanthera ------+ [ind+soa+aus+pap] Micromelum_minutum At node N5: lnL Rel.Prob At node N6: lnL Rel.Prob At node N8: lnL Rel.Prob [aus|pap] -124 1 [aus|aus+pap] -124.3 0.7178 [aus+pap|aus] -124.3 0.7041 [aus|aus] -125.3 0.2666 [aus|aus] -125.2 0.2882 At node N9: lnL Rel.Prob At node N10: lnL Rel.Prob At node N14: lnL Rel.Prob [ncd|aus+pap] -124.6 0.5302 [pap|pap] -124.8 0.4503 [ind|pap] -124 1 [ncd|aus] -125.2 0.2932 [pap|aus+ncd] -125.9 0.1517 At node N15: lnL Rel.Prob At node N16: lnL Rel.Prob At node N19: lnL Rel.Prob [sin|pap] -124.6 0.5507 [pap|sin+pap] -125.4 0.2514 [af2|af2] -124.4 0.6269 [sin|ind+pap] -124.8 0.4493 [pap|pap] -125.5 0.2174 [af1+af2|af1] -126.4 0.0921 [pap|ind+sin+pap] -125.5 0.2162 At node N21: lnL Rel.Prob At node N25: lnL Rel.Prob At node N26: lnL Rel.Prob [af2|ind+soa] -125.5 0.2101 [pap|pap] -124.1 0.9268 [pap|pap] -124 0.9501 [af2|ind] -125.9 0.1472 [af2|ind+pam+soa] -125.9 0.1425 At node N27: lnL Rel.Prob At node N31: lnL Rel.Prob At node N33: lnL Rel.Prob [ind|pap] -125.1 0.3357 [ind+soa|soa] -124 0.9553 [ind+soa|ind] -124.8 0.4368 [soa|pap] -125.9 0.1502 [ind|ind] -124.8 0.4259 At node N34: lnL Rel.Prob At node N36: lnL Rel.Prob At node N37: lnL Rel.Prob [soa|soa] -125.2 0.2953 [soa|soa] -124.9 0.404 [ind+pap|ind] -125.6 0.2015 [soa|ind] -125.5 0.2086 [ind|ind] -125.2 0.2905 [soa+pap|soa] -125.7 0.1728 At node N38: lnL Rel.Prob At node N41: lnL Rel.Prob At node N43: lnL Rel.Prob [pap|soa+pap] -125.8 0.1558 [pap|pap] -124 1 [pap|pap] -124.4 0.6825 [pap|pap] -125.9 0.151 [pap|soa+pap] -125.5 0.2079 [pap|ind+pap] -125.9 0.143 At node N47: lnL Rel.Prob At node N49: lnL Rel.Prob At node N51: lnL Rel.Prob [af1|af1] -124 1 [af1|af2] -124 1 [af2|ind+pam+soa] -124.7 0.4626 [af1+af2|ind] -125.5 0.2141 At node N54: lnL Rel.Prob At node N55: lnL Rel.Prob At node N56: lnL Rel.Prob [soa|ind+soa] -124.4 0.6856 [ind|ind] -124.4 0.623 [ind|ind] -124.3 0.7415 [soa|soa] -125.5 0.2162 [ind+soa|soa] -126.2 0.1031 [ind|ind+soa] -125.7 0.1789 At node N57: lnL Rel.Prob At node N58: lnL Rel.Prob At node N61: lnL Rel.Prob [pap|ind] -124.4 0.6368 [pap|pap] -125.7 0.1773 [soa|soa] -127.1 0.0441 [soa|soa] -126.2 0.1029 [pap|ind+pap] -126 0.1273 [soa| -127.2 0.0405 ind+pam+soa+sin+ aus+pap] [soa| -127.3 0.0373 ind+pam+soa+sin+ pap] At node N62: lnL Rel.Prob At node N65: lnL Rel.Prob At node N66: lnL Rel.Prob [soa|soa] -125.5 0.2109 [soa|soa] -125.6 0.1952 [soa|soa] -124.9 0.3777 [pap|pap] -125.5 0.2087 [pap|pap] -126.6 0.0743 [pap|pap] -125.5 0.2189 At node N69: lnL Rel.Prob At node N70: lnL Rel.Prob At node N72: lnL Rel.Prob [soa|soa] -124.9 0.3906 [soa|soa] -124.9 0.4157 [soa|soa] -125.7 0.1844 [pap|pap] -125.9 0.1426 [pap|pap] -125.6 0.2047 [pap|pap] -126.3 0.09571 Table 6: Lagrange relative probabilities for the 2 gene analysis. The table includes the first 1 to 3 values to illustrate where the node definition is certain and where it is uncertain. * Split format: [left|right], where 'left' and 'right' are the ranges inherited by each descendant branch (on the printed tree, 'left' is the upper branch, and 'right' the lower branch). Global ML at root node: -lnL = 124 dispersal = 0.06009 extinction = 9.832e-12 Cladogram 3 gene Lagrange biogeography. Based on the *BEAST 3 gene species tree. (branch lengths not to scale): ------+ [soa+pap] Clausena_harmandiana ------N4+ : : ------+ [ind+pam+soa+sin+aus+pap] Murraya_paniculata : ------N3+ : ------+ [soa+pap] Merrillia_caloxylon : : ------+ [pap] Swinglea_glutinosa : : : N19+ ------+ [ind+pam+soa+sin] Naringi_crenulata N34+ : : : : : : : ------+ [sin] Poncirus_trifoliata : : N18+ N13+ : : : : : ------+ [ind] Citrus_sinensis : : : : N12+ : : : : : ---+ [aus] Citrus_gracilis : : N17+ N11+ : : : ---+ [aus] Microcitrus_australasica : : : N33+ : ------+ [ind+soa] Atalantia_monophylla : ---N16+ : ------+ [ind] Atalantia_ceylonica : : ------+ [af1] Afraegle_paniculata : --N22+ : : ------+ [af1] Aeglopsis_chevalieri : --N26+ : : : ------+ [af2] Balsamocitrus_dawei : : --N25+ --N32+ ------+ [ind+pam+soa] Aegle_marmelos : : ------+ [pap] Monanthocitrus_cornuta --N31+ : ------+ [soa] Paramignya_lobata --N30+ ------+ [ind] Pamburus_missionis At node N3: lnL Rel.Prob At node N4: lnL Rel.Prob At node N11: lnL Rel.Prob [pap|pap] -64.17 0.0460 [pap|pap] -62.5 0.243 [aus|aus] -61.26 0.846 [soa|soa] -64.34 0.0388 [soa|soa] -62.65 0.2101 [aus+pap|aus] -63.73 0.071 [ind+pam+soa+sin+ -64.44 0.0348 aus+pap|soa] [ind+pam+soa+sin+ -64.51 0.0327 aus+pap|pap] At node N12: lnL Rel.Prob At node N13: lnL Rel.Prob At node N16: lnL Rel.Prob [ind+pap|aus] -61.91 0.4384 [sin|ind+aus+pap] -61.77 0.5057 [ind|ind] -61.26 0.8434 [ind|aus+pap] -62.32 0.2925 [sin|ind+pap] -62.64 0.2124 [ind+soa|ind] -63.49 0.0909 At node N17: lnL Rel.Prob At node N18: lnL Rel.Prob At node N19: lnL Rel.Prob [ind|ind] -62.05 0.3837 [ind|ind] -62.34 0.2856 [ind|ind] -63 0.1472 [ind+pap|ind] -62.91 0.1619 [ind|ind+pap] -63.17 0.1247 [pap|ind+pap] -63.16 0.1253 [pap|ind] -63.26 0.1137 [pap|pap] -63.33 0.1067 At node N22: lnL Rel.Prob At node N25: lnL Rel.Prob At node N26: lnL Rel.Prob [af1|af1] -61.09 0.9974 [af2|ind+pam+soa] -61.24 0.8547 [af1| -61.72 0.532 [af2|ind+pam] -64.14 0.0473 af2+ind+pam+soa] [af1|af2+ind+pam] -63.27 0.113 At node N30: lnL Rel.Prob At node N31: lnL Rel.Prob At node N32: lnL Rel.Prob [ind|ind] -62.22 0.3209 [ind|ind] -62 0.4031 [ind|ind] -61.95 0.4202 [soa|ind] -62.5 0.2435 [soa|soa] -62.54 0.2333 [soa|soa] -62.99 0.1494 At node N33: lnL Rel.Prob At node N34: lnL Rel.Prob [ind|ind] -62.58 0.225 [pap|ind+pap] -64.51 0.0325 [ind+pap|ind] -63.97 0.0559 [pap|pap] -64.85 0.0231 [soa|soa] -64.96 0.0208 Table 7: Lagrange relative probability table for the 3 gene analysis. The table includes the first 1 to 4 values to illustrate where the division is certain and where it is uncertain. * Split format: [left|right], where 'left' and 'right' are the ranges inherited by each descendant branch (on the printed tree, 'left' is the upper branch, and 'right' the lower branch). Global ML at root node: -lnL = 61.09 dispersal = 0.03715 extinction = 0.03039

[CTAB mixing To make 100 ml of the CTAB buffer mentioned in the above protocol (3), dissolve 8.85 g of TRIS- HCl with 5.3 g of TRIS-base with MQ (milli-q water) and fill until you reach 45 ml. In a different beaker, mix 1.82 g of CTAB with MQ and once dissolved, add until you have 45 ml total volume. This can be quickly dissolved with either a heating stirrer or with the help of a microwave. 18.6 g of EDTA is needed for this buffer. Take half of this amount and mix it with the TRIS solution. When it is dissolved, add some of the CTAB solution and some more of the EDTA buffer until everything is in solution. Each M of NaCl needed weighs 5.84 g. Add the NaCl one M at a time to let it dissolve. NaCl releases gas in the liquid as it dissolves. Therefore, as low viscosity as possible is wanted. Warm the solution to 65 C while stirring in order to dissolve the NaCl well. Continue until desired NaCl molarity has been reached.] Acknowledgements I would like to thank Bernard Pfeil for excellent supervising of my thesis work. I would also like to thank Mats Töpel, Stephan Nylinder and Vivian Alden as well the rest of the Molecular Systematics Research Group, for introducing me to various parts of the land of Systematica Botanica. I am grateful for the contributions made by Gothenburg Botanical Garden; Australian National Botanic Gardens; Fairchild Tropical Botanic Garden; the National Botanic Garden of Belgium; Claes Persson and Alex Antonelli (University of Gothenburg and Gothenburg Botanical Garden) who graciously provided me with leaf samples. I also thank Cathy Miller (CSIRO, Canberra) and Mike Bayly (University of Melbourne) for providing some extracted DNA samples. The following herbaria also provided material for study and in some cases DNA extraction: GB, MO – my thanks to them as well. Part of this work was carried out by using the resources of the Computational Biology Service Unit from Cornell University which is partially funded by Microsoft Corporation.

References Web pages:

Tropicos http://www.tropicos.org/Home.aspx

GRIN http://www.ars-grin.gov/cgi-bin/npgs/html/family.pl?979

TreeBASE http://www.treebase.org/treebase-web/home.html

Computer software: Remco R. Bouckaert DensiTree: Making Sense of Sets of Phylogenetic Trees. Bioinformatics. 2010 May 15;26(10):1372-3. Epub 2010 Mar 1

Drummond AJ, Ashton B, Buxton S, Cheung M, Cooper A, Heled J, Kearse M, Moir R, Stones- Havas S, Sturrock S, Thierer T, Wilson A (2010) Geneious v5.1, Available from http://www.geneious.com

Drummond AJ, Ho SYW, Phillips MJ & Rambaut A (2006) PLoS Biology 4, e88.

Drummond AJ, Nicholls GK, Rodrigo AG & Solomon W (2002) Genetics 161, 1307-1320.

Drummond AJ & Rambaut A (2007) "BEAST: Bayesian evolutionary analysis by sampling trees." BMC Evolutionary Biology 7, 214

Hall, T.A. 1999. BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucl. Acids. Symp. Ser. 41:95-98.

Joseph Heled and Alexei J. Drummond Bayesian Inference of Species Trees from Multilocus Data Mol. Biol. Evol. 2010 27: 570-580. D.H. Huson and D. Bryant, Application of Phylogenetic Networks in Evolutionary Studies, Molecular Biology and Evolution, 23(2):254-267, 2006.

Martin DP, Lemey P, Lott M, Moulton V, Posada D, Lefeuvre P. (2010). RDP3: a flexible and fast computer program for analyzing recombination. Bioinformatics 26, 2462-2463. RDP methods citations:

Lemey P, Rambaut A, Drummond AJ & Suchard MA (2009) PLoS Computational Biology 5, e1000520. full text

The RDP method: Martin, D. & Rybicki, E. (2000). RDP: detection of recombination amongst aligned sequences. Bioinformatics 16, 562-563.

The GENECONV method: Padidam, M., Sawyer, S. & Fauquet, C. M. (1999). Possible emergence of new geminiviruses by frequent recombination. Virology 265, 218-225.

The Bootscan/Recscan method: Martin, D. P., Posada, D., Crandall, K. A. & Williamson, C. (2005). A modified bootscan algorithm for automated identification of recombinant sequences and recombination breakpoints. AIDS Res Hum Retroviruses 21, 98-102.

The MaxChi method: Maynard Smith, J. (1992). Analyzing the mosaic structure of genes. J Mol Evol 34, 126-129.

The Chimaera method: Posada, D. & Crandall, K. A. (2001). Evaluation of methods for detecting recombination from DNA sequences: Computer simulations. Proc Natl Acad Sci 98, 13757-13762.

The SiScan method: Gibbs, M. J., Armstrong, J. S. & Gibbs, A. J. (2000). Sister-Scanning: a Monte Carlo procedure for assessing signals in recombinant sequences. Bioinformatics 16, 573-582.

The 3Seq method: Boni M.F., Posada, D. & Feldman, M.W. (2007). An exact nonparametric method for inferring mosaic structure in sequence triplets. Genetics 176, 1035-1047.

The LARD method: Holmes E.C., Worobey, M. & Rambaut,A. (1999). Phylogenetic evidence for recombination in dengue virus. Mol Biol and Evol 16, 405-409.

The Topal/DSS method: McGuire, G. & Wright,F. (2000). TOPAL 2.0: Improved detection of mosaic sequences within multiple alignments. Bioinformatics 16, 130-134.

The Phylpro method: Weiller, G.F. (1998). Phylogenetic profiles: a graphical method for detecting genetic recombinations in homologous sequences. Mol Biol Evol 15, 326-335.

The VisRD methods: Lemey, P., Lott, M., Martin, D.P. & Moulton, V. (2009). Identifying recombinants in human and primate immunodeficiency virus sequence alignments using quartet scanning. BMC Bioinformatics 10, 126.

Recombination count matrices: Lefeuvre, P., Lett, J.M., Reynaud, B., Martin, D.P. (2007). Avoidance of protein fold disruption in natural virus recombinants. PLoS Pathog 11:e181.

Recombination hotspot test: Heath, L., van der Walt, E., Varsani, A. & Martin D.P. (2006). Recombination patterns in aphthoviruses mirror those found in other picornaviruses. J Virol 80, 11827-11832. Recombination rate plots: McVean, G. A. T., Myers, S. R., Hunt, S., Deloukas, P., Bentley, D. R. & Donnelly, P. (2004). The Fine-Scale Structure of Recombination Rate Variation in the Human Genome. Science 304, 581-584.

RMin/LD matrices: McVean, G., Awadalla, P. & Fearnhead, P. (2002). A Coalescent-Based Method for Detecting and Estimating Recombination From Gene Sequences. Genetics 160, 1231-1241.

Neighbor joining or least squares trees: Felsenstein, J. (1989). PHYLIP – Phylogeny inference package (version 3.2). Cladistics 5, 164–166.

Maximum likelihood trees: Guindon, S. & Gascuel, O. (2003). A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol 52, 696-704.

Bayesian trees: Ronquist, F. & Huelsenbeck, J.P. (2003). MRBAYES 3:Bayesian phylogenetic inference under mixed models. Bioinformatics 19,1572-1574. RDP Methods end;

Posada D and Crandall KA 1998. Modeltest: testing the model of DNA substitution. Bioinformatics 14(9): 817-818.

Swofford, D. L. 2002. PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods). Version 4. Sinauer Associates, Sunderland, Massachusetts.

Yang, Z. 2007. PAML 4: a program package for phylogenetic analysis by maximum likelihood. Molecular Biology and Evolution 24: 1586-1591 (http://abacus.gene.ucl.ac.uk/software/paml.html).

Yang, Z., and R. Nielsen. 2000. Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Molecular Biology and Evolution 17:32-43.

Huelsenbeck, J. P. and F. Ronquist. 2001. MRBAYES: Bayesian inference of phylogeny. Bioinformatics 17:754-755.

Ronquist, F. and J. P. Huelsenbeck. 2003. MRBAYES 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19:1572-1574.

Cited Papers:

ALZATE-MARIN, A. L., GUIDUGLI, M. C., SORIANI, H. H., MARTINEZ, C. A. & MESTRINER, M. A. 2009. An efficient and rapid DNA minipreparation procedure suitable for PCR/SSR and RAPD analyses in tropical forest tree species. Brazilian Archives of Biology and Technology, 52, 1217-1224. ARIF, I. A., BAKIR, M. A., KHAN, H. A., AHAMED, A., FARHAN, A. H. A., HOMAIDAN, A. A. A., SADOON, M. A., BAHKALI, A. H. & SHOBRAK, M. 2010. A Simple Method for DNA Extraction from Mature Date Palm Leaves: Impact of Sand Grinding and Composition of Lysis Buffer. International Journal of Molecular Sciences, 11, 3149-3157. AVISE, J. C., SHAPIRA, J. F., DANIEL, S. W., AQUADRO, C. F. & LANSMAN, R. A. 1983. Mitochondrial DNA differentiation during the speciation process in Peromyscus. Molecular Biology and Evolution, 1, 38-56. BAILLIE, P., FRASER, T., HALL, R. & MYERS, K. 2004. Geological development of Eastern Indonesia and the northern Australia collision zone: a review. Ellis, G.K., Baillie, P.W. & Munson, T.J. (Eds), Proceedings of the Timor Sea Symposium, Darwin, Northern Territory, Australia 19-20 June, 2003. Northern Territory Geological SurveyEllis, G.K., Baillie, P.W. & Munson, T.J. (Eds), Proceedings of the Timor Sea Symposium, Darwin, Northern Territory, Australia 19-20 June, 2003. Northern Territory Geological Survey, Special Publication, 1, 539-550. BAYER, R. J., MABBERLEY, D. J., MORTON, C., MILLER, C. H., SHARMA, I. K., PFEIL, B. E., RICH, S., HITCHCOCK, R. & SYKES, S. 2009. A molecular phylogeny of the orange subfamily(Rutaceae: Aurantioideae) using nine cpDNA sequences. American Journal of Botany, 96, 668-685. BEATTIE, G. A. C., HOLFORD, P., MABBERLEY, D. J., HAIGH, A. M. & BROADBENT, P. Year. On the Origins of Citrus, Huanglongbing, Diaphorina citri and Trioza erytreae. In: GOTTWALD, T. R. & GRAHAM, J. H., eds. International Research Conference on Huanglongbing, 2008 Orlando, Florida. http://www.plantmanagementnetwork.org/proceedings/irchlb/2008/presentations/default.asp #keynotes: IRCHLB Proceedings Compilation Copyright © 2009 Plant Management Network. CHASE, M. W., MORTON, C. M. & KALLUNKI, J. A. 1999. Phylogenetic relationships of Rutaceae: a cladistic analysis of the subfamilies using evidence from RBC and ATP sequence variation. Am. J. Bot., 86, 1191-1199. CLAUSSEN, M. & GAYLER, V. 1997. The Greening of the Sahara during the Mid-Holocene: Results of an Interactive Atmosphere-Biome Model. Global Ecology and Biogeography Letters, 6, 369-377. CRONN, R. & WENDEL, J. F. 2004. Cryptic trysts, genomic mergers, and plant speciation. New Phytologist, 161, 133-142. CROUSE 1987. Ethanol Precipitation: Ammonium Acetate as an Alternative to Sodium Acetate. Focus, 9, 2-5. DALY, M. C., COOPER, M. A., WILSON, I., SMITH, D. G. & HOOPER, B. G. D. 1991. Cenozoic plate tectonics and basin evolution in Indonesia. Marine and Petroleum Geology, 8, 2-21. DE NOBLET-DUCOUDRÉ, N., CLAUSSEN, M. & PRENTICE, C. 2000. Mid-Holocene greening of the Sahara: first results of the GAIM 6000 year BP Experiment with two asynchronously coupled atmosphere/biome models. Climate Dynamics, 16, 643-659. DOYLE, J. J. & DOYLE, J. L. 1990. Isolation of plant DNA from fresh tissue. Focus, 12, 13-15. DRÁBKOVÁ, L., KIRSCHNER, J. & VLĈEK, Ĉ. 2002. Comparison of seven DNA extraction and amplification protocols in historical herbarium specimens of juncaceae. Plant Molecular Biology Reporter, 20, 161-175. EDWARDS, S. V., LIU, L. & PEARL, D. K. 2007. High-resolution species trees without concatenation. Proceedings of the National Academy of Sciences, 104, 5936-5941. ENGLER, A., KRAUSE, K., PILGER, R. K. F. & PRANTL, K. A. E. 1887. Die Natürlichen Pflanzenfamilien nebst ihren Gattungen und wichtigeren Arten, insbesondere den Nutzpflanzen, unter Mitwirkung zahlreicher hervorragender Fachgelehrten begründet von A. Engler und K. Prantl, fortgesetzt von A. Engler, Leipzig, W. Engelmann. GADEK, P. A., FERNANDO, E. S., QUINN, C. J., HOOT, S. B., TERRAZAS, T., SHEAHAN, M. C. & CHASE, M. W. 1996. : Molecular Delimitation and Infraordinal Groups. American Journal of Botany, 83, 802-811. GROPPO, M., PIRANI, J. R., SALATINO, M. L. F., BLANCO, S. R. & KALLUNKI, J. A. 2008. Phylogeny of Rutaceae based on twononcoding regions from cpDNA. American Journal of Botany, 95, 985-1005. HALL, R. 1997. Cenozoic plate tectonic reconstructions of SE Asia. Geological Society, London, Special Publications, 126, 11-23. IRVING, E. 1983. Fragmentation and assembly of the continents, Mid-carboniferous to present. Surveys in Geophysics, 5, 299-333. KREADER, C. 1996. Relief of amplification inhibition in PCR with bovine serum albumin or T4 gene 32 protein. Appl. Environ. Microbiol., 62, 1102-1106. KUPER, R. & KRÖPELIN, S. 2006. Climate-Controlled Holocene Occupation in the Sahara: Motor of Africa's Evolution. Science, 313, 803-807. LING, K.-H., WANG, Y., POON, W.-S., SHAW, P.-C. & BUT, P. P.-H. 2009. The Relationship of Fagaropsis and Luvunga in Rutaceae. Taiwania, 54, 338-342. MABBERLEY, D. J. 1998. Australian Citreae with notes on other Aurantioideae (Rutaceae). Telopea 7, 333-344. MAUREIRA-BUTLER, I. J., PFEIL, B. E., MUANGPROM, A., OSBORN, T. C. & DOYLE, J. J. 2008. The Reticulate History of Medicago (Fabaceae). Systematic Biology, 57, 466-482. MOLE, B. J., UDOVICIC, F., LADIGES, P. Y. & DURETTO, M. F. 2004. Molecular phylogeny of Phebalium (Rutaceae: Boronieae) and related genera based on the nrDNA regions ITS 1+2. Plant Systematics and Evolution, 249, 197-212. MORTON, C. M. 2009. Phylogenetic relationships of the Aurantioideae (Rutaceae) based on the nuclear ribosomal DNA ITS region and three noncoding chloroplast DNA regions, atpB- rbcL spacer, rps16, and trnL-trnF. Organisms Diversity & Evolution, 9, 52-68. MORTON, C. M., GRANT, M. & BLACKMORE, S. 2003. Phylogenetic relationships of the Aurantioideae inferred from chloroplast DNA sequence data. Am. J. Bot., 90, 1463-1469. MYERS, R. M., FISCHER, S. G., LERMAN, L. S. & MANIATIS, T. 1985. Nearly all single base substitutions in DNA fragments joined to a GC-clamp can be detected by denaturing gradient gel electrophoresis. Nucleic Acids Research, 13, 3131-3145. PFEIL, B. E. & CRISP, M. D. 2008. The age and biogeography of Citrus and the orange subfamily (Rutaceae: Aurantioideae) in Australasia and New Caledonia. American Journal of Botany, 95, 1621-1631. PICKFORD, #160, MARTIN, WANAS, HAMDALLAH, SOLIMAN & HOSNY 2006. Indications for a humid climate in the Western Desert of Egypt 11-10 Myr ago : evidence from Galagidae (Primates, Mammalia), Paris, FRANCE, Elsevier. POON, W.-S., SHAW, P.-C., SIMMONS, M. P. & BUT, P. P.-H. 2007. Congruence of Molecular, Morphological, and Biochemical Profiles in Rutaceae: a Cladistic Analysis of the Subfamilies Rutoideae and Toddalioideae. Systematic Botany, 32, 837-846. POUND, M. J., HAYWOOD, A. M., SALZMANN, U., RIDING, J. B., LUNT, D. J. & HUNTER, S. J. A Tortonian (Late Miocene, 11.61-7.25 Ma) global vegetation reconstruction. Palaeogeography, Palaeoclimatology, Palaeoecology, In Press, Accepted Manuscript. PUCHOOA, D. & KHOYRATTY, S.-U. 2004. Genomic DNA extraction from <i> amazonica</i>. Plant Molecular Biology Reporter, 22, 195-196. RIBEIRO, R. A. & LOVATO, M. B. 2007. Comparative analysis of different DNA extraction protocols in fresh and herbarium specimens of the genus Dalbergia. Genetics and Molecular Research, 6, 173-187. ROGERS, S. O. & BENDICH, A. J. 1985. Extraction of DNA from milligram amounts of fresh, herbarium and mummified plant tissues. Plant Molecular Biology, 5, 69-76. SALVO, G., BACCHETTA, G., GHAHREMANINEJAD, F. & CONTI, E. 2008. Phylogenetic relationships of Ruteae (Rutaceae): New evidence from the chloroplast genome and comparisons with non-molecular data. Molecular Phylogenetics and Evolution, 49, 736-748. SAMUEL, R., EHRENDORFER, F., CHASE, M. W. & GREGER, H. 2001. Phylogenetic Analyses of Aurantioideae (Rutaceae) Based on Non-Coding Plastid DNA Sequences and Phytochemical Features. Plant Biology, 3, 77-87. SANDERSON, M. J. & DOYLE, J. J. 1992. Reconstruction of Organismal and Gene Phylogenies from Data on Multigene Families: Concerted Evolution, Homoplasy, and Confidence. Systematic Biology, 41, 4-17. SCOTT, K. D., MCINTYRE, C. L. & PLAYFORD, J. 2000. Molecular analyses suggest a need for a significant rearrangement of Rutaceae subfamilies and a minor reassessment of species relationships within<i>Flindersia</i>. Plant Systematics and Evolution, 223, 15- 27. SCOTT, K. D. & PLAYFORD, J. 1996. DNA Extraction Technique for PCR in Rain Forest Plant Species. Biotechniques, 20, 974-978. STEFANOVIĆ, S., PFEIL, B. E., PALMER, J. D. & DOYLE, J. J. 2009. Relationships Among Phaseoloid Legumes Based on Sequences from Eight Chloroplast Regions. Systematic Botany, 34, 115-128. STRAUB, S. C. K., PFEIL, B. E. & DOYLE, J. J. 2006. Testing the polyploid past of soybean using a low-copy nuclear gene—Is Glycine (Fabaceae: Papilionoideae) an auto- or allopolyploid? Molecular Phylogenetics and Evolution, 39, 580-584. SWINGLE, W. T. & REECE, P. C. 1967. The Botany of Citrus and Its Wild Relatives. In: REUTHER, W., WEBBER, H. J. & BATCHELOR, L. D. (eds.) The Citrus Industry, Revised Edition. University of California, Division of Agricultural Sciences. TÖPEL, M. 2010. Phylogenetic and Phyloclimatic Inference of the Evolution of Potentilleae (Rosaceae). Doctoral Thesis, University of Gothenburg.