Aristotle University of Thessaloniki School of Biology Postgraduate Studies Program «Conservation of Biodiversity and Sustainable Exploitation of Native (BNP)»

Postgraduate Diploma Thesis

“DNA barcoding in native plants of the Labiatae () family from Chios Island (Greece) and the adjacent Çeşme-Karaburun Peninsula (Turkey)”

by

Theodoridis Spyros

Supervisor: Konstantinos Vlachonasios, Lecturer

Thessaloniki January 2011 Αριστοτέλειο Πανεπιστήµιο Θεσσαλονίκης Τµήµα Βιολογίας Πρόγραµµα Μεταπτυχιακών Σπουδών (ΠΜΣ): Διατήρηση Βιοποικιλότητας & Αειφορική Εκµετάλλευση Αυτοφυών Φυτών

Μεταπτυχιακή Διπλωµατική Εργασία

“DNA barcoding” αυτοφυών φυτών της οικογένειας Labiatae (Lamiaceae) από τη νήσο Χίο (Ελλάδα) και τη χερσόνησο Ερυθραία (Τουρκία).

Θεοδωρίδης Σπύρος

Επιβλέπων: Κωνσταντίνος Βλαχονάσιος, Λέκτορας

Θεσσαλονίκη Ιανουάριος 2011 Contents

Abstract ...... 1

Περίληψη...... 2

Introduction...... 3

Materials and Methods ...... 6

Taxonomic sampling ...... 6

DNA extraction and amplification ...... 7

Barcoding analysis ...... 8

Results...... 9

PCR amplification and DNA sequencing ...... 9

Intra- and interspecific barcode variation...... 10

Resolving power ...... 13

Discussion...... 19

PCR and sequencing success (universality) ...... 19

Barcode variation and resolution...... 20

Contribution to conservation ...... 22

References ...... 23

Appendix 1...... 29

Appendix 2...... 34

Appendix 3...... 34

Abstract The family Labiatae (Lamiaceae) is known for its fine ornamental or culinary herbs like lavender, mint, oregano, sage and thyme and is a rich source of essential oils for the flavouring and perfume industry. Besides its great economic importance, Labiatae family contributes significantly to the endemic flora of Greece and Turkey. Owing to its economic and ecological significance and to the difficult identification based on morphological characters of several of its taxa, the Labiatae family is an ideal case for developing DNA barcodes. Purpose of this study is to evaluate the utility of DNA barcoding on a local scale in discriminating species of the Labiatae family in Chios Island (Greece) and the adjacent Çeşme-Karaburun Peninsula (Turkey). We chose three cpDNA regions (matK, rbcL, trnH-psbA) that were proposed by previous studies and test them either as single region or multiregion barcodes based on the criteria determined by CBOL (Consortium for the Barcode of Life). Our results show that matK and trnH-psbA are as useful in discriminating species of the Labiatae, for the taxa we examined, as any multiregion combination and could serve as single region barcodes contributing to the conservation and the trade control of species of the Labiatae family occurring in the East Aegean region.

1 Περίληψη Η οικογένεια Labiatae (Lamiaceae) είναι γνωστή για τα καλλωπιστικά και αρτυµατικά της βότανα όπως η λεβάντα, η µέντα, η ρίγανη, το φασκόµηλο και το θυµάρι και είναι πλούσια πηγή σε αιθέρια έλαια για τη βιοµηχανία τροφίµων και αρωµάτων. Πέραν του µεγάλου οικονοµικού της ενδιαφέροντος, η οικογένεια Labiatae συµβάλλει σηµαντικά τόσο στην ελληνική όσο και στην τουρκική ενδηµική χλωρίδα. Εξαιτίας της οικονοµικής και οικολογικής της σηµασίας, αλλά και του δύσκολου προσδιορισµού ορισµένων taxa της µε χρήση µορφολογικών χαρακτήρων, η οικογένεια Labiatae αποτελεί ιδανική περίπτωση για την ανάπτυξη “DNA barcodes”. Σκοπός της παρούσας εργασίας είναι η εκτίµηση της αποτελεσµατικότητας του “DNA barcoding” σε τοπικό επίπεδο διακρίνοντας είδη της οικογένειας Labiatae από τη νήσο Χίο (Ελλάδα) και τη χερσόνησο Ερυθραία (Τουρκία). Επιλέξαµε τρεις περιοχές του χλωροπλαστικού DNA (matK, rbcL, trnH-psbA) οι οποίες έχουν προταθεί από προηγούµενες έρευνες και τις δοκιµάζουµε τόσο µεµονωµένα όσο και σε συνδυασµό µε βάση τα κριτήρια που έχουν τεθεί από το CBOL (Consortium for the Barcode of Life). Τα αποτελέσµατα δείχνουν ότι οι περιοχές matk και trnH-psbA είναι εξίσου χρήσιµες στο να διακρίνουν είδη της οικογένειας Labiatae όσο κάθε συνδυασµός πολλαπλών περιοχών και θα µπορούσαν να συνεισφέρουν ως ατοµικά “barcodes” στη διατήρηση και τον έλεγχο του εµπορίου των ειδών της οικογένειας Labiatae στην περιοχή του Ανατολικού Αιγαίου.

2 Introduction The plant family Labiatae Adans. (= Lamiaceae Martynov the mint family) is almost cosmopolitan, centered chiefly in the Mediterranean area, and it contains about 7173 species across approximately 236 genera which are classified in seven subfamilies: Ajugoideae Kostel., Lamioideae Harley, Nepetoideae (Dumort.) Luerss., Prostantheroideae Luerss., Scutellarioideae (Dumort.) Caruel, Symphorematoideae Briq. and Viticoideae Briq (Harley et al. 2004; Kokkini et al. 2003). The family is known for its fine ornamental or culinary herbs like basil, lavender, mint, oregano, rosemary, sage and thyme and is a rich source of essential oils for the flavouring and perfume industry (Wagstaff et al. 1996). Plants included in the Labiatae family have also medicinal or ceremonial uses and have been used since antiquity (Kokkini et al. 2000; Matkowski & Piotrowska 2006).

Both Greece and Turkey demonstrate a high number of Labiatae taxa (species and subspecies) with variable distribution patterns all over their territories (Davis 1982; Kokkini et al. 1988). Besides its great economic importance, the Labiatae family contributes significantly to the endemic flora of both countries, with the rate of endemism reaching up to 44,2% in Turkey, which is regarded as an important gene- center for the Labiatae (Kokkini et al. 1988, Baser 1994). Moreover, several Labiatae taxa show taxonomic difficulties in Greece and Turkey in intrageneric, e.g. Micromeria Bentham (Davis 1982), Mentha L. Sect. Spicatae (Harley 1982, Kokkini 1983), or intraspecific, e.g. Acinos alpinus L. (Davis & Leblebici 1982) level. Owing to its economic and ecological significance and to the difficult identification based on morphological characters of several of its taxa, the Labiatae family is an ideal case for developing DNA barcodes

Taxon identification has traditionally been the special field of taxonomists, whose number cannot cover the ever increasing needs of non – specialists (Frézal & Leblois 2008). DNA barcoding represents a promising approach to taxon discrimination and identification using short genomic regions (Hebert et al 2003). Ideally, a DNA sequence (barcode) from such a standardized genomic region can be retrieved from small amounts of tissue of an unidentified organism (Kress & Erickson 2007). This sequence is then compared in a database against reference sequences from

3 identified individuals and either the query sequence matches one in the library, which leads to identification, or the new record specifies a novel barcode for a given species (Hajibabaei et al. 2007). Additionally, no match to any record in the database could indicate the existence of a new species (Hebert et al. 2004).

While in animal species the success of DNA barcoding based on the mitochondrial gene cytochrome c oxidase subunit 1 (cox1/COI) has been relatively high, the ability of mitochondrial DNA to discriminate between plant species has been proven more limited (Fazekas et al. 2009). In land plants, differences in the rates of evolution between taxa (Smith & Donoghue 2008), and the widespread appearance of interspecific and intergeneric hybrids (Riesenberg et al. 2006) subsequently affect the development of universal barcoding markers (Ford et al. 2009; Kress et al. 2009). Moreover, in most land plants mitochondrial DNA evolves too slowly (Chase & Fay 2009) and demonstrates structural rearrangements (Palmer et al. 2000), resulting in limitations of the utility of mitochondrial genes as universal barcodes (Kelly et al. 2010).

For a particular DNA region to be useful for plant DNA barcoding the following basic criteria must be met: (i) conserved flanking regions to enable routine amplification across highly divergent taxa; (ii) short sequence length to allow easy single pass sequencing; (iii) significant interspecific genetic variability, yet sufficient sequence consistency to ensure less intra- than interspecific variability (Ford et al. 2009; Hollingsworth et al. 2009; Kress & Erickson 2008). Additionally, an effective barcode should be recoverable from herbarium and other degraded DNA samples (Kress et al. 2005). The search for an alternative to COI DNA region that corresponds to the above-mentioned criteria has focused on different loci from the plastid genome (Chase et al. 2007; Kress & Erickson 2007, 2008; Fazekas et al. 2008; Lahaye et al. 2008; Ford et al. 2009; Hollingsworth et al. 2009). Furthermore, several studies suggested that a DNA barcode should combine more than one locus (Kress & Erickson 2007; Fazekas et al. 2008; CBOL Plant Working Group 2009).

Based on the difficulties in developing a universal barcode capable of discriminating among species in all of the land plant groups, Kress et al. (2009) suggested that DNA barcodes would be most effectively applied in taxa that occur together in a floristic

4 region or ecological community. Moreover, a regional approach to DNA barcoding is consistent with the approach used for centuries for creating morphology – based floras at local and regional levels (Clerc-Blain et al. 2010). Rechinger (1943), in his floristic division of the Aegean region, joined the East Aegean Islands and the adjacent Anatolian Peninsulas in one phytogeographic region named “Ostägäis” (“Eastagean”). Recently, Stefanaki et al. (2010), based on numerical calculation of similarity and chorological data of the Labiatae in Chios Island (East Aegean Islands, Greece) and Çeşme-Karaburun Peninsula (Anatolia, Turkey) conclude to close affinities of the family in the two regions and propose once more the East Aegean Islands and the neighbouring Anatolian mainland to be studied as one phytogeographic entity.

In this study, we evaluate the utility of a regional approach to DNA barcoding in discriminating species of the Labiatae family in Chios Island (Greece) and the adjacent Çeşme-Karaburun Peninsula (Turkey). From a variety of plastid DNA regions that have been proposed as suitable plant DNA barcodes, we choose rbcL (a conserved, coding region), a section of matK (a rapidly evolving, coding region) and trnH-psbA (a rapidly evolving and length variable intergenic spacer) and test them either alone or in combination, as suggested by several studies (CBOL Plant Working Group 2009; Hollingsworth et al. 2009; Kress & Erickson 2007). Based on the criteria determined by CBOL (www.barcoding.si.edu/PDF/Guidelines for non-CO1 selection FINAL.pdf), we tested the utility of the proposed regions in terms of (i) amplification and sequencing success (universality); (ii) intra- versus interspecific variation; (iii) resolving power.

Materials and Methods

Taxonomic sampling

Plant collections were carried out throughout Chios Island and Çeşme-Karaburun Peninsula (Fig. 1) at intervals of the years 2005-6 and 2008 by Anastasia Stefanaki (Herbarium sample code abbreviation: SA) or by Anastasia Stefanaki and Konstantinos Vlachonasios (SAVK). Collected specimens were press-dried and deposited in the Herbarium of Aristotle University of Thessaloniki (TAU). The

5 specimens were taxonomically identified using basic floristic books, monographies and taxonomic papers (Bothmer 1969, 1985; Tutin et al. 1972; Ietswaart 1980; Davis 1982; Kokkini 1983; Morales 1987; Mennema 1989; Strid & Tan 1991). For the barcoding analysis we used a total of 81 specimens (Table 1). For the majority of the species occurring in both regions two specimens were used, one from Chios and one from Çeşme-Karaburun, respectively. For species occurring only in one region we used one specimen (Appendix 1).

Figure 1. Chios Island (Greece) and Çeşme-Karaburun Peninsula (Turkey). Εικόνα 1. Η νήσος Χίος (Ελλάδα) και η χερσόνησος Ερυθραία (Τουρκία).

6 DNA extraction and amplification

Total genomic DNA was extracted from from herbarium specimens by grinding 10-20 mg of tissue in liquid nitrogen, and then using the protocol described by Dellaporta et al. (1983) or DNeasy Plant Mini Kit (Qiagen, Hilden, Germany) following manufacturer’s instructions.

We attempted to obtain sequences for all the 81 individuals used for this study. Primer sequences for the two coding regions matK, rbcL and the noncoding region trnH-psbA were obtained from the protocols proposed by the CBOL Plant Working Group (http://www.barcoding.si.edu/plant_working_group.html). A part of the matK was amplified using primers matk390F and matk1326R (Cuenoud et al. 2002) that had great amplification success in previous studies (Lahaye et al. 2008). For the rbcL region we used a previously proposed reverse primer (Kress & Erickson 2007; Table 1, Appendix 2), after several unsuccessful initial attempts to amplify this region for several samples.

Polymerase chain reaction (PCR) was performed in 12.5-µL reaction mixtures containing 1x buffer (including 1.5 mM MgCI2), 0.2 mM of each dNTP, 0.3 µM of each primer, 0.5 units of KAPA2G Fast DNA polymerase (KAPABIOSYSTEMS, KK5009) and 10-20 ng of template DNA. Amplifications were caried out on a Mastercycler Personal (Eppendorf) thermocycler using the following conditions: a first cycle at 94 °C for 4 min, 40 cycles of DNA denaturation at 94 °C for 30 s, primer annealing at 51 °C for 1 min, and DNA strand extension at 72 °C for 1min and a final cycle at 72 °C for 4 min. Minor adjustments to primer annealing temperature were sometimes necessary depending on the primer pair used for amplification. Each PCR product was run on 0.8% agarose gel containing ethidium bromide and successful products were sequenced by Macrogen, Inc. (Seoul, South Korea) using the same primers as in PCR.

7 Barcoding analysis

Sequences were initially aligned using MUSCLE (Edgar 2004) and alignments were visually inspected. Minor manual adjustment of the alignments was necessary only for the trnH-psbA noncoding region due to differences in sequence size among taxa and was conducted in Jalview (Waterhouse et al. 2009). Because of the low sequence quality at the end of the matK region and in order to avoid the impact of missing data on the analysis, we excluded several characters at the end of the matK alignment (from 815 to 911 bp). All sequences were submitted to GenBank (accession numbers: HQ902701- HQ902879, Appendix 1).

We estimated the performance of both single and multiregion barcodes by combining the sequences for the samples with successfully amplified sequences for the two combined or for all the three regions (cpDNA global dataset). In order to estimate genetic distances, we selected the best fit model of evolution for each data set using jModelTest 0.1.1 (Guindon & Gascuel 2003; Posada 2008) with the Akaike Information Criterion (AIC; Akaike 1974). We calculated intra- and interspecific genetic distances using MEGA 4.0 (Tamura et al. 2007) for species that are represented by multiple individuals. To assess the presence of barcoding gap (i.e. the gap between intra- and interspecific genetic variation), minimum interspecific distances were plotted against maximum intraspecific distances as recommended by CBOL (Non-COI Barcode Regions — Guidelines for CBOL Approval). We also calculated variability of each of the 7 barcodes as the number and percentage of variable characters using MEGA 4.0.

To assess the utility of DNA barcoding for accurate species assignment in our data sets we used ‘best match’ and ‘best close match’ functions of the program TAXONDNA (Meier et al. 2006). TAXONDNA is an alignment-based parametric clustering software that determines the closest match of a sequence by comparing it to all other sequences in the aligned data set. Regarding the ‘best match’ method, if both compared sequences were from the same species, the identification was considered a success, whereas mismatched names were counted as failures. Several equally good best matches from different species were considered ambiguous. For the ‘best close match’ method TAXONDNA assumes the choice of a

8 similarity threshold below which 95% of all intraspecific pairwise distances are found (Meier et al. 2006). All queries without barcode match below the threshold value remained unidentified (no match). For the remaining queries, their identity was compared to the species identity of their closest barcode. If the name was identical, the query was considered a correct identification. The identification was considered a failure (incorrect) when the names were mismatched and considered ambiguous when several equally good best matches were found that belonged to a minimum of two species. These methods were performed on all 7 barcode data sets (3 single region, 3 two – region and 1 global cpDNA dataset) using uncorrected pairwise distances and a minimum sequence overlap of 300 bp. The inclusion of conspecific sequences in the data set is fundamental in order for correct assignments to be made. However, in our analysis, we included not only species represented by multiple individuals but also several species represented by only a single individual. These species are Mentha longifolia, M. suaveolens, pomifera, S. tomentosa, S. verbenaca, Thymus zygioides and Teucrium montanum and demonstrated close genetic affinities to other congeneric species in our data sets after initial analysis.

Several tree-based methods were used to assess weather sequences in our data sets form species-specific clusters. Neighbor joining (NJ), UPGMA and maximum parsimony (MP) were conducted in MEGA 4.0. For the NJ and UPGMA distance- based methods the best fit model of nucleotide substitutions (TrN+G) was chosen and node support was assessed running 1000 bootstrap replicates (Felsenstein 1985). MP analysis was conducted with 100 random sequence adition and otherwise default parameters. Node support values for individual branches of the resulting trees were obtained from heuristic searches of 1000 bootstrap replicates. Bayesian inference (BI) was conducted in MRBAYES v3.1.2 (Hulsenbeck & Ronquist 2001; Ronquist & Huelsenbeck 2003) under the General Time Reversible + Gamma (GTR+G) model of nucleotide substitution which was selected for all regions. Two independent runs with four Monte Carlo Markov chains (MCMCs, one cold and three heated) run for 1 × 106 generations, with trees sampled every 1000th generation, were performed. Each chain used a random tree as starting point and the default temperature parameter value of 0.2. The first 250 sampled trees were discarded as “burn in”. The remaining trees were used to build a 50% majority rule consensus tree.

9 Results

PCR amplification and DNA sequencing

The first step in assessing the utility of candidate barcodes was to estimate amplification and sequencing success (universality) across all 81 individuals. In this regard, trnH-psbA was the best-performing barcode. This target region was successfully amplified and sequenced for 71 and 65 individuals respectively using only one primer pair and with a single attempt for the most of the samples (Table 1). matK also performed well in terms of amplification success (85.2%). However, the achievement of this level required multiple attempts at PCR and the use of two primer pairs (Table 1). Furthermore, high quality sequences were obtained for a relative low proportion of the samples (67.9%). The rbcL region showed an intermediate performance with 79% of the samples successfully amplified and 72.8% successfully sequenced using two reverse primers and multiple attempts at PCR.

Table 1. Amplification and sequencing success for the three individual barcode regions. Πίνακας 1. Επιτυχία ενίσχυσης και αλληλούχισης για τις τρεις γονιδιακές περιοχές. Barcode region Primers used N Percentage Percentage (samples of samples of samples tested) successfully successfully amplified sequenced matK matk390F-matk1326; 81 85.2% 67.9% matk3fkim-matk1rkim rbcL rbcLaF-rbcLaR; 81 79% 72.8% rbcLaF-rbcLaR* trnH-psbA psbA3’f-trnHf_05 81 87.6% 80.2%

Intra- and interspecific barcode variation

Features of the individual and combined barcode datasets are shown in Table 2. Because we were not able to obtain the same sample size for all the barcoding regions, each of the individual or multiregion barcoding analysis differs in the individual and taxa examined. The percentage of variable characters ranged from 10.7% (rbcL) to 49.7% (trnH-psbA). Barcode regions varied also in the proportion of species demonstrating larger minimum interspecific genetic distance than maximum intraspecific (barcoding gap). Distances were calculated using TrN+G (Tamura-Nei with gamma rates) model of evolution, which was the best fit model of evolution

10 implemented in MEGA 4.0 software. Plots of maximum intraspecific versus minimum interspecific TrN+G distances are shown in Figure 2 and illustrate the presence or absence of barcoding gap for the individual and combined barcodes. The rbcL region had the worst performance in terms of interspecific variation with zero interspecific genetic distance values detected for 8 species and only 47.1% (8/17) of the species that were represented by >1 individuals demonstrating a barcoding gap (Table 2, Appendix 3). Combining matK with rbcL significantly increased the proportion of species that exhibited a barcoding gap from 47,1% to 78.6%. Although some intraspecific variation was detected for the matK region, it had the best performance compared to the individual or combined barcodes with 83.3% of the species demonstrating a barcoding gap. The combination of the three regions (matK+rbcL+trnH-psbA) had a relatively lower success compared to the two-region combinations but had also the lower number of species (13) represented by multiple individuals (Table 2, Appendix 3).

Table 2. Characteristics of the 7 DNA barcodes evaluated in this study. N indicates the sample size for each barcode. Presence of barcoding gap is based on species that are represented by >1 individuals and had a difference between minimum interspecific and maximum intraspecific genetic distance greater than zero. Πίνακας 2. Χαρακτηριστικά των 7 “DNA barcodes” που αξιολογηθήκαν στην παρούσα εργασία. Με Ν συµβολίζεται το µέγεθος του δείγµατος για κάθε “barcode”. Η διαφορά ανάµεσα στην ενδοειδική και στη διειδική γενετική απόσταση (barcoding gap) υπολογίστηκε για τα είδη τα οποία αντιπροσωπεύονται από >1 άτοµα.

Percentage Species N Percentage of species Alignment represented Barcode individuals of variable exhibiting length by multiple (N species) characters barcoding individuals gap matK 55 (35) 814 32.3% 18 83.3% rbcL 59 (40) 523 10.7% 17 47.1% trnH-psbA 65 (42) 821 49.7% 21 76.2% matK+rbcL 49 (33) 1337 23.3% 14 78.6% matK+trnH-psbA 52 (34) 1637 41.8% 16 68.8% rbcL+trnH-psbA 52 (35) 1362 35.1% 15 80% matK+rbcL+trnH- 47 (32) 2133 33.1% 13 69.2% psbA

11

12 Figure 2. Plots of minimum interspecific TrN+G distances versus maximum intraspecific TrN+G distances for the individual (A) and the combined (B) barcode regions. Each data point represents a species for which two or more individuals were sampled. Species, which fall above the 1:1 line, exhibit a barcoding gap. Εικόνα 2. Διαγράµµατα ελάχιστων διειδικών – µέγιστων ενδοειδικών TrN+G γενετικών αποστάσεων για τις ατοµικές (Α) και τις συνδυασµένες (Β) “barcode” περιοχές. Κάθε σηµείο εκπροσωπεύει ένα είδος το οποίο αντιπροσωπεύεται από δύο ή παραπάνω άτοµα. Τα είδη τα οποία βρίσκονται επάνω από τη γραµµή 1:1 του διαγράµµατος παρουσιάζουν διαφορά ανάµεσα στην ενδοειδική και στη διειδική γενετική απόσταση (barcoding gap).

Resolving power The results of similarity tests performed in TAXONDNA are shown in Table 3. Success of each candidate barcode using the “best match” and “best close match” distance methods did not differ. The calculated 95% similarity threshold (i.e., the value under which 95% of all intraspecific distances fall) ranged from 0.58% (rbcL) to 9.77% (trnH-psbA). Of the single region barcodes, trnH-psbA had the highest rate of correct classifications and also the highest among all barcodes followed by matK (65.9%) and rbcL (48.8%). Although rbcL had a low success rate when used as single region barcode, it had the second highest percentage of correct classification when combined with trnH-psbA (75.7%). The three other multiregion barcodes demonstrate marginal differences in classification success. The matK region had lower classification success when used as individual barcode than when it was used in combination with rbcL (66.7%), with trnH-psbA (69.2%) or with both of them (69.7%).

Ambiguous classification was also the same between the two methods while incorrect classification was slightly different owing to “no match” option in the best close match method (Table 3). The relative high rates of incorrect classification in all barcodes (7.3% to 30.3%) arisen from the absence of a conspecific sequence for several species (see materials and methods). Discrepancies among classification success of multiregion barcodes and individual regions is the result of differences in sample sizes.

Results of the tree-based methods are shown in Table 4. Resolving power for the matK region varied between 66.7% and 83.3% and for the trnH-psbA between 61.9% and 81%. The rbcL region resolved the lower number of species (47.1% to 52.9%) among single and multiregion barcodes. However, rbcL discriminated 85.7% of the

13 species under the MP method when combined with matK having the best performance among the multiregion barcodes under this method followed by the matK+rbcL+trnH-psbA barcode, which discriminated 84.6% of the species under both MP and BI methods. The matK+trnH-psbA barcode resolved 81.3% of the species under the BI method while the resolution rate for rbcL+trnH-psbA was 73.3% under NJ, UPGMA and BI methods. Resolution with bootstrap support ≥70% (NJ, UPGMA, MP) or posterior probability ≥0.95 (BI) was generally poorer. However, combining regions increased species resolution in terms of clade support. For the three region combined barcode (matK+rbcL+trnH-psbA), all species-specific clusters had bootstrap support ≥70% or posterior probability ≥ 0.95 under the four different methods (Table 4).

Table 3. Identification success of the 7 barcodes using TAXONDNA program under “best match” and “best close match” methods. Threshold distances were calculated from the observed sequences. Numbers in parentheses indicate the number of sequences belonging to species which are represented by >1 individuals. Πίνακας 3. Επιτυχία ταυτοποίησης των 7 “barcodes” χρησιµοποιώντας το λογισµικό TAXONDNA και τις µεθόδους “best match” και “best close match”. Το όριο των αποστάσεων υπολογίστηκε συγκρίνοντας τις αλληλουχίες. Οι αριθµοί στις παρενθέσεις υποδεικνύουν τον αριθµό των αλληλουχιών που ανήκουν σε είδη που αντιπροσωπεύονται από >1 άτοµα.

Best Match Best Close Match N No Barcode Correct Ambiguous Incorrect Correct Ambiguous Incorrect Threshold sequences match matK 44 (38) 65.9% 9.1% 25.0% 65.9% 9.1% 25.0% 0 1.36

rbcL 41 (36) 48.8% 43.9% 7.3% 48.8% 43.9% 7.3% 0 0.58

trnH-psbA 49 (44) 77.6% 4.1% 18.4% 77.6% 4.1% 16.3% 2.04% 9.77

matK+rbcL 36 (30) 66.7% 5.6% 27.8% 66.7% 5.6% 27.8% 0% 1.05

matK+trnH-psbA 39 (34) 69.2% 2.6% 28.2% 69.2% 2.6% 25.6% 2.56% 4.55

rbcL+trnH-psbA 37 (32) 75.7% 0% 24.3% 75.7% 0% 21.6% 2.7% 5.28 matK+rbcL+trnH- 33 (28) 69.7% 0% 30.3% 69.7% 0% 27.3% 3.03% 2.95 psbA

While the percentage of species-specific clusters varied for every barcode and under the different tree-based methods, all the species were consistently clustered in subfamily specific clades according to Harley et al. (2004) for the majority of the barcodes and methods. We here present cladogramms (Fig. 3-5) for two individual barcodes (matK, trnH-psbA) which included the highest number of species represented by multiple individuals and for the 3 region combination barcode

14 (matK+rbcL+trnH-psbA) which generally had the highest rates of statistical support (bootstrap, posterior probability).

Table 4. Resolving power of the 7 barcodes under four different tree based methods for species (N) which are represented by >1 individuals. Numbers in parentheses indicate the percentage of species-specific clusters with bootstrap support ≥70% (NJ, UPGMA, MP) or posterior probability ≥0.95 (BI). Πίνακας 4. Ικανότητα διάκρισης των 7 “barcodes” κάτω από τέσσερις διαφορετικές µεθόδους που βασίζονται στην κατασκευή δένδρων για είδη (Ν) τα οποία αντιπροσωπεύονται από >1 άτοµα. Οι αριθµοί στις παρενθέσεις αντιστοιχούν στο ποσοστό των ειδοειδικών κλάδων µε bootstrap support ≥70% (NJ, UPGMA, MP) ή posterior probability ≥0.95 (ΒΙ).

Barcode N species NJ UPGMA MP BI matK 18 77.8 (50) 83.3 (72.2) 83.3 (66.7) 66.7 (61.1) rbcL 17 52.9 (35.3) 52.9 (35.3) 52.9 (41.2) 47.1 (47.1) trnH-psbA 21 81 (52.4) 81 (76.2) 66.7 (66.7) 61.9 (61.9) matK+rbcL 14 78.6 (64.3) 78.6 (64.3) 85.7 (85.7) 64.3 (57.1) matK+trnH-psbA 16 68.7 (68.7) 68.7 (68.7) 75 (75) 81.3 (75) rbcL+trnH-psbA 15 73.3 (66.7) 73.3 (73.3) 66.7 (53.3) 73.3 (66.7) matK+rbcL+trnH-psbA 13 76.9 (76.9) 76.9 (76.9) 84.6 (84.6) 84.6 (84.6)

UPGMA analysis for the matK region had the highest discrimination rates. One of the species that did not form a species specific cluster is Salvia virgata, where the individual collected from Chios Island, clustered together with the individual of S. verbanaca, collected from the same Island while all other unresolved species belong to Mentha genus (Fig. 3).

NJ analysis for the trnH-psbA barcode, had the same resolution rates as the UPGMA method, lower rates though when bootstrap support ≥70% is used as criterion. In this analysis, Salvia virgata as well as S. fruticosa did not form species specific clusters, with the turkish sample of S. fruticosa forming a cluster with the also turkish sample of S. pomifera collected from the same country. However, the clades which include S. virgata and S. verbanaca are not well supported (Fig 4).

Under BI method, which discriminated the same number of species as MP method for the 3 region combination (matK+rbcL+trnH-psbA) barcode, S. fruticosa did not form a monophyletic clade while S. virgata did. For this barcode though, we had the lower number of sequences and species to be analysed (Fig 5).

15

Figure 3. UPGMA cladogram for the matK region based on TrN+G model of nucleotide substitution. Asterisks indicate bootstrap node support ≥70%. Species names are followed by specimen accession numbers and abbreviations in parentheses for the country in which the sample was collected. Εικόνα 3. Κλαδόγραµµα κατασκευασµένο µε τη µέθοδο UPGMA για τη matK περιοχή χρησιµοποιώντας το µοντέλο νουκλεοτιδικής αντικατάστασης TrN+G. Οι αστερίσκοι υποδεικνύουν bootstrap support ≥70%. Τα ονόµατα των ειδών ακολουθούνται από κωδικούς ερµπαρίου και συντµήσεις για τη χώρα συλλογής του δείγµατος.

16

Figure 4. NJ cladogram for the trnH-psbA region based on TrN+G model of nucleotide substitution. Asterisks indicate bootstrap node support ≥70%. Species names are followed by specimen accession numbers and abbreviations in parentheses for the country in which the sample was collected. Εικόνα 4. Κλαδόγραµµα κατασκευασµένο µε τη µέθοδο NJ για την trnH-psbA περιοχή χρησιµοποιώντας το µοντέλο νουκλεοτιδικής αντικατάστασης TrN+G. Οι αστερίσκοι υποδεικνύουν bootstrap support ≥70%. Τα ονόµατα των ειδών ακολουθούνται από κωδικούς ερµπαρίου και συντµήσεις για τη χώρα συλλογής του δείγµατος.

17

Figure 5. Fifty percent majority-rule consensus cladogram inferred by Bayesian analysis of the global cpDNA data set based on GTR+G model of nucleotide substitution. Species names are followed by specimen accession numbers and abbreviations in parentheses for the country in which the sample was collected. Εικόνα 5. Συγκεντρωτικό κλαδόγραµµα κατασκευασµένο µε µπαγεσιανή µέθοδο για το σύνολο των τριών χλωροπλαστικών περιοχών χρησιµοποιώντας το µοντέλο νουκλεοτιδικής αντικατάστασης GTR+G. Τα ονόµατα των ειδών ακολουθούνται από κωδικούς ερµπαρίου και συντµήσεις για τη χώρα συλλογής του δείγµατος.

18 Discussion

PCR and sequencing success (universality)

Important criteria for evaluating the suitability of a DNA barcode are amplification and sequencing success (CBOL Plant Working Group 2009). In this regard, the non- coding trnH-psbA spacer region had the best performance among the three regions tested here (80.2% successfully sequenced samples) and this result is consistent with previous studies on land plants that have reported rates from 78.5% to 99% (Kress & Erickson 2007; Fazekas et al. 2008; Gonzales et al. 2009; Hollingsworth et al. 2009; Kress et al. 2009). However, Kelly et al. (2010) reported 0% of sequencing success, using more than one primer pair.

The rbcL region showed an intermediate performance compared to matK and trnH- psbA (Table 1). Higher rates of recovery success than the one resulted in our study have been reported in the literature. Many studies that have tested the utility of rbcL, report amplification and sequencing success rates ≥90% using one or two primer pairs (Kress et al. 2005; Fazekas et al. 2008; Kress et al. 2009).

The matK region was undoubtedly the most difficult to amplify and sequence. Even though we used two widely used primer pairs (Table 1) and after several attempts under various conditions, we obtained a relative low number of good quality sequences (67.9%) compared to the other two regions. Rates of recovery success regarding the matK region vary widely in the literature. Saas et al. (2007) have reported 24% success in Cycadales, while Kress et al. (2009) and Gonzales et al. (2009), both studies regarding tropical species, reported 69% and 68% recovery success respectively. Furthermore, the initial size of the matK in our study was 911 bp (varying slightly depending on the primers used), exceeding the limit of ~700 bp or less that an ideal barcode should have making it difficult to obtain good quality sequence for the whole length of the region (Cowan et al. 2006).

Although we are not aware of any extensive barcoding study focused on the Labiatae family, we suspect that the great abundance of secondary metabolites in many of the species of Labiatae family may affect DNA extraction and amplification and consequently the overall performance of DNA barcoding (Friar 2005).

19 Barcode variation and species resolution

The second criterion we used to evaluate the performance of DNA barcording in this study is the assessment of intra- and interspecific variation. The rbcL as a single region barcode had the worst performance, showing the lowest level of variability (10.7%) among the Labiatae species. Furthermore, the percentage of species that presented a barcoding gap for the rbcL region was lower than 50%. This fact makes this region unsuitable when used alone as DNA barcode for the Labiatae family. Variability of trnH-psbA was high owing to the presence of indels and homopolymer runs, which have the potential to overweight distance measures (Newmaster & Ragupathy 2009). Both trnH-psbA and matK presented the desirable barcoding gap. Our results are consistent with previous studies that have tested the variability of these three regions in land plants (Lahaye et al. 2008; Newmaster et al. 2008; Newmaster & Ragupathy 2009, Pettengill & Neel 2010). Using multiregion barcodes did not significantly increase the presence of barcoding gap (Table 2).

To assess our final criterion, resolving power, we used tree based and similarity methods. trnH-psbA had the highest rate of correct identifications followed by the multiregion barcode rbcL+trnH-psbA using the similarity methods in TAXONDNA. The fact that the multiregion matrices included fewer conspecific sequences than those of single region may have affected the rate of correct identifications. On the basis of tree-based methods, matK had greater resolving power under UPGMA, MP and BI than any other single region barcode, while trnH-psbA dominated over matK only under NJ method. The combination matK+rbcL that has been proposed by CBOL as a universal plant barcode, had the higher rates of resolution under MP method including though fewer number of species. In practice, our results suggest that a significant proportion of the samples will not be identified using matK+rbcL owing to amplification failure. matK is one of the most rapidly evolving plastid coding regions and it consistently showed high levels of discrimination among angiosperm species (CBOL 2009). Our results indicate that matK could be used sufficiently as individual barcode in Labiatae family and is consistent with the conclusion of Lahaye et al. (2008), that matK is a suitable barcode region for angiosperms. Nevertheless, the relative low amplification

20 and sequencing success, using more than one primer pair, indicates that further primer development is needed. trnH-psbA intergenic spacer also performed well both in terms of amplification and sequencing success and in terms of resolving power. Kress et al. (2005) suggest that trnH-psbA is the best plastid option for a DNA barcode sequence that has good priming sites length and interspecific variation. The main problem regarding noncoding regions such as trnH-psbA, is the ease of alignment (Kress et al. 2005; Newmaster & Ragupathy 2009). However, in our study, only minor alignment adjustments were necessary. Based on our results, we suggest that matK as well as trnH-psbA are as useful in discriminating species of the Labiatae family from Chios Island (Greece) and the adjacent Çeşme-Karaburun Peninsula (Turkey) as any multiregion combination for the taxa we examined.

We found the three species of Mentha genus used in this study to be consistently unresolved both in single and multiregion analysis under the majority of the tree based methods. These are the main species that decreased the resolution rates. M. spicata is the hybrid between M. suaveolens and M. longifolia and introgression and hybridization among these species is possible where these species co-occur (Harley & Brighton 1977; Harley 1982). Identifying hybrids would be desirable for a DNA barcode and in the case of long established natural hybrid species this should not be problematic (Cowan et al. 2006). However, in cases of recent hybridization or ongoing introgression it is not possible to make a reliable identification using a single or even two plastid DNA regions (Chase et al. 2005). In plants, plastid genes are uniparentally inherited (most often maternally), therefore identification of hybrids would necessitate inclusion of multiple single copy nuclear genes in the barcode (Cowan et al. 2006). Nevertheless, there are several challenges ahead in developing nuclear gene barcodes (Fazekas et al. 2009).

Voucher specimen SA1464 belongs to genus Origanum, section Majorana and was found in Chios co-occurring with Origanum onites. Morphologically, it resembles two species of sect. Majorana, i.e. O. onites (acuminate leaves with prominent veins in the under surface) and O. majorana (spikes arranged in panicles) (Ietswaart 1980; Ietswaart 1982). The latter species, a native of Cyprus, is doubtfully naturalized in Greece, where it is widely cultivated with the vernacular name “marjoram” and may

21 be found in the wild as a garden escape (Ietswaart 1982; Greuter et al. 1986). The specimen collected from Chios may either be a garden escape or a hybrid between O. onites and O. majorana. According to Ietswaart (1982), hybridization is often in Origanum and hybrids are luckily to be found where species co-occur. Unfortunately, we were able to amplify only the trnH-psbA region for this sample and the clustering of this specimen outside the Origanum clade analysis gives us no sufficient information of its identity.

Contribution to conservation

Plants are a vital part of the world’s biological diversity and an essential resource for human well-being. Besides the crop plants that provide our basic food and fibres, many thousands of wild plants have great economic and cultural importance and potential, providing food, medicine, fuel, clothing and shelter for vast numbers of people throughout the world (GSPC 2002). Of particular concern is the fact that many species are in danger of extinction and threatened by habitat transformation, over- exploitation, alien invasive species, pollution, and climate change (Krupnick & Kress 2005). The ultimate and long-term objective of the Global Strategy for Plant Conservation (GSPC) as well as of the European Strategy for Plant Conservation (ESPC) is to halt the current and on-going loss of plant diversity. More specifically, target 11 focuses on the biodiversity of wild flora endangered by international trade. Annually more than 400,000 tonnes of medicinal and aromatic plants are traded globally with 80% harvested from the wild (Planta Europa 2008).

Because of the great economic importance of many species of the Labiatae family, the reliable discrimination and identification of these species is critical. The cost and time-effectiveness of DNA barcoding enables easy species identification (Frézal & Leblois 2008), even from small amounts of plant tissue. Therefore, DNA barcoding can be useful not only for but also for the control of the tradable plant genetic resources contributing to the protection and maintenance of plant diversity.

Our study suggests that matK and trnH-psbA could serve as single region barcodes and contribute to the conservation and the trade control of species of the Labiatae family occurring in the East Aegean.

22 References

Akaike, H (1974) A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, 716–723.

Baser KHC (1994) Essential oils of Labiatae from Turkey: Recent results. Newsletter, 3, 6–11.

Bothmer Rv (1969) Studies in the Aegean Flora XIV. Studies in Scutellaria Section Vulgares Subsection Peregrinae from Greece and Adjacent Turkey. Botanical Notiser, 122, 38–56.

Bothmer Rv (1985) Differentiation patterns in the Scutellaria albida group (Lamiaceae) in the Aegean area. Nordic Journal of , 5, 421–439.

CBOL Plant Working Group (2009) A DNA barcode for land plants. Proceedings of the National Academy of Sciences of the United States of America, 106, 12794– 12797.

Chase MW, Cowan RS, Hollingsworth PM, et al (2007) A proposal for a standardised protocol to barcode all land plants. Taxon, 56, 2, 295–299.

Chase MW, Fay MF (2009) Barcoding of plants and fungi. Science, 325, 5941, 682– 683.

Clerc‐Blain JLE, Starr JR, Bull RD, Saarela JM (2010) A regional approach to plant DNA barcoding provides high species resolution of sedges (Carex and Kobresia, Cyperaceae) in the Canadian Arctic Archipelago. Molecular Ecology Resources, 10, 1, 69-91.

Cowan RS, Chase MW, Kress WJ, Savolainen V (2006) 300,000 species to identify: problems, progress, and prospects in DNA barcoding of land plants. Taxon, 55, 3, 611–616.

Cuenoud P, Savolainen V, Chatrou LW et al. (2002) Molecular phylogenetics of Caryophyllales based on nuclear 18S rDNA and plastid rbcL, atpB, and matK DNA sequences. American Journal of Botany, 89, 1, 132–144.

Davis PH (1982) Flora of Turkey and the East Aegean Islands, vol 7. Edinburgh University Press, Edinburgh.

23 Davis PH, Leblebici E (1982) Acinos Miller. In: Flora of Turkey and the East Aegean Islands (ed. Davis PH), pp. 331-335. Edinburgh University Press, Edinburgh.

Davis PH (1982) Micromeria Bentham. In: Flora of Turkey and the East Aegean Islands (ed. Davis PH), pp. 335-346. Edinburgh University Press, Edinburgh

Dellaporta SL, Wood J, and Hicks JB (1983) A plant DNA minipreparation: version II. Plant Mol Biol Rep, 1, 19–21.

Edgar RC (2004), MUSCLE: multiple sequence alignment with high accuracy and high throughput, Nucleic Acids Research, 32, 1792–1797.

Fazekas AJ, Burgess KS, Kesanakurti PR et al. (2008) Multiple multilocus DNA barcodes from the plastid genome discriminate plant species equally well. PLoS One, 3, e2802.

Fazekas AJ, Kesanakurti PR, Burgess KS et al. (2009) Are plant species inherently harder to discriminate than animal species using DNA barcoding markers? Molecular Ecology Resources, 9, 130–139.

Felsenstein J (1985) Confidence limits on phylogenies: An approach using the bootstrap. Evolution, 39, 783–791.

Ford CS, Ayres KL, Toomey N, et al. (2009) Selection of candidate coding DNA barcoding regions for use on land plants. Botanical Journal of the Linnean Society, 159, 1–11.

Friar EA (2005) Isolation of DNA from plants with large amounts of secondary metabolites. In Molecular evolution: producing the biochemical data (eds Zimmer EA, Roalson EH), pp. 3–14. Elsevier Academic Press, San Diego.

Frézal L, Leblois R (2008) Four years of DNA barcoding: Current advances and prospects. Infection, Genetics and Evolution, 8, 5, 727–736.

Global strategy for plant conservation (2002). Convention on Biological Diversity, Montreal.

Gonzalez MA, Baraloto C, Engel J, Mori SA, Pétronelli P, Riéra B, Roger A, Thébaud C, Chave J (2009) Identification of Amazonian trees with DNA barcodes. PLoS ONE, 4, e7483

24 Greuter W, Burdet HM, Long G (1986) Med-Checklist, Vol 3. Conservatoire et Jardin Botaniques, Genève.

Guindon S, Gascuel O (2003). A simple, fast and accurate method to estimate large phylogenies by maximum-likelihood". Systematic Biology 52: 696–704.

Hajibabaei M, Singer GAC, Hebert PDN, Hickey DA (2007) DNA barcoding: how it complements taxonomy, molecular phylogenetics and population genetics. Trends in Genetics, 23, 167–172.

Harley RM, Brighton CA (1977) Chromosome numbers in the genus Mentha L. Botanical Journal of the Linnean Society, 74, 71–96.

Harley RM (1982) Mentha L. In: Flora of Turkey and the East Aegean Islands (ed. Davis PH) pp. 384–394. Edinburgh University Press, Edinburgh.

Harley RM, Atkins S, Budantsev A, Cantino PD et al. (2004). Labiatae. In: The Families and Genera of Vascular Plants, vol. 7. Flowering Plants, Dicotyledons: Lamiales (Except Acanthaceae including Avicenniaceae) (ed. Kubitzki K) pp. 167– 275. Springer–Verlag, Berlin.

Hebert PDN, Cywinska A, Ball SL, DeWaard JR (2003) Biological identifications through DNA barcodes. Proceedings of the Royal Society B: Biological Sciences, 270, 313–321.

Hebert PDN, Penton EH, Burns JM, Janzen DH, Hallwachs W (2004) Ten species in one: DNA barcoding reveals cryptic species in the neotropical skipper butterfly Astraptes fulgerator. Proceedings of the National Academy of Sciences of the United States of America, 101, 14812–14817.

Hollingsworth ML, Andra Clark A, Forrest LL, et al (2009) Selecting barcoding loci for plants: Evaluation of seven candidate loci with species-level sampling in three divergent groups of land plants. Molecular Ecology Resources, 9, 439–457.

Hulsenbeck JP, Ronquist F (2001) MrBayes: Bayesian inference of phylogeny. Bioinformatics, 17, 754–755.

Ietswaart JH (1980) A taxonomic revision of the genus Origanum (Labiatae). Leiden University Press, Hague, Boston, London.

25 Ietswaart JH (1982) Origanum L. In: Flora of Turkey and the East Aegean Islands (ed. Davis PH) pp. 297-313. Edinburgh University Press, Edinburgh.

Kelly LJ, Ameka GK, Chase MW (2010) DNA barcoding of African Podostemaceae (river-weeds): A test of proposed barcode regions. Taxon, 59, 251–260.

Kokkini S (1983) Taxonomic studies of the genus Mentha L. in Greece. PhD Dissertation, Aristotle University of Thessaloniki, Thessaloniki.

Kokkini S, Karagiannakidou V, Hanlidou E, Vokou D (1988) Geographical and altitudinal distribution of the Lamiaceae in Greece. Phyton, 28, 215–228.

Kokkini S, Hanlidou E, Karousou R (2000) Smell and essential oil variation in Labiatae: does it deserve a taxonomist’s appreciation. Botanika Chronika, 13,187– 199.

Kokkini S, Karousou R, Hanlidou E (2003) Herbs of the Labiatae. In: Encyclopedia of Food Sciences and Nutrition, Herbs (ed. Benjamin Caballero) pp. 3082–3090 Academic Press, Oxford.

Kress WJ, Wurdack KJ, Zimmer EA, Weigt LA, Janzen DH (2005) Use of DNA barcodes to identify flowering plants. Proceedings of the National Academy of Sciences of the United States of America, 102, 8369–8374.

Kress WJ, Erickson DL (2007) A Two-Locus Global DNA Barcode for Land Plants: The Coding rbcL Gene Complements the Non-Coding trnH-psbA Spacer Region. PLoS ONE, 2, e508.

Kress WJ, Erickson DL (2008) DNA barcodes: Genes, genomics, and bioinformatics. Proceedings of the National Academy of Sciences of the United States of America, 105, 2761–2762.

Kress WJ, Erickson DL, Jones FA et al. (2009) Plant DNA barcodes and a community phylogeny of a tropical forest dynamics plot in Panama. Proceedings of the National Academy of Sciences, 106, 18621–18626.

Krupnick GA, Kress WJ (2005) Plant Conservation: A Natural History Approach. University of Chicago Press. Chicago

26 Lahaye R, Van Der Bank M, Bogarin D, et al (2008) DNA barcoding the floras of biodiversity hotspots. Proceedings of the National Academy of Sciences of the United States of America, 105, 2923–2928.

Matkowski A, Piotrowska M (2006) Antioxidant and free radical scavenging activities of some medicinal plants from the Lamiaceae. Fitoterapia, 77, 346–353.

Meier R, Shiyang K, Vaidya G, Ng PKL (2006) DNA barcoding and taxonomy in Diptera: a tale of high intraspecific variability and low identification success. Systematic Biololy, 55, 715–728.

Mennema J (1989) A taxonomic revision of Lamium. PhD Dissertation, Leiden University Press.

Morales VR (1987) El género Thymbra L. (Labiatae). Anales del Jardin Botanico de Madrid, 44, 349–380.

Newmaster SG, Fazekas AJ, Steeves RAD, Janovec J (2008) Testing candidate plant barcode regions in the Myristicaceae. Molecular Ecology Resources, 8, 480– 490.

Newmaster SG, Ragupathy S (2009) Testing plant barcoding in a sister species complex of pantropical Acacia (Mimosoideae, Fabaceae). Molecular Ecology Resources, 9, 172-180.

Palmer JD, Adams KL, Cho Y, Parkinson CL, Qiu YL, Song K (2000) Dynamic evolution of plant mitochondrial genomes: mobile genes and introns and highly variable mutation rates. Proceedings of the National Academy of Sciences of the United States of America, 97, 6960–6966.

Posada D (2008) jModelTest: phylogenetic model averaging. Molecular biology and evolution, 25, 1253–1256.

Rechinger KH (1943) Flora Aegaea. Springer–Verlag, Wien.

Rieseberg LH, Wood TE, Baack E (2006) The nature of plant species. Nature, 440, 524–527.

Ronquist F, Huelsenbeck JP (2003) MrBayes 3: Bayesian phylogeneticinference under mixed models. Bioinformatics, 19, 1572–1574.

27 Sass C, Little DP, Stevenson DW, Specht CD (2007) DNA barcoding in the cycadales: testing the potential of proposed barcoding markers for species identification of cycads. PLoS One, 2, e1154.

Smith SA, Donoghue MJ (2008) Rates of molecular evolution are linked to life history in flowering plants. Science, 322, 86–89.

Stefanaki A, Aki C, Vlachonasios K, Kokkini S (2010) Phytogeographic versus political borders: European Union's Lifelong Learning Programme towards a common concept in the East Aegean (E. Greece, W. Turkey). Fresenius Environmental Bulletin, 19, 696–703.

Strid A, Tan K (1991) Mountain Flora of Greece, Vol. 2. Edinburgh University Press, Edinburgh.

Tamura K, Dudley J, Nei M, Kumar S (2007) MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Molecular Biology and Evolution 24, 1596–1599.

Tutin TG, Heywood VH, Burges NA et al. (1972) Flora Europaea, Vol 3. Cambridge University Press, Cambridge.

Wagstaff SJ, Hickerson L, Spangler R, Reeves PA, Olmstead RG (1998) Phylogeny in Labiatae sl, inferred from cpDNA sequences. Plant Systematics and Evolution, 209, 265–274.

Waterhouse AM, Procter JB, Martin DMA, Clamp M, Barton GJ (2009) Jalview Version 2 – a multiple sequence alignment editor and analysis workbench. Bioinformatics, 25, 1189-1191.

28 Appendix 1. Labiatae taxa sampled, voucher specimen number, collection coordinates and GenBank accession numbers. Παράρτηµα 1. Taxa της οικογένειας Labiatae που χρησιµοποιήθηκαν στην παρούσα µελέτη, αριθµός ερµπαρίου, συντεταγµένες συλλογής και κωδικοί κατάθεσης στην Τράπεζα Γονίδιων.

Voucher Latitude/ Taxon matK rbcL trnH-psbA specimen Longitude Chios Island Acinos alpinus (L.) Moench SA1745 38.33N/25.60E HQ902701 HQ902756 HQ902815

A. rotundifolius Pers. SA1669 38.34N/25.54E HQ902702 HQ902757 HQ902816

Ajuga. orientalis L. SA1672 38.33N/25.56E HQ902724 HQ902758 HQ902817

Ballota acetabulosa (L.) Bentham SA1820 38.31N/25.53E HQ902731 HQ902759 HQ902818

B. nigra L. SA1318 38.20N/26.08E HQ902786

Calamintha incana (Sm.) Boiss. SA1321 38.24N/26.07E

Coridothymus capitatus (L.) Reichenb. fil. SA1853 38.09N/26.01E HQ902732 HQ902819

Lamium amplexicaule L. SA1653 38.29N/25.59E HQ902722 HQ902760 HQ902820

L. moschatum Miller SA1668 38.34N/25.53E HQ902703 HQ902761 HQ902821

Lavandula stoechas L. SA1852 38°11N/26°01E HQ902704 HQ902787 HQ902822

Marrubium vulgare L. SA1664 38.34N/25.52E HQ902744 HQ902762 HQ902823

Melissa officinalis L. SA1724 38.29N/26.06E HQ902824

Mentha longifolia (L.) Hudson SA1579 38.30N/25.59E HQ902745 HQ902788 HQ902825

M. spicata L. SA1533 38.32N/25.54E HQ902746 HQ902789 HQ902826

M. spicata L. SA1536 38.32N/25.54E HQ902733 HQ902763 HQ902827

M. pulegium L. SA1875 38.32N/25.60E HQ902747 HQ902790 HQ902828

29 Voucher Latitude/ Taxon matK rbcL trnH-psbA specimen Longitude M. suaveolens Ehrh. SA1732 38.32N/26.07E HQ902734 HQ902829

Micromeria graeca (L.) Bentham ex Reichenb. SA1674 38.35N/26.04E HQ902830

M. juliana (L.) Reichenb. SA1783 38.21N/26.04E HQ902831

M. myrtifolia Boiss. & Hohen. SA1241 38.11N/26.02E

M. nervosa (Desf.) Bentham SA1647 38.17N/25.56E HQ902832

Nepeta italica L. SA1899 38.31N/26.01E HQ902725 HQ902791 HQ902833

Origanum onites L. SA1807 38.12N/25.58E HQ902752 HQ902834

Origanum sp. SA1464 38.17N/26.06E HQ902835

O. sipyleum L. SA1418 38.34N/26.01E HQ902792 HQ902836

Phlomis cretica C. Presl SA1685 38.11N/26.01E HQ902705 HQ902764 HQ902837

Prasium majus L. SA1639 38.12N/26.01E HQ902765 HQ902838

Prunella vulgaris L. SA1287 38.35N/26.00E HQ902839

Salvia fruticosa Miller SA1850 38.10N/26.02E HQ902726 HQ902793 HQ902840

S. sclarea L. SA1748 38.23N/26.05E HQ902794

S. verbenaca L. SA1757 38.20N/26.05E HQ902753 HQ902795 HQ902841

S. virgata Jacq. SA1782 38.22N/26.04E HQ902735 HQ902796 HQ902842

S. viridis L. SA1648 38.17N/25.56E HQ902706 HQ902766 HQ902843

Scutellaria velenovskyi Rech. fil. SA1854 38.10N/26.01E HQ902797

Scutellaria velenovskyi Rech. fil. SA1626 38.31N/26.00E HQ902780

30 Voucher Latitude/ Taxon matK rbcL trnH-psbA specimen Longitude Sideritis sipylea Boiss. SA1746 38.34N/25.60E HQ902707

Stachys cretica L. SA1761 38.17N/26.06E HQ902708 HQ902767 HQ902844

Teucrium brevifolium Schreber SA1812 38.17N/25.56E HQ902727 HQ902781 HQ902845

T. divaricatum Sieber SA1481 38.17N/26.01E HQ902709 HQ902782 HQ902846

T. polium L. SA1855 38.10N/26.01E HQ902710 HQ902768 HQ902847

T. scordium L. SA1825 38.34N/25.53E HQ902711 HQ902769

Thymbra spicata L. SA1769 38.15N/25.59E HQ902748 HQ902798 HQ902848

Thymus sipyleus Boiss. SA1744 38.33N/25.60E HQ902749 HQ902799 HQ902849

Th. zygioides Griseb. SA1787 38.25N/26.04E HQ902750 HQ902800

Çeşme-Karaburun Peninsula (Turkey) Acinos alpinus (L.) Moench SAVK2004 38.36N/26.30E HQ902728 HQ902770 HQ902850

Ajuga chamaepitys (L.) Schreber SA1959 38.16N/26.32E HQ902712 HQ902771 HQ902851

A. orientalis L. SAVK1979 38.16N/26.31E HQ902729 HQ902772 HQ902870

Ballota acetabulosa (L.) Bentham SAVK1972 38.16N/26.31E HQ902716 HQ902783 HQ902868

B. nigra L. SA2030 38.17N/26.19E

Coridothymus capitatus (L.) Reichenb. fil. SAVK1991 38.27N/26.32E HQ902723 HQ902801 HQ902879

Lamium amplexicaule L. SA1930 38.18N/26.19E HQ902754 HQ902802 HQ902852

Lavandula stoechas L. SA1955 38.39N/26.24E HQ902755 HQ902803 HQ902866

Marrubium vulgare L. SA1929 38.18N/26.19E HQ902730 HQ902853

31 Voucher Latitude/ Taxon matK rbcL trnH-psbA specimen Longitude Melissa officinalis L. SAVK2027 38.17N/26.41E

Mentha pulegium L. SAVK2001 38.35N/26.28E HQ902718 HQ902804 HQ902869

M. spicata L. SA2039 38.17N/26.41E HQ902736 HQ902805 HQ902854

M. suaveolens Ehrh. SA2034 38.17N/26.41E HQ902737 HQ902806 HQ902855

Micromeria graeca (L.) Bentham ex Reichb. SA2059 38.40N/26.25E

M. juliana (L.) Bentham ex Reichb. SAVK2005 38.34N/26.30E HQ902874

M. myrtifolia Boiss. & Hohen. SA1918 38.19N/26.17E

Nepeta italica L. SA2052 38.35N/26.30E

Origanum onites L. SA1945 38.28N/26.37E HQ902807 HQ902856

Phlomis fruticosa L. SA1925 38.19N/26.17E HQ902713 HQ902773 HQ902857

Prasium majus L. SA1921 38.17N/26.14E HQ902774 HQ902858

Salvia fruticosa Miller SA1951 38.40N/26.29E HQ902738 HQ902784 HQ902859

S. pomifera L. SA2035 38.17N/26.41E HQ902751 HQ902860

S. tomentosa Miller SAVK2003 38.36N/26.30E HQ902739 HQ902808 HQ902873

S. verbenaca L. SA1920 38.17N/26.14E HQ902809

S. virgata Jacq. SAVK1986 38.12N/26.36E HQ902717 HQ902810 HQ902871

S. viridis L. SA1940 38.28N/26.35E HQ902714 HQ902811 HQ902861

Satureja thymbra L. SA1943 38.28N/26.36E

32 Voucher Latitude/ Taxon matK rbcL trnH-psbA specimen Longitude Sideritis sipylea Boiss. SAVK2007 38.34N/26.30E HQ902740 HQ902775 HQ902876

Stachys cretica L. SA1923 38.16N/26.15E HQ902776 HQ902862

Teucrium divaricatum Sieber. SA1919 38.17N/26.16E HQ902741 HQ902777 HQ902863

T. montanum L. SAVK2008 38.34N/26.30E HQ902719 HQ902785 HQ902877

T. polium L. SAVK1965 38.18N/26.29E HQ902721 HQ902778 HQ902867

T. scordium L. SAVK2021 38.26N/26.34E HQ902715 HQ902779 HQ902872

Thymbra spicata L. SA1926 38.18N/26.19E HQ902864

Thymus sipyleus Boiss. SAVK2009 38.35N/26.30E HQ902742 HQ902812 HQ902878

Ziziphora taurica Bieb. SAVK2006 38.34N/26.30E HQ902720 HQ902813 HQ902875

Mentha spicata L. SA1558 38.32N/26.07E HQ902743 HQ902814 HQ902865

33 Appendix 2. PCR primers used for amplification of the plastid regions. Παράρτηµα 2. Οι εκκινητές που χρησιµοποιήθηκαν για την ενίσχυση των χλωροπλαστικών περιοχών.

Plastid locus Primer name Sequences 5’ – 3’ matK matk390F CGATCTATTCATTCAATATTTC matk1326R TCTAGCACACGAAAGTCGAAGT matk3fkim- CGTACAGTACTTTTGTGTTTACGAG matk1rkim ACCCAGTCCATCTGGAAATCTTGGTTC rbcL rbcLaF ATGTCACCACAAACAGAGACTAAAGC rbcLaR CTTCTGCTACAAATAAGAATCGATCTC rbcLaR* GTAAAATCAAGTCCACC(AG)CG trnH-psbA psbA3’f GTTATGCATGAACGTAATGCTC trnHf_05 CGCGCATGGTGGATTCACAATCC

Appendix 3. Maximum intraspecific and minimum interspecific TrN+G distances. Παράρτηµα 3. Μέγιστες ενδοειδικές και ελάχιστες διειδικές TrN+G αποστάσεις. matK

Maximum intraspecific Minimum interspecific

Lavandula stoechas 0.001 0.061

Thymus sipyleus 0.006 0.008

Coridothymus capitatus 0.001 0.004

Acinos alpinus 0 0.001

Mentha pulegium 0 0.003

M. spicata 0.005 0

M. suaveolens 0.001 0

Salvia viridis 0.003 0.018

S. fruticosa 0.001 0.003

S. virgata 0.008 0.006

Ajuga orientalis 0.001 0.01

Teucrium scordium 0.001 0.015

T. polium 0 0.001

34 T. divaricatum 0.001 0.013

Sideritis sipylea 0.002 0.025

Lamium amplexicaule 0.001 0.023

Ballota acetabulosa 0 0.017

Marrubium vulgare 0.001 0.017

rbcL

Maximun intraspecific Minimum interspecific

Lavandula stoechas 0 0.02

Salvia viridis 0 0.002

Salvia fruticosa 0 0

Salvia verbenaca 0 0

Salvia virgata 0.002 0

Thymus sipyleus 0.006 0

Acinos alpinus 0 0

Mentha spicata 0 0

Mentha pulegium 0 0

Teucrium polium 0 0

Teucrium divaricatum 0 0.004

Teucrium scordium 0.002 0.004

Ajuga orientalis 0 0.006

Lamium amplexicaule 0 0.012

Ballota acetabulosa 0 0.006

Prasium majus 0 0.006

Stachys cretica 0.002 0.002

35

trnH-psbA

Maximum intraspecific Minimum interspecific

Ajuga orientalis 0.005 0.064

Teucrium divaricatum 0 0.078

Teucrium polium 0.061 0.076

Acinos alpinus 0 0.02

Thymbra spicata 0.003 0.014

Thymus sipyleus 0.046 0.02

Mentha spicata 0.097 0

Mentha suaveolens 0 0

Mentha pulegium 0 0.012

Micromeria juliana 0 0.003

Origanum onites 0 0.005

Coridothymus capitatus 0 0.014

Salvia virgata 0.03 0.015

Salvia viridis 0 0.033

Salvia fruticosa 0.025 0.003

Lavandula stoechas 0 0.093

Ballota acetabulosa 0 0.028

Marrubium vulgare 0.019 0.028

Lamium amplexicaule 0 0.065

Prasium majus 0 0.067

Stachys cretica 0.006 0.067

36

matk+rbcL

Maximum intraspecific Minimum interspecific

Lavandula stoechas 0.001 0.045

Salvia viridis 0.002 0.012

Salvia fruticosa 0.001 0.002

Salvia virgata 0.005 0.004

Acinos alpinus 0 0.001

Mentha pulegium 0 0.002

Mentha spicata 0.003 0.001

Thymus sipyleus 0.006 0.005

Ajuga orientalis 0.001 0.008

Teucrium scordium 0.002 0.013

Teucrium polium 0 0.001

Teucrium divaricatum 0.001 0.011

Ballota acetabulosa 0 0.012

Lamium amplexicaule 0.001 0.019

37

matk+trnH-psbA

Maximum intraspecific Minimum interspecific

Lavandula stoechas 0.001 0.077

Salvia viridis 0.002 0.023

Salvia fruticosa 0.009 0.003

Salvia virgata 0.014 0.009

Acinos alpinus 0 0.006

Coridothymus capitatus 0.001 0.006

Mentha spicata 0.031 0

Mentha pulegium 0 0.004

Mentha suaveolens 0.001 0

Thymus sipyleus 0.019 0.014

Ajuga orientalis 0.001 0.021

Teucrium polium 0.022 0.028

Teucrium divaricatum 0.007 0.051

Ballota acetabulosa 0 0.024

Marrubium vulgare 0.006 0.024

Lamium amplexicaule 0.001 0.033

38

rbcL+trnH-psbA

Maximum intraspecific Minimum interspecific

Lavandula stoechas 0 0.048

Salvia viridis 0 0.013

Salvia virgata 0.008 0.012

Salvia fruticosa 0.01 0.001

Acinos alpinus 0 0.008

Thymus sipyleus 0.022 0.014

Mentha spicata 0.038 0

Mentha pulegium 0 0.005

Ajuga orientalis 0 0.037

Teucrium polium 0.029 0.042

Teucrium divaricatum 0 0.067

Ballota acetabulosa 0 0.014

Lamium amplexicaule 0 0.025

Prasium majus 0.006 0.054

Stachys cretica 0.01 0.054

39

matk+rbcL+trnH-psbA

Maximum intraspecific Minimum interspecific

Lavandula stoechas 0.001 0.058

Salvia viridis 0.001 0.016

Salvia fruticosa 0.006 0.002

Salvia virgata 0.01 0.006

Acinos alpinus 0 0.005

Mentha spicata 0.021 0.001

Mentha pulegium 0 0.004

Thymus sipyleus 0.015 0.015

Ajuga orientalis 0.008 0.022

Teucrium polium 0.016 0.02

Teucrium divaricatum 0.001 0.034

Lamium amplexicaule 0.006 0.026

Ballota acetabulosa 0 0.025

40