Molecular Ecology Resources (2009) 9 (Suppl. 1), 172–180 doi: 10.1111/j.1755-0998.2009.02642.x

BARCODINGBlackwell Publishing Ltd Testing barcoding in a sister species complex of pantropical (, )

STEVEN G. NEWMASTER and SUBRAMANYAM RAGUPATHY Floristic Diversity Research Group, Biodiversity Institute of Ontario Herbarium (OAC), University of Guelph, Guelph, Ontario, Canada N1G 2W1

Abstract Acacia species are quite difficult to differentiate using morphological characters. Routine identification of Acacia samples is important in order to distinguish invasive species from rare species or those of economic importance, particularly in the forest industry. The genus Acacia is quite abundant and diverse comprising approximately 1355 species, which is cur- rently divided into three subgenera: subg. Acacia (c. 161 species), subg. Aculiferum (c. 235 species), and subg. Phyllodineae (c. 960 species). It would be prudent to utilize DNA barcoding in the accurate and efficient identification of . The objective of this research is to test barcoding in discriminating multiple populations among a sister-species complex in pantrop- ical Acacia subg. Acacia, across three continents. Based on previous research, we chose three cpDNA regions (rbcL, trnH-psbA and matK). Our results show that all three regions (rbcL, matK and trnH-psbA) can distinguish and support the newly proposed genera of Wight & Arn. from Acacia Mill., discriminate sister species within either genera and differentiate biogeographical patterns among populations from , Africa and . A morpho- metric analysis confirmed the cryptic nature of these sister species and the limitations of a classification based on phenetic data. These results support the claim that DNA barcoding is a powerful tool for and biogeography with utility for identifying cryptic species, bio- geograhic patterns and resolving classifications at the rank of genera and species. Keywords: Acacia, biodiversity, biogeographical pattern, DNA barcoding, sister species complex, trees, Vachellia Received 14 November 2008; revision received 16 December 2008; accepted 20 January 2009

(Newmaster et al. 2006, 2008; Kress & Erickson 2007, 2008; Introduction Fazekas et al. 2008; Lahaye et al. 2008) have demonstrated DNA barcoding may provide a tool for identifying cryptic the utility of barcoding as an effective tool for plant plant species. Hebert et al. (2003) developed DNA barcoding identification. as a method of species identification and recognition using One of the challenges for plant barcoding is the ability to specific regions of DNA sequence data (Ratnasingham & resolve sister species within a large geographical range. It Hebert 2007). He has developed barcoding in animals, is expected that a system based on any one, or small number which is well documented and can be reviewed online via of chloroplast genes will fail in certain taxonomic groups the Canadian Barcode of Life (http://www.bolnet.ca) and with extremely low amounts of plastid variation while per- the Consortium for the Barcode of Life (CBOL, http:// forming well in other groups. Newmaster et al. (2008) recently www.barcoding.si.edu). Although the difficulties of plant focused on a Neotropical group (Myristicaceae, or nutmeg barcoding have been debated (Chase et al. 2005; Kress et al. family), with low molecular divergence containing some 2005; Cowan et al. 2006; Pennisi 2007), detailed studies recently evolved species that might be expected to be at the limit of resolution for several of the proposed regions (New- Correspondence: Steven G. Newmaster, Integrative Biology, College of Biological Sciences Department of Integrative Biology University master et al. 2008). Two of the regions (matK and trnH-psbA) of Guelph, Guelph, Ontario, Canada N1G 2W1. Fax: (519) 767- had significant variation and show promise for barcoding 1656; E-mail: [email protected] in nutmegs. This research demonstrated that a two-gene

© 2009 Blackwell Publishing Ltd BARCODING PLANTS 173 approach utilizing a moderately variable region (matK) and Methods a more variable region (trnH-psbA) provides resolution among all the Compsonuera species we sampled including Selecting sister species the recently evolved C. sprucei and C. mexicana. Our research was limited to Central and South America and to our knowl- Two pairs of pantropical sister species of Acacia were selected edge, there are no published comprehensive studies that that have been well documented in several published syste- test the ability of plant barcodes to discriminate multiple matic studies. The sister-species Acacia melanoxylon R. Br. populations of sister species that span several contents such and Acacia longifolia (Andrews) Willd. from the Mimosoideae as a pantropical distribution. gummiferae-spicateae are supported by a phylogenetic The genus Acacia comprises approximately 1350 species analysis of chloroplast sequence data (trnL-F, trnK, and matK: of which there are many cryptic sister species with pantropical Luckow et al. 2003). The sister-species Vachellia farnesiana (L.) distributions (Maslin et al. 2003). In fact, many Acacia species Wight & Arnott (sensu lato Acacia farnesiana (L.) Willd.) and are quite difficult to differentiate using morphological (L.) P. Hurter & Mabb. (sensu lato Acacia nilotica characters (Bentham 1842; Wardill et al. 2005). Identifica- (L.) Willd. ex Delile) from the Mimosoideae gummiferae- tion is important in order to distinguish invasive species globiferae are supported by a phylogenetic analysis chloro- (Kriticos et al. 2003) from rare species (Byrne et al. 2001) or plast RFLP data (Bukhari et al. 1999). those of economic importance (Midgley & Turnbull 2003). Acacia species are well adapted to dry conditions (Ross 1981) Selecting barcode regions and have great potential in agroforestry. They have wide- ranging utility as fuelwood, timber, fibre, medicine, food, Several DNA regions were selected for barcoding the sister handicrafts, domestic utensils, environmental amelioration, species. Previous DNA barcoding analyses of molecular data soil fertility, shade, game refuge, livestock fodder, ornamental (Newmaster et al. 2006, 2008; Kress & Erickson 2007, 2008; planning, gum, and (Wickens et al. 1995; McDonald Fazekas et al. 2008; Lahaye et al. 2008) suggest that several et al. 2001; Midgley & Turnbull 2003). DNA regions are suitable for barcoding plants. Based on The genus Acacia is divided into three subgenera: subg. these studies, we chose three regions (rbcL, trnH-psbA and Acacia (pantropical, c. 161 species), subg. Aculiferum (pantro- matK) for barcoding Acacia. pical, c. 235 species) and subg. Phyllodineae (pantropical c. 960 species) (Maslin et al. 2003). However, current morpho- Sampling logical and genetic differences separating the subgenera of Acacia and molecular evidence that the genus Acacia is Representative specimens for all four sister species were polyphyletic necessitate transfer of many taxa to different collected from Australia, northeast Africa and India (Fig. 1). genera (Maslin et al. 2003). This proposal is under debate Five populations were collected for each sister species on among systematists. In the most likely scenario, the majority each of the three contents (4–5 populations × 4 spp. × 3 of the Australian taxa would remain as Acacia Mill. with a continents = 56 voucher samples). Criteria for selecting a significant number of name changes to Senegalia (203 spp.) population include that they were at least 300 km from the and Vachellia Wight & Arn. (161 spp.) in Asia, Africa, next nearest collection site and that the population was Australia and in the Americas (Maslin et al. 2003). Vachellia robust and healthy (not diseased). Each of the 56 collection is actually the earliest legitimate generic name for species sites required the collection of a pressed herbarium voucher currently ascribed to Acacia subg. Acacia, based originally and sample, which was stored in sealed plastic bags on morphological characters. The majority of the molecular with silica gel to ensure rapid drying and minimal DNA support for this revision has been reported from popula- degradation. The herbarium vouchers were used for a mor- tions in Australia, with a few from South Africa (Miller & phometric analysis and deposited at the Biodiversity Institute Bayer 2001; Luckow et al. 2003; Miller & Bayer 2003; Seigler of Ontario Herbarium (OAC), University of Guelph, Ontario, et al. 2006). No populations have been included from India Canada. Leaf samples from the respective 56 population or North Africa. vouchers were used for DNA barcoding. Our objective is to test plant barcoding in discriminating taxonomic affinities among species and multiple popula- DNA sequencing proto-cols tions among a sister species complex in pantropical Acacia subg. Acacia, across three continents. Specifically, we will Total genomic DNA was isolated from approximately 10 mg test the ability of DNA barcoding to: (i) distinguish the newly of dried leaf material from each sample using the NucleoSpin proposed genera of Vachellia from Acacia, (ii) discriminate 96 Plant II (MACHEREY-NAGEL). Extracted DNA was sister species within the genera Acacia and Vachellia, and (iii) stored in sterile microcentifuge tubes at –20 °C. The selected differentiate biogeographical patterns among populations loci were amplified by polymerase chain reaction (PCR; see from India, North Africa and Australia. primers in Table 1) on a PTC–100 thermocycler (Bio-Rad).

© 2009 Blackwell Publishing Ltd 174 BARCODING PLANTS

Fig. 1 Collection sites for Acacia and Vachellia in India, Africa and Australia (the names of the six localities are listed).

Table 1 PCR primers used for amplification of plastid DNA using BioEdit version 7.0.9 (37). In order to obtain an estimate sequences of variation in the regions examined, we calculated pairwise uncorrected p-distance for each region using mega 3.1 Plastid locus Primer name Sequences (Kumar et al. 2004). The sequences were submitted to BOLD and GenBank. matK matK X F TAATTTACGATCAATTCATTC matK 5r GTTCTAGCACAAGAAAGTCG CGCGCATGGTGGATTCACAATCC trnH-psbA trnH-F DNA barcoding analyses psbA-R GTTATGCATGAACGTAATGCTC rbcL rbcL-aF ATGTCACCACAAACAGAGACTAAAGC This analysis utilized the raw sequence data from each of ajf634R GAAACGGTCTCTCCAACGCAT the regions in a matrix with all 56 Acacia specimens. Bray– Curtis average linkage was used to create three distance matrices of the 56 specimens using all matK sequences. The DNA was amplified in 20-μL reaction mixtures containing relationship of classification structure in the species data to 1 U AmpliTaq Gold Polymerase with GeneAmp 106 PCR the molecular characters was analysed with nonmetric Buffer II (100 mm Tris-HCl pH 8.3, 500 mm KCl) and 2.5 mm multidimensional scaling (NMS; Kruskal 1964; Primer 2002).

MgCl2 (Applied Biosystems), 0.2 mm dNTPs, 0.1 mm of each In NMS, the Bray–Curtis distance measure was used because primer (0.5 mm for matK), and 20 ng/μL template DNA. of its robustness for both large and small scales on the axes Amplification products were sequenced directly in both (Minchin 1987). Data were standardized by species maxima directions with the primers used for amplification, follow- and two-dimensional solutions were appropriately chosen ing the protocols of the University of Guelph Genomics based on plotting a measure of fit (‘stress’) to the number facility (http://www.uoguelph.ca/ib/facilities/Genomics/ of dimensions. Stress represents distortion in the data and GenomicsFacility.shtml). Sequence products were cleaned from a stress value over 0.15 is high enough that the results are each specimen on Sephadex columns and ran the samples invalidated (Primer 2002). One thousand iterations were on an ABI 3730 sequencer (Applied Biosystems). Bidirectional used for each NMS run, using random start coordinates. The sequence reads were obtained for all PCR products. Sequen- first two ordination axes were rotated to enhance inter- cher 4.7 (Gene Codes Corp.) was used to assemble and pretability with the different axes. As an independent check, base-call sequences and alignment was completed manually detrended correspondence analysis (DCA; ter Braak 1998)

© 2009 Blackwell Publishing Ltd BARCODING PLANTS 175

Fig. 2 Nonmetric multidimensional scaling ordination of 56 individuals of Acacia species using matK sequence data (33 variable sites). Grey circles represent intraspecific species variation (shading indicates geographic location: white, Australia; grey, Africa; black, India). was used to evaluate the NMS classification. In order to test correspondence analysis (DCA) was used to explore variation whether accurate species assignments can be made among in species scores. A cluster analysis was used to classify the the samples in our data set, we used the ‘best match’ and specimens, as it is better in representing distances among ‘best close match’ functions of the program TaxonDNA similar specimens, whereas DCA is better in representing (Meier et al. 2006). This program determines the closest match distances among groups of specimens (Sneath & Sokal 1973). of a sequence from comparisons to all other sequences in Cluster analysis was performed with ntsys (Rohlf 2000). A an aligned data set. It establishes a similarity threshold distance matrix was generated using an arithmetic average based on the frequency distribution of the intraspecific pair- (upgma) clustering algorithm and standardized data based wise distances. The threshold is set at a value below which on average taxonomic distance subjected to the unweighted 95% of all intraspecific pairwise distances are found (Meier pair-group method. A discriminant function analysis (DFA; et al. 2006). Unlike the ordinations we calculated, TaxonDNA SPSS 1999) was used to rigorously test the classification of ignores indels when calculating distance. These sequence specimens provided in the cluster analysis. The object of DFA identification methods were performed on the rbcL, matK, is to predict multivariate responses that best discriminate trnH-psbA, and all possible two-region combinations using subjects among different groups (Ramsey & Schafer 1997). uncorrected pairwise distances and a minimum sequence A total of 26 morphological characters for each of the 56 overlap of 300 bp. The inclusion of conspecific individuals specimens were used as input for a DFA that were each coded is a key component of this type of analysis, as the query as belonging to one group as designated a priori groups sequence is removed from the data set prior to determining which: (i) determined if the classification was accurate, (ii) its best or closest match. provided discriminant functions for the classification of the taxa and, (iii) indicated if there are important morphological characters for each of the canonical discriminate functions. Morphometric classification analysis Twenty-six morphological variables were recorded from Results the 56 Acacia specimens used in the multivariate phenetic analysis. A matrix of 56 specimens and 26 morphological Barcoding supports the generic split in Acacia characters were used in a multivariate analysis using Canoco 4 (ter Braak 1998). Canonical ordination was used to detect A single-region DNA barcode using rbcL, matK or trnH-psbA groups of specimens and to estimate the contribution of each can distinguish Vachellia species from that of the Acacia variable to the ordination. A principal component analysis species. The ordination using matK sequence data clearly (PCA; ter Braak 1998) was used to identify the length of the defines groups of taxa for each of the respective genera that ordination axis. Unimodal, indirect ordination detrended are well separated (Fig. 2). Variation in rbcL could also be

© 2009 Blackwell Publishing Ltd 176 BARCODING PLANTS

of the specimens: Acacia melanoxylon, A. longifolia, Vachellia farnesiana and V. nilotica. The NMS sequence classification resulted in an ordination that discriminates 56 specimens into the two sets of sister-species species (Fig. 2). Interspecific variation (0.0023–0.0291) is greater than intraspecific variation (0–0.0026), which includes haplotypes from 4–5 populations per species distributed across three continents. This is con- sistent with other results that record nucleotide diversity for Acacia (Byrne et al. 2001). The slight overlap in inter/ intraspecific variation found is our study is supported in other plant barcoding studies (Fazekas et al. 2008; Kress & Erickson 2008; Newmaster et al. 2008). A DNA barcode (using only matK) was used to make Fig. 3 Morphometric ordination of 56 individuals of Acacia species accurate species assignments and identify samples that using 26 morphological variables. Dotted circles represent intra- were misidentified by the field taxonomists. We used the specific species variation (shading indicates geographic location: white, Australia; grey, Africa; black, India). ‘best match’ and ‘best close match’ functions of the program TaxonDNA (Meier et al. 2006) to differentiate all 56 samples into six distinct groups of barcodes. In our study, no two Table 2 Summary statistics for coding and noncoding DNA (rbcL, individuals of different species share identical sequences matK and trnH-psbA) of acacias examined from 56 populations and the percentage of correct identifications of all pairwise Parameters rbcL matK trnH-psbA comparisons was 100% for six distinct groups of taxa. This identified a problem because only 83% of the samples Resolution* 67% 100% 67% matched the sister-species taxa that were targeted in our Size (in bp) 670–720 750–808 630–700 study: A. melanoxylon, A. longifolia, V. farnesiana and V. ni lotica No. of variable sites (S) 25 33 63 DNA barcodes identified 10 specimens from Africa that Mean interspecific p-distance 0.014 0.015 0.201 were misidentified; specimens from Africa thought to be A. melanoxylon and A. longifolia were not grouped with those *Resolution, number of species with haplotypes not found in any other species. respective taxa on the ordination (Fig. 2). These specimens were re-examined by taxonomist and sent to Acacia experts for determination. These are difficult species to identify used to differentiate Vachellia species from that of the Acacia morphologically, but a determination was confirmed by all species. These results are supported by our phenetic analysis the taxonomists that these misidentified specimens can be (defined below), which resulted in a morphometric ordina- grouped into two species: Acacia saligna (Labill.) H.L. Wendl. tion that clearly displays the generic split in Acacia; Vachellia and Acacia arabica (Lam.) Willd. In the ordination (Fig. 2), species are grouped on the top and Acacia species grouped A. arabica is grouped on the left side of the ordination with on the bottom of the ordination (Fig. 3). all other species in the genus Vachellia. We are conducting further research, which suggests these taxa are closely allied and formal name change will be made separately (i.e. new Barcoding discriminates sister species of Acacia combination Vachellia arabica (sensu lato Acacia arabica (Lam.) DNA barcoding identified the four species in question and Willd.) following a more detailed examination of more the identity of 10 specimens that were misidentified as later populations. verified by taxonomic experts. All three regions (rbcL, trnH- Our morphometric analysis did not clearly discriminate psbA and matK) had considerable interspecific divergence all the Acacia samples. A discriminant function analysis (Table 2). A considerable portion (67%) of the species can (DFA) used 26 quantitative characters to classify hetero- be distinguished using rbcL or trnH-psbA alone (Table 2). geneity in 56 specimens into what is currently considered trnH-psbA was highly variable because of the presence of six known taxa of Acacia and Vachellia (A. melanoxylon, indels and homopolymer runs, which have the potential to A. longifolia, V. farnesiana, V. nilotica, A. saligna and A. arabica). overweight distance measures. Some of the indels are asso- The canonical correlation from the discriminant functions is ciated with the homopolymer runs and it is improbable that the ratio of the between groups sums of squares to the all indels of a given length in these regions are homologous. total sums of squares. Thus, the first discriminant function matK was quite variable, easy to align and could distinguish is responsible for 46% of the between group differences all of the Acacia species. We chose to use matK for the multi- (variability in the discriminant scores). The second function variate classification analysis. A single region DNA barcode is responsible for an additional 11% of the between group (matK) discriminates the two sets of sister species among 56 variance and the third function is responsible for an

© 2009 Blackwell Publishing Ltd BARCODING PLANTS 177 additional 11% of the variance. Wilk’s lambda was used to test less importance. These six series are accepted by most taxo- the hypothesis that there are no differences in variance (P < nomists as the primary divisions of the genus (Ross 1973). 0.001) between the groups of taxa which represent differ- Our barcoding results support the earlier classification that ent species (SPSS 1999). There were significant differences recognizes Vachellia, which is the earliest legitimate generic (P < 0.001) for first two canonical functions. Seventy-two name for species currently ascribed to Acacia subg. Acacia. per cent of the samples were correctly classified into six This supports a growing body of morphological and genetic groups (representing the six species). The ordination ana- differences separating the subgenera of Acacia s.l. and mole- lyses utilized DCA in the separation of six species from the cular evidence that the genus Acacia s.l. is polyphyletic, 60 specimens that were analysed and provided a measure which requires several new generic combinations (Miller & of the important morphological variables in the classifica- Bayer 2001; Maslin et al. 2003; Miller et al. 2003; Seigler et al. tion. Principal components analysis (PCA) provided a char- 2006). acter gradient that was unimodal [3.2 standard deviations (SD)] violating the assumption of a linear model (ter Braak Barcoding discriminates sister species of Acacia 1998). Consequently, a DCA was used to classify the 56 specimens into indistinct groups representing the six Acacia We used DNA barcoding to identify the four sister species species (Fig. 1). The eigenvalues for the x-axis (0.713) and and the identity of 10 voucher specimens that were misiden- the y-axis (0.518) indicated that the gradient axes were of tified as later verified by taxonomic experts. We chose to considerable length and justified the use of DCA. barcode cryptic sister species that are difficult for taxonomists to differentiate using morphological characters. The defining characters of many acacias are found in the small flowers Barcoding differentiates biogeographical patterns in that appear during short periods of time during the year. Acacia Vegetation characters are more variable and less reliable for A single-region barcode (matK) discriminates all 56 speci- identification. In our study, we found that the misidentifi- mens into there respective continental haplotypes. In the cation of specimens were those that only had vegetative NMS ordination, all 56 samples are correctly classified by characters, underscoring the difficulty of identifying these species and the continent where they were collected; India, species. Given the important economic value of acacias, it North Africa or Australia (Fig. 1). We used the ‘best match’ would be very useful to have a reliable identification tool and ‘best close match’ functions of the program TaxonDNA that can differentiate Acacia species using only the , (Meier et al. 2006) to differentiate all 56 samples into six species which are easily accessible. and their respective haplotypes among three continents. Other studies have utilized diagnostic sequences to clas- sify previously undetermined specimens due to lack of avail- able morphological characters and as a classification tool Discussion where specimens have proven difficult to classify (Wardill Barcoding and phenetic studies support the generic split in et al. 2005). For example, ITS1 and trnL DNA fragments have Acacia. Our research shows that a single-region DNA barcode been used to identify seven of the described subspecies of can distinguish Vachellia species from that of the Acacia A. nilotica (Brenan 1983; Fagg & Greaves 1990). This has been species. This was supported by our morphometric analyses, particularly useful for identifying cryptic specimens previ- which resulted in an ordination with separation between ously undetermined by herbarium taxonomists. Wardill et al. Vachellia species and Acacia species. These results are sup- (2005) created an ITS1 genotype library that was used as an ported by phylogenetic studies in which Vachellia species are identification tool to be matched exactly to genotypes of placed in a separate clade (100% bootstrap support); all other herbarium specimens identified by taxonomists and species other than those of Vachellia are placed in a different provide a subspecies classification. Although this ITS1 clade (66% bootstrap support), indicating that Vachellia is genotype library is a useful tool for acacias, it would be relatively distantly related to other members of Acacia s.l. desirable to have a standard region to build a library for all (Luckow et al. 2003; Miller et al. 2003; Seigler et al. 2006). plants. We propose that DNA barcoding will be a universal Early in the taxonomic Acacia literature, Bentham (1840) and tool used by taxonomists to identify cryptic species and there- later Wight & Arnott (1834) recognized Vachellia (Acacieae, fore expedite the identification process. Acacia subg. Acacia) as a distinct genus from the ‘true’ Acacia We suggest that DNA barcoding may reveal biogeograph- (Acacieae, Acacia, subg. Phyllodineae). This distinction was ical patterns due to intraspecific variation, particularly at based on several morphological characters including differ- large spatial scales. Our study revealed that intraspecific ences in the pods, phyllodes, involucre on , pollen, variation identifies haplotypes that are associated with a seeds and endosperm. Bentham (1864, 1875) later redefined particular continent. Wardill et al. (2005) used DNA fragments Acacia into six series based on foliage characters and the to identify three distinct genotypes of Vachellia nilotica rep- nature of the spines, with inflorescence characters being of resenting three distinct geographical regions: Pakistan, northern

© 2009 Blackwell Publishing Ltd 178 BARCODING PLANTS

Africa and southern Africa. He suggested that Australian V. of taxa as in the study by Fazekas et al. (2008). A tiered nilotica populations may have originated from India or approach allows an unknown sample to be placed in genus Pakistan and recommend further analysis of Indian samples where a successful pair of primers can be targeted. The (not included in his study) to determine the genetic diver- problems with noncoding regions, such as trnH-psbA, have sity profile and origins of Australian populations (Wardill been discussed previously (Kress et al. 2005; Cowan et al. et al. 2005). Our study did include India populations, which 2006); alignment between species of different genera is the supports his claim that the origin of diversity may be within largest problem. In a tiered approach, an unknown sample India or Pakistan. We are completing a more comprehensive can be aligned among a smaller group of taxa (i.e. within a barcoding campaign (including the Americas and Asia) of genus). It has been shown that other noncoding regions Acacia in order to resolve this and other hypotheses con- such as atpF–atpH, and psbK–psbI provide no additional cerning the biogeography of Acacia. species resolution (Fazekas et al. 2008). In our study, matK At least in Acacia, a two-region barcode may be sufficient alone could be use to distinguish all species. Other researchers for identifying plants. Earlier research identified that a have also successfully utilized DNA barcoding in plant multiregion approach to barcoding plants will be required studies, and the time has come for a unified approach (Kress (Chase et al. 2005; Kress et al. 2005; Cowan et al. 2006; New- & Erickson 2008; Lahaye et al. 2008; Newmaster et al. 2008). master et al. 2006). Kress & Erickson (2007) have proposed The Plant Working Group of the Consortium for the Barcode a two-locus barcode based on rbcL and the trnH-psbA inter- of Life (PWG-CBOL) is collaborating with an international genic spacer. Newmaster et al. (2008), in a study that inves- body of researchers in proposing a ‘universal DNA barcode tigated the utility of seven regions, demonstrated that a for flowering plants’, which will engage DNA barcoding as two-gene approach utilizing a moderately variable region a tool for biologists. (matK) and a more variable region (trnH-psbA) provides resolution for barcoding in nutmegs. Comprehensive studies Conclusion of many temperate land plants revealed that combining more variable plastid markers provided clear benefits for We are at the threshold of a taxonomy renaissance inspired resolving species (Fazekas et al. 2008; Ford et al. 2009). How- by DNA barcoding (Hollingsworth 2007; Miller 2007). ever, they found that all combinations assessed using four Already, we have seen many studies in animals that have to seven regions had only marginally better success rates revealed cryptic diversity, corrected classifications and aided than some two or three region combinations. In our study in the discovery of new species (Hebert et al. 2004a, b; Webb of Acacia, we found that the rbcL region could distinguish et al. 2006; Costa & Carvalho 2007; Yassin et al. 2008). In subgenera groups (which we propose should be renamed plants, there are several reports, including this study, of as genera) and many (67%) of the species. The rbcL region the utility of barcoding for identifying cryptic species, new could play an important role in barcoding. Previously (Chase species, ethnotaxa, biogeograhic patterns and resolving et al. 2005; Newmaster et al. 2006), rbcL was evaluated as a classifications at the rank of genera and species (Newmaster possible region because of its universality, ease of amplifi- et al. 2008, 2009; Kress & Erickson 2008; Lahaye et al. 2008; cation, ease of alignment, and because there is a significant Ragupathy et al. 2009). The next move is toward Automated body of data available for evaluation. It has also been shown Identification Technology (AIT), a state-of-the-art system that to differentiate a large percentage of congeneric plant will revolutionize biology and have considerable impacts species (Newmaster et al. 2006). These studies discussed that on society (Newmaster et al. 2008). Advances in both mole- some barcode applications will require a minimal comple- cular and bioinformatics technology have advanced to the ment of primers (e.g. ecology or applied projects) to identify point where AIT is available as a tool for biologists and cryptic plant material such as or leaves. When pre- eventually everyone in society who has a need for bio- sented with a completely unknown sample, it will be highly diversity information. Scientists (Tautz et al. 2002; Pennisi desirable to run it with the smallest number of primer sets 2007) have discussed the application of computer identifi- as possible in order to place the unknown sample in a genus. cations and recent innovations in animal and plant barcoding. Newmaster et al. (2006) presented the ‘tiered approach’, Research has now shifted from an exploratory phase to a which is based on the use of a common, easily amplified, high throughput phase. Recent research indicates that the and aligned region (such as rbcL) that can act as a scaffold efficacy of an AIT system for plants equates with savings in on which to place data from a more variable region such as time and funding allowing us to save resources for alpha matK or a noncoding region. The utility of matK in barcoding taxonomy (Newmaster et al. 2008). We expect that field has been explored in several studies (Fazekas et al. 2008; biologists in research studies or as environmental consultants Kress & Erickson 2008; Lahaye et al. 2008; Newmaster et al. in business will soon be using barcoding as a tool for quick 2008). Although matK provides considerable interspecific surveys. Border control stations could search for invasive variation it also requires at least two primer pairs, and some- species within materials such as products entering times up to 10 primers when considering a diverse group the country. Industry could implement protocols to identify

© 2009 Blackwell Publishing Ltd BARCODING PLANTS 179 contaminants in food or health products. Farmers and Costa FO, Carvalho GR (2007) The Barcode of Life Initiative: gardeners could quickly identify a weed and learn how to synopsis and prospective societal impacts of DNA barcoding of control it. Naturalists could explore wetlands and children fish. Genomics, Society of and Policy, 3, 29–40. Cowan RS, Chase MW, Kress WJ, Savolainen V (2006) 300 000 could explore their school yard. All of the biodiversity data species to identify: problems, progress, and prospects in DNA from these applications could be fed into a data repository barcoding of land plants. Taxon, 55, 611–616. that would help us to understand, appreciate and conserve Erickson DL, Spouge J, Resch A, Weigt LE, Kress WJ (2008) DNA our natural world (Erickson et al. 2008; Newmaster et al. barcoding in land plants: developing standards to quantify and 2008; Pupulin et al. 2008). maximize success. Taxon, 57, 1304–1316. Fagg CW, Greaves A (1990) Acacia Nilotica 1869–1988, Annotated Bibliography No. F42 (ed. Langdon K), CAB International, Acknowledgements published in collaboration with the Oxford Forestry Institute, Wallingford, UK. This research was supported by Genome Canada through the Fazekas AJ, Burgess KS, Kesanakurti PR et al. (2008) Multiple Ontario Genomics Institute and the Canadian Foundation for Inno- multilocus DNA barcodes from the plastid genome discriminate vation. We thank Trevor Wilson, Dr B. Jackes, E. Marry, Professor plant species equally well. Public Library of Science, ONE, 3(7), Ramadan Fauda, Dr M. Murugesan, Dr V. Balu and Professor Jana e2802. doi: 10.1371/journal.pone.0002802. Janakiraman for helping to identify and collect the Acacia samples Ford CS, Ayres KL, Toomey N, Haider N, Van Alphen Stahl J and voucher specimens for this study. We would like to thank Dr (2009) Selection of candidate coding DNA barcoding regions for Bruce Maslin (Western Australian Herbarium, Australia) and Dr use on land plants. Botanical Journal of the Linnean Society, 159, 1– K. Thothathiri (Madras Herbarium, India) for their help in verifi- 11. cation of the species. Dr K.T. is also acknowledged for reviewing Hebert PDN, Cywinska A, Ball SL, DeWaard JR (2003) Biological an earlier version of the manuscript. We thank Drs A. Fazekas and identification through DNA barcodes. Proceedings of the Royal P. Kesanakurti Canadian Plant Barcoding Group for reviewing the Society B: Biological Sciences, 270, 313–321. sequence results. Finally, we would like to thank members of the Hebert PD, Penton EH, Burns JM, Janzen DH, Hallwachs W Floristic Diversity Research Group, Biodiversity Institute of Ontario (2004a) Ten species in one: DNA barcoding reveals cryptic for the support and assistance, particularly Royce Steeves, Carole species in the Neotropical skipper butterfly Astraptes fulgerator. Ann Lacroix, Neil Webster and the CBS sequencing facility. Proceedings of the National Academy of Sciences, USA, 101(41), 14812–14817. Hebert PDN, Stoeckle MY, Zemlak TS, Francis CM (2004b) Identi- Conflict of interest statement fication of birds through DNA barcodes. Public Library of Science, Biology 2 The authors have no conflict of interest to declare and note that the , (10), e312 (www.plosbiology.org). Genomics, funders of this research had no role in study design, data collection Hollingsworth PM (2007) DNA barcoding: potential users. Society of and Policy 3 and analysis, decision to publish, or preparation of the manuscript. , , 44–47. Kress WJ, Erickson DL (2007) A two-locus global DNA barcode for land plants: the coding rbcL gene complements the non-coding References trnH-psbA spacer region. Public Library of Science ONE, 2, e508. doi: 10.1371/journal.pone.0000508. Bentham G (1840) Contributions towards a flora of South America Kress J, Erickson DL (2008) DNA barcodes: genes, genomics, and — enumeration of plants collected by Mr. Schomburgk in British bioinformatics. Proceeding of the National Academy of Sciences, Guiana. Journal of Botany (Hooker), 2, 127–146. USA, 105, 2761–2762. Bentham G (1842) Notes on Mimoseae, with a short synopsis of Kress WJ, Wurdack KJ, Zimmer EA, Weigt LA, Janzen DH (2005) species. Journal of Botany (Hooker), 4, 323–418. Use of DNA barcodes to identify flowering plants. Proceedings of Bentham G (1864) Flora Australiensis, vol. 2. London. the National Academy of Sciences, USA, 102, 8369–8374. Bentham G (1875) Revision of the suborder Mimoseae. Transactions Kriticos DJ, Sutherst RW, Brown JR, Adkins SW, Maywald GF of the Linnean Society of, London, 30, 335–664. (2003) Climate change and the potential distribution of an inva- ter Braak CJF (1998) Canoco Version 4.0. Centre for Biometry sive alien plant: Acacia nilotica ssp. indica in Australia. Journal of CPRO-OLO, Wageningen, The Netherlands. Applied Ecology, 40, 111–124. Brenan JPM (1983) Manual on Taxonomy of Acacia Species: Present Kruskal JB (1964) Non-metric multidimensional scaling: a numer- Taxonomy of Four Species of Acacia (A. Albida, A. Senegal, A. ical method. Psychometrika, 29, 115–129. Nilotica and A. Tortilis), pp. 20–35. Food and Agricultural Kumar S, Tamura K, Nei M (2004) Mega 3: integrated software for Organization of the United Nations, Rome, Italy. molecular evolutionary genetics analysis and sequence alignment. Bukhari YM, Koivu K, Tigerstedt PMA (1999) Phylogenetic analysis Briefings in Biometrics, 5, 150–163. of Acacia (Mimosaceae) as revealed from chloroplast RFLP data. Lahaye R, van der Bank M, Bogarin D et al. (2008) DNA barcoding Theoretical and Applied Genetics, 98, 291–298. the floras of biodiversity hotspots. Proceedings of the National Byrne M, Tischler G, Macdonald B, Coates DJ, McComb J (2001) Academy of Sciences USA, 105, 2923–2928. doi: 10.1073/ Phylogenetic relationships between two rare acacias and their pnas.0709936105. common, widespread relatives in south-western Australia. Con- Luckow M, Miller JT, Murphy DJ, Livshultz T (2003) A phylogenetic servation Genetics, 2, 157–166. analysis of the Mimosoideae (Leguminosae) based on chloroplast Chase MW, Salamin N, Wilkinson M et al. (2005) Land plants DNA sequence data. In: Advances in Legume Systematics, Part 10, and DNA barcodes: short-term and long-term goals. Philosophical Higher Level Systematics (eds Klitgaard BB, Bruneau A), pp. 197– Transactions of the Royal Society B: Biological Sciences, 360, 1889–1895. 220. Royal Botanical Gardens, Kew, UK.

© 2009 Blackwell Publishing Ltd 180 BARCODING PLANTS

Maslin BR, Miller JT, Seiger DS (2003) Overview of the generic status Ragupathy S, Newmaster SG, Murugesan M, Balasubramaniam V of Acacia (Leguminosae: Mimosoideae). Australian Systematic (2009) DNA barcoding discriminates a new cryptic grass species Botany, 16, 1–18. revealed in an ethnobotany study by the hill tribes of the McDonald MW, Maslin BR, Butcher PA (2001) Utilisation of Western Ghats in southern India. Molecular Ecology Resources, acacias. In: , Volume 11A, Mimosaceae, Acacia Part 9 (Suppl. 1), 164–171. 1 (eds Orchard AE, Wilson AJG), pp. 30–40. ABRS/CSIRO Pub- Ramsey FL, Schafer DW (1997) The Statistical Sleuth: A Course in lishing, Melbourne, Australia. Methods of Data Analysis. Duxbury Press, Belmont, California. Meier R, Shiyang K, Vaidya G, Ng PKC (2006) DNA barcoding and Ratnasingham S, Hebert PDN (2007) BOLD: the Barcode of Life taxonomy in Diptera: a tale of high intraspecific variability and Data System (www.barcodinglife.org). Molecular Ecology Notes, low identification success. Systematic Biology, 55, 715–728. 7(3), 355–364. Midgley SJ, Turnbull JW (2003) Domestication and use of Australian Rohlf F (2000) NTSYS: Numerical Taxonomy and Multivariate Analysis Acacias: an overview. Australian Systematic Botany, 16(1), 89– System, Version 2.1. Exeter Software, New York. 102. Ross JH (1973) Towards a classification of African acacias. Bothalia, Miller SE (2007) DNA barcoding and the renaissance of taxonomy. 11, 107–113. Proceedings of the National Academy of Sciences, USA, 104, 4775– Ross JH (1981) An analysis of the African Acacia species: their 4776. distribution, possible origins and relationships. Bothalia, 13, Miller JT, Bayer RJ (2001) Molecular phylogenetics of Acacia 389–413. (Fabaceae: Mimosoideae) based on the chloroplast matK coding Seigler DS, Ebinger JE, Miller JT (2006) The genus Senegalia sequence and flanking trnK intron spacer region. American Journal (Fabaceae: Mimosoideae) from the New World. Phytologia, of Botany, 88(4), 697–705. 88(1), 38–93. Miller JT, Bayer RJ (2003) Molecular Phylogenetics of Acacia sub- Sneath P, Sokal R (1973) Numerical Taxonomy. W.H. Freeman, San genera Acacia and Aculeiferum (Fabaceae: Mimosoideae) based Francisco, California. on the chloroplast matK coding sequence and flanking trnK SPSS (1999) Professional Base System Software for Statistical Analysis. intron spacer regions. Australian Systematic Botany, 16, 27–33. SPSS Inc., Chicago. Minchin P (1987) An evaluation of the relative robustness of tech- Tautz D, Arctander P, Minelli A, Thomas RH, Vogler AP (2002) niques for ecological ordination. Vegetatio, 69, 89–107. DNA points the way ahead in taxonomy. Nature, 418, 479. Newmaster SG, Fazekas AJ, Ragupathy S (2006) DNA barcoding Wardill TJ, Graham GC, Zalucki M et al. (2005) The importance in the land plants: evaluation of rbcL in a multigene tiered of species identity in the biocontrol process: identifying the sub- approach. Canadian Journal of Botany, 84, 335–341. species of Acacia nilotica (Leguminosae: Mimosoideae) by genetic Newmaster SG, Fazekas A, Steeves R, Janovec J (2008) Testing distance and the implications for biological control. Journal of candidate plant barcode regions in the Myristicaceae. Molecular Biogeography, 32, 2145–2159. Ecology Resources, 8, 480–490. Webb KE, Barnes DKA, Clark MS, Bowden DA (2006) DNA Newmaster SG, Ragupathy S, Janovec J (2009) Automated identi- barcoding: a molecular tool to identify Antarctic marine larvae. fication technology (AIT) systems and database for identifica- Deep-Sea Research, II(53), 1053–1060. tion of plants. International Journal of Computer Applications in Wickens GE, Seif-El-Din AG, Sita G, Nahal I (1995) Role of Acacia Technology (Invited article-in press). species in the rural economy of dry Africa and the Near East. Pennisi E (2007) Wanted: a barcode for plants. Science, 318, 190–191. FAO Conservation Guide No. 27. Food and Agriculture Organiza- Primer Software (2002) Primer Multivariate Software Version 529. tion, Rome, Italy. PRIMER-E Ltd, Hedingham Gardens, Roborough Plymouth, Wight R, Arnott GAW (1834) Prodromus Florae Peninsulae Indiae UK. Orientalis. London. Pupulin F, Gigot G, Maurin O, Duthoit S, Barraclough TG, Savol- Yassin A, Capy P, Madi-Ravazzi L, Ogereau D, David JR (2008) ainen V (2008) DNA barcoding the floras of biodiversity DNA barcode discovers two cryptic species and two geographical hotspots. Proceedings of the National Academy of Sciences, USA, radiations in the invasive drosophilid Zaprionus indianus. Molecular 105, 2923–2928. Ecology Resources, 8, 491–501.

© 2009 Blackwell Publishing Ltd