Molecular Identification of Species in Juglandaceae: a Tiered Method
Total Page:16
File Type:pdf, Size:1020Kb
Journal of Systematics and Evolution 49 (3): 252–260 (2011) doi: 10.1111/j.1759-6831.2011.00116.x Research Article Molecular identification of species in Juglandaceae: A tiered method 1,2Xiao-Guo XIANG 1,2Jing-Bo ZHANG 1An-Ming LU 1Rui-Qi LI∗ 1(State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China) 2(Graduate University of Chinese Academy of Sciences, Beijing 100049, China) Abstract DNA barcoding is a method of species identification and recognition using DNA sequence data. A tiered or multilocus method has been recommended for barcoding plant species. In this study, we sampled 196 individuals representing 9 genera and 54 species of Juglandaceae to investigate the utility of the four potential barcoding loci (rbcL, matK, trnH-psbA, and internal transcribed spacer (ITS)). Our results show that all four DNA regions are easy to amplify and sequence. In the four tested DNA regions, ITS has the most variable information, and rbcL has the least. At generic level, seven of nine genera can be efficiently identified by matK. At species level, ITS has higher interspecific p-distance than the trnH-psbA region. Difficult to align in the whole family, ITS showed heterogeneous variability among different genera. Except for the monotypic genera (Cyclocarya, Annamocarya, Platycarya), ITS appeared to have limited power for species identification within the Carya and Engelhardia complex, and have no power for Juglans or Pterocarya. Overall, our results confirmed that a multilocus tiered method for plant barcoding was applicable and practicable. With higher priority, matK is proposed as the first-tier DNA region for genus discrimination, and the second locus at species level should have enough stable variable characters. Key words DNA barcoding, ITS, Juglandaceae, matK, rbcL, trnH-psbA. DNA barcoding is a system to aid species recog- rbcL sequences and suggested that rbcL can be used at nition and identification through the characteristics of generic level, and an implementation of a second lo- a standard gene region across all organisms (Hebert cus would be more applicable for land plants. Kress et al., 2003). For animals, the mitochondrial gene cox1 & Erickson (2007) suggested that a combination of as a barcode shows great promise in identifying cryptic the non-coding trnH-psbA spacer region and the cod- species, accelerating biodiversity inventories and iden- ing rbcL gene was a two-locus global barcode for land tifying species from degraded material (Smith et al., plants. Chase et al. (2007) outlined two combinations 2005; Hajibabaei et al., 2006; Rubinoff, 2006). How- of three regions (rpoC1, rpoB, and matK; rpoC1, matK, ever, mitochondrial DNAs evolve too slowly to pro- and trnH-psbA) as candidate markers for land plant bar- vide enough informative sites to discriminate species in coding. Lahaye et al. (2008a, 2008b) suggested matK plants. Identification of a locus or a combination of loci could be used as a universal and variable DNA barcode that can serve as an effective DNA barcode for plant for flowering plants. Recently, CBOL Plant Working species is challenging. Group (2009) formally suggested the combination of Most of the previous plant barcoding studies were rbcL and matK as the core barcode for land plants. Yao carried out on a large scale, attempting to find uni- et al. (2010) and Chen et al. (2010) proposed ITS2 as versal and consistent markers for all angiosperms or a powerful DNA barcoding marker for plants and ani- land plants. For example, Kress et al. (2005) compared mals. However, some authors used one or several can- nine DNA markers and suggested that internal tran- didate markers to test their utility through dense sam- scribed spacer (ITS) and trnH-psbA were potentially pling within a single family or genus. For instance, Sass usable DNA regions for barcoding flowering plants. et al. (2007) used seven cpDNA regions and ITS to dis- Newmaster et al. (2006) analyzed more than 10 000 criminate cycads (Cycadaceae), showing that ITS had the most variability. Newmaster et al. (2008) suggested that matK and trnH-psbA provided resolution among all the Compsonuera species (Myristicaceae). Ren et al. Received: 9 November 2010 Accepted: 14 November 2010 ∗ Author for correspondence. E-mail: [email protected]; Tel.: 86-10- (2010) combined trnH-psbA and ITS to discriminate 62836447; Fax: 86-10-62590843. 88.5% species of Alnus (Betulaceae). C 2011 Institute of Botany, Chinese Academy of Sciences XIANG et al.: DNA barcoding of Juglandaceae 253 Recently, a tiered or multilocus method has 1.2 DNA extraction, amplification, and sequencing been proposed for plant DNA barcoding (Newmaster Total DNAs were isolated from silica gel-dried et al., 2006). That is, one region is selected for dis- leaves following the protocol of Doyle & Doyle (1987). criminating genera or families, and the other(s) for The amplification of DNA regions was carried out us- identifying species. However, there are few empirical ing standard polymerase chain reaction (PCR). Primer studies that have tested this method in a specific taxo- sequences for amplification and sequencing are pre- nomic group. sented in Table S2. The PCR products were sequenced The family Juglandaceae consists of nine genera on an ABI 3730 DNA sequencer (Applied Biosys- and ca. 60 species, and is distributed mainly in the tems, Foster City, CA, USA). The primers for ITS are Northern Hemisphere (Manchester, 1987, 1999). An- from Baldwin (1992); primers for trnH-psbA are from namocarya, Cyclocarya, Engelhardia, Platycarya, and http://www.kew.org/barcoding/protocols.html; primers Pterocarya are restricted to the Old World. Alfaroa and for matK are from Cuenoud et al. (2002); and primers Oreomunnea are restricted to the New World. Carya for rbcL are from Kress & Erickson (2007). and Juglans occur in both continents (Lu, 1982). There are seven genera and ca. 27 species distributed in China. 1.3 Data analysis Taxonomy of the genera was largely based on flower and The sequences were first aligned using ClustalX fruit characters (Manning, 1978; Manchester, 1987). Al- (Thompson et al., 1997) software then manually ad- though the genera are easy to identify by morphological justed in Bioedit version 7 (Hall, 1999). characters, species identification in certain genera are In order to obtain an estimate of variation in the four difficult using morphological characters alone. For in- regions examined, pairwise Kimura 2-parameter (K2P) stance, species of Carya distributed in eastern North distances were calculated in Mega 4.0 (Kumar et al., America are almost indistinguishable because of their 2008). Indels were coded with the simple indel coding highly similar morphological characters. Juglandaceae method of Simmons & Ochoterena (2000). Neighbor- is confirmed to be monophyletic based on molecular joining trees of ITS, matK, rbcL, and trnH-psbA were analyses (Manos & Steele, 1997; Li et al., 2004). Recent generated using MEGA version 4 with K2P (Kumar phylogenetic studies based on morphology and DNA et al., 2008) with 1000 replicates to determine clade sequences (ITS, trnL-F, atpB-rbcL) have partially re- support and tree resolution. solved relationships among genera and some groups To ensure accurate species assignments in our within genera of Juglandaceae (Manos & Stone, 2001; datasets, we used the “Best Match”, “Best Close Manos et al., 2007). Match”, and “All Species Barcodes” methods of the In this study, we use four DNA regions (rbcL, program TaxonDNA (Meier et al., 2006). This program matK, trnH-psbA, and ITS) that were frequently rec- determines the closest match of a sequence from com- ommended as potential DNA barcodes in previous re- parisons to all other sequences in an aligned dataset. searches to differentiate species of Juglandaceae. Our Comparing with the “Best Match”, “Best Close Match” objectives are: (i) to test the universality of the four strategy requires a threshold similarity value that de- DNA regions in Juglandaceae; (ii) to evaluate the poten- fines how similar a sequence match needs to be before tial of barcodes to broadly identify genera and species it can be identified. This value can be estimated for a across the family; and (iii) to provide some suggestions given dataset by obtaining a frequency distribution of on future DNA barcoding regions for plants. all intraspecific pairwise distances and determining the threshold distance below which 95% of all intraspe- cific distances are found. These sequence identification 1 Material and methods methods were carried out on the rbcL, matK, trnH-psbA, and ITS datasets. 1.1 Taxon sampling Multiple samples of each species recognized in Flora of China (Lu & Stone, 1999) were included in 2 Results this study, covering both morphological and geographi- cal range of each taxon. Most of the species distributed 2.1 Characteristics of the four DNA regions in the Americas and Europe were collected. In total, Of the 196 individuals included in this study, PCR we sampled 196 individuals representing nine genera amplification was successful for all loci, except ITS and 54 species. There are 88 sequences of rbcL, matK, for Engelhardia hainanensis Chen. Success rates for trnH-psbA, and ITS downloaded from GenBank. The bidirectional sequencing were highest for rbcL (100%) detailed information was included in Table S1. and trnH-psbA (100%), followed by ITS (94%) and C 2011 Institute of Botany, Chinese Academy of Sciences 254 Journal of Systematics and Evolution Vol. 49 No. 3 2011 Table 1 Evaluation of four DNA markers rbcL matK trnH-psbA ITS ITS2 Universal ability to primer Yes Yes Yes Yes — Percentage PCR success 100 90 100 94 — Percentage sequencing 100 95 100 100 — success Aligned sequence length (bp) 742 792 299 749 240 Indels length (bp) — 6 1–29 1–157 1–25 No. information sites/variable 20/88 51/123 26/39 198/273 29/115 sites Distribution of variable sites Dispersive Dispersive Dispersive Dispersive — No.