American Journal of Botany: e372–e374. 2011.

AJB Primer Notes & Protocols in the Sciences

C HLOROPLAST MICROSATELLITE PRIMERS FOR CACAO ( T HEOBROMA CACAO ) AND OTHER 1

Ji Y. Yang2,5 , Lambert A. Motilal3 , Hannes Dempewolf2 , Kamaldeo Maharaj 4 , and Q. C. B. Cronk 2

2 Department of Botany, University of British Columbia, 6270 University Boulevard, Vancouver, British Columbia, V6T 1Z4 Canada; 3 Cocoa Research Unit, University of the West Indies, St. Augustine, Trinidad, Republic of Trinidad and Tobago, West Indies; and 4 Ministry of Agriculture, Land, and Marine Resources, Central Experiment Station, Centeno, Via Arima P.O., Republic of Trinidad and Tobago, West Indies

• Premise of the study: Chloroplast microsatellites were developed in Theobroma cacao to examine the genetic diversity of cacao cultivars in Trinidad and Tobago. • Methods and Results: Nine polymorphic microsatellites were designed from the chloroplast genomes of two T. cacao acces- sions. These microsatellites were tested in 95 hybrid accessions from Trinidad and Tobago. An average of 2.9 alleles per locus was found. • Conclusions: These chloroplast microsatellites, particularly the highly polymorphic pentameric repeat , were useful in assessing genetic variation in T. cacao . In addition, these markers should also prove to be useful for population genetic studies in other species of Malvaceae.

Key words: chloroplast haplotype; chocolate; Malvaceae; microsatellites; Theobroma cacao ; Trinitario .

Theobroma cacao L. (Malvaceae) is one of the most eco- including barley (Provan et al., 1999) and grapevine (Arroyo- nomically important crops grown in tropical regions of the Garcia et al., 2002 ). In addition, because of their nonrecombi- world. There are three major varietal groups of cacao: the nant and uniparentally inherited nature, cpSSRs are the ideal Forastero, the Criollo, and the Trinitario. The Forastero is the markers for cultivar origin and pedigree studies. Thus, cpSSR most common; the Criollo is the most prized and expensive, markers should be useful to unravel the history of crosses that but has a lower yield and is more susceptible to diseases than produced the many Trinitario accessions currently found today. the Forastero; and the Trinitario is thought to be of hybrid In this study, we designed and characterized nine cpSSR origin with Criollo and Forastero parents. The Trinitario is markers to estimate the genetic variation of Trinitario cultivars considered to combine the high fl avor quality of Criollo with in Trinidad and Tobago. the high-yield and disease-resistant properties of Forastero (http://www.icco.org/about/growing.aspx; Wood and Lass, 1985). As the name implies, Trinitario varieties arose through METHODS AND RESULTS accidental hybridization in Trinidad and Tobago, and it is the most common varietal group planted in the country. Further- Using the two complete cpDNA genome sequences from two accessions of more, there is a large germplasm collection of Trinitarios at T. cacao (Sveinsson et al., 2010; Jansen et al., 2011), we selected for polymor- phic cpSSR markers. The most common repeat motifs found were mononu- the International Cocoa Genebank, Trinidad (ICG, T). How- cleotides, and we selected for ones with the longest repeat number. We also ever, fi eld genebanks of cacao contain many mislabeled and selected one each of polymorphic penta-, hexa-, and octanucleotides because misidentifi ed (Motilal et al., 2009; Irish et al., 2010). their alleles are easier to score. The nine primers were designed using Primer3 There is a need for a standardized, reliable method to geneti- software (Rozen and Skaletsky, 2000 ) with an expected product size range cally identify different Trinitario varieties and to estimate their from 150 to 450 bp. overall genetic diversity. Genomic DNA was extracted from fresh and silica gel– dried using a QIAamp DNA Stool Mini Kit (QIAGEN, Valencia, , USA). PCR Chloroplast simple sequence repeat (cpSSR) markers, or amplifi cations were performed in a fi nal volume of 15 μ L, containing 10 ng microsatellites, have proved useful as genetic markers for of DNA, 1× reaction PCR buffer (10 mM Tris-HCl pH 8.3, 50 mM KCl), 200 μ distinguishing among different cultivars in several crop species M of each dNTP, 1.5 mM MgCl2 , 1 U of Taq DNA polymerase (Fermentas Canada, Burlington, Ontario, Canada), 0.05 μM forward primer, 0.5 μ M μ 1 reverse primer, and 0.5 M dye-labeled universal primer. Four dyes (VIC, Manuscript received 21 June 2011; revision accepted 26 July 2011. FAM, NED, and PET) were used (Applied Biosystems, Carlsbad, California, The authors thank Dapeng Zhang for sharing his knowledge on USA). PCR reactions were performed using a BIO-RAD thermocycler with a Theobroma cacao and Nolan Kane and Saimi Sveinsson for their help with touch-down program: 95° C for 3 min; followed by nine cycles of 94 ° C for 30 bioinformatics. The work was supported by a World Bank Development s, 65° C for 30 s (temperature decreased by 1° C for every cycle), and 72° C for Marketplace grant awarded to Q. C. B. Cronk and J. M. M. Engels. 45 s; followed by 29 cycles of 94° C for 30 s, 55° C for 30 s, 72° C for 45 s; and 5 Author for correspondence: [email protected] a fi nal extension at 72° C for 20 min. Each microsatellite marker was amplifi ed singly and then pooled based on differences in fl uorescent labeling and ex- doi:10.3732/ajb.1100306 pected fragment size. The nine markers were arranged in two multiplex sets

American Journal of Botany: e372–e374, 2011; http://www.amjbot.org/ © 2011 Botanical Society of America e372 December 2011] AJB Primer Notes & Protocols — Cacao chloroplast SSRs e373

Table 1. Characterization of nine chloroplast microsatellite primers in Theobroma cacao . For each locus, the primer name, forward and reverse primer sequence, repeat motif, product size range, number of alleles, unbiased haplotype diversity index , annealing temperature, and GenBank accession number are shown.

No. of ′ ′ ° Primers Primer sequence (5 – 3 ) Repeat motif Size range (bp) alleles h Ta ( C) GenBank Accession No.

CaCrSSR1F: CATGGATTCAGCAGCAGTTC (CTTTA)11 359 – 409 7 0.78 55 JF979116 R: CGAGTCCGCTTATCTCCAAC CaCrSSR2F: TTCAACCCAATCGCTCTTTT (A)12 237 – 239 3 0.63 55 JF979117 R: AATTTGAATGATTACCCGATCT CaCrSSR3F: AGAACGAATCCGCTCCTCTT (TAAAAG)2 208 – 214 2 0.46 55 JF979118 R: GGTCACGGCAACATAACAAC CaCrSSR4F: GAACGACGGGAATTGAACC (T) 10 180 – 181 2 0.44 55 JF979119 R: TGATGAATCGTAGAAATGGAAA CaCrSSR5F: TCACTTTCACTCCTTTTCCATTT (T) 13 199 – 209 4 0.5 55 JF979120 R: TTTTGACTCCGTTTAGACATAGG CaCrSSR6F: CGAATCCCTTTCTTCATACAAA (C) 8 175 – 176 2 0.47 55 JF979121 R: TTTCATGTTTTGATTGCATCG CaCrSSR7F: AGGGCTCCGTAAGATCCAGT (T)11 285 – 286 2 0.46 55 JF979122 R: GTCTTAGGCCTTGCGATTCA CaCrSSR8F: TTTCTGATTCACCGGCTCTT (T)11 299 – 300 2 0.46 55 JF979123 R: TGGTGGAATTCTTTGCATTG CaCrSSR9F: TCCACTCAGCCATCTCTCCT (TACTTTAT) 11 338 – 346 2 0.73 55 JF979124 R: GTCCCTTTTGAGCGAAATCA

Note : h = unbiased haploid diversity index; T a = annealing temperature.

(1) CaCrSSR1 (FAM), CaCrSSR2 (FAM), CaCrSSR3 (VIC), CaCrSSR8 (NED) (Peakall and Smouse, 2006) was from 0.444 to 0.797. The pentameric cpSSR and (2) CaCrSSR10 (VIC), CaCrSSR4 (PET), CaCrSSR6 (NED), CaCrSSR5 was the most polymorphic with seven alleles and was the most informative in (FAM), CaCrSSR10 (FAM). An ABI 3730 automated DNA Sequencer (Applied identifying cultivars within the Trinitario varietal group. We detected eight hap- Biosystems) was used to genotype the multiplex sets. The software GeneMapper lotypes out of the 95 Trinitarios sampled by using all nine cpSSR markers (Table version 3.2 (Applied Biosystems) was used to call the allele sizes. 2 ) . The eight haplotypes varied in frequency ranging from 0.011 (1 out of 95) Marker CaCrSSR1, the pentameric repeat, differed by eight repeats in to 0.326 (31 out of 95). These cpSSRs may also prove useful in identifying the two T. cacao accessions, from which we designed the cpSSR primers. cultivars within and among the 10 population groups said to be present in T. This marker was the only one to exhibit two alleles. These two alleles cacao (Motamayor et al., 2008 ). differed in amplifi cation intensities and level of polymorphism. The lower The high degree of sequence conservation found in chloroplast genomes (Pro- intensity allele (362 bp) had about a third of the concentration. This allele van et al., 1999 ) should enable the primers to be used in other species of Malva- did not create a problem in scoring as it was monomorphic in all 95 samples ceae. We tested the nine cpSSR primers on tomentosum Nutt. ex in the current study with a product size that was easily distinguished Seem. Out of the nine cpSSRs, six amplifi ed successfully (CaCrSSR1, CaCrSSR2, from the real pentameric product. The higher intensity allele was highly CaCrSSR5, CaCrSSR6, CaCrSSR7, and CaCrSSR8). A BLAST search was con- polymorphic. We hypothesize that the monomorphic allele may be a cross- ducted on the amplifi ed pentameric cpSSR sequence (CaCrSSR1) from the amplifying product found in nuclear DNA. Further investigation is, how- National Center for Biotechnology Information website (http://www.ncbi.nlm ever, necessary to establish defi nitively the monomorphic nature of this .nih.gov) and obtained nearly perfect hit scores to chloroplasts of three species of extra product. Gossypium L. The primer sequences matched completely, but the number of We tested for microsatellite polymorphism in 95 accessions from Trini- pentanucleotide repeats differed. Overall, Gossypium had fewer pentanucleotide dad and Tobago (see Appendix 1). Sixty-four individuals were collected repeats than T. cacao. GenBank accessions of G. thurberi Tod. and G. barbadense from 33 different farmers from Trinidad; eight individuals were collected from L. both showed three pentanucleotide repeats while G. hirsutum L. had two. In six different farmers from Tobago. The remaining 23 came from the ICG, T addition, we also tested the primers on four species of A. Gray collection (one sample from Grenada Selection [GS], 14 samples from Impe- (S. glaucescens Greene, S. hendersonii S. Watson, S. nelsoniana Piper, and S. oreg- rial College Selection [ICS], seven samples from Trinidad [TRD], and one ano (Nutt. ex Torr. & A. Gray) A. Gray ). The following four primers amplifi ed sample from Dominica [DOM]). All nine loci were polymorphic, ranging from successfully (CaCrSSR1, CaCrSSR2, CaCrSSR3, and CaCrSSR8). Except for two to seven alleles with an average of 2.9 alleles per locus ( Table 1) . The un- primer CaCrSSR3, all these were polymorphic. These cpSSRs may also prove to biased haploid diversity per locus calculated using GenAlEx 6.1 software be useful in population genetic studies of other species in the Malvaceae.

Table 2. Initial screening of the nine cpSSRs on 95 accessions of Trinitario cultivars. Identifi ed chloroplast haplotypes, haplotype frequencies, and the alleles pertaining to each haplotype are shown.

Ha Frequency CaCrSSR1 CaCrSSR2 CaCrSSR3 CaCrSSR4 CaCrSSR5 CaCrSSR6 CaCrSSR7 CaCrSSR8 CaCrSSR9 A 0.189 359 239 285 214 338 175 180 300 199 B 0.105 364 237 286 208 338 174 181 299 199 C 0.179 369 237 286 208 338 174 181 300 204 D 0.031 374 238 286 208 346 174 181 299 209 E 0.011 384 239 286 208 346 174 181 299 209 F 0.021 399 239 285 208 338 175 180 300 200 G 0.137 399 239 285 214 338 175 180 300 200 H 0.326 409 239 286 208 346 174 181 299 209

Note : Ha = identifi ed chloroplast haplotypes. e374 American Journal of Botany [Vol. 0

CONCLUSION transfers of rpl22 to the nucleus. Molecular Biology and Evolution 28 : 835 – 847 . The Theobroma cacao cpSSRs revealed a high degree of Motamayor , J. C. , P. Lachneaud , J. W. da Silva e Mota , R. Loor, D. genetic variation within the Trinitario cultivars. Eight haplo- N. Kuhn, J. S. Brown, and R. J. Schnell . 2008 . Geographic and genetic population differentiation of the Amazonian chocolate tree types were identifi ed out of 95 accessions sampled. (Theobroma cacao L). PLoS ONE 3 : e3311 . The pentameric repeat (CaCrSSR1) was the most polymorphic Motilal , L. A. , D. Zhang , P. Umaharan , S. Mischke , M. Boccara , and should prove to be a useful complementary tool for genetic and S. Pinney . 2009 . Increasing accuracy and throughput in large- identifi cation of hybrid cultivars between Forastero and Criollo scale microsatellite fi ngerprinting of cacao fi eld germplasm collec- accessions and indeed among all cacao accessions. Furthermore, tions. Tropical Plant Biology 2 : 23 – 37 . the pentameric repeat may be useful for population genetic ap- Peakall , R. , and P. E. Smouse . 2006 . GenAlEx 6: Genetic analysis plications in Gossypium and other species in the Malvaceae. in Excel. Population genetic software for teaching and research. Molecular Ecology Notes 6 : 288 – 295 . LITERATURE CITED Provan , J. , J. R. Russel , A. Booth , and W. Powell . 1999 . Polymorphic chloroplast simple sequence repeat primers for systematic and popula- Arroyo-Garc í a , R. , F. Lefort , M. T. Andr é s , J. Ib á ñ ez , J. Borrego , tion studies in the genus Hordeum. Molecular Ecology 8 : 505 – 511 . N. Jouve , F. Cabello , and J. M. Mart í nez-Zapater . 2002 . Rozen , S. , and H. Skaletsky . 2000 . Primer3 on the WWW for general Chloroplast microsatellite polymorphisms in Vitis species. Genome users and for biologist programmers. In S. Krawetz and S. Misener 45 : 1142 – 1149 . [eds.], Bioinformatics methods and protocols: Methods in molecular Irish , B. I. , R. Goenaga , D. Zhang , R. Schnell , S. Brown , and J. C. biology, 365 – 386. Humana Press, Totowa, New Jersey, USA. Motamayor . 2010 . Microsatellite fi ngerprinting of the USDA-ARS Sveinsson , S. , N. C. Kane , H. Dempewolf , D. Zhang , and Q. C. B. Cronk . tropical agriculture research station cacao (Theobroma cacao L.) germ- 2010 . Theobroma cacao chloroplast, complete genome. Website http:// plasm collection. Crop Science 50 : 656 – 667 . www.ncbi.nlm.nih.gov/nucleotide/309321245 [accessed 15 May 2011]. Jansen , R. K. , C. Saski , S. B. Lee , A. K. Hansen , and H. Daniell . Wood , G. A. R. , and R. A. Lass . 1985 . Cocoa, 4th ed. Longman 2011 . Complete plastid genome sequences of three ( Casta- Group Ltd., Blackwell Science Ltd., Osney Mead, Oxford, United nea, Prunus, Theobroma ): Evidence for at least two independent Kingdom.

Appendix 1. Accessions used in this study .

Accessions Location

K087, K088, K089, K107 Aripo, Trinidad K064, K130 Betsy ’ s Hope, Tobago K017, K018, K024, K033, K034, K051, K052, K092, K152, K153 Biche, Trinidad K011, K012, K013, K094, K110, K111, K112, K113, K114, K117, K131 Brasso Seco, Trinidad K025, K108 Brasso Venado, Trinidad K050 Carapal, Trinidad K035, K049, K054 Coromandel, Trinidad K128, K129 Cumana, Trinidad K026, K028, K073, K079 Gran Couva, Trinidad DOM 13, GS 28, ICS 15, ICS 16, ICS 17, ICS 40, ICS 43, ICS 45, ICS 46, ICS 60, ICS 84, ICS 85, ICS 92, ICS 95, ICS 97, ICS ICG, Trinidad 111, TRD 115, TRD 23,TRD 35, TRD 52, TRD 66, TRD 66b, TRD 86 K056, K071, K097, K109, K120 Lopinot, Trinidad K059, K070, K106, K121, K141, K142, K143, K146, K147, K148, K149, K151 Moruga, Trinidad K053, K060, K075 Moriah, Tobago K080 Roxborough, Tobago K067 Runnemede, Tobago K133 Tabaquite, Trinidad K062, K134, K135, K136, K137 Tableland, Trinidad K055 Tamana, Trinidad K015, K016, K029 Vega de Oropouche, Trinidad Note : ICG = International Cocoa Genebank .