J. Appl. Entomol.

ORIGINAL CONTRIBUTION Development of genic-SSRs markers from soybean aphid sequences generated by high-throughput sequencing of cDNA library T.-H. Jun1,2, M. A. R. Mian1,2,3, K. Freewalt3, O. Mittapalli1 & A. P. Michel1

1 Department of Entomology, Ohio Agricultural Research and Development Center, The Ohio State University, Wooster, OH, USA 2 Department of Horticulture and Crop Science, Ohio Agricultural Research and Development Center, The Ohio State University, Wooster, OH, USA 3 USDA-ARS, Wooster, OH, USA

Keywords host-plant resistance, microsatellite, soybean Abstract aphid, simple sequence repeats The soybean aphid (Aphis glycines Matsumura) is one of the most impor- tant insect pests of soybean [Glycine max (L.) Merr.] in North America, Correspondence Dr. Andy Michel (corresponding author), and three biotypes of the aphid have been confirmed. Genetic studies of Department of Entomology, Ohio Agricultural the soybean aphid are needed to determine genetic diversity, movement Research and Development Center, The Ohio pattern, biotype distribution and mapping of virulence genes for efficient State University, 1680 Madison Avenue, control of the pest. Simple sequence repeats (SSR) markers are useful Wooster, OH 44691, USA. E-mail: for population and classical genetic studies, but few are currently avail- [email protected] able for the soybean aphid. In this study, we designed primers for 342 genic-SSR markers from a dataset of more than 102 024 transcript reads Received: June 14, 2011; accepted: November 20, 2011. generated by 454 GS FLX sequencing of a cDNA library of the soybean aphid. Two hundred forty-six markers generated PCR products of doi: 10.1111/j.1439-0418.2011.01697.x expected size and 26 were polymorphic among four pooled aphid DNA samples. An additional five markers that were fixed for two alleles among the pooled samples were found to be polymorphic when tested on 96 individual aphids. Sequencing of the PCR products generated by two polymorphic SSR markers revealed that the polymorphisms were strictly because of variations in the SSR repeats among the aphids tested. The genetic diversity among 96 soybean aphids, 24 each from two field collections (South Dakota and Michigan, biotype unknown) and two laboratory colonies [biotype 1 (B1) and biotype 2 (B2)], was assessed with 29 polymorphic SSR markers. These markers discriminated labora- tory colonies from field collections and field collections from different states. The genic-SSR markers developed and validated in this study will be a significant addition to the limited number of SSR markers, mostly genomic-SSR, currently available for the soybean aphid. These markers will be useful for genetic studies, including population genetics, genetic and QTL mapping, migration and biotype differentiation of the soybean aphid.

dale et al. 2011). Severe damage by heavy infesta- Introduction tions of soybean aphid has resulted in significant The soybean aphid (Aphis glycines Matsumura) is yield loss of more than 50% in USA and up to 58% native to southeastern and eastern Asia but now has in China (Wang et al. 1996; Ostlie 2002). Direct become a major insect pest of soybean [Glycine max feeding also leads to several phenotypic symptoms (L.) Merr.] in North America (Wu et al. 2004; Rags- such as leaf distortion or curling, stunting, wilting

ª 2011 Blackwell Verlag, GmbH 1 Aphis glycines genic-SSRs T.-H. Jun et al. and yellowing (Hill et al. 2004; Krupke et al. 2005; et al. 2000; Li et al. 2004; Choudhary et al. 2009). Mensah et al. 2005). Soybean aphids can also trans- Expressed sequence tags databases are very useful mit several plant viruses [e.g. Alfalfa mosaic virus, Soy- resources to identify SSRs by using various bioinfor- bean dwarf virus and Soybean mosaic virus] to soybean matic tools (Varshney et al. 2005). A large number of and cause the development of sooty mould because SSR markers from ESTs have been developed and of honeydew excreted by soybean aphid, which utilized in diverse species such as cotton (Gossypium inhibits photosynthesis (Iwaki et al. 1980; Hartman spp., Guo et al. 2007), western corn rootworm et al. 2001; Hill et al. 2001; Mueller and Grau 2007). (Diabrotica virgifera, Kim et al. 2008b), apple maggot Host-plant resistance is an effective method for (Rhagoletis pomonella, Schwarz et al. 2009), chickpea controlling insect pests, which also reduces the use (Cicer arietinum L., Choudhary et al. 2009) and coral of chemical insecticide applications and is suitable for (Acropora millepora, Wang et al. 2009). In comparison organic soybean production. Recently, at least three with genomic-SSRs having no relation to the tran- soybean aphid resistance genes (Rag1, Rag2 and scribed regions, genic or EST-SSRs have an advantage Rag3) have been identified and mapped on three dif- of being directly associated with expressed regions of ferent soybean chromosomes (7, 13 and 16) (Li et al. the genes, because the markers are designed from the 2007; Mian et al. 2008; Zhang et al. 2010). However, coding regions of the genome that are subject to selec- biotypes of soybean aphid have been identified in tion pressures (Li et al. 2004; Choudhary et al. 2009; North America that can overcome host-plant resis- Garg et al. 2011). In addition, genic-SSRs are useful tance provided by these genes. As Rag resistance was for comparative analysis across related species because found from Asian varieties, and soybean aphid’s the coding regions of the genome are more conserved native range is Asia, these biotypes were likely part than non-coding genomic regions (Varshney et al. of the initial invasion. For example, biotype 2 of soy- 2005; Jakse et al. 2010; Zeng et al. 2010). bean aphid, capable of defeating the Rag1 gene, has Traditionally, generating a large number of genic- been found in Ohio and biotype 3, able to colonize SSRs has been technically challenging, expensive plants with the Rag2 resistance gene, was identified and time-consuming (Eujayl et al. 2004; Wang from the overwintering host Frangula alnus in Indi- et al. 2010a). Recently developed next-generation ana (Kim et al. 2008a; Hill et al. 2010). As a result, sequencing platforms such as Illumina’s Genome the presence of such soybean aphid biotypes can Analyzer (GA), ABI’s SOLiD and Roche’s 454 GS reduce efficacy and sustainability of host-plant resis- FLX have been used in various studies, including tance. Therefore, successful deployment of soybean large-scale resequencing in well-characterized spe- aphid resistance genes depends on using them in cies, transcriptional profiling and transcriptome areas where certain biotypes are non-existent, less sequencing (Ossowski et al. 2008; Varshney et al. likely to appear or develop. However, the knowledge 2009; Wang et al. 2010b; Cullum et al. 2011). Addi- of overall genetic diversity, population structure, bio- tionally, de novo transcriptome sequencing for species type frequencies or abundance among North Ameri- without reference sequences have succeeded in gen- can soybean aphid populations is very limited, erating and assembling draft sequences using mainly because of the lack of informative molecular advanced programming, which enables the develop- markers (Michel et al. 2009a,b). Thus, development ment of a large number of genetic markers (Varsh- of PCR-based co-dominant molecular markers is ney et al. 2009; Bai et al. 2010; Li et al. 2010; Wang needed to clarify relationships among the soybean et al. 2010a; Garg et al. 2011). The objectives of this aphid populations and monitor biotype migration. study were (i) to identify and validate genic-SSR Molecular genetic markers have been developed markers from soybean aphid transcript sequences and utilized for the study of population genetic struc- generated by 454 GS FLX transcript sequencing and ture of many insect species. In particular, simple (ii) to assess their polymorphism and usefulness in sequence repeats (SSRs) or microsatellites are widely determining genetic diversity within and among used markers because of their high abundance in soybean aphid populations. the genome, locus-specific co-dominant inheritance, multiallelic nature, high polymorphism and high Materials and Methods rates of transferability across species (Aggarwal et al. 2007; Choudhary et al. 2009; Dutta et al. 2011). The Sequence data sources SSR markers can be found either in non-coding or coding regions of the genome, including protein-cod- Information regarding sampling and 454 GS FLX ing genes and expressed sequence tags (ESTs) (To´ th sequencing is published elsewhere (Bai et al. 2010).

2 ª 2011 Blackwell Verlag, GmbH T.-H. Jun et al. Aphis glycines genic-SSRs

Briefly, total RNA was isolated from 50 biotype 1 were collected from a soybean field in SD in 2009 aphids using Trizol Reagent (Invitrogen). Approxi- (GPS: 44.324, )96.775). The four pooled DNA sets mately 10–20 lg total RNA were sent to the Purdue (B1, B2, OH-09 and SD) each consisted of about 20 University Genome Center for 454 sequencing fol- adult soybean aphids. Each of the adult aphids was lowing manufacturer’s instructions. A total of collected from a different soybean plant to limit 7 366 599 bp forming 19 293 high-quality transcript chances of collecting clones. This pooling of DNA sequences were obtained through two rounds of was performed only for the initial validation for assembly using Newbler and Phrap program (Bai SSR markers (e.g. PCR amplification condition, et al. 2010). Of the 19 293 transcript sequences, a product size estimation and potential for polymor- total of 881 (4.6%) transcript sequences containing phism). SSR regions were detected using the program msat For the population-level genetic diversity and dif- finder (Thurston and Field 2005) with a minimal ferentiation analyses, 24 individual soybean aphids repeat number of 8 (dinucleotide repeats) and 5 (all were sampled from biotype 1 and biotype 2 colo- other repeats). Of these, 501 (2.6%) transcript nies and from two field sites, Michigan (MI, GPS: sequences (mean length 615, range 119–2771 bp) 42.691, )84.486) and South Dakota (SD, same as that had at least 40 bp sequences in both flanking above). Infested leaves were collected throughout regions of each SSR were used to develop genic-SSR the field but only using one aphid per leaf per markers in the present study. plant to control for the possibility of collecting clones. Development of SSR markers PCR conditions and fragment analysis of SSRs The 501 transcripts from Bai et al. (2010) were searched again using BatchPrimer3 software (http:// DNA was extracted using the QuickExtract Seed probes.pw.usda.gov/batchprimer3/index.html) by DNA Extraction Solution (Epicentre, Madison, WI) changing the minimum number of penta- and hexa- according to the manufacturer’s protocol. PCR nucleotide repeats to 4. An additional six loci (3 amplifications were performed in 20-ll reactions penta- and 3 hexa-nucleotide repeats) were found containing approximate 50 ng of template DNA, 1· among the 501 contigs that contained microsatel- PCR buffer, 2.5 mm Mg2+, 200 lm dNTP, 40 nm of lites. The primary parameters for primer design were forward primer with M13 tail, 160 nm reverse prim- the following: PCR product size ranging from 100 to ers, 160 nm of dye labelled M13 primer and 1.0 unit 400 bp for multiplex analysis, primer length from 18 of Taq DNA polymerase (GenScript USA Inc., Piscat- to 25 bp with 20 bp as the optimum, optimum away, NJ). The PCR cycles consisted of initial dena- annealing temperature 60C having difference turation at 95C for 5 min, followed by 32 cycles of within 1C between forward and reverse primers, 30-s denaturation at 94C, 30-s annealing tempera- and GC content from 40% to 60% with 50% as ture between 48 and 65C depending on the opti- optimum. All forward primers had the universal mum annealing temperature for each primer pair M13 sequence (CACGACGTTGTAAAACGAC) at (see Table S2), and 30-s extension at 72C, followed their 5¢ end for use in fluorescent primer labelling by eight cycles of 30-s denaturation at 94C, 45-s (Schuelke 2000). The primers were synthesized by annealing at 53C and 45-s extension at 72C. The Integrated DNA Technologies Inc (Coraville, IA). optimum annealing temperature for each primer pair was determined empirically, using the theoreti- cal melting temperature as the starting point and Soybean aphid samples adjustments to annealing temperatures (increase The adult soybean aphids collected from biotype 1 and decrease) as needed to obtain clean PCR prod- (B1), biotype 2 (B2), an Ohio field in 2009 (OH- ucts. The PCR was finished with a final 12-min 09) and a South Dakota field in 2009 (SD) were extension at 72C on a thermal cycler (PTC-225; MJ used as pooled DNA templates. Both biotype 1 and Research Inc., Waltham, MA, USA; model TC-512; biotype 2 aphids were collected from separate and Techne Inc., Burlington, NJ, USA). Multiplex frag- isolated laboratory colonies that were started with ment analysis and allele sizing for the SSR markers limited (<50) individuals (see Michel et al. 2010). were performed using the CEQ 8800 genetic ana- The OH-09 aphids were collected from a soybean lyzer and software, respectively (Beckman Coulter, field near the OARDC campus during the summer Fullerton, CA). Allele scoring was checked for the of 2009 (GPS: 40.768, )81.906), and the SD aphids presences of null alleles and stutter peaks with

ª 2011 Blackwell Verlag, GmbH 3 Aphis glycines genic-SSRs T.-H. Jun et al.

Microchecker (Van Oosterhout et al. 2004), and no Results significant evidence for either was found. Frequency distribution and development of genic- Sequencing analysis of PCR products SSRs Two genic-SSR markers (Ae-SSR32 and Ae-SSR331) A total of 621 SSRs were identified in 501 transcript that revealed length polymorphism with a single sequences, which are derived from the total of 19 293 clear and specific band in the initial screening of the transcript sequences containing 11 866 singletons and four pooled aphid samples were selected to identify 7427 contigs (fig. 1). Among the sequences used for the source of length variation among PCR products. the discovery of SSRs, 409 (81.6%) had single SSR, Direct sequencing of PCR products amplified by the while 92 (18.4%) sequences had 2 (13.8%), 3 (3.4%) two genic-SSR markers in four aphid DNA samples or 4 (1.2%) SSRs each. The trinucleotide motifs were (one each from B1, B2, OH-09 and SD) were per- the most abundant repeat motif (79.5%), followed by formed. For the BigDye terminator cycle sequencing, di- (16.2%), tetra- (2.4%), penta-(1.4%) and hex- the PCR products were directly purified using the anucleotide motifs (0.5%) (fig. 1). The most abun- mixture of exonuclease I and shrimp alkaline phos- dant dinucleotide repeat unit detected was the AT/AT phatase (ExoSAP), which were conducted in 12-ll representing (10.1%) of the total 621 SSRs, followed reactions containing 10 ll of PCR product, 2.0 units by AC/GT (4.8%) and CA/TG (0.8%). The AAT/ATT of exonuclease I and 1.8 units of shrimp alkaline motif was the most abundant trinucleotide repeat phosphatase. Incubation was performed at 37C for type (20.8%) of the total SSRs followed by ATA/TAT 1 h and 72C for 15 min. Sequencing was performed (19.0%) and TAA/TTA (16.3%) (Table S1). in both forward and reverse directions with PCR A total of 342 primer pairs were designed from amplification primers using ABI Prism 3100 genetic the total of 621 SSRs (fig. 1). The primers for the analyzer (Functional Biosciences, Madison, WI), and remaining 279 SSRs could not be designed because the sequences from each aphid sample were aligned of flanking sequence being too short or poor base using CodonCode Aligner 3.5.1 demo version composition for primer design. (http://www.codoncode.com/aligner/) and Clustal W multiple alignment of BioEdit Sequence Alignment Validation of genic-SSR markers Editor 7.0.9 software (Hall 1999). Of the 342 primer pairs that were tested for initial validation for SSR markers using the four pooled Genetic diversity analysis aphid DNA templates (B1, B2, OH-09 and SD), a

For each marker, allelic richness (Rs), observed (Ho) total of 246 (71.9%) primer pairs generated PCR and expected heterozygosity (He) were calculated products with expected size, whereas 72 (21.0%) using the program FSTAT (Goudet 1995, 2001). primers pairs did not amplify even after retests under Hardy–Weinberg disequilibrium was also calculated multiple PCR conditions (table 1; Table S2). In addi- using FSTAT, using 10 000 random permutations, tion, 18 (5.3%) primers pairs produced PCR frag- with Bonferroni corrected P-values. FSTAT was also ments much larger than the expected size and 6 used to calculate the genetic differentiation statistic, (1.8%) primers pairs amplified multiple products

FST, with significance determined by 10 000 permuta- (table 1; Table S2). Of the 246 primers pairs vali- tions (analyses were performed with and without dated in the initial test, 208 (84.6%) primers pairs repeated genotypes, as common for aphid population each generated a single (allele) whereas 38 (15.4%) genetic studies). The SSR allelic data were converted primers pairs produced two alleles. A total of 26 into binary code as 1 or 0 for each aphid individual to (7.6%) primer pairs were polymorphic among the evaluate the genetic relationship. The genetic similar- four pooled aphid DNA templates (table 1; Table S2). ity among the aphids was determined using Dice The remaining 12 primer pairs each showed the coefficient (Dice 1945). The UPGMA (Unweighted presence of two identical-sized bands among the Pair-Group Method with Arithmetic mean) based four DNA pools, indicating either fixation of the het- dendrogram was constructed based on similarity erozygous state or the presence of two alleles in the coefficients using SAHN module of the NTSYS-pc soft- pooled DNA. These 12 markers were further tested ware package ver 2.2. The dendrogram was subjected using 96 individual aphids from four aphid collec- to 1000 bootstraps using the FreeTree software tions (see below). Some individual aphids showed a (Hampl et al. 2001). second fainter band for each marker and some did

4 ª 2011 Blackwell Verlag, GmbH T.-H. Jun et al. Aphis glycines genic-SSRs

102 024 transcript reads (Total size: 30 438 043 bp)

Assembly using Newbler and Phrap program

19 293 transcript sequences (Total size: 7 366 599)

11 866 singletons 7427 contigs

SSR identification using BatchPrimer3

621 SSRs in 501 sequences Di-nucleotide : 100 (16.2%) Tri-nucleotide : 494 (79.5%) Tetra-nucleotide : 15 (2.4%) Penta-nucleotide : 9 (1.4%) Hexa-nucleotide : 3 (0.5%)

Primer design using BatchPrimer3

342 primer pairs designed

Fig. 1 Summary information on the database, simple sequence repeats (SSR) identification and primer design for development of genic- SSRs from soybean aphid transcript 49 (14.3%) 281 (82.2%) 5 (1.5%) 6 (1.7%) 1 (0.3%) Di-nucleotide Tri-nucleotide Tetra-nucleotide Penta-nucleotide Hexa-nucleotide sequences.

Table 1 Summary of genic-SSR marker validation using 342 primers four aphid samples (fig. 2). Both B2 and SD aphid for four different pooled soybean aphid DNA samples (biotype 1, bio- samples possessed 10 TA repeats for the Ae-SSR32, type 2, Ohio-09 and South Dakota) whereas OH-09 and B1 had 8 TA repeats each. For Ae-SSR331, OH-09 and B1 samples had 13 ATA Percent of motif units indicating two more repeats than the Category Numbers total (%) other two aphids that differed by 6 bp in size based Number of primers with 246 71.9 on genotyping data. The SSR flanking sequences in expected allele size Ae-SSR32 and Ae-SSR331 each were identical across Number of primers 26 7.6 all aphids tested. Therefore, the results from the showing polymorphism sequence alignments confirm that the differences in Number of primers failed to 72 21.0 number of motif repeats resulted in allele size varia- amplify PCR products tions for the two SSR markers. Number of primers with larger 18 5.3 PCR bands than expected Number of primers with 6 1.8 Genetic relationship among soybean aphid popula- multiple bands tions SSR, Simple sequence repeats. Twenty-nine polymorphic genic-SSRs markers were used to assess the genetic diversity and differentia- not have the second band, indicating non-specific tion in a set of 96 soybean aphids collected from B1, primer-binding sites. Under more stringent PCR con- B2, MI and SD with 24 individual aphids tested ditions (higher annealing temperatures), five primer from each location (fig. 3). A total of 58 alleles were pairs produced only a single band (implying previous amplified from 29 SSRs with an average of 2.0 per co-amplification), two produced the same two bands locus. Within population polymorphism, statistics are in all 96 aphids tested (i.e. all individuals fixed for shown in table 2. In both biotype colonies, Hardy– the heterozygote state) and the remaining five each Weinberg equilibrium could not be calculated for produced polymorphic results (Table S2). several loci, as fixation of either the homozygote or Multiple alignments were performed using nucleo- heterozygote genotype occurred, but represents tide sequences containing the SSR regions from PCR deviations nonetheless. Fixation of genotypes in products amplified with two genic-SSR markers in these populations is likely due to continued asexual

ª 2011 Blackwell Verlag, GmbH 5 Aphis glycines genic-SSRs T.-H. Jun et al.

(a) Ae-SSR32

B2 GGTCTAACCGTTATGCGTTTAAACGTAATGATATGTATTACTTAACATTATATAATATATATATATATATATATAAATTGTTTCTAAAGTATACGCTAGCCGAGTGAACAACAGAA (116 bp) OH(09) GGTCTAACCGTTATGCGTTTAAACGTAATGATATGTATTACTTAACATTATATAATATATATATATATATA AATTGTTTCTAAAGTATACGCTAGCCGAGTGAACAACAGAA (112 bp) B1 GGTCTAACCGTTATGCGTTTAAACGTAATGATATGTATTACTTAACATTATATAATATATATATATATATA AATTGTTTCTAAAGTATACGCTAGCCGAGTGAACAACAGAA (112 bp) SD GGTCTAACCGTTATGCGTTTAAACGTAATGATATGTATTACTTAACATTATATAATATATATATATATATATATAAATTGTTTCTAAAGTATACGCTAGCCGAGTGAACAACAGAA (116 bp)

(b) Ae-SSR331

B2 TTATCGGTGAGTGAGGAATGTGAAAACGTCCCTAATATTTAATAATTAATATTTATATAATAATAATAATAATAATAATAATAATAATA AATTACTAAAACACTGGAGATT (111 bp) OH(09) TTATCGGTGAGTGAGGAATGTGAAAACGTCCCTAATATTTAATAATTAATATTTATATAATAATAATAATAATAATAATAATAATAATAATAATAAATTACTAAAACACTGGAGATT (117 bp) B1 TTATCGGTGAGTGAGGAATGTGAAAACGTCCCTAATATTTAATAATTAATATTTATATAATAATAATAATAATAATAATAATAATAATAATAATAAATTACTAAAACACTGGAGATT (117 bp) SD TTATCGGTGAGTGAGGAATGTGAAAACGTCCCTAATATTTAATAATTAATATTTATATAATAATAATAATAATAATAATAATAATAATA AATTACTAAAACACTGGAGATT (111 bp)

B2 ATAATTAATCGTAAAAATTACAAGATTTTAAAATTATGTTTTAAAGTGTAAAACAACAAATTATTTATAATATAATTTTTCATACTTAAGATTATTATTATTATTTTTATATCATATTACC (232 bp) OH(09) ATAATTAATCGTAAAAATTACAAGATTTTAAAATTATGTTTTAAAGTGTAAAACAACAAATTATTTATAATATAATTTTTCATACTTAAGATTATTATTATTATTTTTATATCATATTACC (238 bp) B1 ATAATTAATCGTAAAAATTACAAGATTTTAAAATTATGTTTTAAAGTGTAAAACAACAAATTATTTATAATATAATTTTTCATACTTAAGATTATTATTATTATTTTTATATCATATTACC (238 bp) SD ATAATTAATCGTAAAAATTACAAGATTTTAAAATTATGTTTTAAAGTGTAAAACAACAAATTATTTATAATATAATTTTTCATACTTAAGATTATTATTATTATTTTTATATCATATTACC (232 bp)

B2 ATAATTGTATATATAATAATATAATATGTTAAGATGAAAACATTGCTAGCAAAAACTGAAAAGAAAAAATTTTAATATTTAAAAAAGAAAAAACAC GTTGCCGTCGGATAGTTG (346 bp) OH(09) ATAATTGTATATATAATAATATAATATGTTAAGATGAAAACATTGCTAGCAAAAACTGAAAAGAAAAAATTTTAATATTTAAAAAAGAAAAAACAC GTTGCCGTCGGATAGTTG (352 bp) B1 ATAATTGTATATATAATAATATAATATGTTAAGATGAAAACATTGCTAGCAAAAACTGAAAAGAAAAAATTTTAATATTTAAAAAAGAAAAAACAC GTTGCCGTCGGATAGTTG (352 bp) SD ATAATTGTATATATAATAATATAATATGTTAAGATGAAAACATTGCTAGCAAAAACTGAAAAGAAAAAATTTTAATATTTAAAAAAGAAAAAACAC GTTGCCGTCGGATAGTTG (346 bp)

Fig. 2 Partial sequence alignment of alleles derived from four soybean aphid samples; biotype 1 (B1), biotype 2 (B2), Ohio field in 2009 [OH (09)] and South Dakota field in 2009 (SD) using two simple sequence repeats (SSR) markers: (a) Ae-SSR32 and (b) Ae-SSR331. Repeat regions are in bold- face, and alignment gaps are represented by dash. Primer-binding sites are indicated by underlined letters. reproduction and clonal amplification, where 15 and 17) grouped under cluster VI revealing a favoured genotypes quickly outnumber and outcom- very high genetic dissimilarity with other 21 aphids pete less fit genotypes. In the field samples, only from MI (fig. 3). Similarly, 23 of the SD individuals Ae-SSR 46 and Ae-SSR 11, Ae-SSR 152 and Ae-SSR grouped together under cluster III with minor differ- 167 showed significant deviation in Hardy–Weinberg ences, but 1 individual (SD-16) showed a different equilbirum in both field samples. A null allele effect genotype and was placed separately under cluster is likely not the cause given that the direction of IV. The overall clustering found in the dendrogram deviation was heterozygote excess (Ho>He). As these was also corroborated by FST values (table 3). All FST samples were collected late during the year after values were significant (P = 0.006 after Bonferroni many asexual generations occurred, deviations are corrections). likely due to clonal amplification, similar to labora- tory populations. Nonetheless, the microsatellites Discussion provided good resolution, as genotypic diversity was high in field populations. Each individual aphid Frequency and distribution of genic-SSRs represented a unique genotype among the 48 total individuals, and no genotype was found in more Mining of genic-SSRs in soybean aphid transcript than one population. sequences from the next generation sequencing (454 A dendrogram calculated by dice similarity coeffi- GS FLX) was successfully carried out in the present cient placed the aphid individuals into six clusters at study. In an earlier study, a total of 308 genomic- a cut-off point of 0.86 similarity (fig. 3). Cluster II SSRs were detected from the 1240 contigs through included all 24 aphids from biotype 2 with no Illumina GAII sequencing of soybean aphid (Jun within-cluster differences, and cluster V included all et al. 2011). These SSR markers from soybean aphid individuals of biotype 1 with minor differences demonstrate that the Illumina GAII technology could among individuals. Twenty-one of the 24 aphids be very useful for developing molecular markers for from Michigan grouped in cluster I with minor organisms with limited genomic resources. A total of differences, while the remaining three aphids (MI-5, 621 genic-SSRs were obtained in our study, and the

6 ª 2011 Blackwell Verlag, GmbH T.-H. Jun et al. Aphis glycines genic-SSRs

MI1 MI2 MI3 MI4 MI8 MI9 MI10 MI11 MI12 MI13 MI14 MI16 MI18 MI21 I MI22 99 MI23 MI24 MI6 MI7 MI19 MI20 BT2-1 BT2-2 BT2-3 BT2-4 BT2-5 BT2-6 BT2-7 BT2-8 BT2-9 BT2-10 II 99 BT2-11 BT2-12 BT2-13 BT2-14 BT2-15 BT2-16 BT2-17 BT2-18 BT2-19 BT2-20 BT2-21 BT2-22 BT2-23 BT2-24 SD12 SD23 SD2 SD3 SD4 SD5 SD6 SD7 SD8 SD9 SD10 SD18 SD19 SD20 III 96 SD21 SD22 SD1 SD13 SD14 SD15 SD17 IV SD11 SD24 SD16 BT1-1 BT1-4 BT1-5 BT1-7 BT1-10 BT1-11 BT1-12 BT1-13 BT1-16 BT1-17 BT1-18 100 V 99 BT1-19 BT1-21 BT1-22 BT1-23 BT1-24 BT1-2 BT1-3 BT1-6 BT1-8 BT1-9 BT1-14 BT1-15 VI 99 BT1-20 MI5 MI15 MI17 0.78 0.84 0.89 0.95 1.00 Similarity coefficient

Fig. 3 The UPGMA dendrogram of the genetic diversity and differentiation analyses among the 96 soybean aphids collected from biotype 1 (BT1- 1 to BT1-24) and biotype 2 (BT2-1 to BT2-24) laboratory colonies and from two field sites, Michigan (MI-1 to MI-24) and South Dakota (SD-1 to SD- 24), 24 individual samples from each. Bootstrap values over 50% are shown at the inner nodes. frequency of genic-SSRs was approximately 1 SSR/ 2007; Gong et al. 2008; Dutta et al. 2011). The fre- 11.9 kb. Higher or lower frequencies of genic-SSRs quency and distribution of genic-SSRs can vary have been reported in various studies, for instance, depending on (i) various factors in the SSR search rice (1 SSR/3.4 kb), pigeonpea (1 SSR/8.4 kb) and such as software, search criteria and size of data set cotton (1 SSR/20.0 kb). However, such different fre- and (ii) genomic regions (e.g. coding regions, 5¢- quencies could be due to the application of different UTRs and 3¢-UTRs) used for the detection of genic- repeat unit thresholds and repeat length, the amount SSRs (Li et al. 2004; Varshney et al. 2005; Garg et al. of database searched and SSR identification tools 2011). Yu et al. (2004) reported different frequency used (Varshney et al. 2005; Dutta et al. 2011). of di- and trinucleotide repeats in coding regions and Trinucleotide motifs were the most abundant UTRs – 74% of the trinucleotides repeats were in repeat type in these genic-SSRs, representing 79.5% coding regions and 26% in UTRs, whereas 19% of of the total SSRs detected. This result is in agreement dinucleotides repeats were observed in coding with a number of previous reports that the trinu- regions, and 42% and 39% were in 5¢ and 3¢ UTRs, cleotide motifs are the most frequent genic repeats respectively. in most species, which might be due to the tendency towards negative selection against frame-shift muta- Development and validation of genic-SSRs tions in the coding regions (Metzgar et al. 2000; Cloutier et al. 2009). However, in some species such Approximately 72% of the 342 SSR primers used for as coffee, pigeonpea, and pumpkin dinucleotide initial validation generated the PCR product with the repeats were the most frequent (Aggarwal et al. expected product size, while 96 primers (28%) failed

ª 2011 Blackwell Verlag, GmbH 7 Aphis glycines genic-SSRs T.-H. Jun et al.

Table 2 Polymorphism statistics of genic-SSR markers from A. glycines

Biotype 1 Biotype 2 Michigan South Dakota

Locus Rs Ho He FIS Rs Ho He FIS Rs Ho He FIS Rs Ho He FIS

22 2.00 1.00 0.52 )1.00 2.00 1.00 0.53 )1.00 2.00 0.88 0.51 )0.78 2.00 1.00 0.52 )1.00 32 1.00 0.00 0.00 NA 1.00 0.00 0.00 NA 2.00 0.81 0.50 )0.67 1.00 0.00 0.00 NA 13 2.00 1.00 0.53 )1.00 2.00 1.00 0.53 )1.00 2.00 1.00 0.52 )1.00 1.72 0.00 0.13 1.00 46 2.00 1.00 0.52 )1.00 1.00 0.00 0.00 NA 2.00 1.00 0.51 )1.00 2.00 1.00 0.52 )1.00 116 2.00 1.00 0.53 )1.00 2.00 1.00 0.53 )1.00 2.00 1.00 0.51 )1.00 2.00 0.94 0.51 )0.88 136 2.00 1.00 0.53 )1.00 2.00 1.00 0.54 )1.00 2.00 0.86 0.50 )0.74 1.47 0.07 0.07 0.00 202 2.00 1.00 0.53 )1.00 2.00 1.00 0.53 )1.00 1.72 0.14 0.14 )0.05 1.00 0.00 0.00 NA 247 1.00 0.00 0.00 NA 1.00 0.00 0.00 NA 1.00 0.00 0.00 NA 2.00 1.00 0.52 )1.00 292 2.00 1.00 0.53 )1.00 2.00 1.00 0.54 )1.00 2.00 0.83 0.50 )0.70 2.00 0.93 0.51 )0.87 338 1.00 0.00 0.00 NA 2.00 1.00 0.53 )1.00 2.00 0.83 0.50 )0.70 2.00 0.92 0.52 )0.85 26 1.64 0.09 0.09 0.00 1.00 0.00 0.00 NA 1.00 0.00 0.00 NA 1.00 0.00 0.00 NA 161 1.00 0.00 0.00 NA 1.00 0.00 0.00 NA 2.00 0.84 0.50 )0.71 1.00 0.00 0.00 NA 342 2.00 1.00 0.52 )1.00 2.00 1.00 0.53 )1.00 2.00 0.90 0.51 )0.81 2.00 0.94 0.51 )0.88 81 1.00 0.00 0.00 NA 1.00 0.00 0.00 NA 1.78 0.17 0.16 )0.06 2.00 0.94 0.51 )0.88 168 1.00 0.00 0.00 NA 1.00 0.00 0.00 NA 1.00 0.00 0.00 NA 1.72 0.00 0.13 1.00 213 2.00 1.00 0.52 )1.00 1.00 0.00 0.00 NA 2.00 1.00 0.51 )1.00 2.00 0.94 0.51 )0.88 63 1.00 0.00 0.00 NA 2.00 1.00 0.53 )1.00 1.00 0.00 0.00 NA 1.00 0.00 0.00 NA 307 2.00 1.00 0.53 )1.00 1.00 0.00 0.00 NA 1.00 0.00 0.00 NA 1.69 0.13 0.12 )0.03 16 2.00 1.00 0.53 )1.00 1.00 0.00 0.00 NA 1.94 0.00 0.26 1.00 2.00 1.00 0.52 )1.00 74 1.00 0.00 0.00 NA 2.00 1.00 0.53 )1.00 2.00 0.85 0.50 )0.73 2.00 0.75 0.49 )0.57 209 1.00 0.00 0.00 NA 1.00 0.00 0.00 NA 1.86 0.11 0.19 0.46 1.00 0.00 0.00 NA 11 2.00 1.00 0.51 )1.00 1.00 0.00 0.00 NA 2.00 1.00 0.51 )1.00 2.00 1.00 0.51 )1.00 28 1.00 0.00 0.00 NA 2.00 1.00 0.51 )1.00 2.00 0.13 0.12 )0.05 2.00 1.00 0.51 )1.00 331 2.00 1.00 0.51 )1.00 1.00 0.00 0.00 NA 2.00 0.91 0.51 )0.83 2.00 0.05 0.05 0.00 113 1.00 0.00 0 NA 2.00 1 0.51 )1.00 1.80 0.13 0.12 )0.05 2.00 1.00 0.51 )1.00 152 2.00 0.00 0.43 1.00 2.00 0.95 0.51 )1.00 2.00 1.00 0.51 )1.00 2.00 1.00 0.51 )1.00 176 2.00 1.00 0.51 )1.00 2.00 1.00 0.51 )1.00 1.45 0.05 0.05 )0.05 1.97 0.10 0.18 0.45 258 1.60 0.04 0.04 0 2.00 1.00 0.51 )1.00 1.80 0.13 0.12 )0.05 2.00 1.00 0.51 )1.00 167 1.00 0.00 0.00 NA 2.00 1.00 0.51 )1.00 2.00 1.00 0.51 )1.00 2.00 1.00 0.51 )1.00

SSR, Simple sequence repeats; Rs, allelic richness; Ho, observed heterozygosity; He, expected heterozygosity. Bold values represent loci with signif- icant FIS values in field populations (all except locus 16 in MI are significant heterozygosity deficits, P < 0.006 after Bonferroni corrections). Itali- cized values represent loci in HWE (i.e. neutral loci) in field populations. Data shown after removal of repeated MLGs. NA: test not performed.

Table 3 Pairwise FST values of soybean aphid populations at 29 various tests under multiple PCR conditions. These microsatellites failures could be caused because of the following: (i) flanking primers designed across a splice site, (ii) the MI B1 B2 SD existence of large introns within the PCR target region, (iii) the existence of null alleles because of MI – mutation at the primer-binding sites and (iv) the use B1 0.39 – B2 0.36 0.35 – of inaccurate sequences because of assembly errors SD 0.30 0.41 0.28 – in de novo transcriptome sequencing for species without reference sequences (Pemberton et al. 1995; MI, Michigan; B1, Biotype 1; B2, Biotype 2; SD, South Dakota. Varshney et al. 2005; Cloutier et al. 2009; Wang All values significant after Bonferroni correction (P < 0.006). et al. 2010a). In the present study, the genic-SSR markers to amplify the expected PCR products. Among the showed a low level of polymorphism (8.5%), com- 96 primers, 18 (5.3%) produced much larger than pared with the result (about 20%) of previous expected band sizes. These results might be due to marker validation using genomic-SSR markers (Jun presence of a small intron within the flanking region et al. 2011) and cross-species amplification of micro- of each of the primer pairs. In contrast, 72 (21%) of satellites from Aphis gossypii and Aphis fabae (Michel the 96 primers amplified no PCR fragments despite et al. 2009a). Our observation was in agreement

8 ª 2011 Blackwell Verlag, GmbH T.-H. Jun et al. Aphis glycines genic-SSRs with other studies, showing that genomic-SSRs were tion were very diverse compared with the rest of the more polymorphic compared with genic-SSRs, which aphids in the collection while only one aphid may be due to greater levels of sequence conserva- showed divergence from the rest of the aphids in the tion within transcribed or gene-rich regions (Eujayl SD collection. These results match with the compar- et al. 2001; Chabane et al. 2005; Varshney et al. ative genetic diversity expected of the soybean 2005). Nevertheless, it is possible that the low poly- aphids in these two states. Michigan is one of the morphism rate of genic-SSRs developed in the pres- major soybean producing states with high popula- ent study may be partly due to small number of tion of buckthorn (winter host of the soybean samples tested. Cloutier et al. (2009) suggested that aphid), and the aphid has been present in this state the relatedness or the number of genotypes tested since its first discovery in 2000. On buckthorn, can partly influence the polymorphism rates. Addi- aphids undergo sexual reproduction (and hence tionally, polymorphism could be decreased due to a recombination) and overwinter as eggs; thus, the bottleneck during the initial North American inva- amount of buckthorn in a given area is potentially sion, use of field samples from few locations col- correlated with soybean aphid genetic diversity. lected in early in the season and clonally propagated South Dakota is at the western fringe of the US soy- laboratory colonies. Sequencing of the PCR products bean belt with very few buckthorn present in the of two SSR markers showed that the polymorphism state. The initial aphid colonization of soybean fields occurred strictly because of variations in the repeat in SD each year may require migration of the aphids units in the aphid samples used. However, SSR from distant overwintering sites creating a genetic polymorphism can also occur from variations in bottleneck. Thus, given a smaller resident popula- the flanking regions (Eujayl et al. 2004; Saha et al. tion, smaller genetic diversity would be expected in 2006). a field collection of soybean aphids from SD com- pared with a collection from MI. The usefulness of the 29 polymorphic genic-SSR Genetic differentiation within/among different markers for genetic diversity evaluation of soybean soybean aphid populations aphids has been clearly demonstrated in this study. The 29 polymorphic SSR markers tested proved very These markers discriminated laboratory colonies useful for genetic differentiation among the aphid from field collections, field collections from different samples used in the genetic diversity test (fig. 3, states and among individuals in a field collection. table 3). The aphids of the two laboratory colonies These attributes of these markers will be very useful formed their separate clusters with little or no for entomologists and soybean breeders working within-colony diversity. Michel et al. (2010) with soybean aphids. reported similar results using FIVE laboratory and 12 In contrast with genomic-SSR markers, the genic- field populations of soybean aphids. They suggested SSRs are generally derived from the transcribed that the loss of genotypic diversity because of the regions of the genome. Thus, some of these genic- fixation of homozygous or heterozygous genotypes SSRs can be associated with the gene expressions of in laboratory populations resulted in higher genetic soybean aphid. In our study, top-level Gene Ontol- differentiation among laboratory than field popula- ogy terms were assigned to the three transcript tions. The observed lack of polymorphism among sequences containing polymorphic SSR region individual aphids within a laboratory colony is also (Table S3). Such genic-SSRs could be used for the expected (Michel et al. 2010; Jun et al. 2011). evaluation of functional diversity among soybean Unfortunately, the studies cannot be directly com- aphid populations, although further research will be pared with the current study because different indi- needed to confirm their true functions. viduals were used. More detailed study will need to In conclusion, here we report 246 high-quality be conducted to determine whether this set of genic- genic-SSR markers experimentally validated from a SSR markers can actually differentiate between field set of 342 primer pairs designed from EST sequences collections and biotype 1 and 2 of the soybean generated by 454 GS FLX sequencing of a cDNA aphids. library of the soybean aphid. Twenty-nine of the Most of the aphid samples from MI and SD were 246 markers were polymorphic among aphids used grouped into clusters of their own. Although both in this study; however, more markers have the MI and SD had equal numbers of genotypes, based potential to be polymorphic among more soybean on the dendrogram, MI contained more divergent aphid populations. The usefulness of these markers genotypes. Three of the 24 aphids in the MI collec- in genetic diversity and population genetics has also

ª 2011 Blackwell Verlag, GmbH 9 Aphis glycines genic-SSRs T.-H. Jun et al. been demonstrated in a limited number of soybean KB, Varshney RK, Cook DR, Singh NK, 2011. aphid collections. These markers should be useful Development of genic-SSR markers by deep transcrip- for genetic studies, including population genetics, tome sequencing in pigeonpea [Cajanus cajan (L.) genetic mapping, migration and biotype differentia- Millspaugh]. BMC Plant Biol. 20, 11–17. tion of the soybean aphid and are significant addi- Eujayl I, Sorrells M, Baum M, Wolters P, Powell W, tions to the limited number of SSR markers that are 2001. Assessment of genotypic variation among culti- currently available for the soybean aphid from the vated durum wheat based on EST-SSRS and genomic published literature. SSRS. Euphytica 119, 39–43. Eujayl I, Sledge MK, Wang L, May GD, Chekhovskiy K, Zwonitzer JC, Mian MAR, 2004. Medicago truncatula Acknowledgements EST-SSRs reveal cross-species genetic markers for Medicago spp. Theor. Appl. Genet. 108, 414–422. We thank Wei Zhang, Lucia Orantes, Jane Todd, Garg R, Patel RK, Tyagi AK, Jain M, 2011. De novo Ronald Fioritto and Tim Mendiola for their technical assembly of chickpea transcriptome using short reads help in this study. We also thank the employees at for gene discovery and marker indentification. DNA the MCIC for their help in sequencing and genotyp- Res. 18, 53–63. ing. We appreciate Dr. Kelley Tilmon and Dr. Chris Gong L, Stift G, Kofler R, Pachner M, Lelley T, 2008. DiFonzo for their help in soybean aphid collection Microsatellites for the genus Cucurbita and an from South Dakota and Michigan, respectively. Two SSR-based genetic linkage map of Cucurbita pepo anonymous reviews provided critical insight to L. Theor. Appl. Genet. 117, 37–48. improve the manuscript. This study was supported Goudet J, 1995. FSTAT (Version 1.2): a computer by UDSA-ARS, the OSU/OARDC and a grant from program to calculate F-statistics. J. Hered. 86, 485–486. the Ohio Soybean Council (10-2-03). Goudet J, 2001. FSTAT: a program to estimate and test gene diversities and þxation indices (version 2.9.3). [WWW document]. URL http://www.unil.ch/izea/soft- References wares/fstat.html. Aggarwal RK, Hendre PS, Varshney RK, Bhat PR, Krish- Guo WZ, Sang ZQ, Zhou BL, Zhang TZ, 2007. Genetic nakumar V, Singh L, 2007. Identification, characteriza- relationships of D-genome species based on two types tion and utilization of EST-derived genic microsatellite of EST-SSR markers derived from G. arboretum and markers for genome analyses of coffee and related spe- G. raimondii in Gossypium. Plant Sci. 172, 808–814. cies. Theor. Appl. Genet. 114, 359–372. Hall TA, 1999. BioEdit: a user-friendly biological Bai X, Zhang W, Orantes L, Jun TH, Mittapalli O, Mian sequence alignment editor and analysis program for MA, Michel AP, 2010. Combining next-generation Windows 95/98/NT. Nucl. Acids Symp. Ser. 41, 95–98. sequencing strategies for rapid molecular resource Hampl V, Pavlicek A, Flegr J, 2001. Construction and development from an invasive aphid species, Aphis bootstrap analysis of DNA fingerprinting-based phylo- glycines. PLoS ONE 5, e11370. genetic trees with the freeware program FreeTree: Chabane K, Ablett GA, Cordeiro GM, Valkoun J, Henry application to trichomonad parasites. Int. J. Syst. Evol. RJ, 2005. EST versus genomic derived microsatellite Microbiol. 51, 731–735. markers for genotyping wild and cultivated barley. Hartman GL, Domier LL, Wax LM, Helm CG, Onstad Genet. Res. Crop Evol. 52, 903–909. DW, Shaw JT, Solter LF, Voegtlin DJ, D’Arcy CJ, Gray Choudhary S, Sethy NK, Shokeen B, Bhatia S, 2009. ME, SteVey KL, Isard SA, Orwick PL, 2001. Occurrence Development of chickpea EST-SSR markers and analy- and distribution of Aphis glycines on soybeans in Illinois sis of allelic variation across related species. Theor. in 2000 and its potential control. [WWW document]. Appl. Genet. 118, 591–608. URL http://www.plantmanagementnetwork.org/pub/ Cloutier S, Niu Z, Datla R, Duguid S, 2009. Development php/brief/aphisglycines/ and analysis of EST-SSRs for flax (Linum usitatissimum Hill JH, Alleman R, Hogg DB, Grau CR, 2001. First report L.). Theor. Appl. Genet. 119, 53–63. of transmission of Soybean mosaic virus and Alfalfa Cullum R, Alder O, Hoodless PA, 2011. The next genera- mosaic virus by Aphis glycines in the New World. Plant tion: using new sequencing technologies to analyse Dis. 85, 561. gene regulation. Respirology 16, 210–222. Hill CB, Li Y, Hartman GL, 2004. Resistance to the Dice LR, 1945. Measures of the amount of ecologic asso- soybean aphid in soybean germplasm. Crop Sci. 44, ciation between species. Ecology 26, 297–302. 98–106. Dutta S, Kumawat G, Singh BP, Gupta DK, Singh S, Dog- Hill CB, Crull L, Herman TK, Voegtlin DJ, Hartman GL, ra V, Gaikwad K, Sharma TR, Raje RS, Bandhopadhya 2010. A new soybean aphid (Hemiptera: Aphididae) TK, Datta S, Singh MN, Bashasab F, Kulwal P, Wanjari biotype identified. J. Econ. Entomol. 103, 509–515.

10 ª 2011 Blackwell Verlag, GmbH T.-H. Jun et al. Aphis glycines genic-SSRs

Iwaki M, Roechan M, Hibino H, Tochihara H, Tantera virus epidemics on soybean in wisconsin. Plant Dis. 91, DM, 1980. A persistent aphid borne virus of soybean, 266–272. Indonesian Soybean dwarf virus transmitted by Aphis Ossowski S, Schneeberger K, Clark RM, Lanz C, Warth- glycines. Plant Dis. 64, 1027–1030. mann N, Weigel D, 2008. Sequencing of natural strains Jakse J, Stajner N, Luthar Z, Jean-Marc Jeltsch J, of Arabidopsis thaliana with short reads. Genome Res. Javornik B, 2011. Development of transcript-associated 18, 2024–2033. microsatellite markers for diversity and linkage map- Ostlie K, 2002. Managing soybean aphid, University of ping studies in hop (Humulus lupulus L.). Mol. Breeding Minnesota Extension Service. [WWW document]. URL 28, 227–239. http://www.soybeans.umn.edu/crop/insects/aphid/ Jun TH, Michel AP, Mian MAR, 2011. Development of aphid_publication_managingsba.html. soybean aphid genomic SSR markers using next gener- Pemberton JM, Slate J, Bancroft DR, Barrett JA, 1995. ation sequencing. Genome 54, 360–367. Nonamplifying alleles at microsatellite loci: a caution Kim KS, Hill CB, Hartman GL, Mian MAR, Diers BW, for parentage and population studies. Mol. Ecol. 4, 2008a. Discovery of soybean aphid biotypes. Crop Sci. 249–252. 48, 923–928. Ragsdale DW, Landis DA, Brodeur J, Heimpel GE, Des- Kim KS, Ratcliffe ST, Wade french B, Liu L, Sappington neux N, 2011. Ecology and management of the soy- TW, 2008b. Utility of EST-derived SSRs as population bean aphid in North America. Annu. Rev. Entomol. genetic markers in a beetle. J. Hered. 99, 112–124. 56, 375–399. Krupke CH, Obermeyer JL, Bledsoe LW, 2005. Soybean Saha MC, Cooper JD, Mian MAR, Chekhovskiy K, May aphid, E-217-W Purdue Extension, Purdue University. GD, 2006. Tall fescue genomic SSR markers: develop- [WWW document]. URL http://extension.entm.pur- ment and transferability across multiple grass species. due.edu/publications/E-217.pdf. Theor. Appl. Genet. 113, 1449–1458. Li YC, Korol AB, Fahima T, Nevo E, 2004. Microsatellites Schuelke M, 2000. An economic method for the Xuores- within genes: structure, function, and evolution. Mol. cent labeling of PCR fragments. Nat. Biotechnol. 18, Biol. Evol. 21, 991–1007. 233–234. Li Y, Hill CB, Carlson SR, Diers BW, Hartman GL, 2007. Schwarz D, Robertson HM, Feder JL, Varala K, Hudson Soybean aphid resistance genes in the soybean culti- ME, Ragland GJ, Hahn DA, Berlocher SH, 2009. Sym- vars Dowling and Jackson map to linkage group M. patric ecological speciation meets pyrosequencing: sam- Mol. Breeding 19, 25–34. pling the transcriptome of the apple maggot Rhagoletis Li R, Fan W, Tian G, Zhu H, He L, Cai J, Huang Q, Cai Q, Li pomonella. BMC Genomics 10, 633. B, Bai Y et al., 2010. The sequence and de novo assem- Thurston MI, Field D, 2005. Msatfinder: detection and bly of the giant panda genome. Nature 463, 311–317. characterisation of microsatellites. 3SR: Distributed by Mensah C, DiFonzo C, Nelson RL, Wang D, 2005. Resis- the authors at http://www.genomics.ceh.ac.uk/msat- tance to soybean aphid in early maturing soybean finder /. CEH Oxford, Mansfield Road, Oxford OX1. germplasm. Crop Sci. 45, 2228–2233. To´ th G, Ga´spa´ri Z, Jurka J, 2000. Microsatellites in differ- Metzgar D, Bytof J, Wills C, 2000. Selection against ent eukaryotic genomes: survey and analysis. Genome frameshift mutations limits microsatellite expansion in Res. 10, 967–981. coding DNA. Genome Res. 10, 72–80. Van Oosterhout C, Hutchinson WF, Wills DPM, Shipley Mian MAR, Kang ST, Beil SE, Hammond RB, 2008. P, 2004. MICRO-CHECKER: software for identifying Genetic linkage mapping of the soybean aphid resis- and correcting genotyping errors in microsatellite data. tance gene in PI 243540. Theor. Appl. Genet. 117, Mol. Ecol. Notes 4, 535–538. 955–962. Varshney RK, Graner A, Sorrells ME, 2005. Genic micro- Michel AP, Zhang W, Jung JK, Kang ST, Mian MAR, satellite markers in plants: features and applications. 2009a. Cross-species amplification and polymorphism Trends Biotechnol. 23, 48–55. of microsatellite loci in the soybean aphid, Aphis gly- Varshney RK, Nayak SN, May GD, Jackson SA, 2009. cines. J. Econ. Entomol. 102, 1389–1392. Next-generation sequencing technologies and their Michel AP, Zhang W, Jung JK, Kang ST, Mian MAR, implications for crop genetics and breeding. Trends 2009b. Population genetic structure of Aphis glycines. Biotechnol. 27, 522–530. Environ. Entomol. 38, 1301–1311. Wang SY, Bao XZ, Sun YJ, Chen RL, Zhai BP, 1996. Michel AP, Zhang W, Mian MAR, 2010. Genetic diversity Study on the effects of the population dynamics of and differentiation among laboratory and field popula- soybean aphid (Aphis glycines) on both growth and tions of the soybean aphid, Aphis glycines. B. Entomol. yield of soybean. Soybean Science 15, 243–247. Res. 100, 727–734. Wang S, Zhang L, Matz M, 2009. Microsatellite charac- Mueller EE, Grau CR, 2007. Seasonal progression, symp- terization and marker development from public EST tom development, and yield effects of alfalfa mosaic and WGS databases in the reef-building coral Acropora

ª 2011 Blackwell Verlag, GmbH 11 Aphis glycines genic-SSRs T.-H. Jun et al.

millepora (Cnidaria, Anthozoa, Scleractinia). J. Hered. Supporting Information 100, 329–337. Wang Z, Fang B, Chen J, Zhang X, Luo Z, Huang L, Chen Additional Supporting Information may be found in X, Li Y, 2010a. De novo assembly and characterization the online version of this article: of root transcriptome using Illumina paired-end Table S1. Distribution and frequency of different sequencing and development of cSSR markers in motif types identified in the 501 transcript sequences sweetpotato (Ipomoea batatas). BMC Genomics 11, 726. from soybean aphid. Wang X-W, Luan J-B, Li J-M, Bao Y-Y, Zhang C-X, Liu Table S2. The list of 342 soybean aphid genic-SSR S-S, 2010b. De novo characterization of a whitefly primer pairs with marker name, forward and reverse transcriptome and analysis of its gene expression primer sequences, expected product size (bp), moti during development. BMC Genomics 11, 400. and aphid samples used for PCR amplification [B1 = Wu Z, Schenk-Hamlin D, Zhan W, Ragsdale DW, Heimpel Biotype1, B2 = Biotype2, OH(09) = Ohio in 2009, GE, 2004. The soybean aphid in China: a historical and SD = South Dakota). review. Ann. Entomol. Soc. Am. 97, 209–218. Yu JK, La Rota M, Kantety RV, Sorrells ME, 2004. EST Table S3. Summary of top-level GO terms derived SSR markers for comparative mapping in assigned to the three A. glycines B1 transcript wheat and rice. Mol. Gen. Genomics. 271, 742–751. sequences containing polymorphic SSR region; P: Zeng S, Xiao G, Guo J, Fei Z, Xu Y, Roe BA, Wang Y, Biological Process, C: Cellular Component, and F: 2010. Development of a EST dataset and characteriza- Molecular Function. tion of EST-SSRs in a traditional Chinese medicinal Please note: Wiley-Blackwell are not responsible plant, Epimedium sagittatum (Sieb. Et Zucc.) Maxim. for the content or functionality of any supporting BMC Genomics 11, 94. materials supplied by the authors. Any queries Zhang G, Gu C, Wang D, 2010. A novel locus for (other than missing material) should be directed to soybean aphid resistance. Theor. Appl. Genet. 120, the corresponding author for the article. 1183–1191.

12 ª 2011 Blackwell Verlag, GmbH