Characterization and Development of EST-SSR Markers Derived from Transcriptome of Yellow Catfish
Total Page:16
File Type:pdf, Size:1020Kb
Molecules 2014, 19, 16402-16415; doi:10.3390/molecules191016402 OPEN ACCESS molecules ISSN 1420-3049 www.mdpi.com/journal/molecules Article Characterization and Development of EST-SSR Markers Derived from Transcriptome of Yellow Catfish Jin Zhang 1, Wenge Ma 1, Xiaomin Song 1, Qiaohong Lin 1, Jian-Fang Gui 1,2,* and Jie Mei 1,* 1 Key Laboratory of Freshwater Animal Breeding, Ministry of Agriculture, Freshwater Aquaculture Collaborative Innovation Center of Hubei Province, College of Fisheries, Huazhong Agricultural University, Wuhan 430070, China 2 State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of Sciences, University of the Chinese Academy of Sciences, Wuhan 430072, China * Author to whom correspondence should be addressed; E-Mails: [email protected] (J.-F.G.); [email protected] (J.M.); Tel./Fax: +86-27-6878-0707 (J.-F.G.); +86-27-8728-2113 (J.M.). External Editor: Derek J. McPhee Received: 6 August 2014; in revised form: 28 September 2014 / Accepted: 29 September 2014 / Published: 13 October 2014 Abstract: Yellow catfish (Pelteobagrus fulvidraco) is one of the most important freshwater fish due to its delicious flesh and high nutritional value. However, lack of sufficient simple sequence repeat (SSR) markers has hampered the progress of genetic selection breeding and molecular research for yellow catfish. To this end, we aimed to develop and characterize polymorphic expressed sequence tag (EST)–SSRs from the 454 pyrosequencing transcriptome of yellow catfish. Totally, 82,794 potential EST-SSR markers were identified and distributed in the coding and non-coding regions. Di-nucleotide (53,933) is the most abundant motif type, and AC/GT, AAT/ATT, AAAT/ATTT are respective the most frequent di-, tri-, tetra-nucleotide repeats. We designed primer pairs for all of the identified EST-SSRs and randomly selected 300 of these pairs for further validation. Finally, 263 primer pairs were successfully amplified and 57 primer pairs were found to be consistently polymorphic when four populations of 48 individuals were tested. The number of alleles for the 57 loci ranged from 2 to 17, with an average of 8.23. The observed heterozygosity (HO), expected heterozygosity (HE), polymorphism information content (PIC) and fixation index (FIS) values ranged from 0.04 to 1.00, 0.12 to 0.92, 0.12 to 0.91 and −0.83 to 0.93, Molecules 2014, 19 16403 respectively. These EST-SSR markers generated in this study could greatly facilitate future studies of genetic diversity and molecular breeding in yellow catfish. Keywords: EST-SSRs; yellow catfish; 454 pyrosequencing; genetic diversity 1. Introduction Molecular marker systems, such as simple sequence repeats (SSRs) or microsatellites [1], single nucleotide polymorphism (SNPs) [2], amplified fragment length polymorphisms (AFLPs) [3] and random amplification of polymorphic DNAs (RAPDs) [4] have been developed and are applied to fisheries and aquaculture. Yellow catfish is an important freshwater fish for its delicious flesh and high market value, whereas overfishing is decreasing its number and genetic diversity [5]. Applying genomic tools in the selection of elite broodstock has the potential to improve the productivity and commercial value of this species. In populations of yellow catfish, males grow faster than females by two to three folds. For this reason, an all-male monosex population has been massively produced for commercial purpose [3,6,7]. However, genetic resources and suitable molecular markers are still scarce in yellow catfish. SSRs are tandem repeating sequences of 1–6 nucleotides and distributed throughout vertebrate genomes [8]. Based on their locations, SSRs can be classified into genomic SSRs (gSSRs) and Expressed Sequence Tag-SSRs (EST-SSRs) [9]. Because of high level of polymorphism, SSRs have wide applications in population genetics, such as parentage analysis [10], Quantitative Trait Locus (QTL) mapping [11], marker assisted selection (MAS) [12], and phylogenetic studies [13]. Traditional methods of developing gSSR markers require fragmented genomic DNA and are usually time-consuming and labor-intensive. With the advent of high-throughput sequencing technology, the development of EST-SSRs has become a fast, efficient, and low-cost option for economical fish species [14,15]. The transcriptome of yellow catfish was acquired using a 454 GS-FLX Titanium platform and 540 Mbp of raw data were generated. In this study, we analyze the frequency and distribution of 82,794 potential EST-SSRs in the yellow catfish transcriptome. Sixty of 300 validated primer pairs were selected and further characterized for polymorphism analysis. Recently, we have performed genetic selection breeding on four wild populations of yellow catfish collected from Chang Lake (Jingzhou), Hong Lake (Honghu), South Lake (Zhongxiang) and Dongting Lake (Hunan) as previously reported [16]. These EST-SSR markers should provide a promising genetic resource for molecular breeding of yellow catfish. 2. Results and Discussion 2.1. Characterization of EST-SSRs in the Yellow Catfish Transcriptome Putative open reading frames (ORFs) of all the assembled contigs and singletons were predicted by EMBOSS software. After analyzing the transcriptome by MISA software, we identified 82,794 SSRs, among which 23,085 SSRs (27.9%) are located in the coding region, 18,954 SSRs (22.9%) in the 5'-UTR, and 18,537 SSRs (22.4%) in the 3'-UTR (Figure 1A). Then, we analyzed the distribution of Molecules 2014, 19 16404 SSRs that have 2–6 bp repeat motif and are widely used. Of the 14,090 SSR identified in the coding region, dinucleotide accounts for 72.2% (10,180), tri-nucleotide is 17.6% (2478), tetra-nucleotide is 9.3% (1309), followed by penta-nucleotide 0.7% (98) and hexa-nucleotide 0.2% (25). Of the 10,584 SSR identified in the 5'-UTR, the most abundant is also dinucleotide accounting for 74.3% (7868), followed by tri-, tetra-, penta- and hexa-nucleotide with 14.5% (1532), 10% (1061), 1.1% (118) and 0.04% (5), respectively. Of the 11,654 SSR in the 3'-UTR, the percentage (and number) of di-, tri-, tetra-, penta- and hexa-nucleotide is 77.4% (9015), 13.4% (1559), 8.2% (961), 0.9% (107) and 0.1% (12), respectively (Figure 1B). Different locations of SSR markers in ESTs may suggest their possible for gene expression and functions [17]. The SSR insertions inside the promoter region of genes could modulate their expression levels [18]. Figure 1. Distribution of EST-SSRs across the 5' UTR, CDS and 3' UTR in yellow catfish. Number of SSRs located on non-coding and coding region (A) and the distributions of SSRs with different motif sizes (B). Among the 82,794 SSRs, di-nucleotide is the most abundant type of repeat motif that is accounting for 65.14% (53,933) of the total SSRs, while hexa-nucleotide is the least type (84, 0.10%). Furthermore, the percentages of mono-, tri-, tetra-, and penta-nucleotide are 17.11% (14,168), 9.79% (8104), 7.28% (6027) and 0.58% (478) in respective. Most of SSRs had 6–36 repeat units, and six repeat units (15,004, 18.12%) and ten repeat units (9784, 11.82%) were the most represented types (Table 1). In the di-nucleotide repeat SSRs, AC/GT (39,554, 73.3%) and AG/CT (11,460, 21.2%) are the dominant types (Figure 2A). Similar to other fishes [19], (GC)n repeats are extremely rare in yellow catfish. Two most frequent repeats in the tri- nucleotide are AAT/ATT (3645, 45.0%) and ATC/GAT (1353, 16.7%) (Figure 2B). Among the tetra- nucleotide, the top two types of repeat motifs are AAAT/ATTT (1412, 23.4%) and ACAG/CTGT (943, 15.6%) (Figure 2C). Molecules 2014, 19 16405 Table 1. Frequency of different repeat motifs among the EST-SSRs of yellow catfish. Repeats Mo Di Tri Tetra Penta Hexa Total Percentage (%) 5 - 0 2654 1843 253 43 4793 5.79 6 - 12,561 1347 994 80 22 15,004 18.12 7 - 7110 893 632 44 8 8687 10.49 8 - 4411 537 421 16 5 5390 6.51 9 - 3248 384 316 18 3 3969 4.79 10 6769 2429 276 289 19 2 9784 11.82 11 3055 1972 263 225 15 0 5530 6.68 12 1805 1628 244 194 4 1 3876 4.68 13 995 1418 207 144 14 0 2778 3.36 14 602 1260 206 129 6 0 2203 2.66 15 392 1112 173 132 2 0 1811 2.19 16 174 1008 186 96 2 0 1466 1.77 17 136 896 141 110 1 0 1284 1.55 18 80 846 113 64 0 0 1103 1.33 19 53 806 128 60 3 0 1050 1.27 20 26 799 90 46 1 0 962 1.16 21 18 731 81 58 0 0 888 1.07 22 13 688 54 44 0 0 799 0.97 23 12 713 44 48 0 0 817 0.99 24 5 709 30 26 0 0 770 0.93 25 3 655 23 30 0 0 711 0.86 26 4 634 12 23 0 0 673 0.81 27 1 648 9 20 0 0 678 0.82 28 3 573 3 12 0 0 591 0.71 29 0 594 1 12 0 0 607 0.73 30 3 563 1 12 0 0 579 0.70 31 5 521 0 6 0 0 532 0.64 32 2 479 2 7 0 0 490 0.59 33 0 462 2 2 0 0 466 0.56 34 0 432 0 3 0 0 435 0.53 35 1 421 0 5 0 0 427 0.52 36 0 394 0 5 0 0 399 0.48 >36 11 3212 0 19 0 0 3242 3.92 Total 14,168 53,933 8104 6027 478 84 82,794 100.00 Percentage (%) 17.11 65.14 9.79 7.28 0.58 0.10 100.00 2.2.