Analyses of Porcine Public Snps in Coding-Gene Regions by Re-Sequencing and Phenotypic Association Studies
Total Page:16
File Type:pdf, Size:1020Kb
Mol Biol Rep DOI 10.1007/s11033-010-0496-1 Analyses of porcine public SNPs in coding-gene regions by re-sequencing and phenotypic association studies Xiaoping Li • Sang-Wook Kim • Kyoung-Tag Do • You-Kyoung Ha • Yun-Mi Lee • Suk-Hee Yoon • Hee-Bal Kim • Jong-Joo Kim • Bong-Hwan Choi • Kwan-Suk Kim Received: 22 April 2010 / Accepted: 11 November 2010 Ó Springer Science+Business Media B.V. 2010 Abstract The Porcine SNP database has a huge number SNPs (16.2%) were found across all the five breeds, and of SNPs, but these SNPs are mostly found by computer 199 SNPs (21.7%) were breed specific polymorphisms. data-mining procedures and have not been well character- According to the SNP locations in the gene sequences, ized. We re-sequenced 1,439 porcine public SNPs from these 916 variations were categorized into 802 non-coding four commercial pig breeds and one Korean domestic SNPs (785 in intron, 17 in 30-UTR) and 114 coding SNPs breed (Korean Native pig, KNP) by using two DNA pools (86 synonymous SNPs, 28 non-synonymous SNPs). The from eight unrelated animals in each breed. These SNPs nucleotide substitution analyses for these SNPs revealed were from 419 protein-coding genes covering the 18 that 70.2% were from transitions, 20.0% from transver- autosomes, and the re-sequencing in breeds confirmed 690 sions, and the remaining 5.79% were deletions or inser- public SNPs (47.9%) and 226 novel mutations (173 SNPs tions. Subsequently, we genotyped 261 SNPs from 180 and 53 insertions/deletions). Thus, totally, 916 variations genes in an experimental KNP 9 Landrace F2 cross by the were found from our study. Of the 916 variations, 148 Sequenom MassARRAY system. A total of 33 traits including growth, carcass composition and meat quality were analyzed for the phenotypic association tests using the 132 SNPs in 108 genes with minor allele frequency Electronic supplementary material The online version of this (MAF) [0.2. The association results showed that five article (doi:10.1007/s11033-010-0496-1) contains supplementary marker-trait combinations were significant at the 5% material, which is available to authorized users. experiment-wise level (ADCK4 for rear leg, MYH3 for rear X. Li Á S.-W. Kim Á K.-T. Do Á Y.-K. Ha Á K.-S. Kim (&) leg, Hunter B, Loin weight and Shearforce) and four at the Department of Animal Science, Chungbuk National University, 10% experiment-wise level (DHX38 for average daily gain Cheongju, Chungbuk 361-763, South Korea at live weight, LGALS9 for crude lipid, NGEF for front leg e-mail: [email protected] and LIFR for pH at 24 h). In addition, 49 SNPs in 44 genes X. Li showing significant association with the traits were detec- Department of Animal Technology, Huazhong Agricultural ted at the 1% comparison-wise level. A large number of University, Wuhan 430070, China genes that function as enzymes, transcription factors or Y.-M. Lee Á J.-J. Kim (&) signalling molecules were considered as genetic markers School of Biotechnology, Yeungnam University, Gyeongsan, for pig growth (RNF103, TSPAN31, DHX38, ABCF1, South Korea ABCC10, SCD5, KIAA0999 and FKBP10), muscling e-mail: [email protected] (HSPA5, PTPRM, NUP88, ADCK4, PLOD1, DLX1 and S.-H. Yoon Á H.-B. Kim GRM8), fatness (PTGIS, IDH3B, RYR2 and NOL4) and Department of Food and Animal Biotechnology, Seoul National meat quality traits (DUSP4, LIFR, NGEF, EWSR1, ACTN2, University, Seoul, South Korea PLXND1, DLX3, LGALS9, ENO3, EPRS, TRIM29, EHMT2, RBM42, SESN2 and RAB4B). The SNPs or genes B.-H. Choi (&) National Institute of Animal Science, Suwon, South Korea reported here may be beneficial to future marker assisted e-mail: [email protected] selection breeding in pigs. 123 Mol Biol Rep Keywords Pig public SNPs Á Resequencing Á Protein- 1,439 public SNPs coding genes Á Association study PCR and sequencing in five breeds 339 homozygous+ 690 SNPs+ 226 novel + 402 failed to be detected Introduction 863 SNPs + 53 ins/del Single nucleotide polymorphisms (SNPs) are one of the 261 SNPs most abundant genetic variations and are widespread Genotyping throughout the whole genome. In the human genome, on 23 failed +33 homozygous + 68 MAF<0.2 + 132 MAF>0.2 average one SNP with a minor allele frequency (MAF) Association analyses greater than 1% occurs out of 300 base pairs [1, 2]. SNPs have also been found to be highly variable even in the coding 94 associated SNPs (p<0.01) regions of genes [3–7]. Compared with other polymor- phisms such as simple sequence repeats, SNPs have the 25: meat quality advantage of relatively high stability. Thus, due to their high 17: growth availability and stability, SNPs are becoming an important 23: carcass marker of choice for applications in a variety of fields such composition as population genomics, evolutionary analysis and disease Fig. 1 Steps and processes of SNP selection and validations. ins research [8–11]. In farm animals, SNPs can be used to insertion, del deletion, MAF minor allele frequency. The associated identify genome regions or genes influencing important SNPs were detected on a level of P \ 0.01 economic traits, such as growth, fatness, muscling, meat quality, and reproduction as well as disease resistance [12, 13]. So far, several causative mutations affecting these traits we annotated these sequences into coding or non-coding have been identified in pigs, included the earliest reported regions and re-sequenced them in five different breeds mutations within the meat quality genes (HAL, RN)[14, 15], using direct PCR sequencing methods and also performed a missense mutation (Asp298Asn) in MC4R affecting feed large scale phenotypic association analyses to identify intake, growth and backfat [16, 17], and the IGF2-intron3- useful DNA markers for pig growth, carcass and meat G3072A substitution affecting muscle growth and backfat quality traits. The flow chart of the whole experiment has thickness detected in a cross between Meishan and European been illustrated in the Fig. 1. White breeds [18, 19]. The pig NCBI SNP database (dbSNP) includes huge numbers of pig SNPs found by direct sequencing, in silico data mining or experimental studies [6, 7]. These publicly Materials and methods available SNPs are a valuable resource for gene linkage mapping and association studies in pigs. However, insuffi- Prediction of coding and non-coding SNPs cient annotations of the SNPs such as gene origins, genome locations and allele distributions in breeds dramatically The putative SNPs in 449 sequence tagged site (STS) limited their practical utilities. Therefore, characterizing representing 419 pig protein-coding genes were analyzed and validating these SNPs in breeds will benefit the further according to their genome locations (STS sequences are applications of these SNPs in pig genetics or breeding. In available in the NCBI STS database). These STSs are addition, the increasing availability of high throughput genomic sequences which may include intronic SNPs genotyping technologies makes it possible to conduct a (iSNP) and exonic SNPs, and the SNPs in exons can be large scale phenotypic association analyses and identify identified as coding regions (cSNPs) or non-coding regions multiple causal variants for complicated traits at one time. (UTR). A subset of cSNPs give rise to a variation in the Previously, we constructed an in silico coding gene SNP encoded amino acid residues are known as non-synony- map where we consolidated 465 SNP containing sequences mous SNPs (nsSNP), and the cSNPs that do not change from the NCBI pig SNP database (dbSNP) and assigned amino acid residues are called synonymous SNPs (sSNPs). them onto the pig QTL map based on BLAST analyses In this study, we used iSNP, sSNP, nsSNP and UTR to [20]. Totally, 449 sequences corresponding to 419 protein define the SNP locations and their characteristics. Because coding genes were submitted to the NCBI STS database the NCBI database has a huge number of porcine ESTs, the and each received an access ID. These sequences contained STS sequences were firstly manually blasted with the 1,439 putative SNPs covering 18 autosomes. In this study, porcine ESTs (identity score[90%) to find out the intronic 123 Mol Biol Rep regions of the STS sequences using the ‘‘nucleotide blast’’ Association analyses program ‘‘nr’’ option under the ‘‘basic blast’’ category in NCBI. If the porcine ESTs were unavailable, the STS Resource population and phenotype collections sequences were directly blasted against the mRNA of the human homologies, and the intronic regions of the STSs A three-generation resource population developed from a were identified due to the highly conserved intron–exon cross between five Korean native boars and ten Landrace boundaries between humans and pigs [21]. Then blastx was sows was used for the association study. A total of 404 F2 performed using the STS sequences with the detected animals with phenotypic records were genotyped for 261 porcine ESTs against the protein sequences of the human SNPs. The phenotypic traits we analyzed included four homologies to find out the non-coding and coding SNPs. growth traits (birth weight, 21-days weight, average daily The sSNPs and nsSNPs were evaluated by calculating the gain at weaning, average daily gain on test), 13 carcass frame of the consensus sequence translation. The newly composition traits (loin eye area, kalbi area, sirloin weight, detected mutations were analyzed in the same way. galmegi weight, backfat thickness, live weight, hot carcass weight, bone weight, loin weight, front leg weight, rear leg PCR amplification, sequencing and SNP discovery weight, leather weight, samgyup weight) and 16 meat quality traits (crude ash, crude protein, crude lipid, mois- In order to check the polymorphism status of the publicly ture, lipid, drip loss, water holding capacity, cooking loss, available SNPs in different pig breeds, we designed 449 shear force, loin PH at 24 h, CIE-L, CIE-A, CIE-B, Hunter sets of polymerase chain reaction (PCR) primers using L, Hunter A, Hunter B); the means and standard deviations Oligo 6 (see Table 1 and additional file 2) based on the for the 33 traits were listed in additional file 5.