Prediction of Genes Related to Positive Selection Using Whole-Genome Resequencing in Three Commercial Pig Breeds

Prediction of Genes Related to Positive Selection Using Whole-Genome Resequencing in Three Commercial Pig Breeds

eISSN 2234-0742 Genomics Inform 2015;13(4):137-145 G&I Genomics & Informatics http://dx.doi.org/10.5808/GI.2015.13.4.137 ORIGINAL ARTICLE Prediction of Genes Related to Positive Selection Using Whole-Genome Resequencing in Three Commercial Pig Breeds HyoYoung Kim1, Kelsey Caetano-Anolles2, Minseok Seo3, Young-jun Kwon3, Seoae Cho4, Kangseok Seo5*, Heebal Kim1,4,6** 1Department of Agricultural Biotechnology, Seoul National University, Seoul 08826, Korea, 2Department of Animal Sciences, University of Illinois, Urbana, IL 61801, USA, 3Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea, 4C&K Genomics Inc., Seoul National University Research Park, Seoul 08826, Korea, 5Department of Animal Science and Technology, College of Life Science and Natural Resources, Sunchon National University, Suncheon 57922, Korea, 6Department of Agricultural Biotechnology, Animal Biotechnology Major, and Research Institute for Agriculture and Life Sciences, Seoul National University, Seoul 08826, Korea Selective sweep can cause genetic differentiation across populations, which allows for the identification of possible causative regions/genes underlying important traits. The pig has experienced a long history of allele frequency changes through artificial selection in the domestication process. We obtained an average of 329,482,871 sequence reads for 24 pigs from three pig breeds: Yorkshire (n = 5), Landrace (n = 13), and Duroc (n = 6). An average read depth of 11.7was obtained using whole-genome resequencing on an Illumina HiSeq2000 platform. In this study, cross-population extended haplotype homozygosity and cross-population composite likelihood ratio tests were implemented to detect genes experiencing positive selection for the genome-wide resequencing data generated from three commercial pig breeds. In our results, 26, 7, and 14 genes from Yorkshire, Landrace, and Duroc, respectively were detected by two kinds of statistical tests. Significant evidence for positive selection was identified on genes ST6GALNAC2 and EPHX1 in Yorkshire, PARK2 in Landrace, and BMP6, SLA-DQA1, and PRKG1 in Duroc.These genes are reportedly relevant to lactation, reproduction, meat quality, and growth traits. To understand how these single nucleotide polymorphisms (SNPs) related positive selection affect protein function, we analyzed the effect of non-synonymous SNPs. Three SNPs (rs324509622, rs80931851, and rs80937718) in the SLA-DQA1 gene were significant in the enrichment tests, indicating strong evidence for positive selection in Duroc. Our analyses identified genes under positive selection for lactation, reproduction, and meat-quality and growth traits in Yorkshire, Landrace, and Duroc, respectively. Keywords: non-synonymous, swine, positive selection, re-sequencing, single nucleotide polymorphism animals is determined by human-generated pressures, and Introduction the genetic differences resulting from artificial selection have led to economic development [3]. Among domestic Artificial selection is an aspect of the process of animals, the pig has experienced a particularly long history domestication development [1] that is reflected in the of haplotype changes through artificial selection in the evolution of domestic animals. Positive selection increases process of domestication [2, 4]. For most of their history, the fitness of adaptive traits [2]. Previous research has studies have found positive selection in Yorkshire, Landrace, demonstrated that the selection and mating of domestic and Duroc to be associated with specific genes related to Received July 29, 2015; Revised November 21, 2015; Accepted November 21, 2015 *Corresponding author: Tel: +82-61-750-3232, Fax: +82-61-750-3230, E-mail: [email protected] **Corresponding author: Tel: +82-2-880-4803, Fax: +82-2-883-8812, E-mail: [email protected] Copyright © 2015 by the Korea Genome Organization CC It is identical to the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/). HY Kim, et al. Identifying Positive Selection in Three Pig Breeds lactation [5, 6], reproduction [7], meat quality [8], and for resequencing data processing and SNP calling. Sub- growth [3] trait. Given the importance of these traits to the stitution calling was performed using GATK Unified- pig farming industry, it is necessary to detect the signals of Genotyper. We phased the haplotypes for the entire pig positive selection in pigs. populations using BEAGLE [15]. Picard tools was used for Identifying genomic selection using high-density single duplicate read removal and all mate-pair information nucleotide polymorphism (SNP) arrays or sequencing data is confirmation. Samtools was used for indexing the results useful for the discovery of putative trait-related genes. A from bam files and calculating the mapped reads using the variant under selection pressure between and/or within flagstat option. GATK was used for realignment and SNP populations will show an increase of the frequency of calling from resequencing data, and VCFtools was used particular alleles or linkage disequilibrium (LD) of SNPs. when VCF files were handled. After filtering, non-biallelic Selective sweep can cause genetic differentiation across SNPs were excluded. populations [9]. The cross-population extended haplotype Detection of selective sweep homozygosity (XPEHH) [10] and cross-population com- posite likelihood ratio (XPCLR) [9] tests are widely used to To detect selective sweep, we implemented two between- search for signals of selection between the populations. The populations methods (XPEHH and XPCLR) between each XPEHH test detects the occurrence of selection based on pairwise breed contrast. These statistics used the phased measuring LD, while the XPCLR test considers the spatial data based on the SNP genotypes obtained from the whole- patterns of allele frequencies of SNPs [9]. The XPCLR test is genome resequencing. For each contrast, XPEHH and a likelihood method that uses differentiation of multi-locus XPCLR were implemented across all SNPs between the allele frequencies between two populations to detect three pig breeds to identify potential positive selection. We selective sweep. Selection with significant signals can break used a p-value of <0.01 for XPEHH and the top 1,000 genes up the haplotype structures more rapidly than mutation or in XPCLR as the cutoff criteria. These tests revealed that recombination processes [11]. some SNPs might have been under positive selection in the In this study, the XPEHH and XPCLR between-popula- Yorkshire, Landrace, and Duroc breeds, respectively. tion methods were implemented to detect positive selection Bioinformatic analysis of genes under positive in three pig breeds through resequencing using the Illumina selection HiSeq2000 platform (Illumina, Inc., San Diego, CA, USA). To further explain the biological implications of positively Based on the findings of positive selection, enrichment selected genes, functional enrichment analysis was per- analysis was performed to examine the biological functions formed. of genes in detected regions. In this study, regions with SNPs under positive selection were extended approximately 10 kb Methods upstream and downstream [16]. We assembled genes located within the extended region using the RefGene from Sampling and whole-genome sequencing the UCSC Genome Browser (http://genome.ucsc.edu/; ver. We used genomic DNA samples gathered from 12 males hg19). Gene enrichment analysis was performed GO terms and 12 females of three pig breeds: Yorkshire, 2 males and 3 [17], including biological process, molecular function, females; Landrace, 7 males and 6 females; and Duroc, 3 cellular component, and Kyoto Encyclopedia of Genes and males and 3 females. Blood samples were collected for DNA Genomes (KEGG) pathway analysis [18] using the DAVID extraction. Sample collections and DNA quality check (http://david.abcc.ncifcrf.gov/) tool [19]. procedures were performed according to the manufacturer's Analysis of SNP effects instructions. Next, we constructed genomic DNA libraries for each sample using TruSeq DNA Library kits (Illumina). To identify non-synonymous coding SNPs (nsSNPs), we The paired-end library was sequenced on an Illumina HiSeq analyzed SNP effects using the SNPeffect database, which 2000 sequencing platform. detects phenotypic effects of variation (http://snpeffect. The pair-end sequence reads were aligned to the reference vib.be/) [20]. For more exact results, a Fisher's exact test was pig genome sequence from University of California, Santa performed to identify breed-specific amino acids in genes Cruz (UCSC; http://genome.ucsc.edu/; susScr3) using with nsSNPs. The statistical test was used to analyze 2 × 2 Bowtie2 with the default setting. We used the following contingency tables composed with two factors. Considering open-source software: Bowtie2, Picard tools 1.94 (http:// our purpose, one factor was breed information, and the other picard.sourceforge.net), Samtools 0.1.19 [12], Genome was amino acid information. From the 2 × 2 contingency Analysis Toolkit (GATK) 2.6.4 [13], and VCFtools 4.0 [14] tables, we expected to observe enrichment of specific amino 138 www.genominfo.org Genomics & Informatics Vol. 13, No. 4, 2015 acids in specific breeds at each position. All data parsing and Selective sweep detected by between-population calculations were performed using Python (ver. 2.5) and R methods (ver. 3.1.2). To detect signals of positive selection in the three pig Results breeds, two between-population

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    9 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us