doi:10.1111/j.1365-2052.2011.02213.x A whole-genome association study for pig reproductive traits

S. K. Onteru*, B. Fan*,†, Z-Q. Du*, D. J. Garrick*, K. J. Stalder* and M. F. Rothschild* *Department of Animal Science and Center for Integrated Animal Genomics, Iowa State University, Ames, IA 50011, USA. †Key Laboratory of Agricultural Genetics, Breeding and Reproduction, Ministry of Education & College of Animal Science and Technology, Huazhong Agricultural University, Wuhan 430070, China.

Summary A whole-genome association study was performed for reproductive traits in commercial sows using the PorcineSNP60 BeadChip and Bayesian statistical methods. The traits included total number born (TNB), number born alive (NBA), number of stillborn (SB), number of mummified foetuses at birth (MUM) and gestation length (GL) in each of the first three parities. We report the associations of informative QTL and the within the QTL for each reproductive trait in different parities. These results provide evidence of effects having temporal impacts on reproductive traits in different parities. Many QTL identified in this study are new for pig reproductive traits. Around 48% of total genes located in the identified QTL regions were predicted to be involved in placental functions. The genomic regions containing genes important for foetal developmental (e.g. MEF2C) and uterine functions (e.g. PLSCR4) were associated with TNB and NBA in the first two parities. Sim- ilarly, QTL in other foetal developmental (e.g. HNRNPD and AHR) and placental (e.g. RELL1 and CD96) genes were associated with SB and MUM in different parities. The QTL with genes related to utero-placental blood flow (e.g. VEGFA) and hematopoiesis (e.g. MAFB) were associated with GL differences among sows in this population. Pathway analyses using genes within QTL identified some modest underlying biological pathways, which are interesting candidates (e.g. the nucleotide metabolism pathway for SB) for pig reproductive traits in different parities. Further validation studies on large populations are warranted to improve our understanding of the complex genetic architecture for pig reproductive traits.

Keywords biological pathways, parity, reproductive traits, whole-genome association.

gene studies (Rothschild et al. 1996; Vallet et al. 2005; Introduction Spotter et al. 2009) have been conducted to find the QTL The pig, being a highly prolific mammal, could be one of the and associated genes for these traits. A recent candidate best species to study the genetic complexity of lowly heri- gene study established that the genes associated with pig table reproductive traits. Around 30% of culling in pig reproductive traits are primarily involved in energy production systems has primarily been because of repro- metabolism (Rempel et al. 2010). However, genomic ductive problems (Stalder et al. 2004). Reproductive per- improvement in pig reproductive traits requires detailed formance in commercial pig production systems is usually whole-genome association studies (WGAS) to explore the quantified by numerous economically important production chromosomal regions and genetic markers that explain the traits. These traits include total number born (TNB), num- variation in these traits. ber born alive (NBA), number of stillborn (SB), number of The pig genome project (http://www.sanger.ac.uk/ mummified foetuses at birth (MUM), and the gestation Projects/S_scrofa/) and the development of the Illumina length (GL) for each parity. Many linkage (Cassady et al. PorcineSNP60 BeadChip (Ramos et al. 2009) via the efforts 2001; King et al. 2003; Tribout et al. 2008) and candidate of the International Swine Genome Sequencing Consortium have provided an opportunity to carry out WGAS in the pig. Address for correspondence Advanced statistical methods (Meuwissen et al. 2001; M. F. Rothschild, Department of Animal Science, 2255 Kildee Hall, Kizilkaya et al. 2010) and tools (GENSEL software at http:// Ames, IA 50011, USA. bigs.ansci.iastate.edu) based on Bayesian approaches are E-mail: [email protected] available to analyse the large quantities of SNP chip data Accepted for publication 12 February 2011 for genomic selection and WGAS in domestic animal

18 2011 The Authors, Animal Genetics 2011 Stichting International Foundation for Animal Genetics, 43, 18–26 WGAS for pig reproductive traits 19 populations (Fernando & Garrick 2008). During recent The Bayes C method is derived from the Bayes B approach years, several WGAS have been performed in humans (Meuwissen et al. 2001). The Bayes B method assumes a (http://www.genome.gov/admin/gwascatalog.txt), cattle different variance for every SNP and is heavily influenced (Feugang et al. 2009) and sheep (Becker et al. 2010). by the prior, whereas Bayes C uses a common variance However, WGAS studies using SNP chips are just now being that is reliably estimated from the SNP data. The Bayes B reported for the pig (Duijvesteijn et al. 2010). Therefore, a method is more sensitive to the given priors than is WGAS study was carried out using the PorcineSNP60 Bayes C. The Bayes C approach has been explained pre- BeadChip, which is the most powerful genomic platform for viously by Kizilkaya et al. (2010). Briefly, the basic model studying pig reproductive traits, including TNB, NBA, SB, of Bayes C is as follows: MUM and GL for the first three parities. XK y ¼ l þ xjbjdj þ e j¼i Materials and methods where y is the phenotype vector, l is the overall mean, K is

Animals and phenotypes the total number of SNPs, xj is the column vector of a covariate SNP at j, b is the substitution effect of a SNP A total of 683 female pigs born over a period of 6 months j at locus j, and d is a random 0/1 variable that represents were included in the study. The sows were from a commer- j the absence (with the selected prior probability p) or pres- cial operation which utilized breeding stock from Newsham ence (with the probability 1 ) p) of the locus j in the model. Choice Genetics (West Des Moines, IA, USA). These animals b is conditional on r2 and is considered to be normally belonged to a Large White grandparent maternal line and a j b distributed N (0, r2). The e is the vector of random residuals Large White · Landrace parent maternal line and were used b assumed to be normally distributed. In this study, the for an earlier candidate gene study (Fan et al. 2009). To following modified statistical model from Kizilkaya et al. understand the genetic differences between these two lines, (2010) was used: population stratification was examined using an identical- by-state distance clustering method in the PLINK program y ¼ Xb þ Zu þ e (Purcell et al. 2007). This analysis clustered both of the lines into one cluster, suggesting that there are limited genetic where y is the vector of phenotypes, X is an incidence ma- differences between these lines. However, based on the ped- trix of fixed effects (b), Z is a matrix of SNP genotypes that 2 igree information and the desire to account for the limited were fitted as random effects (u) distributed N (0, ru ), and e differences, line was considered in the models for analyses. is the vector of random residual effects assumed to be nor- 2 All 683 sows produced a first parity litter, and subsets of 558 mally distributed N (0, re ). The fixed factors used in this and 442 sows produced litters in parities 2 and 3, respec- statistical model were gilt line, cohort group based on ani- tively. The reproductive traits, including TNB, NBA, SB, mal entry date on farm, and season for each trait in each MUM and GL, were recorded in these three parities. parity; l was set as an intercept. Most of the phenotypic data were normally distributed for TNB, NBA and GL in this population. However, the phenotypic distributions for SB DNA isolation, SNP array genotyping and quality control and MUM were not normal, and hence they were analysed The methods for DNA isolation and quantification have by ordered categorical threshold analysis using GENSEL been outlined in an earlier publication (Fan et al. 2009). software. Individual SNP effects were estimated from a

DNA samples of 700–1000 ng with a ratio of A260/280 mixture model with a probability of 0.995 that any SNP higher than 1.50 and a concentration >20 ng/ll were used would have a zero effect such that approximately 250–300 for PorcineSNP60 BeadChip genotyping. Genotyping was non-zero SNP effects were fitted per iteration of a Markov performed commercially at GeneSeek, Inc. (Lincoln, NE, chain. This probability (0.995) was selected on the USA). The SNPs with call rate £80%, Gentrain score £40%, assumption that 250–300 SNP markers (0.005 of 57 814 minor allele frequency £0.001 and P-value <0.0001 for a SNP markers) may explain the variation in the pig repro- v2 test for Hardy–Weinberg equilibrium were excluded from ductive traits. This high value of probability (0.995) has the data set. After these quality control measures, a total of been shown to give faster convergence in the model aver- 57 814 SNPs out of a total of 64 232 SNPs qualified for aging procedures, yet still results in every SNP being in- association analyses. cluded in some small proportion of the models. A total of 50 000 iterations in a Markov chain with burn-in of 1000 iterations were run for the analyses. The results from this Genome-wide association analyses analysis included posterior distributions for the effects of The analyses were implemented with a Bayes C model each of the 57 814 markers, adjusted for the portfolio of all averaging approach using the GENSEL software (http:// the other fitted marker effects in the model, which were bigs.ansci.iastate.edu) for each trait in individual parities. updated in each iteration of the chain.

2011 The Authors, Animal Genetics 2011 Stichting International Foundation for Animal Genetics, 43, 18–26 20 Onteru et al.

The effect of a QTL may be distributed across numerous putative QTL reached significance near to 0.01 but not SNPs in linkage disequilibrium with the QTL, resulting in exactly at P < 0.01. Multiple testing was taken into account individual SNP effects that tend to underestimate the real using the proportion of false positives (PFP) as in the work of QTL effect. Accordingly, the posterior means of the SNP Fernando et al. (2004). This approach controls the proba- effects were collectively used to predict the genomic merit of bility of false-positive conclusions across all the tests sliding regions including five consecutive SNPs undertaken, rather than the probability of making one based on the physical map order. The variation in genomic mistake over all tests, as would be the interpretation of an merit for this chromosome fragment as a proportion of total experiment-wise error correction. Given the assumptions genetic variance across the animals in the population was that the experiment had 50% power and a 99% probability used to identify the informative regions. There are 11 563 of the null hypothesis of no QTL in the SNP window, then unique SNP windows with five consecutive SNPs in the the PFP from Fernando et al. (2004) is 0.66 for P < 0.01 whole pig genome. Therefore, the expected proportion of and 0.16 for P < 0.001. This method of bootstrap analysis variance accounted for by one window is 8.6E)05 has also been used in other studies (Fan et al. 2011; Onteru (1/11 563). The estimated proportion of genetic variance et al. 2011). contributed by sliding windows of consecutive SNPs was plotted against genomic marker locations using standard Gene search, functional annotation and pathway packages available in the R 2.11.1 software (http://www. analyses r-project.org). The genomic locations with large contribu- tions were considered to be a QTL. In this way, a portfolio of Candidate QTL regions were identified for each parity for all 12-56 putative QTL were identified for bootstrap analyses reproductive traits. A QTL is a five SNP window or combi- for each trait. nation of consecutive five SNP windows with a bootstrap P- value <0.01 (for MUM P < 0.05) and a higher proportion of genetic variance than the expected proportion of variance Bootstrap analysis for hypothesis testing accounted for by one window (8.6E)05). Gene searches To construct the distribution of the test statistic for each were carried out within these QTL using the Sus scrofa 9 putative QTL, bootstrap samples were produced using the genome build; references in this manuscript are to HGNC posterior means of the 57 814 SNPs. This involved creating gene symbols. For associated gene-poor QTL regions, a NCBI- 1000 bootstrap data sets for each trait. A bootstrap sample BLAST search was performed against the for

yj for replicate j was created using the posterior means of identification of their homologous human genomic regions. the fixed b^ and SNP u^i effects, except that all those SNPs The human genes within the homologous sequences contained in the window that formed the QTL were ex- (E-value < 1e)09) were considered to be present in the cluded, and a vector of simulated residuals was added, associated gene-poor pig QTL regions. If no genes were formed by sampling a vector of independent standard nor- identified in these gene-poor regions, then the genes up-

mal deviations, j, with one deviation for each animal, stream and downstream of the region were considered to scaled by the posterior mean of the residual standard devi- possibly represent the locus. These genes were not con-

ation r^e, according to: sidered for further functional annotation and pathway analysis. The genes included non-randomly within the i¼X57 814 associated QTL were used for functional annotation and ^ yj ¼ Xb þ ziu^i þ r^eej: pathway analysis. Functional annotation clustering was i¼1;i2=QTL performed based on all the available annotation categories present in the DAVID software (http://david.abcc.ncifcrf.gov). These bootstrap samples are constructed according to the As the genes were non-randomly selected for functional null hypothesis of no QTL in the identified SNP window. clustering, the gene enrichment clusters related to repro- Each bootstrap sample was reanalysed using the Bayes C ductive functions and reproductive tissues were taken into model used for the real data, and the genetic variance val- consideration irrespective of the DAVID Fisher exact test ues of the SNP window corresponding to the QTL were P-value. To identify the biological pathways or partial gene accumulated, for comparison with the test statistic repre- networks by which the genes located in the different QTL sented by the proportion of genetic variance of the SNP were interacting among themselves, pathway analyses were window identified in the analysis of the real data. If just one performed through Pathway Studio (Nikitin et al. 2003) bootstrap statistic from the 1000 simulated exceeded the using Ariadne metabolic pathways and signalling pathways test statistic from the real data, the comparison-wise P-value available in the ResNet 7.0 mammalian database (devel- was determined to be 0.001 < P < 0.002. Only QTL with oped by Ariadne Genomics, Inc, Rockville, MD, USA). P < 0.01 were considered for gene searching and functional Previously identified QTL were also evaluated in the asso- annotation for all reproductive traits except MUM, for which ciated QTL regions using PigQTLdb (http://www.animal P < 0.05 was considered, because many of the MUM genome.org/cgi-bin/gbrowse/pig/) for each trait.

2011 The Authors, Animal Genetics 2011 Stichting International Foundation for Animal Genetics, 43, 18–26 WGAS for pig reproductive traits 21

it has been shown that these genomic approaches are at Results and discussion least as successful as other competing methods (Sun et al. In the present study, a WGAS using the PorcineSNP60 2010). The QTL regions with a high proportion of variance BeadChip was performed by means of Bayes C model aver- (higher than 8.6E)05, which is an expected proportion of aging with random SNP effects for pig reproductive traits variance accounted for by one window) in the genomic recorded in the first three parities. The new genomic selec- prediction of merit, accounted for by using a five consecu- tion method Bayes C was used instead of Bayes B to analyse tive SNP window, were considered to be associated with the the present whole-genome data because it is not influenced traits studied if their P-values from bootstrap testing were greatly by the priors for the genetic and residual variances < 0.01 (0.05 for MUM). As mentioned in the materials and (Kizilkaya et al. 2010). methods, the PFP from Fernando et al. (2004) is 0.66 for Most reproductive traits for the first three parities were P < 0.01 and 0.16 for P < 0.001. For an experiment with lowly (for TNB, NBA and MUM) to moderately (for SB and 50% power where a 99% probability of the null hypothesis GL) heritable, with only a small proportion of the pheno- of no QTL in the SNP window exists, we expect at least half typic variance explained by the genome markers (0.001– of the reported QTL to be real for all studied traits, except for 0.40) (Table S1). This confirms that small genetic effects are MUM (PFP is 0.90 for P < 0.05) in this study. observed in the presence of considerable farm management The SNP sliding window approach and bootstrap analy- influences and other environmental effects, which make it ses identified several different chromosomal regions as being difficult to rapidly improve these traits by selection (Serenius significantly associated (P < 0.01) with reproductive traits et al. 2008). The earlier reported heritabilities using large (TNB, NBA, SB, MUM (P < 0.05) and GL) in different par- populations were also very low for TNB (0.062 ± 0.023; ities (Table 1 and Tables S2–S6). For instance, the number Siewerdt et al. 1995) and NBA (0.049 ± 0.023 in purebred of candidate chromosomal regions for TNB, which is a and 0.091 ± 0.054 in crossbred populations; Cecchinato combination of NBA and SB, were 14 (on SSC2, 3, 4, 7, 8, et al. 2010). These findings support our heritability esti- 14 and 16), 33 (on SSC3, 7, 8, 9, 11, 12, 13, 14, 15, 16 mates using the PorcineSNP60 BeadChip in a small popu- and 17) and 28 (on SSC1, 2, 3, 4, 6, 8, 9, 12, 13, 14, 15, lation. The phenotypic correlations among these traits 18 and X) in the first, second and third parities, respectively. across the first three parities for sows were 0.16–0.2 for These results indicate that different genomic regions are TNB, 0.15–0.18 for NBA, 0.09–0.21 for SB, 0.009–0.06 for involved and that there are possible temporal gene effects MUM and 0.33–0.38 for GL. Genetic correlation estimates depending upon each sowÕs parity. They further confirm either were not able to be obtained from this data set or that different parities should be considered as different converged at zero (results not shown). Given that it is traits, as these traits had a low agreement of SNPs between known that the genetic correlations are low for several of parities. these traits, in keeping with the phenotypic correlations, The DAVID functional annotation showed that an average these respective traits in different parities should be con- of 68.3% (Fig. 1) of total genes present in the identified QTL sidered as different traits. A similar conclusion was previ- regions for all traits except MUM were clustered as repro- ously reported based on multivariate Bayesian analyses for ductive genes involved in pituitary (e.g. NOS1AP), ovarian litter size for different parities (Noguera et al. 2002). Earlier (e.g. IRF1), uterine (e.g. CALCA), placental (e.g. VEGFA) published studies providing correlations among the traits and embryological functions (e.g. HNRNPD). In addition, an have been conducted in well-managed experimental herds average of 48.9% of the total genes present in the identified or in nucleus-level breeding farms; these produced higher QTL regions for all of the studied traits except MUM were correlations (Haneberg et al. 2001). The results found in the involved in placental functions (Fig. 1). The observed per- present study may also reflect the level of management and centages of reproductive and placental genes in the identi- environmental influences more commonly observed in fied QTL appeared to markedly depart from the 1.2% of commercial pork production systems that are, as a rule, not reproductive genes (n = 166) in the Sus scrofa 9 genome as well managed when compared with research and build. The 1.2% estimate was based on the clustering of 166 breeding herds. genes as being involved in reproductive process. This clus- In the present WGAS, identification of QTL was performed tering was carried out by DAVID online using human species by a SNP sliding window approach, because it accounts for for annotation of 13 985 pig genes obtained from the En- linkage disequilibrium between neighbouring SNPs and sembl biomart using the Sus scrofa 9 genome build. This may be better at discriminating important chromosomal reinforces the role of placental function genes in pig repro- effects from spurious effects of single SNPs. A similar sliding ductive traits in the present population. However, some window approach using high-density SNP chip data was important genes like MEF2C on SSC2, and PLSCR4 and also followed to successfully identify QTL with few false PLSCR5 on SSC13 were consistently located in the QTL positives for the QTLMAS 2010 data containing a complex regions highly significantly (P < 0.01) associated with the pedigree (Sun et al. 2010). This study used genomic pre- first and second parities, respectively, for TNB and NBA diction approaches rather than single marker analyses, and (Figs 2 and S1). The gene MEF2C on SSC2 has never been

2011 The Authors, Animal Genetics 2011 Stichting International Foundation for Animal Genetics, 43, 18–26 22 Onteru et al.

Table 1 The summary of significantly (P < 0.01) associated QTL regions and some important genes within the regions for reproductive traits in maternal pig lines.

No. of QTL Some important genes in the QTL Previously identified QTL in the Trait regions SSC for candidate regions regions (SSC)1 highly associated QTL regions (SSC)2

TNB1 14 2, 3, 4, 7, 8, 14, 16 MEF2C (2), RASA1 (2), HTR1A (16) Meat quality (2) TNB2 33 3, 7, 8, 9, 11, 12, 13, 14, PLSCR4 (13), PLSCR5 (13), PTX3 Non-functional nipples and fat 15, 16, 17 (13), SEC23B (17) deposition (13) TNB3 28 1, 2, 3, 4, 6, 8, 9, 12, 13, BCL7B (3), NIPAL2 (4), A Novel Ovulation rate, non-functional 14, 15, 18, (15) nipples and fat deposition (15) X NBA1 11 1, 2, 3, 4, 12, 14, 16 IGFBPL1 (1), RASA1 (2), MEF2C (2), Meat quality (2) HTR1A (16) NBA2 22 1, 5, 7, 10, 11, 12, 13, 14 PLSCR4 (13), PLSCR5 (13), ATGR1 Non-functional nipples and fat (13), TBX3 (14) deposition (13) NBA3 9 2, 3, 4, 6, 12, 15 BCL7B (3), ROR1 (6), A Novel Ovulation rate, non-functional protein (15) nipples and fat deposition (15) SB1 25 2, 4, 6, 8, 9, 10, 12, 15, 16, EYA3 (6), RPLPO (8), HNRNPD (8) TNB, total NBA and weight of ovary 17, 18 (8) SB2 17 1, 3, 4, 5, 6, 8, 10, 14, 16 CDH20 (1), SS18 (6), TAF4B (6), Abdominal fat weight (6) KCTD1 (6) SB3 21 1, 2, 3, 4, 5, 6, 8, 12, 13, 15, FGGY (6), RELL1 (8), ACCN1 (12) Number of corpora lutea, teat 17, 18 number and non-functional nipples (8) MUM13 37 1, 2, 4, 6, 9, 10, 13, 14, 15, 17 ESR1 (1), AHR (9), AQP7 (10) Ovulation rate, fat deposition and haptoglobin concentration (9) MUM23 26 1, 2, 3, 4, 5, 8, 9, 10, 13, 14, EEA1 (5), ACAD11 (13), NPHP3 (13), Non-functional nipples (13) 15, 16, 17 CCRL1 (13), USB5 (13) MUM33 41 1, 2, 3, 4, 6, 7, 9, 10, 11, 13, 14, ECDHE2 (6), HSPH1 (11), CD96 (13), Non-functional nipples and fat 15, 16, 17, 18, X ZEBD2 (13) deposition (13) GL1 21 2, 4, 5, 6, 9, 11, 13, 15, 16 FSHB (1), CRSP2 (2), CALCA (2), Non-functional nipples and fat PTH (2) deposition (2) GL2 12 3, 6, 7, 9, 10, 11, 17 MATN3 (3), EPS15 (6), MAFB (17) Fat deposition and haptoglobin concentration (17) GL3 20 1, 6, 7, 9, 13, 14 18 FGF7 (1), CHGA (7), VEGFA (7) Age at puberty (7)

Genes at highly associated regions and their (SSC) are in bold letters. TNB, total number born; NBA, number born alive; SB, number of stillborn; MUM, mummified foetuses at birth; GL, gestation length; 1, 2 and 3 represent parity 1, 2 and 3, respectively. 1The expansions of gene symbols can be obtained from http://uswest.ensembl.org/Sus_scrofa/Info/Index/. 2The QTL information was obtained from PigQTLDB http://www.animalgenome.org/cgi-bin/gbrowse/pig/. 3The bootstrap significance for MUM QTL regions is P < 0.05.

previously reported to be important and is a transcription placental genes (C4orf19, RELL1) was highly significantly factor for the Mef2 family genes. Its altered function is (P < 0.01) associated with SB (Fig. S2 and Table 1 and known to cause foetal death as a result of heart and vas- Table S4). We were not able to collect stillborn piglets and cular dysfunctions (Bi et al. 1999). PLSCR4 in the rat uterus assess their phenotypes. Another developmental gene (AHR) has a role in the regulation of aminophospholipid translo- was contained in a QTL on SSC9, and a QTL on SSC13 cation, which modulates the activity of many membrane included placental genes (ACAD11, NPHP3, CCRL1 and required for inflammation and coagulation events USB5). These QTL were significantly (P < 0.05) associated in the uterus (Phillippe et al. 2006). The region on SSC15 with MUM in the first and second parities (Fig. S3 and Ta- containing a novel protein coding gene was highly signifi- ble 1 and Table S5). Similarly, another QTL on SSC13 cantly (P < 0.01) associated with litter size in parity 3 containing both placental (CD96) and embryological (Fig. 2 and Fig. S1, Table 1 and Tables S2–S3). The QTL developmental (ZBED2) genes was very highly significantly containing embryonic developmental genes on SSC8 (P < 0.001) associated with MUM in the third parity (HNRNPD and ROLPO) and SSC6 (SS18) were highly sig- (Fig. S3 and Table 1 and Table S5). Overall, the genes nificantly (P < 0.01) associated with SB in the first and important for foetal development and placental functions second parities, respectively. In the third parity, a QTL with were found to be associated with number of stillbirths and

2011 The Authors, Animal Genetics 2011 Stichting International Foundation for Animal Genetics, 43, 18–26 WGAS for pig reproductive traits 23

Figure 1 The distribution of associated total QTL, all genes within QTL, and total reproductive and placental genes within QTL for each pig reproductive trait in the first three parities. The X-axis represents reproductive trait, and the Y-axis represents number of counts. The annotation of genes is based on DAVID functional annotation. Among total genes in QTL, average percentages of total reproductive genes and genes clustered to placenta are 68.3 and 48.9, respectively, for all traits (The trait MUM is not represented in this figure, as 90% of the associated QTL with MUM were possible false positives). TNB, total number born; NBA, number born alive; SB, number of stillborn; MUM, mummified foetuses at birth; GL, gestation length; 1, 2 and 3 represent parity 1, 2 and 3, respectively. mummified foetuses delivered in different parities in this genome-wide association studies. Therefore, to identify the study. The genes CRSP2 and CALCA on SSC2 are calcitonin potential gene pathways for individual traits, the genes gene-related peptide family genes that were found to be very within the associated QTL for each trait were used for highly significantly (P < 0.001) associated with the first pathway analysis in Pathway Studio. Significantly parity GL. The genes including MAFB on SSC17 and (P < 0.05) over-represented pathways were considered to VEGFA on SSC7 were also highly significantly (P < 0.01) be important (Table S7). Identification of such networks associated with GL in the second and third parities, will help researchers to explore the underlying biological respectively (Fig. S4 and Table 1 and Table S6). Members of pathways that cannot be uncovered by statistical associa- the CRSP and VEGF gene families are involved in utero- tion analysis. Exploration of such pathways will not only placental blood flow during pregnancy (Yallampalli et al. complement the association analysis results but will also be 2002; Zhou et al. 2003). The MAFB gene is a leucine zipper useful to help focus on new lists of genes that do not reach a transcription factor involved in the regulation of lineage significant level in association analysis. In addition, recog- specific hematopoiesis (Sarrazin et al. 2009). Taken to- nition of biological pathways among QTL will be useful for gether, the genes related to blood flow and maintenance of implementing simple management strategies in commercial blood cells appear to be associated with GL in the pig pop- farms to modulate the physiological networks of animals, as ulation evaluated in this study. a complement to applying genetic selection in order to Complex traits are the result of molecular networks that achieve better production. For example, in the present are affected by genetic loci and the environment (Chen et al. analysis, nucleotide metabolism pathways were found to be 2008). Association of different QTL regions and important important for SB in all three parities (Table S7). This indi- genes within them may explain genetic variation in a trait cates that the modulation of nucleotide metabolism by because of their interaction through the networks of genes adding nucleotides to pig diets may prevent stillborn pig- among the different QTL. The functional annotation clus- lets, because sows in commercial conditions are under tering shows the physiological functions that the selected high production pressures that disrupt the nucleotide genes are involved in. However, the pathway analysis pool required for growth, maintenance and reproduction specifically identifies the biochemical pathways in which (http://www.chemoforma.com/uploads/1269254132496.pdf). the selected genes participate. Knowing the biochemical Therefore, even though the identified pathways in this pathway is useful to allow the consideration of the impor- study are modest, they will be important to consider in tance of other genes in addition to the genes identified by future studies.

2011 The Authors, Animal Genetics 2011 Stichting International Foundation for Animal Genetics, 43, 18–26 24 Onteru et al.

Figure 2 Whole-genome analyses for total number born (TNB) in the first three parities for maternal pig lines. The X-axis is genomic location of SNPs, and the Y-axis represents proportion of genetic variance. Different colours represent SNPs on different chromosomes from SSC1 to X and unmapped markers. Each spot indicates the proportion of genetic variance by a SNP window of five consecutive SNPs. SSC: Pig chromosome; AGTR1, angiotensin II receptor, type 1; CCNH, cyclin H; MEF2C, myocyte enhancer factor 2C; PTX3, pentraxin 3, long; PLSCR4, 4; PLSCR5, phospholipid scramblase family, member 5; RASA1, RAS p21 protein activator (GTPase activating protein) 1; TMEM161B, transmembrane protein 161B.

Most of the QTL identified in this study are new genomic pathway analyses identified some modest underlying bio- regions (Table 1, Tables S2–S6), rather than previously logical pathways associated with pig reproductive traits in reported QTL for TNB, NBA, SB, MUM and GL (PigQTLDB different parities; and (iv) future validation studies in very http://www.animalgenome.org/cgi-bin/gbrowse/pig/). Hence, large populations are recommended for pig reproductive the genomic regions identified here should be considered for traits, as these are lowly heritable traits regulated by future QTL studies in other large sow populations. Although many genes, and the interaction between genes and several important genes, QTL regions, and some biological environment may be significant. pathways were found to be associated with reproductive traits in pigs in the present population, further validation Acknowledgements studies are needed. These studies need to be conducted using large populations with different genetic backgrounds, The authors appreciate the financial support provided by before finalizing a list of genes to include in selection pro- National Pork Board, Newsham Choice Genetics, the State grammes for improving reproductive performance in the pig of Iowa, the Iowa State University College of Agriculture and to better understand the biology of economically and Life Sciences and Hatch funds. The authors are thankful important reproductive traits in pigs. to Marja T. Nikkila for providing phenotype records. In conclusion, the important findings in the present WGAS study are (i) pig reproductive traits are lowly (for References TNB, NBA and MUM) to moderately (for SB and GL) heritable under farm management and environmental Becker D., Tetens J., Brunner A., Bu¨rstel D., Ganter M., Kijas J., influences; (ii) different chromosomal regions and thus International Sheep Genomics Consortium & Dro¨gemu¨ller C. different genes appear to be associated with each repro- (2010) Microphthalmia in Texel sheep is associated with a mis- ductive trait in different parities, which supports the sense mutation in the paired-like homeodomain 3 (PITX3) gene. presence of temporal gene effects in different parities; (iii) PLoS ONE 13, e8689.

2011 The Authors, Animal Genetics 2011 Stichting International Foundation for Animal Genetics, 43, 18–26 WGAS for pig reproductive traits 25

Bi W., Drake C.J. & Schwarz J.J. (1999) The transcription factor Onteru S.K., Fan B., Nikkila M.T., Garrick D.J., Stalder K.J. & MEF2C-null mouse exhibits complex vascular malformations and Rothschild M.F. (2011) Whole-genome association analyses for reduced cardiac expression of angiopoietin 1 and VEGF. Devel- lifetime reproductive traits in the pig. Journal of Animal Science 89, opmental Biology 211, 255–67. 988–95. Cassady J.P., Johnson R.K., Pomp D., Rohrer G.A., Van Vleck L.D., Phillippe M., Bradley D.F., Ji H., Oppenheimer K.H. & Chien E.K. Spiegel E.K. & Rohrer G.A. (2001) Identification of quantitative (2006) Phospholipid scramblase isoform expression in pregnant trait loci affecting reproduction in pigs. Journal of Animal Science rat uterus. Journal of the Society for Gynecologic Investigation 13, 79, 623–33. 497–501. Cecchinato A., de los Campos G., Gianola D., Gallo L. & Carnier P. Purcell S., Neale B., Todd-Brown K., Thomas L., Ferreira M.A.R., (2010) The relevance of purebred information for predicting ge- Bender D., Maller J., Bakker P.I.W., Daly M.J. & Sham P.C. (2007) netic merit of survival at birth of crossbred piglets. Journal of PLINK: a toolset for 9 whole-genome association and population- Animal Science 88, 481–90. based linkage analysis. American Journal of Human Genetics 81, Chen Y., Zhu J., Lum P.Y. et al. (2008) Variations in DNA elucidate 559–75. molecular networks that cause disease. Nature 452, 429–35. Ramos A.M., Crooijmans R.P.M.A., Affara N.A. et al. (2009) Design Duijvesteijn N., Knol E.F., Merks J.W., Crooijmans R.P., Groenen of a high density SNP genotyping assay in the pig using SNPs M.A., Bovenhuis H. & Harlizius B. (2010) A genome-wide asso- identified and characterized by next generation sequencing ciation study on androsterone levels in pigs reveals a cluster of technology. PLoS ONE 4, e6524. candidate genes on chromosome 6. BMC Genetics 11, 42. Rempel L.A., Nonneman D.J., Wise T.H., Erkens T., Peelman L.J. & Fan B., Onteru S.K., Mote B.E., Serenius T., Stalder K.J. & Rothschild Rohrer G.A. (2010) Association analyses of candidate single M.F. (2009) Large-scale association study for structural sound- nucleotide polymorphisms on reproductive traits in swine. Journal ness and leg locomotion traits in the pig. Genetics Selection and of Animal Science 88, 1–15. Evolution 41, 14. Rothschild M.F., Jacobson C., Vaske D. et al. (1996) The estrogen Fan B., Onteru S.K., Du Z.-Q., Garrick D.J., Stalder K.J. & Rothschild receptor locus is associated with a major gene influencing litter M.F. (2011) Genome-wide association study identifies loci for size in pigs. Proceedings of the National Academy of Sciences of the body composition and structural soundness traits in pigs. PLoS United States of America 93, 201–5. ONE 6, e14726. Sarrazin S., Mossadegh-Keller N., Fukao T. et al. (2009) MafB Fernando R.L. & Garrick D.J. (2008) GenSel – User Manual for a restricts M-CSF-dependent myeloid commitment divisions of Portfolio of Genomic Selection Related Analyses. Animal Breeding hematopoietic stem cells. Cell 138, 300–13. and Genetics, Iowa State University, Ames. Serenius T., Stalder K.J. & Fernando R.L. (2008) Genetic associ- Fernando R.L., Nettleton D., Southey B.R., Dekkers J.C.M., ations of sow longevity with age at first farrowing, number of Rothschild M.F. & Soller M. (2004) Controlling the proportion piglets weaned, and wean to insemination interval in the of false positives in multiple dependent tests. Genetics 166, 611– Finnish Landrace swine population. Journal of Animal Science 86, 9. 3324–9. Feugang J.M., Kaya A., Page G.P., Chen L., Mehta T., Hirani K., Siewerdt F., Cardellino R.A. & Rosa V.C.D. (1995) Genetic param- Nazareth L., Topper E., Gibbs R. & Memili E. (2009) Two-stage eters of litter traits in three pig breeds in southern Brazil. Brazilian genome-wide association study identifies integrin beta 5 as having Journal of Genetics 18, 199–205. potential role in bull fertility. BMC Genomics 10, 176. Spotter A., Muller S., Hamann H. & Distl O. (2009) Effect of Haneberg E.H.A.T., Knol E.F. & Merks J.W.M. (2001) Estimates of polymorphisms in the genes for LIF and RBP4 on litter size in genetic parameters for reproduction traits at different parities in two German pig lines. Reproduction in Domestic Animals 44, Dutch Landrace pigs. Livestock Production Science 69, 179–86. 100–5. King A.H., Jiang Z., Gibson J.P., Haley C.S. & Archibald A.L. (2003) Stalder K.J., Knauer M., Bass T.J., Rothschild M.F. & Mabry J.W. Mapping quantitative trait loci affecting female reproductive (2004) Sow longevity. Pig News and Information 25, 53N–74N. traits on porcine chromosome 8. Biology of Reproduction 68, Sun X., Habier D., Fernando R.L., Garrick D.J. & Dekkers J.C.M. 2172–9. (2010) Genomic breeding value prediction and QTL mapping of Kizilkaya K., Fernando R.L. & Garrick D.J. (2010) Genomic pre- QTLMAS2010 data using Bayesian methods. BMC Proceedings In diction of simulated multibreed and purebred performance using press. observed fifty thousand single nucleotide polymorphism geno- Tribout T., Lannuccelli N., Druet T., Gilbert H., Riquet J., Gueblez R., types. Journal of Animal Science 88, 544–51. Mercat M.J., Bidanel J.P., Milan D. & Le Roy P. (2008) Detection Meuwissen T.H.E., Hayes B.J. & Goddard M.E. (2001) Prediction of of quantitative trait loci for reproduction and production traits in total genetic value using genome-wide dense marker maps. Large White and French Landrace pig populations. Genetics Genetics 157, 1819–29. Selection and Evolution 40, 61–78. Nikitin A., Egorov S., Daraselia N. & Mazo M. (2003) Pathway Vallet J.L., Freking B.A., Leymaster K.A. & Christenson R.K. (2005) studio – the analysis and navigation of molecular networks. Allelic variation in the erythropoietin receptor gene is associated Bioinformatics 19, 2155–7. with uterine capacity and litter size in swine. Animal Genetics 36, Noguera J.L., Varona L., Babot D. & Estany J. (2002) Multivariate 97–103. analysis of litter size for multiple parities with production traits Yallampalli C., Chauhan M., Thota C.S., Kondapaka S. & Wimala- in pigs: II. Response to selection for litter size and correlated wansa S.J. (2002) Calcitonin gene-related peptide in pregnancy response to production traits. Journal of Animal Science 80, 2548– and its merging receptor heterogeneity. Trends in Endocrinology 55. and Metabolism 13, 263–9.

2011 The Authors, Animal Genetics 2011 Stichting International Foundation for Animal Genetics, 43, 18–26 26 Onteru et al.

Zhou Y., Bellingard V., Feng K.-T., McMaster M. & Fisher S.J. Table S3 The detailed information about QTL regions sig- (2003) Human cytotrophoblasts promote endothelial survival nificantly (P < 0.01) associated with number born alive and vascular remodeling through secretion of Ang2, PlGF, and (NBA) after bootstrap analysis. VEGF-C. Developmental Biology 263, 114–25. Table S4 The detailed information about QTL regions sig- nificantly (P < 0.01) associated with number of stillborn Supporting information (SB) after bootstrap analysis. Table S5 The detailed information about QTL regions sig- Additional supporting information may be found in the nificantly (P < 0.05) associated with number of mummified online version of this article. foetuses at birth (MUM) after bootstrap analysis. Figure S1 Whole-genome analyses for number born alive Table S6 The detailed information about QTL regions sig- (NBA) in the first three parities for maternal pig lines. nificantly (P < 0.01) associated with gestation length (GL) Figure S2 Whole-genome analyses for number of stillborn after bootstrap analysis. (SB) in the first three parities for maternal pig lines. Table S7 DAVID enriched reproductive genes and over- Figure S3 Whole-genome analyses for number of mummi- represented pathways by Pathway studio using genes fied piglets at birth (MUM) in the first three parities for within the very significantly (P < 0.01) associated QTL for maternal pig lines. pig reproductive traits. Figure S4 Whole-genome analyses for gestation length (GL) As a service to our authors and readers, this journal in the first three parities for maternal pig lines. provides supporting information supplied by the authors. Table S1 Posterior mean of variance components explained Such materials are peer-reviewed and may be re-organized by whole-genome SNP markers for reproductive traits in a for online delivery, but are not copy-edited or typeset. study using maternal pig lines. Technical support issues arising from supporting informa- Table S2 The detailed information about QTL regions sig- tion (other than missing files) should be addressed to the nificantly (P < 0.01) associated with total number born authors. (TNB) after bootstrap analysis.

2011 The Authors, Animal Genetics 2011 Stichting International Foundation for Animal Genetics, 43, 18–26