A Genome Scan for Quantitative Trait Loci Affecting Average Daily Gain and Kleiber

Ratio in Baluchi Sheep

Running Title: GWAS for Growth rate

Majid Pasandideh*, Ghodrat Rahimi-Mianji, Mohsen Gholizadeh

Laboratory for Molecular Genetics and Animal Biotechnology, Faculty of Animal and Aquatic

Science, Sari Agricultural Sciences and Natural Resources University,

Sari, P.O. Box -578, Iran

Postal address: Majid Pasandideh, No 1, Valiasr 8 Alley, Valiasr (Farhangian) Town, Bojnord City,

North Khorasan Province, Iran.

Postal Code: 94178-94815

Corresponding Author Email: [email protected]

[email protected]

[email protected]

Keywords: Baluchi sheep, Genome-wide association study, Growth rate, Single nucleotide polymorphism

Abstract

1

Genome-wide association study (GWAS) is an efficient tool for detection SNPs and candidate in quantitative traits. Growth rate is an important trait for increasing of meat production in sheep. A total of 96 Baluchi sheep were genotyped using Illumina OvineSNP50 BeadChip to run a GWAS for average daily gain (ADG) and Kleiber ratio (KR) traits in different periods of age in sheep. Traits included were average daily gain from birth to 3 months (ADG0-3), from 3 months to 6 months (ADG3-

6), from 6 months to 9 months (ADG6-9), from 9 months to yearling (ADG9-12), from birth to 6 months

(ADG0-6), from 3 months to 9 months (ADG3-9), from 3 months to yearling (ADG3-12) and corresponding Kleiber ratios (KR0-3, KR3-6, KR6-9, KR9-12, KR0-6, KR3-9 and KR3-12, respectively). A total of 42,243 SNPs passed the quality-control filters and were analyzed by PLINK software in a linear mixed model. Two SNPs were identified on two at the 5% genome-wide significance level for KR(3-9) and KR(6-9). Two candidate genes namely MAGI1 and ZNF770 were identified correspondingly harboring and close to these QTL. Also, a total of 21 SNPs were found on chromosomes 2, 3, 5, 6, 7, 10, 17, 19, 20 and 25 at the 5% -wide significance level for

ADG and KR traits. Thus, we suggest more studies to discovery of causative variants for growth traits in sheep.

Keywords: Baluchi sheep, Genome-wide association study, Growth rate, Single nucleotide polymorphism

Introduction

In Iran, the consumption of lamb and mutton as sources are more prevalent than meat from beef cattle and goats. Due to the fact that meat production from sheep is not enough, it is necessary to increase the efficiency of sheep production by improving litter size, lamb weight and growth rate.

The Baluchi sheep is one of the most important native breeds in Iran, constituting about 30% of the total sheep population (Yazdi et al. 1997). This breed has medium size, fat-tailed with the

2

predominant white coat color, whereas the muzzle is black. These animals are mostly kept on pasture and adapted to the harsh environment and dry and hot climate in the eastern part of Iran with low quality pasture.

Increasing in size and number of cells is applied as primarily definition of growth. However, this does not comprise the phenomenology and etiology of growth. Because, growth is a continuous function during life of animal, it is better to use growth rate or increasing in weight and size during different stages of life for measuring growth (Nkrumah et al. 2007). Therefore, the average daily gain (ADG) in weight has been one of the most valuable traits and plays an essential role in sheep breeding programs. A portion of feed intake is served for animal maintenance requirements and the rest can be used for production (milk, wool and meat). Due to maintenance need is higher for larger animals, they must obtain more feed intake to have the same body mass gain as the smaller animal (Kleiber

1936). Therefore, a method must be used for comparison of larger and smaller animals. Metabolic mass (W0.75) in kleiber ratio provides the possibility of comparison the large and the small animals without to have an advantage to either (Kleiber 1947). Selection for mass or growth rate can result in undesirable correlated responses such as increasing of fat deposit in the animal body (Badenhorst

1990). The fat deposit can disrupt the homeostatic balance leading to lower fertility (Badenhorst

2011). Selection for efficiency will not decrease fertility as a result of increasing fat deposits in the body. Therefore, efficiency of feed conversion is a more acceptable selection criterion than mass gain for evaluation growth rate (Badenhorst 2011). However, it is difficult to determine the efficiency of feed conversion for each animal individually in field condition. The Kleiber ratio is based on the fact that there is a direct relationship between the animal weight and the maintenance requirements and production. Therefore, Kleiber ratio is used as an alternative selection criterion for efficiency (Kleiber

1947).

3

Mapping of quantitative trait loci (QTL) is a set of procedures to detect the genetic variation using linkage mapping and traditional statistical and quantitative genetic approaches. There have been two approaches for QTL mapping since many years ago: candidate approach and linkage mapping analyses by microsatellite markers (Hirschhorn and Daly 2005). It is difficult to use candidate gene approach for quantitative traits, because it is limited to how much of the biology of the investigated trait is recognized (Andersson and Georges 2004; Al-Mamun et al. 2015). The limitations of QTL mapping by microsatellite markers are the low map resolution and the large confidence interval (CI).

Therefore, it is difficult to identify the important genes and the underlying mutations for the interested traits using the marker information (Goddard and Hayes 2009). Recently, due to the accessibility of a high-density SNP chip, the genome wide association study (GWAS) has become a useful technique for the identification of causal genes for the economical traits in livestock. The GWAS with the applying of large scales of SNPs in whole genome investigates the correlation between genetic variations and phenotypic variations in trait. The goal of GWAS is using of linkage disequilibrium

(LD) information between markers and causative mutations that tend to be inherited together to next generation (Bush and Moore 2012). GWAS has been widely applied to detect of causative mutation in human in order to improve treatment or produce useful diagnostic for diseases (Scott et al. 2007;

Lasky-Su et al. 2008; Lee et al. 2008). In animal, especially cattle, this method has emphasized on economically important traits to expedite the genetic improvement (Cole et al. 2011; Zanella et al.

2011). A GWAS analysis in Australian Merino sheep has detected 39 SNPs associated with body weight and a major QTL region (included 13 SNPs) on OAR6 (Al-Mamun et al. 2015). Gholizadeh et al. (2015) identified one SNP with genome wide significance effect within SYNE1 gene on chromosome 8 to be associated with yearling weight in Baluchi sheep. A GWA study in purebred sheep reported 36 significant SNPs for growth and meat production traits and 5 most crucial candidate genes associated with post-weaning gain (Zhang et al. 2013). Our objectives in this study were to

4

perform a GWAS in Baluchi sheep to identify the significant SNPs associated with ADG and KR traits in different periods of age using the ovine SNP50 BeadChip (Illumina, San Diego, USA) and to explore the candidate genes around these SNPs.

Materials and methods

Ethical statement

This study has been performed with the approval of Abbasabad Sheep Breeding Station of Mashhad,

Iran and Iran national science foundation (INSF) (number 89,002,382). All institutional and national guidelines for the care and use of laboratory animals were followed.

Animal resources and measurement of traits

The sheep population used in this study consisted of 96 Baluchi sheep (93 females and 3 males) collected from Abbasabad Sheep Breeding Station, Mashhad, Iran. The animals were kept with conventional industry practices. The mating period was between late summer (August) and early autumn (September) and included at most three estrous cycles (51 days). Lambing started in early

February and ended in late March. The lambs were kept with their mothers until the mean weaning age of all lambs at about 3 months. During spring and summer, the animals were kept on pasture and kept indoors in winter. This study focused on growth rate traits included: average daily gain from birth to 3 months (ADG0-3), from 3 months to 6 months (ADG3-6), from 6 months to 9 months (ADG6-

9), from 9 months to yearling (ADG9-12), from birth to 6 months (ADG0-6), from 3 months to 9 months

(ADG3-9), from 3 months to yearling (ADG3-12) and corresponding Kleiber ratios (KR0-3, KR3-6, KR6-

9, KR9-12 ,KR0-6, KR3-9 and KR3-12, respectively).

Average daily gain and Kleiber ratios for each period were calculated as follow:

ADG = (end weight – first weight) / number of days in period

KR = ADG for each period / (end weight) 0.75

Sampling, genotyping and data quality control

5

Venous jugular blood samples were collected using vacuum tubes treated with 0.25% Ethylene

Diamine Tetracetic Acid (EDTA) as an anticoagulant. Genomic DNA was extracted from blood samples using a modified salting out protocol following Miller et al. (1988). DNA was genotyped using the Illumina OvineSNP50 BeadChip containing 54,241 SNPs at the Sci-Life Lab in Uppsala,

Sweden. PLINK software version 1.07 (Purcell et al. 2007) was used for data quality control. SNPs that passed the quality control criteria (genotyping frequency >95%, minor allele frequency >0.05 and Hardy–Weinberg equilibrium P>0.001) were included for further analysis. Population stratification was assessed by a quantile-quantile (Q-Q) plot using the ggplot package in R software and genomic inflation factor (based on median chi-squared) in PLINK software.

Genome-wide association analysis

Using PLINK software, linear regression analyses in GWAS were performed with a model included

SNPs effect and sex, year of birth, month of birth, herd and type of birth as covariates. Least square analyses were conducted using the general linear model (GLM) procedure (SAS Institute 2002) to test the significance of fixed effects to be included in the model. The significant fixed effects were considered in the final model for each trait. All interactions between fixed effects were not significant and therefore were excluded from the model. The full model was as follows:

yijklmn = µ + SNPi + Sexj + Yeark + Monthl + Herdm + Type of birthn + eijklmn

Where:

yijklmn: vector of the observed traits; µ: overall mean; SNPi: SNP genotype; Sexj: fixed effect of sex

(2 levels); Yeark: fixed effect of year of birth (4 levels); Monthl: fixed effect of month of birth (4 levels); Herdm: fixed effect of herd (2 levels); Type of birthn: fixed effect of type of birth, single or twin lamb in birth (2 levels); eijklmn: residual effect to each observation.

Significance test

6

A Bonferroni correction was used to control the family-wise error rate (FWER). The genome-wide significant level is for SNPs with a raw P-value less than 0.05/N, where N is the total number of SNPs on all chromosomes. Also, chromosome-wide significant level refers for SNPs with a raw P-value less than 0.05/n, where n is number of SNPs tested for each chromosome in this study. Haploview version 4.1 software (Barrett et al. 2005) was used to plot Manhattan plots for all SNP markers of studied traits.

Study of genes and QTLs in the candidate regions

We used the latest sheep genome Ovis_aries_v3.1

(http://www.livestockgenomics.csiro.au/sheep/oar3.1.php permanent.), UCSC Genome

Bioinformatics (http://genome.ucsc.edu, permanent.) and National Center for Biotechnology

Information (NCBI) (http://www.ncbi.nlm.nih.gov/. permanent.) for identifying relationship between significant SNPs and ovine genes. A BLAST search was also performed using the bovine UCSC

Genome Browser to assess genes already mapped to the bovine genome. QTL database

(http://www.animalgenome.org/QTLdb/cattle.html) was used for detection of QTL in the candidate regions.

Results and discussion

Phenotype statistics and quality control

Descriptive statistics for records of growth traits are presented in Table 1. After quality control, we removed 213 SNPs with call rates less than 95%, 6865 SNPs with minor allele frequency (MAF) less than 0.05 and 24 SNPs with Hardy–Weinberg (HWE) less than 0.001. A total of 42,243 SNPs passed the quality-control criteria and were included in the dataset.

Table 1. Descriptive statistics of examined traits in Baluchi sheep.

7

The Q-Q plots for some of traits were shown in Figure 1. In these plots, X and Y axes are the expected p-values under null hypothesis and the observed p-values, respectively. The genomic inflation factor

(lambda) is defined as the ratio of the median of the observed distribution of the test statistic to the expected median. The lambda value for ADG(0-3) and ADG(0-6) was 1.03 and 1.1 respectively and for other traits was 1. Thus, these scales were close to their expectations values under the null hypothesis of 1. Based on Q-Q plots and lambda value, there was no population stratification.

GWAS analyses

The Manhattan plots for some of traits were shown in figure 2. These plots shows P-values [in terms of -log(P-value)] for all SNP markers of studied traits (Figure 2). We identified two SNPs at the 5% genome-wide significance level (indicated in bold in Table 2) associated with KR(6-9) and KR(3-9) and 21 SNPs at the 5% chromosome-wide significance level for ADG and KR traits. No significant

SNPs were identified for ADG(3-6). The details of significant SNPs, including their positions in the genome, the nearest known ovine genes, their distance from genes, and the raw P-values, are given in Table 2.

Table 2. Significant SNPs associated with ADG and KR.

ADG(0-3) was nominally correlated with ADG(3-9) and KR(0-3). We found three significant SNPs for ADG(0-3) located 450,981, 53,760 and 134,185 bp apart from the nearest known ovine genes

(KCNIP4, ARHGEF40 and ZNF770). OAR6_45438896 and OAR7_31307164 SNPs had significantly association with KR(0-3) and ADG(3-9), respectively. KCNIP4 gene that also known as CALP encodes a member of the family of voltage-gated potassium (Kv) channel-interacting (KCNIPs), which belong to the recovering branch of the EF-hand superfamily (Huang and

Forsberg, 1998). CALP (Calpain) mediated degradation of myofibrillar proteins and is related to tenderness of slaughtered meat. Also it plays an important role in skeletal muscle growth in beef, 8

lamb and pork (Morgan et al. 1993). ARHGEF40 (Rho guanine nucleotide exchange factor 40) encodes Rho protein belongs to the family of Rho GTPases. Rho protein regulates a variety of biological response pathways including cell motility, cell division, cell transformation, gene transcription, organelle development and cytoskeletal dynamics (van Unen et al. 2015). ZNF770 encodes a member of zinc finger proteins family as one of most abundant proteins in eukaryotic genomes. These proteins have diverse functions including DNA recognition, RNA packaging, transcriptional activation, regulation of apoptosis, protein folding and assembly, and lipid binding

(Laity et al. 2001).

ADG(6-9) showed a nominal relationship with KR(6-9). There were 3 significant SNPs associated with ADG(6-9). Also, two SNPs showed significant effect on KR(6-9) that OAR19_37632125 SNP reached 5% Bonferroni chromosome-wide significance level (P=0.0466). This marker was located within the MAGI1 gene whereas OAR5_96992290 and OAR19_55346948 were 674810 bp and

10856 bp apart from the nearest known ovine genes ARRDC3 and SETD2, respectively. MAGI1

(Membrane-Associated Guanylate Kinase Inverted 1) encodes a protein of membrane-associated guanylate kinase homologue (MAGUK) family. Internal surface of the plasma membrane at regions where cells contact each other is made of multiprotein complexes assembled by contribution of

MAGUK proteins. Also, the encoded protein by this gene regulates scaffolding protein at cell-cell junctions (Zaric et al. 2012). Arrestin Domain Containing 3 (ARRDC3) encodes a member of mammalian α-arrestin family of proteins that plays an important role in metabolism and the development of obesity. ARRDC3 gene effects on β-adrenergic receptors so that the loss of ARRDC3 leads to increasing response to β-adrenergic stimulation in isolated adipose tissue (Tian et al. 2016).

Patwari et al. (2011) reported that the mice with Inactivated Arrdc3 were resistance to obesity due to increasing in energy consumption through increased activity levels and increased thermogenesis of both brown and white adipose tissues. SETD2 (SET domain containing protein 2) is a histone

9

methyltransferase which regulates the chromatin structure through covalent modification of histones.

SETD2 is involved in transcriptional activation and repression, DNA repair, and cell cycle regulation

(Turner 2000).

ADG(9-12) was found to be closely correlated with KR(9-12). In this study, 3 SNPs were significantly associated with both ADG(9-12) and KR(9-12). OAR19_28027752, s10229 and

OAR19_32337762 were located 586,566, 60,256 and 33,949 bp apart from the known ovine genes

CHL1, PPP4R2 and FOXP1, respectively. CHL1 (Cell Adhesion Molecule L1 Like) encodes a protein of the L1 gene family of neural cell adhesion molecules. This protein is a neural recognition molecule that may plays a role in signal transduction pathways and maintaining a high fidelity of chromosome transmission during mitosis (Holloway 2000). PPP4R2 (Protein Phosphatase 4

Regulatory Subunit 2) is one of the regulatory subunits of protein phosphatase 4 (PPP4C). This protein is expressed in the nucleation of microtubules at centrosomes that may strongly affect cell survival (Shui et al. 2007). Bosio et al. (2012) showed that PPP4R2 displays a specific and very dynamic localization in differentiating neuronal cells and thus has a role in muscular paralysis.

FOXP1 (Forkhead Box P1) belongs to Forkhead Box P (FOXP) subfamily of transcription factors.

The role of Forkhead box transcription factors is in the regulation of tissue- and cell type- specific gene transcription during both development and adulthood (Bacon et al. 2015). In a research was generated a conditional Foxp1 mutant mouse with deletion Foxp1 specifically in neural tissues using the Cre-Lox system. They reported these mice are viable but have a significantly reduced body weight compared with wild-type littermates. Thus we think that Foxp1 gene might affect body weight (Bacon et al. 2015).

In this study, two SNPs significantly associated with ADG(0-6) were located within FUT8 and 22346 bp apart from NDFIP2. The fucosyltransferase 8 gene (FUT8) encodes an enzyme that transfers fucose from GDP-fucose to glycoconjugates such as glycoproteins and glycolipids. Wang et al.

10

(2005) generated 1,6-fucosyltransferase (Fut8)-null mice and found that disruption of FUT8 induces severe growth retardation and death during postnatal development. NDFIP2 (Nedd4 Family

Interacting Protein 2) encodes a protein that bind to and activate members of the Nedd4 family of E3 ubiquitin ligases (Mund and Pelham 2010). A study reported that NDFIP2 can promote secretion of the hallmark cytokine of Th1 cell differentiation, IFN-γ and plays a role in the Th1 cell differentiation process (Lund et al. 2007).

We found 3 significant SNPs affecting ADG(3-12) and KR(3-12) that OAR17_54155064 SNP was associated with both these traits. The nearest known ovine genes to these SNPs were RPL10A,

TMEM132B and RASGEF1B. The RPL10A gene (Ribosomal Protein L10a) encodes a component of the 60S subunit for ribosomal protein L10a. It seems RPL10a is involved in embryogenesis and organogenesis and cell proliferation (Wonglapsuwan et al. 2010). TMEM132B (Transmembrane

Protein 132B) is a member of the TMEM132 gene family, which has 5 subtypes (TMEM132A-

E). The function of TMEM132B protein is relatively uncharacterized (Farlow et al. 2015).

RASGEF1B (RasGEF Domain Family Member 1B) is a GEF. Guanine Nucleotide Exchange Factors

(GEFs( activate small GTPases and promote the formation of active Ras–GTP. These proteins are involved in cell proliferation, survival, differentiation, vesicular trafficking, and gene expression

(Andrade et al. 2010).

OAR3_19747277 SNP located within the known ovine MBOAT2 gene was significantly associated with KR(3-6). MBOAT2 (Membrane Bound O-Acyltransferase Domain Containing 2) encodes a protein of Acyltransferase family. This protein mediates the conversion of lysophosphatidylethanolamine into phosphatidylethanolamine (Gijon et al. 2008).

Our GWAS identified five SNPs associated with KR(0-6), and they were distributed on two autosomes. One of the SNPs was located within the PSAP (Prosaposin) gene and the others were located 15,825 bp, 146,168 bp, 15,896 and 823,671 apart from the nearest known ovine genes SCGN,

11

HDGF, SCN4A and ZP4, respectively. SCGN (Secretagogin, EF-Hand Calcium Binding

Protein) gene encodes a calcium-binding protein secreted in the cytoplasm (Maj et al. 2010). This protein is similar to calmodulin and calbindin-D28K and involved in KCL-stimulated calcium flux and cell proliferation. SCGN is strongly expressed in pancreatic β-cells, other neuroendocrine cells and distinct neurons of the CNS (Rogstam et al. 2007). HDGF (Hepatoma-Derived Growth Factor) is an acidic heparin-binding protein originally isolated from the conditional medium of human hepatoma cells. It is a member of a new family of growth factors called HDGFrelated proteins (HRPs)

(Tsai et al. 2013). HDGFs have growth factor activity for hepatoma cells, fibroblasts, smooth muscle cells, and endothelial cells (Dietz et al. 2002). Protein encoded by PSAP gene is involved in a number of biological functions including the development of the nervous system and the reproductive system.

When prosaposin is cleaved it generates four main products including saposins A, B, C, and D. The individual saposins help to lysosomal enzymes for breaking down fatty substances (Morales et al.

2000). SCN4A (Sodium Voltage-Gated Channel Alpha Subunit 4) encodes NaV1.4 that it mediates the voltage-dependent sodium ion permeability of excitable membranes. This protein expresses at the neuromuscular junction of striated muscles including laryngeal muscles. Some studies suggest that SCN4A-related disorders such as generalized muscle hypertrophy with clinical or electrical myotonia evolved later in life (Lion-Francois et al. 2010; Matthews et al. 2008). ZP4 (Zona Pellucida

Glycoprotein 4) is a member of oocyte-specific genes family. These genes encode glycoprotein which is a part of the extracellular matrix surrounds growing oocytes, ovulated eggs and pre-implantation embryos (Lefievre et al. 2004).

We found two significant SNPs associated with KR(3-9) that OAR7_31307164 SNP reached 5%

Bonferroni chromosome-wide significance level (P=0.0061). This marker was related in ADG(0-3) and ADG(3-9) too. The identified SNPs for KR(3-9) were located 134,185 and 25,561 bp apart from the nearest known ovine genes (ZNF770 and MEIS2). MEIS2 (Meis Homeobox 2) encodes a

12

homeobox protein belonging to the TALE ('three amino acid loop extension') family. TALE group comprise from homeodomain-containing transcription factors that regulate cell proliferation and differentiation during development. Also, they are involved in proximal-distal limb patterning and skeletal muscle differentiation (Crowley et al. 2010).

In this study, the identified SNP markers for growth rate traits were located within or close to 21 known ovine genes in total. Of these genes, KCNIP4, ARHGEF40, ARRDC3, FOXP1, FUT8,

SCGN, HDGF, SCN4A and MEIS2 genes are directly involved in processes related to muscle growth and fat deposit. Other identified genes may mediate in some processes such as development of the reproductive system, cell proliferation and differentiation, protein folding and levels of gene transcription thereupon affect muscle growth and fat deposit in sheep. In different periods of ADG and KR traits, some of significant markers were same and some of them were different. The records related to ADG and KR traits are actually repeated observations measured along a trajectory and the mean and covariance between measurements change gradually along the trajectory (Mrode 2014).

Also, due to the diverse expression of genes in different periods of age, heritability and genetic correlation between repeated measurements are varied along the trajectory (Mrode 2014). Therefore, detection of different SNP markers for ADG or KR traits in different periods of age is reasonable and acceptable.

Raadsma et al. (2009) identified QTL controlling ADG in Awassi and Merino sheep on OAR3, OAR6 and OAR7. In agreement with those findings reported by Raadsma et al. (2009), we detected some

QTL affecting growth rate traits in sheep on OAR3 (one SNP), OAR6 (two SNPs) and OAR7 (three

SNPs). Also, we identified one SNP located on OAR2, one SNP located on OAR5, two SNPs located on OAR17 and five SNPs located on OAR19. The results of our study are consistent with findings reported by Zhang et al. (2013) that identified QTL controlling ADG in Sunit, German Mutton, and

Dorper sheep on OAR2, OAR5, OAR17, OAR19 and OAR20. Hadjipavlou and Bishop (2009) using

13

microsatellite markers found one QTL for ADG on OAR20 in Scottish Blackface sheep. We identified three SNPs located on OAR20 consistent with those findings. Furthermore, we found one novel SNP on OAR10 and two not previously reported SNPs on OAR25 affecting ADG in sheep.

We performed BLAST for regions harboring the significant SNPs using the cow genome and found the orthologous regions. The results of BLAST for the significant SNPs in sheep were shown in Table

3. The results showed that 9 out of 21 detected genes were found relevant in cattle (ARRDC3, CHL1,

FOXP1, FUT8, KCNIP4, MAGI1, MEIS2, NDFIP2 and SCGN). According to the theory of conserving of genomic functional regions in different species and due to the fact that the genetic mechanisms are fundamentally similar in all living organisms (Bai et al. 2012), the results of BLAST are reasonable and acceptable. Also, it provides an opportunity to perform deep sequencing of these genes for further investigation. The results of QTL survey for regions with the significant SNPs in sheep using the cow genome were shown in Table 4. Most of identified regions in this study overlapped with QTLs that had previously been identified as affecting body weight traits in cattle

(Kneeland et al. 2004; Nkrumah et al. 2007; Sherman et al. 2009; McClure et al. 2010).

Table 3. BLAST results for the significant SNPs in sheep using the cow genome.

Table 4. The results of QTL survey for regions with the significant SNPs in sheep using the cow genome

We did not find any GWAS for ADG and KR traits in sheep, but there are some reports on other traits related to growth. Kominakis et al. (2017) performed a GWAS for body size traits in Frizarta dairy sheep. They reported 11 significant SNPs, at the chromosome level, and 39 candidate genes that were associated with gene networks relevant to body size traits. Matika et al. (2016) conducted a GWAS for growth traits in Scottish Blackface lambs. They identified a genomic region on chromosome 6 associated with bone weight, bone area, fat area, fat density, fat weight and muscle density traits as 14

well as one SNP on chromosome 1 at the genome-wide significance level related to muscle density.

A GWAS analysis on 329 Chinese sheep identified 36 significant SNPs for growth traits and 5 candidate genes associated with post-weaning weight gain (Zhang et al. 2013). Santana et al. (2014) performed a GWAS for ADG in 720 Nellore cattle. The results showed a genomic region with six significant SNPs on chromosome 3 harboring some important genes such as PDE4B, LEPR, CYP2J2 and FGGY. Bolormaa et al. (2011) conducted a GWAS to identify genomic regions affecting growth traits in 7 breeds of cattle. They reported the association of SNPs with the RFI and ADG on chromosomes 5 and 8. Snelling et al. (2010) performed a GWAS for growth traits in 7 breeds of cattle. They reported that most of the SNPs associated with direct growth were located on BTA6. Our

BLAST study showed that these findings are relevant to regions we found in the present study.

Conclusions

In this study, 21 significant SNPs were identified for all traits that 2 of them reached 5% genome- wide significance level. Five of these SNPs were located within ovine genes and the rests were located close to ovine genes (10,856 bp - 823,671 bp apart). The most of the candidate genes and SNP markers were found novel. We suggest more studies to verify the causal mutations and candidate genes affecting growth rate traits in sheep. The results of this study could be used for increasing efficiency of selection and identification of the physiological processes of growth rate in sheep.

References

Al-Mamun H. A., Kwan P., Clark S. A., Ferdosi M. H., Tellam R. and Gondro C. 2015. Genome-wide association study of body weight in Australian Merino sheep reveals an orthologous region on OAR6 to human and bovine genomic regions affecting height and weight. Genet. Sel. Evol. 47, 1-11.

Andersson L. and Georges M. 2004. Domestic-animal genomics: deciphering the genetics of complex traits. Nature Rev. Genet. 5, 202-212.

Andrade W. A., Silva A. M., Alves V. S., Salgado A. P. C., Melo M. B., Andrade H. M. et al. 2010. Early endosome localization and activity of RasGEF1b, a toll-like receptor-inducible Ras guanine-nucleotide exchange factor. Genes Immun. 11, 447-457. 15

Bacon C., Schneider M., Le Magueresse C., Froehlich H., Sticht C., Gluch C. et al. 2015. Brain-specific Foxp1 deletion impairs neuronal development and causes autistic-like behaviour. Mol. Psychiatr. 20, 632-639.

Bai Y., Sartor M. and Cavalcoli J. 2012. Current status and future perspectives for sequencing livestock genomes. J. anim. sci. biotechnol. 3, 1-6.

Badenhorst M. A. 1990. The kleiber ratio as a possible selection for sire selection. The shepherd. 35, 18-19.

Badenhorst M. A. 2011. The Kleiber Ratio as a possible selection for Afrino Sire Selection. Grootfontein Agricultural Development Institute. 4, 9-12.

Barrett J. C., Fry B., Maller J. D. M. J. and Daly M. J. 2005. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics. 21, 263-265.

Bolormaa S., Hayes B. J., Savin K., Hawken R., Barendse W., Arthur P. F. et al. 2011. Genome-wide association studies for feedlot and growth traits in cattle. J. Anim. Sci. 89, 1684-1697.

Bosio Y., Berto G., Camera P., Bianchi F., Ambrogio C., Claus P. et al. 2012. PPP4R2 regulates neuronal cell differentiation and survival, functionally cooperating with SMN. Eur. J. Cell Biol. 91, 662-674.

Bush W. S. and Moore J. H. 2012. Genome-wide association studies. PLoS Comput Biol. 8, e1002822.

Cole J. B., Wiggans G. R., Ma L., Sonstegard T. S., Lawlor T. J., Crooker B. A. et al. 2011. Genome-wide association analysis of thirty one production, health, reproduction and body conformation traits in contemporary US Holstein cows. BMC genomics. 12, 1-17.

Crowley M. A., Conlin L. K., Zackai E. H., Deardorff M. A., Thiel B. D. and Spinner N. B. 2010. Further evidence for the possible role of MEIS2 in the development of cleft palate and cardiac septum. Am. J. Med. Genet. Part A. 152, 1326- 1327.

Dietz F., Franken S., Yoshida K., Nakamura H., Kappler J. and Gieselmann V. 2002. The family of hepatoma-derived growth factor proteins: characterization of a new member HRP-4 and classification of its subfamilies. Biochem. J. 366, 491-500.

Farlow J. L., Lin H., Sauerbeck L., Lai D., Koller D. L., Pugh E. et al. 2015. Lessons learned from whole exome sequencing in multiplex families affected by a complex genetic disorder, intracranial aneurysm. PloS one. 10, e0121104.

Gholizadeh M., Rahimi-Mianji G. and Nejati-Javaremi A. 2015. Genomewide association study of body weight traits in Baluchi sheep. J. Genet. 94, 143-146.

Gijon M. A., Riekhof W. R., Zarini S., Murphy R. C. and Voelker D. R. 2008. Lysophospholipid acyltransferases and arachidonate recycling in human neutrophils. J. Biol. Chem. 283, 30235-30245.

Goddard M. E. and Hayes B. J. 2009. Mapping genes for complex traits in domestic animals and their use in breeding programmes. Nature Rev. Genet. 10, 381-391.

Hadjipavlou G. and Bishop S. C. 2009. Age‐dependent quantitative trait loci affecting growth traits in Scottish Blackface sheep. Anim. Genet. 40, 165-175.

Hirschhorn J. N. and Daly M. J. 2005. Genome-wide association studies for common diseases and complex traits. Nature Rev. Genet. 6, 95-108.

Holloway S. L. 2000. CHL1 is a nuclear protein with an essential ATP binding site that exhibits a size-dependent effect on chromosome segregation. Nucleic Acids Res. 28, 3056-3064.

Huang J. and Forsberg N. E. 1998. Role of calpain in skeletal-muscle protein degradation. Proc. Natl. Acad. Sci. USA 95, 12100-12105. 16

Kleiber M. 1936. Problems involved in breeding for efficiency of food utilization. Proc. Am. Soc. Anim. Prod., Madison, WI, 247–258.

Kleiber M. 1947. Body size and metabolic rate. Physiol. rev. 27, 511-541.

Kneeland J., Li C., Basarab J., Snelling W. M., Benkel B., Murdoch B. et al. 2004. Identification and fine mapping of quantitative trait loci for growth traits on bovine chromosomes 2, 6, 14, 19, 21, and 23 within one commercial line of. J. Anim. Sci. 82, 3405-3414.

Kominakis A., Hager-Theodorides A. L., Zoidis E., Saridaki A., Antonakos G. and Tsiamis G. 2017. Combined GWAS and ‘guilt by association’-based prioritization analysis identifies functional candidate genes for body size in sheep. Genet. Sel. Evol. 49, 1-16.

Laity J. H., Lee B. M. and Wright P. E. 2001. Zinc finger proteins: new insights into structural and functional diversity. Curr. Opin. Struc. Biol. 11, 39-46.

Lasky‐Su J., Anney R. J. L., Neale B. M., Franke B., Zhou K., Maller J. B. et al. 2008. Genome‐wide association scan of the time to onset of attention deficit hyperactivity disorder. Am. J. Med. Genet. Part B: Neuropsychiatric Genetics. 147, 1355-1358.

Lee Y. H., Kang E. S., Kim S. H., Han S. J., Kim C. H., Kim H. J. et al. 2008. Association between polymorphisms in SLC30A8, HHEX, CDKN2A/B, IGF2BP2, FTO, WFS1, CDKAL1, KCNQ1 and type 2 diabetes in the Korean population. J. Hum. Genet. 53, 991-998.

Lefievre L., Conner S. J., Salpekar A., Olufowobi O., Ashton P., Pavlovic B. et al. 2004. Four zona pellucida glycoproteins are expressed in the human. Hum. Reprod. 19, 1580-1586.

Lion-Francois L., Mignot C., Vicart S., Manel V., Sternberg D., Landrieu P. et al. 2010. Severe neonatal episodic laryngospasm due to de novo SCN4A mutations A new treatable disorder. Neurology. 75, 641-645.

Lund R. J., Loytomaki M., Naumanen T., Dixon C., Chen Z., Ahlfors H. et al. 2007. Genome-wide identification of novel genes involved in early Th1 and Th2 cell differentiation. The J. Immunol. 178, 3648-3660.

Maj M., Gartner W., Ilhan A., Neziri D., Attems J. and Wagner L. 2010. Expression of TAU in insulin-secreting cells and its interaction with the calcium-binding protein secretagogin. J. Endocrinol. 205, 25-36.

Matika O., Riggio V., Anselme-Moizan M., Law A. S., Pong-Wong R., Archibald A. L. et al. 2016. Genome-wide association reveals QTL for growth, bone and in vivo carcass traits as assessed by computed tomography in Scottish Blackface lambs. Genet. Sel. Evol. 48, 1-15.

Matthews E., Guet A., Mayer M., Vicart S., Pemble S., Sternberg D. et al. 2008. Neonatal hypotonia can be a sodium channelopathy: recognition of a new phenotype. Neurology. 71, 1740-1742.

McClure M. C., Morsci N. S., Schnabel R. D., Kim J. W., Yao P., Rolf M. M. et al. 2010. A genome scan for quantitative trait loci influencing carcass, post‐natal growth and reproductive traits in commercial Angus cattle. Anim. Genet. 41, 597- 607.

Miller S. A., Dykes D. D. and Polesky H. F. R. N. 1988. A simple salting out procedure for extracting DNA from human nucleated cells. Nucleic Acids Res. 16, 1215.

Morales C. R., Zhao Q., Lefrancois S. and Ham D. 2000. Role of prosaposin in the male reproductive system: effect of prosaposin inactivation on the testis, epididymis, prostate, and seminal vesicles. Arch. Andrology. 44, 173-186.

Morgan J. B., Wheeler T. L., Koohmaraie M., Savell J. W. and Crouse J. D. 1993. Meat tenderness and the calpain proteolytic system in Longissimus muscle of young bulls and steers. J. Anim. Sci. 71, 1471-1476.

Mrode R. A. (2014). Linear models for the prediction of animal breeding values, Cabi. 17

Mund T. and Pelham H. R. B. 2010. Regulation of PTEN/Akt and MAP kinase signaling pathways by the ubiquitin ligase activators Ndfip1 and Ndfip2. Proc. Natl. Acad. Sci. 107, 11429-11434.

Nkrumah J. D., Sherman E. L., Li C., Marques E., Crews D. H., Bartusiak R. et al. 2007. Primary genome scan to identify putative quantitative trait loci for feedlot growth rate, feed intake, and feed efficiency of beef cattle. J. Anim Sci. 85, 3170- 3181.

Patwari P., Emilsson V., Schadt E. E., Chutkow W. A., Lee S., Marsili A. et al. 2011. The arrestin domain-containing 3 protein regulates body mass and energy expenditure. Cell Metab. 14, 671-683.

Purcell S., Neale B., Todd-Brown K., Thomas L., Ferreira M. A. R., Bender D. et al. 2007. PLINK: a tool set for whole- genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559-575.

Raadsma H. W., Thomson P. C., Zenger K. R., Cavanagh C., Lam M. K., Jonas E. et al. 2009. Mapping quantitative trait loci (QTL) in sheep. I. A new male framework linkage map and QTL for growth rate and body weight. Genet. Sel. Evol. 41, 1-17.

Rogstam A., Linse S., Lindqvist A., James P., Wagner L. and Berggard T. 2007. Binding of calcium ions and SNAP-25 to the hexa EF-hand protein secretagogin. Biochemical Journal. 401, 353-363.

Santana M. H. A., Utsunomiya Y. T., Neves H. H. R., Gomes R. C., Garcia J. F., Fukumasu H. et al. 2014. Genome-wide association study for feedlot average daily gain in Nellore cattle (Bos indicus). J. Anim. Breed. Genet. 131, 210-216.

SAS Institute. 2002. SAS User’s Guide v. 9.1. Statistics. SAS Institute, Inc., Cary, NC.

Scott J., Chant D., Andrews G., Martin G. and McGrath J. 2007. Association between trauma exposure and delusional experiences in a large community-based sample. Brit. J. Psychiat. 190, 339-343.

Sherman E. L., Nkrumah J. D., Li C., Bartusiak R., Murdoch B. and Moore S. S. 2009. Fine mapping quantitative trait loci for feed intake and feed efficiency in beef cattle. J. Anim Sci. 87, 37-45.

Shui J. R. W., Hu M. C. T. and Tan T. H. 2007. Conditional knockout mice reveal an essential role of protein phosphatase 4 in thymocyte development and pre-T-cell receptor signaling. Mol. Cell. Biol. 27, 79-91.

Snelling W. M., Allan M. F., Keele J. W., Kuehn L. A., McDaneld T. P. L., Smith T. S. et al. 2010. Genome-wide association study of growth in crossbred beef cattle. J Anim Sci. 88, 837-848.

Tian X., Irannejad R., Bowman S. L., Du Y., Puthenveedu M. A., von Zastrow M. et al. 2016. The α-Arrestin ARRDC3 Regulates the Endosomal Residence Time and Intracellular Signaling of the β2-Adrenergic Receptor. J. Biol. Chem. 291, 14510-14525.

Tsai H. E., Wu J. C., Kung M. L., Liu L. F., Kuo L. H., Kuo H. M. et al. 2013. Up-regulation of hepatoma-derived growth factor facilities tumor progression in malignant melanoma. PloS one. 8, e59345.

Turner B. M. 2000. Histone acetylation and an epigenetic code. Bioessays 22, 836-845. van Unen J., Reinhard N. R., Yin T., Wu Y. I., Postma M., Gadella T. W. J. et al. 2015. Plasma membrane restricted RhoGEF activity is sufficient for RhoA-mediated actin polymerization. Sci. Rep. 5, 1-16.

Wang X., Inoue S., Gu J., Miyoshi E., Noda K., Li W. et al. 2005. Dysregulation of TGF-β1 receptor activation leads to abnormal lung development and emphysema-like phenotype in core fucose-deficient mice. P. Natl. Acad. Sci. USA. 102, 15791-15796.

Wonglapsuwan M., Miyazaki T., Loongyai W. and Chotigeat W. 2010. Characterization and biological activity of the ribosomal protein L10a of the white shrimp: Fenneropenaeus merguiensis De Man during vitellogenesis. Mar. Biotechnol. 12, 230-240.

18

Yazdi M. H., Engström G., Näsholm A., Johansson K., Jorjani H. and Liljedah L. E. 1997. Genetic parameters for lamb weight at different ages and wool production in Baluchi sheep. Anim. Sci. 65, 247-255.

Zanella R., Settles M. L., McKay S. D., Schnabel R., Taylor J., Whitlock R. H. et al. 2011. Identification of loci associated with tolerance to Johne’s disease in Holstein cattle. Anim. Genet. 42, 28-38.

Zaric J., Joseph J. M., Tercier S., Sengstag T., Ponsonnet L., Delorenzi M. et al. 2012. Identification of MAGI1 as a tumor-suppressor protein induced by cyclooxygenase-2 inhibitors in colorectal cancer cells. Oncogene. 31, 48-59.

Zhang L., Liu J., Zhao F., Ren H., Xu L., Lu J. et al. 2013. Genome-wide association studies for growth and meat production traits in sheep. PloS one. 8, e66569.

Table 1. Descriptive statistics of examined traits in Baluchi sheep.

Traits Mean Standard deviation Minimum Maximum Coefficient of variation (%) ADG(0-3) (g/day) 210.30 37.45 122.22 294.44 17.81 ADG(3-6) (g/day) 103.65 39.29 0 222.22 37.91 ADG(6-9) (g/day) 39.03 10.68 11.11 155.56 27.36 ADG(9-12) (g/day) 54.65 16.86 11.11 133.33 30.85 ADG(0-6) (g/day) 157 24.06 115.56 230.56 15.33 ADG(3-9) (g/day) 67.89 20.98 5.56 150 30.90 ADG(3-12) (g/day) 60.24 15.35 37.04 103.7 25.48 KR(0-3) 19.77 1.41 15.83 22.62 7.11 KR(3-6) 7.52 2.49 0 14.99 33.11 KR(6-9) 2.65 0.93 0.61 8.81 35.09 KR(9-12) 3.34 1.10 0.69 9.68 32.93 KR(0-6) 11.48 0.57 10.23 12.9 4.96 KR(3-9) 4.63 1.22 0.57 8.49 26.42 KR(3-12) 3.80 0.73 2.33 5.32 19.32

Table 2. Significant SNPs associated with ADG and KR.

Trait Chromosome SNP Position on The nearest Distance (bp)* P value ovine genome known ovine (bp) genes ADG(0-3) 6 OAR6_45438896 40,744,185 KCNIP4 +450981 4.53E-06 2 OAR2_206970153 195,475,557 ARHGEF40 -53760 8.81E-06 7 OAR7_31307164 27,580,032 ZNF770 +134185 1.60E-05

ADG(3------6) ADG(6-9) 19 OAR19_37632125 35,906,837 MAGI1 Within 2.95E-06 5 OAR5_96992290 88,884,839 ARRDC3 +674810 2.15E-05 19 OAR19_55346948 52,363,666 SETD2 -10856 2.70E-05

ADG(9-12) 19 OAR19_28027752 26,601,647 CHL1 +586566 1.85E-05 19 s10229 28,611,845 PPP4R2 -60256 2.08E-05 19 OAR19_32337762 30,686,012 FOXP1 +33949 3.78E-05

ADG(0-6) 7 OAR7_81656552 74,682,974 FUT8 Within 5.90E-06 10 s25185 55,154,960 NDFIP2 +22346 3.07E-05

ADG(3-9) 7 OAR7_31307164 27,580,032 ZNF770 +134185 8.28E-06

ADG(3-12) 17 OAR17_54155064 49,699,518 RPL10A +163741 3.19E-05 17 OAR17_53158609 48,863,156 TMEM132B -58917 4.02E-05

KR(0-3) 6 OAR6_45438896 40,744,185 KCNIP4 +450981 1.11E-05

KR(3-6) 3 OAR3_19747277 18,237,509 MBOAT2 Within 9.64E-06

19

KR(6-9) 19 OAR19_37632125 35,906,837 MAGI1 Within 1.11E-06 5 OAR5_96992290 88,884,839 ARRDC3 +674810 2.20E-05

KR(9-12) 19 s10229 28,611,845 PPP4R2 -60256 9.14E-06 19 OAR19_32337762 30,686,012 FOXP1 +33949 1.25E-05 19 OAR19_28027752 26,601,647 CHL1 +586566 4.45E-05

KR(0-6) 20 s54659 31,096,082 SCGN -15825 1.50E-05 20 OAR20_38075792 34,761,868 HDGF +146168 3.89E-05 25 s17548 27,854,019 PSAP Within 4.94E-05 20 s31449 15,415,171 SCN4A +15896 5.59E-05 25 DU463400_341 11,434,110 ZP4 +823671 5.86E-05

KR(3-9) 7 OAR7_31307164 27,580,032 ZNF770 +134185 1.45E-07 7 OAR7_33307842 29,539,107 MEIS2 -25561 1.42E-05

KR(3-12) 6 OAR6_105936918 96,309,709 RASGEF1B +444933 2.04E-05 17 OAR17_54155064 49,699,518 RPL10A +163741 3.75E-05 *Positive value denotes the gene located downstream of SNP, negative value denotes the gene located upstream of SNP.

Table 3. BLAST results for the significant SNPs in sheep using the cow genome.

Trait SNP Position on cattle genome(bp) The nearest known Distance*(bp) cattle genes ADG(0-3) OAR6_45438896 6:42,276,632-42,312,069 KCNIP4 +439094 OAR2_206970153 2:83,228,859-83,270,681 SLC39A10 -1463321 OAR7_31307164 10:30,643,499-30,687,187 AQR +128635

ADG(6-9) OAR19_37632125 22:35,971,507-36,009,315 MAGI1 Within OAR5_96992290 7:93,923,406-93,972,003 ARRDC3 +670312 OAR19_55346948 22:52,975,982-53,011,087 NRADD -62531

ADG(9-12) OAR19_28027752 22:26,749,230-26,785,202 CHL1 +426711 s10229 22:28,689,233-28,729,658 SHQ1 -223318 OAR19_32337762 22:30,811,743-30,855,460 FOXP1 +8060

ADG(0-6) OAR7_81656552 10:78,087,337-78,131,491 FUT8 Within s25185 12:54,985,427-55,023,687 NDFIP2 Within

ADG(3-9) OAR7_31307164 10:30,643,499-30,687,187 AQR +128635

ADG(3-12 OAR17_54155064 17:52,308,527-52,356,727 AACS -617351 OAR17_53158609 17:50,278,560-50,318,248 SLC15A4 +912548

KR(0-3) OAR6_45438896 6:42,276,632-42,312,069 KCNIP4 +439094

KR(3-6) OAR3_19747277 11:88,403,379-88,442,397 MIR2298 +74315

KR(6-9) OAR19_37632125.1 22:35,971,507-36,009,315 MAGI1 Within OAR5_96992290 7:93,923,406-93,972,003 ARRDC3 +670312

KR(9-12) s10229 22:28,689,233-28,729,658 SHQ1 -223318 OAR19_32337762 22:30,811,743-30,855,460 FOXP1 +8060 OAR19_28027752 22:26,749,230-26,785,202 CHL1 +426711

KR(0-6) s54659 23:31,911,871-31,950,906 SCGN -1085 OAR20_38075792 23:35,620,421-35,623,882 MIR2284C -23238 s17548 28:28,131,667-28,172,244 CDH23 +8218 s31449 23:15,309,751-15,350,600 FOXP4 -41390 DU463400_341 28:11,557,684-11,591,301 CHRM3 -1254890

KR(3-9) OAR7_31307164 10:30,643,499-30,687,187 AQR +128635 OAR7_33307842 10:32,591,860-32,635,026 MEIS2 -8616

KR(3-12) OAR6_105936918 6:98,442,720-98,478,600 HNRNPD -438063 OAR17_54155064 17:52,308,527-52,356,727 AACS -617351 *Positive value denotes the gene located downstream of SNP, negative value denotes the gene located upstream of SNP.

20

Table 4. The results of QTL survey for regions with the significant SNPs in sheep using the cow genome

SNP QTL in cow Chromosome QTL peak location QTL reference OAR6_45438896 Body weight (mature) 6 88.78 (cM) McClure et al. (2010) Residual feed intake 6 101.0 (cM) Sherman et al. (2009)

OAR2_206970153 Body weight (birth) 2 114.2 (cM) McClure et al. (2010)

OAR7_31307164 Body weight (yearling) 10 44.25 (cM) McClure et al. (2010) OAR7_33307842

OAR19_37632125 Body weight (yearling) 22 82.93 (cM) McClure et al. (2010) ADG 22 66.3 (cM) Nkrumah et al. (2007)

OAR19_28027752 Body weight (yearling) 22 54.05 (cM) McClure et al. (2010) s10229 OAR19_32337762

s54659 Body weight (weaning) 23 58.19 (cM) McClure et al. (2010) ADG 23 55.2 (cM) Kneeland et al. (2004) Residual feed intake 23 63.4 (cM) Sherman et al. (2009)

OAR20_38075792 Body weight (weaning) 23 58.19 (cM) McClure et al. (2010) Body weight (birth) 23 71.64 (cM) McClure et al. (2010)

s17548 Body weight (weaning) 28 49.39 (cM) McClure et al. (2010)

s31449 Body weight (weaning) 23 26.51 (cM) McClure et al. (2010)

DU463400_341 Body weight (birth) 28 24.77 (cM) McClure et al. (2010) Body weight (weaning) 10 44.25 (cM) McClure et al. (2010)

Titles and legends to figures

Figure 1. Quantile-quantile (Q-Q) plots of genome-wide association results for ADG0-3, ADG6-9, KR3-9, KR6-9. X and Y axes are the expected p-values under null hypothesis and the observed p-values, respectively.

21

Figure 2. Manhattan plot of -log10(p-values) for association of SNP loci with ADG0-3, ADG6-9, KR3-9, KR6-9. Genome- wide (p < 0.05) threshold is represented as a dashed red line and suggestive threshold as a dashed blue line.

22

23

24