View metadata, citation and similar papers at core.ac.uk brought to you by CORE Animal Genetics provided by Archivio istituzionale della ricerca - Università di Palermo

Runs of homozygosity reveal genome-wide autozygosity in Italian sheep breeds

Journal: Animal Genetics Manuscript ForID AnGen-17-04-0069.R3 Peer Review Manuscript Type: Full Paper

Date Submitted by the Author: n/a

Complete List of Authors: Mastrangelo, Salvatore; University of Palermo, Scienze Agrarie e Forestali Ciani, Elena; University of Bari, Biosciences, Biotechnologies and Pharmacological Sciences Sardina, Maria Teresa; University of Palermo, Scienze Agrarie e Forestali Sottile, Gianluca; University of Palermo, Scienze Economiche, Aziendali e Statistiche Pilla, Fabio; University of Molise, Agricoltura Ambiente Alimenti Portolano, Baldassare; University of Palermo, Scienze Agrarie e Forestali

Runs of homozygosity, genomic inbreeding, Italian sheep breeds, Single Keywords: Nucleotide Polymorphism

Animal Genetics Page 1 of 26 Animal Genetics

1 2 3 1 Runs of homozygosity reveal genome-wide autozygosity in Italian sheep breeds 4 5 2 S. Mastrangelo1*, E. Ciani2, M. T. Sardina1, G. Sottile3, F. Pilla4, B. Portolano1, and the 6 7 3 Bi.Ov.Ita Consortium 8 9 4 10 11 5 1 Dipartimento Scienze Agrarie, Alimentari e Forestali, University of Palermo, 90128 Palermo, 12 13 6 14 15 2 16 7 Dipartimento di Bioscienze, Biotecnologie e Biofarmaceutica University of Bari, 70124 Bari, 17 18 8 Italy 19 3 For Peer Review 20 9 Dipartimento Scienze Economiche, Aziendali e Statistiche, University of Palermo, 90128 21 22 10 Palermo, Italy 23 24 11 4 Dipartimento Agricoltura, Ambiente e Alimenti, University of Molise, 86100 Campobasso, 25 26 12 Italy 27 28

29 13 30 31 14 Corresponding author 32 33 15 Salvatore Mastrangelo 34 35 16 [email protected] 36 37 17 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Animal Genetics Animal Genetics Page 2 of 26

1 2

3 18 Summary 4 5 19 The availability of dense singlenucleotide polymorphism (SNP) assays allows for the 6 7 20 determination of autozygous segments based on runs of consecutive homozygous genotypes 8 9 21 (ROH). The aim of the present study was to investigate the occurrence and the distribution of 10 11 22 ROH in 21 Italian sheep breeds using mediumdensity SNP genotypes, in order to characterize 12 13 14 23 autozygosity and identify genomic regions that frequently appeared in ROH within individuals, 15 16 24 namely ROH islands. After filtering, the final number of animals and SNPs retained for analyses 17 18 25 were 502 and 46 277, respectively. A total of 12 302 ROH were identified. The mean number of 19 For Peer Review 20 26 ROH per breed ranged from 10.58 () to 44.54 (Valle del Belice). The average length of 21 22 27 ROH across breeds was 4.55 Mb and ranged from 3.85 () to 5.51 (). Valle del 23 24 28 Belice showed the highest value of inbreeding on the basis of ROH (F ) (0.099), whereas 25 ROH 26 27 29 Comisana showed the lowest one (0.016) and high standard deviation values revealed high 28 29 30 variability in autozygosity levels within each breed. Differences also existed in the length of 30 31 31 ROH. Analysis of the distribution of ROH according to their size showed that for all breeds, the 32 33 32 majority of the detected ROH were less than 10 Mb in length, with few long ROH >25 Mb. The 34 35 33 levels of ROH that we estimated here reflect the inbreeding history of the investigated sheep 36 37 34 breeds. These results also highlight that ancient and recent inbreeding have had an impact on the 38 39 genome of the Italian sheep breeds and suggest that several animals experienced recent 40 35 41 42 36 autozygosity events. Comisana and Bergamasca appeared as the less consanguineous breeds, 43 44 37 whereas , Leccese and Valle del Belice showed ROH patterns which are typically 45 46 38 produced by recent inbreeding. Moreover, within the genomic regions most commonly 47 48 39 associated with ROH, several candidate genes were detected. 49 50 40 Keywords: Runs of homozygosity, genomic inbreeding, Italian sheep breeds, Single Nucleotide 51 52 41 Polymorphism 53 54 55 42 56 57 43 Introduction 58 59 60 Animal Genetics Page 3 of 26 Animal Genetics

1 2 3 44 The abundance of singlenucleotide polymorphisms (SNPs) throughout the genome and the 4 5 45 availability of dense SNP assays, makes these markers particularly suitable for the detection of 6 7 46 genomic regions where a reduction in heterozygosity occurred (Kim et al. 2013). Runs of 8 9 47 homozygosity (ROH) are contiguous segments of homozygous genotypes that are present in an 10 11 48 individual due to parents transmitting identical haplotypes to their offspring (Gibson et al. 2006). 12 13 14 49 In animal genetics, ROH are used as predictors of whole genome inbreeding levels (Purfield et 15 16 50 al. 2012; Marras et al. 2015; Mastrangelo et al. 2016), to characterize the genomic distribution of 17 18 51 inbreeding depression on a phenotype (Biscarini et al. 2014; Pryce et al. 2014), and to identify 19 For Peer Review 20 52 genes associated with traits of economic interest present in these regions (Zhang et al. 2015; 21 22 53 Szmatola et al. 2016; Purfield et al. 2017). Selection pressure and mating schemes can be 23 24 54 disentangled by ROH; therefore, ROH detection can also be used to minimize inbreeding and to 25 26 27 55 improve mating systems (Bosse et al. 2012; Zhang et al. 2015). Currently, whole genome 28 29 56 inbreeding estimated from ROH (FROH) is considered to be the most powerful method of 30 31 57 detecting inbreeding effects compared to several alternative inbreeding estimates, and it allows 32 33 58 distinguishing between recent and ancient inbreeding (Keller et al. 2011). Moreover, FROH 34 35 59 outperforms inbreeding coefficients estimated from pedigree data (Purfield et al. 2012) and it 36 37 60 does not depend on allele frequencies. ROH patterns are not randomly distributed across the 38 39 genome. Bosse et al. (2012) examined different European and Asian pig populations and found 40 61 41 42 62 that ROH size and abundance in the genome varied considerably among breeds and that animals 43 44 63 of the same population showed similar ROH patterns in their genomes, and demonstrated that 45 46 64 demography and recombination have a major effect on the genomic distribution of ROH while 47 48 65 selection might have a more secondary role. Other authors (Zhang et al.2015) showed that ROH 49 50 66 are likely the result of selection events and not attributable only to demographic history. 51 52 67 Recently, Purfield et al. (2017) reported a significant moderate correlation between the 53 54 55 68 occurrence of SNP in a ROH and two different statistical approaches (FST and HapFLK) to 56 57 69 identify putative selection signatures (FSTSNP in a ROH 0.25, <0.0001;–log10 hapFLK pvalue 58 59 60 Animal Genetics Animal Genetics Page 4 of 26

1 2 3 70 SNP in a ROH 0.37, <0.0001). In fact, genomic regions subjected to selection frequently show 4 5 71 signatures such as reduced nucleotide diversity and increased homozygosity around the selected 6 7 72 locus compared to the rest of the genome (Pemberton et al. 2012). Moreover, as selection is 8 9 73 often characterized by local reduction in haplotype diversity, the distribution of ROH patterns 10 11 74 across the genome can inform on genomic regions that have potentially been subjected to recent 12 13 14 75 and/or ancient selective pressure (Kim et al. 2013; Pryce et al. 2014; Purfield et al. 2017; 15 16 76 Kukučková et al. 2017). However, most of the studies focused so far on cattle and pig, and only 17 18 77 few concerned other livestock species such as sheep (AlMamun et al. 2016; Purfield et al. 19 For Peer Review 20 78 2017). Therefore, studies on a wider range of species and breeds are needed to better characterize 21 22 79 the genomic distribution of ROH and its relationship with neutral and selective factors (Peripolli 23 24 80 et al. 2016). The aim of the present study was to investigate the occurrence and the distribution 25 26 27 81 of ROH in Italian sheep breeds using mediumdensity SNP genotypes, in order to characterize 28 29 82 autozygosity and identify genomic regions with high ROH frequencies, namely ROH islands. 30 31 83 The identification and characterization of ROH can provide insights into how population history, 32 33 84 structure and demography have evolved over time. 34 35 85 36 37 86 Materials and methods 38 39 Genotyping 40 87 41 42 88 Genotyping data from the Illumina OvineSNP50 panel of 476 animals from 20 sheep breeds 43 44 89 were available from the Italian Sheep Consortium (BIOVITA) (Ciani et al. 2014), whereas single 45 46 90 nucleotide polymorphism (SNP) genotypes of 30 animals of Barbaresca breed were available 47 48 91 from a recent study on this breed (Mastrangelo et al. 2017). 49 50 92 These data were merged and the whole data set consisted of 506 animals from 21 Italian sheep 51 52 93 breeds genotyped for 49 034 SNPs spread over all chromosomes. Chromosomal coordinates for 53 54 55 94 each SNP were obtained from the ovine genome assembly 3.1 (OAR3.1). Markers were filtered 56 57 95 to exclude loci assigned to unmapped contigs. Only SNPs located on autosomes were considered 58 59 60 Animal Genetics Page 5 of 26 Animal Genetics

1 2 3 96 for further analyses. Moreover, the following filtering parameters were adopted to exclude 4 5 97 certain loci and animals and to generate the pruned input file: (i) SNPs with call rate less than 6 7 98 95% and (ii) minor allele frequency (MAF) less than 1%, and (iii) animals with more than 2% of 8 9 99 missing genotypes were removed. File editing was carried out using PLINK (Purcell et al. 2007). 10 11 100 12 13 14 101 Genetic relationships amongst breeds 15 16 102 Pairwise genetic relationships were estimated using Identity by State (IBS) genetic distances 17 18 103 calculated by PLINK (Purcell et al. 2007) and graphically represented by multidimensional 19 For Peer Review 20 104 scaling (MDS) analysis. 21 22 105 23 24 106 Runs of homozygosity definition 25 26 27 107 Runs of homozygosity were estimated for each animal using PLINK (Purcell et al. 2007). No 28 29 108 pruning was performed based on linkage disequilibrium (LD), but the minimum length that 30 31 109 constituted the ROH was set to 1 Mb, to exclude short ROH deriving from LD. The following 32 33 110 criteria were used to define the ROH: (i) one missing SNP was allowed in the ROH and up to 34 35 111 one possible heterozygous genotype; (ii) the minimum number of SNPs that constituted the ROH 36 37 112 was set to 30; (iii) the minimum SNP density per ROH was set to 1 SNP every 100 kb; (iv) 38 39 maximum gap between consecutive homozygous SNPs of 250 kb. 40 113 41 42 114 The mean number (MNROH) and the average length (ALROH) of ROH per individual per breed, 43 44 115 and the sum of all ROH segments per animal were estimated. The inbreeding coefficient on the 45 46 116 basis of ROH (FROH) for each animal was calculated as follows: 47

48 117 FROH = LROH/Laut 49 50 118 where LROH is the total length of all ROH in the genome of an individual, and Laut the specified 51 52 119 length of the autosomal genome covered by SNPs on the chip (2 452.06 Mb). 53 54 55 120 Each ROH was categorized based on its physical length into 1 to <5 Mb, 5 to <10 Mb, 10 to <15 56 57 121 Mb, 15 to < 20, 20 to <25, 25 to <30 Mb and ≥30 Mb. For each of the ROH length categories, 58 59 60 Animal Genetics Animal Genetics Page 6 of 26

1 2 3 122 the mean sum of ROH per breed was calculated by summing all ROH per animal in that category 4 5 123 and averaging this per breed. 6 7 124 The genomic inbreeding coefficient based on the difference between observed vs. expected 8 9 125 number of homozygous genotypes (FHOM) was also estimated using PLINK (Purcell et al. 2007). 10 11 126 Pearson’s correlation between the two measures of inbreeding was calculated. 12 13 14 127 15 16 128 Detection of common runs of homozygosity 17 18 129 To identify the genomic regions most commonly associated with ROH for the metapopulation, 19 For Peer Review 20 130 and for groups on the basis of production purposes (milk, meat, wool and “milk and meat”), the 21 22 131 percentage of occurrences of SNP in ROH was calculated by counting the number of times the 23 24 132 SNP was detected in those ROH, and this was plotted against the position of the SNP along the 25 26 27 133 chromosome (OAR). The genomic regions most commonly associated with ROH were identified 28 29 134 by selecting the top 1% of the SNPs most commonly observed in ROH (Szmatola et al. 2016; 30 31 135 Purfield et al. 2017). Adjacent SNPs with proportion of ROH occurrences over the adopted 32 33 136 threshold formed genomic regions, called ROH islands. 34 35 137 Genomic coordinates for all the identified ROH islands were used for the annotation of genes 36 37 138 that were fully or partially contained within each selected region using the UCSC Genome 38 39 Browser (http://genome.ucsc.edu/). Finally, to investigate the biological function of each 40 139 41 42 140 annotated gene contained in ROH islands, an accurate literature survey was conducted. 43 44 141 45 46 142 Results and discussion 47 48 143 We analyzed animals from 21 Italian sheep breeds with different inbreeding background, 49 50 144 selection histories and production purposes (Ciani et al. 2014; Mastrangelo et al. 2017). After 51 52 145 filtering, the final number of animals and SNPs retained for analyses were 502 and 46 277, 53 54 55 146 respectively. We used an MDS plot of the pairwise IBS distance to explore in detail the 56 57 147 relationship among the Italian sheep breeds. The results showed that most sheep breeds formed 58 59 60 Animal Genetics Page 7 of 26 Animal Genetics

1 2 3 148 nonoverlapping clusters and were clearly separate populations (Figure S1). In particular, the 4 5 149 first dimension (C1) distinguished Barbaresca from other breeds, as also reported in a previous 6 7 150 study (Mastrangelo et al. 2017). 8 9 151 The abundance, length and genomic distribution of ROH constitute a valuable source of 10 11 152 information about the demographic history of livestock species (Bosse et al. 2012). A total of 12 12 13 14 153 302 ROH were identified in the investigated breeds, with 7 846 ROH< 5Mb, 3 327 ROH ranged 15 16 154 from 5 to 10 Mb and 1 129 ROH>10 Mb. In order to verify their genomic distribution, ROH 17 18 155 were grouped according to their size in two classes (long ROH> 5 Mb and short ROH <5Mb), 19 For Peer Review 20 156 and these were depicted on the basis of the position along the chromosome (OAR). Despite the 21 22 157 ROH being more or less distributed throughout the chromosome, large ROH were mostly found 23 24 158 in regions with low recombination (in the middle of the chromosome), whereas small ROH tend 25 26 27 159 to be more frequent towards the telomeres (Fig. 1). These results were consistent with the 28 29 160 genomic distribution observed by other authors (Bosse et al. 2012; Purfield et al. 2017). 30 31 161 Among the 502 animals, 484 (96.4%) had at least 1 ROH longer than 1 Mb. A descriptive 32 33 162 statistics for ROH in each breed is reported in Table 1. The mean number of ROH per breed 34 35 163 ranged from 10.58 (Comisana) to 44.54 (Valle del Belice). Similar results in terms of range 36 37 164 values were reported by AlMamun et al. (2016) in a study on Australian sheep breeds. The 38 39 average length of ROH across breeds was 4.55 Mb and ranged from 3.85 (Biellese) to 5.51 40 165 41 42 166 (Leccese). A recent study on goats reported an average length of 6.28 Mb (Manunza et al. 2016), 43 44 167 whereas Kim et al. (2013) in cattle observed a mean ROH length per animal of about 6 Mb. The 45 46 168 ALROH values showed low variation among Italian sheep breeds, indicating that this value is not 47 48 169 a good descriptor of ROH, as reported by other authors (Marras et al. 2015). The distributions of 49 50 170 FROH>1Mb for each breed are reported in Fig. 2 and the mean values in Table 1. Valle del Belice 51 52 171 breed showed the highest value of F >1 Mb (0.099), whereas Comisana showed the lowest 53 ROH 54 55 172 one (0.016), in agreement with the MNROH results. However, the high standard deviation values 56 57 173 revealed high variability in autozygosity levels within each breed. Consistently with our results, 58 59 60 Animal Genetics Animal Genetics Page 8 of 26

1 2 3 174 Kim et al. (2015) found, in sheep and goats, FROH values ranging from 0.02 to 0.10 and 0.02 to 4 5 175 0.09, respectively. In general, dairy sheep breeds (such as , , Leccese, 6 7 176 Massese, and Valle del Belice) showed the highest MNROH and FROH values. Similarly, 8 9 177 Marras et al. (2015), in a study on five Italian cattle breeds, reported the highest number of ROH 10 11 178 in dairy breeds and the lowest values in beef breeds. 12 13 14 179 The total number of ROH and the total lengths of genome in ROH for each individual per breed 15 16 180 are shown in Fig. 3. The majority of the individuals clustered close to the origin of coordinates 17 18 181 because each one carried from 1 to 20 ROH with a total length less than 200 Mb (Fig. 3). For 19 For Peer Review 20 182 several breeds, there were individuals that carried a large number of ROH (from 40 to 100) with 21 22 183 a total length of 300 to 600 Mb, and some extreme individuals with more than 600 Mb of their 23 24 184 autosomes covered by ROH, equivalent to almost one fourth of their genome. Consistently with 25 26 27 185 our results, Manunza et al. (2016) in a genomewide analysis of Spanish goat breeds, reported 28 29 186 similar distributions of ROH, and Purfield et al. (2017) showed that in certain individuals ROH 30 31 187 covered 768.65 Mb of the autosome. 32 33 188 Breeds differ in term of ROH length categories (Fig. 4). Analysis of the distribution of ROH 34 35 189 according to their size showed that for all breeds, the majority of the detected ROH were less 36 37 190 than 10 Mb in length, with few long ROH > 25 Mb. Similar results were reported in another 38 39 study on meat sheep breeds (Purfield et al. (2017). Barbaresca, Delle Langhe and Valle del 40 191 41 42 192 Belice had a larger mean portion of their genome (79.51 Mb, 78.46 Mb, and 86.14 Mb, 43 44 193 respectively) covered in shorter ROH (15 Mb) in agreement with the MNROH results. The 45 46 194 coverage ranged from 26.68 (Comisana) to 66.63 (Leccese) Mb in the remaining breeds. No 47 48 195 animals of the Comisana breed had ROH >15 Mb, whereas eight breeds showed ROH >30 Mb 49 50 196 and among these, Delle Langhe, Leccese and Valle del Belice had the highest mean portion of 51 52 197 their genome classified as ROH (>2.51 Mb) in this category. These results highlight that both 53 54 55 198 ancient and recent inbreeding have had an impact on the genome of the Italian sheep breeds. 56 57 199 ROH length is negatively correlated with time of coancestry. In fact, because recombination 58 59 60 Animal Genetics Page 9 of 26 Animal Genetics

1 2 3 200 events interrupt long chromosome segments, long ROH (~10 Mb) arise as result of recent 4 5 201 inbreeding (up to five generations ago). In contrast, short ROH (~1 Mb) are produced by 6 7 202 identical by descent genomic regions from old ancestors and are indicative of more ancient 8 9 203 relatedness (up to 50 generations ago) (Howrigan et al. 2011), which is frequently unaccounted 10 11 204 for in an individual’s recorded pedigrees. These results suggested that animals involved in this 12 13 14 205 study experienced recent autozygosity events, and indicated that these breeds have not recently 15 16 206 been extensively crossed with other ones otherwise the long ROH would have been broken down 17 18 207 (Mastrangelo et al. 2016). 19 For Peer Review 20 208 The levels of ROH that we estimated here reflect the inbreeding history of the investigated sheep 21 22 209 breeds. In Italy, during the last 60 years, the broad use of has implied the decline of 23 24 210 many Italian local breeds (Ciani et al. 2014). On the basis of ROH analyses, Comisana and 25 26 27 211 Bergamasca appeared as the less consanguineous breeds, whereas Barbaresca, Leccese and Valle 28 29 212 del Belice showed ROH patterns which are typically produced by recent inbreeding or strong 30 31 213 population bottlenecks. Generally, natural mating is the common practice for local sheep breeds, 32 33 214 and the exchange of animals among flocks is quite unusual, with an increase of inbreeding 34 35 215 within the population due to uncontrolled mating of related individuals (Mastrangelo et al. 36 37 216 2012). Therefore, the results reflect a breed management that does not always avoid crossing of 38 39 related animals, whereas the low abundance of long ROH in Comisana and Bergamasca reflects 40 217 41 42 218 a proper breed management, and a sufficient effective population size (Ne). The lower number 43 44 219 ROH abundances found in Comisana and Bergamasca are consistent with the results reported by 45 46 220 Ciani et al. (2014) who showed a larger Ne in these two breeds (571 and 557, respectively) 47 48 221 compared to the other Italian breeds. On the other hand, the ROH patterns reported for 49 50 222 Barbaresca, Leccese and Valle del Belice suggest a smaller Ne, as confirmed by previous studies 51 52 223 on these breeds, describing Ne values of 183, 512 and 340, respectively (Ciani et al. 2014; 53 54 55 224 Mastrangelo et al. 2017). Moreover, the ROH patterns also reflect the historical demography of 56 57 225 the breeds that have gone through demographic declines, such as the Barbaresca and Leccese, 58 59 60 Animal Genetics Animal Genetics Page 10 of 26

1 2 3 226 which experienced a dramatic census contraction. The connection between the current 4 5 227 population size and ROH patterns for each breed was also analyzed. A regression plot with Ne 6 7 228 (50 generations ago) (Ciani et al. 2014; Mastrangelo et al. 2017) against mean number (MNROH) 8 9 229 and the average length (ALROH) of ROH for each breed is reported in Figure S2. An association 10 11 230 between MN and AL versus Ne was found. The association has been tested using the test 12 ROH ROH 13 14 231 statistic based on Pearson’s correlation coefficient. The variables are negatively correlated with 15 16 232 Ne, indeed we obtained a correlation of 0.498 (p-value 0.022) using MNROH and a correlation of 17 18 233 0.555 (p-value 0.009) using ALROH. Therefore, number and size of ROH are in agreement with 19 For Peer Review 20 234 the occurrence of bottlenecks, founder effects and inbreeding (Bosse et al. 2012). Among these 21 22 235 three breeds, the situation of Valle del Belice is less critical because has a larger census (~215 23 24 236 000) respect to the Barbaresca (1 300) and Leccese (~800). Since excessive inbreeding can 25 26 27 237 reduce the longterm fitness of a population, it is necessary to preserve the given population and 28 29 238 prevent further increase of inbreeding especially in endangered breeds, such as Barbaresca 30 31 239 (Mastrangelo et al. 2017). 32 33 240 The average genomic inbreeding coefficients based on the difference between observed vs. 34

35 241 expected number of homozygous genotypes (FHOM) were also estimated (Table 1). A high 36 37 242 correlation was found between FHOM and FROH in all breeds, ranging from 0.863 to 0.996. These 38 39 results corroborate previous results observed in cattle (Zhang et al. 2015; Mastrangelo et al. 40 243 41 42 244 2016) and in sheep (Purfield et al. 2017). Pedigree data are not available for the animals 43 44 245 considered in this study, but the strong correlation between the pedigreebased inbreeding 45 46 246 coefficient and the sum of ROH previously reported by several authors (Keller et al. 2011; 47

48 247 Purfield et al. 2012) indicates that the estimates of FROH are a good reflection of IBD and 49 50 248 suggests that, in the absence of pedigree data, as may be the case for most local breeds, the ROH 51 52 249 coverage may be used to infer aspects of recent population history, when using relatively few 53 54 55 250 samples. 56 57 58 59 60 Animal Genetics Page 11 of 26 Animal Genetics

1 2 3 251 To identify the genomic regions that were most commonly associated with ROH in all the 4 5 252 breeds, the top 1% of SNPs with the highest number of occurrences was chosen as an indication 6 7 253 of a possible ROH hotspot in the genome (Fig. 5). ROH frequencies vary vastly across positions 8 9 254 in the genome. The largest extent of overlapping ROH positions among all sheep breeds was 10 11 255 detected on OAR6, in a genomic region spanning positions from 41.39 to 42.93 Mb. Considering 12 13 14 256 that the regions harbouring ROH often differed among breeds (Bosse et al. 2012; Zhang et al. 15 16 257 2015), the top 1% of SNPs was also plotted on the basis of production purposes (Figure S3) 17 18 258 (milk, meat, wool and “milk and meat”). These adjacent SNPs were merged into genomic 19 For Peer Review 20 259 regions and considered as indicators of potential autozygosity islands. This approach resulted in 21 22 260 the identification of several genomic regions that frequently are homozygous (Table 2). 23 24 261 Chromosome 6 in dual purpose breeds had a ROH segment with the highest peak (Figure S3d), 25 26 27 262 that consisted of 32 SNPs with an occurrence in ROH of ~ 45% and a length of 1.54 Mb. 28 29 263 Analyses of human (Gibson et al. 2006) and cattle (Purfield et al. 2012) ROH have previously 30 31 264 established a correlation between extensive LD and high incidence of ROH. When interrogated 32 33 265 with HAPLOVIEW (Barrett et al. 2005), the majority of SNPs in this genomic region had low 34 35 266 level of LD, with an average r2 value of 0.15 (Figure S4). Moreover, because strong LD, 36 37 267 typically extending up to about 100 kb, is common throughout the sheep genome (Mastrangelo et 38 39 al. 2014), short tracts of homozygosity are very prevalent. To exclude these short and very 40 268 41 42 269 common ROH, the minimum length for ROH was set at >1 Mb. Therefore, though there was a 43 44 270 significant relationship between increased regional LD and incidence of ROH islands, their 45 46 271 existence was not easy to explain on the basis of just LD (Nothnagel et al. 2009). 47 48 272 Chromosome position, number of SNPs, start and end position of ROH, and gene symbol for all 49 50 273 annotated genes within the genomic regions of extended homozygosity are reported in Table S1. 51 52 274 We found that some SNPs occurred in poor gene content regions. In fact, some of the identified 53 54 55 275 genomic regions contain few annotated or uncharacterized genes (Table S1). The majority of the 56 57 276 genes were involved in multiple signaling and signal transduction pathways in a wide variety of 58 59 60 Animal Genetics Animal Genetics Page 12 of 26

1 2 3 277 cellular and biological processes. In what follows, we will focus our discussion on few most 4 5 278 relevant genes within ROH previously identified to be important for the four production abilities. 6 7 279 Genomic regions displaying autozygosity in meat sheep breeds are linked mostly with several 8 9 280 candidate genes related with meat production traits (Table 2), such as: STT that influences feed 10 11 281 conversion, muscle growth, and adiposity traits (Jin et al. 2011), ADIPOQ involved in the 12 13 14 282 control of fat metabolism and insulin sensitivity (Zhang et al. 2014), KCNMB3, PIK3CA and 15 16 283 ZMAT3 which were found within a selective sweep region containing a QTL for body length, 17 18 284 withers height, hip width in beef cattle (Makina et al. 2015), TTN which plays key roles in 19 For Peer Review 20 285 skeletal muscle structural organization and for meat production traits in pigs (Braglia et al. 21 22 286 2013), SDC1 associated with growth traits in cattle (Sun et al. 2010) and GHR, a gene associated 23 24 287 with growth, fat deposition and meat production (Di Stasio et al. 2005). The regions displaying 25 26 27 288 autozygosity in dairy sheep breeds were linked mostly with genes involved in reproduction, such 28 29 289 as the CCNB2 gene which affects oocyte development (Wang et al. 2015). Other candidate genes 30 31 290 detected within these ROH islands were TYRP1 and BNC2, which are involved in controlling 32 33 291 coat pigmentation in sheep; moreover, TYRP1 gene has a homolog on bovine chromosomes 2 34 35 292 and 8 on which the location of QTL for milk traits has been reported (Ashwell et al. 1997); 36 37 293 PLIN2 is related to triacylglycerol synthesis and secretion in mammary gland epithelial cells of 38 39 dairy goats (Shi et al. 2013). For most of these breeds (Altamurana, Delle Langhe, Massese, 40 294 41 42 295 Pinzirita, Sardinian Black, Valle del Belice) the breeding program for milk production is limited, 43 44 296 which was clearly reflected by the distribution of autozygosity islands. In dual purpose breeds 45 46 297 (meat and milk), several genes associated with specific traits related to meat and milk production 47 48 298 were detected: PRKCI , PHC3, GPR160 genes associated with tail type in sheep (Yuan et al. 49 50 299 2016), LAP3 and FAM184B reported as candidate genes for carcass or growth traits in cattle (Xia 51 52 300 et al. 2017); LAP3 has been also associated with milk production traits (Zheng et al. 2011); 53 54 55 301 NCAPG gene related to fetal growth, body frame size (Eberlein et al. 2009) and milk production 56 57 302 traits (Weikard et al. 2012) in cattle; PPARGC1A associated with milk fat yield (Weikard et al. 58 59 60 Animal Genetics Page 13 of 26 Animal Genetics

1 2 3 303 2005). For wool breeds, we did not identify within ROH islands relevant candidate genes that are 4 5 304 known to affect these specific traits. We however detected the SH3BP5L gene that is known to 6 7 305 interacts with YWHAZ, a gene which was significantly associated with sheep fiber diameter 8 9 306 (Wang et al. 2014). Other known genes detected in ROH are involved in reproduction (GDF9) 10 11 307 (An et al. 2013) and fecundity (SETDB2) (Lai et al. 2016) traits and in maternal behavior 12 13 14 308 (ADRA2B) (Michenet et al. 2016). Therefore, for most genes associated with ROH islands, a 15 16 309 biological link to traits such as milk yield and composition, growth, fat deposition and meat 17 18 310 production, reproduction and coat color, which are known to be under selection, can be 19 For Peer Review 20 311 hypothesized. Considering that the genomic regions under selection tend to generate ROH 21 22 312 island, these hotspots are suggested to become an indicator of regions of the genome that are 23 24 25 313 under positive selection. 26 27 314 28 29 315 Conclusions 30 31 316 To the best of our knowledge, this study is the first effort to describe the occurrence and the 32 33 317 distribution of ROH in the genome in several sheep breeds. The levels of ROH that we estimated 34 35 318 here reflect the inbreeding history of the investigated sheep breeds. These results showed that 36 37 38 319 ancient and recent inbreeding have had an impact on the genome of the Italian sheep breeds. The 39 40 320 identification of ROH improve the assessment of the impact of the current active breeding 41 42 321 programs on inbreeding. Therefore, to increase breed performance and reveal how selection 43 44 322 shapes the genome, it is essential to dissect the genetic architecture of each population. 45 46 323 47 48 324 Conflicts of interest 49 50 51 325 The authors declare they do not have any conflicts of interest 52 53 326 54 55 327 Authors’ contributions 56 57 58 59 60 Animal Genetics Animal Genetics Page 14 of 26

1 2 3 328 SM conceived, planned and coordinated the study and drafted the manuscript; EC, FP and BP 4 5 329 contributed with generation of data; SM, MTS, and GS analyzed the data and performed the 6 7 330 statistical analysis. All authors discussed the results, made suggestions and corrections. All 8 9 331 authors read and approved the final manuscript. 10 11 332 12 13 14 333 References 15 16 334 AlMamun H.A., Clark S.A., Kwan P. & Gondro C. (2015) Genomewide linkage disequilibrium 17 335 and genetic diversity in five populations of Australian domestic sheep. Genetics Selection 18 19 336 Evolution 47, 90. For Peer Review 20 21 337 An X.P., Hou J.X., Zhao H.B et al. (2013) Polymorphism identification in goat GNRH1 and 22 338 GDF9 genes and their association analysis with litter size. Animal Genetics 44(2), 234238. 23 24 339 Ashwell M.S., Rexroad C.E. Jr., Miller R.H., VanRaden P.M. & Da Y. (1997) Detection of loci 25 340 affecting milk production and health traits in an elite US Holstein population using microsatellite 26 27 341 markers. Animal Genetics 28, 216222. 28 29 342 Barrett J.C., Fry B., Maller J. & Daly M.J. (2005) Haploview: analysis and visualization of LD 30 343 and haplotype maps. Bioinformatics 21, 263–265. 31 32 344 Biscarini F., Biffani S., Nicolazzi E.L., Morandi N. & Stella A. (2014) Applying runs of 33 34 345 homozygosity to the detection of associations between genotype and phenotype in farm animals. 35 346 Proceedings of the 10th World Congress of Genetics Applied to Livestock Production. 36 37 347 Vancouver, BC, Canada. 38 348 Bosse M., Megens H.J., Madsen O., Paudel Y., Frantz L.A., Schook L.B., Crooijmans R.P. & 39 40 349 Groenen M.A. (2012) Regions of homozygosity in the porcine genome: consequence of 41 42 350 demography and the recombination landscape. PLoS Genetics 8, e1003100. 43 351 Braglia S., Davoli R., Zappavigna A., Zambonelli P., Buttazzoni L., Gallo M. & Russo V. (2013) 44 45 352 SNPs of MYPN and TTN genes are associated to meat and carcass traits in Italian large white and 46 47 353 Italian duroc pigs. Molecular Biology Reports 40(12), 69276933. 48 354 Ciani E., Crepaldi P., Nicoloso L. et al. (2014) Genomewide analysis of Italian sheep diversity 49 50 355 reveals a strong geographic pattern and cryptic relationships between breeds. Animal Genetics 51 356 45, 256–266. 52 53 357 Di Stasio L., Destefanis G., Brugiapaglia A., Albera A. & Rolando A. (2005) Polymorphism of 54 55 358 the GHR gene in cattle and relationships with meat production and quality. Animal Genetics 36, 56 359 138140. 57 58 59 60 Animal Genetics Page 15 of 26 Animal Genetics

1 2 3 360 Eberlein A., Takasuga A., Setoguchi K. et al. (2009) Dissection of genetic factors modulating 4 361 fetal growth in cattle indicates a substantial role of the nonSMC condensin I complex, subunit G 5 6 362 (NCAPG) gene. Genetics 183, 95164. 7 8 363 Gibson J., Newton E.M. & Collins A. (2006) Extended tracts of homozygosity in outbred human 9 364 populations. Human Molecular Genetics 15, 789–95. 10 11 365 Howrigan D.P., Simonson M.A. & Keller M.C. (2011) Detecting autozygosity through runs of 12 366 homozygosity: a comparison of three autozygosity detection algorithms. BMC Genomics 12, 13 14 367 460. 15 16 368 Jin Q.J., Sun J.J., Fang X.T. et al. (2011) Molecular characterization and polymorphisms of the 17 369 caprine Somatostatin (SST) and SST Receptor 1 (SSTR1) genes that are linked with growth 18 19 370 traits. Molecular BiologyFor Reports Peer 38, 31293135. Review 20 21 371 Keller M., Visscher P. & Goddard M. (2011) Quantification of inbreeding due to distance 22 372 ancestors and its detection using dense SNP data. Genetics 189, 237–49. 23 24 373 Kim E.S., Cole J.B., Huson H., Wiggans G.R., Van Tassell C.P., Crooker B.A., Liu G., Da Y. & 25 374 Sonstegard T.S. (2013) Effect of artificial selection on runs of homozygosity in U.S. Holstein 26 27 375 cattle. PLoS One 8, e80813. 28 29 376 Kim E.S., Elbeltagy A.R., AboulNaga A.M., Rischkowsky B., Sayre B., Mwacharo J.M. & 30 377 Rothschild M.F. (2015) Multiple genomic signatures of selection in goats and sheep indigenous 31 32 378 to a hot arid environment. Heredity 116, 255–64. 33 34 379 Kukučková V., Moravčíková N., Ferenčaković M., Simčič M., Mészáros G., Sölkner J., 35 380 Trakovická A., Kadlečík O., Curik I., & Kasarda, R. (2017) Genomic characterization of Pinzgau 36 37 381 cattle: genetic conservation and breeding perspectives. Conservation Genetics, 118. 38 382 Lai F.N., Zhai H.L., Cheng M. et al. (2016) Wholegenome scanning for the litter size trait 39 40 383 associated genes and SNPs under selection in dairy goat (Capra hircus). Scientific Reports 6. 41 42 384 Makina S.O., Muchadeyi F.C., MarleKöster E., Taylor J.F., Makgahlela M. L. & Maiwashe A. 43 385 (2015). Genomewide scan for selection signatures in six cattle breeds in South Africa. Genetics 44 45 386 Selection Evolution 47(1), 92. 46 47 387 Manunza A., Noce A., Serradilla J.M. et al. (2016) A genomewide perspective about the 48 388 diversity and demographic history of seven Spanish goat breeds. Genetics Selection Evolution 49 50 389 48(1), 52. 51 390 Marras G., Gaspa G., Sorbolini S., Dimauro C., AjmoneMarsan P., Valentini A., Williams J.L. 52 53 391 & Macciotta N.P. (2015) Analysis of runs of homozygosity and their relationship with 54 55 392 inbreeding in five cattle breeds farmed in Italy. Animal Genetics 46, 110–21. 56 57 58 59 60 Animal Genetics Animal Genetics Page 16 of 26

1 2 3 393 Mastrangelo S., Sardina M.T., Riggio V. & Portolano B. (2012) Study of polymorphisms in the 4 394 promoter region of ovine blactoglobulin gene and phylogenetic analysis among the Valle del 5 6 395 Belice breed and other sheep breeds considered as ancestors. Molecular Biology Reports 39,745– 7 8 396 751. 9 397 Mastrangelo S., Di Gerlando R., Tolone M., Tortorici L., Sardina M.T. & Portolano B (2014) 10 11 398 Genome wide linkage disequilibrium and genetic structure in Sicilian dairy sheep breeds. BMC 12 399 Genetics 15, 108. 13 14 400 Mastrangelo S., Tolone M., Di Gerlando R., Fontanesi L., Sardina M.T. & Portolano B. (2016) 15 16 401 Genomic inbreeding estimation in small populations: evaluation of runs of homozygosity in 17 402 three local dairy cattle breeds. Animal 10, 746–54. 18 19 403 Mastrangelo S., PortolanoFor B., Di PeerGerlando R., ReviewCiampolini R., Tolone M. & Sardina M. T. (2017) 20 21 404 Genomewide analysis in endangered populations: a case study in Barbaresca sheep. Animal 1 22 405 10. 23 24 406 Michenet A., Saintilan R., Venot E. & Phocas F. (2016) Insights into the genetic variation of 25 407 maternal behavior and suckling performance of continental beef cows. Genetics Selection 26 27 408 Evolution 48(1), 45. 28 29 409 Nothnagel M., Lu T., Kayser M. & Krawczak M. (2009) Genomic and geographic distribution of 30 410 SNPdefined runs of homozygosity in Europeans. Human Molecular Genetics 19, 2927 35. 31 32 411 Pemberton T., Absher D., Feldman M., Myers R.M., Rosenberg N.A. & Li J.Z. (2012) Genomic 33 34 412 patterns of homozygosity in worldwide human populations. American Journal of Human 35 413 Genetics 91, 275– 92. 36 37 414 Peripolli E., Munari D.P., Silva M.V.G.B., Lima A.L.F., Irgang R. & Baldi F. (2016) Runs of 38 415 homozygosity: current knowledge and applications in livestock. Animal Genetics 48, 25571. 39 40 416 Pryce J.E., HaileMariam M., Goddard M.E. & Hayes B.J. (2014) Identification of genomic 41 42 417 regions associated with inbreeding depression in Holstein and Jersey dairy cattle. Genetics 43 418 Selection Evolution 46, 71. 44 45 419 Purcell S., Neale B., ToddBrown K. et al. (2007) PLINK: a tool set for wholegenome 46 47 420 association and populationbased linkage analyses. American Journal of Human Genetics 81, 48 421 559–75. 49 50 422 Purfield D.C., Berry D., McParland S. & Bradley D.G. (2012) Runs of homozygosity and 51 423 population history in cattle. BMC Genetics 13, 70. 52 53 424 Purfield D.C., McParland S, Wall E., & Berry D.P. (2017) The distribution of runs of 54 55 425 homozygosity and selection signatures in six commercial meat sheep breeds. PloS one 12(5), 56 426 e0176780. 57 58 59 60 Animal Genetics Page 17 of 26 Animal Genetics

1 2 3 427 Shi H., Luo J., Zhu J. et al. (2013) PPARγ regulates genes involved in triacylglycerol synthesis 4 428 and secretion in mammary gland epithelial cells of dairy goats. PPAR research 2013, 310948. 5 6 429 Sun J., Jin Q., Zhang C., Fang X., Gu C., Lei C., Wang J. & Chen H. (2010) Polymorphisms in 7 8 430 the bovine ghrelin precursor (GHRL) and Syndecan1 (SDC1) genes that are associated with 9 431 growth traits in cattle. Molecular Biology Reports 38, 31533160 10 11 432 Szmatola T., Gurgul A., RopkaMolik K., Jasielczuck I., Zabek T. & BugnoPoniewierska M. 12 433 (2016) Charateristics of runs of homozygosity in selected cattle breeds maintained in Poland. 13 14 434 Livestock Science 7288. 15 16 435 Wang Z., Zhang H, Yang H. et al. (2014) Genomewide association study for wool production 17 436 traits in a Chinese Merino sheep population. PloS one 9(9), e107101. 18 19 437 Wang H., Zhang L., CaoFor J. et alPeer. (2015) GenomeWide Review Specific Selection in Three Domestic 20 21 438 Sheep Breeds. PloS one 10(6), e0128688. 22 439 Weikard R., Kühn C., Goldammer T., Freyer G. & Schwerin M. (2005) The bovine PPARGC1A 23 24 440 gene: molecular characterization and association of an SNP with variation of milk fat synthesis. 25 441 Physiological Genomics 21:1–13. 26 27 442 Weikard R., Widmann P., Buitkamp J., Emmerling R. & Kuehn C. (2012) Revisiting the 28 29 443 quantitative trait loci for milk production traits on BTA6. Animal Genetics 43(3), 31823. 30 444 Xia J., Fan H., Chang, T. et al. (2017) Searching for new loci and candidate genes for 31 32 445 economically important traits through genebased association analysis of Simmental cattle. 33 34 446 Scientific Reports 7. 35 447 Yuan, Z., Liu, E., Liu, Z. et al. (2016) Selection signature analysis reveals genes associated with 36 37 448 tail type in Chinese indigenous sheep. Animal Genetics 48(1), 5566. 38 449 Zhang L., Yang M., Li, C. et al. (2014) Identification and genetic effect of a variable duplication 39 40 450 in the promoter region of the cattle ADIPOQ gene. Animal Genetics 45, 171179. 41 42 451 Zhang Q., Guldbrandtsen B., Bosse M., Lund M.S. & Sahana G. (2015) Runs of homozygosity 43 452 and distribution of functional variants in the cattle genome. BMC Genomics 16, 542. 44 45 453 Zheng X., Ju Z., Wang J. et al. (2011) Single nucleotide polymorphisms, haplotypes and 46 47 454 combined genotypes of LAP3 gene in bovine and their association with milk production traits. 48 455 Molecular Biology Reports 38(6),405361. 49 50 456 51 52 457 53 54 55 56 57 58 59 60 Animal Genetics Animal Genetics Page 18 of 26

1 2 3 458 Table 1 Descriptive statistics for runs of homozygosity and inbreeding for each sheep breed. 4 5 Breed NTROH MNROH ALROH FROH>1Mb FHOM r(FROH FHOM) 6 658 27.42 4.61 0.053±0.032 0.093±0.033 0.969*** 7 8 Altamurana 575 25.10 4.41 0.055±0.053 0.063±0.064 0.988*** 9 431 17.96 5.31 0.041±0.033 0.044±0.036 0.986*** 10 458 19.91 4.34 0.043±0.039 0.054±0.042 0.991*** 11 12 Barbaresca 1221 40.70 4.96 0.087±0.075 0.106±0.058 0.996*** 13 Bergamasca 342 14.25 4.29 0.026±0.016 0.064±0.020 0.942*** 14 Biellese 419 19.05 3.85 0.034±0.033 0.074±0.036 0.936*** 15 16 Comisana 254 10.58 3.79 0.016±0.010 0.036±0.012 0.911*** 17 Delle Langhe 938 39.00 4.71 0.080±0.049 0.114±0.052 0.978*** 18 380 16.52 4.41 0.037±0.055 0.048±0.059 0.992*** 19 For Peer Review 20 512 21.33 5.19 0.054±0.053 0.061±0.054 0.991*** 21 Istrian Pramenka 391 16.29 4.58 0.031±0.023 0.060±0.028 0.897*** 22 23 Laticauda 551 22.96 4.05 0.047±0.065 0.061±0.068 0.995*** 24 Leccese 953 38.12 5.51 0.090±0.078 0.104±0.082 0.995*** 25 Massese 666 27.75 4.41 0.055±0.061 0.081±0.063 0.993*** 26 27 Pinzirita 527 21.96 4.40 0.048±0.041 0.069±0.046 0.980*** 28 533 22.21 4.54 0.042±0.022 0.066±0.027 0.863*** 29 Sardinian Ancestral Black 404 20.20 4.42 0.046±0.052 0.079±0.054 0.992*** 30 31 Sardinian White 500 20.83 4.56 0.041±0.035 0.074±0.037 0.975*** 32 520 21.70 4.02 0.047±0.052 0.051±0.056 0.991*** 33 Valle del Belice 1069 44.54 5.14 0.099±0.077 0.123±0.080 0.996*** 34 35 459 NTROH= Total number of ROH per breed; MNROH = mean number of ROH per individual and per breed; ALROH =

36 460 average length of ROH per individual and per breed in Mb; FROH >1Mb = mean ROHbased inbreeding coefficient 37 461 with standard deviation; F = inbreeding coefficient based on the difference between observed v. expected 38 HOM 39 462 number of homozygous genotypes; r = correlation between FROH and FHOM; ***Pvalue<0.001 40 463 41 42 464 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Animal Genetics Page 19 of 26 Animal Genetics

1 2 3 465 Table 2 Genomic regions of extended homozygosity for production traits in the sheep breeds. 4 5 Production OAR Number of Start, End, Candidate genes 6 purpose SNPs Mb Mb 1 199 195.23 207.26 SST, ADIPOQ, KCNMB3, PIK3CA, 7 ZMAT3, 8 2 48 130.02 132.36 TTN 9 3 15 15.82 16.42 - 10 Meat 3 43 27.34 29.33 SDC1 11 4 37 53.40 55.42 - 12 10 4 49.15 49.28 - 13 13 4 32.02 32.17 - 14 16 43 31.53 33.69 GHR 15 1 11 131.62 131.99 - 16 2 263 76.24 90.43 TYRP1, BNC2, PLIN2 2 18 101.07 103.40 - 17 3 27 15.00 16.40 - 18 Milk 3 46 30.29 32.73 - 19 3 For26 Peer110.7 8Review 112.42 - 20 5 13 99.76 100.41 - 21 7 32 47.76 49.60 CCNB2 22 23 36 36.63 39.06 - 23 5 86 38.74 44.27 SH3BP5L, GDF9 24 5 49 73.71 75.90 - 25 5 185 79.65 89.49 - 10 87 18.63 23.13 SETDB2 26 Wool 10 5 48.10 48.28 - 27 15 76 19.84 23.88 - 28 22 170 21.45 30.81 - 29 22 37 32.95 35.08 ADRA2A 30 26 13 42.57 43.33 - 31 1 154 178.04 188.40 - 32 1 97 214.73 228.13 PRKCI, PHC3, GPR160 33 Milk and Meat 6 45 37.08 39.16 LAP3, FAM184B, NCAPG 34 6 40 41.38 43.29 PPARGC1A 35 10 123 18.63 25.62 - 466 Meat (Alpagota, Appenninica, Bergamasca, Biellese, Fabrianese, Sambucana); milk (Altamurana, Comisana, Delle 36 37 467 Langhe, Leccese, Massese, Pinzirita, Sardinian Black and White, Valle del Belice); wool (Gentile di Puglia and 38 468 Sopravissana); milk and meat (Bagnolese, Istrian Pramenka, Laticauda, Barbaresca). 39 40 469 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Animal Genetics Animal Genetics Page 20 of 26

1 2 3 470 Figures 4 5 471 Figure 1 Genomic distribution of runs of homozygosity (ROH) in the autosomal chromosomes 6 7 472 (OAR), with the length of OAR (xaxis) in Mb. The ROH segments are shown within the 8 9 473 chromosomes in yellow (long ROH> 5 Mb) and in green (ROH <5Mb). Each row represents one 10 11 474 animal. 12 13 14 475 15 16 476 Figure 2 Distribution of ROH inbreeding coefficients for each sheep breed. (Alp=Alpagota, 17 18 477 Alt=Altamurana, App=Appenninica, Bag=Bagnolese, Barb=Barbaresca, Berg=Bergamasca, Bie=Biellese, 19 For Peer Review 20 478 Com=Comisana, Lan=Delle Langhe, Fab=Fabrianese, GdP=Gentile di Puglia, IsPr= Istrian Pramenka, 21 479 Lat=Laticauda, Lec=Leccese, Mas=Massese, Pin=Pinzirita, Sam=Sambucana, Sar_b=Sardinian Black, Sar_w= 22 23 480 Sardinian White, Sop=Sopravissana, VdB=Valle del Belice) 24 25 481 26 27 482 Figure 3 Relationship between the total number of runs of homozygosity (ROH) segments (y 28 29 483 axis) and the total length (Mb) of genome in such ROH (xaxis) for all individuals from each 30 31 32 484 breed. Each dot represents an individual. (Alp=Alpagota, Alt=Altamurana, App=Appenninica, 33 34 485 Bag=Bagnolese, Barb=Barbaresca, Berg=Bergamasca, Bie=Biellese, Com=Comisana, Lan=Delle Langhe, 35 486 Fab=Fabrianese, GdP=Gentile di Puglia, IsPr= Istrian Pramenka, Lat=Laticauda, Lec=Leccese, Mas=Massese, 36 37 487 Pin=Pinzirita, Sam=Sambucana, Sar_b=Sardinian Black, Sar_w= Sardinian White, Sop=Sopravissana, VdB=Valle 38 39 488 del Belice) 40 41 489 42 43 490 Figure 4 Classification of ROH in seven categories (xaxis) according to size (from 1 to 5 Mb to 44 45 491 more than 30 Mb) and mean sum of ROH in Mb (yaxis) within each ROH length category per 46 47 492 breed. 48 49 50 493 51 52 494 Figure 5 Incidence of each SNP in the ROH among all sheep breeds 53 54 495 55 56 496 Supporting information 57 58 59 60 Animal Genetics Page 21 of 26 Animal Genetics

1 2 3 497 Figure S1 Genetic relationships among the Italian sheep breeds defined through 4 5 498 multidimensional scaling (MDS) analysis. 6 7 499 Figure S2 A regression plot between effective population size (Ne) (50 generation ago) against 8 9 500 mean number (MNROH) and the average length (ALROH) of ROH for each breed. 10 11 501 Figure S3 The percentage of SNP residing within ROH in the sheep breeds grouped for 12 13 14 502 production purposes: a) meat (Alpagota, Appenninica, Bergamasca, Biellese, Fabrianese, 15 16 503 Sambucana); b) milk (Altamurana, Comisana, Delle Langhe, Leccese, Massese, Pinzirita, 17 18 504 Sardinian Black and White, Valle del Belice); c) wool (Gentile di Puglia and Sopravissana); d) 19 For Peer Review 20 505 milk and meat (Bagnolese, Istrian Pramenka, Laticauda, Barbaresca) 21 22 506 Figure S4 Haplotype block (heat) map of a portion of OAR6 in dual purpose (milk and meat) 23 24 507 breeds. 25 26 27 508 28 29 509 Table S1. Chromosome position, number of SNPs, start and end position of ROH islands, and 30 31 510 symbols for all annotated genes within these genomic regions. 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Animal Genetics OAR1 OAR2 OAR3 OAR4 OAR5 OAR6 Animal Genetics Page 22 of 26

1 2 3 4 5 6 7 80 75 150 225 300 0 68 135 202 270 0 60 120 180 240 0 35 70 105 140 0 30 60 90 120 0 38 75 112 150 9 OAR7 OAR8 OAR9 OAR10 OAR11 OAR12 10 11 12 13 14 15 16 17 18 19 For Peer Review 200 30 60 90 120 0 25 50 75 100 0 30 60 90 120 0 25 50 75 100 0 20 40 60 80 0 25 50 75 100 21 OAR13 OAR14 OAR15 OAR16 OAR17 OAR18 22 23 24 25 26 27 28 29 30 31 320 25 50 75 100 0 20 40 60 80 0 25 50 75 100 0 20 40 60 80 0 20 40 60 80 0 20 40 60 80 33 OAR19 OAR20 OAR21 OAR22 OAR23 OAR24 34 35 36 37 38 39 40 41 42 43 440 20 40 60 80 0 15 30 45 60 0 15 30 45 60 0 15 30 45 60 0 20 40 60 80 0 12 25 38 50 45 OAR25 OAR26 46 47 48 49 50 51 52 53 54 Animal Genetics 55 560 12 25 38 50 0 12 25 38 50 57 58 59 60 Page 23 of 26 Animal Genetics

1 ● 2 3 4 5 0.3 6 7 8 9 10 ● 11 ● ● ● 12 13 0.25 14 15 16 ● 17

18 ● 19 20 21 22 0.2 ● 23 24 For Peer Review 25 26 ● 27 ● 28 29 30 31 0.15 ● 32 ● 33 34 ● 35 ● ● 36 37 38 ● 39 0.1 ● 40 41 42 43 ● ● 44 45 ● 46 47 48 0.05 49 ● 50 51 52 53 54 55 56 57 0 ● 58 59 60 Biellese Pinzirita Leccese Alpagota Massese Laticauda Comisana Bagnolese Fabrianese Altamurana Barbaresca Sambucana Appenninica Bergamasca Delle Langhe

Animal Genetics Sopravissana Valle del Belice Valle Sardinian Black Sardinian White Gentile di Puglia Pramenka Alpagota Altamurana Appenninica Bagnolese Barbaresca Animal Genetics Page 24 of 26

1 ● 2 ● 3 ●●● 4 ● ●● ● ● ● ● ●● ● ● 5 ● ● ● ● ●● ●● ● ● ●● ● ● ● ● ● ●● 6 ● ● ● ●●●● ● ●●● ●●● ●●● ●●●● 7 ●● ● ●●●● ● ●●● Number of ROHs ●●● ●● ● ●● ●● ●● ●●● ●●● ●● 8 ● ● ● ●●● ●● ●●●● 9 0 40 80 120 0 40 80 120 0 40 80 120 0 40 80 120 0 40 80 120 10 0 200 400 600 800 0 200 400 600 800 0 200 400 600 800 0 200 400 600 800 0 200 400 600 800 11 12Total lengthBergamasca of segments (Mb) Total lengthBiellese of segments (Mb) Total lengthComisana of segments (Mb) Total lengthDelle of Langhe segments (Mb) Total lengthFabrianese of segments (Mb) 13 14 15 ● 16 ● 17 18 ● ● ● 19 ● ● ● ●●● 20 ●●●●●● ● ●●●●●● 21 ● ● ● ● ● ●● ●● ● ● ●● Number of ROHs ●●● ● ● ●●● 22 ●●●●● ●●● ●● ●● ●●●● ●● ●●● ●●● ● ●● ● ●●● 230 40 80 120 0 40 80 120 0 40 80 120 0 40 80 120 0 40 80 120 24 For Peer Review 25 0 200 400 600 800 0 200 400 600 800 0 200 400 600 800 0 200 400 600 800 0 200 400 600 800 26Total lengthGentile of disegments Puglia (Mb) Total lengthIstriana of Pramenka segments (Mb) Total lengthLaticauda of segments (Mb) Total lengthLeccese of segments (Mb) Total lengthMassese of segments (Mb) 27 28 ● 29 ● ● 30 ● 31 ● ● 32 ● ● ● ● ● ● 33 ● ● ●●● ● ● 34 ● ● ●● ● ● ●● 35 ● ●● ●● ●●●● ●● ●● ●●● ● ● ●●● ●● ●●●● Number of ROHs 36 ●● ●● ●● ● ● ●●●● ●● ●●● ●●● ● ●● ●●●●● ●●● ●●●●● ●● 37 ●● ● ● ●● 380 40 80 120 0 40 80 120 0 40 80 120 0 40 80 120 0 40 80 120 39 0 200 400 600 800 0 200 400 600 800 0 200 400 600 800 0 200 400 600 800 0 200 400 600 800 40 41Total lengthPinzirita of segments (Mb) Total lengthSambucana of segments (Mb) Total lengthSardinian of segments Black (Mb) Total lengthSardinian of segments White (Mb) Total lengthSopravissana of segments (Mb) 42 43 44 45 46 ● ● ● 47 ● ● ● 48 ● ● ● ● ● ● ● ●●●● 49 ● ● ● ●●●● ●●● ● ● ● ● ●● ●● ●●●● ● 50 ● ●●●●● ●● ●●●● ● Number of ROHs ● ● ● ●● ● ● ●●●●● ●● ●●●● ● 51 ● ● ● ● ●● ●●● ●●●● 520 40 80 120 0 40 80 120 0 40 80 120 0 40 80 120 0 40 80 120 53 0 200 400 600 800 0 200 400 600 800 0 200 400 600 800 0 200 400 600 800 0 200 400 600 800 54 55Total lengthValle of del segments Belice (Mb) Total length of segments (Mb) Total length of segments (Mb) Total length of segments (Mb) Total length of segments (Mb) 56 ● 57 58 59 ● 60 ● ●● ● ● ● ● ● ● ● ●●●● ●●

Number of ROHs ●● ●● Animal Genetics 0 40 80 120 0 200 400 600 800 Page 25 of 26 Animal Genetics

1 2 90 3 4 Alpagota 5 Altamurana 6 80 7 Appenninica 8 Bagnolese 9 10 70 Barbaresca 11 Bergamasca 12 For Peer Review 13 Biellese 60 14 Comisana 15 16 Delle Langhe 17 50 Fabrianese 18 19 Gentile di Puglia 20 Istrian Pramenka

21 (Mb) Mean 40 22 Laticauda 23 Leccese 24 25 30 Massese 26 Pinzirirta 27 28 Sambucana 20 29 Sar_Ancestral_B 30 31 Sardinian_white 32 10 Sopravissana 33 34 Valle del Belice 35 36 0 37 1-5 Mb 5-10 Mb 10-15 Mb 15-20 Mb 20-25 Mb 25-30 Mb > 30 Mb 38 39 Animal Genetics 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Animal Genetics Page 26 of 26

1 2 3 4 5 6 7 8 9 10 11 12 For Peer Review 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 Animal Genetics 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60