University of Veterinary Medicine Hannover

Athletic performance and conformation in Hanoverian - population genetic and genome-wide association analyses

Thesis Submitted in partial fulfillment of the requirements for the degree Doctor of Veterinary Medicine - Doctor medicinae veterinariae - (Dr. med. vet.)

by Wiebke Schröder Halle/Saale

Hannover 2010 Academic supervision: Univ.-Prof. Dr. Dr. habil. Ottmar Distl Institut für Tierzucht und Vererbungsforschung Bünteweg 17p 30559 Hannover

1. Referee: Univ.-Prof. Dr. Dr. habil. Ottmar Distl 2. Referee: Univ.-Prof. Dr. Karsten Feige

Day of oral examination: 22. November 2010 To my family

Parts of this work have been accepted or submitted for publication in the following journals: 1. The Veterinary Journal 2. Livestock Production Science 3. Archiv Tierzucht

Table of contents

Table of contents

1 Introduction……………………………………...………………………………… 1 2 A review on candidate for physical performance in the ………………………………………………………………………………… 7 2.1 Abstract……………………………………………….………………………. 9 3 Does the proportion of genes of foreign breeds influence breeding values for performance traits in the Hanoverian warmblood horse?...... 11 3.1 Abstract………………………………………………….…………………… 13 3.2 Introduction……………………………………………………..…………… 14 3.3 Materials and Methods…………………………….………………………. 14 3.3.1 Performance Data…………………………………………………. 14 3.3.2 Pedigree Data……………………………………………………… 16 3.3.3 Statistical Analyzes……………………………………………….. 17 3.3.4 Model Development………………………………………………. 17 3.3.5 Genetic Analyses………………………………………………….. 18 3.4 Results………………………………………………..……………………… 19 3.4.1 Statistical Analyses………………………………………………... 19 3.4.2 Analyses of Variance and Model Development………………… 20 3.4.3 Genetic Analyses………………………………………………….. 20 3.5 Discussion…………………………………………………….……………… 21 3.6 Conclusion……………………………………………………..……………. 24 3.7 References…………………………………………………….…………….. 24 4 Genetic evaluation of Hanoverian warmblood horses for conformation traits considering the proportion of genes of foreign breeds……………. 35 4.1 Abstract………………………………………………………………………. 37 4.2 Zusammenfassung………………………………………………………….. 38 5 A genome wide association study for quantitative trait loci of show-jumping in Hanoverian warmblood horses…………………………………. 41 5.1 Summary…………………………………………………………………….. 43 5.2 Introduction………………………………………………………………….. 44 Table of contents

5.3 Materials and Methods……………………………………………………... 45 5.3.1 Animals and Phenotypic data…………………………………….. 45 5.3.2 Genotyping SNPs………………………………………………….. 46 5.3.3 Data analysis……………………………………………………….. 47 5.4 Results……………………………………………………………………….. 50 5.5 Discussion…………………………………………………………………… 51 5.6 References………………………………………………………………….. 57 6 Identification of quantitative trait loci for dressage in Hanoverian warm blood horses……………………………………………………………………… 81 6.1 Summary……………………………………………………………………. 83 6.2 Introduction…………………………………………………………………. 84 6.3 Materials and Methods……………………………………………………. 85 6.3.1 Animals and Phenotypic data…………………………………… 85 6.3.2 Genotyping SNPs………………………………………………… 87 6.3.3 Data analysis……………………………………………………… 87 6.4 Results……………………………………………………………………… 91 6.5 Discussion………………………………………………………………….. 92 6.6 References…………………………………………………………………. 98 7 A genome wide association study for quantitative trait loci of conformation in Hanoverian warmblood horses……………………………………………. 121 7.1 Abstract……………………………………………………………………… 123 7.2 Introduction…………………………………………………………………. 124 7.3 Results………………………………………………………………………. 125 7.4 Discussion…………………………………………………………………... 126 7.5 Materials and Methods…………………………………………………….. 132 7.5.1 Animals and Phenotypic data……………………………………. 132 7.5.2 Genotyping SNPs…………………………………………………. 133 7.5.3 Data analysis………………………………………………………. 134 7.6 References………………………………………………………………….. 137 8 General Discussion……………………………………………………………… 169 8.1 References………………………………………………………………….. 171 Table of contents

9 Summary………………………………………………………………………….. 177 10 Zusammenfassung……………………………………………………………… 183 11 Appendix………………………………………………………………………….. 193 12 List of publications……………………………………………………………… 197 13 Acknowledgement………………………………………………………………. 201

Abbreviations

List of abbreviations

A adenin ACE angiotensin converting enzyme ACTN3 alpha 3 ADP adenosine diphosphate ADRB2 adrenergic beta-2-receptor, surface AF3BL2 ATPase family 3-like 2 AI auction inspection AMPD1 adenosine monophosphate deaminase 1 ANOVA analyses of variance Arg argentine AT achilles tendinopathy ATI achilles tendon injury ATP adenosine triphosphate BDKRB2 bradykinin receptor B2 BDNF brain-derived neurotrophic factor BLUP Best Linear Unbiased Prediction bp base pairs BV breeding value bv_dress breeding value for dressage bv_jump breeding value for jumping bv_limbs breeding value for the limbs bv_rhp breeding value for riding horse points C cytosine cAMP cyclic adenosine monophosphate cDNA complementary deoxyribonucleic acid CHRM2 cholinergic receptor muscarinic 2 CK creatine kinase CKM creatine kinase, muscle Abbreviations

CLUSTRALW2 multiple alignment program through sequence weights COL1A1 collagen, type I, alpha 1 COL5A1 collagen, type V, alpha 1 COL15A1 collagen, type XV, alpha 1 ConFL conformation front legs ConHL conformation hind legs CYP27B1 cytochrome P450, family 27, subfamily B, polypeptide 1 gene David database for annotation, visualization and integrated discovery Del deletion Dev general impression and development DRD4 dopamine receptor D4 ECA caballus Ensembl joint project between EMBL (European Molecular Biology Laboratory) – EBI (European Bioinformatics Institute) and the Wellcome Trust Sanger Institute EPAS1 endothelial PAS domain 1 EquCab2 Equus caballus assembly 2 FST follistatin G guanine GABPA GA binding protein transcription factor, alpha subunit 60kDa GABPB GA binding protein transcription factor, bet GDF8 growth and differential factor 8 Gly glycine GNB5 guanine nucleotide binding protein (G protein), beta 5 GYS1 glycogen synthase 1 h² heritability HAN proportion of Hanoverian warmblood genes HB hemoglobin HBB hemoglobin beta Head conformation of the head HDC histidine decarboxylase Abbreviations

HFE hemochromatosis HIF1A hypoxia inducible factor 1 alpha HOL proportion of warmblood genes HPX hemopexin HSS Hanoverian Studbook Society IGF1 insulin-like growth factor 1 In insertion LCORL ligand dependent nuclear receptor corepressor-like gene LD linkage disequilibrium LSM least square means MAF minor frequency MARKAPK mitogen-activated protein kinase-activated protein kinase 2 Mb megabase MCPH1 microcephalin 1 MYL2 light chain 2 regulatory, cardiac, slow gene MYL3 myosin, light chain 3, alkali; ventricular, skeletal, slow MYLK kinase MYO5A myosin VA (heavy chain 12, myoxin) MYO7B myosin VIIB MSTN myostatin MMP3 matrix metalloproteinase-3 MPT performance test n number NCBI National Center for Biotechnology Information Neck conformation of the neck NRAP nebulin-related anchoring protein NRF1 nuclear respiratory factor 1 NRF2 nuclear respiratory factor 2 P error probability PAPSS2 bifunctional 3'-phosphoadenosine 5'-phosphosulfate synthetase2 PLAGL1 pleiomorphic adenoma gene-like 1 Abbreviations

PPARA peroxisome proliferator-activated receptor alpha PPARGC1A peroxisome proliferator-activated receptor gamma, coactivator 1 alpha PPARD peroxisome proliferator-activated receptor delta PRG4 proteoglycan PT performance test PT_Walk walk under rider evaluated at mare performance test or auction inspection PT_Trot trot under rider evaluated at mare performance test or auction inspection PT_Canter canter under rider evaluated at mare performance test or auction inspection PT_FJT total score for free jumping evaluated at mare performance test or auction inspection PT_Ride rideability as judged by the judging commission evaluated at mare performance test or auction inspection

σa² additive genetic variance

σe² residual variance

σr² event variance R arginine RAD23B RAD23 homolog B REML Residual Maximum Likelihood RHP riding horse points RNF160 ring finger protein 160 RPP Reitpferde-Points Sad position SAS statistical analysis system SCGB1A1 secretoglobin, family 1A, member 1 SD standard deviation SE standard error SBI studbook inspection Abbreviations

SBI_Walk walk at hand evaluated at studbook inspection SBI_Imp impetus and elasticity in trot at hand evaluated at studbook inspection SBI_Corr correctness of gaits in walk and trot at hand evaluated at studbook inspection SLC6A4 serotonin transporter SNP single nucleotide SHOX2 short stature homeobox 2 T thymin TASSEL Trait Analysis by aSSociation, Evolution and Linkage TB proportion of genes TBX4 T-box transcription factor 4 gene TNC tenascin C TRAK proportion of genes TRAPPC9 trafficking protein particle complex 9 TRHR thyrotropin-releasing hormone receptor TRPC3 transient receptor potential cation channel, subfamily C, member 3 VCE Variance Component Estimation VEGFA vascular endothelial growth factor A VDR 1,25- dihydroxyvitamin D3 receptor VIT Vereinigte Informationssysteme Tierhaltung w.V. VWC2 Willebrand factor C domain containing 2 WH height at withers X stop codon

CHAPTER 1

Introduction

1

2 Introduction

1 Introduction Worldwide, horses play an important role within human cultures. Since domestication, the focus of breeding has been to improve the horses’ usefulness to man. Hanoverian warmblood horses (Hanoverians) can be traced back to the 16th century. Back then they were primarily bred for farm work and military service, requiring competitive, calm and rideable horses. In the middle of the 20th century the usability for became more of a focal point. Today, the Hanoverian represents one of the most important breeds of sport horses in the world. In particular dressage and show jumping play an economically important role in Hanoverian breeding. A favourable conformation is an additional asset for sales, especially at young ages. Hence, the Hanoverian studbook society (HSS) aims at selecting animals with best performance and most favourable conformation values for the next generation. In order to refine conformation of future progeny, and are commonly used, while the intended use of Holsteiner warmblood horses (Holsteiner) is to improve the show-jumping performance. Population genetic analyses of performance traits and conformation are performed regularly to ease performance orientated matings. Currently, many genes are in discussion to influence human athletic capability, while such studies in horses are rare. However, due to the rapid development in the equine molecular genetics that has evolved around the second assembled genome sequence of the horse (EquCab2.0), the prospects of dissecting the genetic components of multigenic traits such as performance and conformation have increased dramatically. In addition, the strong conserved synteny between human and equine chromosomal structure and new bioinformatic tools should facilitate identifying genes involved in equine performance. Our approach is focused on genetic factors that play a major role for equine physical capability and conformation. Molecular genetic studies in particular on heterogeneous populations are highly sensitive to data stratification. Hence a careful model choice is required to avoid false positive results. In particular the proportions of genes and relationship are important factors for stratification. The model regularly used for population genetic analyses of Hanoverians does not consider the proportion of foreign breeds. Hence, one aim of

3 Introduction

this study was to determine whether the inclusion of the proportion of genes could improve the model for genetic evaluation for performance and conformation of Hanoverians. The second purpose of this study was to identify genomic regions harbouring candidate genes for physical performance and conformation of Hanoverians. In order to achieve this objective, we performed whole genome association analyses of single nucleotide polymorphism (SNP) with the aim to further define quantitative trait loci (QTL) for different traits of performance and conformation employing the Illumina Equine SNP50 BeadChip.

Overview of chapter contents The content of the thesis is presented in single papers according to § 8 Abs. 3 of the Rules of Graduation (Promotionsordnung) of the University of Veterinary Medicine Hanover. Chapter 2 reviews the literature for specialities of the equine physical performance and represent 28 candidate genes for equine performance. Chapter 3 and 4 contain an update on performance and conformation related breeding values of Hanoverian warmblood horses and investigates whether the inclusion of the proportion of genes could improve in the model. Chapter 5, 6 and 7 present whole genome association analyses for performance and conformation employing the Illumina Equine SNP50 BeadChip, in order to determine genomic regions responsible for show-jumping, dressage and conformation in Hanoverians. Chapter 8 comprises a general discussion and conclusions referring to chapters 2-7. Chapter 9 is a concise English summary of this thesis, while Chapter 10 is an expanded detailed German summary which takes into consideration the overall research context.

4 Introduction

5

6

CHAPTER 2

A review on candidate genes for physical performance in the horse

Wiebke Schröder, Andreas Klostermann, Ottmar Distl

Institute for Animal Breeding and Genetics, University of Veterinary Medicine Hannover, Bünteweg 17p, 30559 Hannover,

Article in Press, Corrected Proof (doi:10.1016/j.tvjl.2010.09.029)

7

8 A review on candidate genes for physical performance in the horse

2 A review on candidate genes for physical performance in the horse

2.1 Abstract Intense selection for speed, endurance or pulling power in the domestic horse (Equus caballus) has resulted in a number of adaptive changes in the phenotype required for elite athletic performance. To date, studies in human have revealed a large number of genes involved in elite athletic performance, but studies in horses are rare. The horse genome assembly and bioinformatic tools for genome analyses have been used to compare human performance genes with their equine orthologues, to retrieve pathways for these genes and to investigate their chromosomal distribution. We represent 28 candidate genes for equine performance that have polymorphisms associated with human elite athletic performance and may have impact on athletic performance in horses. A significant accumulation of candidate genes was found on horse 4 and 12. Genes involved in pathways for focal adhesion, regulation of , neuroactive ligand- receptor interaction, and calcium signalling were overrepresented. Genome-wide association studies for athletic performance in horses may benefit from the strong conserved synteny of the chromosomal arrangement of genes among human and horse.

Keywords: Equus caballus; Genome; Performance; Candidate genes; Genetic polymorphisms; Pathway analysis

9

10

CHAPTER 3

Does the proportion of genes of foreign breeds influence breeding values for performance traits in the Hanoverian warmblood horse?

W. Schröder, K.F. Stock, O. Distl

Department of Animal Breeding and Genetics, University of Veterinary Medicine Hannover (Foundation), Hannover; Germany

Submitted for publication

11

12

Genetic evaluation for performance

3 Does the proportion of genes of foreign breeds influence breeding values for performance traits in the Hanoverian warmblood horse?

3.1 Abstract Performance data of in total 36,441 Hanoverian warmblood horses (Hanoverians) were used to determine whether genetic evaluation for performance in the Hanoverian could benefit from the inclusion of the proportion of genes of foreign breeds in the model. For our analyses we considered all Hanoverians born from 1992 to 2005, for which records from mare performance tests, auction inspections or studbook inspections were available. Genetic parameters were estimated univariately for five traits evaluated at mare performance tests and auction inspections (walk, trot and canter under saddle, free jumping, and rideability) and for three traits evaluated at studbook inspections (walk, elasticity and correctness of gaits for walk and trot in hand) in a linear animal model using Residual Maximum Likelihood. Genetic evaluation was subsequently performed using Best Linear Unbiased Prediction. To investigate the effect of correcting for the proportion of genes of stallions from foreign breeds, two different models were used for the analyses. In Model 1, the fixed effects sex (for the auction inspection data only) and age, and the random effect date-place interaction were considered. In Model 2, proportions of genes of Thoroughbred, Trakehner and Holsteiner stallions were additionally included as fixed effects. Heritabilities of analyzed performance traits in both models ranged between 0.11 and 0.34, with standard errors of 0.01. Pearson correlation coefficients determined between corresponding breeding values from Model 1 and 2 were highly positive (>0.98), indicating little effect of the model on the results of genetic evaluation. Our results indicate that using a model which includes the proportion of genes of Thoroughbred, Trakehner and Holsteiner as fixed effects will not relevantly improve genetic evaluation for performance in the Hanoverian.

Keywords: Hanoverian, genetic evaluation, performance, proportion of genes.

13 Genetic evaluation for performance

3.2 Introduction The Hanoverian warmblood horse is primarily bred to be a rideable and talented with good abilities for the disciplines dressage, jumping, eventing and (Koenen et al., 2004). Therefore, an early performance evaluation is not just beneficial for a preselection of talented youngster sport horses (Ducro et al., 2007; Wallin et al., 2003), but also plays an important role for the selection of breeding horses. Performance traits, evaluated at mare performance tests (MPT), auction inspections (AI) and studbook inspections (SBI) represent a suitable selection base. Estimation of breeding values for performance traits recorded at MPT and SBI have been described for the Hanoverian warmblood as well as for other German warmblood breeds (Christmann, 1996; Lührs-Behnke et al., 2006). In the models used for genetic evaluation, environmental effects which principally influence performance, i.e. age at evaluation, sex, place and date of evaluation were considered separately and combined. For dairy cattle, Vanderick et al. (2009) could show that breeding values that account for breed proportions provide a theoretically better tool to evaluate crossbred dairy cattle populations. Similar results were obtained by Stewart et al. (2009) for the sport horse population of Great Britain. They considered a model including breed classes as most appropriate for the estimation of breeding values for dressage performance. For the Hanoverian warmblood horse (Hanoverian), the influence of the proportion of foreign breeds on performance has not been investigated in depth. Thoroughbred, Trakehner and Holstein warmblood (Holsteiner) represent the most common stallions employed for crossbreeding in the Hanoverian population (Hamann and Distl, 2008). Hence, the aim of our study was to determine whether genetic evaluation for performance could benefit from inclusion of the proportion of genes of these three breeds in the Hanoverian.

3.3 Materials and Methods 3.3.1 Performance Data Results of MPT, AI and SBI of the Hanoverian Studbook Society (HSS) were made available for this study. Information on 36,441 Hanoverian warmblood horses

14

Genetic evaluation for performance

born between 1992 and 2005 were considered. All performance data and pedigree information were made available by the HSS through the national unified animal ownership database (Vereinigte Informationssysteme Tierhaltung w.V., VIT) in Verden at the Aller, Germany.

Mare performance test. MPT data included information on 16,814 performance tested . Of these mares 14,500 accomplished their MPT in a one day event in the field (MPTF), whilst only 2,314 mares completed their MPT in stationary performance tests (MPTS). Tests at station included 26 days of standardized training and a final test. If a mare participated in more than one MPT, only the last test result was considered for subsequent analyses. Included MPTs took place in 1995 to 2008. The number of performance tested mares per year ranged between 855 and 1,286 in MPTF and between 126 and 210 in MPTS. The mares were judged for quality of gaits (walk, trot, and canter under rider), jumping talent (style and ability of free jumping), rideability, and character using a 0.5 scale from 0 (not shown) to 10 (excellently shown). Style and ability of free jumping were scored individually and subsequently averaged to a total score for free jumping. Rideability was separately scored by a judging commission and by a test rider. The character was only judged at MPTS. Most of the mares completed their MPT until the completion of their fourth year of age (mean age of 3.58 ± 0.91 years in MPTF and 3.54 ± 0.78 years in MPTS). Performance tests in the field were held at 63 places with on average 26.94 ± 13.19 (1 to 82) judged Hanoverian warmblood mares per date and place. MPTS took place in only 5 places with on average 18.52 ± 6.47 (1 to 30) mares per date and place.

Auction inspection. Horses offered for sale at riding horse auctions of the HSS are preselected by a judging commission. Auction candidates are chosen based on their preliminary performance evaluation. Between 1999 and 2008, 8,081 Hanoverians (5,567 males, 2,514 females) were judged at auction inspections (AI) in a procedure similar to MPTF. Accordingly, presented horses were scored for quality of gaits (walk, trot and canter under rider), jumping talent (style and ability of free jumping, total

15

Genetic evaluation for performance

score for free jumping), and rideability. Mean age of evaluated horses at AI was 4.21 ± 0.82 years. AI were held at 111 places, with on average 16.77 ± 18.68 (range 1 to 96) inspected horses per date and place.

Studbook inspection. All mares intended to be used for breeding under the HSS must be registered in the Hanoverian studbook. At studbook inspection (SBI) a judging commission gives individual scores for several conformation traits as well as for walk, correctness of gaits in walk and trot at hand, and impetus and elasticity in trot at hand. Scores on a scale from 0 (not shown) to 10 (excellently shown) were assigned for each trait. For more details see Stock and Distl (2006). For this study we considered the SBI results with respect to the three gait-related traits of 29,053 mares, presented between 1995 and 2008 at a mean age of 3.79 ± 1.66 years. There were 182 places of SBI with on average 11.73 ± 12.44 (range 1 to 83) inspected mares per date and place.

3.3.2 Pedigree Data For the genetic analyses, four ancestral generations of all horses with performance data (SBI, MPT or AI results) were considered. The relationship matrix comprised 80,746 individuals, including 7,486 base animals. The 29,053 mares judged at SBI descended from 1,079 sires and 1,706 maternal grandsires. The sires were on average represented by 26.91 ± 63.5 (range 1 to 998) horses and the maternal grandsires were on average represented by 17.02 ± 41.59 (range 1 to 704) horses. The 24,895 horses that performed at MPT or AI descended from 935 sires and 1,485 maternal grandsires. The sires were on average represented by 26.61 ± 62.59 (range 1 to 941) horses and the maternal grandsires were on average represented by 16.76 ± 40.61 (range 1 to 686) horses. The performance tested Hanoverian population had an average proportion of 0.58 (median = 0.59) Hanoverian genes, 0.23 (median = 0.20) Thoroughbred genes, 0.07 (median = 0.05) Trakehner genes, and 0.05 (median = 0) Holsteiner genes. The proportions of genes provided by Thoroughbred, Trakehner and Holsteiner were calculated for the tested

16

Genetic evaluation for performance

Hanoverian population. For this calculation, all available pedigree information was used. Details are described elsewhere (Hamann and Distl, 2008).

3.3.3 Statistical Analyzes Statistical analyses included three traits evaluated at SBI, i.e. walk at hand (SBI_Walk), correctness of gaits in walk and trot at hand (SBI_Corr) and impetus and elasticity in trot at hand (SBI_Imp), and five performance test (PT) traits evaluated at MPT and AI, i.e. walk under rider (PT_Walk), trot under rider (PT_Trot), canter under rider (PT_Canter), total score of free jumping (PT_FJT) and rideability scored by judging commission (PT_Ride). Because for some horses individual scores for style and ability of free jumping were not recorded, only the total score for free jumping was considered for all individuals. Horses evaluated at AI were, unlike the ones at MPTs, only scored for rideability by a judging commission, but not by a test rider. For that reason we only included rideability scores from the judging commission for our analyses.

3.3.4 Model Development The following effects were tested for their influence on distribution of performance trait scores from MPT/AI and SBI: Age at MPT/AI or SBI evaluation as covariate or fixed effect (3-, 4- or ≥ 5 years old); evaluation year (individual years from 1995- 2008), evaluation month (individual months), evaluation season (February through April, May through July, August through October, November through January), and evaluation place (182 places of SBI, 111 places of AI, 63 places of MPTF and 5 places of MPTS) as fixed effects; combined date-place effect (2,476 levels for SBI, 482 levels for AI, 537 levels for MPTF, 126 levels for MPTS) as random effect; proportion of genes (Hanoverian or Thoroughbred, Trakehner, and Holsteiner) as covariate or fixed effect (low, moderate and high proportion of genes of the respective breed). The sex effect (male, female) was tested for the AI traits. Simple and multiple analyses of variance (ANOVA) were performed using the procedures GLM and MIXED of the Statistical Analysis System (SAS), Version 9.2 (SAS Institute Inc., Cary, NC, USA, 2010). Model choice was based on the model fit

17

Genetic evaluation for performance

test statistics and the significance tests. In the final model (Model 1), sex (only for AI) and age group at SBI or MPT/AI evaluation were considered as fixed effects, and the combined date-place effect was considered as random effect. To investigate the impact of accounting for the proportion of genes on results of the genetic analyses, an alternative model (Model 2) was used which additionally included the proportions of genes of Thoroughbred, Trakehner and Holsteiner as fixed effects. Class were formed from the proportions of genes with the aim of having similar numbers of horses in each of three effect levels. Given the uneven representation of breeds, boundaries were set independently for Thoroughbred (≤0.13, >0.13 and <0.30, ≥0.30), Trakehner (≤0.2, >0.2 and <0.8, ≥0.8), and Holsteiner (0.0, >0.0 and <0.3, ≥0.3). Distributions of performance scores and residuals were analyzed for including tests for normality using Kolmogorov-Smirnov statistics of the UNIVARIATE procedure of SAS. For all analyses, the significance limit was set to 0.05.

3.3.5 Genetic Analyses Genetic parameters were estimated univariately in a linear animal model with (Residual Maximum Likelihood (REML) using VCE-5, version 5.1.2 (Variance Component Estimation; Kovač et al., 2003). Estimates for environmental and genetic effects were obtained under the same models using Best Linear Unbiased Prediction (BLUP) with the software PEST (Groeneveld et al., 1990).

yijnopq= μ + AGEAI/MPT/SBIi + SEXj + dateAI/MPT/SBI x placeAI/MPT/SBIno [1]

+ ap + eijnopq

yijklmnopq= μ + AGEAI/MPT/SBIi + SEXj + TBk + TRAKl + HOLm [2]

+ dateAI/MPT/SBI x placeAI/MPT/SBIno + ap + eijklmnopq

with yi…q = MPT/AI or SBI score, μ = model constant, AGEAI/MPT/SBIi = fixed effect of age group at performance evaluation (i = 1-3), SEXj = fixed effect of sex (j = 1-2),

TBk = proportion of Thoroughbred genes (k = 1-3), TRAKl = proportion of

Trakehner genes (l = 1-3), HOLm = proportion of Holsteiner genes (m = 1-3),

18

Genetic evaluation for performance

dateAI/MPT/SBI x placeAI/MPT/SBIno = random effect of interaction between date of performance evaluation and place of MPT/AI or SBI, ap = random additive genetic effect of the individual horse ( r = 1-36,441) and ei…q = residual. To study the genetic correlations between the analyzed traits and to test the influence of the model on the results of genetic evaluation, Pearson correlation coefficients between breeding values were calculated using the procedure CORR of SAS.

3.4 Results 3.4.1 Statistical Analyses Between 1995 and 2008, 16,814 Hanoverian mares participated in MPT, gaining 83,923 scores for the considered performance traits (PT_Walk, PT_Trot, PT_Canter, PT_FJT and PT_Ride). Between 1999 and 2008, 8,081 Hanoverians were evaluated at AI, gaining 38,976 scores for the same five traits. Concerning SBI 29,053 Hanoverian mares were presented between 1995 and 2008, gaining 87,037 scores for the tree considered performance traits (SBI_Walk, SBI_Imp, and SBI_Corr). Development of mean scores of for the gait related performance traits (MPT_Walk, MPT_Trot and MPT_Canter evaluated at MPT and AI; SBI_Walk, SBI_Imp and SBI_Corr evaluated at SBI) are shown in Figures 1 and 2. Mean scores for MPT_Ride and MPT_FJT ranged between 4.9 (PT_FJT at AI in 2008) and 7.6 (PT_Ride at MPT in 2008). Mean scores from MPTs were considerably higher than those from AI. Mean scores for gaits and rideability differed by 0.49-0.76, jumping scores differed by up to 1.92 between AI and MPT. Considering gaits, lowest mean was determined for PT_Trot (6.2) at AI in 2002 and 2006, and highest mean was determined for PT_Canter at MPT in 2008 (7.5). Scores ranged between 3 and 10 at SBI, and between 1.5 and 10 at MPT/AI. Neither MPT or AI scores nor SBI scores were distributed normally (P<0.01). Skewness coefficients were in the range of |s|=0.02-0.68.

19

Genetic evaluation for performance

3.4.2 Analyses of Variance and Model Development In all subsets of data (MPTF, MPTS, AI, and SBI) the fixed effects proportions of genes of Thoroughbred, Trakehner and Holsteiner, age group at evaluation, additionally sex for AI, and the random effect (combined date-place) were significant for all traits, as previously found by Stock and Distl (2007). Table 1 shows least square means (LSM) with their standard errors (SE) of performance scores evaluated at MPT/AI and SBI for the proportion of genes of Thoroughbred, Trakehner, and Holsteiner. Figure 3 shows cumulative percentages of Hanoverian, Thoroughbred, Trakehner and Holsteiner genes. The relation between performance trait scores of MPT/AI and SBI, and the proportion of genes of the considered horse breeds is illustrated in Supplementary Figures 1 and 2. Proportions of genes of the considered breeds differed markedly between the upper and lower 10%-quantiles for SBI_Walk, PT_Walk and PT_FJT with respect to the Holsteiner, and for PT_Trot with respect to Holsteiner and Trakehner.

3.4.3 Genetic Analyses Results of univariate genetic analyses of performance traits evaluated at MPT/AI and SBI without (Model 1) or with (Model 2) correction for the proportion of genes of Thoroughbred, Trakehner and Holsteiner are shown in Table 2. Heritabilities estimated for the SBI traits ranged between h2 = 0.11 (SBI_Corr in both models) and h2 = 0.34 (SBI_Imp in both models). Similar heritabilities were estimated for the MPT/AI traits, ranging between 0.22 (MPT_Walk in Model 2) and 0.34 (MPT_Trot in Model 1). The comparison of heritability estimates from the two models showed only slight differences of 0.0007 (SBI_Imp) to 0.0058 (MPT_Walk). Additive genetic variances

(σa²) were slightly smaller in Model 1 than in Model 2 (0.0007-0.0075). The event variances (σr²) were identical in both models for the performance traits evaluated at SBI and for walk at MPT/AI, and only slight differences (range between 0.0001- 0.0017) were seen for MPT_Trot, MPT_Canter, MPT_FJT and MPT_Ride. Residual variances (σe²) for all traits but MPT_Walk were slightly higher (by 0.0002-0.0048) in Model 1 than in Model 2 (Table 2).

20

Genetic evaluation for performance

Pearson correlation coefficients calculated between the breeding values for all traits estimated using Model 1 and Model 2 are given in Table 3. Except for PT_FJT correlations between all performance traits were significantly positive, ranging between 0.35 and 0.79. However, PT_FJT was negatively correlated with all other traits with -0.16 to -0.14. There were only minor differences of 0.0015 to 0.0928 between the Pearson correlation coefficients determined between breeding values for the analyzed traits in Model 1 and Model 2. The largest differences between the two models were seen for the correlations involving breeding values for PT_FJT. Comparisons of breeding values estimated in Model 1 and Model 2 for the same trait revealed for all traits Pearson correlation coefficients close to unity (r > 0.98; results not shown).

3.5 Discussion The aim of this study was to determine whether including the proportions of genes of foreign breeds in the model could improve genetic evaluation for performance in the Hanoverian as indicated by the results of Stewart et al. (2009), with respect to genetic evaluation for dressage of the sport horse population in Great Britain. Possible optimization of genetic evaluation for performance through inclusion of some breed effect in the model has been previously shown by Vanderick et al. (2009) for a cross-breed dairy cattle population in New Zealand. For the German warmblood horse, especially the Hanoverian, the impact of different proportions of genes of other horse breeds on prediction of breeding values has not been investigated yet. The breeding aim of the Hanoverian is defined as follows: a rideable, noble, big framed and correct warmblood horse, which, on the basis of its natural abilities, its temperament and character is suitable as a performance horse as well as a pleasure horse. On this basis the HSS strives for the breeding of talented sport horses for the disciplines dressage, jumping, eventing and driving. Genetic evaluation for performance in the main disciplines of riding sport, dressage and jumping is routinely performed using records from competitions and performance tests (Von Velsen- Zerweck, 1998).

21

Genetic evaluation for performance

Publication of breeding values allows performance-oriented matings with regard to dressage, jumping or both. Development of MPT scores over time, as shown for the time period 1995 to 2008 in this study, illustrate the breeding progress particularly in the dressage related traits. In this connection, possible sale benefits may have caused that breeders tend to put more weight on dressage than on jumping talent of their foals. At young age, high prices are primarily achieved for foals with good movements. Results from SBI, MPT and AI have been used for this study. SBI is obligatory for all Hanoverian broad mares, as well as MPT for dams of stallions born after 1990. For all other mares, MPT is voluntary. Training for and participation in MPT is costly and time consuming, so breeder may only present their best mares. The much lower increase of SBI scores than of MPT scores may therefore also reflect preselection effects. Unlike MPT, AI gives a more representative cross-section through the population, because riding horse auctions are organized by the HSS as a sale platform for their breeders. Horses of any gender and at different training level are presented. AI averages accordingly increased on a lower level than the MPT averages, but also indicated the continuing breeding process. The highly positive additive genetic correlations between analogous performance traits evaluated on the occasion of MPT and AI (Stock and Distl, 2006) justified the combined use of MPT and AI data in this study. Moderate heritabilities, mostly ranging between 0.15 and 0.58 have been reported for those performance traits evaluated at MPT/AI and SBI in large numbers of horses (e.g. Lührs-Behnke et al., 2006; Stock and Distl, 2006; Stock and Distl, 2007). Results of this study agree with previous estimates, regardless of which of the two models have been used. In the Hanoverian, breeding use of stallions and mares from other breeds is possible, given the agreement with the breeding directives. Thoroughbreds and Trakehners are commonly used to make future progeny nobler, the intended use of is to improve jumping performance. Our data support that higher proportions of genes of Thoroughbred and Trakehner had a positively impact performance, particularly rideability. This is in line with the current

22

Genetic evaluation for performance

recommendation of the HSS to increase the use of Thoroughbred stallions for breeding. The Holsteiner is especially bred for show jumping (Koenen et al., 2004), so that increased proportions of genes of this breed should be beneficial for free jumping performance in the present analysis. Accordingly, extension of the model used for routine genetic evaluation by effects correcting for the influence of other breeds, resulted in slightly lower estimates of genetic variances. However, decrease of heritabilities was very small, and the impact on genetic evaluation was negligible. Correlations between corresponding breeding values of >0.98 indicate, that identification of genetically superior individuals will not be relevantly improved by using the extended model. Different results were recently obtained with regard to competition data for dressage performance (Stewart et al., 2009). However, the analyzed horse population was much smaller and more heterogeneous than the Hanoverian population analyzed here. Use of preselected data (scores better than 60% and of internationally competing horses) for a very wide range of horse types with limited pedigree information that performed under different and unknown conditions makes it plausible that stratifications by breed proportions can influence the results of genetic analyses. In the Hanoverian data analyzed here possible breed-related differences may have already been sufficiently accounted for by the relationship matrix. Therefore, genetic evaluation in the linear animal model without explicit consideration of the proportion of genes from Thoroughbred, Trakehner and Holsteiner resulted almost in identical results as the extended model. These results indicate that analyses of performance data collected under standardized conditions for a large number of horses from the same breed will not relevantly benefit from model extension by breed class effects. Given the positive correlations between young horse performances in performance tests with later success in competitions (Wallin et al., 2003), breeding values estimated on the basis of MPT/AI and SBI information in the standard model should allow reliable identification of individuals for favourable breeding use. Performance-orientated mating will then ensure further breeding progress of the Hanoverian in the main disciplines of riding sport.

23

Genetic evaluation for performance

Consideration of the proportions of genes of foreign breeds may be required when using different data sources (e.g., competition data from horses from different breeds) or focussing on identification of genome regions influencing performance. Stratification of data by the proportion of genes of the most important foreign breeds may then facilitate using molecular genetic tools to speed up the selection response with respect to performance.

3.6 Conclusion Genetic parameters estimated for performance traits in the Hanoverian without and with the consideration of the proportion of genes of foreign breeds produced almost identical results. Thus the results of this study show that for the Hanoverian population, genetic evaluation for performance in the linear animal model will probably not benefit from including the proportion of genes from Thoroughbred, Trakehner and Holsteiner.

3.7 References Ducro, B.J., Koenen, E.P.C., van Tartwijk, J.M.F.M., Bovenhuis, H., 2007. Genetic relations of movement and free-jumping traits with dressage and show-jumping performance in competition of horses. Livest. Sci.107, 227- 234. Groeneveld, E, Kovač, M, Wang, T., 1990. PEST, a general purpose BLUP package for multivariate prediction and estimation. In: World Congress on Genetics Applied to Livestock Production, 488-491. Edinburgh, UK Garsi, Madrid. Hamann, H., Distl, O., 2008. Genetic variability in Hanoverian warmblood horses using pedigree analysis. J. Anim Sci. 86, 1503-1513. Koenen, E. P. C., Aldridge, L. I., Philipsson, J., 2004. An overview of breeding objectives for warmblood sport horses. Livest. Prod. Sci. 88, 77-84. Kovač, M, Groeneveld, E., Garcia-Cortez, A., 2003. VCE-5 User’s Guide and Reference Manual Version 5.1.2. Institute for Animal Science and Animal Husbandry, Federal Agricultural Research Centre (Bundesforschungsanstalt für Landwirtschaft, FAL), Mariensee / Neustadt, Germany.

24

Genetic evaluation for performance

Lührs-Behnke, H., Röhe, R., Kalm, E., 2006. Genetische Parameter für Zuchtstutenprüfungsmerkmale der verschiedenen deutschen Warmblutzuchtverbände. Züchtungskunde 78, 271-280. Stewart, I.D., Woolliams, J.A., Brotherstone, S., 2009. Genetic evaluation of horses for performance in dressage competitions in Great Britain. Livest. Sci., doi:10.1016/j.livsci.2009.10.011. Stock, K.F., Distl, O., 2007. Genetic correlations between performance traits and radiographic findings in the limbs of German Warmblood riding horses. J. Anim Sci. 85, 31-41. Stock, K.F., Distl, O., 2006. Genetic correlations between conformation traits and radiographic findings in the limbs of German Warmblood riding horses. Gen. Sel. Evol. 38, 657-671. Vanderick, S., Harris, B.L., Pryce, J.E., Gengler, N., 2009. Estimation of test-day model (co)variance components across breeds using New Zealand dairy cattle data. J. Dairy Sci. 92, 1240-1252. Von Velsen-Zerweck, A., 1998. Integrierte Zuchtwertschätzung für Zuchtpferde. Diss. agr. Georg August Universität Göttingen. Wallin, L., Strandberg, E., Philipsson, J., 2003. Genetic correlations between field test results of Riding Horses as 4-year-olds and lifetime performance results in dressage and show jumping. Livest. Prod. Sci. 82, 61-71.

25

Genetic evaluationforperformance

Table 1 Least square means (LSM) with their standard errors (SE) of performance scores from performance tests (PT) and studbook inspections (SBI) for the proportions of genes of Thoroughbred, Trakehner and Holsteiner, estimated in 36,441 Hanoverians from birth years 1992 to 2005. Breed PT traits SBI traits Breed class PT_Walk PT_Trot PT_Canter PT_FJT PT_Ride SBI_Walk SBI_Imp SBI_Corr TB 1 6.58 ± 0.017 6.49 ± 0.017 6.78 ± 0.016 6.80 ± 0.038 6.98 ± 0.016 6.67 ± 0.014 6.72 ± 0.014 6.63 ± 0.011 2 6.72 ± 0.014 6.60 ± 0.015 6.88 ± 0.014 6.61 ± 0.035 7.07 ± 0.014 6.77 ± 0.012 6.80 ± 0.012 6.62 ± 0.010 3 6.77 ± 0.017 6.55 ± 0.017 6.89 ± 0.016 6.35 ± 0.038 7.05 ± 0.016 6.80 ± 0.015 6.74 ± 0.014 6.64 ± 0.012 TRAK 1 6.66 ± 0.015 6.44 ± 0.016 6.81 ± 0.015 6.71 ± 0.037 6.98 ± 0.015 6.70 ± 0.013 6.69 ± 0.012 6.61 ± 0.010 2 6.68 ± 0.015 6.57 ± 0.015 6.87 ± 0.014 6.59 ± 0.036 7.05 ± 0.014 6.75 ± 0.015 6.78 ± 0.012 6.65 ± 0.010 3 6.73 ± 0.018 6.63 ± 0.017 6.87 ± 0.017 6.47 ± 0.038 7.08 ± 0.016 6.78 ± 0.015 6.80 ± 0.014 6.63 ± 0.012 HOL 1 6.77 ± 0.012 6.64 ± 0.013 6.84 ± 0.013 6.15 ± 0.034 7.06 ± 0.012 6.81 ± 0.01 6.79 ± 0.009 6.61 ± 0.008 2 6.79 ± 0.024 6.60 ± 0.023 6.88 ± 0.022 6.51 ± 0.045 7.09 ± 0.021 6.81 ± 0.02 6.79 ± 0.020 6.62 ± 0.017 3 6.52 ± 0.016 6.40 ± 0.016 6.83 ± 0.016 7.11 ± 0.037 6.97 ± 0.015 6.61 ± 0.01 6.69 ± 0.014 6.66 ± 0.011 PT_Walk = walk under rider evaluated at mare performance test (MPT) or auction inspection (AI); PT_Trot = trot under rider evaluated at MPT/AI; PT_Canter = canter under rider evaluated at MPT/AI; PT_FJT = total score for free jumping evaluated at MPT/AI; PT_Ride = rideability as judged by the judging commission evaluated at MPT/AI; SBI_Walk = walk at hand evaluated at studbook inspection (SBI); SBI_Imp = impetus and elasticity in trot at hand evaluated at SBI; SBI_Corr = correctness of gaits in walk and trot at hand evaluated at SBI; TB = Thoroughbred (1= ≤0.13; 2 = >0.13 and <0.30; 3 = ≥0.30 of Thoroughbred genes); TRAK = Trakehner (1= ≤0.2; 2 = >0.2 and <0.8; 3 = ≥0.8 of Trakehner genes); HOL = Holsteiner warmblood (1 = 0.0; 2 = >0.0 and <0.3; 3 = ≥0.3 of Holsteiner genes).

Genetic evaluation for performance

Table 2 Comparison of residual variances (σe²), event variances (σr²) and additive

genetic variances (σa²) and of heritabilities (h²) which were univariately estimated for performance traits from performance tests (PT) and studbook inspections (SBI) of 36,415 Hanoverian Warmblood horses from birth years 1992-2005 without (Model 1) or with (Model 2) correction for the proportion of genes of Thoroughbred, Trakehner and Holsteiner in the model. Trait Model 1 Model 2

σe² σr² σa² h² σe² σr² σa² h² PT_Walk 0.6126 ± 0.0416 ± 0.2207 ± 0.2523 ± 0.6161 ± 0.0416 ± 0.2152 ± 0.2465 ± 0.0087 0.0028 0.0113 0.0114 0.0087 0.0029 0.0113 0.0115 PT_Trot 0.4113 ± 0.1008 ± 0.2627 ± 0.3391 ± 0.4116 ± 0.1012 ± 0.2620 ± 0.3382 ± 0.0073 0.0051 0.0099 0.0107 0.0073 0.0051 0.0099 0.0107 PT_Cante 0.3889 ± 0.1006 ± 0.2141 ± 0.3043 ± 0.3907 ± 0.1008 ± 0.2112 ± 0.3006 ± r 0.0077 0.0056 0.0105 0.0114 0.0076 0.0056 0.0102 0.0113 PT_Ride . 0.3934 ± 0.0973 ± 0.1461 ± 0.2295 ± 0.3944 ± 0.0974 ± 0.1446 ± 0.2272 ± 0.0070 0.0052 0.0089 0.0107 0.0070 0.0052 0.0088 0.0107 PT_FJT 0.7561 ± 1.1011 ± 0.5455 ± 0.2271 ± 0.7609 ± 1.1028 ± 0.5380 ± 0.2240 ± 0.0084 0.0232 0.0118 0.0080 0.0084 0.0232 0.0118 0.0080 SBI_Imp 0.4106 ± 0.0428 ± 0.2354 ± 0.3418 ± 0.4108 ± 0.0428 ± 0.2348 ± 0.3411 ± 0.0073 0.0024 0.0099 0.0107 0.0073 0.0024 0.0099 0.0107 SBI_Corr 0.4279 ± 0.0228 ± 0.0578 ± 0.1137 ± 0.4291 ± 0.0228 ± 0.0560 ± 0.1104 ± 0.0056 0.0019 0.0058 0.0079 0.0056 0.0019 0.0057 0.0079 SBI _Walk 0.5180 ± 0.0565 ± 0.1890 ± 0.2475 ± 0.5195 ± 0.0565 ± 0.1869 ± 0.2450 ± 0.0079 0.0027 0.0101 0.0107 0.0081 0.0027 0.0102 0.0109 PT_Walk = walk under rider evaluated at mare performance test (MPT) or auction inspection (AI); PT_Trot = trot under rider evaluated at MPT/AI; PT_Canter = canter under rider evaluated at MPT/AI; PT_FJT = total score for free jumping evaluated at MPT/AI; PT_Ride = rideability as judged by the judging commission evaluated at MPT/AI; SBI_Walk = walk at hand evaluated at studbook inspection (SBI); SBI_Imp = impetus and elasticity in trot at hand evaluated at SBI; SBI_Corr = correctness of gaits in walk and trot at hand evaluated at SBI.

27

Genetic evaluation for performance

Table 3 Pearson correlation coefficients determined between breeding values which were univariately estimated for performance traits from studbook inspections (SBI) and performance tests (PT) of 36,415 Hanoverian Warmblood horses from birth years 1992-2005 without (Model 1) or with (Model 2) correction for the proportion of genes of Thoroughbred, Trakehner and Holsteiner in the model; correlations based on breeding values of all 80,746 horses in the relationship matrix. Model SBI_Corr SBI_Walk PT_Walk PT _Trot PT_Canter PT_Ride PT_FJT Trait Model 1 SBI_Elast 0.5648 0.5749 0.5313 0.7769 0.6836 0.6874 -0.2978 SBI_Corr 0.4176 0.3950 0.5130 0.4738 0.5160 -0.2487 SBI_Walk 0.7432 0.5563 0.5490 0.5976 -0.3742 PT_Walk 0.6338 0.6170 0.6930 -0.4499 PT _Trot 0.7669 0.7892 -0.4146 PT_Canter 0.7884 -0.2219 PT_Ride -0.3143 Model 2 SBI_Elast 0.5566 0.5603 0.5150 0.7670 0.6807 0.6835 -0.2557 SBI_Corr 0.3764 0.3492 0.4841 0.4804 0.5062 -0.1559 SBI_Walk 0.7496 0.5494 0.5314 0.5924 -0.3855 PT_Walk 0.6223 0.6028 0.6870 -0.4484 PT _Trot 0.7540 0.7860 -0.3877 PT_Canter 0.7878 -0.1694 PT_Ride -0.2810 PT_Walk = walk under rider evaluated at mare performance test (MPT) or auction inspection (AI) ; PT_Trot = trot under rider evaluated at MPT/AI; PT_Canter = canter under rider evaluated at MPT/AI; PT_FJT = total score for free jumping evaluated at MPT/AI; PT_Ride = rideability as judged by the judging commission evaluated at MPT/AI; SBI_Walk = walk at hand evaluated at studbook inspection (SBI); SBI_Imp = impetus and elasticity in trot at hand evaluated at SBI; SBI_Corr = correctness of gaits in walk and trot at hand evaluated at SBI.

28

Genetic evaluation for performance

Figure 1 Development of performance-related scores from studbook inspections (SBI) between 1995 and 2008 in 29,031 Hanoverians mares from birth years 1992- 2005. SBI_Walk = walk at hand evaluated at studbook inspection (SBI); SBI_Imp = impetus and elasticity in trot at hand evaluated at SBI; SBI_Corr = correctness of gaits in walk and trot at hand evaluated at SBI.

29

Genetic evaluation for performance

Figure 2 Development of scores from mare performance tests (MPT) and auction inspections (AI) between 1995 and 2008 in 24,882 Hanoverians from birth years 1992-2005. MPT_Walk = walk under rider under evaluated at mare performance test (MPT); MPT_Trot = trot under rider evaluated at MPT; MPT_Canter = canter under rider valuated at MPT/AI; AI_Walk = walk under rider evaluated at auction inspection (AI); AI_Trot = trot under rider evaluated at AI; AI_Canter = canter under rider evaluated at AI.

30

Genetic evaluation for performance

Figure 3 Cumulative percentage of the proportion of Hanoverian warmblood (Han), Thoroughbred (TB), Trakehner (Trak) and Holsteiner warmblood (Hol) genes in 36,415 Hanoverian Warmblood horses from birth years 1992-2005.

31

Genetic evaluation for performance

SBI_Walk (upper 10%)

SBI_Walk (20-80%)

SBI_Walk (lower 10%)

SBI_Imp (upper 10%)

SBI_Imp (20-80%)

SBI_Imp (lower 10%)

SBI_Corr (upper 10%)

SBI_Corr (20-80%)

SBI_Corr (lower 10%)

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Han TB Trak Hol Others

Supplementary Figure 1 Proportions of genes of Hanoverian warmblood (Han), Thoroughbred (TB), Trakehner (Trak), Holsteiner warmblood (Hol), and other breeds (Others) by quantiles of scores for correctness of gaits in walk and trot at hand (SBI_Corr), impetus and elasticity in trot at hand (SBI_Imp) and walk at hand (SBI_Walk) evaluated at studbook inspections between 1995 and 2008 in 29,031 Hanoverians mares from birth years 1992-2005.

32

Genetic evaluation for performance

PT_Walk (upper 10%) PT_Walk (20-80%) PT_Walk (lower 10%)

PT_Trot (upper 10%) PT_Trot (20-80%) PT_Trot (lower 10%)

PT_Canter (upper 10%) PT_Canter (20-80%) PT_Canter (lower 10%)

PT_FJT (upper 10%) PT_FJT (20-80%) PT_FJT (lower 10%)

PT_Ride (upper 10%) PT_Ride (20-80%) PT_Ride (lower 10%)

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Han TB Trak Hol Others

Supplementary Figure 2 Proportions of genes of Hanoverian warmblood (Han), Thoroughbred (TB), Trakehner (Trak), Holsteiner warmblood (Hol), and other breeds (Others) by quantiles of scores for walk under rider (PT_Walk), trot under rider (PT_Trot), canter under rider (PT_Canter), total score for free jumping (PT_FJT), and rideability as judged by the judging commission (PT_Ride) evaluated at mare performance tests and auction inspections between 1995 and 2008 in 24,882 Hanoverians mares from birth years 1992-2005.

33

34

CHAPTER 4

Genetic evaluation of Hanoverian warmblood horses for conformation traits considering the proportion of genes of foreign breeds

WIEBKE SCHRÖDER, KATHRIN FRIEDERIKE STOCK and OTTMAR DISTL

Department of Animal Breeding and Genetics, University of Veterinary Medicine Hannover (Foundation), Hannover; Germany

Archiv Tierzucht 53 (2010) 4, 377-387, ISSN 0003-9438

35

36

Genetic evaluation for conformation

4 Genetic evaluation of Hanoverian warmblood horses for conformation traits considering the proportion of genes of foreign breeds

4.1 Abstract Conformation data of in total 29,053 Hanoverian warmblood mares were used to determine whether genetic evaluation for conformation in the Hanoverian could benefit from the inclusion of the proportion of genes of foreign breeds in the model. For our analyses, we considered all Hanoverian mares born from 1992 to 2005 with available studbook inspection data. Genetic parameters were estimated univariately for eight routinely scored conformation traits (head, neck, saddle position, frontlegs, hindlegs, type, frame, and general impression and development), and height at withers from studbook inspections, in a linear animal model using Residual Maximum Likelihood (REML). Genetic evaluation was subsequently performed using Best Linear Unbiased Prediction. To investigate the effect of correcting for the proportion of genes of foreign breeds, two different models were used for the analyses. In Model 1, the fixed effect age at studbook inspection, and the random effect date-place interaction were considered. In Model 2, proportions of genes of Thoroughbred, Trakehner and Holsteiner were additionally included as fixed effects. Heritabilities of analyzed conformation traits and withers height ranged in both models between 0.10 and 0.57, with standard errors of ≤0.01. Pearson correlation coefficients determined between breeding values of corresponding traits using Model 1 and 2 were highly positive (>0.99), indicating little effect of the model on the results of genetic evaluation. According to our results using a model which includes the proportion of genes of Thoroughbred, Trakehner and Holsteiner as fixed effects will not relevantly improve genetic evaluation for conformation in the Hanoverian.

Keywords: horse; Hanoverian; genetic parameters; blood proportion; breeding values; conformation; type

37 Genetic evaluation for conformation

4.2 Zusammenfassung Genetische Evaluierung von Exterieur Merkmalen des Hannoveraners, unter Berücksichtigung von Fremd-Genanteilen

Untersucht wurde, ob die Berücksichtigung von Fremdgenanteilen ein Modell für genetische Analysen von Exterieurmerkmalen verbessern kann. Zu diesem Zweck standen Stutbuchaufnahmedaten von insgesamt 29.053 Hannoveraner Stuten der Jahrgänge 1992 bis 2005 des Hannoveraner Verbandes zur Verfügung. Genetische Parameter wurden für acht, routinemäßig bei Stutbuchaufnahmen beurteilte, Exterieurmerkmale (Kopf, Hals, Sattellage, Vorderhand, Hinterhand, Typ, Rahmen und Gesamteindruck und Entwicklung) sowie die Widerristhöhe univariat in einem linearen Tiermodel mittels Residual Maximum Likelihood geschätzt. Die Zuchtwertschätzung wurde anschließend mittels Best Linear Unbiased Prediction und der geschätzten Varianzen durchgeführt. Um die Auswirkungen einer Berücksichtigung von Fremdgenanteilen auf genetische Analysen prüfen zu können, wurden zwei unterschiedliche Modelle verwendet. In Modell 1 wurde das Alter bei Aufnahme ins Stutbuch als fixer Effekt und die Kombination aus Datum und Ort als zufälliger Effekt berücksichtigt. Ein zweites Modell wurde zusätzlich um die Genanteilsklassen von Englischen Vollblut, Trakehner und Holsteiner als fixen Effekten erweitert. Heritabilitäten lagen in beiden Modellen zwischen 0,10 und 0,57 bei Standardfehlern ≤0.01 für die analysierten Exterieurmerkmale sowie der Widerristhöhe. Die Zuchtwerte, geschätzt mit Modell 1 und Modell 2, waren hoch positiv miteinander korreliert (Korrelationskoeffizienten nach Pearson >0.99). Zusammenfassend zeigen unsere Ergebnisse, dass eine Berücksichtigung von Fremdgenanteilen in einem Zuchtwertschätzmodell für Exterieurmerkmale des Hannoveraners keine wesentlichen Vorteile bringt.

Schlüsselwörter: Pferd; Hannoveraner; genetische Parameter; Blutanteile; Zuchtwerte; Exterieur; Typ

38 Genetic evaluation for conformation

39

GWAS for show-jumping

40

CHAPTER 5

A genome wide association study for quantitative trait loci of show-jumping in Hanoverian warmblood horses

W. Schröder, A. Klostermann, K. F. Stock and O. Distl

Institute for Animal Breeding and Genetics, University of Veterinary Medicine Hannover, Bünteweg 17p, 30559 Hannover, Germany

41

42

GWAS for show-jumping

5 A genome wide association study for quantitative trait loci of show- jumping in Hanoverian warmblood horses

5.1 Summary Show-jumping is an economical important breeding goal in Hanoverian warmblood horses. The aim of this study was a genome-wide association (GWA) study for quantitative trait loci (QTL) of show-jumping in Hanoverian warmblood horses employing the Illumine equine SNP50 Beadchip. For our analyses we genotyped 115 stallions of the National State stud of Lower . The show- jumping talent of a horse includes style and ability of free jumping. To control spurious associations based on population stratification, two different mixed linear animal model approaches were employed besides linear models with adaptive permutations for correcting multiple testing. Population stratification was explained best in the mixed linear animal model considering Hanoverian, Thoroughbred, Trakehner and Holsteiner genes and the marker identity-by-state relationship matrix. We identified six QTL for show-jumping on horse chromosomes (ECA) 1, 8, 9 and 26

(-log10 P-value > 5) and further putative QTL with -log10 P-values of 3-5 on ECA1, 2, 3, 11, 17 and 21. Within six QTL regions, we identified human performance related genes including PAPSS2 on ECA1, MYL2 on ECA8, TRHR on ECA9 and NRF2 on ECA26 and within the putative QTL regions NRAP on ECA1, and TBX4 on ECA11. The results of our GWA suggest that genes involved in muscle structure, development and metabolism are crucial for elite show jumping performance. Further studies are necessary to validate these QTL in larger data sets and other horse populations.

Keywords: horse, show-jumping, quantitative trait loci, GWA, single nucleotide polymorphisms

43

GWAS for show-jumping

5.2 Introduction Performance in show jumping is an economical important trait in warmblood . An athletic horse, suitable for dressage and show-jumping is the main aim of the Hanoverian Studbook Society (HSS). Primarily bred to be a horse suitable for the military usage, since the foundation of the HSS as early as 1735, the Hanoverian warmblood (Hanoverian) was intensely selected for an athletic and competitive phenotype that is required for an intensely used riding horse. Therefore, the Hanoverian represents one of the most important breeds of sport horses in the world today. Heritabilities for show-jumping in Hanoverian horses were estimated at 0.39 to 0.61 (Stock & Distl 2007). In 1993, a breeding program for show-jumpers "Programm Hannoveraner Springpferdezucht" was initiated with the aim to give more impact to breed elite Hanoverian show-jumping horses. Due to the long generation interval, genetic improvement in horses needs longer time-spans to be realised, so the application of genetic markers in selection schemes to improve physical performance appears highly desirable. However, even population genetic analyses are performed routinely nowadays, studies on QTL and candidate genes contributing to equine performance are still in the beginnings. Gu et al. (2009) localized genomic regions in the Thoroughbred genome potentially containing genes that influence exercise-related phenotypes by using a hitchhiking mapping approach based on microsatellites. Recently, Hill et al. (2010) could show a single nucleotide polymorphism (SNP) in the myostatin gene (MSTN) which is strongly associated with best results on short and long race distances among Thoroughbreds. In humans, numerous studies have been performed to identify QTL and candidate genes affecting physical performance in different sports. Therefore, related to physical performance the human is the best studied species so far (Bray et al. 2009). For human and a few other species e.g. cattle and dogs, genotyping arrays containing SNP markers were successfully used for mapping QTL for quantitative traits (Karlsson et al. 2007; Myles et al. 2008; Kolbehdari et al. 2009). With the completion of the equine genome assembly, SNP assays covering the whole equine genome have been developed to scan genetic variations in horses at a very high resolution.

44

GWAS for show-jumping

The objectives of our study were to perform a genome-wide association (GWA) analysis for show-jumping in Hanoverian horses using the equine SNP50 BeadChip (Illumina, San Diego, CA, USA) and to screen the potential QTL for possible candidate genes known from human studies.

5.3 Materials and Methods 5.3.1 Animals and phenotypic data Blood samples were collected from 115 Hanoverian warmblood stallions of the National State stud of . These stallions were born between 1972 and 2000 and represent a random sample from all Hanoverian stallions born in last 20-30 years. Pedigree data were made available by the HSS through the national unified animal ownership database (Vereinigte Informationssysteme Tierhaltung w.V., VIT). Pedigree records of these stallions allowed us to assign the 115 stallions into 16 families which included a total of 798 stallions. We employed the latest breeding values (BVs) for show-jumping (Mai 2009) provided by HSS. BVs for show-jumping were estimated based on results recorded at mare performance tests (MPTs) since 1987 and inspections before auctions (PAIs) since 1999 including 35,512 animals. Show-jumping is a composed trait resulting from scores for style and ability of free- jumping. At MPTs mares are scored by a judging commission for show-jumping using a scale from 0 (not shown) to 10 (excellently shown) with 0.5 intervals. Style and ability of free jumping are separately scored and then averaged to a total score for show-jumping. Horses pre-selected for sale at riding horse auctions of the HSS are scored for show-jumping by a judging commission. Between 1999 and 2008, 8,081 Hanoverians (5,567 males, 2,514 females) were judged at PAIs. Auction candidates are scored for the same traits and at the same scale like the mares at MPTs. The same commission judges the mares at MPTs and the auction candidates to ensure comparable results. If a mare took part in a PAI as well as in a MPT, then the result of the MPT is included in the BV estimation. For mares which repeated the MPT, the last result is included. BVs are estimated yearly through VIT for show-jumping employing a bivariate BLUP (best linear unbiased prediction) animal model (Christmann 1996).

45

GWAS for show-jumping

Yijk= μ + TESTi+ aj + eijk

with yijk = score for style and ability of free-jumping, μ = model constant, TESTi = fixed effect of the individual test for MPT or PAI, interaction between the place, the year and season of performance evaluation, aj = random additive genetic effect of the individual horse and eijk = random residual. Season has been classified in two seasons (January to June and July till December). BVs were standardized to a mean value of 100 points and a standard deviation of 20 points, using the horses born in the years 1999 and 2000 as reference. Every year the reference birth years move by one year forward. For the investigated stallions the BVs for show-jumping range from 56-171 (mean 103 ± 28) (Table S2). Reliabilities ranged between 0.25 and 0.99 (mean 0.87 ± 0.13). The distributions of BVs for show-jumping were analyzed using the procedure UNIVARIATE of SAS software (Statistical Analysis System, version 9.2, SAS Institute, Cary, NC, USA, 2010). For each stallion the proportion of genes of Hanoverian (HAN), Thoroughbred (TB), Trakehner (TRAK) and Holsteiner (HOL) horses was calculated using all available pedigree information. Details are described elsewhere (Hamann & Distl 2008). Mean (median) proportions of genes in the stallions were 0.54 (0.63) for HAN, 0.28 (0.19) for TB, 0.05 (0.03) for TRAK, and 0.06 (0) for HOL. Given the uneven representation of gene proportions, each four classes for the several breeds were defined as follows HAN: ≤0.34, >0.34 and <0.78, ≥0.78; TB: ≤0.13, >0.13 and <0.30, ≥0.30; TRAK: ≤0.20, >0.20 and <0.80, ≥0.80; HOL: 0.00, >0.00 and <0.30, ≥0.30.

5.3.2 Genotyping SNPs Genomic DNA was extracted from EDTA blood samples of the 115 Hanoverian warmblood stallions through a standard ethanol fraction with concentrated sodium chloride (6M NaCl) and sodium dodecyl sulphate (10% SDS). Concentration of extracted DNA was determined using Nanodrop ND 1000 (Peqlab Biotechnology, Erlangen, Germany). DNA concentration of samples was adjusted to 50 ng/μl.

46

GWAS for show-jumping

Genotyping was performed with the Illumina equine SNP50 BeadChip containing 54,602 SNP markers using standard procedures as recommended by the manufacturer. Raw data were analysed using the genotype module version 3.2.32 of the BeadStudio program (Illumina). In order to assign the genotypes, we generated a cluster file with the help of the BeadStudio software and the genotyping module version 3.2.32.

5.3.3 Data analysis For genome-wide mapping we performed association analyses for all SNPs with a minor allele frequency (MAF) >0.05 and a call rate >0.90. Due to missingness test, no SNP was excluded. There were 7875 SNPs that did not reach a sufficient MAF and 3951 SNP had a call rate ≤0.90, so 43,441 SNPs were left for association analyses. To control spurious associations, we tested possible stratification effects on their outcome of GWA and employed empirical genome-wide error probabilities through adaptive permutations. The models employed were parameterized using PLINK, version 1.07 (http://pngu.mgh.harvard.edu/~purcell/plink/ (Purcell et al. 2007)) and TASSEL, version 2.1 (http://www.maizegenetics.net/tassel (Bradbury et al. 2007)). First, genome-wide associations were determined without any parameters for potential data stratification. Adaptive permutations for correction of multiple tests was performed using a maximum of 1,000,000 permutations (Plink1) the “--assoc” and the “--aperm” options of PLINK. An extended model included the gene proportions of the important founder breeds HAN, TB, TRAK and HOL to improve the results of the GWA analyses. The covariates were considered as class effects with each four levels. The adaptive permutations were done applying a linear regression model using the “--linear” and “--covar” options for PLINK (Plink2). In a third PLINK model, we tested the effect of family structure on the GWA analysis. We performed Cochran- Mantel-Haenszel (CMH) tests within the 16 families and simultaneously considered the gene proportion of HAN, TB, TRAK and HOL as covariates. Here, we utilized the “--mh”, “--within” and “--aperm” options for PLINK (Plink3). A mixed linear animal model (MLM) was employed to control for marker-based population structure (Q-matrix) and marker identity-by-state (IBS) based kinship

47

GWAS for show-jumping

among all individuals (K-matrix) using TASSEL (http://www.maizegenetics.net/tassel (Bradbury et al. 2007)) (Tassel1). The data file for building these two matrices were from 7375 genome-wide and equidistantly distributed SNPs at pairwise linkage disequilibrium (r2) <0.2 (Ritland 1996). The Q-matrix contained three covariates for the cryptic structure of the stallions as determined by STRUCTURE, version 2.3.3 (Pritchard et al. 2000) via optimization of the likelihood of the data. Using the KIN option of TASSEL, the K-matrix was created calculating the marker IBS coefficients. This subset of SNPs was generated as a pruned subset of SNPs that are in approximate linkage equilibrium with each other. Therefore the “--indep-pairwise 2000 500 0.2” (sliding window size of 2000 SNPs, window shift steps of 500 SNPs and an r2 threshold of 0.2) option within PLINK was used. We implemented two MLM models. The first MLM model explained for effects of the cryptic data structure as determined via structure and the IBS-kinship matrix (Tassel1). In the second MLM model (Tassel2), the gene proportions of HAN, TB, TRAK and HOL as covariates and the IBS-kinship matrix were taken into account. The MLM (Yu et al. 2006) was implemented in TASSEL as described in Henderson’s notation (Bradbury et al. 2007):

y = Qβ + Zu + Ga + e where y is the BV for show-jumping; β is an unknown vector containing fixed effects of population structure (Q-matrix) or the proportion of genes of HAN, TB, TRAK and HOL; u is an unknown vector of random additive genetic effects from multiple background QTL for individuals (K-matrix); a contains the unknown genotype effects of the SNPs in the GWA; X, Z, and G are the known design matrices for respective effects of population structure, polygenic SNP genotype effects; and e is the unobserved vector of random residuals. Subsequently, we calculated the observed polymorphism information content (PIC) using the ALLELE procedure of SAS/Genetics (Statistical Analysis System, version 9.2 SAS Institute, Cary, NC, USA, 2010).

48

GWAS for show-jumping

We built quantile-quantile (Q-Q) plots to visualise the observed versus expected P- value distribution for each of the models employed. The observed –log10 P-values were plotted against –log10 P-values expected under the null hypothesis of independence (Fig. 1-5). The observed divergence between the expected distribution of the regression line and the distribution of observed –log10 P-values represents the inflation of P-values mainly caused by data stratification. According to the Q-Q plots, smallest –log10 P-value inflation was observed using a MLM model with the gene proportion of HAN, TB, TRAK and HOL as covariates and the K-matrix for random additive genetic effects due to the IBS relationships among all animals (Tassel2).

Based on the Q-Q plots, we defined a SNP as significant with -log10 (P) > 3 and as highly significant with -log10 (P) > 5 using Tassel1 or Tassel2. We found 1000 SNPs as significant using Plink1, 694 significant SNPs using Plink2 and 151 significant SNPs using Plink3. Using MLM, 270 SNPs were significantly associated with the BV for show-jumping employing Tassel1 and 57 SNPs employing Tassel2. Subsequently, potential QTL were defined as genomic regions with a minimum of one SNP marker estimated as highly significant using Tassel1 or Tassel2, at least estimated as significant using both models and -log10 (P) >1 using any other model. Further putative QTL were defined as genomic regions harbouring at least one SNP marker estimated as significant using Tassel1 and Tassel2 and -log10 (P) >1 using any other model. Estimates of the additive and dominance effects for each of the most significant SNPs within each potential QTL were obtained using Best Linear Unbiased Prediction (BLUP) with the software PEST (Groeneveld et al. 1990).

yijklmno= μ + GTi + HANj + TBk + TRAKl + HOLm + an + eijklmno

with yi…o = BV for show-jumping, μ = model constant, GTi = genotype of the most significant SNP within each QTL, HANj = proportion of Hanoverian genes TBk = proportion of Thoroughbred genes, TRAKl = proportion of Trakehner genes, HOLm = proportion of Holsteiner genes, an = random additive genetic effect of the individual horse (o =1-3665) and ei…o = residual.

49

GWAS for show-jumping

The additive genetic effects for each of the most significant SNP within each QTL were estimated as half of the difference of the least square means of the two homozygous genotypes. The dominance effect was calculated as the deviation of the least square means of the heterozygotes from the average of the two homozygous genotypes. Significance was tested using F-tests. The genotype based BVs (gBV) for show-jumping was calculated based on the observed additive or dominance effect of these SNPs for each stallion.

6 gBV   da )( i  1151 with gBV = genotype BV for show-jumping, a = additive effect of each SNP, d = dominance effect of each SNP. For better compression, we standardized gBVs to a mean value of 100 points and a standard deviation of 20 points. Subsequent, correlations between BV show- jumping and gBV show-jumping were calculated using the CORR procedure of SAS/Genetics. We performed multiple analyses of variance (ANOVA) to test the influence of gBV show-jumping on the distribution of BV show-jumping (r2) using the procedures GLM of SAS/Genetics.

5.4 Results We were able to detect six QTL on ECA1, 8, 9 and 26 for show-jumping (Table 1). Peak values were at 0–3.0 Mb and at 39.7-42.3 Mb on ECA1. Only one SNP supported the QTLs on ECA8 at 21.5 Mb and on ECA9 at 52.6 Mb. On ECA26 two QTL are likely, one QTL at 11.8 Mb and one at 21.2 Mb. Further putative QTL with - log P-values >3 and <5 were on ECA1, 3, 11, 17 and 21 (Table S1). Peak values were at 18.0 Mb, at 120 Mb, at 159 Mb and at 176 Mb on ECA1. Only one SNP at 28.2 Mb (ECA3), 32.2 Mb (ECA11), 53.1 (ECA21) supported the QTLs on ECA3, 11 and 21. On ECA17 peak values were highest at 72.8 Mb and on ECA21 at 53.1 Mb.

50

GWAS for show-jumping

The largest effect with a dominance estimated of 17.83 was detected for BIEC2- 1036317 on ECA8. Further significant dominance effects were found for BIEC2-243 (ECA1), the favourable heterozygous genotype effect was 9.05. Additive effects were significant for BIEC2-18316 (ECA1) and BIEC2-683832 (ECA26). The beneficial additive effect was 13.16 ± 4.04 BV-points for show-jumping (P<0.01). For BIEC2- 683832 (ECA26) a favourable additive effect was detected for BV show-jumping (12.91 ± 5.56) within the tested stallions. BIEC2-689886 (ECA26) had a significant dominance effect on BV show-jumping. The heterozygous genotype was 9.91 ± 4.21 (P<0.05) above the homozygote mean. Only the SNP within the QTL on ECA9 (BIEC2-1094761) had neither a significant additive nor a significant dominance effect (Table 4). The correlation coefficient estimated between gBV show-jumping and BV- show-jumping was 0.72. The variance, explained by the most significant SNP within each QTL was 0.54. Table 3 shows the distribution of SNP genotypes per proportions of Hanoverian, Thoroughbred, Trakehner and Holsteiner genes for the genotyped stallions.

Within the QTL regions the bifunctional 3'-phosphoadenosine 5'-phosphosulfate synthetase 2 gene (PAPSS2) on ECA1, the myosin light chain 2 regulatory, cardiac, slow gene (MYL2) on ECA8, the thyrotropin-releasing hormone receptor (TRHR) on ECA9 and the nuclear respiratory factor 2 gene (NRF2) on ECA26 and within the putative QTL regions the nebulin-related anchoring protein gene (NRAP) on ECA1 and the T-box transcription factor 4 gene (TBX4) on ECA11 could be identified as candidate genes for physical performance. Table S1 shows all SNPs estimated as significant using Tassel1 or Tassel2 and potential candidate genes from human studies for physical fitness.

5.5 Discussion The aim of our study was to map QTL associated with show-jumping in Hanoverian warmblood horses. We identified six QTL and a total of 10 putative QTL in this study. The moderate numbers of associated SNPs suggest that several genes, similar to

51

GWAS for show-jumping

physical performance in human (MacArthur & North 2005), are possibly involved in show-jumping. The Q-Q-plots show expected distribution of association test statistics (X-axis) across the SNPs compared to the observed value (Y-axis). Any bias deviation from the X=Y line implies a consistent data stratification across the whole genome, as seen in Fig. 1-4. The Q-Q-plot for observed P-values calculated using Tassel2, shows a solid line matching X=Y until it curves at the end, representing the small number of truly associated SNPs. Hence, omitting the effects of kinship and the proportion of genes for GWA would result in an increased false discovery rate caused by stratification within the investigated population. Observed genotype frequencies and estimates for additive and dominance effects of each of the most significant SNP within each QTL indicate that a positive selection for show-jumping is already taking place in the investigated stallion population. These results are probably due to the requirements of the stallion licensing procedure of the HSS that include free-jumping evaluation for each stallion, to ensure a versatile future progeny. William and Folland (2007) developed a model to calculate a ‘total genotype score’ (TGS) for a genetic predisposition to high endurance potential based on genetic polymorphisms that are candidates to explain variation in human endurance ability. Our results indicate, that the application of genetic markers could be beneficial for selection schemes to improve show-jumping and early reveal best horses for professional training. Identified QTLs and putative OTLs were compared to positively selected regions in Thoroughbreds found by Gu et al. (2009). The putative QTL on ECA11 was also found to be subjected to positive selection in Thoroughbreds. However, none of the other potential QTL coincided with those regions. In the Hanoverian horse population analyzed here, possible breed-related marker associations have been on purpose sufficiently accounted for in the models used, to reveal within-breed-variation for show-jumping. In contrast, Gu et al. (2009) were searching for across-breed- variations to reveal genomic regions distinctive primary for Thoroughbreds. In addition, differences between detected QTL regions in Thoroughbreds and Hanoverians could be due to different breeding aims. The HSS maintains a special

52

GWAS for show-jumping

breeding program for show-jumpers to produce horses that are above the average talented for show-jumping. In Contrast, Thoroughbred-breeding is primary focused on speed and endurance, jumping style and ability are only secondary traits. We reviewed 28 human performance related genes for which genome-wide associated have been shown, that may have a critical role in physical performance of horses, too. We enlarged that list by 25 performance related genes from the Human Gene Map for Performance and Health-Related Fitness Phenotypes (Bray et al. 2009) without known polymorphism, to 53 genes. Show-jumping requires an excellent combination of fast and strong force over a moderate period of time, balance, and coordinative skills in particular during the approach and takeoff. Most of the force for takeoff is provided by muscular power generated by the hindlimbs (Lopez-Rivero & Letelier 2000; Barrey 2004). High jump and hurdles are human sports probably closest to style and ability required for equine show-jumping. Injuries and diseases of the locomotor system are the most common reasons for horses missing training or competition (Jeffott et al. 1982; Rossdale et al., 1985). Hence, genes associated with elite performance in power athlete and limb health are potential candidate genes for show-jumping, too. We detected functional candidate genes within the QTLs on ECA1, 8, 9 and 26 as well further candidate genes within the putative QTLs. PAPSS2 on ECA1, MYL2 on ECA8, TRHR on ECA9, NRF2 on ECA26, NRAP on ECA1, and TBX4 on ECA11 are interesting candidate genes due to their function. Sulfation is a common modification of endogenous (lipids, , and carbohydrates) and exogenous (xenobiotics and drugs) compounds. In mammals, the sulfate source 3'-phosphoadenosine 5'-phosphosulfate (PAPS) is created from ATP and inorganic sulfate. PAPSS2 encodes one of the two PAPS synthetases. In Human and mice this gene has an important role in skeletogenesis during postnatal growth. In mice, a SNP in this gene results in 50% reduction in limb length and a 25% reduction in axial skeletal size (Kurima et al. 1998). Positive genetic correlations between conformation traits and show-jumping have been shown in several, although not all genetic studies (Holmström & Philipsson 1993; Koenen et al. 1995), making PAPSS2 a suspicious candidate gene. We detected two of the highly associated

53

GWAS for show-jumping

SNPs (BIEC2-19814, BIEC2-19816) in this QTL as intragenic in the cleavage stimmulation factor 3-prim PRE-RNA, subunit 2, 64-KD, tau variant gene (CSTF2T) sequence. Dass et al. (2007) found that CSTF2T male knockout mice were infertile and displayed aberrant spermatogenesis, resulting in male infertility that resembled oligoasthenoteratozoospermia. MYL2 on ECA8 is in proximity to BIEC2-1036317, a SNP estimated as highly significant associated with show-jumping using any of the applied models. MYL2 is involved in muscle contraction through cyclic interactions with actin-rich thin filaments to create contractile force (Baker et al. 2006). MYL2 is also part of the KEGG- pathways “Focal adhesion” and “Regulation of actin cytoskeleton”, which were found overrepresented among the 53 human candidate genes and genes within positive selected regions in Thoroughbreds (Gu et al. 2009). The only SNP that supported the QTL on ECA9 has neither a significant additive nor a dominance effect on show-jumping, but we could detect an interesting functional candidate gene in the QTL region. TRHR is expressed in the thyrotrope cells of the anterior pituitary and encodes fore the hyrotropin-releasing hormone receptor, a G protein-coupled receptor that activates the inositol phospholipid- calcium-protein kinase C transduction pathway upon the binding of thyroid stimulating hormone (TSH). Thyroid hormones play a particularly crucial role in brain maturation during fetal development and regulation of metabolism (increases cardiac output, heard and ventilation rate, basal metabolic rate, and potentiates the effect of catecholamines). In a brother and sister with hypothyroidism and complete resistance to thyrotropin-releasing hormone, Bonomi et al. (2009) identified homozygosity for a nonsense in TRHR. In horses, among others hypothyreosis is related to performance intolerance and contumacy. BIEC2-1094761 is intragenic in the sequence of the R-spondin family, member 2 gene (RSPO2). RSPO2 is a protein included in the Wnt-signaling pathway that describes a network of proteins most well known for their roles in embryogenesis and cancer, but also involved in normal physiological processes in adult animals. In purebred doges, RPSO2 was associated with coat phenotypes (Cadieu et al. 2009).

54

GWAS for show-jumping

In the human NRF2 gene a SNP is found to positively influence human endurance performance, leading to a higher training response in VO2max (Eynon et al. 2009; He et al. 2007). Nuclear respiratory factors in general function as transcription factors that activate the expression of some key metabolic genes regulating cellular growth, and nuclear genes required for respiration, heme biosynthesis, and mitochondrial DNA transcription and replication (Kelly & Scarpulla 2004). The equine orthologous gene is localized next to BIEC2-689886 which is found highly associated with show-jumping using any of the implemented models and has a highly significant additive effect and a significant dominance effect on show- jumping. The NRAP is a heart and striated muscle-specific scaffolding protein that is involved in myofibril assembly. It is primarily associated with developing myofibrillar structures containing α-actinin (Shajia et al. 2008). For the investigated stallion population, a neighbouring SNP (BIEC2-84830) has a highly significant additive effect (P<0.01) and a significant dominance effect for show-jumping. The TBX4 is found to be required in muscle connective tissue for muscle/tendon patterning of the hind limbs in mice (Hasson et al. 2010). The orthologous equine gene is localized on ECA11 next to BIEC2-149160 that also has a significant additive effect (P<0.01) on show-jumping. TBX4 represents a candidate gene for show- jumping due to its potential influence on limb health. Hill et al. (2010) found a sequence polymorphism in the MSTN gene that is strongly associated with best race distance among elite racehorses. For the stallion population investigated in our study, MSTN was not found within a QTL or putative QTL for show-jumping. But we found significantly associated SNPs 7 Mb upstream and another 7 Mb downstream of MSTN. As this SNP is not contained in the equine SNP50 Bead-Chip a significant association can not be ruled out for MSTN. For flat races primary speed and endurance are requested and phenotypes commonly differ between fastest horses on short and long distances but equal among these groups. Conversely, elite show-jumping horses appear in more variable phenotypes. For that reason we suppose that muscle contraction force might have more influence mass

55

GWAS for show-jumping

on elite show-jumping performance than the overall muscle. Further analyses are required to verify the influence of MSTN on show-jumping. We were not able to detect obvious functional candidate genes within two of the six QTL regions. There were several genes with largely unknown functions within these highly associated regions. As signalling pathways are not yet investigated in depth and particularly not in horses, these QTL may harbour genes with significant implications for show-jumping. However, we detected three positional candidate genes within putative QTLs harbouring a highly significant intragenic SNP. On ECA1 we found BIEC2-243 within the neuroendocrine long coiled-coil protein 2 gene (JAMIP3). Based on the structure, expression pattern, and sublocalization of the JAMIP3 proteins, Cruz-Garcia et al. (2007) proposed that they are members of the golgin family of proteins, and part of the secretory pathway and may be responsible for Golgi structure or function. In addition we found the BCL2/adenovirus E1B 19-KD protein-interacting protein 3 gene (NIP3) in the same QTL as a positional candidate gene. Bruick (2000) presented evidence that the hypoxia-inducible factor-1α (HIF1A) activates expression of NIP3, which in turn primes cells for apoptosis under conditions of persistent oxygen deprivation. This pathway may play a role in cell death resulting from cerebral and myocardial ischemi. Mason et al. (2007) found in HIF1A knock out mice that the HIF1A causes an adaptive response in similar to endurance training. BIEC2-683832, the only SNP that supported the QTL on ECA26, has both a significant additive and dominance effect on show- jumping and was estimated as highly significant using any of the implemented models. We could not detect a functional candidate gene but the roundabout, axon guidance receptor, homolog 2 gene (ROBO2) as a positional candidate gene harbouring BIEC2-683832 intragenic. This gene belongs to the ROBO family and is part of the immunoglobulin superfamily proteins. The encoded protein is a receptor for SLIT2, molecules known to function in axon guidance and cell migration. Defects in this gene are the cause of vesicoureteral reflux type 2 (Lu et al. 2007). However, we can not obviate that genes within thus two QTL are involved in further metabolic or signalling pathways that are not investigated in depth yet.

56

GWAS for show-jumping

Further analyses using a larger horse population and denser marker sets will be required to verify the associated regions. Only one gene involved in human physical performance could be detected within highly associated QTL regions for show-jumping, indicating that genes affecting equine show-jumping performance largely differ from those revealed for human performance. We suppose that only few human sports require physical conditions comparable to thus required for show-jumping. We suppose that first of all genes involved in muscle structure and contraction force as well as limb developmental processes, more than overall muscle mass are crucial for elite show jumping performance. Further analyses including larger population and denser SNP marker sets are required to verify the potential QTL for show-jumping. Our approach appeared useful as a starting point to identify QTL for show-jumping within a breed.

5.6 References Baker P.E., Kearney J.A., Gong B., Merriam A.P., Kuhn D.E., Porter J.D. & Rafael- Fortney J.A. (2006). Analysis of differences between /-deficient vs mdx skeletal muscles reveals a specific upregulation of slow muscle genes in limb muscles. Neurogenetics 2, 81-91. Barrey E. (2004). Biomechanics of locomotion in the athletic horse. In: Equine sports medicine and surgery (ed by Hinchcliff K.W., Kaneps A.J., Geor R.J.), pp. 210–30. Elsevier Health, Philadelphia. Bonomi M., Busnelli M., Beck-Peccoz P., Costanzo D., Antonica F., Dolci C., Pilotta A., Buzi F. & Persani L. (2009). A family with complete resistance to thyrotropin- releasing hormone. The New England journal of medicine 360, 731-4. Bradbury P.J., Zhang Z., Kroon D.E., Casstevens T.M., Ramdoss Y. & Buckler E.S. (2007). TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics 23, 2633-5. Bray M.S., Hagberg J.M., Pérusse L., Rankinen T., Roth S.M., Wolfarth B. & Bouchard C. (2009). The human gene map for performance and health-related

57

GWAS for show-jumping

fitness phenotypes: the 2006-2007 update. Medicine and Science in Sports and Exercise 41, 35-73. Bruick R.K. (2000). Expression of the gene encoding the proapoptotic Nip3 protein is induced by hypoxia. Proceedings of the National Academy of Sciences of the United States of America 97, 9082-7. Christmann L. (1996). Zuchtwertschätzung für Merkmale der Stutbuchaufnahme und\line der Stutenleistungsprüfung im Zuchtgebiet Hannover. Dissertation, Georg- August Universität Göttingen. Cruz-Garcia D., Vazquez-Martinez R., Peinado J.R., Anouar Y., Tonon M.C., Vaudry H., Castano J.P. & Malagon M.M. (2007). Identification and characterization of two novel (neuro).endocrine long coiled-coil proteins. FEBS Letters 581, 3149-56. Dass B., McDaniel L., Schultz R.A., Attaya E., MacDonald C.C. (2002). The gene CSTF2T, encoding the human variant CstF-64 polyadenylation protein tau-CstF- 64, lacks introns and may be associated with male sterility. Genomics 80, 509-14. Eynon N., Sagiv M., Meckel Y., Duarte J.A., Alves A.J., Yamin C., Sagiv M., Goldhammer E. & Oliveira J. (2009). NRF2 intron 3 A/G polymorphism is associated with endurance athletes' status. Journal of Applied Physiology 107, 76- 9. Groeneveld E., Kovac M., & Wang T. (1990). Pest, a general purpose BLUP package for multivariate prediction and estimation. In World Congress on Genetics Applied to Livestock Production, 488–91. Edinburgh, UK Garsi, Madrid. Gu J., Orr N., Park S.D., Katz L.M., Sulimova G., MacHugh D.E. & Hill E.W. (2009). A Genome Scan for Positive Selection in Thoroughbred Horses. PLoS ONE 4, e5767. Guo D.-C., Pannu H., Tran-Fadulu V., Papke C.L., Yu R.K., Avidan N., Bourgeois S., Estrera A.L., Safi H.J., Sparks E., Amor D., Ades L., McConnell V., Willoughby C.E., Abuelo D., Willing M., Lewis R.A., Kim D.H., Scherer S., Tung P.P., Ahn C., Buja L.M., Raman C.S., Shete S.S. & Milewicz D.M. (2007). Mutations in smooth muscle alpha-actin (ACTA2) lead to thoracic aortic aneurysms and dissections. Nature Genetics 39, 1488-93. Guo D.-C., Papke C.L., Tran-Fadulu V., Regalado E.S., Avidan N., Johnson R.J.,

58

GWAS for show-jumping

Kim D. H., Pannu H., Willing M.C., Sparks E., Pyeritz R.E., Singh M.N., Dalman R.L., Grotta J.C., Marian A.J., Boerwinkle E.A., Frazier L.Q., LeMaire S.A., Coselli J.S., Estrera A.L., Safi H.J., Veeraraghavan S., Muzny D.M., Wheeler D.A., Willerson J.T., Yu R.K., Shete S.S., Scherer S.E., Raman C.S., Buja L.M. & Milewicz D.M. (2009). Mutations in smooth muscle alpha-actin (ACTA2) cause coronary artery disease, stroke, and Moyamoya disease, along with thoracic aortic disease. American Journal of Human Genetics 84, 617-27. Hamann H. & Distl O. (2008). Genetic variability in Hanoverian warmblood horses using pedigree analysis. Journal of Animal Science 86, 1503-13. Hasson P., DeLaurier A., Bennett M., Grigorieva E., Naiche L.A., Papaioannou V.E., Mohun T.J. & Logan M.P.O. (2010). Tbx4 and tbx5 acting in connective tissue are required for limb muscle and tendon patterning. Developmental Cell 18, 148-56. He Z., Hu Y., Feng L., Lu Y., Liu G., Xi Y., Wen L. & McNaughton L.R. (2007). NRF2 genotype improves endurance capacity in response to training. International Journal of Sports Medicine 28, 717-21. Hill E.W., Gu J., Eivers S.S., Fonseca R.G., McGivney B.A., Govindarajan P., Orr N., Katz L.M. & MacHugh D. (2010). A sequence polymorphism in MSTN predicts sprinting ability and racing stamina in Thoroughbred horses. PloS One 5, e8645. Holmström M. & Philipsson J. (1993). Relationships between conformation, performance and health in 4-year-old swedish warmblood riding horse. Livestock Production Science 33, 293-312. Jeffcott L.B., Rossdale P.D., Freestone J., Frank C.J., & Towers-Clark P.F. (1982). An assessment of wastage in Thoroughbred racing from conception to 4 years of age. Equine Veterinary Journal 14, 185-98. Karlsson E.K., Baranowska I., Wade C.M., Salmon Hillbertz N.H.C., Zody M.C., Anderson N., Biagi T.M., Patterson N., Pielberg G.R., Kulbokas E.J. 3rd, Comstock K.E., Keller E.T., Mesirov J.P., von Euler H., Kämpe O., Hedhammar A., Lander E.S., Andersson G., Andersson L. & Lindblad-Toh K. (2007). Efficient mapping of mendelian traits in dogs through genome-wide association. Nature Genetics 39, 1321-8. Kelly D.P. & Scarpulla R.C. (2004). Transcriptional regulatory circuits controlling

59

GWAS for show-jumping

mitochondrial biogenesis and function. Genes & Development 18, 357-68. Kolbehdari D., Wang Z., Grant J.R., Murdoch B., Prasad A., Xiu Z., Marques E., Stothard P. & Moore S.S. (2009). A whole genome scan to map QTL for milk production traits and somatic cell score in Canadian Holstein bulls. Journal of Animal Breeding and Genetics 126, 216-27. Koenen E.P.C., van Veldhuizen A.E. & Brascamp E.W. (1995). Genetic parameters of linear scored conformation traits and their relation to dressage and show- jumping performance in the Dutch Warmblood Riding Horse population. Livestock Production Science 43, 85-94. Lopez-Rivero J.L. & Letelier A. (2000). Skeletal muscle profile of show jumpers: physiological and pathological considerations. The elite show jumper. Conference of Equine Sports Medicine Science 57–76. Lu S., Borst D.E. & Horowits R. (2008). Expression and alternative splicing of N-RAP during mouse skeletal muscle development. Cell Motility and the Cytoskeleton 65, 945-54. Lu W., van Eerde A.M., Fan X., Quintero-Rivera F., Kulkarni S., Ferguson H., Kim H.- G., Fan Y., Xi Q., Li Q., Sanlaville D., Andrews W., Sundaresan V., Bi W., Yan J., Giltay J.C., Wijmenga C., de Jong T.P., Feather S.A., Woolf A.S., Rao Y., Lupski J.R., Eccles M.R., Quade B.J., Gusella J.F., Morton C.C. & Maas R.L. (2007). Disruption of ROBO2 is associated with urinary tract anomalies and confers risk of vesicoureteral reflux. American Journal of Human Genetics 80, 616-32. Ma Y., Liu H., Tu-Rapp H., Thiesen H.-J., Ibrahim S.M., Cole S.M. & Pope R.M. (2004). Fas ligation on macrophages enhances IL-1R1-Toll-like receptor 4 signaling and promotes chronic inflammation. Nature Immunology 5, 380-7. Macarthur D.G. & North K.N. (2005). Genes and human elite athletic performance. Human Genetics 116, 331-9. Mason S.D., Rundqvist H., Papandreou I., Duh R., McNulty W.J., Howlett R.A., Olfert I.M., Sundberg C.J., Denko N.C., Poellinger L. & Johnson R.S., (2007). HIF- 1alpha in endurance training: suppression of oxidative metabolism. American Journal of Physiology 293, 2059-69. Myles S., Tang K., Somel M., Green V., Kelso V. & Stoneking V. (2008). Identification

60

GWAS for show-jumping

and analysis of genomic regions with large between-population differentiation in humans. Annals of Human Genetics 72, 99-110. Pritchard J.K., Stephens M. & Donnelly P. (2000). Inference of Population Structure Using Multilocus Genotype Data. Genetics 155, 945-59. Purcell S., Neale B., Toddbrown V., Thomasv L., Ferreira M., Bender D., Maller J., Sklar P., Debakker P. & Daly M. (2007). PLINK: a tool set for whole-genome association and population-based linkage analyses. The American Journal of Human Genetics 81, 559-75. Ritland K. (1996). estimators for pairwise relatedness and individual inbreeding coefficients. Genetics Research 67, 175-85. Rossdale P.D., Hopes R., Wingfield Digby N.J. & Offord K. (1985). Epidemiological study of wastage among racehorses 1982 and 1983. Veterinary Record 116, 66-9. Smedley D., Haider S., Ballester B., Holland R., London D., Thorisson G. & Kasprzyk A. (2009). BioMart - biological queries made easy. BMC Genomics 10, 22. Stock K.F. & Distl O. (2007). Genetic correlations between performance traits and radiographic findings in the limbs of German Warmblood riding horses. Journal of Animal Science 85, 31-41. Yu J., Pressoir G., Briggs W.H., Vroh Bi I., Yamasaki M., Doebley J.F., McMullen M.D., Gaut B.S., Nielsen D.M., Holland J.B., Kresovich S. & Buckler E.S. (2006). A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nature Genetics 38, 203-8.

61

GWAS for show-jumping

Table 1 Quantitative trait loci for show-jumping, their location on Equus caballus (ECA) in Mb, minor allele frequency (MAF), polymorphism information content (PIC), SNP-motif (minor allele is written in bold) and -log10 P-values using different analysis models.

Position in ECA SNP ID MAF PIC Plink 1 Plink 2 Plink 3 Tassel 1 Tassel 2 Mb 1 BIEC2-236 0.35 0.35 C/T 1644004 3.32 2.59 2.64 5.60 3.92 BIEC2-243 0.35 0.35 C/T 1655065 3.34 2.81 2.58 5.89 4.29 BIEC2-18316 0.31 0.34 A/G 39721387 6.41 1.17 2.53 5.36 4.00 BIEC2-19814 0.07 0.12 C/T 42260541 5.98 5.33 1.51 6.20 4.29 BIEC2-19816 0.07 0.12 G/T 42263934 5.98 5.33 1.51 6.20 4.29 8 BIEC2-1036317 0.12 0.19 A/G 21474716 9.00 5.83 4.03 7.40 5.69 9 BIEC2-1094761 0.19 0.26 A/G 52584334 2.26 2.64 2.02 5.30 3.61 26 BIEC2-683832 0.17 0.25 A/C 11751736 9.08 4.98 3.69 6.27 3.53 BIEC2-689886 0.47 0.37 G/T 21164111 5.39 3.05 3.12 5.67 3.10 Plink1 = adaptive permutation testing. Plink2 = adaptive permutation testing using the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner as covariates. Plink3 = adaptive permutation testing using the MH-test within 16 core families and the proportion of Hanoverian, Thoroughbred, Trakehner and Holsteiner genes as covariates. Tassel1 = mixed linear model (MLM) for population structure estimated using structure (Pritchard et al.2000) and identity by state relationship matrix among all individuals. Tassel2 = mixed linear model (MLM) for the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner as covariates and marker identity by state based kinship among all individuals.

62

GWAS for show-jumping

Table 2 Estimates of the additive (a) and dominance (d) effects with their standard errors (SE) and error probabilities (P) for each of the most significant SNP within each QTL using an animal model for the trait show jumping and the distribution of genotype frequencies for the respective SNP.

Alleles Genotype frequencies ECA SNP-ID a1 P d2 P 1 2 11 12 22 1 BIEC2-243 T C -6.22 ± 3.38 0.07 9.05 ± 4.36 <0.05 0.12 0.45 0.41 1 BIEC2-18316 A G -13.16 ± 4.04 <0.01 7.65 ± 4.60 0.10 0.10 0.41 0.48 8 BIEC2-1036317 A G 7.31± 6.62 0.27 17.83 ± 7.45 <0.05 0.79 0.18 0.03 9 BIEC2-1094761 A G -0.54 ± 6.37 0.93 13.55 ± 7.33 0.07 0.03 0.32 0.65 26 BIEC2-683832 A C 12.91± 5.56 <0.05 4.08 ± 6.08 <0.05 0.70 0.24 0.05 26 BIEC2-689886 T G 8.32 ± 3.08 <0.01 9.91 ± 4.21 <0.05 0.30 0.44 0.25 1 2 a = (m22-m11) / 2; d = m12 – ((m11 +m22) / 2)

63 GWAS for show-jumping

Table 3 Distribution of genotypes per proportions of Hanoverian (HAN), Thoroughbred (TB), Trakehner (TRAK) and Holsteiner (HOL) genes for the genotyped stallions.

Genotype Genotype Genotype Genotype Genotype Genotype Proportion frequency (%) frequency (%) frequency (%) frequency (%) frequency (%) frequency (%) of genes in BIEC2-243 BIEC2-18316 BIEC2-1036317 BIEC2-1094761 BIEC2-683832 BIEC2-689886 classes TT TC CC AA AG GG AA AG GG AA AG GG AA AC CC TT TG GG HAN 1 21 50 25 14 29 54 68 29 4 7 39 54 71 21 7 39 39 21 2 12 43 45 9 41 50 82 14 4 2 34 64 68 25 7 34 39 27 3 3 45 51 10 52 38 84 16 0 0 23 77 74 26 0 16 58 26 TB 1 7 47 47 3 53 43 80 17 3 3 33 63 67 30 3 23 50 27 2 17 44 37 15 42 40 79 17 4 0 37 63 62 29 10 36 37 33 3 9 45 45 9 27 64 79 21 0 6 24 70 88 12 0 36 52 12 TRAK 1 13 42 46 15 35 50 79 17 4 4 35 60 71 23 6 31 46 23 2 11 54 32 18 36 43 68 32 0 0 21 79 64 29 7 21 39 39 3 13 44 44 0 51 49 87 10 3 3 36 62 74 23 3 36 46 18 HOL 1 10 33 41 4 40 56 87 13 0 3 26 71 76 24 0 34 49 16 2 11 78 6 44 44 6 39 44 17 0 67 33 39 28 33 11 17 72

HAN 1: ≤0.34, 2: >0.34 and <0.78, 3: ≥0.78; TB 1: ≤0.13, 2: >0.13 and <0.30, 3: ≥0.30; TRAK 1: ≤0.20, 2: >0.20 and <0.80, 3: ≥0.80; HOL 1: 0.00, 2: >0.00 and <0.30, 3: ≥0.30.

64

GWAS for show-jumping

Table S1 Quantitative trait loci (QTL) and putative QTL for show-jumping, their location on Equus caballus chromosome (ECA) in Mb, minor allele frequency (MAF), polymorphism information content (PIC), SNP-motif and -log10 P-values using different analysis models.

Position in ECA SNP ID MAF PIC Alleles Plink 1 Plink 2 Plink 3 Tassel 1 Tassel 2 Mb 1 BIEC2-236 0.35 0.35 C/T 1644004 3.32 2.59 2.64 5.60 3.92 BIEC2-243 0.35 0.35 C/T 1655065 3.34 2.81 2.58 5.89 4.29 BIEC2-8178 0.39 0.36 C/T 17992316 1.55 2.31 1.89 4.58 3.45 BIEC2-18316 0.31 0.34 A/G 39721387 6.41 1.17 2.53 5.36 4.00 BIEC2-19814 0.07 0.12 C/T 42260541 5.98 5.33 1.51 6.20 4.29 BIEC2-19816 0.07 0.12 G/T 42263934 5.98 5.33 1.51 6.20 4.29 BIEC2-52293 0.21 0.28 C/T 120200513 3.64 2.61 1.56 3.46 3.74 BIEC2-70522 0.44 0.37 A/G 159022526 4.86 3.73 1.76 3.22 3.07 BIEC2-84830 0.29 0.33 C/T 176229370 2.35 3.04 1.83 3.29 3.35 3 BIEC2-775163 0.50 0.38 C/T 28275761 3.54 1.75 1.20 4.28 3.14 8 BIEC2-1036317 0.12 0.19 A/G 21474716 9.00 5.83 4.03 7.40 5.69 9 BIEC2-1094761 0.19 0.26 A/G 52584334 2.26 2.64 2.02 5.30 3.61 11 BIEC2-149160 0.50 0.38 A/G 32231193 3.68 2.00 2.69 4.40 3.24 17 BIEC2-383870 0.08 0.14 A/G 72821870 4.26 3.70 1.02 4.45 3.87 BIEC2-383872 0.08 0.14 A/G 72822143 4.26 3.70 1.02 4.45 3.87 BIEC2-383879 0.08 0.14 A/G 72830904 4.26 3.70 1.02 4.45 3.87 BIEC2-383880 0.08 0.14 C/T 72830974 4.26 3.70 1.02 4.45 3.87 21 BIEC2-573168 0.08 0.14 A/G 53122307 5.29 4.44 1.98 3.31 3.12 26 BIEC2-683832 0.17 0.25 A/C 11751736 9.08 4.98 3.69 6.27 3.53 BIEC2-689886 0.47 0.37 G/T 21164111 5.39 3.05 3.12 5.67 3.10 Plink1 = adaptive permutation testing. Plink2 = adaptive permutation testing using the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner as covariates. Plink3 = adaptive permutation testing using the MH-test within 16 core families and the proportion of Hanoverian, Thoroughbred, Trakehner and Holsteiner genes as covariates. Tassel1 = mixed linear model (MLM) for population structure estimated using structure (Pritchard et al.2000) and identity by state relationship matrix among all individuals. Tassel2 = mixed linear model (MLM) for the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner as covariates and marker identity by state based kinship among all individuals.

65

GWAS for show-jumping

Table S2 Quantiles of the breeding value (BV) for show-jumping and the genomic BV for show-jumping in 115 Hanoverian stallions genotyped.

Number of Number of Quantiles BV Genotype BV stallions stallions (%) show-jumping show-jumping genotyped genotyped 95 4 153 6 142 90 20 147 17 130 75 28 119 27 110 50 27 98 31 100 25 15 82 17 87 10 15 66 7 77 5 1 56 6 67

66

GWAS for show-jumping

Table S3 Families analyzed, their size, number of stallions genotyped, the mean breeding value (BV) for show-jumping, and mean proportions of Hanoverian (HAN), Thoroughbred (TB), Trakehner (TRAK) and Holsteiner (HOL) genes for the genotyped stallions.

Total Number of Family BV for show- Proportion of genes descending from family stallions number jumping size genotyped HAN (%) TB (%) TRAK (%) HOL (%)

1 63 10 103 ± 20 51.4 33.3 1.1 0 2 19 3 80 ± 17 55.0 43.7 1.7 0 3 98 7 147 ±14 16.7 23.1 0.3 54.8 4 37 8 76 ± 10 32.3 26.0 20.4 8.1 5 65 7 98 ± 23 71.6 27.1 1.3 0 6 47 7 132 ± 17 74.3 14.1 4.3 6.4 7 31 4 123 ± 19 21.3 12.8 3.3 0 8 80 13 110 ± 22 74.8 15.7 2.5 1.5 9 56 5 74 ± 20 24.6 74.8 0.4 0 10 33 6 99 ± 14 36.5 3.8 3.5 0 11 70 15 111 ± 17 79.3 16.4 3.9 0 12 15 2 83 ± 9 41.0 29.0 5.0 0 13 17 4 135 ± 21 47.2 38.5 4.5 5.3 14 29 2 163 ± 10 75.0 15.0 4.0 0 15 66 16 75 ± 13 78.4 13.9 6.1 1.2 16 72 6 84 ± 17 0 100 0 0 Total 798 115 102 48.7 30.45 3.9 4.8

67 GWAS for show-jumping

Table S4 Single nucleotide polymorphisms (SNPs) significantly associated with the breeding value for show-jumping (-log10 (P) >3), their location on Equus caballus chromosome (ECA) in Mb and -log10 error probabilities estimated with Tassel1 and Tassel2 and potential candidate genes. Position in Candidate Position in bp ECA SNP ID Tassel1 Tassel2 Mb gene Start/ End 1 BIEC2-119 889481 3.06 2.08 BIEC2-120 889838 3.06 2.08 BIEC2-121 889850 3.06 2.08 BIEC2-228 1604288 3.99 2.57 BIEC2-236 1644004 5.60 3.92 BIEC2-243 1655065 5.89 4.29 BIEC2-256 1733640 3.82 2.52 BIEC2-365 2126664 3.51 2.51 BIEC2-8178 17992316 4.58 3.45 NRAP 18098978 - 18168334 BIEC2-12140 26544989 3.88 1.71 BIEC2-12366 26935288 3.31 1.53 BIEC2-13174 29772231 3.27 1.86 BIEC2-14522 32149200 4.53 2.96 BIEC2-15319 33821924 3.79 2.60 BIEC2-15373 33982178 3.67 2.52 BIEC2-15437 34124159 3.67 2.52 BIEC2-15452 34163704 3.67 2.52 BIEC2-16469 36130720 3.16 1.95 BIEC2-16470 36130748 3.16 1.95 BIEC2-18316 39721387 5.36 4.00 BIEC2-18916 40502663 3.31 1.33 BIEC2-18926 40535352 3.80 1.59 PAPSS2 40701869-40730244 BIEC2-19814 42260541 6.20 4.29 BIEC2-19816 42263934 6.20 4.29 BIEC2-19862 42503810 3.10 3.05 BIEC2-21023 47129984 3.00 1.81 BIEC2-28138 67809750 3.40 2.04 BIEC2-36014 82647550 3.35 2.52 BIEC2-37171 87944954 3.36 1.89 BIEC2-43592 100557789 3.34 1.82 BIEC2-44094 102806182 4.24 2.70 BIEC2-44113 102863676 3.07 1.76 BIEC2-51499 118538558 2.47 3.00 BIEC2-52293 120200513 3.46 3.74 BIEC2-52329 120249442 2.00 3.01 BIEC2-58470 134173973 3.25 1.62 BIEC2-58504 134295562 3.25 1.62 BIEC2-67513 155963444 3.66 2.49 BIEC2-69481 158312638 3.07 2.74 BIEC2-70522 159022526 3.22 3.07 BIEC2-72554 160315007 3.07 1.70 BIEC2-72558 160315339 3.07 1.70 BIEC2-78830 165356683 2.21 3.04 BIEC2-79100 165950704 3.03 1.81

68 GWAS for show-jumping

Table S4 continued Position in Candidate Position in bp ECA SNP ID Tassel1 Tassel2 Mb gene Start/ End BIEC2-84830 176229370 3.29 3.35 BIEC2-86022 177358112 2.35 3.25 2 BIEC2-455526 10743770 4.32 2.44 BIEC2-455641 10846930 4.81 2.39 BIEC2-463834 25698465 3.45 2.03 BIEC2-470125 35006030 3.27 1.76 BIEC2-475545 43376345 2.04 3.13 BIEC2-475643 43440117 2.04 3.13 BIEC2-475770 43809291 2.64 3.20 BIEC2-475798 43931117 3.10 2.96 BIEC2-475846 44039916 3.10 2.96 BIEC2-476894 46480817 3.00 2.32 BIEC2-477471 49769093 3.47 2.68 BIEC2-488334 72409140 3.20 1.49 BIEC2-504327 103625986 3.62 2.32 3 BIEC2-772752 13763517 3.43 2.03 BIEC2-773958 20964733 3.05 2.07 HP 21850633-21853753 BIEC2-775163 28275761 4.28 3.14 BIEC2-775214 28555836 3.64 1.75 BIEC2-776206 32241051 3.82 2.49 BIEC2-778039 39858065 3.25 2.30 BIEC2-779957 50614921 3.31 1.79 BIEC2-795805 80987074 4.08 2.62 BIEC2-802664 93276965 3.08 1.81 BIEC2-804073 95597055 3.22 2.15 4 BIEC2-859367 39143144 3.82 2.18 BIEC2-874264 93920364 3.33 3.13 CHRM2 90882119-90883198 BIEC2-877305 98151733 3.61 2.08 BIEC2-877422 98289097 3.29 2.36 BIEC2-877683 99201376 3.81 1.81 BIEC2-877728 99223478 3.24 1.31 BIEC2-877754 99261878 3.24 1.29 BIEC2-878817 100893694 3.00 2.48 BIEC2-881825 103436292 3.05 1.70 NOS3 102601313-102619154 BIEC2-885148 107562252 3.27 2.38 5 BIEC2-890528 7315592 3.00 1.53 BIEC2-890713 7717196 3.24 1.67 BIEC2-890723 7747257 1.68 3.09 BIEC2-905836 37203420 3.75 2.74 BIEC2-905861 37459676 3.64 2.57 ATP1A2 37404795 - 37424718 BIEC2-907068 41486214 3.13 1.98 BIEC2-907076 41546213 3.30 2.43 BIEC2-907089 41668701 3.00 2.37 BIEC2-907090 41668765 3.10 2.60 BIEC2-907096 41671476 3.10 2.60 BIEC2-907099 41671640 3.10 2.60 BIEC2-907101 41671989 3.00 2.37 BIEC2-911675 59007000 3.14 1.74 BIEC2-926667 87755921 4.27 1.87 6 BIEC2-935503 1691413 4.37 2.04 BIEC2-935506 1691742 4.13 1.69 BIEC2-950349 41075839 3.88 2.30

69 GWAS for show-jumping

Table S4 continued Position in Candidate Position in bp ECA SNP ID Tassel1 Tassel2 Mb gene Start/ End BIEC2-952896 46920985 3.49 2.54 7 BIEC2-983001 16469175 3.24 2.54 BIEC2-983311 16986735 3.26 2.28 BIEC2-983357 17125484 5.24 4.35 BIEC2-984956 20797167 3.16 2.09 BIEC2-986726 24147552 3.22 2.47 BIEC2-992560 33733480 3.27 2.77 BIEC2-994893 35557557 3.00 2.27 BIEC2-1005528 67597722 3.04 2.38 UCP2 69879779-69886120 BIEC2-1017308 93854160 3.36 2.70 BDNF 94040241- 94073623 8 BIEC2-1021328 942506 3.61 2.96 BIEC2-1030877 12580862 3.00 1.70 BIEC2-1032231 13888931 3.22 2.16 BIEC2-1032255 14037298 3.90 2.55 BIEC2-1036108 20846666 2.92 3.10 BIEC2-1036317 21474716 7.40 5.69 MYL2 21089435 - 21096156 BIEC2-1050278 49070142 4.24 2.92 BIEC2-1057745 63224126 3.19 2.34 BIEC2-1057746 63224274 3.19 2.34 BIEC2-1064475 84049410 4.04 3.70 9 BIEC2-1073209 8669798 4.02 3.00 BIEC2-1074224 10748872 3.35 3.24 BIEC2-1074231 10754521 3.03 2.85 BIEC2-1074262 10789372 4.22 3.89 BIEC2-1084126 29755354 3.67 1.68 BIEC2-1084351 29950971 3.18 1.92 BIEC2-1085337 31667103 2.77 3.22 BIEC2-1094761 52584334 5.30 3.61 TRHR 53530152-53564683 BIEC2-1103804 70281993 3.11 2.10 BIEC2-1104866 72780610 3.34 2.44 BIEC2-1105430 74981480 3.47 2.92 10 BIEC2-111489 26444526 3.00 1.38 BIEC2-114419 34084834 3.30 2.37 BIEC2-116246 37545357 3.26 1.73 BIEC2-122793 50998042 3.47 2.19 BIEC2-126599 58414631 4.38 3.42 BIEC2-127817 60793440 3.82 2.46 BIEC2-130075 65167181 3.11 2.27 11 BIEC2-149160 32231193 4.40 3.24 TBX4 34635880-34662642 BIEC2-153980 42144936 3.03 1.87 SLC6A4 43855676-43876627 12 BIEC2-192003 20878555 3.93 2.57 CNTF 18907955-18909987 13 BIEC2-203227 1858408 3.04 1.64 BIEC2-206803 6195432 3.75 2.34 BIEC2-214346 18106889 4.81 2.92 BIEC2-214347 18107021 3.90 2.11 BIEC2-214358 18116105 4.81 2.92 BIEC2-214423 18141804 4.81 2.92 BIEC2-214424 18142109 4.77 2.92 BIEC2-230165 31063558 4.16 1.90 14 BIEC2-247357 18066309 3.57 2.48 BIEC2-248307 22509300 3.61 3.53 BIEC2-253057 30292759 3.59 2.68 ADRB2 28966505-28967756

70

GWAS for show-jumping

Table S4 continued Position in Candidate Position in bp ECA SNP ID Tassel1 Tassel2 Mb gene Start/ End BIEC2-253066 30296571 3.98 2.96 NR3C1 33819335-33923603 BIEC2-260501 61417422 3.39 2.02 BIEC2-274735 89876915 2.60 3.07 15 BIEC2-277769 239017 3.54 1.96 BIEC2-277771 240170 3.06 1.42 BIEC2-300973 31054768 3.14 1.95 BIEC2-301609 33288522 1.81 3.01 BIEC2-304815 41132806 3.09 2.37 BIEC2-307495 47321090 3.57 2.10 BIEC2-307496 47323167 3.57 2.10 BIEC2-321926 80085648 3.47 2.62 16 BIEC2-328405 5107308 3.44 2.24 BIEC2-328407 5107537 3.24 1.85 BIEC2-328560 5581292 3.20 2.11 BIEC2-328569 5582998 3.20 2.11 BIEC2-328942 7037286 3.04 2.30 BIEC2-328948 7090342 3.04 2.30 BIEC2-333972 23272601 3.04 1.95 BIEC2-338108 30350606 3.05 2.51 BIEC2-364096 79630899 3.40 1.97 17 BIEC2-367347 5420696 3.52 3.02 BIEC2-368239 8996795 3.62 1.45 BIEC2-373877 22683951 4.57 2.48 BIEC2-373977 23255246 4.71 2.72 BIEC2-374088 23823129 3.09 2.11 BIEC2-375830 36081472 1.95 3.20 BIEC2-376514 40031314 3.23 3.13 BIEC2-377771 51450154 3.39 1.65 BIEC2-377805 51636934 3.56 2.21 BIEC2-377814 51646404 4.39 2.72 BIEC2-377997 52822386 3.13 1.56 BIEC2-379266 58592821 3.29 2.77 BIEC2-382969 66512056 3.89 1.37 BIEC2-383186 67651523 3.34 3.05 BIEC2-383870 72821870 4.45 3.87 BIEC2-383872 72822143 4.45 3.87 BIEC2-383879 72830904 4.45 3.87 BIEC2-383880 72830974 4.45 3.87 BIEC2-384085 74800615 3.16 1.75 18 BIEC2-401744 12474125 3.11 1.52 BIEC2-412334 49109516 3.02 2.47 BIEC2-412335 49110177 3.02 2.47 BIEC2-416362 59224030 2.28 3.06 BIEC2-416376 59243054 3.11 3.96 MSTN 66490208-66495180 BIEC2-418371 73284185 4.50 2.28 20 BIEC2-512604 2517236 3.13 2.52 BIEC2-521589 15644753 3.60 1.74 21 BIEC2-547502 3017794 3.57 1.63 BIEC2-565851 41466762 3.21 1.97 BIEC2-573009 53038106 3.01 2.47 BIEC2-573168 53122307 3.31 3.12 22 BIEC2-574622 572872 3.09 2.38

71

GWAS for show-jumping

Table S4 continued Position in Candidate Position in bp ECA SNP ID Tassel1 Tassel2 Mb gene Start/ End BIEC2-575340 1498126 3.33 1.68 23 BIEC2-607492 5473131 3.58 2.72 BIEC2-622879 29540429 3.14 1.33 24 BIEC2-630306 6082586 3.00 2.82 BIEC2-630344 6227252 3.45 3.11 BIEC2-630350 6264670 3.45 3.11 BIEC2-630353 6305498 3.00 2.36 BIEC2-630354 6307088 3.00 2.36 HIF1A 8972034-8993323 BIEC2-648304 36948288 3.22 1.81 BIEC2-649094 37766990 3.57 2.43 BDKRB2 38694878-38695924 25 BIEC2-660983 14398489 3.45 2.17 BIEC2-660984 14398628 3.45 2.17 BIEC2-660988 14401922 3.26 1.64 26 BIEC2-682698 9226414 3.46 2.30 BIEC2-683832 11751736 6.27 3.53 BIEC2-684083 12317187 3.22 1.86 BIEC2-689227 20258377 3.35 2.33 BIEC2-689886 21164111 5.67 3.10 BIEC2-690523 22954773 3.96 2.40 BIEC2-690524 22954861 3.96 2.40 BIEC2-690559 23032035 3.12 2.32 NRF2 233908826-23422854 BIEC2-690650 23245243 3.87 2.85 BIEC2-690659 23282559 3.87 2.85 BIEC2-690662 23287358 3.84 2.89 BIEC2-690770 23585719 3.62 2.22 BIEC2-690776 23595708 3.18 1.92 BIEC2-690779 23617199 3.29 1.95 BIEC2-690788 23621246 3.29 1.95 BIEC2-690800 23626577 3.18 1.92 BIEC2-692206 28110090 3.07 1.47 BIEC2-692673 29803727 3.11 1.63 BIEC2-692767 30220853 4.26 2.74 BIEC2-692775 30287237 4.26 2.74 BIEC2-692806 30402377 3.22 1.58 BIEC2-696918 39725831 4.59 2.60 BIEC2-696967 39843337 3.18 1.80 BIEC2-697162 40183063 3.46 2.05 BIEC2-697333 40516276 3.34 2.25 BIEC2-697334 40516394 3.34 2.25 27 BIEC2-707024 16504695 4.00 2.16 BIEC2-711215 24443333 3.96 3.31 BIEC2-718915 34542464 3.34 2.17 29 BIEC2-754872 15306599 3.04 1.72 BIEC2-758288 21640418 3.31 1.09 BIEC2-762520 30505382 3.87 2.03 30 BIEC2-812892 1165979 4.69 2.80 BIEC2-822775 15304825 3.55 1.87 BIEC2-823330 16606208 3.51 1.97 BIEC2-823417 16985152 4.03 2.37 BIEC2-823508 17213090 3.55 2.19 BIEC2-825287 19513015 3.43 2.64

72

GWAS for show-jumping

Table S4 continued Position in Candidate Position in bp ECA SNP ID Tassel1 Tassel2 Mb gene Start/ End BIEC2-826952 23799195 4.21 3.76 BIEC2-828225 26769212 3.53 2.11 BIEC2-828264 27056586 3.26 2.16 BIEC2-828366 27729561 4.69 2.70 31 BIEC2-831033 1918704 3.68 2.11 BIEC2-833094 5808753 3.10 1.86 BIEC2-836843 12977503 3.17 1.81 BIEC2-837644 14474097 3.19 2.72 X BIEC2-1108329 1589940 3.62 3.56 BIEC2-1109692 9787770 2.12 3.10 BIEC2-1111426 14456628 3.08 2.74 BIEC2-1111427 14456630 3.08 2.74 BIEC2-1112160 16483078 3.34 2.43 BIEC2-1112345 16979524 3.47 2.96 BIEC2-1127119 61343778 3.14 1.95 BIEC2-1131661 71930948 3.93 2.59 BIEC2-1131671 71933909 3.37 1.99 BIEC2-1131673 71935075 3.50 2.12 BIEC2-1131677 71935403 3.37 1.99 BIEC2-1131691 71944666 3.11 1.79 BIEC2-1138668 82710964 3.60 2.52 BIEC2-1138677 82724519 3.60 2.52 BIEC2-1146307 96517795 3.04 1.95 Tassel1 = significance ascertained integrating population structure (estimated using STRUCTURE (Pritchard, Stephens, und Donnelly 2000)) and marker identity by state based kinship in a mixed linear model (MLM) by using TASSEL. Tassel2= significance ascertained integrating the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner as covariates and marker identity by state based kinship in a mixed linear model (MLM) by using TASSEL.

73

GWAS for show-jumping

Figure 1 Quantile – Quantile-plots of observed P- values estimated for adaptive permutation testing using a maximum of 1,000,000 permutations (Plink1) versus the expectation under null. The black line shows the expected distribution and the black points show the absolute observed distribution.

74

GWAS for show-jumping

Figure 2 Quantile – Quantile-plots of observed P- values estimated for adaptive permutation testing considering the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner as covariate using a maximum of 1,000,000 permutations (Plink2) versus the expectation under null. The black line shows the expected distribution and the black points show the absolute observed distribution.

75

GWAS for show-jumping

Figure 3 Quantile – Quantile-plots of observed P-values estimated for CHM adaptive permutation testing within 16 family clusters, considering the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner as covariate using a maximum of 1,000,000 permutations (Plink3) versus the expectation under null. The black line shows the expected distribution and the black points show the absolute observed distribution.

76

GWAS for show-jumping

Figure 4 Quantile – Quantile-plots of observed P- values estimated using a mixed linear animal model by simultaneous accounting multiple levels of marker-based population structure (Q matrix) and relative kinship among the individuals (K matrix) using TASSEL (Tassel1)versus the expectation under null. The black-line shows the expected distribution and the black points show the absolute observed distribution.

77

GWAS for show-jumping

Figure 5 Quantile – Quantile-plots of observed P-values estimated using a mixed linear animal model by simultaneous accounting the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner and relative kinship among the individuals (K matrix) using TASSEL (Tassel2) versus the expectation under null. The black-line shows the expected distribution and the black points show the absolute observed distribution.

78

GWAS for show-jumping

79

80

CHAPTER 6

Identification of quantitative trait loci for dressage in Hanoverian warm blood horses

Wiebke Schröder, Andreas Klostermann, Kathrin Frederike Stock, Ottmar Distl

Institute for Animal Breeding and Genetics, University of Veterinary Medicine Hannover, Bünteweg 17p, 30559 Hannover, Germany

81

82

Identification of QTL for dressage

6 Identification of quantitative trait loci for dressage in Hanoverian warm blood horses

6.1 Summary The aim of this study was a genome-wide association (GWAS) study for quantitative trait loci (QTL) of dressage in Hanoverian warmblood horses employing the Illumine equine SNP50 Beadchip. We genotyped 115 stallions of the National State stud of Lower Saxony for our analyses. The quality of walk, trot and canter, and the rideability of a horse are characterizing for in the dressage talent. We performed adaptive permutation testing with and without parameters to explain potential data stratification to determine associations. Besides, we employed two different mixed linear animal model approaches to control for marker-based population structure and marker identity-by-state (IBS) based kinship among all individuals. Population stratification was explained best in the mixed linear animal model considering Hanoverian, Thoroughbred, Trakehner and Holsteiner genes and the marker identity- by-state relationship matrix. We identified 12 QTL for dressage on horse chromosomes (ECA) 2, 3, 4, 6, 7, 8, 9, 10, 16, 18, 27 and 28 (-log10 P-value >4) and further putative QTL with -log10 P-values of 3-4 on ECA1, 2, 3, 5, 14, 18, 19, 20, 21 and 26. Within the QTL regions we identified functional candidate gene for dressage performance, including VWC2 on ECA4, HPX on ECA7, AF3BL2 on ECA8, TRAPPC9 on ECA9, MYL3 on ECA16 and MCPH1 on ECA27. Within the putative QTL we found MYO5A, GNB5, GABPB and HDC on ECA1, TRPC3 on ECA2, PPARGC1A on ECA3, MARKAPK on ECA5, SHOX2 on ECA19, and GABPA on ECA26 as functional candidate genes. Our results suggest that multiple genes involved in diverse processes are crucial for elite dressage performance. In particular genes involved in coordination, ataxia and learning aptitude might play a major roll for excellent dressage performance. To validate these QTL, further studies in larger data sets and other horse populations are necessary.

Keywords: horse, dressage, quantitative trait loci, GWAS, SNP

83 Identification of QTL for dressage

6.2 Introduction Primarily bred to be a horse suitable for the military usage, since the foundation of the Hanoverian Stud Book in 1735, the Hanoverian warmblood (Hanoverian) was intensely selected for an athletic and competitive phenotype that is required for a favourable riding horse. Therefore, the Hanoverian represents one of the most important breed of sport horses in the world today and is numerously represented among the best dressage horses world wide. Excellent movements in walk, trot and canter combined with a good rideability are the major aim of dressage horse breeding. The walk is aimed to be a rhythmical and even four beat, ground covering, energetic and elastic, and well balanced. A clear two-beated cadence, a high level of impulsion, elasticity, ground cover and balance, an uphill-moving forehand with a freely moving shoulder is demanded for the trot. Thereby, the hindlegs are supposed to be active, well bending and moving with thrust under centre of gravity, with a clear activity of the musculature of the back and the thighs. The canter is aimed to be uphill with clear rhythm (three-beat), impulsion, elasticity, ground cover and balance. Every canter stride should be a powerful push with well bent hindlegs, striding under centre of gravity. Results of mare performance tests (MPT) and auction inspection (PAI) of the Hanoverian studbook society (HSS) of last decades indicate breeding progress particularly in thus dressage related traits. In this connection, possible sale benefits may cause that breeders tend to put in particular weight on dressage talent of their foals. At young age, high prices are primarily achieved for foals with good movements and dressage horses are often the most expensive among the horses offered at riding horse auctions in Europe. However, training a dressage horse to high levels takes several years and further more, genetic improvement in horses is greatly reduced by the long generation interval. Heritability estimates for walk, trot, canter and rideability range from 0.26 – 0.36 in Hanoverian warmblood horses (Stock & Distl 2007). Hence, the application of genetic markers in selection schemes to improve dressage performance and early reveal best horses for professional training could be highly desirable. Although the physical attributes contributing to excellent dressage performance are well described, the genes contributing to dressage

84

Identification of QTL for dressage

performance have not yet been identified. Genomic regions potentially containing genes that influence exercise-related phenotypes in Thoroughbreds are localized for the first time by Gu et al. (2009). In contrast, numerous studies have been performed to identify QTL and candidate genes potentially affecting physical performance in human. Therefore, relating to physical performance the human is the best studied species so far (Bray et al. 2009). Ballet, competitive dancing, and gymnastic exercise are human sports requiring physical performance most comparable to dressage. Next to endurance and strength, high levels of balance, coordination and sensitivity are required. For human and few other species e.g. cattle and dogs, genotyping arrays containing SNP markers were successfully used for mapping QTL for quantitative traits (Karlsson et al. 2007; Myles et al. 2008; Kolbehdari et al. 2009). The completion of the equine genome assembly, and SNP assays covering the whole equine genome enable to scan for genetic variations in horses at a very high resolution. The aim of this study was to perform a genome-wide association (GWA) analyses for dressage, including the quality of walk, trot, canter and rideability as well as the composed trait dressage in Hanoverian horses using the equine SNP50 BeadChip (Illumina, San Diego, CA, USA) and to screen the potential QTL for possible candidate genes known from human studies.

6.3 Materials and Methods 6.3.1 Animals and phenotypic data Blood samples were collected from 115 Hanoverian warmblood stallions of the National State Stud of Lower Saxony. These stallions were born between 1972 and 2000 and represent a random sample from all Hanoverian stallions born in the last 20-30 years. Pedigree data were made available by the HSS through the national unified animal ownership database (Vereinigte Informationssysteme Tierhaltung w.V., VIT). Pedigree records of these stallions allowed us to assign the 115 stallions into 16 families which included a total of 798 stallions (Table S1). We employed the latest breeding values (BVs) for dressage (Mai 2009) provided by the HSS. BVs for dressage were estimated based on results recorded at mare performance tests

85

Identification of QTL for dressage

(MPTs) since 1987 and inspections before auctions (PAIs) since 1999 including 35,512 animals. Dressage is a composed trait resulting from scores for the quality of walk, trot and canter, and the rideability. At MPTs mares are scored by a judging commission for dressage using a scale from 0 (not shown) to 10 (excellently shown) with 0.5 intervals. Quality of walk, trot, and canter and rideability are separately scored and then averaged to a total score for dressage. Horses pre-selected for sale at riding horse auctions of the HSS are scored for dressage by a judging commission. Between 1999 and 2008, 8,081 Hanoverians (5,567 males, 2,514 females) were judged at PAIs. Auction candidates are scored for the same traits and at the same scale like the mares at MPTs. The same commission judges the mares at MPTs and the auction candidates to ensure comparable results. If a mare took part in a PAI as well as in a MPT, then the result of the MPT is included in the BV estimation. For mares which repeated the MPT, the last result is included. BVs are estimated yearly through the VIT by applying a multivariate BLUP (best linear unbiased prediction) animal model (Christmann 1996).

Yijk= μ + TESTi+ aj + eijk

with yijk = score for quality of walk, trot and canter, and rideability, μ = model constant, TESTi = fixed effect of the individual test for MPT or PAI: interaction between the place, the year and season of performance evaluation, aj = random additive genetic effect of the individual horse and eijk = random residual. Season has been classified two seasons (January to June and July till December). BVs were standardized to a mean value of 100 points and a standard deviation of 20 points, using the horses born in the years 1999 and 2000. Every year the reference birth years move by one year forward. For the investigated stallions the BV for dressage ranged from 48-150 (mean 97±23) (Table S2). Reliabilities ranged between 0.25 and 0.99 (mean 0.87±0.13). The distributions of BVs for dressage were analyzed using the procedure UNIVARIATE of SAS software (Statistical Analysis System, version 9.2, SAS Institute, Cary, NC, USA, 2010).

86

Identification of QTL for dressage

For each stallion the proportion of genes of Hanoverian (HAN), Thoroughbred (TB), Trakehner (TRAK) and Holsteiner (HOL) horses were calculated using all available pedigree information. Details are described elsewhere (Hamann & Distl 2008). Mean (median) proportions of genes in the stallions were 0.54 (0.63) for HAN, 0.28 (0.19) for TB, 0.05 (0.03) for TRAK, and 0.06 (0) for HOL (Table S1). Given the uneven representation of gene proportions, each four classes for the several breeds were defined as follows HAN: ≤0.34, >0.34 and < 0.78, ≥0.78; TB: ≤0.13, >0.13 and < 0.30, ≥0.30; TRAK: ≤0.20, >0.20 and < 0.80, ≥0.80; HOL: 0.00, >0.00 and < 0.30, ≥0.30.

6.3.2 Genotyping SNPs Genomic DNA was extracted from EDTA blood samples of 115 Hanoverian warmblood stallions through a standard ethanol fraction with concentrated sodium chloride (6M NaCl) and sodium dodecyl sulphate (10% SDS). Concentration of extracted DNA was determined using the Nanodrop ND 1000 (Preqleb Biotechnology GmbH, Erlangen, Germany). DNA concentration of samples was adjusted between 30 and 80 ng/μl. Genotyping was performed with the Illumina equine SNP50 BeadChip containing 54,602 SNP markers using standard procedures as recommended by the manufacturer. Raw data were analysed using the genotype module version 3.2.32 of the BeadStudio program (Illumina). In order to assign the genotypes we generated a cluster file with the help of the BeadStudio software and the genotyping module version 3.2.32.

6.3.3 Data analysis For genome-wide mapping we performed association analyses for all SNPs with a minor allele frequency (MAF) >0.01 and a call rate >0.90. Due to missingness test, no SNP was excluded. There were 7875 SNPs that did not reach a sufficient MAF and 3951 SNP had a call rate ≤0.90, so 43,441 SNPs were left for association analyses. To control spurious associations, we tested possible stratification effects on their outcome of GWA and employed empirical genome-wide error probabilities

87

Identification of QTL for dressage

through adaptive permutations. The models employed were parameterized using PLINK, version 1.07 (http://pngu.mgh.harvard.edu/~purcell/plink/ (Purcell et al. 2007)) and TASSEL version 2.1 (http://www.maizegenetics.net/tassel (Bradbury et al. 2007)). First, genome-wide associations were determined without any parameters to explain potential data stratification. Therefore, adaptive permutations for correction of multiple tests was performed using a maximum of 1,000,000 permutations (Plink1) the “--assoc” and the “--aperm” options of PLINK. An extended model included the gene proportions of the important founder breeds HAN, TB, TRAK and HOL to improve the results of the GWA analyses. The covariates were considered as class effects with each four levels. The adaptive permutations were done applying a linear regression model using the “--linear” and “--covar” options for PLINK (Plink2). In a third PLINK model, we tested the effect of family structure on the GWA analysis. We performed Cochran-Mantel-Haenszel (CMH) tests within the 16 families and simultaneously considered the gene proportion of HAN, TB, TRAK and HOL as covariates. Here, we utilized the “--mh”, “--within” and “--aperm” options for PLINK (Plink3). A mixed linear animal model (MLM) was employed to control for marker-based population structure (Q-matrix) and marker identity-by-state (IBS) based kinship among all individuals (K-matrix) using TASSEL (http://www.maizegenetics.net/tassel (Bradbury et al. 2007)) (Tassel1). The data file for building these two matrices were from 7375 genome-wide and equidistantly distributed SNPs at pair-wise linkage disequilibrium <0.2 (Ritland 1996). The Q-matrix contained three covariates for the cryptic structure of the stallions as determined by STRUCTURE, version 2.3.3 (Pritchard et al. 2000), via optimization of the likelihood of the data. Using the KIN option of TASSEL, the K-matrix was created calculating the marker IBS coefficients. This subset of SNPs was generated as a pruned subset of SNPs that are in approximate linkage equilibrium with each other. Therefore the “--indep-pairwise 2000 500 0.2” (sliding window size of 2000 SNPs, window shift steps of 500 SNPs and an r2 threshold of 0.2) option within PLINK was used. We implemented two MLM models. The first MLM model explained for effects of the cryptic structure as determined via structure and the IBS-kinship matrix (Tassel1). In the second MLM

88

Identification of QTL for dressage

model (Tassel2) the proportions of genes of HAN, TB, TRAK and HOL as covariates and the IBS-kinship matrix were taken into account. The MLM (Yu et al. 2006) was implemented in TASSEL as described in Henderson’s notation (Bradbury et al. 2007):

y = Qβ + Zu + Ga + e

where y is the BV for dressage, walk, trot, canter or rideability; β is an unknown vector containing fixed effects of population structure (Q-matrix) or the proportion of genes of HAN, TB, TRAK and HOL; u is an unknown vector of random additive genetic effects from multiple background QTL for individuals (K-matrix); a is an unknown vector containing the genotype effects of the SNPs in the GWA; X, Z, and G are the known design matrices; and e is the unobserved vector of random residuals. Subsequently, we calculated the observed polymorphism information content (PIC) using the ALLELE procedure of SAS/Genetics (Statistical Analysis System, version 9.2 SAS Institute, Cary, NC, USA, 2010). We built quantile-quantile (Q-Q) plots to visualise the observed versus expected P- value distribution for each of the models employed. The observed –log10 P-values were plotted against –log10 P-values expected under the null hypothesis of independence (Fig. 1-5). The observed divergence between the expected distribution of the regression line and the distribution of observed –log10 P-values represent the inflation of P-values mainly caused by data stratification. According to the Q-Q plots, smallest –log10 P-value inflation was observed using a MLM model with the gene proportion of HAN, TB, TRAK and HOL as covariates and the K-matrix for random additive genetic effects due to the IBS relationship among all animals (Tassel2).

Based on the Q-Q plots, we defined a SNP as significant with -log10 (P) >3 and as highly significant with -log10 (P) >4 using Tassel1 or Tassel2. We found 1000 SNPs as highly significant using Plink1, 694 SNPs using Plink2 and 151 SNPs using Plink3. Using MLM, 270 SNPs were associated with the BV for dressage employing Tassel1 and 57 employing Tassel2. Subsequent, potential QTL were defined as genomic region with minimum one SNP marker estimated as highly significant using Tassel1 or Tassel2 (Table 1). Further putative QTL were defined as genomic regions

89

Identification of QTL for dressage

harbouring at least one SNP marker estimated as significant using Tassel1 and Tassel2 (Table S1). Estimates of the additive and dominance effects for each of the most significant SNPs within each potential QTL were obtained using Best Linear Unbiased Prediction (BLUP) with the software PEST (Groeneveld et al. 1990).

Yijklmno= μ + GTi+ HANj + TBk + TRAKl + HOLm+ an + eijklmno

with yi…o = BVs for dressage, walk, trot, canter and rideability μ = model constant,

GTi = genotypes of the most significant SNP within each QTL, HANj = proportion of

Hanoverian genes, TBk = proportion of Thoroughbred genes, TRAKl = proportion of

Trakehner genes, HOLm = proportion of Holsteiner genes, an = random additive genetic effect of the individual horse (n = 1-3665) and ei…o = residual. The additive genetic effects for each of the most significant SNP within each QTL were estimated as half of the difference of the least square means of the two homozygous genotypes. The dominance effect was calculated as the deviation of the least square means of the heterozygotes from the average of the two homozygous genotypes. If none of the investigated stallions was homozygote for the minor allele of the SNP, the genotype effect was calculated as the deviation of the least square mean of the homozygote genotype from the least square mean of heterozygote genotype. Significance was tested using F-tests. The genotype based BVs (gBV) for dressage was calculated based on the observed additive or dominance effect of these SNPs for each stallion.

13 gBV   bda )( i  1151 with gBV = genotype BV for dressage, a = additive effect of each SNP, d = dominance effect of each SNP or b = genotype effect of the SNP. For better compression, we standardized gBVs to a mean value of 100 points and a standard deviation of 20 points. Subsequent, correlations between BV dressage and gBV dressage were calculated using the CORR procedure of SAS/Genetics. We performed multiple analyses of variance (ANOVA) to test the influence of gBV

90

Identification of QTL for dressage

dressage on the distribution of BV dressage (r2) using the procedures GLM of SAS/Genetics.

6.4 Results We were able to define 12 QTL for dressage on horse chromosomes (ECA) 2, 3, 4, 6, 7, 8, 9, 10, 16, 18, 27 and 28 (Table 1). The QTL regions were on ECA2 at 115.8 Mb, on ECA4 at 19.1 Mb, on ECA6 at 45.8 Mb, on ECA7 at 76.9 Mb, on ECA10 at 73.0 Mb, on ECA27 at 33.3 Mb and on ECA28 at 45.2 Mb. On ECA3, highest –log10 P-values were at 89.4-89.6 Mb and on ECA8 at 39.3 Mb. Peak values were highest at 79.2–79.3 Mb on ECA9, at 40.8-41.3 Mb on ECA16 and at 0.8 Mb on ECA18. Further putative QTL with -log P-values >3 and <4 were on ECA1, 2, 3, 5, 14, 18, 19, 20, 21 and 26 (Table S3). The mean polymorphism information (PIC) was 0.26 for each SNP with -log P-values >4. The largest additive effects with values of 16.1-16.3 points for BV dressage were detected for was detected for BIEC2-734776 and BIEC2-745466 on ECA28 (Table 2). Additive effects in a similar but smaller size were shown for QTL on ECA4, 7 and 9. With the exception of the QTL on ECA9 and ECA28 at 22 Mb, for all other QTL significant dominance effects could be demonstrated. The dominance effects were in a range from -9 to -20 and 21 to 32 points of BV dressage. For the QTL with a MAF <0.09 each one homozygous genotype was missing and thus, additive and dominance effects were not estimable. The correlation coefficient estimated between the gBV for dressage and the BV for dressage was 0.54. The variance, explained by the most significant SNP within each QTL was 0.40. Table 3 shows the distribution of SNP genotypes per proportions of Hanoverian, Thoroughbred, Trakehner and Holsteiner genes. Within the QTL regions the von Willebrand factor C domain containing 2 gene (VWC2) on ECA4, the hemopexin gene (HPX) on ECA7, the ATPase family gene 3- like 2 gene (AF3BL2) on ECA8, the trafficking protein particle complex 9 gene (TRAPPC9) on ECA9, the myosin, light chain 3, alkali; ventricular, skeletal, slow gene (MYL3) on ECA16 and the microcephalin 1 gene (MCPH1) ECA27 could be identified as a functional candidate gene for physical performance. Within the

91

Identification of QTL for dressage

putative QTL we found the myosin VA (heavy chain 12, myoxin) gene (MYO5A), the guanine nucleotide binding protein (G protein), beta 5 gene (GNB5), the GA binding protein transcription factor, beta gene (GABPB) and the histidine decarboxylase gene (HDC) on ECA1, the transient receptor potential cation channel, subfamily C, member 3 gene (TRPC3) on ECA2, the peroxisome proliferator-activated receptor gamma, coactivator 1 alpha gene (PPARGC1A) on ECA3, the mitogen-activated protein kinase-activated protein kinase 2 gene (MARKAPK) on ECA5, the short stature homeobox 2 gene (SHOX2) on ECA19, and the GA binding protein transcription factor, alpha subunit 60kDa gene (GABPA) on ECA26 as functional candidate genes. Table S2 shows all SNPs, estimated as significant using Tassel1 or Tassel2 and potential candidate genes from human studies for physical fitness.

6.5 Discussion The aim of our study was to reveal SNP markers associated with dressage in Hanoverian warmblood horse. We were able to define 12 QTL and 15 further putative QTL. We suggest, based on the moderate number of associated SNPs that several genes, similar to physical performance in human (Macarthur & North 2005), are involved in an elite dressage performance. The Q-Q-plots show expected distribution of association test statistics (X-axis) across the SNPs compared to the observed value (Y-axis). Any bias deviation from the X=Y line implies a consistent data stratification across the whole genome, as seen in Fig. 1-4. The Q-Q-plot for observed P-values calculated using Tassel2, shows a solid line matching X=Y until it curves at the end, representing the small number of truly associated SNPs. Hence, omitting the effects of kinship and the proportion of genes for GWA would result in an increased false discovery rate caused by stratification within the investigated population. To some extent estimated –log error probabilities substantially vary between the different models used, indicating a high sensitivity of GWA analyses. In particular, results estimated using Plink differ from thus estimated using Tassel. We suppose, the algorithm used in Tassel takes data stratification of the investigated stallion population better into account. Our results indicate that an in-depth model choice is crucial for reliable

92

Identification of QTL for dressage

GWA. However, we suppose that some of the differences between –log error probabilities are due to SNPs with low MAF. Thus associations are suspicious, because they base on only few animals. We can not obviate that thus QTL might be false positive associations and further investigations are required to verify there influence on dressage performance. The observed genotype frequencies and the estimated dominance effect of BIEC2- 800503 (ECA3), as well as the observed genotype frequencies and estimates for additive effects of BIEC2-1045942 (ECA8), and BIEC2-131285 (ECA10) indicate that a positive selection for dressage is already taking place in the investigated stallion population. However, genotype frequencies and additive, dominance and genotype effects of the most significant SNPs within the other QTL provide a possibility to improve BV for dressage in the future progeny by maker based selection for a genotype preferable for dressage. Our results indicate that there is still scope to improve dressage performance within the Hanoverian population by selective breeding. Identified QTL were compared to positively selected regions in Thoroughbreds, found by Gu et al. (2009). The QTL on ECA27 and 28, and the putative QTL on ECA21 was marginally overlapping with a positive selected region in Thoroughbreds. None of the other potential QTL we found coincided with the positive selected regions in the Thoroughbred. In the Hanoverian horse population analyzed here, possible breed-related marker associations have been on purpose sufficiently accounted for in the models used, to reveal within-breed-variation for dressage. In contrast, Gu et al. (2009) were searching for across-breed-variations to reveal genomic regions distinctive primary for Thoroughbreds. We reviewed 28 human performance related genes for which genome-wide associated mutations have been shown in man. Dressage requires a combination of a high level of muscular tension combined with elastic, rhythmic, regular and supple movments. Additionally, balance, coordinative skills and high level of sensitivity and learning aptitude are crucial for elite dressage performance. On ECA2 we found TRPC3 as a potential candidate gene for dressage. TRPC3 is thought to form a receptor-activated non-selective calcium permeant cation channel.

93

Identification of QTL for dressage

Becker et al. (2009) found in mice with cerebellar ataxia, termed 'moonwalker', in which affected mice show motor and coordination defects associated with progressive loss of Purkinje cells in the cerebellum, a gain-of-function substitution in exon 7 of the TRPC3 gene. The VWC2 gene may have impact on the brain as well. This gene encodes a secreted bone morphogenic protein antagonist. The encoded protein is possibly involved in neural function and development and may have a role in cell adhesion. Koike et al. (2007) created a knockout of the VWC2 ortholog in zebrafish and observed impairment of brain development. The equine ortholog is located on ECA4. On ECA8 we found the AFG3L2 gene, which encodes a protein localized in mitochondria and closely related to paraplegin as potential candidate gene. The paraplegin gene is responsible for an autosomal recessive form of hereditary spastic paraplegia and is a candidate gene for other hereditary spastic paraplegias or neurodegenerative disorders. In human and mice, mutations in this gene are associated with spinocerebella ataxia, loose of balance, tremor and cerebellar degeneration with loss of Purkinje cells and parallel fibers, and reactive astrogliosis (Di Bella et al. 2010; Martinelli et al. 2009). In the same QTL we found the ABHD3 gene harbouring BIEC2-1045942 and BIEC2-1046042 as intragenic SNPs. This gene encodes a protein containing an alpha/beta hydrolase fold, which is a catalytic domain found in a very wide range of enzymes, but the function of this protein has not been determined yet and it remains open whether ABHD3 has an influence on dressage performance. However, in particular dressage demands a high level of coordination, balance and learning aptitude from a horse, for this reason TRPC3, VWC2 and AFG3L2 are interesting candidate genes. Another gene related to brain function is the TRAPPC9 on ECA9. BIEC2-11064686 and BIEC2-1106495 are part of the coding sequence of this gene. The protein encoded by this gene functions as an activator of NF-kappa-B through increased phosphorylation of the I kappa B kinase complex and function in neuronal cells differentiation. In human, mutations in this gene are related to mental retardation (Mochida et al. 2009; Mir et al. 2009; Philippe et al. 2009). On ECA7 we detected the HPX gene, which encodes a plasma glycoprotein that binds heme with high affinity and transports it from the plasma to the liver and may

94

Identification of QTL for dressage

be involved in protecting cells from oxidative stress (Takahashi et al. 1985). We were able to detect MYL3 as a potential candidate gene on ECA16. MYL3 encodes a myosin light chain 3, an alkali light chain. The myosin molecule consists of two heavy chains and four associated light chains. Two of the light chains are regulatory light chains encoded by the MYL2 gene, and 2 are alkali light chains, or essential light chains, encoded by the MYL3 gene. The light chains stabilize the long alpha- helical neck of the myosin head. are a large family of motor proteins that share the common features of ATP hydrolysis, actin binding and potential for kinetic energy transduction. Their function in striated muscle is only partially understood (Poetter et al. 1996). In human, mutations in MYL3 have been identified as a cause of mid-left ventricular chamber type hypertrophic cardiomyopathy (Kabaeva et al. 2002). The limiting factor for all kinds of physical performance is a sufficient oxygen supply, which is highly dependent from the cardiac output per minute and the oxygen carrying capability of the blood. Hence, HPX and MYL3 are suspicious candidate genes for dressage performance. On ECA27 we found MCPH1 as a potential candidate gene. MCPH1 encodes a DNA damage response protein. It may play a role in neurogenesis and regulation of the size of the cerebral cortex. Several studies revealed that mutations in this gene are related to microcephaly and mental retardation in human (e.g. Jackson et al. 2002; Garshasbi et al. 2006). Within the QTL on ECA3 and 18 we found KIAA1239 and ARHGEF4 as positional candidate genes with highly associated SNPs intragenic. KIAA1239 (ECA3) encodes the Leucine-rich repeat and WD repeat-containing protein KIAA1239. However, the function of the KIAA1239 gene is largely unknown (Nagase et al. 1999). The rho guanine nucleotide exchange factor (GEF) 4 is encoded by ARHGEF4 (ECA18) and plays a fundamental role in numerous cellular processes that are initiated by extracellular stimuli that work through G protein coupled receptors (Thiesen et al. 2000). It remains open whether KIAA1239 and ARHGEF4 have an influence on dressage performance and further investigations are required for verifying. Within the putative QTL we found further potential candidate genes. On ECA1 we detected two genes associated with ataxia, one gene associate with human

95

Identification of QTL for dressage

performance capability and another associated with behaviour and coordination. MYO5A is one of three myosin V heavy-chain genes, belonging to the myosin gene superfamily. Myosin V is a class of actin-based motor proteins involved in cytoplasmic vesicle transport and anchorage, spindle-pole alignment and mRNA translocation. The protein encoded by this gene is abundant in melanocytes and nerve cells. GNB5 is a heterotrimeric guanine nucleotide-binding proteins (G protein), which integrates signals between receptors and effector proteins. Jones et al. (2000) determined that mice with convulsive limb movements and ataxia persistent into adulthood express a novel gene combining the promoter and first 2 exons of GNB5 with the C-terminal exons of the closely linked MYO5A gene. On the same chromosome, we detected HDC and GABPB as functional candidate genes. The enzyme encoded by the HDC gene catalyzes biosynthesis of histamine from histidine. The biogenic amine histamine is an important modulator of numerous physiologic processes, including neurotransmission, gastric acid secretion, and smooth muscle tone (Bruneau et al. 1992). Several animal studies found that in mice a lack of HDC results in increased locomotor and stereotypic behaviors, as well as increased anxiety (e.g. Kubota et al. 2002; Dere et al. 2004). Strong nerves and good learning aptitude are crucial for elite dressage horses, hence HDC may be a functional candidate gene. GABPB (ECA1) and GABPA (ECA26) encode GA-binding protein transcription factors, beta and alpha subunit and are also referred to as nuclear respiratory factor-2 (NRF2). In the human NRF2 gene a SNP is found to positively influence human endurance performance, leading to a higher training response in VO2max (Eynon et al. 2009; He et al. 2007). PPARGC1A (ECA3) transcriptionally activates a complex pathway of lipid and glucose metabolism and is expressed primarily in tissues of high metabolic activity such as liver, heart and exercising oxidative skeletal muscle fibers. It is a coactivator of the subset of oxidative phosphorylation genes that control glucose and lipid transportation and oxidation, skeletal muscle fiber type formation and mitochondrial biosynthesis (Puigserver et al. 1998). Studies in Brangus steers revealed associations of SNPs in the bovine PPARGC1A with growth and meat quality traits. A SNP marker within the human PPARGC1A shows strong association with endurance

96

Identification of QTL for dressage

capacity. Trained individuals show in general increased PPARGC1A mRNA levels and increased resistance to muscle fatigue (Lucia et al. 2005). Recently, Eivers et al. (2009) found in Thoroughbreds a significant association with post-exercise PPARCG1A expression in equine skeletal muscle and post-exercise plasma lactate concentration. Thus PPARGC1A is a candidate gene that might have impact on dressage by affecting an athletic and muscular phenotype. The MAPKAPK2 gene encodes a member of the Ser/Thr protein kinase family which is known to be involved in many cellular processes including stress and inflammatory responses, nuclear export, gene expression regulation and cell proliferation. Hegen et al. (2006) found that MAPKAPK2 knockout mice were protected against collagen-induced arthritis. The equine orthologous gene is in proximity to BIEC2-888469 on ECA5 which is revealed as significant for dressage. High level dressage is stress for joints, ligaments and tendons. Healthy limbs are required for excellent movements and due to the long period of time it costs to train a dressage horse to high levels, limb health is an economically important factor for dressage horse breeding. Mutations in the human SHOX2 lead to growth retardation associated with Turner, Leri-Weill dyschondrosteosis, and Langer mesomelic dysplasia syndromes, which marked the shortening of the forearms and lower legs. Confirming results could be observed in mice with inactivated SHOX2 in the developing limbs (Cobb et al. 2006). The equine orthologous gene is localized within the QTL significant associated for dressage on ECA19. For a noble and athletic looking horse and for a good quality of gaits, long legs are crucial. Therefore, genes like SHOX2 may have an impact on limb development and are candidate genes for dressage. None of the genes involved in human physical performance could be detected within highly associated QTL regions for dressage, and only few were within the putative QTLs, indicating that genes affecting equine dressage performance largely differ from thus detected for human physical performance. We suppose that only few human sports like competitive dancing or ballet require physical conditions most comparable to thus required for dressage. However, for human primary endurance and power sports are investigated for candidate genes yet. Our results show that

97

Identification of QTL for dressage

multiple genes involved in divers’ processes may be crucial for elite dressage performance. In particular genes involved in coordination, ataxia and learning aptitude may play a major roll for excellent dressage performance. Scoring the quality of gaits and rideability is always partly influenced by subjective opinions, making GWA analyses more difficult. Hence, to verify the QTL regions for dressage, further analyses including larger populations and denser SNP marker sets are required. However, our approach appeared useful as a starting point to identify QTL for dressage within a breed.

6.6 References

Becker E.B.E., Oliver P.L., Glitsch M.D., Banks G.T., Achilli F., Hardy A., Nolan P.M., Fisher E.M.C. & Davies K.E. (2009) A point mutation in TRPC3 causes abnormal Purkinje cell development and cerebellar ataxia in moonwalker mice. Proceedings of the National Academy of Sciences of the United States of America 106, 6706- 11. Bradbury P.J., Zhang Z., Kroon D.E., Casstevens T.M., Ramdoss Y. & Buckler E.S. (2007) TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics 23, 2633-5. Bray M.S., Hagberg J.M., Pérusse L., Rankinen T., Roth S.M., Wolfrath B. & Bouchard C. (2009) The human gene map for performance and health-related fitness phenotypes: the 2006-2007 update. Medicine and science in sports and exercise 41, 35-73. Bruneau G., Nguyen V.C., Gros F., Bernheim A. & Thibault J. (1992) Preparation of a rat brain histidine decarboxylase (HDC) cDNA probe by PCR and assignment of the human HDC gene to chromosome 15. Human Genetics 90, 235–8. Christmann, L. (1996) Zuchtwertschätzung für Merkmale der Stutbuchaufnahme und\line der Stutenleistungsprüfung im Zuchtgebiet Hannover. Georg-August Universität Göttingen. Dissertation Cobb J., Dierich A., Huss-Garcia Y. & Duboule D. (2006) A mouse model for human short-stature syndromes identifies Shox2 as an upstream regulator of Runx2 during long-bone development. Proceedings of the National Academy of Sciences

98

Identification of QTL for dressage

of the United States of America 103, 4511-5. Dere E., De Souza-Silva M.A., Spieler R.E., Lin J.S., Ohtsu H., Haas H.L. & Huston J.P. (2004) Changes in motoric, exploratory and emotional behaviours and neuronal acetylcholine content and 5-HT turnover in histidine decarboxylase-KO mice. European Journal of Neuroscience 20, 1051-58. Di Bella, D., Lazzaro F., Brusco A., Plumari M., Battaglia G., Pastore A., Finardi A., Cagnoli C., Tempia F., Frontali M., Veneziano L., Sacco T., Boda E., Brussino A., Bonn F., Castellotti B., Baratta S., Mariotti C., Gellera C., Fracasso V., Magri S., Langer T., Plevani P., Di Donato S., Muzi-Falconi M. & Taroni F. (2010) Mutations in the mitochondrial protease gene AFG3L2 cause dominant hereditary ataxia SCA28. Nature Genetics 42, 313-21. Edgar A. J. & Polak J. M. (2002) Cloning and tissue distribution of three murine alpha/beta hydrolase fold protein cDNAs. BMC Genomics 292, 617-25. Eivers S.S., McGivney B.A., Fonseca R.G., Machugh D.E., Menson K., Park S.D.E., Rivero J.-L.L., Taylor C.T., Katz L.M.& Hill E.W. (2009) Alterations in oxidative gene expression in equine skeletal muscle following exercise and training. Physiological Genomics 40, 83-93 Eynon N., Sagiv M., Meckel Y., Duarte J.A., Alves A.J., Yamin C., Sagiv M., Goldhammer E. & Oliveira J. (2009) NRF2 intron 3 A/G polymorphism is associated with endurance athletes' status. Journal of Applied Physiology 107, 76- 9. Garshasbi M., Motazacker M.M., Kahrizi K., Behjati F., Abedini S.S., Nieh S.E., Firouzabadi S.G., Becker C., Ruschendorf F., Nurnberg P., Tzschach A., Vazifehmand R., Erdogan F., Ullmann R., Lenzner S., Kuss A.W., Ropers H.H. & Najmabadi H. (2006) SNP array-based homozygosity mapping reveals MCPH1 deletion in family with autosomal recessive mental retardation and mild microcephaly. Human Genetics 118, 708-15. Groeneveld E., Kovac M. & Wang T. (1990) Pest, a general purpose BLUP package for multivariate prediction and estimation. In World Congress on Genetics Applied to Livestock Production, 488–91. Edinburgh, UK Garsi, Madrid. Gu J., Orr N., Park S.D., Katz L.M., Sulimova G., MacHugh D.E. & Hill E.W. (2009) A

99

Identification of QTL for dressage

Genome Scan for Positive Selection in Thoroughbred Horses. PLoS ONE 4: e5767. Hamann H., & Distl O. (2008) Genetic variability in Hanoverian warmblood horses using pedigree analysis. Journal of Animal Science 86, 1503-13. He Z., Hu Y., Feng L., Lu Y., Liu G., Xi Y., Wen L. & McNaughton L.R. (2007) NRF2 genotype improves endurance capacity in response to training. International Journal of Sports Medicine 28, 717-21. Hegen M., Gaestel M., Nickerson-Nutter C.L., Lin L.-L. & Telliez J.-B. (2006) MAPKAP kinase 2-deficient mice are resistant to collagen-induced arthritis. Journal of Immunology 177, 1913-7. Hill E.W., Gu J., Eivers S.S., Fonseca R.G., McGivney B.A., Govindarajan P., Orr N., Katz L.M. & MacHugh D.E. (2010) A sequence polymorphism in MSTN predicts sprinting ability and racing stamina in Thoroughbred horses. PloS One 5: e8645. Jackson A.P., Eastwood H., Bell S.M., Adu J., Toomes C., Carr I.M., Roberts E., Hampshire D.J., Crow Y.J., Mighell A.J., Karbani G., Jafri H., Rashid Y., Mueller R.F., Markham A.F. & Woods C.G. (2002) Identification of microcephalin, a protein implicated in determining the size of the human brain. American Journal of Human Genetics 71, 136-42. Kabaeva Z.T., Perrot A., Wolter B., Dietz R., Cardim N., Correia J.M., Schulte H.D., Aldashev A.A., Mirrakhimov M.M. & Osterziel K.J. (2002) Systematic analysis of the regulatory and essential myosin light chain genes: genetic variants and mutations in hypertrophic cardiomyopathy. European Journal of Human Genetics 10, 741-8. Karlsson E.K., Baranowska I., Wade C.M., Salmon Hillbertz N.H.C., Zody M.C., Anderson N., Biagi T.M., Patterson N., Pielberg G.R., Kulbokas E.J. 3rd, Comstock K.E., Keller E.T., Mesirov J.P., von Euler H., Kämpe O., Hedhammar A., Lander E.S., Andersson G., Andersson L. & Lindblad-Toh K. ( 2007) Efficient mapping of mendelian traits in dogs through genome-wide association. Nature Genetics 39, 1321-8. Koike N., Kassai Y., Kouta Y., Miwa H., Konishi M. & Itoh N. (2007) Brorin, a novel secreted bone morphogenetic protein antagonist, promotes neurogenesis in

100

Identification of QTL for dressage

mouse neural precursor cells. Journal of Biological Chemistry 282, 15843-50. Kolbehdari D., Wang Z., Grant J.R., Murdoch B., Prasad A., Xiu Z., Marques E., Stothard P. & Moore S.S. (2009) A whole genome scan to map QTL for milk production traits and somatic cell score in Canadian Holstein bulls. Journal of animal breeding and genetics 126, 216-27. Kubota Y., Ito C., Sakurai E., Sakurai E., Watanabe T. & Ohtsu H. (2002) Increased methamphetamine-induced locomotor activity and behavioral sensitization in histamine-deficient mice. Journal of Neurochemistry 83, 837-45. Lucia A., Gómez-Gallego F., Barroso I., Rabadán M., Bandrés F., San Juan A.F., Chicharro J.L., Ekelund U., Brage S., Earnest C.P., Wareham N.J. & Franks P.W. (2005) PPARGC1A genotype (Gly482Ser) predicts exceptional endurance capacity in European men. Journal of Applied Physiology 99, 344-8. Macarthur D.G. & North K.N. (2005) Genes and human elite athletic performance. Human Genetics 116, 331-9. Martinelli P., La Mattina V., Bernacchia A., Magnoni R., Cerri F.,Cox G., Quattrini A., Casari G. & Rugarli E.I. (2009) Genetic interaction between the m-AAA protease isoenzymes reveals novel roles in cerebellar degeneration. Human Molecular Genetics 18, 2001-13. Mir A., Kaufman L., Noor A., Motazacker M.M., Jamil T., Azam M., Kahrizi K., Rafiq M. A., Weksberg R., Nasr T., Naeem F., Tzschach A., Kuss A.W., Ishak G.E., Doherty D., Ropers H.H., Barkovich A.J., Najmabadi H., Ayub M. & Vincent J.B. (2009) Identification of mutations in TRAPPC9, which encodes the NIK- and IKK- beta-binding protein, in nonsyndromic autosomal-recessive mental retardation. American Journal of Human Genetics 85, 909-15. Mochida G.H., Mahajnah M., Hill A.D., Basel-Vanagaite L., Gleason D., Hill R.S., Bodell A., Crosier M., Straussberg R. & Walsh C.A.A (2009) truncating mutation of TRAPPC9 is associated with autosomal-recessive intellectual disability and postnatal microcephaly. American Journal of Human Genetics 85, 897-902. Myles S., Tang K., Somel M., Green R.E., Kelso J. & Stoneking M. (2008) Identification and analysis of genomic regions with large between-population differentiation in humans. Annals of human genetics 72, 99-110.

101

Identification of QTL for dressage

Nagase T., Ishikawa K., Kikuno R., Hirosawa M., Nomura N. & Ohara O. (1999) Prediction of the coding sequences of unidentified human genes. XV. The complete sequences of 100 new cDNA clones from brain which code for large proteins in vitro. DNA Research 6, 337-45. Philippe O., Rio M., Carioux A., Plaza J.-M., Guigue P., Molinari F., Boddaert N., Bole-Feysot C., Nitschke P., Smahi A., Munnich A. & Colleaux L. (2009) Combination of linkage mapping and microarray-expression analysis identifies NF- kappa-B signaling defect as a cause of autosomal-recessive mental retardation. American Journal of Human Genetics 85, 903-8. Poetter K., Jiang H., Hassanzadeh S., Master S.R., Chang A., Dalakas M.C., Rayment I., Sellers J.R., Fananapazir L. & Epstein N.D. (1996) Mutations in either the essential or regulatory light chains of myosin are associated with a rare myopathy in human heart and skeletal muscle. Nature Genetics 13, 63-9. Pritchard J.K., Stephens M. & Donnelly P. (2000) Inference of Population Structure Using Multilocus Genotype Data. Genetics 155, 945-59. Puigserver P., Wu Z., Park C.W., Graves R., Wright M. & Spiegelman B.M. (1998) A cold-inducible coactivator of nuclear receptors linked to adaptive thermogenesis. Cell 92, 829-39. Purcell S., Neale B., Toddbrown K., Thomas L., Ferreira M., Bender D., Maller J., Sklar P., de Bakker P.I., Daly M.J. & Sham P.C. (2007) PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses. American journal of human genetics 81, 559-75. Ritland K. (1996) Estimators for Pairwise Relatedness and Individual Inbreeding Coefficients. Genetics Research 67, 175-85. Stock, K. F., und O. Distl. 2007. Genetic correlations between performance traits and radiographic findings in the limbs of German Warmblood riding horses. J. Anim Sci. 85, Nr. 1 (Januar 1): 31-41. doi:10.2527/jas.2005-605. Takahashi, N., Takahashi, Y., Putnam, F. W. Complete amino acid sequence of human hemopexin, the heme-binding protein of serum. Proc. Nat. Acad. Sci. 82: 73-77, 1985.

102

Identification of QTL for dressage

Thiesen S., Kubart S., Ropers H.-H. & Nothwang H.G. Isolation of two novel human RhoGEFs, ARHGEF3 and ARHGEF4, in 3p13-21 and 2q22. (2000) Biochemical and Biophysical Research Communications 273, 364-9. Yu J., Pressoir G., Briggs W.H., Vroh Bi I., Yamasaki M., Doebley J.F., McMullen M.D., Gaut B.S., Nielsen D.M., Holland J.B., Kresovich S. & Buckler E.S. (2006) A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nature Genetics 38, 203-8.

103

Identification of QTL for dressage

Table 1 Quantitative trait loci for dressage, their location on Equus caballus chromosome (ECA) in Mb, minor allele frequency (MAF), polymorphism information content (PIC), SNP-motif (minor allele written in bold) and -log10 P-values estimate using different models.

ECA SNP Position MAF PIC Alleles Plink1 Plink2 Plink3 Tassel1 Tassel2 2 BIEC2-508353 115762143 0.35 0.35 C/T 1.13 0.80 1.26 3.51 4.09 3 BIEC2-800378 89489113 0.18 0.25 C/T 0.12 0.24 0.03 3.55 4.10 BIEC2-800503 89619156 0.20 0.27 C/T 0.51 0.74 0.59 4.65 4.92 4 BIEC2-852129 19079166 0.22 0.29 C/T 0.58 0.10 0.38 4.75 4.61 6 BIEC2-952532 45761583 0.01 0.03 C/T 3.58 1.49 1.86 5.52 4.48 7 BIEC2-1007282 76868086 0.43 0.37 A/G 2.15 1.72 0.94 5.68 4.72 8 BIEC2-1045942 39346486 0.06 0.11 C/T 2.41 2.32 3.48 4.06 4.15 9 BIEC2-1106486 79208732 0.42 0.37 G/T 1.62 1.35 3.21 4.38 3.65 BIEC2-1106495 79299429 0.32 0.34 A/G 1.34 1.04 2.38 3.53 3.43 10 BIEC2-131285 73028535 0.08 0.14 A/G 4.13 4.46 2.95 3.45 4.40 16 BIEC2-342983 40815437 0.02 0.04 C/T 3.00 2.09 0.59 4.41 3.69 BIEC2-343385 41315164 0.02 0.04 A/G 3.00 2.09 0.59 4.41 3.69 18 BIEC2-389893 763441 0.02 0.04 C/T 3.30 2.59 0.74 4.53 4.11 BIEC2-389894 763444 0.02 0.04 A/G 3.30 2.59 0.74 4.53 4.11 27 BIEC2-717291 33345559 0.02 0.04 C/T 3.89 2.01 1.45 6.68 5.98 28 BIEC2-734776 21957440 0.09 0.15 A/G 2.22 0.84 2.49 3.31 3.34 BIEC2-745466 45168480 0.23 0.29 A/C 0.19 0.46 0.16 4.29 3.72 Plink1 = adaptive permutation testing. Plink2 = adaptive permutation testing using the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner as covariates. Plink3 = adaptive permutation testing using the MH-test within 16 core families and the proportion of Hanoverian, Thoroughbred, Trakehner and Holsteiner genes as covariates. Tassel1 = mixed linear model (MLM) for population structure estimated using structure (Pritchard et al.2000) and identity by state relationship matrix among all individuals. Tassel2 = mixed linear model (MLM) for the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner as covariates and marker identity by state based kinship among all individuals.

104 Identification of QTL for dressage

Table 2 Estimates of the additive (a) and dominance (d) or genotype (b) effects with their standard errors (SE) and error probabilities (P) for each of the most significant SNP within each QTL using an animal model for the trait dressage and the distribution of genotype frequencies for the respective SNP.

Genotype Alleles QTL ECA SNP-ID a1 P d2 or b3 P frequencies 1 2 11 12 22 1 2 BIEC2-508353 T C 2.44 ± 2.90 0.40 -9.21 ± 3.922 <0.05 0.12 0.46 0.42 2 3 BIEC2-800503 T C -8.51 ± 5.80 0.15 21.27 ± 6.712 <0.01 0.63 0.35 0.03 3 4 BIEC2-852129 T C -13.06 ± 5.01 <0.01 -15.15 ± 5.932 0.01 0.04 0.36 0.60 4 6 BIEC2-95253 T C - - 32.19 ± 11.193 <0.01 0 0.03 0.97 5 7 BIEC2-1007282 A G 10.11 ± 2.88 <0.01 -8.79 ± 3.692 <0.05 0.34 0.44 0.21 7 8 BIEC2-1045942 T C - - -18.01 ± 5.423 <0.01 0 0.12 0.88 7 9 BIEC2-1106486 A G -9.22 ± 2.90 <0.05 4.17 ± 3.622 0.25 0.15 0.56 0.30 8 10 BIEC2-131285 A G - - -21.47 ± 10.453 <0.01 0.83 0.17 0 9 16 BIEC2-342983 T C - - 21.27 ± 10.453 <0.05 0 0.04 0.95 10 18 BIEC2-389893 T C - - 24.20 ± 9.393 0.01 0 0.04 0.95 11 27 BIEC2-717291 T C - - 29.25 ± 9.733 <0.01 0 0.04 0.95 12 28 BIEC2-734776 A G 16.29 ± 6.98 <0.05 -6.55 ± 8.352 0.43 0.83 0.16 0.02 13 28 BIEC2-745466 A C 16.14 ± 6.42 0.01 -19.62 ± 7.102 <0.01 0.55 0.39 0.03 1 2 3 a = (m22-m11) / 2; d = m12 – ((m11+m22) / 2); b= m12-m11/22

105 Identification of QTL for dressage

Table 3 Distribution of genotypes per proportions of Hanoverian (HAN), Thoroughbred (TB), Trakehner (TRAK) and Holsteiner (HOL) genes for the genotyped stallions.

Proportion of genes in classes Genotype frequency (%) HAN TB TRAK HOL 1 2 3 1 2 3 1 2 3 1 2 TT 4 14 16 13 12 12 4 18 18 13 6 BIEC2-508353 TC 50 41 52 53 44 42 58 43 33 45 50 CC 46 45 32 33 44 45 38 39 49 41 44 TT 68 61 61 67 67 52 65 54 67 63 61 BIEC2-8000503 TC 32 34 39 30 29 48 33 39 33 34 39 CC 0 5 0 3 4 0 2 7 0 3 0 TT 11 4 0 0 6 6 0 0 13 4 6 BIEC2-852129 TC 43 29 42 30 38 36 50 32 21 35 39 CC 46 68 58 70 56 58 50 68 67 61 56 TT 0 0 0 0 0 0 0 0 0 0 0 BIEC2-952532 TC 7 2 0 0 2 6 0 4 5 2 6 CC 93 98 100 100 98 94 100 96 95 98 94 AA 11 36 52 43 42 12 35 35 51 34 33 BIEC2-1007282 AG 43 48 39 53 38 45 52 46 33 43 50 GG 43 16 10 3 17 42 23 25 15 23 11 TT 0 0 0 0 0 0 0 0 0 0 0 BIEC2-1045942 CT 7 13 16 20 8 12 17 4 13 13 6 CC 93 88 84 80 92 89 83 96 87 87 94 TT 11 18 13 17 13 15 15 21 10 12 28 BIEC2-1106486 TG 54 55 58 40 63 58 52 64 54 57 50 GG 36 27 29 43 23 27 33 14 36 31 22 AA 93 80 81 77 81 94 92 75 79 82 89

BIEC2-131285 AG 7 20 19 23 19 6 8 25 21 18 11 GG 0 0 0 0 0 0 0 0 0 0 0 TT 4 0 0 0 0 0 0 0 0 0 0 BIEC2-342983 CT 7 5 0 0 4 9 0 4 10 4 6 CC 89 95 100 100 94 91 100 93 90 96 89 TT 0 0 0 0 0 0 0 0 0 0 0 BIEC2-389893 TC 11 4 0 0 4 9 2 4 8 4 6 CC 86 96 100 100 94 91 98 93 92 96 89 TT 0 0 0 0 0 0 0 0 0 0 0 BIEC2-717291 TC 7 5 0 0 4 9 0 4 10 4 6 CC 89 95 100 100 94 91 100 93 90 96 89 AA 75 82 90 90 85 73 83 89 77 85 72 BIEC2-734776 AG 18 18 10 10 12 27 15 11 21 14 22 GG 7 0 0 0 4 0 2 0 3 1 6 AA 61 53 52 67 48 55 63 36 59 55 56 BIEC2-745466 AC 25 41 48 33 46 33 31 64 31 39 39 CC 7 2 0 0 4 3 0 0 8 3 0 HAN 1: ≤0.34, 2: >0.34 and <0.78, 3: ≥0.78; TB 1: ≤0.13, 2: >0.13 and <0.30, 3: ≥0.30; TRAK 1: ≤0.20, 2: >0.20 and <0.80, 3: ≥0.80; HOL 1: 0.00, 2: >0.00 and <0.30, 3: ≥0.30.

106 Identification of QTL for dressage

Table S1 Families analyzed, their size, number of stallions genotyped, the mean breeding value (BV) for dressage, and mean proportions of Hanoverian (HAN), Thoroughbred (TB), Trakehner (TRAK) and Holsteiner (HOL) genes for the genotyped stallions.

Total Number of Proportion of genes descending from Family BV for family stallions number dressage size genotyped HAN (%) TB (%) TRAK (%) HOL (%)

1 63 10 89 ± 21 51.4 33.3 1.1 0 2 19 3 97 ± 11 55.0 43.7 1.7 0 3 98 7 87 ±18 16.7 23.1 0.3 54.8 4 37 8 126 ± 19 32.3 26.0 20.4 8.1 5 65 7 88 ± 21 71.6 27.1 1.3 0 6 47 7 85 ± 14 74.3 14.1 4.3 6.4 7 31 4 96 ± 9 21.3 12.8 3.3 0 8 80 13 85 ± 19 74.8 15.7 2.5 1.5 9 56 5 113 ± 22 24.6 74.8 0.4 0 10 33 6 92 ± 13 36.5 3.8 3.5 0 11 70 15 87 ± 16 79.3 16.4 3.9 0 12 15 2 125 ± 18 41.0 29.0 5.0 0 13 17 4 89 ± 14 47.2 38.5 4.5 5.3 14 29 2 81± 15 75.0 15.0 4.0 0 15 66 16 120 ± 21 78.4 13.9 6.1 1.2 16 72 6 99 ± 24 0 100 0 0 Total 798 115 97 48.7 30.45 3.9 4.8

107 Identification of QTL for dressage

Table S2 Quantiles of breeding value (BV) and genotype breeding values (gBV) for dressage and number of genotyped stallions per quantile.

Number of Number of stallions gBV Quantiles (%) BV dressage stallions per per quantile (BV) dressage quantile (gBV) 95 5 141 10 123 90 16 129 15 72 75 31 112 30 61 50 29 94 29 46 25 18 82 17 40 10 6 70 6 29 5 6 62 6 23

108 Identification of QTL for dressage

Table S3 Quantitative trait loci (QTL) and putative QTL for dressage, their location on Equus caballus chromosome (ECA) in Mb, minor allele frequency (MAF), polymorphism information content (PIC), SNP motif and -log10 P-values estimate using different models.

ECA SNP Position MAF PIC Alleles Plink1 Plink2 Plink3 Tassel1 Tassel2 1 BIEC2-61367 139468469 0.02 0.04 A/G 2.69 0.73 2.12 3.48 3.06 2 BIEC2-461362 21008582 0.11 0.18 A/G 3.93 3.56 3.88 3.06 3.46 BIEC2-461736 22109951 0.32 0.33 A/C 3.67 3.38 2.86 3.88 3.10 BIEC2-480931 58526415 0.06 0.11 G/T 1.98 1.70 1.06 3.58 3.00 BIEC2-481459 59229254 0.38 0.36 C/T 5.17 3.76 1.37 3.50 3.31 BIEC2-504940 107039863 0.38 0.36 G/T 1.88 3.05 1.27 3.31 3.24 BIEC2-505906 110824134 0.15 0.22 G/T 2.61 2.60 1.16 3.42 3.30 BIEC2-508353 115762143 0.35 0.35 C/T 1.13 0.80 1.26 3.51 4.09 3 BIEC2-800378 89489113 0.18 0.25 C/T 0.12 0.24 0.03 3.55 4.10 BIEC2-800482 89567828 0.20 0.27 A/C 0.12 0.25 0.06 3.38 3.90 BIEC2-800503 89619156 0.20 0.27 C/T 0.51 0.74 0.59 4.65 4.92 BIEC2-806874 100801174 0.46 0.37 A/C 2.37 2.46 2.27 3.03 3.20 BIEC2-806875 100801330 0.46 0.37 A/C 2.37 2.46 2.27 3.03 3.20 BIEC2-811905 117829016 0.49 0.37 A/G 0.26 0.07 0.36 3.30 3.38 4 BIEC2-852129 19079166 0.22 0.29 C/T 0.58 0.10 0.38 4.75 4.61 5 BIEC2-888469 3612914 0.21 0.27 C/T 1.63 1.34 1.60 3.09 3.61 6 BIEC2-952532 45761583 0.01 0.03 A/G 3.58 1.49 1.86 5.52 4.48 7 BIEC2-1007282 76868086 0.43 0.37 A/G 2.15 1.72 0.94 5.68 4.72 8 BIEC2-1045942 39346486 0.06 0.11 C/T 2.41 2.32 3.48 4.06 4.15 BIEC2-1046042 40043641 0.45 0.37 G/T 0.14 0.44 0.09 4.17 3.89 9 BIEC2-1106486 79208732 0.42 0.37 G/T 1.62 1.35 3.21 4.38 3.65 BIEC2-1106495 79299429 0.32 0.34 A/G 1.34 1.04 2.38 3.53 3.43 10 BIEC2-131285 73028535 0.08 0.14 A/G 4.13 4.46 2.95 3.45 4.40 14 BIEC2-262404 66630266 0.43 0.37 A/C 0.88 0.84 0.84 3.53 3.37 16 BIEC2-342983 40815437 0.02 0.04 C/T 3.00 2.09 0.59 4.41 3.69 BIEC2-343385 41315164 0.02 0.04 A/G 3.00 2.09 0.59 4.41 3.69 18 BIEC2-389893 763441 0.02 0.04 C/T 3.30 2.59 0.74 4.53 4.11 BIEC2-389894 763444 0.02 0.04 C/T 3.30 2.59 0.74 4.53 4.11 BIEC2-396099 6633895 0.24 0.30 G/T 1.30 1.67 1.19 3.00 3.02 BIEC2-397558 8701876 0.43 0.37 A/G 1.70 2.01 2.68 3.51 3.22 19 BIEC2-422417 1134701 0.48 0.37 C/T 1.38 1.54 1.52 3.71 3.16 20 BIEC2-512341 1939539 0.43 0.37 A/G 0.53 0.39 0.79 3.37 3.38 BIEC2-512342 1939607 0.43 0.37 C/T 0.53 0.39 0.79 3.37 3.38 BIEC2-512522 2301617 0.21 0.28 A/G 2.80 2.47 2.72 3.31 3.11 BIEC2-532022 39334283 0.31 0.34 A/G 1.97 2.36 0.70 3.40 3.08 21 BIEC2-552102 11883502 0.43 0.37 C/T 3.64 3.94 3.90 3.36 3.05 26 BIEC2-691198 24412608 0.42 0.37 C/T 0.24 0.42 0.41 3.06 3.05 27 BIEC2-717291 33345559 0.02 0.04 C/T 3.89 2.01 1.45 6.68 5.98

109 Identification of QTL for dressage

Table S3 continued.

ECA SNP Position MAF PIC Alleles Plink1 Plink2 Plink3 Tassel1 Tassel2 28 BIEC2-734776 21957440 0.09 0.16 A/G 2.22 0.84 2.49 3.31 3.34 BIEC2-745466 45168480 0.23 0.29 A/C 0.19 0.46 0.16 4.29 3.72 Plink1 = adaptive permutation testing. Plink2 = adaptive permutation testing using the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner as covariates Plink3 = adaptive permutation testing using the MH-test within 16 core families and the proportion of Hanoverian, Thoroughbred, Trakehner and Holsteiner genes as covariates. Tassel1 = mixed linear model (MLM) for population structure estimated using structure (Pritchard et al.2000) and identity by state relationship matrix among all individuals. Tassel2 = mixed linear model (MLM) for the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner as covariates and marker identity by state based kinship among all individuals.

110 Identification of QTL for dressage

Table S4 Single nucleotide polymorphisms (SNPs) significantly associated with the breeding value for dressage (-log10 (P) >3), their location on Equus caballus chromosome (ECA) in Mb and -log10 error probability estimated with Tassel1 and Tassel2, and potential candidate genes. Position in Candidate Position in bp ECA SNP ID Tassel1 Tassel2 Mb gene Start/ End 1 BIEC2-61367 139468469 3.48 3.06 MYO5A 138142462-138257171 2 BIEC2-459850 18796966 3.60 2.08 GNB5 138372166-138410652 BIEC2-461362 21008582 3.06 3.46 HDC 139917440-139938572 BIEC2-461470 21180537 3.00 2.72 GABPB 139870278-139897140 BIEC2-461736 22109951 3.88 3.10 BIEC2-462953 23822569 2.77 3.29 BIEC2-480907 58518774 3.42 2.96 BIEC2-480931 58526415 3.58 3.00 BIEC2-481459 59229254 3.50 3.31 BIEC2-497438 90811943 2.39 3.08 BIEC2-504940 107039863 3.31 3.24 TRPC3 106125238-106174871 BIEC2-505686 109839135 2.60 3.00 BIEC2-505687 109839370 2.60 3.00 BIEC2-505705 109978520 2.60 3.00 BIEC2-505906 110824134 3.42 3.30 BIEC2-508353 115762143 3.51 4.09 3 BIEC2-783140 57838137 3.17 2.20 BIEC2-783144 57875355 3.01 2.05 BIEC2-787275 64947509 3.71 2.89 BIEC2-799204 86451588 3.15 2.43 BIEC2-800051 88940197 2.80 3.12 BIEC2-800071 88985065 2.80 3.12 BIEC2-800378 89489113 3.55 4.10 BIEC2-800482 89567828 3.38 3.90 BIEC2-800503 89619156 4.65 4.92 BIEC2-806874 100801174 3.03 3.20 PPARGC1A 100784624-100876530 BIEC2-806875 100801330 3.03 3.20 BIEC2-810308 113536881 3.12 2.80 BIEC2-811905 117829016 3.30 3.38 4 BIEC2-852129 19079166 4.75 4.61 BIEC2-876011 96625871 3.27 2.80 5 BIEC2-885847 36327 3.00 2.48 BIEC2-887094 1958579 3.00 2.21 BIEC2-888469 3612914 3.09 3.61 MAPKAPK2 2952805-2956869 BIEC2-888636 3972453 2.85 3.05 BIEC2-897582 20301396 3.17 2.49 BIEC2-897738 21119272 3.42 2.44 BIEC2-898163 23011953 3.23 2.59 BIEC2-928986 91087138 3.00 1.97 BIEC2-929000 91106034 3.00 1.97 6 BIEC2-946258 30516410 3.45 2.32 BIEC2-946260 30516445 3.45 2.32 BIEC2-952532 45761583 5.52 4.48 7 BIEC2-1007282 76868086 5.68 4.72 HPX 75419475-75427985 8 BIEC2-1045942 39346486 4.06 4.15 AFG3L2 37825926-37856104 BIEC2-1046042 40043641 4.17 3.89

111 Identification of QTL for dressage

Table S4 continued. Position in Candidate Position in bp ECA SNP ID Tassel1 Tassel2 Mb gene Start/ End BIEC2-1051244 50820999 3.00 2.44 BIEC2-1063951 80751226 3.00 2.74 9 BIEC2-1071439 6013125 3.36 2.57 BIEC2-1071676 6471465 3.40 2.60 BIEC2-1073687 9664365 3.89 2.92 BIEC2-1081527 25840557 2.60 3.22 BIEC2-1103471 69412870 3.17 2.43 BIEC2-1106486 79208732 4.38 3.65 BIEC2-1106495 79299429 3.53 3.43 BIEC2-1106581 79659292 3.44 2.92 10 BIEC2-130897 70278303 3.05 2.82 BIEC2-130900 70280670 3.05 2.82 BIEC2-131285 73028535 3.45 4.40 11 BIEC2-157726 49040482 3.38 2.89 BIEC2-157945 49514091 3.19 2.21 14 BIEC2-244643 12013212 3.11 2.06 BIEC2-244645 12019461 3.11 2.06 BIEC2-262404 66630266 3.53 3.37 15 BIEC2-318239 71050260 3.09 2.82 BIEC2-318243 71054291 3.16 2.92 BIEC2-318679 72221305 3.69 2.74 16 BIEC2-342973 40811163 3.38 2.17 MYL3 39968273-39973952 BIEC2-342983 40815437 4.41 3.69 BIEC2-343385 41315164 4.41 3.69 BIEC2-359817 72862351 3.19 2.44 17 BIEC2-378051 53414593 3.81 2.92 BIEC2-378055 53436693 3.55 2.74 BIEC2-382246 63363901 2.62 4.01 BIEC2-382273 63381047 2.62 4.01 BIEC2-386377 78980457 2.68 3.08 18 BIEC2-389893 763441 4.53 4.11 BIEC2-389894 763444 4.53 4.11 BIEC2-390659 1694571 3.30 2.49 BIEC2-396099 6633895 3.00 3.02 BIEC2-397558 8701876 3.51 3.22 19 BIEC2-422417 1134701 3.71 3.16 SHOX2 1769247-1774673 BIEC2-422566 2251539 3.11 2.96 20 BIEC2-512341 1939539 3.37 3.38 BIEC2-512342 1939607 3.37 3.38 BIEC2-512522 2301617 3.31 3.11 BIEC2-522360 17010592 3.09 2.59 BIEC2-532022 39334283 3.40 3.08 BIEC2-533182 43890537 3.10 2.35 VEGFA 42691424-42704833 BIEC2-534001 45464819 3.88 2.82 BIEC2-535085 47181060 3.00 2.15 BIEC2-537914 50410240 3.00 2.14 BIEC2-538611 51188088 3.60 2.51 21 BIEC2-552102 11883502 3.36 3.05 BIEC2-554073 16204106 3.24 2.19 BIEC2-554074 16204360 3.36 2.52 23 BIEC2-614032 13782561 3.68 2.68 BIEC2-614085 13873062 3.50 2.37

112 Identification of QTL for dressage

Table S4 continued Position in Candidate Position in bp ECA SNP ID Tassel1 Tassel2 Mb gene Start/ End BIEC2-614086 13873067 3.50 2.37 BIEC2-614144 13943057 3.72 2.64 BIEC2-624203 38864822 2.47 3.21 24 BIEC2-639514 22191201 2.96 3.32 25 BIEC2-662132 17657913 3.49 2.80 TNC 19705467-19766034 26 BIEC2-691198 24412608 3.06 3.05 NRF2 (GABPA) 23390826-23422854 BIEC2-691713 25808727 3.03 2.40 27 BIEC2-709106 20868100 3.05 2.24 BIEC2-710064 22033847 3.62 2.42 BIEC2-713079 27187797 2.92 3.17 BIEC2-713445 27915218 3.69 2.74 BIEC2-717291 33345559 6.68 5.98 MCPH1 34149682-34373827 BIEC2-719149 34732013 3.01 2.39 28 BIEC2-734776 21957440 3.31 3.34 BIEC2-735623 24316688 3.19 2.82 IGF1 26183639-26252893 BIEC2-743993 40740834 3.25 1.92 BIEC2-744767 42539177 3.03 1.81 PPARA 42093244-42113011 BIEC2-745466 45168480 4.29 3.72 Tassel1 = significance ascertained integrating population structure (estimated using STRUCTURE (Pritchard, Stephens, und Donnelly 2000)) and marker identity by state based kinship in a mixed linear model (MLM) by using TASSEL. Tassel2 = significance ascertained integrating the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner as covariates and marker identity by state based kinship in a mixed linear model (MLM) by using TASSEL.

113 Identification of QTL for dressage

Figure 1 Quantile-Quantile-plots of observed P-values estimated for adaptive permutation testing using a maximum of 1,000,000 permutations (Plink1) versus the expectation under null. The black-line shows the expected distribution, the black points show the absolute observed distribution and the redline shows the middle observed distribution.

114 Identification of QTL for dressage

Figure 2 Quantile-Quantile-plots of observed P-values estimated for adaptive permutation testing considering the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner as covariate using a maximum of 1,000,000 permutations (Plink2) versus the expectation under null. The black-line shows the expected distribution, the black points show the absolute observed distribution and the redline shows the middle observed distribution.

115 Identification of QTL for dressage

Figure 3 Quantile-Quantile-plots of observed P-values estimated for CHM adaptive permutation testing within 16 family clusters, considering the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner as covariate using a maximum of 1,000,000 permutations (Plink3) versus the expectation under null. The black-line shows the expected distribution, the black points show the absolute observed distribution and the redline shows the middle observed distribution.

116 Identification of QTL for dressage

Figure 4 Quantile-Quantile-plots of observed P-values estimated using a mixed linear animal model by simultaneous accounting multiple levels of marker-based population structure (Q matrix) and relative kinship among the individuals (K matrix) using TASSEL versus the expectation under null (Tassel1). The black-line shows the expected distribution, the black points show the absolute observed distribution and the redline shows the middle observed distribution.

117 Identification of QTL for dressage

Figure 5 Quantile-Quantile-plots of observed P-values estimated using a mixed linear animal model by simultaneous accounting the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner and relative kinship among the individuals (K matrix) using TASSEL versus the expectation under null (Tassel2). The black-line shows the expected distribution, the black points show the absolute observed distribution and the redline shows the middle observed distribution.

118 Identification of QTL for dressage

119

120

CHAPTER 7

A genome wide association study for quantitative trait loci of conformation in Hanoverian warmblood horses

Wiebke Schröder, Andreas Klostermann, Ottmar Distl

Institute for Animal Breeding and Genetics, University of Veterinary Medicine Hannover, Bünteweg 17p, 30559 Hannover, Germany

121

122 QTL for conformation

7 A genome wide association study for quantitative trait loci of conformation in Hanoverian warmblood horses

7.1 Abstract A functional conformation is crucial for any elite equine performance. The conformational traits that are considered for population genetic analyses of Hanoverian are grouped under two topics, riding horse points (RHP) and limbs (LIMBS). RHP include all traits that are constitutive for the quality of a riding horse (head, neck, saddle position, frame, type and development), whereas LIMBS comprises all thus traits describe the quality of limb conformation (front legs, hind legs and correctness of gaits). Employing the Ilumina equine SNP50 Beadchip, we performed genome wide association (GWA) analyses to map quantitative trait loci (QTL) for RHP and LIMBS. We genotyped 115 stallions of the National state stud of Lower Saxony. To control spurious associations based on population stratification, two different mixed linear animal model (MLM) approaches were employed besides three general linear models with adaptive permutations for correcting multiple testing. Population stratification was taken into account best by employing a MLM, including the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner, and a marker identity by state (IBS) based kinship. We revealed four QTL for RHP on

ECA3, 15, 19 and 20 and two QTL for LIMBS on ECA5 and 18 (-log10 P-value >5).

Further putative QTL with -log10 P-values >3 <5 were detected for RHP on ECA3, 6, 17, 18, 19, 21 and 27 and for LIMBS on ECA1, 3, 5, 8, 10, 11, 14, 17, 18, 19, 20, 25, 26 and 31. Within the QTL regions for RHP on ECA3 we identified PPARGC1A and LCORL as candidate genes. Inside the further putative QTL regions CYP27B1 on ECA6, MYO7B on ECA18, SOHX2 on ECA19 and FST on ECA21 could be identified as functional candidate genes for RHP. For LIMBS we detected PRG4 on ECA5 and MYO7B on ECA18 as functional candidate genes. Within the further putative QTL regions SHOX2 on ECA19, COL15A1 and RAD23B on ECA25, RNF160 on ECA26 and PLAGL1 on ECA31 could be identified as candidate genes for LIMBS.

123

QTL for conformation

7.2 Introduction Any elite physical performance in horses is based on a preferable functional conformation. Positive genetic correlations between conformation traits and performance in equestrian sports have been shown in several, although not all genetic studies [1], [2]. In addition, a favourable conformation can represent a possible sale benefit in particular at young ages. Hence, the Hanoverian studbook society (HSS) aims at selecting animals with best conformation values for the next generation. A modern and noble sport horse of varying calibre; big framed, with a well defined outline, lean texture, and well-muscled with a clear sex type is recommended by the HSS. Hence, horses intent for breeding under the HSS are previously scored by a judging commission for conformation and correctness of gaits. To simplify conformation orientated breeding, breeding values (BVs) for conformational traits of the Hanoverian warmblood are composed to total a BV for riding horse points (RHP) and a BV for LIMBS. RHP is a composed trait resulting from scores for conformation of the head, the neck and the saddle position, the type, the frame and the general impression and development. BV LIMBS is also a composed trait resulting from scores for frontlegs, hindlegs and correctness of gaits. Heritability estimates in Hanoverian warmbloods range from 0.19 to 0.50 for conformational traits included in RHP, and 0.09 to 0.12 for traits included in LIMBS [3]. Genetic improvement in horses is greatly reduced by the long generation interval, so the application of genetic markers in selection schemes to improve body conformation could be highly desirable. However, even population genetic analyses are performed routinely nowadays, studies on QTL and candidate genes contributing to equine body conformation are still at the beginning. For human, dairy cattle, and dogs genotyping arrays containing SNP markers were successfully used for mapping QTL for quantitative traits. With the completion of the equine genome assembly, SNP assays spanning the whole equine genome and research work on large scale identification, validation and analysis of genotypic variation in horses has become possible but no such study on is published jet.

124

QTL for conformation

The objective of this study was a whole genome-wide association (GWA) analysis using the equine SNP50 BeadChip (Illumina, San Diego, CA, USA) for RHP and LIMBS in Hanoverian warmblood horses.

7. 3 Results The GWA analysis could identify four QTL for RHP on ECA3, 15, 19 and 20 (Table1). Peak values were at 101.3-109.0 Mb on ECA3 and at 84.4–87.9 Mb on ECA15. Only one SNP at 2.3 Mb was supporting the QTL on ECA19. On ECA20 – log10 P-value were highest at 19.3–24.1 Mb. Further putative QTL for RHP were detected on ECA2, 3, 6, 15, 17, 18, 20, 21 and 27 (Table S1). Their locations were at 11.5-14.5 Mb and 43.9–46.9 Mb on ECA2, at 81.4-85.6 Mb on ECA3, at 75.5–78.7 Mb on ECA6, at 102.2 Mb on ECA17, at 2.3 Mb on ECA18, at 15.4 Mb on ECA20, at 18.8 Mb on ECA21 and at 13.2 Mb on ECA27 (Table S1). For LIMBS we were able to detect 2 QTL on ECA5 and 18 (Table 2). On ECA5 we revealed a QTL at 20.5–23.5 Mb. Only one SNP was supporting the QTL on ECA18 at 2.3 MB. Further putative QTL for LIMBS were located on ECA1, 3, 5, 8, 11, 14, 17, 19, 20, 25, 26 and 31 (Table S2). Peak values were at 123.8 Mb on ECA1, at 59.2–

61.3 Mb and at 69.2 Mb on ECA3. On ECA5 –log10 P-values were highest at 87.8- 91.1 Mb. On ECA8 -log P-values were highest at 9.4 and 24.9 Mb, on ECA11 at 6.4 Mb, and on ECA14 at 65.7-65.9 Mb. On ECA17 peak values were highest at 35.3– 39.5 Mb, at 2.3 Mb on ECA19 and on ECA20 at 17.3 Mb. We detected two QTL on ECA 25, one at 4.8 Mb and one at 11.5–11.7 Mb. Further QTL were detected at 26.1 Mb on ECA26 and at 20.8 Mb on ECA31. The additive effect of BIEC2-808466 (ECA3) was highly significant for RHP. BIEC2-325253 (ECA15) had a highly significant additive effect as well as a highly significant dominance effect on the BV for RHP. A highly significant dominance effect was observed for BIEC2-524152 on ECA20. For BIEC2-422566 on ECA19 the additive and the dominance effect were not estimable, because none of the investigated horses was homozygote for the minor allele (Table 3). For LIMBS, BIEC2-897799 (ECA5) and BIEC2-391005 (ECA18) had both highly significant additive effects (Table 4). The correlation coefficient estimated between the gBV for

125

QTL for conformation

RHP and the BV for RHP was 0.40 and between the gBV for LIMBS and the BV for LIMBS was 0.33. Table 3 shows the distribution of SNP genotypes per proportions of Hanoverian, Thoroughbred, Trakehner and Holsteiner genes. Within the identified QTL region for RHP on ECA3 we identified the peroxisome proliferator-activated receptor gamma, coactivator 1 alpha gene (PPARGC1A) and the ligand dependent nuclear receptor corepressor-like gene (LCORL) as candidate genes. Within the putative QTL regions, the cytochrome P450, family 27, subfamily B, polypeptide 1 gene (CYP27B1) on ECA6, the myosin VIIB gene (MYO7B) on ECA18, and the short stature homeobox 2 gene (SHOX2) on ECA19, and the follistatin gene (FST) on ECA21 could be identified as functional candidate genes for RHP. Among the two QTL regions for LIMBS we detected the proteoglycan gene (PRG4) on ECA5 and MYO7B on ECA18 as a functional candidate gene. Within putative QTL regions we revealed SHOX2 on ECA19, the collagen, type XV, alpha 1 gene (COL15A1) and the RAD23 homolog B gene (RAD23B) on ECA25, the ring finger protein 160 (RNF160) on ECA26 and the pleiomorphic adenoma gene-like 1 gene (PLAGL1) on ECA31 as candidate genes for BV LIMBS.

7.4 Discussion The aim of this study was to detect QTL associated with conformation in Hanoverian warmblood horses. The number of QTL on different chromosomes found for RHP and LIMBS in this study suggest that several genes are possibly involved in growth and developmental processes. Potential QTL for RHP and LIMBS were defined as genomic region with minimum one SNP marker estimated as highly significant using Tassel1 or Tassel2, but at least estimated as significant using both models, and -log10 (P) >1 using any of the other models employed. Further putative QTL were defined as genomic regions harbouring at least one SNP marker estimated as significant using Tassel1 and

Tassel2 and -log10 (P) >1 using any other model. According to the Q-Q-plots, omitting the effects of kinship and the proportion of genes would result in an overestimation of SNP effects caused by stratification within the investigated

126

QTL for conformation

population. The differences between the -log10 P-values observed for the same SNP using different models for association analyses are due to data inflation. Considering the Q-Q plot for Tassel2, expected and observed -log10 P-values are in line up to - log10 (P) <3. At the same expected -log10 P-value (3) using any of the other implemented models, observed P-values are larger, representing data inflation, depending on the model used. With exception of the QTL on ECA18 and 19, a coincidence of QTL positions among the two traits RHP and LIMBS could not be observed. We suppose that thus analogies are due to the genetic correlations between RHP and LIMBS (0.74). Both, RHP and LIMBS are composed traits and the individual traits included show variable genetic correlations among each other. Strongest correlations between traits included in RHP and traits included in LIMBS were observed between the conformation of front legs and the frame, and between the correctness of gaits and the development and general impression of the individual horse. It is likely that QTL analogous between RHP and LIMBS are harbouring genes that have impact on one trait and hence on the other, too. We compared the identified QTL to positively selected regions in Thoroughbreds found by Gu et al. (2009) [4]. The putative QTL for RHP on ECA17 was also found to be subjected to positive selection in Thoroughbreds. For LIMBS we found analogy in QTL on ECA5, and putative QTL on ECA8, 11 and 17. However, none of the other potential QTL coincided with those regions. Those analogies could be due to the common usage of Thoroughbred stallions in Hanoverian breeding to make future progenies nobler. However, in the Hanoverian horse population analyzed here, possible breed-related marker associations have been on purpose sufficiently accounted for in the models used, to reveal within-breed-variation for RHP and LIMBS. In contrast, Gu et al. (2009) [4] were searching for across-breed-variations to reveal genomic regions distinctive primary for Thoroughbreds. We detected six functional candidate genes within the defined and putative QTL for RHP and seven functional candidate genes within the QTL and putative QTL for LIMBS. On ECA3 we revealed PPARGC1A and LCORL as promising candidate genes for RHP in proximity to BIEC2-808466, one of seven neighbouring SNPs

127

QTL for conformation

estimated as highly significant using any of the applied models. PPARGC1A transcriptionally activates a complex pathway of lipid and glucose metabolism and is expressed primarily in tissues of high metabolic activity such as liver, heart and exercising oxidative skeletal muscle fibers. The PPARGC1A is a coactivator of the subset of oxidative phosphorylation genes that control glucose and lipid transportation and oxidation, skeletal muscle fiber type formation and mitochondrial biosynthesis [5]. Studies in Brangus steers revealed associations of SNPs in the bovine PPARGC1A with growth and meat quality traits [6]. A SNP marker within the human PPARGC1A shows strong association with endurance capacity. Trained individuals show in general increased PPARGC1A mRNA levels and increased resistance to muscle fatigue [7]. Recently, Eivers et al. (2009) [8] found in Thoroughbreds a significant association with post-exercise PPARCG1A expression in equine skeletal muscle and post-exercise plasma lactate concentration. Thus PPARGC1A is a candidate gene that might influence RHP through an athletic and muscular phenotype. On the same chromosome we identified LCORL as a functional candidate gene. Polymorphisms in LCORL are significantly associated with measures of skeletal frame size (trunk length) and adult height in human and mice [9], [10]. In horses, trunk length and height at withers strongly affect the conformation of the saddle position, the frame, and general impression and development, three of the traits included in RHP. CYP27B1 is a cytochrome P450 enzyme in the proximal tubule of the kidney that catalyzes the hydroxylation of calcidiol to calcitriol, the bioactive form of Vitamin D3, which binds to the vitamin D receptor (VDR) and regulates calcium metabolism. Thus, this enzyme regulates the level of biologically active vitamin D and plays an important role in calcium homeostasis. Panda et al. 2003 [11] found that mice deficient for CYP27B1 developed features similar to those of human rickets: hypocalcemia, secondary hyperparathyroidism, retarded growth, and skeletal abnormalities. The orthologous equine gene is localized on ECA6 next to BIEC2- 1187571 that has a significant additive effect (P<0.01) on RHP.

128

QTL for conformation

On ECA18 next to BIEC2-391005, and on ECA19 in proximity to BIEC2-422566 we found MYO7B and SHOX2 as functional candidate genes within QTL for RHP as well as for LIMBS. Myosins are the fundamental functional units of straight muscles, by being molecular motors that, upon interaction with actin filaments, utilize energy from ATP hydrolysis to generate mechanical force. Dall'Olio et al. (2009) [12] found SNPs in the myosin heavy chain 7 gene varying among horse breeds that differ in performance and morphology traits. That indicates that genes involved in the myosin formation play central roles for conformational traits and physical performance. Most conformational traits included in RHP are positively affected from a well defined, muscular outline. Scores for front and hindlegs benefit if they are lean, well muscled and defined. Hence, we suppose MYO7B represents a suitable candidate gene for RHP as well as for LIMBS. Mutations in the human SHOX2 lead to growth retardation associated with Turner, Leri-Weill dyschondrosteosis, and Langer mesomelic dysplasia syndromes, which marked the shortening of the forearms and lower legs [13]. Same results could be observed in mice with inactivated SHOX2 in the developing limbs [14]. Some traits included in RHP, in particular saddle position and frame, are influenced by the conformation of front legs. In this context, genes like SHOX2 are suitable candidate genes for RHP as well as for LIMBS. On ECA21 localized next to BIEC2-554900, which is found highly associated with RHP using any of the implemented models, FST represents another potential candidate gene for RHP. Follistatin is an autocrine glycoprotein that is expressed in nearly all tissues of higher animals. It is being studied for its role in regulation of muscle growth in mice, as an antagonist to myostatin (MSTN) which inhibits excessive muscle growth. Lee and McPherron (2001) [15] demonstrated that inhibition of MSTN, either by genetic elimination (knockout mice) or by increasing the amount of follistatin, resulted in greatly increased muscle mass. For LIMBS we detected a highly significant SNP (BIEC2-898157) intragentic in PRG4 on ECA5. The protein encoded by PRG4 is a large proteoglycan specifically synthesized by chondrocytes located at the surface of articular cartilage, and also by some synovial lining cells. It functions as a boundary lubricant at the cartilage surface

129

QTL for conformation

and contributes to the elastic absorption and energy dissipation of synovial fluid. Mutations in this gene result in camptodactyly-arthropathy-coxa vara-pericarditis syndromean, an arthritis-like autosomal recessive disorder [16]. PRG4 has multiple functions in articulating joints and tendons that include the protection of surfaces and control of synovial cell growth [17]. We suppose, PRG4 could have an influence on LIMBS through its function as a lubricant in joints and tendons [18]. In additions, it represents is highlighted through BIEC2-898157 as a positional candidate gene. On ECA25 COL15A1 in proximity to BIEC2-655148 and RAD23B next to BIEC2- 659406 are functional candidate genes. Both SNP markers had highly significant (P <0.01) additive effects on LIMBS. Collagen is a protein that strengthens and supports many tissues in the body, including cartilage, bone, tendon, skin and sclera. Type XV collagen has a wide tissue distribution but the strongest expression is localized to basement membrane zones so it may function to adhere basement membranes to underlying connective tissue stroma. Mouse studies have shown that collagen XV deficiency is associated with muscle and microvessel deterioration [19]. In human, the collagen, type I, alpha 1, and the collagen, type V, alpha 1 genes are highly associated with tendon injuries, indicating genes coding for collagens as suspicious candidate genes for LIMBS. The protein encoded by RAD23B is one of two mammal homologs of Saccharomyces cerevisiae Rad23, a protein involved in the nucleotide excision repair. Ng et al. (2002) [20] created a RAD23B knockout mouse model and observed a high rate of intrauterine or neonatal death in RAD23B deficient animals. However, surviving animals displayed a variety of abnormalities, including retarded growth. Further analyses are required to investigate whether RAD23B has impact on equine limb growth. RNF160 on ECA26 next to BIEC2-691812, which has highly significant additive (P<0.01) and dominance effects (P<0.01) on RHP, represents another promising candidate gene. Like most RING finger proteins, RNF160 function as an ubiquitin ligase. Chu et al. (2009) [21] identified a recessive mutation in the mouse RNF160 that manifested as a progressive movement disorder. Affected mice exhibited age-

130

QTL for conformation

dependent and often asymmetrical progressive weakness of the hind limbs, bradykinesia, and eventually loss of locomotion. On ECA31 next to a SNP (BIEC2-841254) that has a highly significant (P<0.01) additive effect on LIMBS, we identified PLAGL1 as another functional candidate gene. PLAGL1 encodes a C2H2 zinc finger protein with transactivation and DNA- binding activity and is a member of a network of co-regulated genes comprising other imprinted genes involved in the control of embryonic growth. Varrault et al. (2006) [22] could show that PLAGL1 insufficient mice had intrauterine growth restriction, altered bone formation, and neonatal lethality. Whether PLAGL1 influences equine limb growth remains open and further analyses are required for verification. In doges several QTL harbouring candidate genes appropriate to regulation of size are described [23]. For the stallion population investigated in this study we could not detect one of the genes significantly associated with size in dog breeds. Aim of the present study was to reveal QTL for within breed variation for RHP and LIMBS in the Hanoverian population investigated. In contrast, Jones et al. (2008) [23] were searching for QTL associated to across-breed-variation for conformation. Accordingly, genes involved in major conformational variation observed across- breeds are probably not the same responsible for the minor variations observed within a breed. We can conclude that first of all genes coding for muscular processes, growth, limb development and embryonic development might be constitutive for both, RHP and LIMBS. Hence, our findings are in line with most previous population genetic analyses that found positive genetic correlations between conformation and performance traits. Our findings indicate that genes that influence conformation might also have an impact on physical performance. Further analyses including larger population and denser SNP marker sets are required to verify the potential QTL for RHP and LIMBS. Our approach appeared useful as a starting point to identify QTL for RHP and LIMBS within a breed.

131

QTL for conformation

7.5 Materials and Methods 7.5.1 Animals and phenotypic data Blood samples were collected from 115 Hanoverian warmblood stallions of the National State stud of Lower Saxony. These stallions were born between 1972 and 2000 and represent a random sample from all Hanoverian stallions born in last 20-30 years. Pedigree data were made available by the Hanoverian studbook society (HSS) through the national unified animal ownership database (Vereinigte Informationssysteme Tierhaltung w.V., VIT). Pedigree records of these stallions allowed us to assign the 115 stallions into 16 families which included a total of 798 stallions (Table S3). We employed the latest BVs RHP and LIMBS (Mai 2009) provided by the HSS. BVs for RHP were estimated based on results of studbook inspection (SBI) since 1979 including 85,598 animals. RHP is a composed trait resulting from scores for conformation of the head, the neck, the saddle position, the type, the frame and the general impression and development. The BV for LIMBS is also a composed trait resulting from score for frontlegs, hindlegs and the correctness of gaits. All mares intended to be used for breeding under the HSS must be registered in the Hanoverian studbook. At SBI a judging commission scores for correctness at walk in hand and the conformation of the head, the neck, the saddle position, the frontlegs, the hindlegs, as well as for type, frame, and general impression and development of the mare presented. Scores on a scale from 0 (not shown) to 10 (excellently shown) were assigned for each of these traits. BVs are estimated yearly through the VIT for traits included in RHP and LIMBS by employing a multivariate BLUP (best linear unbiased prediction) animal model [24]

Yijk= μ + TESTi+ aj + eijk

with yijk = score for head, neck, saddle position, frame, type, general impression and development, frontlegs and hindlegs, correctness of gaits, μ = model constant,

TESTi = fixed effect of the individual test for: -interaction between the place and year of performance evaluation; aj = random additive genetic effect of the individual horse and eijk = random residual.

132

QTL for conformation

Breeding values are standardized to a mean value of 100 points and a standard deviation of 20 points. Basis of comparison are the horses of the age groups 1999 and 2000. This means that the average breed value of horses of these two age groups equals 100. Every year this basis of comparison moves by one year. For the investigated stallions the mean BV for RHP was 105 ± 25 (rang 24-164) and the mean BV LIMBS was 106 ± 21 (rang 45-160). Reliabilities ranged between 0.28 and 0.99 (mean 0.88±0.11). The BVs for RHP and LIMBS were analyzed for their distribution in the available stallions’ population using the procedure UNIVARIATE of SAS software (Statistical Analysis System, version 9.2, SAS Institute, Cary, NC, USA, 2010) (Table S4). For each stallion the proportion of genes of Hanoverian (HAN), Thoroughbred (TB), Trakehner (TRAK) and Holsteiner (HOL) horses were calculated using all available pedigree information. Details are described elsewhere [25]. Mean (median) proportions of genes in the stallions were 0.54 (0.63) for HAN, 0.28 (0.19) for TB, 0.05 (0.03) for TRAK, and 0.06 (0) for HOL. Given the uneven representation of gene proportions, each four classes for the several breeds were defined as follows HAN: ≤0.34, >0.34 and <0.78, ≥0.78; TB: ≤0.13, >0.13 and <0.30, ≥0.30; TRAK: ≤0.20, >0.20 and <0.80, ≥0.80; HOL: 0.00, >0.00 and <0.30, ≥0.30.

7.5.2 Genotyping SNPs Genomic DNA was extracted from EDTA blood samples of 115 Hanoverian warmblood stallions through a standard ethanol fraction with concentrated sodium chloride (6M NaCl) and sodium dodecyl sulphate (10% SDS). Concentration of extracted DNA was determined using the Nanodrop ND 1000 (Peqlab Biotechnology, Erlangen; Germany). DNA concentration of samples was adjusted between 30 and 80 ng/μl. Genotyping was performed with the Illumina Equine SNP50 BeadChip containing 54,602 SNP markers using standard procedures as recommended by the manufacturer. Raw data were analysed using the genotype module version 3.2.32 of the BeadStudio program (Illumina). In order to assign the genotypes we generated a

133

QTL for conformation

cluster file with the help of the BeadStudio software and the genotyping module version 3.2.32.

7.5.3 Data analysis For genome wide mapping we performed association analyses for all SNPs with a minor allele frequency (MAF) >0.05 and a call rate >0.90 Due to missingness test, no SNP was excluded. There were 7875 SNPs that did not reach a sufficient MAF and 3951 SNP had a call rate <0.90, so 43,441 SNPs were left for association analyses. To control spurious associations, we tested possible stratification effects on their outcome of GWA and employed empirical genome-wide error probabilities through adaptive permutations. The models employed were parameterized using PLINK, version 1.07 (http://pngu.mgh.harvard.edu/~purcell/plink/ [26]) and TASSEL version 2.1 (http://www.maizegenetics.net/tassel [27]). First, genome wide associations were determined without any parameters to explain potential data stratification. Therefore, adaptive permutations for correction of multiple tests was performed using a maximum of 1,000,000 permutations (Plink1) the “--assoc” and the “--aperm” options of PLINK. An extended model included the gene proportions of the important founder breeds HAN, TB, TRAK and HOL to improve the results of the GWA analyses. The covariates were considered as class effects with each four levels. The adaptive permutations were done applying a linear regression model using the “--linear” and “- -covar” options for PLINK (Plink2). In a third PLINK model, we tested the effect of family structure on the GWA analysis. We performed Cochran-Mantel-Haenszel (CMH) tests within the 16 families and simultaneously considered the gene proportion of HAN, TB, TRAK and HOL as covariates. Here, we utilized the “--mh”, “-- within” and “--aperm” options for PLINK (Plink3). A mixed linear animal model (MLM) was employed to control for marker-based population structure (Q-matrix) and marker identity-by-state (IBS) based kinship among all individuals (K-matrix) using TASSEL (http://www.maizegenetics.net/tassel [22]) (Tassel1). The data file for building these two matrices were from 7375 genomewide and equidistantly distributed SNPs at pair wise linkage disequilibrium <0.2 [28] The Q-matrix contained three covariates for the cryptic structure of the

134

QTL for conformation

stallions as determined by STRUCTURE, version 2.3.3 [29] , via optimization of the likelihood of the data. Using the KIN option of TASSEL, the K-matrix was created calculating the marker IBS coefficients. This subset of SNPs was generated as a pruned subset of SNPs that are in approximate linkage equilibrium with each other. Therefore the “--indep-pairwise 2000 500 0.2” (sliding window size of 2000 SNPs, window shift steps of 500 SNPs and an r2 threshold of 0.2) option within PLINK was used. We implemented two MLM models. The first MLM model explained for effects of the cryptic structure as determined via structure and the IBS-kinship matrix (Tassel1). In the second MLM model (Tassel2) the proportions of genes of HAN, TB, TRAK and HOL as covariates and the IBS-kinship matrix were taken into account. The MLM [30] was implemented in TASSEL as described in Henderson’s notation [27]: y = Qβ + Zu + Ga + e

where y is the BV for RHP/ LIMBS; β is an unknown vector containing fixed effects of population structure (Q-matrix) or the proportion of genes of HAN, TB, TRAK and HOL; u is an unknown vector of random additive genetic effects from multiple background QTL for individuals (K-matrix); a is the containing genotype effects of the SNPs in the GWA; X, Z, and G are the known design matrices; and e is the unobserved vector of random residuals. Subsequently, we calculated the observed polymorphism information content (PIC) using the ALLELE procedure of SAS/Genetics (Statistical Analysis System, version 9.2 SAS Institute, Cary, NC, USA, 2010). We built quantile-quantile (Q-Q) plots to visualise the observed versus expected P- value distribution for each of the models employed. The observed –log10 P-values were plotted against –log10 P-values expected under the null hypothesis of independence (Fig. 1-5). The observed divergence between the expected distribution of the regression line and the distribution of observed –log10 P-values represent the inflation of P-values mainly caused by data stratification. According to the Q-Q plots, smallest –log10 P-value inflation was observed using a MLM model with the gene proportion of HAN, TB, TRAK and HOL as covariates and the K-matrix for random additive genetic effects due to the IBS relationship among all animals (Tassel2).

135

QTL for conformation

Based on the Q-Q plots, we defined a SNP as significant with -log10 (P) >3 and as highly significant with -log10 (P) >5 using Tassel1 or Tassel2. We found 489 SNPs for RHP and 549 SNPs for LIMBS as highly significant using Plink1, 520 SNPs for RHP and 444 SNPs for LIMBS using Plink2 and 93 SNPs for RHP and 101 SNPs for LIMBS using Plink3. Using the alternative MLMs, 313 SNPs for RHP and 232 SNPs for LIMBS were associated employing Tassel1 and 51 SNPs for RHP and 72 SNPs for LIMBS employing Tassel2. Subsequently, potential QTL were defined as genomic region with minimum one SNP marker estimated as highly significant using Tassel1 or Tassel2, but at least estimated as significant using both models, and -log10 (P) >1 using any other model (Table 1, 2). Further putative QTL were defined as genomic regions harbouring at least one SNP marker estimated as significant using Tassel1 and Tassel2 and -log10 (P) >1 using any other model (Table S1 and S2). Estimates of the additive and dominance effects for each of the most significant SNPs within each potential QTL were obtained using Best Linear Unbiased Prediction (BLUP) with the software PEST [31].

yijklmno= μ + GTi+ HANj + TBk + TRAKl + HOLm+ an + eijklmno

with yi…o = BV for RHP/LIMBS, μ = model constant, GTi = genotypes of the most significant SNP within each QTL, HANj = Proportion of Hanoverian genes, TBk = proportion of Thoroughbred genes, TRAKl = proportion of Trakehner genes, HOLm = proportion of Holsteiner genes, an = random additive genetic effect of the individual horse (n =1-3665) and ei…o = residual. The additive genetic effects of the SNPs were estimated as half of the difference of the least square means of the two homozygous genotypes. The dominance effect was calculated as the deviation of the least square means of the heterozygotes from the average of the two homozygous genotypes. If none of the investigate stallions was homozygote for the minor allele, the genotype effect was calculate instead of additive and dominance effect as the deviation of the least square means of the heterozygotes from the least square means of the homozygous genotype. Significance was tested using F-tests (Table 3, 4). The genotype based BVs (gBV)

136

QTL for conformation

for RHP and limbs were calculated based on the observed additive and dominance or genotype effects of these SNPs for each stallion.

2/4 gBV   bda )( i  1151 with gBV = genotype BV for RHP or limbs, a = additive effect of each SNP, d = dominance effect of each SNP or b = genotype effect of the SNP. For better comperession, we standardized gBVs to a mean value of 100 points and a standard deviation of 20 points. Subsequent, correlations between BV RHP and gBV RHP, and between BV limbs and gBV limbs, were calculated using the CORR procedure of SAS/Genetics. We performed multiple analyses of variance (ANOVA) to test the influence of gBV RHP on the distribution of BV RHP, and gBV limbs on the distribution of BV limbs (r2) using the procedures GLM of SAS/Genetics.

7.6 References 1. Koenen EPC, van Veldhuizen AE, Brascamp EW (1995) Genetic parameters of linear scored conformation traits and their relation to dressage and show- jumping performance in the Dutch Warmblood Riding Horse population. Livest Prod Sci 43: 85-94. 2. Holmström M, Philipsson J (1993) Relationships between conformation, performance and health in 4-year-old swedish warmblood riding horse. Livest Prod Sci 33: 293-312. 3. Stock KF, Distl O (2006) Genetic correlations between conformation traits and radiographic findings in the limbs of German Warmblood riding horses. Genet Sel Evol 38: 657-671. 4. Gu J, Orr N, Park SD, Katz LM, Sulimova G, et al. (2009) A Genome Scan for Positive Selection in Thoroughbred Horses. PLoS ONE 4: e5767. 5. Kelly DP, Scarpulla RC, (2004) Transcriptional regulatory circuits controlling mitochondrial biogenesis and function. Genes Dev 18: 357-368. 6. Lucia A, Gómez-Gallego F, Barroso I, Rabadán M, Bandrés F, et al. (2005) PPARGC1A genotype (Gly482Ser) predicts exceptional endurance capacity in European men. J Appl Physiol 99: 344-348.

137

QTL for conformation

7. Soria LA, Corva PM, Branda Sica A, Villarreal EL, Melucci LM, et al. (2009) Association of a novel polymorphism in the bovine PPARGC1A gene with growth, slaughter and meat quality traits in Brangus steers. Cell Probes 23:304-308. 8. Eivers SS, McGivney BA, Fonseca RG, Machugh DE, Menson K, et al. (2010) Alterations in oxidative gene expression in equine skeletal muscle following exercise and training. Physiol Genomics 8: 83-93 9. Soranzo N, Rivadeneira F, Chinappen-Horsley U, Malkina I, Richards JB, et al. (2009) Meta-analysis of genome-wide scans for human adult stature identifies novel Loci and associations with measures of skeletal frame size. PLoS Genetics 5: e1000445. 10. Sovio U, Bennett AJ, Millwood IY, Molitor J, O'Reilly PF, et al. (2009) Genetic eterminants of height growth assessed longitudinally from infancy to adulthood in the northern Finland birth cohort 1966. PLoS Genetics 5: e1000409. 11. Panda DK, Miao D, Tremblay ML, Sirois J, Farookhi R, et al. (2001) Targeted ablation of the 25-hydroxyvitamin D 1alpha -hydroxylase enzyme: evidence for skeletal, reproductive, and immune dysfunction. P Natl Acad Sci USA 98: 7498- 7503. 12. Dall'Olio S, Davoli R, Scotti E, Fontanesi L, Russo V (2009) SNPs within the beta myosin heavy chain (MYH7) and the pyruvate kinase muscle (PKM2) genes in horse. Italian J Anim Sci 6: 421-428. 13. Clement-Jones M, Schiller S, Rao E, Blaschke RJ, Zuniga A (2000) The short stature homeobox gene SHOX is involved in skeletal abnormalities in Turner syndrome. Hum Molec Genet 9: 695-702. 14. Cobb J, Dierich A, Huss-Garcia Y, Duboule DA (2006) Mouse model for human short-stature syndromes identifies Shox2 as an upstream regulator of Runx2 during long-bone development. Proc Nat Acad Sci 103: 4511-4515. 15. Lee S, McPherron AC (2001) Regulation of myostatin activity and muscle growth. P Natl Acad Sci USA 98: 9306-9311. 16. Eklund L, Piuhola J, Komulainen J, Sormunen R, Ongvarrasopone C, et al. 2001) Lack of type XV collagen causes a skeletal myopathy and cardiovascular defects in mice. P Natl Acad Sci USA 98: 1194-1199.

138

QTL for conformation

17. Marcelino J, Carpten JD, Suwairi WM, Gutierrez OM, Schwartz S, et al. (1999) CACP, encoding a secreted proteoglycan, is mutated in camptodactyly- arthropathy-coxa vara-pericarditis syndrome. Nature Genet 23: 319-322. 18. Rhee DK, Marcelino J, Baker M, Gong Y, Smits P, et al. (2005) The secreted glycoprotein lubricin protects cartilage surfaces and inhibits synovial cell overgrowth. J Clin Invest 115: 622-631. 19. Eklund L, Piuhola J, Komulainen J, Sormunen R, Ongvarrasopone C, et al. (2001) Lack of type XV collagen causes a skeletal myopathy and cardiovascular defects in mice. Proc Nat Acad Sci 98: 1194-1199. 20. Ng JMY, Vrieling H, Sugasawa K, Ooms MP, Grootegoed JA, et al. (2002) Developmental defects and male sterility in mice lacking the ubiquitin-like DNA repair gene mHR23B. Mol Cell Biol 22: 1233-1245. 21. Chu J, Hong NA, Masuda CA, Jenkins BV, Nelms KA, et al. (2009) A mouse forward genetics screen identifies LISTERIN as an E3 ubiquitin ligase involved in neurodegeneration. P Natl Acad Sci USA 106: 2097-2103. 22. Varrault A, Gueydan C, Delalbre A, Bellmann A, Houssami S, et al. (2006) Zac1 regulates an imprinted gene network critically involved in the control of embryonic growth. Dev Cell 11: 711-722. 23. Jones P, Chase K, Martin A, Davern P, Ostrander EA, et al. (2008) Single- nucleotide-polymorphism-based association mapping of dog stereotypes. Genetics 179: 1033-1044. 24. Christmann L (1996) Zuchtwertschätzung für Merkmale der Stutbuchaufnahme und\line der Stutenleistungsprüfung im Zuchtgebiet Hannover. Georg-August Universität Göttingen (Dissertation) 25. Hamann H, Distl O (2008) Genetic variability in Hanoverian warmblood horses using pedigree analysis. J Anim Sci 86: 1503-1513. 26. Purcell S, Neale B, Toddbrown K, Thomas L, Ferreira M, et al. (2007) PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses. Am J Hum Genet 81: 559-575. 27. Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, Ramdoss Y, et al. (2007) TASSEL: software for association mapping of complex traits in diverse samples.

139

QTL for conformation

Bioinformatics 23: 2633-2635. 28. Ritland K (1996) Estimators for Pairwise Relatedness and Individual Inbreeding Coefficients. Genet Res 67: 175-185. 29. Pritchard JK, Stephens M, Donnelly P (2000) Inference of Population Structure Using Multilocus Genotype Data. Genetics 155: 945-959. 30. Yu J, Pressoir G, Briggs WH, Vroh Bi I, Yamasaki M, et al. (2006) A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat Genet 38: 203-208. 31. Groeneveld E, Kovac M, Wang T (1990) World Congress on Genetics Applied to Livestock Production, Edinburgh, UK Garsi, Madrid 1: 488–491.

140

QTL for conformation

Table 1 Quantitative trait loci for riding horse points, their location on Equus caballus chromosome (ECA) in Mb, minor allele frequency (MAF), polymorphism content

(PIC), SNP motif, and -log10 P-values using different analysis models.

Position in ECA SNP ID MAF PIC Alleles Plink 1 Plink 2 Plink 3 Tassel 1 Tassel 2 Mb

3 BIEC2-807931 0.43 0.37 C/T 102839312 8.38 7.17 6.00 7.90 5.82 BIEC2-808466 0.49 0.37 A/G 105163077 8.21 6.14 6.00 7.08 6.15 BIEC2-808500 0.38 0.36 G/T 105363241 8.71 7.65 6.00 6.67 5.34 BIEC2-808543 0.48 0.37 C/T 105547002 7.59 6.93 6.00 6.26 5.34 BIEC2-808608 0.41 0.37 A/G 105875809 5.64 5.96 3.72 5.59 3.86 BIEC2-808617 0.49 0.37 A/G 105876397 5.79 5.77 4.60 5.85 4.21 BIEC2-809050 0.25 0.30 A/G 107542694 4.13 4.39 3.64 5.49 4.72 15 BIEC2-325120 0.39 0.36 C/T 85962396 4.39 4.45 4.56 3.66 3.13 BIEC2-325188 0.49 0.37 C/T 86171192 4.75 4.40 4.52 3.51 3.17 BIEC2-325253 0.31 0.34 A/G 86428358 5.19 5.68 3.87 5.10 3.73 BIEC2-325256 0.31 0.34 C/T 86428577 5.19 5.68 3.87 5.10 3.73 19 BIEC2-422566 0.07 0.11 A/G 2251539 3.89 4.22 4.16 4.17 5.03 20 BIEC2-524152 0.25 0.31 A/G 20811935 6.63 6.81 2.60 7.08 4.85 BIEC2-524167 0.30 0.33 A/G 20832718 6.55 6.06 3.52 6.05 4.08 BIEC2-524168 0.30 0.33 A/G 20835801 6.55 6.06 3.52 6.05 4.08 BIEC2-524686 0.47 0.37 A/G 22665104 3.98 4.63 2.73 5.06 3.12

Plink1 = adaptive permutation testing. Plink2 = adaptive permutation testing using the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner as covariates. Plink3 = adaptive permutation testing using the MH-test within 16 core families and the proportion of Hanoverian, Thoroughbred, Trakehner and Holsteiner genes as covariates. Tassel1 = mixed linear model (MLM) for population structure estimated using structure (Pritchard et al.2000) and identity by state relationship matrix among all individuals. Tasse2 = mixed linear model (MLM) for the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner as covariates and marker identity by state based kinship among all individuals.

141

QTL for conformation

Table 2 Quantitative trait loci for limbs, Equus caballus chromosome (ECA), minor allele frequency (MAF), polymorphism content (PIC), and observed alleles (Alleles), position in Mb, and -log10 P-values estimate using Plink1, Plink2, Plink3, Tassle1 and Tassel2.

Position in ECA SNP ID MAF PIC Alleles Plink 1 Plink 2 Plink 3 Tassel 1 Tassel 2 Mb

5 BIEC2-897727 0.23 0.29 A/G 21052939 5.81 4.51 2.76 4.23 3.22 BIEC2-897799 0.27 0.32 A/G 21255011 6.89 5.65 3.92 4.83 4.05 BIEC2-897881 0.27 0.32 G/T 21634896 6.97 5.92 3.21 4.64 3.79 BIEC2-897883 0.27 0.32 C/T 21638160 6.97 5.92 3.21 4.64 3.79 BIEC2-898062 0.32 0.34 C/T 22105736 6.99 6.01 3.79 4.41 3.59 BIEC2-898069 0.25 0.30 A/G 22168823 6.86 5.13 2.59 4.82 3.46 BIEC2-898157 0.13 0.21 A/G 22947033 5.24 4.86 2.96 5.33 3.36 BIEC2-898176 0.13 0.21 C/T 23121913 5.24 4.86 2.96 5.33 3.36 18 BIEC2-391005 0.25 0.31 C/T 2265150 5.39 4.97 3.36 5.29 3.44

Plink1 = adaptive permutation testing. Plink2 = adaptive permutation testing using the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner as covariates. Plink3 = adaptive permutation testing using the MH-test within 16 core families and the proportion of Hanoverian, Thoroughbred, Trakehner and Holsteiner genes as covariates. Tassel1 = mixed linear model (MLM) for population structure estimated using structure (Pritchard et al.2000) and identity by state relationship matrix among all individuals. Tasse2 = mixed linear model (MLM) for the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner as covariates and marker identity by state based kinship among all individuals.

142

QTL for conformation

Table 3 Estimates of the additive (a) and dominance (d) effects with their standard errors (SE) and error probabilities (P) for each of the most significant SNP within each QTL using an animal model for the trait RHP and the distribution of genotype frequencies for the respective SNP.

Genotype Alleles QTL SNP-ID a1 P d2 or 3b P frequencies 1 2 11 12 22 1 BIEC2-808466 A G 16.75 ± 2.98 <0.01 2.43 ± 3.692 0.51 0.25 0.52 0.23 2 BIEC2-325253 A G 8.24 ± 3.24 <0.01 11.38 ± 4.532 <0.01 0.50 0.39 0.11 3 BIEC2-422566 A G - - -23.09 ± 5.773 <0.01 0.87 0.13 - 4 BIEC2-524152 A G 5.63 ± 5.18 0.28 16.79 ± 5.792 <0.01 0.53 0.43 0.03 1 2 3 a = (m22-m11) / 2; d = m12 – ((m11 +m22) / 2); b = m12-m11/22

143

QTL for conformation

Table 4 Estimates of the additive (a) and dominance (d) effects with their standard errors (SE) and error probabilities (P) for each of the most significant SNP within each QTL using an animal model for the trait limbs and the distribution of genotype frequencies for the respective SNP.

Genotype Alleles QTL SNP-ID a1 P d2 P frequencies 1 2 11 12 22 1 BIEC2-897799 A G 14.15 ± 3.36 <0.01 0.73 ± 3.97 0.85 0.51 0.43 0.06 2 BIEC2-391005 T C 8.67 ± 3.03 <0.01 6.44 ± 3.97 0.11 0.57 0.35 0.08 1 2 a = (m22-m11) / 2; d = m12 – ((m11 +m22) / 2

144

QTL for conformation

Table 3 Distribution of genotypes of the most significant SNP of each QTL for RHP and limbs per proportions of Hanoverian (HAN), Thoroughbred (TB), Trakehner (TRAK) and Holsteiner (HOL) genes for the genotyped stallions.

RHP limbs Proportion Genotype Genotype Genotype Genotype Genotype Genotype of genes in frequency (%) frequency (%) frequency (%) frequency (%) frequency (%) frequency (%) classes BIEC2-243 BIEC2-18316 BIEC2-1036317 BIEC2-1094761 BIEC2-683832 BIEC2-689886 TT TC CC AA AG GG AA AG GGAA AG GGAA AC CC TT TG GG HA 1 50 43 7 57 36 7 82 18 0 57 39 4 54 36 11 50 39 11 N 2 21 57 21 46 36 18 89 11 0 45 52 4 52 45 4 57 36 7 3 10 27 46 48 48 3 87 13 0 65 32 3 48 45 6 65 29 6 TB 1 17 50 33 47 50 3 87 13 0 53 40 7 43 47 10 73 20 7 2 19 56 25 44 44 12 90 10 0 58 42 0 63 35 2 52 38 10 3 42 48 9 61 21 18 82 18 0 45 48 6 39 52 9 52 42 6 TR 1 31 44 25 58 31 10 85 15 0 60 35 4 50 42 8 58 33 8 AK 2 21 50 29 43 54 4 86 14 0 43 54 4 43 54 4 57 36 7 3 21 64 15 44 38 18 90 10 0 51 46 3 59 36 5 56 36 8 HO 1 24 51 26 51 38 11 87 13 0 56 41 3 49 44 6 60 33 7 L 2 33 61 6 44 44 11 89 11 0 39 56 6 61 33 6 44 44 11 HAN 1: ≤0.34, 2: >0.34 and <0.78, 3: ≥0.78; TB 1: ≤0.13, 2: >0.13 and <0.30, 3: ≥0.30; TRAK 1: ≤0.20, 2: >0.20 and <0.80, 3: ≥0.80; HOL 1: 0.00, 2: >0.00 and <0.30, 3: ≥0.30.

145

QTL for conformation

Table S1 Quantitative trait loci (QTL) and putative QTL for RHP, their location on Equus caballus chromosome (ECA) in Mb, minor allele frequency (MAF), polymorphism information content (PIC), SNP-motif and -log10 P-values using different analysis models. Position in ECA SNP ID MAF PIC Alleles Plink1 Plink2 Plink3 Tassel1 Tassel2 Mb 3 BIEC2-797083 0.40 0.37 A/G 82925586 6.34 6.25 3.02 4.48 3.09 BIEC2-797488 0.37 0.36 A/G 83653040 5.95 6.12 3.31 4.11 3.00 BIEC2-797548 0.08 0.14 A/C 83763295 5.92 5.17 2.67 4.53 3.61 BIEC2-797709 0.49 0.37 A/G 84116142 6.08 5.08 3.36 3.80 3.40 BIEC2-807931 0.43 0.37 C/T 102839312 8.38 7.17 6.00 7.90 5.82 BIEC2-808466 0.49 0.37 A/G 105163077 8.21 6.14 6.00 7.08 6.15 BIEC2-808500 0.38 0.36 G/T 105363241 8.71 7.65 6.00 6.67 5.34 BIEC2-808543 0.48 0.37 C/T 105547002 7.59 6.93 6.00 6.26 5.34 BIEC2-808608 0.41 0.37 A/G 105875809 5.64 5.96 3.72 5.59 3.86 BIEC2-808617 0.49 0.37 A/G 105876397 5.79 5.77 4.60 5.85 4.21 BIEC2-809050 0.25 0.30 A/G 107542694 4.13 4.39 3.64 5.49 4.72 6 BIEC2-1187377 0.27 0.32 C/T77015004 4.63 4.03 3.93 3.14 3.00 BIEC2-1187571 0.26 0.31 C/T 77259757 4.09 3.80 3.81 3.12 3.17 15 BIEC2-325120 0.39 0.36 C/T 85962396 4.39 4.45 4.56 3.66 3.13 BIEC2-325188 0.49 0.37 C/T 86171192 4.75 4.40 4.52 3.51 3.17 BIEC2-325253 0.31 0.34 A/G 86428358 5.19 5.68 3.87 5.10 3.73 BIEC2-325256 0.31 0.34 C/T 86428577 5.19 5.68 3.87 5.10 3.73 17 BIEC2-368487 0.39 0.36 C/T 10216714 4.38 7.17 2.15 4.52 3.25 18 BIEC2-391005 0.25 0.31 C/T 2265150 5.59 4.95 2.85 4.79 3.11 19 BIEC2-422566 0.07 0.11 A/G 2251539 3.89 4.22 4.16 4.17 5.03 20 BIEC2-521412 0.17 0.25 A/C 15421443 2.85 2.08 1.18 4.21 3.00 BIEC2-524152 0.25 0.31 A/G 20811935 6.63 6.81 2.60 7.08 4.85 BIEC2-524167 0.30 0.33 A/G 20832718 6.55 6.06 3.52 6.05 4.08 BIEC2-524168 0.30 0.33 A/G 20835801 6.55 6.06 3.52 6.05 4.08 BIEC2-524686 0.47 0.37 22665104A/G 3.98 4.63 2.73 5.06 3.12 21 BIEC2-554900 0.12 0.19 C/T 18811655 3.55 4.31 3.10 4.60 3.31 27 BIEC2-705454 0.11 0.18 C/T 13198799 4.42 2.80 3.03 3.23 3.12 Plink1 = significance ascertained through adaptive permutation testing (max of 1,000,000 permutations). Plink2 = significance ascertained considering the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner. Plink3 = significance ascertained through adaptive permutation testing within 16 family clusters under consideration of the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner warmblood, using a maximum of 1,000,000 permutations. Tassel1 = significance ascertained integrating population structure (estimated using STRUCTURE) and marker identity by state based kinship in a mixed linear model (MLM) by using TASSEL. Tassel = significance ascertained integrating the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner as covariates and marker identity by state based kinship in a mixed linear model (MLM) by using TASSEL.

146

QTL for conformation

Table S2 Quantitative trait loci (QTL) and putative QTL for limbs, their location on Equus caballus chromosome (ECA) in Mb, minor allele frequency (MAF), polymorphism information content (PIC), SNP-motif and -log10 P-values using different analysis models. Position in ECA SNP ID MAF PIC Alleles Plink1 Plink2 Plink3 Tassel1 Tassel2 Mb 1 BIEC2-53188 0.10 0.16 A/G 123800096 2.48 1.47 3.82 3.21 3.27 3 BIEC2-784176 0.10 0.17 A/C 59588660 4.42 3.88 4.20 3.78 4.46 BIEC2-784177 0.13 0.20 C/T 59588915 4.38 3.75 4.29 3.44 3.96 BIEC2-784179 0.12 0.19 A/G 59599826 3.83 3.44 3.80 3.05 3.65 BIEC2-784182 0.13 0.20 A/G 59602599 4.34 3.63 4.30 3.44 4.03 BIEC2-784278 0.16 0.23 C/T 59738480 5.19 4.14 3.80 3.84 3.99 BIEC2-784634 0.12 0.19 A/G 60488386 4.26 3.99 3.68 3.74 4.27 BIEC2-784670 0.12 0.19 A/G 60564407 4.26 3.99 3.68 3.74 4.27 BIEC2-784763 0.12 0.19 A/G 60768911 4.16 3.69 3.74 3.61 4.13 BIEC2-784779 0.12 0.19 C/T 60787349 4.16 3.69 3.74 3.61 4.13 BIEC2-784948 0.12 0.19 C/T 60909570 4.16 3.69 3.74 3.61 4.13 BIEC2-784957 0.12 0.19 C/T 60941985 4.12 3.63 3.79 3.68 4.17 BIEC2-784974 0.12 0.19 A/G 60985887 4.16 3.69 3.74 3.61 4.13 BIEC2-785004 0.30 0.33 61053785G/T 3.01 3.38 2.56 3.08 3.31 BIEC2-785193 0.18 0.25 A/C 61276175 3.90 3.61 3.57 3.82 3.99 BIEC2-789759 0.19 0.26 A/C 69166205 3.71 3.59 2.60 3.68 3.95 5 BIEC2-897727 0.23 0.29 A/G 21052939 5.81 4.51 2.76 4.23 3.22 BIEC2-897799 0.27 0.32 A/G 21255011 6.89 5.65 3.92 4.83 4.05 BIEC2-897881 0.27 0.32 G/T 21634896 6.97 5.92 3.21 4.64 3.79 BIEC2-897883 0.27 0.32 C/T 21638160 6.97 5.92 3.21 4.64 3.79 BIEC2-898062 0.32 0.34 C/T 22105736 6.99 6.01 3.79 4.41 3.59 BIEC2-898069 0.25 0.30 A/G 22168823 6.86 5.13 2.59 4.82 3.46 BIEC2-898157 0.13 0.21 A/G 22947033 5.24 4.86 2.96 5.33 3.36 BIEC2-898176 0.13 0.21 C/T 23121913 5.24 4.86 2.96 5.33 3.36 BIEC2-926759 0.15 0.22 A/G 87836901 2.49 3.80 3.61 4.85 4.10 BIEC2-928986 0.29 0.32 A/G 91087138 3.09 3.37 2.30 3.80 3.27 BIEC2-929000 0.29 0.33 C/T 91106034 3.09 3.37 2.30 3.80 3.27 8 BIEC2-1027884 0.40 0.37 A/G 9430846 3.69 3.29 1.53 4.14 3.06 BIEC2-1037941 0.43 0.37 C/T 24898640 3.91 4.95 1.14 4.96 3.49 11 BIEC2-136265 0.29 0.32 A/G 6358979 1.46 1.66 2.04 4.84 3.00 14 BIEC2-261914 0.16 0.23 A/G 65722376 6.60 6.05 3.81 4.01 3.78 BIEC2-262035 0.15 0.22 A/G 65968391 6.43 5.83 3.02 3.77 3.35 17 BIEC2-375664 0.38 0.36 C/T 35342569 4.68 4.73 1.78 4.83 3.13 BIEC2-375985 0.47 0.37 G/T 36846170 3.44 4.52 1.50 4.48 3.37 BIEC2-376057 0.28 0.32 C/T 37232491 3.63 4.18 2.12 3.53 3.02 BIEC2-376274 0.34 0.35 A/G 38765378 3.46 3.34 1.98 3.65 3.06 BIEC2-376373 0.30 0.33 C/T 39508435 3.75 3.34 1.59 4.27 3.06 18 BIEC2-391005 0.25 0.31 C/T 2265150 5.39 4.97 3.36 5.29 3.44 19 BIEC2-422566 0.07 0.11 A/G 2251539 3.46 3.11 2.99 3.50 4.12 20 BIEC2-522383 0.26 0.31 A/G 17272320 2.67 3.44 1.82 4.44 3.31 BIEC2-522384 0.26 0.31 A/C 17272406 2.69 3.39 1.81 4.41 3.29 BIEC2-522385 0.26 0.31 C/T 17272529 2.67 3.44 1.82 4.44 3.31 25 BIEC2-655148 0.14 0.22 C/T 4775985 5.48 5.07 3.41 3.70 3.28 BIEC2-659406 0.07 0.11 C/T 11562192 3.59 4.22 2.38 3.42 3.35 BIEC2-659464 0.07 0.11 G/T 11699419 2.16 2.13 2.14 3.68 3.18 26 BIEC2-691812 0.41 0.37 A/G 26108871 2.60 1.94 1.00 3.31 3.00

147

QTL for conformation

Table S2. continued Position in ECA SNP ID MAF PIC Alleles Plink1 Plink2 Plink3 Tassel1 Tassel2 Mb 31 BIEC2-841254 0.36 0.35 G/T 20781519 5.65 5.93 2.04 4.77 3.37 Plink1 = significance ascertained through adaptive permutation testing (max of 1,000,000 permutations). Plink2 = significance ascertained considering the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner. Plink3 = significance ascertained through adaptive permutation testing within 16 family clusters under consideration of the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner warmblood, using a maximum of 1,000,000 permutations. Tassel1 = significance ascertained integrating population structure (estimated using STRUCTURE) and marker identity by state based kinship in a mixed linear model (MLM) by using TASSEL. Tassel = significance ascertained integrating the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner as covariates and marker identity by state based kinship in a mixed linear model (MLM) by using TASSEL.

148

QTL for conformation

Table S3 Core families analyzed, their size, number of stallions genotyped, the mean breeding value for riding horse points (RHP) and limbs (LIMBS), and mean proportions of Hanoverian (HAN), Thoroughbred (TB), Trakehner (TRAK) and Holsteiner (HOL) genes for the genotyped stallions.

Total Number of Family Proportion of genes descending from family stallions BV for RHP BV for LIMBS number size genotyped HAN (%) TB (%) TRAK (%) HOl (%) 1 63 10 97 ± 19 97 ± 13 51.4 33.3 1.1 0 2 19 3 104 ± 3 107 ± 4 55.0 43.7 1.7 0 3 98 7 113 ± 16 114 ±13 16.7 23.1 0.3 54.8 4 37 8 138 ± 15 129 ± 17 32.3 26.0 20.4 8.1 5 65 7 102 ± 18 106 ± 15 71.6 27.1 1.3 0

6 47 7 124 ± 21 121 ± 9 74.3 14.1 4.3 6.4

7 31 4 98 ± 8 96 ± 5 21.3 12.8 3.3 0

8 80 13 80 ± 24 89 ± 20 74.8 15.7 2.5 1.5

9 56 5 134 ± 24 120 ± 19 24.6 74.8 0.4 0

10 33 6 90 ± 22 92 ± 23 36.5 3.8 3.5 0

11 70 15 92 ± 23 101 ± 17 79.3 16.4 3.9 0

12 15 2 113 ± 5 112 ± 9 41.0 29.0 5.0 0

13 17 4 99 ± 27 86 ± 14 47.2 38.5 4.5 5.3

14 29 2 81 ± 18 106 ± 11 75.0 15.0 4.0 0

15 66 16 115 ± 20 125 ± 17 78.4 13.9 6.1 1.2

16 72 6 103 ± 14 90 ± 24 0 100 0 0

Total 798 115 102 102 48.7 30.45 3.9 4.8

149

QTL for conformation

Table S4 Quantiles of the breeding values (BV) and genotype BVs (gBV) for riding horse points (RHP) and limbs (LIMBS).

Quantiles (%) BV for RHP gBV for RHP BV for LIMBS gBV for LIMBS

95 164 163 160 184 90 139 163 135 184 75 120 154 122 179 50 104 141 106 153 25 88 129 92 153 10 73 110 81 144 5 67 103 73 118

150

QTL for conformation

Table S5 SNPs significantly associated with RHP (LOD > 3.0) using the Tassel1 or Tassel2 Model Single nucleotide polymorphisms (SNPs) significantly associated with the breeding value for riding horse points (-log10 (P) >3), their location on Equus caballus chromosome (ECA) in Mb and -log1 0 error probabilities estimated with Tassel1 and Tassel2 and potential candidate genes. Position in Candidate ECA SNP ID Tassel1 Tassel2 Position in Mb Mb gene 1 BIEC2-13478 30427816 3.49 2.62 BIEC2-16836 37154743 3.02 1.52 BIEC2-62809 144340953 4.00 2.46 BIEC2-65387 153033730 3.34 2.02 BIEC2-72017 159620436 3.02 2.28 2 BIEC2-457195 13084869 2.64 3.00 BIEC2-459850 18796966 4.45 3.30 BIEC2-464902 26707031 3.19 1.62 BIEC2-476637 45451604 2.59 3.00 BIEC2-481503 59410884 3.77 2.72 BIEC2-481504 59444000 3.08 1.81 BIEC2-495617 88120167 3.42 1.25 BIEC2-508605 116149167 3.01 1.76 3 BIEC2-771230 7559038 3.22 1.70 BIEC2-772769 13840698 3.32 2.68 BIEC2-774387 23976728 3.43 2.43 BIEC2-774390 23991075 3.43 2.43 BIEC2-774402 24046825 3.43 2.43 BIEC2-774821 26467499 3.05 0.95 BIEC2-779147 44643447 3.05 0.95 BIEC2-779149 44643788 3.05 0.95 BIEC2-779464 47452938 3.00 2.27 BIEC2-791010 71080355 3.70 2.21 BIEC2-796203 81569473 3.58 2.82 BIEC2-796310 81675086 3.43 1.91 BIEC2-796688 82394754 3.33 2.48 BIEC2-796886 82653963 4.18 2.96 BIEC2-797082 82925395 3.93 2.77 BIEC2-797083 82925586 4.48 3.09 BIEC2-797488 83653040 4.11 3.00 BIEC2-797548 83763295 4.53 3.61 BIEC2-797709 84116142 3.80 3.40 BIEC2-798300 84886936 3.36 0.97 BIEC2-798794 85595160 3.12 2.12 BIEC2-798799 85606322 3.12 2.12 BIEC2-798852 85794236 3.16 2.10 BIEC2-798930 85964657 3.26 1.18 BIEC2-799204 86451588 5.03 2.89 BIEC2-799389 87018911 4.25 2.85 BIEC2-799491 87596645 3.98 3.21 BIEC2-799492 87610119 4.24 2.96 BIEC2-799496 87610242 3.98 3.21 BIEC2-807931 102839312 7.90 5.82 PPARGC1A 100784624-100876530 BIEC2-808192 103461428 2.52 3.43

151

QTL for conformation

Table S5 continued Position in Candidate ECA SNP ID Tassel1 Tassel2 Position in Mb Mb gene BIEC2-808466 105163077 7.08 6.15 BIEC2-808500 105363241 6.67 5.34 BIEC2-808543 105547002 6.26 5.34 LCORL 105646547-105758011 BIEC2-808608 105875809 5.59 3.86 BIEC2-808617 105876397 5.85 4.21 BIEC2-808640 105947243 3.64 2.77 BIEC2-809050 107542694 5.49 4.72 4 BIEC2-844018 2914990 3.49 1.85 BIEC2-847730 10143709 3.57 1.59 BIEC2-847734 10148019 3.57 1.71 BIEC2-851102 15857380 3.00 1.85 BIEC2-853477 21548517 3.01 1.34 BIEC2-853669 21680389 3.98 1.94 BIEC2-853718 21711097 4.23 2.08 BIEC2-860945 41989527 3.06 2.16 5 BIEC2-897312 19292554 3.66 1.74 BIEC2-897596 20390001 3.83 2.13 BIEC2-897638 20457617 3.99 2.21 BIEC2-897696 20786648 3.87 2.41 BIEC2-897701 20791048 3.87 2.41 BIEC2-897892 21689996 3.71 2.23 BIEC2-898069 22168823 3.47 2.21 BIEC2-898157 22947033 4.80 2.77 BIEC2-898172 23098207 4.65 2.70 BIEC2-898176 23121913 4.80 2.77 BIEC2-898534 25154013 3.59 1.87 BIEC2-900689 31436280 3.40 1.77 BIEC2-900691 31436618 3.36 1.76 BIEC2-900692 31436670 3.36 1.76 BIEC2-900936 32548736 3.41 2.51 BIEC2-901539 34710217 3.42 2.34 BIEC2-901567 34887043 3.33 2.66 BIEC2-905857 37383845 3.79 2.51 BIEC2-912123 59890582 3.54 2.39 BIEC2-912176 60030634 3.41 1.35 BIEC2-918384 73146481 3.00 1.67 BIEC2-918388 73147945 3.00 1.67 BIEC2-920743 77733808 3.02 1.79 BIEC2-921015 78259537 3.06 1.61 BIEC2-926759 87836901 3.40 2.92 BIEC2-929141 91229049 4.15 2.55 BIEC2-929148 91241576 4.25 2.80 BIEC2-929151 91242708 3.39 2.77 BIEC2-932791 97216631 4.26 1.61 BIEC2-933501 98122103 3.63 1.48 6 BIEC2-939135 10267215 3.29 1.33 BIEC2-944925 22888476 3.34 1.55 BIEC2-946964 33782226 3.21 2.18 BIEC2-952532 45761583 3.26 2.35 BIEC2-957041 55608721 3.54 3.03 BIEC2-957769 56832213 3.04 2.11 BIEC2-958816 58657544 3.16 2.28

152

QTL for conformation

Table S5 continued Position in Candidate ECA SNP ID Tassel1 Tassel2 Position in Mb Mb gene BIEC2-1181633 69280891 3.03 1.87 BIEC2-1183450 71872446 3.21 2.85 BIEC2-1186448 76055578 3.64 1.87 CYP27B1 75303709-75307830 BIEC2-1186451 76057683 3.83 2.02 BIEC2-1186501 76100416 4.97 2.62 BIEC2-1187377 77015004 3.14 3.00 BIEC2-1187571 77259757 3.12 3.17 7 BIEC2-986405 23278762 3.08 1.40 BIEC2-992538 33669475 3.18 2.31 BIEC2-993145 34291268 4.00 2.47 BIEC2-1006136 70914456 5.22 3.26 BIEC2-1008133 80739520 3.27 1.17 BIEC2-1008165 80994747 4.61 1.72 8 BIEC2-1022701 2955972 4.40 2.54 BIEC2-1046042 40043641 2.64 3.45 BIEC2-1050009 47996071 3.08 1.62 BIEC2-1051244 50820999 3.41 2.49 BIEC2-1054120 57387443 3.67 1.85 BIEC2-1057054 61558721 4.37 2.06 BIEC2-1057219 61803683 4.78 2.39 BIEC2-1064307 82982694 3.06 2.62 9 BIEC2-1067731 860869 3.16 1.14 BIEC2-1071900 6743128 3.55 1.87 BIEC2-1071914 6756912 3.97 2.52 BIEC2-1090197 41806295 3.38 3.01 BIEC2-1101914 66034214 4.14 2.68 BIEC2-1103349 68529055 3.84 2.64 BIEC2-1103351 68529354 3.96 2.51 BIEC2-1104137 70859944 3.88 2.47 BIEC2-1104140 70863096 3.88 2.47 BIEC2-1104189 70889211 3.88 2.47 BIEC2-1104197 70893162 3.36 2.28 BIEC2-1104198 70893252 3.88 2.47 BIEC2-1104242 70960794 3.88 2.47 10 BIEC2-97587 7346411 3.50 2.72 BIEC2-122289 50125577 3.45 2.19 BIEC2-122428 50326303 3.11 1.62 BIEC2-122670 50718975 3.65 1.64 BIEC2-130623 68407608 3.16 2.03 BIEC2-132884 81200428 3.94 2.22 11 BIEC2-136265 6358979 3.73 2.52 BIEC2-136322 6659556 4.35 2.17 BIEC2-136343 6808395 4.23 2.15 BIEC2-136353 6903428 4.45 2.02 BIEC2-143784 22239520 3.32 2.52 BIEC2-157774 49157741 3.35 1.78 BIEC2-160750 55161299 3.07 2.20 12 BIEC2-167337 1819803 3.27 1.21 BIEC2-192112 20952733 3.25 2.18 BIEC2-194588 23714293 3.13 1.41 BIEC2-194589 23717619 3.39 1.40 BIEC2-194593 23721615 3.92 1.57

153

QTL for conformation

Table S5 continued Position in Candidate ECA SNP ID Tassel1 Tassel2 Position in Mb Mb gene BIEC2-194605 23724775 3.07 1.14 13 BIEC2-204483 3309820 3.48 1.19 BIEC2-235536 38534789 3.27 1.17 BIEC2-235734 39785298 3.32 1.33 14 BIEC2-238245 1132356 4.22 2.62 BIEC2-248760 24240674 3.92 3.15 BIEC2-271320 83884366 3.03 2.28 BIEC2-271570 84264354 3.15 2.12 BIEC2-274596 89634008 3.01 2.00 BIEC2-276841 92054297 3.69 2.08 BIEC2-276851 92057259 3.64 2.28 15 BIEC2-294859 22314765 3.75 1.80 BIEC2-308356 48337765 3.36 1.51 BIEC2-319719 74302488 3.03 1.75 BIEC2-320588 75960062 3.24 1.92 BIEC2-320772 76558411 3.68 2.62 BIEC2-325118 85961234 3.16 2.40 BIEC2-325120 85962396 3.66 3.13 BIEC2-325170 86165109 3.47 2.70 BIEC2-325188 86171192 3.51 3.17 BIEC2-325192 86175145 3.13 2.30 BIEC2-325224 86297142 3.36 1.21 BIEC2-325253 86428358 5.10 3.73 BIEC2-325256 86428577 5.10 3.73 BIEC2-325468 87411975 3.17 1.80 BIEC2-326108 90076417 3.14 1.64 16 BIEC2-328704 6203161 3.15 2.29 BIEC2-328730 6306983 3.00 2.04 BIEC2-328841 6579330 3.15 2.29 BIEC2-328853 6605389 3.02 2.23 BIEC2-349051 50673234 4.57 3.83 BIEC2-351800 55818835 3.49 2.55 BIEC2-356341 64646823 3.14 1.55 17 BIEC2-368487 10216714 4.52 3.25 BIEC2-374370 25963835 3.62 1.92 BIEC2-375113 30634556 3.57 2.02 BIEC2-375625 35156682 3.59 1.85 BIEC2-375641 35258708 3.59 1.85 BIEC2-375643 35259013 3.59 1.85 BIEC2-375664 35342569 4.43 2.54 BIEC2-375681 35404484 3.59 1.85 BIEC2-375682 35405565 3.59 1.85 BIEC2-375924 36644251 3.73 1.74 BIEC2-376274 38765378 3.63 2.96 BIEC2-376318 39183407 4.73 2.89 BIEC2-376373 39508435 4.60 2.80 BIEC2-376621 40886200 5.31 2.77 BIEC2-376623 40886448 4.29 2.24 BIEC2-376630 40900237 5.31 2.77 BIEC2-376631 40939096 3.34 1.80 BIEC2-376642 41029980 3.73 2.12 BIEC2-376713 41669220 3.74 2.26

154

QTL for conformation

Table S5 continued Position in Candidate ECA SNP ID Tassel1 Tassel2 Position in Mb Mb gene BIEC2-376714 41669334 4.26 2.96 BIEC2-376718 41690826 4.22 2.92 BIEC2-377538 49208322 4.10 2.55 BIEC2-377685 50842582 3.69 2.49 BIEC2-378928 58094803 3.25 1.86 BIEC2-380111 59891032 3.10 1.31 BIEC2-385829 78498537 4.11 2.28 18 BIEC2-390282 1053781 4.01 2.30 BIEC2-390931 2198055 4.43 2.36 BIEC2-391005 2265150 4.79 3.11 MYO7B 3154614-3221832 BIEC2-415144 56426769 3.00 1.49 BIEC2-417604 68262293 3.27 2.21 19 BIEC2-422417 1134701 3.21 2.68 BIEC2-422566 2251539 4.17 5.03 SHOX2 1769247-1774673 BIEC2-426894 8238600 3.27 0.99 BIEC2-426898 8242401 3.27 0.99 BIEC2-432221 26350892 3.14 1.67 BIEC2-434932 31846554 3.57 2.30 BIEC2-435240 32401242 3.15 2.00 BIEC2-435242 32401427 3.15 2.00 BIEC2-439699 40745903 3.42 2.82 BIEC2-440771 42299877 3.37 1.65 BIEC2-445581 53195185 3.28 1.85 20 BIEC2-514406 5268293 4.23 1.54 BIEC2-517668 10474078 3.02 2.13 BIEC2-521290 15264130 4.21 2.55 BIEC2-521412 15421443 4.21 3.00 BIEC2-522383 17272320 3.93 1.77 BIEC2-522384 17272406 3.89 1.75 BIEC2-522385 17272529 3.93 1.77 BIEC2-523868 20221316 3.07 1.66 BIEC2-524152 20811935 7.08 4.85 BIEC2-524167 20832718 6.05 4.08 BIEC2-524168 20835801 6.05 4.08 BIEC2-524624 22282065 3.64 1.97 BIEC2-524628 22293796 3.05 1.46 BIEC2-524686 22665104 5.06 3.12 BIEC2-525830 25015798 3.71 2.15 BIEC2-532655 42598631 3.50 3.48 BIEC2-532662 42633429 3.50 3.48 BIEC2-532664 42635192 3.50 3.48 BIEC2-535017 47081435 3.66 2.31 21 BIEC2-547452 2785828 3.41 1.79 BIEC2-554899 18811395 4.02 2.70 BIEC2-554900 18811655 4.60 3.31 FST 18584478-18590636 BIEC2-567211 43237107 3.05 2.22 BIEC2-570118 49480299 3.00 1.27 22 BIEC2-574239 23790 3.26 1.37 BIEC2-574487 414888 3.84 2.70 BIEC2-575282 1355408 3.38 1.71 BIEC2-575296 1401359 3.37 1.77 BIEC2-575339 1496512 3.57 2.51

155

QTL for conformation

Table S5 continued Position in Candidate ECA SNP ID Tassel1 Tassel2 Position in Mb Mb gene BIEC2-575361 1567586 3.18 1.68 BIEC2-576026 3136348 3.10 1.34 BIEC2-576141 3329069 3.37 1.55 BIEC2-577013 4632335 3.11 2.96 BIEC2-577154 4903099 3.37 2.59 BIEC2-577188 4937693 3.37 2.59 BIEC2-578327 6165755 3.25 2.72 BIEC2-578604 6526701 3.27 2.64 BIEC2-578742 6835408 3.25 2.72 BIEC2-580651 10402703 3.22 1.61 BIEC2-592859 34574806 3.90 2.52 23 BIEC2-621603 27143229 3.32 1.89 BIEC2-623933 36755506 3.65 2.28 BIEC2-624676 41334502 5.01 1.87 BIEC2-624725 41839692 3.07 1.77 24 BIEC2-640349 25956957 3.77 1.30 BIEC2-647889 36465581 3.64 2.24 26 BIEC2-691684 25758750 3.01 2.31 BIEC2-691713 25808727 4.69 3.74 BIEC2-691812 26108871 3.53 3.47 BIEC2-693818 34789359 3.22 1.99 BIEC2-695906 38410792 3.42 2.05 BIEC2-696276 38636450 3.22 1.95 27 BIEC2-700306 2256814 3.04 2.26 BIEC2-700308 2259858 3.04 2.26 BIEC2-702069 7707453 3.61 2.55 BIEC2-705454 13198799 3.23 3.12 BIEC2-707051 16525618 3.27 1.71 BIEC2-710994 23988191 4.05 2.92 BIEC2-717291 33345559 4.03 3.00 28 BIEC2-726879 7526456 3.77 3.15 BIEC2-730532 14015630 3.04 2.09 BIEC2-730672 14662217 3.90 2.18 BIEC2-732532 17853343 3.64 2.10 BIEC2-732543 17872600 3.64 2.10 BIEC2-732925 18650862 5.72 3.05 BIEC2-735014 22595015 3.50 1.66 BIEC2-737513 28541743 3.18 1.01 BIEC2-743993 40740834 3.80 2.72 29 BIEC2-752447 10520032 4.51 1.92 30 BIEC2-813326 1996933 3.02 2.27 BIEC2-814558 3973171 3.04 1.22 BIEC2-815049 4679614 4.00 2.52 BIEC2-815165 4839969 3.59 1.98 BIEC2-815182 4854791 3.48 1.78 BIEC2-827949 24959480 3.23 1.44 31 BIEC2-829947 238966 3.02 2.13 BIEC2-833144 6242631 3.08 2.14 BIEC2-837754 14734668 3.26 2.15 BIEC2-839688 18089361 3.21 1.28 BIEC2-827949 24959480 3.23 1.44

156

QTL for conformation

Tassel1 = significance ascertained integrating population structure (estimated using STRUCTURE [23]) and marker identity by state based kinship in a mixed linear model (MLM) by using TASSEL. Tassel = significance ascertained integrating the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner as covariates and marker identity by state based kinship in a mixed linear model (MLM) by using TASSEL.

157

QTL for conformation

Table S6 Single nucleotide polymorphisms (SNPs) significantly associated with the breeding value for limbs (-log10 (P) >3), their location on Equus caballus chromosome (ECA) in Mb and -log10 error probabilities estimated with Tassel1 and Tassel2 and potential candidate genes. Position in Candidate Position in Mb ECA SNP ID Tassel1 Tassel2 Mb gene 1 BIEC2-850 3571562 3.16 2.60 BIEC2-9552 20314958 3.19 2.89 BIEC2-9585 20560257 3.37 2.74 BIEC2-9599 20670268 3.52 2.26 BIEC2-9626 21006127 3.01 2.49 BIEC2-9966 21809520 2.54 3.50 BIEC2-53188 123800096 3.21 3.27 BIEC2-73157 161026231 3.09 2.43 BIEC2-81326 170197392 3.34 2.66 BIEC2-81461 170567134 3.37 2.00 BIEC2-89066 181845129 3.24 1.27 2 BIEC2-459850 18796966 3.29 1.86 BIEC2-468992 33665077 3.64 2.24 BIEC2-479634 56150792 3.23 2.85 BIEC2-480907 58518774 3.26 2.89 BIEC2-480931 58526415 3.32 2.80 BIEC2-484506 65321569 3.06 2.60 BIEC2-484508 65321677 3.06 2.60 BIEC2-484509 65321685 3.06 2.60 BIEC2-486399 69215803 3.00 1.90 3 BIEC2-781624 53119067 3.48 2.77 BIEC2-783885 59263095 2.47 3.13 BIEC2-784098 59450356 2.39 3.06 BIEC2-784176 59588660 3.78 4.46 BIEC2-784177 59588915 3.44 3.96 BIEC2-784179 59599826 3.05 3.65 BIEC2-784182 59602599 3.44 4.03 BIEC2-784278 59738480 3.84 3.99 BIEC2-784634 60488386 3.74 4.27 BIEC2-784670 60564407 3.74 4.27 BIEC2-784763 60768911 3.61 4.13 BIEC2-784779 60787349 3.61 4.13 BIEC2-784948 60909570 3.61 4.13 BIEC2-784957 60941985 3.68 4.17 BIEC2-784974 60985887 3.61 4.13 BIEC2-785004 61053785 3.08 3.31 BIEC2-785193 61276175 3.82 3.99 BIEC2-789759 69166205 3.68 3.95 BIEC2-796336 81739558 3.19 2.77 BIEC2-799204 86451588 3.41 1.82 BIEC2-800503 89619156 3.09 2.70 BIEC2-807931 102839312 3.25 1.96 4 BIEC2-847476 9910427 3.15 1.57 BIEC2-866887 54805325 3.07 1.23 BIEC2-869323 68347968 3.14 2.11 BIEC2-873301 89366270 3.01 1.80

158

QTL for conformation

Table S6 continued Position in Candidate ECA SNP ID Tassel1 Tassel2 Position in Mb Mb gene BIEC2-881808 103414051 3.33 2.62 5 BIEC2-897297 19230817 3.78 1.33 BIEC2-897298 19230857 3.78 1.33 BIEC2-897302 19232577 3.78 1.33 BIEC2-897303 19235729 3.55 1.35 BIEC2-897312 19292554 3.30 1.57 BIEC2-897356 19444168 3.77 2.32 BIEC2-897596 20390001 4.36 2.80 BIEC2-897638 20457617 4.58 2.96 BIEC2-897696 20786648 3.06 2.01 BIEC2-897701 20791048 3.06 2.01 BIEC2-897727 21052939 4.23 3.22 BIEC2-897799 21255011 4.83 4.05 BIEC2-897874 21541967 3.03 1.75 BIEC2-897881 21634896 4.64 3.79 BIEC2-897883 21638160 4.64 3.79 BIEC2-897892 21689996 3.27 2.44 BIEC2-898062 22105736 4.41 3.59 BIEC2-898069 22168823 4.82 3.46 BIEC2-898157 22947033 5.33 3.36 BIEC2-898172 23098207 3.89 2.34 BIEC2-898176 23121913 5.33 3.36 BIEC2-898705 26187848 3.60 2.01 BIEC2-898710 26191932 3.67 2.06 BIEC2-898714 26215264 3.99 2.55 BIEC2-898725 26291020 3.12 1.81 BIEC2-900936 32548736 3.15 2.47 BIEC2-901045 32937329 3.27 1.97 BIEC2-926412 86984368 3.32 2.72 BIEC2-926759 87836901 4.85 4.10 BIEC2-927187 88243192 3.88 2.09 BIEC2-927189 88243274 3.45 2.09 BIEC2-928624 90369184 3.00 1.45 BIEC2-928638 90400254 3.32 1.57 BIEC2-928986 91087138 3.80 3.27 BIEC2-929000 91106034 3.80 3.27 BIEC2-929141 91229049 3.05 1.57 BIEC2-932791 97216631 3.06 0.92 6 BIEC2-939105 10161499 3.62 1.24 BIEC2-939135 10267215 4.52 2.13 BIEC2-939148 10348635 3.60 1.28 BIEC2-958816 58657544 3.10 2.06 BIEC2-1186501 76100416 3.23 1.44 7 BIEC2-1013982 90355524 3.17 1.44 8 BIEC2-1021433 1058058 3.38 2.23 BIEC2-1022617 2897484 3.32 2.42 BIEC2-1022701 2955972 3.05 1.64 BIEC2-1027884 9430846 4.14 3.06 BIEC2-1037941 24898640 4.96 3.49 BIEC2-1046715 42756432 3.33 1.91 BIEC2-1051244 50820999 3.28 2.68 9 BIEC2-1067636 718593 2.38 3.12

159

QTL for conformation

Table S6continued Position in Candidate ECA SNP ID Tassel1 Tassel2 Position in Mb Mb gene BIEC2-1068646 2377573 3.48 3.42 BIEC2-1068666 2412264 3.48 3.42 BIEC2-1070467 4405717 3.47 2.15 BIEC2-1070468 4405728 3.37 2.12 BIEC2-1071900 6743128 3.56 0.98 BIEC2-1071914 6756912 4.55 1.88 BIEC2-1073609 9544337 3.27 2.41 BIEC2-1075054 13643614 4.38 2.77 BIEC2-1103214 67771774 3.02 1.56 BIEC2-1103349 68529055 3.32 2.32 BIEC2-1104137 70859944 3.41 2.01 BIEC2-1104140 70863096 3.41 2.01 BIEC2-1104189 70889211 3.41 2.01 BIEC2-1104198 70893252 3.41 2.01 BIEC2-1104242 70960794 3.41 2.01 10 BIEC2-108299 21112140 3.57 1.72 BIEC2-122289 50125577 3.44 1.91 BIEC2-130896 70278029 3.77 2.85 BIEC2-130915 70365869 3.77 2.85 BIEC2-130917 70369934 3.77 2.85 BIEC2-130937 70527577 3.62 2.19 BIEC2-132355 78462701 3.78 2.46 11 BIEC2-136265 6358979 4.84 3.00 BIEC2-141258 16395242 3.37 1.86 BIEC2-155067 44671168 3.06 1.37 BIEC2-157720 49039853 3.48 2.27 13 BIEC2-203613 2336845 3.00 2.77 BIEC2-204483 3309820 3.20 1.37 BIEC2-235536 38534789 3.11 1.57 14 BIEC2-241282 7230689 3.22 1.19 BIEC2-261914 65722376 4.01 3.78 BIEC2-262035 65968391 3.77 3.35 BIEC2-262621 67999643 3.08 2.35 BIEC2-266400 75359064 4.04 2.59 BIEC2-274596 89634008 3.86 1.70 BIEC2-277370 92832873 3.58 2.11 15 BIEC2-317557 69936180 2.17 3.00 BIEC2-317830 70340963 2.11 3.11 BIEC2-319647 73863669 3.48 2.59 BIEC2-319656 73929684 3.25 2.66 16 BIEC2-345520 44696730 3.10 2.19 BIEC2-349467 51772501 3.20 1.34 BIEC2-349474 51786396 3.84 1.44 BIEC2-349476 51786601 3.84 1.44 BIEC2-349636 51992543 3.99 1.54 BIEC2-362321 76200085 2.96 3.05 17 BIEC2-366829 2931611 3.54 3.69 BIEC2-374370 25963835 3.41 2.15 BIEC2-374565 27132595 3.37 1.92 BIEC2-374566 27133182 3.36 1.89 BIEC2-374569 27133494 3.36 1.89 BIEC2-374571 27133573 3.36 1.89

160

QTL for conformation

Table S6 continued Position in Candidate ECA SNP ID Tassel1 Tassel2 Position in Mb Mb gene BIEC2-374572 27133687 3.36 1.89 BIEC2-374592 27199856 3.45 2.24 BIEC2-374854 29231209 3.33 2.07 BIEC2-374891 29368701 3.10 1.86 BIEC2-374900 29413033 3.30 1.85 BIEC2-374998 29952455 3.60 1.93 BIEC2-375012 30088971 3.60 1.93 BIEC2-375021 30206201 3.60 1.93 BIEC2-375027 30229247 3.52 1.72 BIEC2-375625 35156682 4.19 2.25 BIEC2-375639 35255873 3.34 1.82 BIEC2-375641 35258708 4.19 2.25 BIEC2-375643 35259013 4.19 2.25 BIEC2-375664 35342569 4.83 3.13 BIEC2-375681 35404484 4.19 2.25 BIEC2-375682 35405565 4.19 2.25 BIEC2-375759 35625286 3.97 2.17 BIEC2-375782 35659915 3.34 1.80 BIEC2-375924 36644251 4.89 2.80 BIEC2-375985 36846170 4.48 3.37 BIEC2-376053 37201102 3.50 2.39 BIEC2-376057 37232491 3.53 3.02 BIEC2-376057 37232491 3.53 3.02 BIEC2-376274 38765378 3.65 3.06 BIEC2-376318 39183407 3.81 2.92 BIEC2-376373 39508435 4.27 3.06 BIEC2-377615 49835529 3.07 2.26 BIEC2-378339 55582048 3.03 1.86 18 BIEC2-391005 2265150 5.29 3.44 MYO7B 3154614-3221832 BIEC2-396583 7182586 3.60 2.12 BIEC2-404700 14650564 3.07 0.86 BIEC2-417897 70473204 3.13 1.71 19 BIEC2-422566 2251539 3.50 4.12 SHOX2 1769247-1774673 BIEC2-439699 40745903 3.00 1.94 BIEC2-439769 40771324 3.00 2.24 BIEC2-446323 55082963 3.13 2.51 20 BIEC2-517466 10043487 3.00 1.92 BIEC2-522383 17272320 4.44 3.31 BIEC2-522384 17272406 4.41 3.29 BIEC2-522385 17272529 4.44 3.31 BIEC2-524167 20832718 3.92 2.33 BIEC2-524168 20835801 3.92 2.33 BIEC2-524686 22665104 3.00 1.44 BIEC2-525830 25015798 3.15 2.17 21 BIEC2-565004 40040336 2.92 3.35 BIEC2-565012 40126150 2.92 3.35 BIEC2-565031 40217007 2.92 3.35 BIEC2-565032 40217079 2.92 3.35 BIEC2-565034 40217182 2.92 3.35 BIEC2-568451 45721096 2.64 3.64 BIEC2-568462 45772761 2.47 3.37 BIEC2-568476 45864042 2.89 3.78

161

QTL for conformation

Table S6 continued Position in Candidate ECA SNP ID Tassel1 Tassel2 Position in Mb Mb gene BIEC2-568507 46125675 2.80 3.50 22 BIEC2-575296 1401359 3.11 2.40 BIEC2-575339 1496512 3.41 2.60 BIEC2-587085 21454739 3.20 2.22 BIEC2-587405 21973567 3.08 2.52 BIEC2-592859 34574806 3.17 1.87 BIEC2-596805 39463952 3.54 2.44 BIEC2-599045 42508442 3.08 2.04 23 BIEC2-612065 10629248 3.61 2.80 BIEC2-618034 19861470 3.11 1.70 BIEC2-618042 19874879 4.05 2.48 BIEC2-618908 21108119 4.04 2.38 BIEC2-621603 27143229 3.26 2.17 BIEC2-624676 41334502 5.03 2.80 BIEC2-626135 47893418 3.48 2.70 24 BIEC2-640349 25956957 4.42 1.59 BIEC2-641542 27762351 3.05 2.06 BIEC2-641566 27856416 3.37 1.87 BIEC2-643825 30690348 3.00 2.05 25 BIEC2-655148 4775985 3.70 3.28 COL15A1 5725159-5791881 BIEC2-659406 11562192 3.42 3.35 BIEC2-659464 11699419 3.68 3.18 BIEC2-659857 12494034 3.15 3.09 RAD23B 13002781-13047904 BIEC2-663052 19952311 4.44 2.08 26 BIEC2-687940 18652737 3.07 2.36 BIEC2-691607 25592024 3.35 2.31 BIEC2-691713 25808727 3.69 2.92 BIEC2-691812 26108871 3.31 3.00 RNF160 26086284-26139377 27 BIEC2-705268 12824352 3.00 2.82 BIEC2-719149 34732013 4.39 2.20 28 BIEC2-726077 6209993 3.07 2.80 BIEC2-737513 28541743 3.13 1.71 BIEC2-744574 41626801 3.13 3.07 BIEC2-745277 44354481 3.47 2.35 BIEC2-745279 44357526 3.47 2.35 BIEC2-745539 46000319 3.18 1.95 BIEC2-745550 46121975 3.12 2.19 29 BIEC2-752447 10520032 3.84 1.58 BIEC2-753387 11382768 3.17 1.50 31 BIEC2-831515 2539753 3.28 3.87 BIEC2-833228 6429918 3.09 2.72 BIEC2-841254 20781519 4.77 3.37 BIEC2-841402 21564112 3.29 1.06 PLAGL1 21988301-21995020 Tassel1 = significance ascertained integrating population structure (estimated using STRUCTURE [23]) and marker identity by state based kinship in a mixed linear model (MLM) by using TASSEL. Tassel = significance ascertained integrating the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner as covariates and marker identity by state based kinship in a mixed linear model (MLM) by using TASSEL.

162

QTL for conformation

1 2

Figure 1 Quantile – Quantile-plots of observed P-values estimated for riding horse points (1) and LIMBS (2) using adaptive permutation testing using a maximum of 5,000,000 permutations Plink versus the expectation under null. The black-line shows the expected distribution, the black points show the absolute observed distribution.

163

QTL for conformation

1 2

Figure 2 Quantile – Quantile-plots of observed P-values estimated for riding horse points (1) and LIMBS (2) using adaptive permutation testing considering the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner as covariate using a maximum of 1,000,000 permutations Plink versus the expectation under null. The black-line shows the expected distribution, the black points show the absolute observed distribution.

164

QTL for conformation

1 2

Figure 3 Quantile – Quantile-plots of observed P-values estimated for riding horse points (1) and LIMBS (2) using CHM adaptive permutation testing within 16 family clusters, considering the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner as covariate using a maximum of 1,000,000 permutations Plink versus the expectation under null. The black-line shows the expected distribution, the black points show the absolute observed distribution.

165

QTL for conformation

1 2

Figure 4 Quantile – Quantile-plots of observed P-values estimated for riding horse points (1) and LIMBS (2) using a mixed linear animal model by simultaneous accounting multiple levels of marker-based population structure (Q matrix) and relative kinship among the individuals (K matrix) using TASSEL versus the expectation under null. The black-line shows the expected distribution, the black points show the absolute observed distribution.

166

QTL for conformation

1 2

Figure 5 Quantile – Quantile-plots of observed P- values estimated for riding horse points (1) and LIMBS (2) using a mixed linear animal model by simultaneous accounting the proportion of genes of Hanoverian, Thoroughbred, Trakehner and Holsteiner and relative kinship among the individuals (K matrix) using TASSEL versus the expectation under null. The black-line shows the expected distribution, the black points show the absolute observed distribution and the redline shows the middle observed distribution.

167

168

CHAPTER 8

General discussion

169

170

General discussion

8 General discussion

A capable, athletic horse, suitable for most kinds of competitive equestrian sports, in particular dressage and show-jumping is the major aim of Hanoverian warmblood (Hanoverian) breeding. Today, intense selection for specific performance traits is primary based on population genetic analyses, in particular breeding value (BV) estimations. But even though BVs represent a reliable basis for performance orientated mating, they are not able to explain all of the phenotypic variance observed. By implementing the proportion of thoroughbred, Trakehner and Holsteiner warmblood (Holsteiner) genes into the model, we tried to decrease the residual variance of genetic evaluation. However, our results indicated that for the homogenous, well documented Hanoverian population investigated here, analyses of performance data collected under standardized conditions for a large number of horses from the same breed will not relevantly benefit from model extension by breed class effects. Recently Wade et al. (2009) could show that major haplotypes are frequently shared among diverse horse breeds, increasing the facility of across-breed mapping. Their power calculations suggested that ~100,000 SNPs would be sufficient for association mapping within all breeds as well as across breeds. Our results indicate that genome-wide-association studies (GWAS) remarkably benefited from a thorough model choice. For the stallions of the National State stud of Lower Saxony tested, considering the proportion of genes of Hanoverian, thoroughbred, Trakehner and Holsteiner and in particular a marker based identity by state (IBS) kinship matrix took data stratification into account best. That disparity could be due to the considerably small number of stallions investigated for GWAS. In addition, the Illumina equine SNP50 genotyping BeadChip used for our analyses features 54,602 evenly distributed SNPs, approximate half of the SNP number recommended as sufficient. However, further analyses including larger populations and denser marker are required to verify whether the model for GWAS for performance and conformation

171 General discussion

benefit from inclusion of the proportion of genes and a marker based IBS kinship matrix. Genetic improvement in horses is greatly reduced by the long generation interval, so the application of genetic markers in selection schemes to improve specific performance traits could be highly desirable. For dairy cattle, SNP based genotypic BVs are already used successfully to early select individuals for favourable breeding use (e.g. Meuwissen et al. 2001; Villanueva et al. 2005; Habier et al. 2007). The impact of genomic selection on identifying most talented horses for equestrian sports and further breeding has not been investigated yet for Hanoverians or any other warmblood breed. A key finding from experiments in dairy cattle conducted to date is that the reference population must be very large to subsequently predict accurate genomic estimated breeding values (VanRaden et al. 2009). To date, relatively high cost of genotyping is a major limiting factor for genomic selection schemes within horses. However, these costs might be reduced remarkably in near future using new technologies. Next-generation sequencing (NGS), a new generation of sequencing technologies that provides unprecedented opportunities for high throughput functional genomic research, might be a suitable solution. NGS enables whole- genome sequencing within hours to days for a fraction of the cost of the commonly used sanger-sequencing technology. Therewith the identification of breed, kinship, performance or conformation related markers should be greatly simplified. However, due to high infrastructure cost and the need to handle the large volume of data generated using the new technology, NGS is not embraced yet (Schuster et al. 2008). It is likely that the stallion population investigated in this study is too small for reliable genomic BV estimation and further studies will be necessary to validate our results in larger data sets and other horse populations. However, we could show that genomic selection based on a moderate number of SNPs is possible and might relevantly improve Hanoverian sport horse breeding. Only a few of our 28 candidate genes for equine performance that have polymorphisms associated with human elite athletic performance could be localized within the QTL and putative QTL for show-jumping and dressage. Elite performance

172

General discussion

in show-jumping and dressage require wide ranges on special physical conditions. An excellent combination of fast and strong muscle contraction force over a moderate period of time, balance, and coordinative skills in particular during the approach and takeoff is required for show-jumping. Most of the force for takeoff has to be provided by muscular power generated by the hindlimbs (Lopez-Rivero & Letelier 2000; Barrey 2004). High jump and hurdles are human sports probably closest to equine show-jumping. In contrast, dressage requires a combination of a high level of muscular tension and elastic, rhythmic, regular and supple movements. Additionally, balance, coordinative skills and high level of sensitivity and learning aptitude are crucial for elite dressage performance. We found less but much higher associated QTL regions for show-jumping, riding horse points and limbs compared to the ones found for dressage. This could be explained by the more subjective scoring of the quality of gaits and rideability of a dressage horse particular in comparison to the more objective evaluation of the jumping talent of a horse. Only few human sports as competitive dancing or ballet require physical conditions most comparable to thus required for dressage. However, for human beings candidate genes are primary investigated for endurance and power sports yet. Polymorphisms within human genes that are associated with elite physical performance are mostly related to phenotypic adoptions preferable for endurance or power. Horses naturally possess several functional and structural phenotypic adaptations required for exceptional athletic performance, indicating that possibly genes others than in humans are crucial for elite equine performance. In contrast to modern warmblood breeds as the Hanoverian, thoroughbred-breeding is primary focused on acceleration, speed and endurance. Hence, to verify if human and equine endurance and power performance is significantly affected by homologous genes, race horses like thoroughbreds should be investigated in further studies. Gu et al. (2009) reported candidate genes for performance in thoroughbreds using a population genetic-based hitchhiking mapping approach. Based on their results they propose thoroughbred as a novel in vivo large animal model that may help understanding complex molecular interactions of exercise influenced metabolic disorders. Mapping and GWAS projects in the horse like ours

173

General discussion

are likely to accelerate in the coming years and will identify mutations in genes related to morphology and metabolism, which may benefit human performance next to the commonly used rodent models. It can be concluded that elite physical performance, growth and development are contributed to complex molecular and environment interactions. However, our analyses are a first step into clarifying genetic influences on equine performance and conformation, but further studies using denser marker sets (100,000 SNPs) and larger horse populations are required to validate our results.

8.1 References Habier D., Fernando R.L., Dekkers J.C. (2007) The impact of genetic relationship information on genome-assisted breeding values. Genetics 177, 2389-2397. Barrey E. (2004). Biomechanics of locomotion in the athletic horse. In: Equine sports medicine and surgery (ed by Hinchcliff K.W., Kaneps A.J., Geor R.J.), pp. 210– 30. Elsevier Health, Philadelphia. Lopez-Rivero J.L. & Letelier A. (2000). Skeletal muscle profile of show jumpers: physiological and pathological considerations. The elite show jumper. Conference of Equine Sports Medicine Science 57–76. Meuwissen T.H.E., Hayes B.J., Goddard M.E. (2001) Prediction of total genetic value using genome-wide dense marker maps. Genetics 157, 1819-1829. Schuster S.C. (2008) Next-generation sequencing trasforms today’s biology. Nature Methods 5, 16-18. VanRaden P.M., Van Tassel C.P., Wiggans G.R., Sonstegard T.S., Schnabel R.D., Taylor J.F., Schenkel F. (2009) Invited review: Reliabillity of genomic predictions for North American Holstein bulls. Journal of Dairy Science 92, 16-24. Villanueva B., Pong-Wong R., Fernandez J., Toro M.A. (2005) Benefits from marker- assisted selection under an additive polygenic genetic model. Journal of Animal Science 83, 1747-1752. Wade, C.M., Giulotto, E., Sigurdsson, S., Zoli, M., Gnerre, S., Imsland, F., Lear, T.L., Adelson, D.L., Bailey, E., Bellone, R.R., Blöcker, H., Distl O., Edgar, R.C., Garber,

174

General discussion

M., Leeb, T., Mauceli, E., MacLeod, J.N., Penedo, M.C., Raison, J.M., Sharpe, T., Vogel, J., Andersson, L., Antczak, D.F., Biagi, T., Binns, M.M., Chowdhary, B.P., Coleman, S.J., Della Valle, G., Fryc, S., Guérin, G., Hasegawa, T., Hill, E.W., Jurka, J., Kiialainen, A., Lindgren, G., Liu, J., Magnani, E., Mickelson, J.R., Murray, J., Nergadze, S.G., Onofrio, R., Pedroni, S., Piras, M.F., Raudsepp, T., Rocchi, M., Røed, K.H., Ryder, O.A., Searle, S., Skow, L., Swinburne, J.E., Syvänen, A.C., Tozaki, T., Valberg, S.J., Vaudin, M., White, J.R., Zody, M.C., Broad Institute Genome Sequencing Platform; Broad Institute Whole Genome Assembly Team, Lander, E.S., Lindblad-Toh K. (2009). Genome sequence, comparative analysis, population genetics of the domestic horse. Science 326, 865-867.

175

176

CHAPTER 9

Summary

177

178

Summary

9 Summary

Athletic performance and conformation in Hanoverian warmblood horses - population genetic and genome-wide association analyses

Wiebke Schröder (2010)

Intense selection for speed, endurance or pulling power in the domestic horse (Equus caballus) has resulted in a number of adaptive changes in the phenotype required for elite athletic performance. To date, studies in human have revealed a large number of genes involved in elite athletic performance, but studies in horses are rare. The horse genome assembly and bioinformatic tools for genome analyses have been used to compare human performance genes with their equine orthologues, to retrieve pathways for these genes and to investigate their chromosomal distribution. We represent 28 candidate genes for equine performance that have polymorphisms associated with human elite athletic performance and may have impact on athletic performance in horses. A significant accumulation of candidate genes was found on horse chromosomes 4 and 12. Genes involved in pathways for focal adhesion, regulation of actin cytoskeleton, neuroactive ligand- receptor interaction, and calcium signalling were overrepresented. Genome-wide association studies for athletic performance in horses may benefit from the strong conserved synteny of the chromosomal arrangement of genes among human and horse. Performance data of in total 36,441 Hanoverian warmblood horses (Hanoverians) were used to determine whether genetic evaluation for performance and conformation in the Hanoverian could benefit from the inclusion of the proportion of genes of foreign breeds in the model. For our analyses we considered all Hanoverians born from 1992 to 2005, for which records from mare performance tests, auction inspections or studbook inspections were available. Genetic parameters were estimated univariately in a linear animal model using Residual

179 Summary

Maximum Likelihood. Genetic evaluation was subsequently performed using Best Linear Unbiased Prediction. To investigate the effect of correcting for the proportion of genes of stallions from foreign breeds, two different models were used for the analyses. Heritabilities of analyzed performance traits in both models ranged between 0.11 and 0.34, with standard errors of 0.01. Pearson correlation coefficients determined between corresponding breeding values from Model 1 and 2 were highly positive (>0.98), indicating little effect of the model on the results of genetic evaluation. Our results indicate that using a model which includes the proportion of genes of Thoroughbred, Trakehner and Holsteiner as fixed effects will not relevantly improve genetic evaluation for performance and conformation in the Hanoverian. According to our results using a model which includes the proportion of genes of Thoroughbred, Trakehner and Holsteiner as fixed effects will not relevantly improve genetic evaluation for performance and conformation in the Hanoverian. Finally, a genome-wide association (GWA) study for quantitative trait loci (QTL) of performance traits (show-jumping, dressage and conformation) in Hanoverian warmblood horses employing the Illumine equine SNP50 Beadchip was aim of this study. For our analyses we genotyped 115 stallions that could be assigned into 16 families as a random sample out of 798 stallions of the National State stud of Lower Saxony. We investigated the breeding values (BV) show-jumping, dressage, riding horse points (RHP) and limbs for GWA. The BV for show-jumping, which includes style and ability of free jumping, was investigated for association first. We were able to localize six QTL for show-jumping on horse chromosomes (ECA) 1, 8, 9 and 26 (- log10 P-value >5). Within the QTL regions, we could identify human performance related genes including PAPSS2 on ECA1, MYL2 on ECA8, TRHR on ECA9 and NRF2 on ECA26. The results of our GWA suggest that genes involved in muscle structure, development and metabolism are crucial for elite show jumping performance. The dressage talent of a horse is characterized by the quality of walk, trot and canter, and rideability. For analyzes we investigated the BV dressage, which is composed of thus traits. We located 12 QTL for dressage on horse chromosomes

180

Summary

(ECA) 2, 3, 4, 6, 7, 8, 9, 10, 16, 18, 27 and 28 (-log10 P-value >4). Within the QTL regions we identified functional candidate genes for dressage performance, including VWC2 on ECA4, HPX on ECA7, AF3BL2 on ECA8, TRAPPC9 on ECA9, MYL3 on ECA16 and MCPH1 on ECA27. Our results suggest that multiple genes, involved in diverse processes are crucial for elite dressage performance. In particular genes involved in coordination, ataxia and learning aptitude might play a major roll for the quality of dressage performance. Conformational traits that are evaluated at studbook inspection and stallion licensing are additionally grouped for population genetic analyzes under two topics, riding horse points (RHP) and limbs (LIMBS). RHP include all traits that are constitutive for the quality of a riding horse (head, neck, saddle position, frame, type and development), whereas LIMBS comprises all traits that are characterizing for the quality of limb conformation (front legs, hind legs and correctness of gaits). We could detect four QTL for RHP on ECA3, 15, 19 and 20 and two QTL for LIMBS on ECA5 and 18 (-log10 P-value >5). Within the QTL regions for RHP we identified PPARGC1A and LCORL on ECA3 as functional candidate genes. For LIMBS we detected PRG4 on ECA5 and MYO7B on ECA18 as functional candidate genes. We can conclude that first of all genes coding for muscular processes, growth, limb development and embryonic development might be constitutive for both, RHP and LIMBS. Our Results represent the initial search for QTL for performance and conformation of Hanoverians. However, further studies are necessary to validate the QTL for show-jumping, dressage, RHP and Limbs in larger data sets and other horse populations.

181

Summary

182

CHAPTER 10

Zusammenfassung

183

184

Erweiterte Zusammenfassung

10 Erweiterte Zusammenfassung

Genomweite Assoziationsanalysen für Leistungsmerkmale des Hannoveraner Warmblutpferdes unter der Einbeziehung von Netzwerkanalysen

Wiebke Schröder (2010)

Eine strikte Selektion auf physische Leistungsmerkmale wie Schnelligkeit, Ausdauer und Zugkraft hat bei den domestizierten Pferden (Equus caballus) zu einer Reihe adaptiver Veränderungen des Phänotyps geführt, die diese Tiere zu Ausnahme-Athleten unter den Säugern machen. Während in aktuellen Studien beim Menschen der Einfluss einer ganzen Reihe von Genen auf die physische Leistungsfähigkeit diskutiert wird, stecken derlei Untersuchungen beim Pferd noch in den Kinderschuhen. In dieser Arbeit stellen wir erstmals 28 Kandidatengene vor, die auf Grund ihrer Funktion die physische Leistungsfähigkeit des Pferdes beeinflussen könnten. Für jedes dieser Gene ist bereits mindestens ein Polymorphismus in der humanen Sequenz beschrieben der mit körperlicher Leistungsfähigkeit assoziiert wird. Die zweite Assemblierung des Pferdegenoms (Ecab2) wurde verwendet um anschließend mit Hilfe bioinformatischer Werkzeuge die Sequenzen der humanen Leistungsgene mit denen ihrer equinen Orthologe zu vergleichen, ihre chromosomale Verteilung zu ermitteln und Netzwerke dieser Gene untereinander zu untersuchen. Eine signifikante Häufung dieser Kandidatengene konnte auf den equnien Chromosomen 4 und 12 beobachtet werden. Gene, die Pfadwegen für fokale Adhäsion, Regulation des Actin-Zytoskeletes, neuroaktive Ligand-Rezeptor- Interaktion sowie Kalzium Signalwegen angehören, waren unter den ermittelten Kandidatengenen überrepräsentiert. Die Ergebnisse der Analyse suggerieren, dass weiter führende Untersuchungen, wie zum Beispiel genom-weite Assoziationsstudien für Leistungsmerkmale des Pferdes, von der stark konservierten Synteny zwischen den chromosomalen Gen-Anordnungen von Mensch und Pferd profitieren könnten.

185 Erweiterte Zusammenfassung

Im folgendem wurde untersucht, ob populationsgenetische Auswertungen von Leistungsdaten Hannoveraner Warmblutpferde (Hannoveraner) durch die Berücksichtigung von Fremdgenanteilen im Modell profitieren können. Dafür wurden die Daten aller 36.441 Hannoveraner der Geburtsjahrgänge von 1992 bis 2005 berücksichtigt, für die mindestens ein Ergebnis einer Stutenleistungsprüfung, einer Auktionssichtung oder einer Stutbucheintragung dokumentiert war. Dazu wurden genetische Parameter für fünf der bei Stutenleistungsprüfungen und Auktionssichtungen standardmäßig ermittelten Merkmale (Schritt, Trab und Galopp unter dem Sattel, Freispringen und Rittigkeit) sowie für drei der bei Stutbuchaufnahmen bewerteten Merkmale (Schritt, Schwung und Elastizität sowie die Korrektheit des Ganges von Schritt und Trab an der Hand) univariat in einem linearen Tiermodel mittels Residual Maximum Likelyhood (REML) geschätzt. Anschließend wurde mittels Best Linear Unibased Prediction (BLUP) und der geschätzten Varianz die Zuchtwertschätzung durchgeführt. Um zu untersuchen welchen Effekt die Korrektur auf Genanteile eingekreuzter Hengsten anderer Rassen hat, wurden zwei unterschiedliche Modelle für die Auswertungen verwendet. In Modell 1 wurden die fixen Effekt des Geschlechtes (nur für die Auktionssichtungsdaten) sowie des Alters, und der zufällige Effekt der Veranstaltung (Ort-Datum Interaktion) berücksichtigt. Modell 2 wurde zusätzlich noch um die fixen Effekte der Genanteile von Englischem Vollblut, Trakehner und Holsteiner Warmblut (Holsteiner) erweitert. Die berechnete Heritabilität der analysierten Merkmale schwankten mit beiden Modellen zwischen 0,11 und 0,34 mit Standardfehlern 0,01. Die Korrelationskoeffizienten nach Pearson für die mit Model 1 und 2 ermittelten Zuchtwerte korrespondierender Merkmale waren hoch positiv (>0,98), was auf einen nur kleinen Effekt des Modells auf die Ergebnisse der Analysen hindeutet. Damit zeigen die Ergebnisse der vorliegenden Arbeit, dass ein Modell für genetische Auswertungen von Leistungsmerkmalen des Hannoveraners durch die zusätzliche Berücksichtigung der Genanteile von Englischem Vollblut, Trakehner und Holsteiner nicht relevant verbessert wird. Während die Zuchtziele der meisten deutschen Warmblutrassen im Hinblick auf die Nutzbarkeit für den Pferdesport nur marginal variieren, so gibt es doch einige

186

Erweiterte Zusammenfassung

charakteristische Unterschiede zwischen den erwünschten Exterieurmerkmalen der einzelnen Zuchten. In einer anschließenden Studie wurde deshalb untersucht, ob ein Modell für genetische Analysen von Exterieurmerkmalen des Hannoveraners durch die Berücksichtigung von Fremdgenanteilen verbessert werden kann. Zu diesem Zweck standen Stutbuchaufnahmedaten von insgesamt 29.053 Hannoveraner Stuten der Jahrgänge 1992 bis 2005 des Hannoveraner Verbandes zur Verfügung. Genetische Parameter wurden für acht der bei Stutbuchaufnahmen routinemäßig beurteilten Exterieurmerkmale (Kopf, Hals, Sattellage, Vorderhand, Hinterhand, Typ, Rahmen und Gesamteindruck und Entwicklung) sowie die Widerristhöhe wiederum univariat in einem linearen Tiermodel mittels Residual Maximum Likelihood geschätzt. Auch hier wurde die Zuchtwertschätzung anschließend mittels Best Linear Unbiased Prediction und der geschätzten Varianzen durchgeführt. Um die Auswirkungen einer Berücksichtigung von Fremdgenanteilen auf genetische Analysen prüfen zu können, wurden dieselben zwei Modelle verwendet wie in der vorhergehenden Studie. Die ermittelten Heritabilitäten lagen in beiden Modellen zwischen 0,10 und 0,57 bei Standardfehlern ≤0,01 für die analysierten Exterieurmerkmale sowie die Widerristhöhe. Auch hier waren die mit Model 1 und 2 für korrespondierende Merkmale geschätzten Zuchtwerte hoch positiv miteinander korreliert (Korrelationskoeffizienten nach Pearson >0,99). Zusammenfassend zeigen die Ergebnisse, dass eine Berücksichtigung von Fremdgenanteilen in einem Zuchtwertschätzmodell auch für Exterieurmerkmale des Hannoveraners keine wesentlichen Vorteile bringt. Mittels genomweiter Assoziationsstudien (GWAS) wurde dann im Folgenden nach quantitativen Merkmalsorte (QTL, im Englischen QTL, quantitative trait loci) für Leistungsmerkmale (Springen, Dressur und Exterieur) des Hannoveraners gesucht. Dafür standen Daten von 798, 16 unterschiedlichen Familien angehörenden Hengsten des niedersächsischen Landgestüts Celle zur Verfügung. Von diesen wurden zufällig 115 Hengste ausgewählt und mit einem DNA-Chip (Illumina SNP50 BeadChip) genotypisiert. Als Untersuchungsparameter wurden die Zuchtwerte für Dressur, Springen, Reitpferde-Points und Fundament der Hengste verwendet. Um auf Datenstratifikation beruhende falsch positive Ergebnisse zu minimieren wurden

187

Erweiterte Zusammenfassung

unterschiedliche Modelle für die Analysen angewendet. Zum einen zwei unterschiedliche gemischte lineare Tiermodelle und zum anderen drei weitere allgemeine lineare, adaptiv permutierte Tiermodelle. Die Datenstratifikation konnte für alle untersuchten Merkmale am besten durch ein gemischtes lineares Tiermodel erklärt werden, in dem die Genanteile von Hannoveraner, Englischem Vollblut, Trakehner und Holsteiner und eine Verwandtschaftsmatrix auf Grundlage der Bewertung der Markerähnlichkeit nach Allelzustand (IBS, im Englischen IBS, identity by state) berücksichtigt wurden. Zunächst wurde nach QTL für das Springtalent gesucht. Dieses beinhaltet die Manier und das Vermögen eines Pferdes und wird beim Freispringen ermittelt. Der für die Analysen verwendete Zuchtwert Springen eines Hengstes setzt sich aus der Eigenleistung des Hengstes und derer aller seiner Nachkommen die an einer Stutenleistungsprüfung oder Auktionssichtung teilgenommen haben zusammen. Es konnten sechs QTL für das Springen auf den equinen Chromosomen (ECA) 1, 8, 9 und 26 (-log10 P-Wert >5) und weitere putative QTL mit -log10 P-Werte von 3-5 auf ECA1, 2, 3, 11, 17 und 21 definiert werden. Innerhalb der QTL- und auch der putativen QTL-Regionen konnten einige Gene ermitteln werden, die beim Menschen Einfluss auf die physische Leistungsfähigkeit haben, wie PAPSS2 auf ECA1, MYL2 auf ECA8, TRHR auf ECA9 und NRF2 on ECA26, sowie innerhalb der putativen QTL Regionen NRAP auf ECA1 und TBX4 auf ECA11. Die Ergebnisse dieser GWAS deuten darauf hin, dass vor allem Gene, die mit der Entwicklung der Muskulatur, ihrer Struktur und ihrem Energiestoffwechsel in Zusammenhang stehen von besonderer Bedeutung für eine überdurchschnittliche Springleistung sind. Als nächstes wurde das Genom auf QTL für das Dressurpotential des Hannoveraners untersucht. Die Eignung eines Pferdes für die Dressur wird im Wesentlichen von der Qualität des Schrittes, des Trabes und des Galopps, sowie der Rittigkeit bestimmt. Als Untersuchungsparameter diente der Zuchtwert Dressur in dem diese vier Merkmale zusammengefasst sind. Auch hier fließen in den Zuchtwert die Eigenleistung des Hengstes selbst, sowie die Leistungen aller seiner Nachkommen, die auf einer Stutenleistungsprüfung oder einer Auktionssichtung bewertet wurden ein. Es konnten 12 QTL für Dressur auf ECA2, 3, 4, 6, 7, 8, 9, 10,

188

Erweiterte Zusammenfassung

16, 18, 27 und 28 (-log10 P-Werte > 4) sowie weitere putative QTL mit -log10 P- Werten von 3 bis 4 auf ECA1, 2, 3, 5, 14, 18, 19, 20, 21 und 26 lokalisiert werden. Innerhalb der QTL- und putativen QTL- Regionen sind einige funktionelle Kandidatengene für Dressur, darunter VWC2 auf ECA4, HPX auf ECA7, AF3BL2 auf ECA8, TRAPPC9 auf ECA9, MYL3 auf ECA16 and MCPH1 auf ECA27, sowie innerhalb der putativen QTL Regionen MYO5A, GNB5, GABPB and HDC auf ECA1, TRPC3 auf ECA2, PPARGC1A auf ECA3, MARKAPK auf ECA5, SHOX2 auf ECA19 und GABPA auf ECA26. Diese Ergebnisse deuten darauf hin, dass eine Vielzahl an zum Teil funktionell sehr unterschiedlichen Genen ausschlaggebend für das Dressurpotential eines Pferdes ist. Insbesondere Gene, die für koordinative Fähigkeiten ein Rolle spielen, an Ataxie beteilig sind oder das Lernverhalten beeinflussen scheinen eine zentrale Rolle für die Dressur zu spielen. Anschließend wurden in einer weiteren Analyse QTL für das Exterieur definiert. Die bei Stutbuchaufnahmen und Hengstkörungen standardmäßig evaluierten Exterieurmerkmale lassen sich in zwei Gruppen zusammenfassen, zum einen in die so genannten Reitpferde-Points (RPP) und zum anderen in das Fundament. Die RPP beinhalten alle jene Parameter des Körperbaues, die die Qualität eines Pferdes als Reitpferd maßgeblich beeinflussen (Kopf, Hals, Sattellage, Typ, Rahmen sowie Gesamteindruck und Entwicklung) während das Fundament alle die Merkmale beinhaltet, die die Qualität der Gliedmaßen beschreiben (Vorderhand, Hinterhand und die Korrektheit des Ganges). Als Untersuchungsparameter dienten die Zuchtwerte für RPP und Fundament, welche die Eigenleistung des Hengstes sowie die seiner Töchter die an einer Stutbuchaufnahme teilgenommen haben beinhalten. Für RPP konnten vier QTL auf ECA 3, 15, 19, und 20, sowie zwei QTL für das

Fundament auf ECA 5 und 18 (-log10 P-Werte >5) definiert werden. Weitere putative

QTL mit -log10 P-Werten >3 bis <5 konnten auf ECA3, 6, 17, 18, 19, 10, 21 und 27 für RPP und auf ECA1, 3, 5, 8, 11, 14, 17, 18, 19, 20, 25, 26 und 31 für das Fundament lokalisiert werden. Innerhalb der QTL und putativen QTL für RPP wurden folgende Kandidatengene lokalisiert: PPARGC1A und LCORL auf ECA3 (QTL) sowie CYP27B1 auf ECA6, MYO7B auf ECA18, SOHX2 auf ECA19 und FST auf ECA21 (putative QTL). Für das Fundament wurden innerhalb der ermittelten QTL PRG4 auf

189

Erweiterte Zusammenfassung

ECA5 und MYO7B auf ECA18, sowie innerhalb der putativen QTL SHOX2 auf ECA19, COL15A1 und RAD23B auf ECA25, RNF160 auf ECA26 und PLAGL1 auf ECA31 als Kandidatengene ermittelt. Diese Ergebnisse deuten darauf hin, dass vor allem Gene, die mit der Entwicklung der Gliedmaßen, dem Wachstum und der Bemuskelung assoziiert sind, entscheidend für ein vorteilhaftes, funktionelles Exterieur sind. Um die in der vorliegenden Arbeit ermittelten Leistungs-QTL zu verifizieren, sind weitere Untersuchungen an größeren Populationen und anderen Rassen notwendig. Dazu können neue Methoden wie Next-Generation-Sequenzing, welches im Vergleich zur herkömmlichen Sanger-Sequenzierung Sequenzvergleiche einer großen Zahl von Kandidatengenen mit einem vertretbaren Aufwand erlaubt, einen wertvollen Beitrag leisten.

190

Erweiterte Zusammenfassung

191

192 CHAPTER 11

Appendix

193

194

Appendix

11 Appendix

Software BioMart version 0.7 http://www.biomart.org/index.html CLUSTALW2 http://www.ebi.ac.uk/Tools/clustalw2/index.html DAVID Bioinformatics Resources 6.7 http://david.abcc.ncifcrf.gov/home.jsp ENSEMBL http://www.ensembl.org KEGG http://www.genome.jp/kegg/ PEST Groeneveld et al. (1990) In ‘Proceedings of the 4th World Congress of Genetics Applied to Livestock Production’. Edinburgh PLINK version 1.07 http://pngu.mgh.harvard.edu/~purcell/plink/ SAS/Genetics, version 9.2 Statistical analysis System, SAS institute, Cary, NC, USA Structure 2.3.3 http://pritch.bsd.uchicago.edu/structure.html SUN Ultra Enterprise 450 Sun microsystems GmbH, Kirchheim-Heimstetten SUN FIRE V490 Sun microsystems GmbH, Kirchheim-Heimstetten TASSEL version 2.1 http://www.maizegenetics.net/index.php? option= com_content&task=view&id=89&Itemid=119 the http://www.geneontology.org/ VCE-5 version 5.1.2 Kovač et al. (2003) Institute for Animal Science and Animal Husbandry, Federal Agricultural Research Centre (Bundesforschungsanstalt für Landwirtschaft, FAL), Mariensee / Neustadt, Germany.

195

196

CHAPTER 12

List of publications

197

198 List of publications

12 List of publications

Journal articles

Wiebke Schröder, Andreas Klostermann, Ottmar Distl (2010) A review on candidate genes for physical performance in the horse. The veterinary Journal, Article in Press, Corrected Proof (doi:10.1016/j.tvjl.2010.09.029)

Wiebke Schröder, Kathrin Friederike Stock and Ottmar Distl (2010) Does the proportion of genes of foreign breeds influence breeding values for performance traits in the Hanoverian warmblood horse? Livestock production science, in review

Wiebke Schröder, Kathrin Friederike Stock and Ottmar Distl (2010) Genetic evaluation of Hanoverian warmblood horses for conformation traits considering the proportion of genes of foreign breeds. Archiv Tierzucht 53 (2010) 4, 377-387, ISSN 0003-9438

Oral presentation

Wiebke Schröder, Andreas Klostermann, Ottmar Distl (2009) Bioinformatische Ansätze zur Unterstützung genomweiter Assoziationsanalysen beim Pferd. Vortragstagung der DGfZ und GfT am 16./17. September 2009 in Gießen

199

200 CHAPTER 13

Acknowledgements

201

202

Acknowledgements

13 Acknowledgements

First of all I would like to thank my supervisor Prof. Dr. Dr. habil. Ottmar Distl for providing the interesting topic of my doctoral thesis and his academic guidance and support of this work.

Many thanks to Dr. Kathrin F. Stock, Mr Jörn Wrede and Andreas Klostermann for their support during the statistical analyses and their important advices and helpful suggestions.

I would also like to thank all my colleagues and friends of the Institute of Animal Breeding and Genetics of the University of Veterinary Medicine Hannover for their support, humor and the friendly atmosphere. It was a great pleasure to work with all of you.

My special gratitude for understanding, undemanding love and support in my live goes to Birte and Sonja.

Very special thanks go to my family, especially to my parents. Thank you for your support in every way throughout all the years and being there when I needed you.

Last but not least I wish to thank Christian for accompanying me the last years and for his undemanding loving support every single day.

203