Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine 409

Chicken Genomics - Linkage and QTL mapping

PER WAHLBERG

ACTA UNIVERSITATIS UPSALIENSIS ISSN 1651-6206 UPPSALA ISBN 978-91-554-7377-8 2009 urn:nbn:se:uu:diva-9502                      ! ""!  #$"" %  &  %    % '&  & (  %   )* +&   ,     - &*

 .& '* ""!* &/ 0  1 2 /  3+2   * 4   *                     "!*  * * 567 !891!#1::188819*

+& &    %         & &/ ( )* +&  % &/    &       8""" ; !"""    6 & 4* 6  &      &  &    % & ,      &&  < &/     %       * +& %  % & &               , , &    &/  * +& ,       &      %  &     &        %  =   1,&* 6   %     &    &1%  %%    1,& , & &1.& 6 (.6)  2 ,1.& 6 (2.6)  * 5       % &   %  %%   & ,         ,    3  + 2  (3+2) %%   1,& * 4  /  ,   ,&&       /    # % & 9 &/   * 4& & &    &   %%  , & , %     & 3+2   %    %%      %  3+2 & ,&    %%  %    1,&* .   &               %  1,    ,   ( )* +&    #: 3+2  & %%  1,&   % &  %   , / %     * +&   & &    %  &  %%    1,&    /   %     %% %     %      3+2 ,&    %%     ,& 1,    , 3+2* +&   % & &   ,  /  * +& %    , % & &/ > &    &     1,  /      #!: 67' /          % & &/  * +&          !?9   /     &       %  &    ,&  /  * +&     %    ,   *# @      ,  ,& &   * 4    %       ,        ,      ,  * +& ,            %    * A     &/ ,  %    &&    ,& 01& B *

   &/  B     /   3+2         1   > &     ,  67'

! "        #      !  " # $ %&'"    " ()*%+',  "   

C ' .& ""!

5667 #?:#1?"? 567 !891!#1::188819  $  $$$ 1!:" (& $@@ */*@ D E $  $$$ 1!:") List of Papers

This thesis is based on the following papers, which will be referred to throughout the text by their roman numerals.

I Jacobsson L, Park HB, Wahlberg P, Fredriksson R, Perez-Enciso M, Siegel PB, Andersson L: Many QTLs with minor additive effects are associated with a large difference in growth between two selec- tion lines in chickens. Research 86:115-125 (2005)

II Wahlberg P, Strömstedt L, Tordoir X, Foglio M Heath S, Lechner D, Hellström AR, Tixier-Boichard M, Lathrop M, Gut IG, Anders- son L: A high-resolution linkage map for the Z chromosome in chicken reveals hot spots of recombination. Cytogenetic and Ge- nome Research 117:22-29 (2007)

III Groenen MAM, Wahlberg P, Foglio M, Cheng HH, Megens H-J, Crooijmans R, Besnier F, Lathrop M, Muir WM, Wong GK, Gut IG, Andersson L: A high density SNP based linkage map of the chicken genome reveals sequence features correlated with recombination rate. Genome Research (Accepted)

IV Wahlberg P, Carlborg Ö, Foglio M, Tordoir X, Syvänen A-C, Lathrop M, Gut IG, Siegel PB, Andersson L: Genetic analysis of an F2 intercross derived from two chicken lines divergently selected for body-weight. (Submitted)

Reprints were made with permission from the publishers.

Contents

Introduction...... 7 The evolution of the domestic chicken...... 7 The chicken genome...... 8 Mendelian genetics...... 10 Linkage analysis and construction of linkage maps ...... 12 Quantitative genetics...... 17 QTL mapping ...... 18 Aims of the thesis...... 23 Present investigations...... 25 Construction and analysis of maps in chicken (Paper II and III)...... 25 Background...... 25 Results and Discussion ...... 26 Future perspective...... 29 Genetic analysis of two extreme selection lines – QTL mapping (Paper I and IV)...... 29 Background...... 29 Results and Discussion ...... 31 Future perspective...... 34 Acknowledgements...... 36 References...... 38

Introduction

The evolution of the domestic chicken The domestication of animals started approximately 9,000-12,000 years ago [1]. By means of selective breeding, humans have changed the phenotypic appearance of wild ancestors into the domestic animals seen today. In fact, Darwin used domesticated animals as a model for phenotypic evolution by means of natural selection, when he proposed his, at that time controversial, evolutionary theory [2]. The genetic consequence is that associated with favorable effects on the traits under selection increase in frequency, and eventually may become fixed in the population under selection [3]. The evolutionary history of the domestic chicken can for simplicity be di- vided into three phases; first is the emergence of the genus Gallus, second is the domestication process from its ancestor(s), and third, the expansion into the more than 100 distinct breeds of chicken that we see today (www.amerpoultryassn.com). Chicken and other birds share a common an- cestor with mammals that lived about 300 million years ago [4]. Modern birds are placed in the order Aves and are the most diverse vertebrate order, with approximately 9,600 extant species. Additionally, birds are the only living descendents from dinosaurs that became extinct approximately 65 million years ago [5]. The defining characteristic of species within Aves is that all have feathers. Within the genus Gallus four sub-species have been suggested as possible progenitors of the domestic chicken; the red jungle fowl (G. gallus), the grey jungle fowl (G. sonnerati), the green jungle fowl (G. varius) and the Ceylon jungle fowl (G. lafayettei) [6]. The origin of the domestic chicken has been under debate for centuries, although the consensus view has been that the domestic chicken is descended from the red jungle fowl, a species that still habituates parts of Southeast Asia. The monophyletic origin of the domestic chicken, from the red jungle fowl, has been supported by genetic studies of the mitochondrial genome [7, 8]. Recently, however, strong genetic evidence was put forward that contradicts this thesis. Eriksson and co-workers ana- lyzed the DNA sequences of the BCDO2 (beta-carotene dioxygenase 2) gene that is involved in the control of yellow skin pigmentation [9]. Surprisingly, sequence data obtained from several species of Gallus and domesticated chickens showed that the yellow skin at the BCDO2 locus, common among domestic chickens, must have originated from the grey jungle fowl

7 and not the red jungle fowl. To what extent genetic material from the grey jungle fowl has contributed to the genome of present-day domestic chickens and whether other species also have contributed remains unclear. Genomic sequencing, using new high-throughput technologies, of the four sub-species of Gallus together with domestic chicken breeds, will resolve the question regarding the origin of the domestic chicken. The process of chicken domestication is thought to have been initiated in Asia between 7,000-9,000 years ago [10]. From these original sites the do- mesticated chicken has spread and is now present in most parts of the world. During the last 50 years or so, intensive breeding has dramatically changed the domesticated chicken in to a highly efficient production animal [11]. A modern egg-laying chicken (layer) produces approximately 300 eggs per year and chickens bred for meat production (broilers) have an adult body- weight around four kilograms (kg) [12]. In comparison, the wild ancestor(s) weigh around one kg and are seasonal egg layers, producing on average 10- 12 eggs per clutch [13]. This response to selection for body-weight and egg production has been made possible through the combination of efficient breeding programs, large effective population sizes and the high genetic variability present within the domestic chicken [14-16]. The chicken is not only an important source of food, but also has a long tradition as a model organism for scientific research [17]. For example, the easily accessible chicken embryo has aided in the understanding of devel- opmental processes such as limb formation [18, 19]. Genetic pioneers, such as Bateson and Punnett, used the chicken as a model to study the principles of whilst historically the chicken has been used as the prime model organism for studies of other avian species [20, 21].

The chicken genome A draft genome sequence of the red jungle fowl was published in 2004 and represented the first non-mammalian amniote species to be sequenced [4]. In an accompanying paper, results from a sequence comparison between the jungle fowl reference sequence and partial sequences (0.25X coverage) gen- erated from three domestic chicken breeds were presented [16]. The ge- nome-wide comparison between the wild and domestic chicken yielded the discovery of 2.8 million single nucleotide polymorphisms (SNP). The mean nucleotide diversity was estimated to be approximately 5 SNPs/kb, in com- parisons between domestic breeds to red jungle fowl, but also in compari- sons between and within breeds of domestic chicken [16]. The high genetic diversity within and between domestic breeds shows that although the chicken has gone through a domestication process, vast nucleotide diversity remains in domestic populations. Given the short time scale that elapsed from the start of domestication (7,000 – 9,000 years ago) to the present, a

8 coalescence time of 1.4 million years would be required to account for ob- served nucleotide diversity in domestic breeds [16]. Hence, most of the ge- netic variation observed in the domestic chicken pre-dates the start of do- mestication. The estimate of 5 SNP/kb in chicken is much higher than the observed rate of about1 SNP/kb in humans [22], and the rate in the dog [23, 24] while similar to different sub-species of mice [25]. Domestication of the chicken must have involved a large founding population and possibly a large geographical region. The karyotype (i.e. chromosome organization) in birds differs from that in mammals. The chicken has 38 autosomal chromosome pairs (ten large mac- rochromosome pairs and 28 pairs of small microchromosomes) and two sex chromosomes (Z and W) [26]. In contrast to mammalian species, where the male is the heterogametic sex (XY), birds display a reverse order with the female as the heterogametic sex (ZW) and males are homogametic (ZZ). The sex chromosomes in the avian genome are not homologues to the X and Y chromosomes in mammals, and have evolved from a different autosomal chromosome pair [27]. Characteristic features of a typical avian karyotype include a larger num- ber of chromosomes and also substantial size variants between chromosomes [26]. Macro- and microchromosomes are also found in reptiles and some fish [28-30]. The macrochromosomes are relatively few, and about the same size as an average-sized mammalian chromosome (i.e. ~ 140 Mb). Microchromo- somes vary in size from 2-15 Mb [4]. Mammalian chromosomes also show size variation, however the span is small compared to that detailed in birds. There are interesting structural differences between macro- and microchro- mosomes, for example, GC content and gene density are higher on the mi- crochromosomes, whilst intronic lengths are shorter [4]. In addition, repeti- tive sequence elements are generally less common on microchromosomes [4, 31]. The haploid size of the chicken genome is approximately 1 x 109 base- pairs (bp) [4], compared to 3 x 109 bp in the human [32, 33] and 2.7 x 109 bp in the mouse [34]. The smaller genome size of avian species is mainly due to a reduction of repetitive sequences and, in general, less non-coding se- quences [4]. The number of genes contained in the avian genome is equiva- lent to that in the human genome (about 20,000 genes) [4, 35]. The stream- lined genomes of avian species have been suggested to have evolved as an adaptation to the high metabolic demand associated with flight [36, 37]. Consistent with this hypothesis, flightless birds have larger genomes com- pared to flying birds, and bats have smaller genomes than other mammalian species in the same order [36, 38]. The possibility remains, that the small genome size of birds could have evolved from the flightless dinosaur ances- tor and so merely reflects the neutral evolutionary processes of selfish repeti- tive DNA sequences [39]. Testing this hypothesis would require a compara- tive analysis of the genomes from the extinct predecessors of avian species,

9 although to date such, an analysis has not been possible. In a recent study by Organ and colleagues, the absence of data from extinct species has been overcome by exploring the positive correlation between genome size and cell volume [40]. First, the authors showed that bone-cell size correlates well with genome size in extant vertebrates, and then used this relationship to estimate genome size from the bone-cell cavities found in fossilized bone remains. They employed this method to estimate genome sizes from 31 ex- tinct species of dinosaurs together with extinct avian species. The results from their analysis suggested that a small genome size, evolved long before the appearance of the first birds.

Mendelian genetics The basic principles of inheritance were first established by (1822-1884). Mendel studied different strains of garden peas and after a series of experiments conducted during 1856-1863, formulated the funda- mental rules concerning the transmission of hereditary characters from par- ent to offspring and the inheritance of those characters to subsequent genera- tions [41]. Mendel focused his studies on phenotypic traits that are qualita- tive and could easily be classified in two categories, such as the color and morphology of seeds (i.e. green/yellow, round/wrinkled, etc.). This approach enabled Mendel to successfully elucidate the basic principles of inheritance and formulate what we now know as the Mendelian laws of inheritance. However, the ideas of Mendel were not appreciated by the scientific com- munity until the beginning of the 20th century, when his work was re- discovered by de Vries, Correns and von Tschermak [42]. The domestic chicken played a prominent role in the early days of genetic research as it was the first animal in which the Mendelian principles were confirmed [43]. Genes are the basic units that determine phenotypic characteristics and are transmitted from parent to offspring. For example, the peas that Mendel used in his studies segregated for genetic variants that determine the color of the shell. In diploid species, two copies of a gene are present in all cells of an adult organism, with the exception of the haploid germ cells (gametes) that carry only one copy of each gene. A gene can exist in different forms owing to variation within the genetic code and the alternative copies of a gene are called alleles. The total number of alleles within a population for a certain gene is not restricted to two, but can vary and multiple alleles are common. The particular combination of the two alleles constitutes the of an individual, while the observed characteristics caused by the genotype are denoted as the of an organism. Each gene has a certain position on the chromosome that is designated as a locus. An individual, who carries two identical alleles at a locus, is said to be homozygous, while an individual that carries two different alleles are referred to as heterozygous.

10 Mendel´s first law, the law of segregation, states that phenotypic characters are controlled by pairs of genes (or alleles) that segregate during the forma- tion of the gametes. Consider gene A that has two alternative alleles (A and a). An individual that has genotype ‘AA’, will only produce one type of gametes that carry the ‘A’ allele, whereas an individual that is heterozygous ‘Aa’ can either transmit the ‘A’ or the ‘a’ allele. In the latter case the two kinds of gametes are normally produced in equal proportions. Suppose we cross an ‘AA’ individual with an ‘Aa’ individual, the offspring would either have genotype ‘AA’ or ‘Aa’ in the proportions 1:1. Figure 1 describes the segregation of alleles at gene A in a mating between two individuals homo- zygous for different alleles. The genotype of an individual is determined by the two alleles at the locus, however the phenotypic expression of the trait controlled by the gene depends on the nature of the alleles. For instance, if the phenotype of the heterozygote is identical to one of the homozygotes, the allele causing the phenotype is called dominant and the allele that is con- cealed is called recessive.

Figure 1. Segregation pattern of allele ‘A’ and ‘a’ in a mating between two homo- zygous parental and a subsequent mating between F1 individuals to pro- duce an F2 generation. The segregation of alleles in the F1 parents are visualized by a Punnett square, with the expected genotype ratios shown underneath.

Mendel´s second law, the law of independent assortment, declares that the segregation of alleles for two or more genes is independent from each other. For simplicity, assume that we have two genes A and B, and that both have two alleles ‘Aa’ and ‘Bb’. Based on the assumption of independent segrega- tion, the two genes should form nine possible genotype combinations in total. There are however, situations where genes do not show independent

11 assortment and alleles tend to be inherited together. In general, genes located on different chromosomes show independent assortment, whereas genes that are sufficiently close to each other on the same chromosome will exhibit genetic linkage. Co-segregation, or genetic linkage, between pairs of loci was not observed by Mendel but was later discovered by Correns, Bateson and Punnett [44].

Linkage analysis and construction of linkage maps This section aims to summarize the requirements of and give a theoretical background to genetic linkage analysis and the methods used to construct linkage maps. An in-depth theoretical discussion on the topic is given by Ott [45]. Linkage maps display the linear order and the relative distances be- tween loci.

Genetic markers In the previous section it was shown how Mendel used morphological varia- tions in peas as genetic markers to study the principles of inheritance. A morphological marker is generally easy to observe, however, they are lim- ited in number and therefore of restricted use as markers. Linkage mapping was seriously handicapped by the lack of markers until 1980, when a semi- nal paper by Botstein and colleagues described a method to use DNA poly- morphisms as a phenotype, and thereby opened up the possibility to con- struct linkage maps in humans [46]. The use of anonymous DNA polymor- phisms as genetic markers has since enabled the construction of comprehen- sive linkage maps that cover the entire genome of numerous organisms including humans [47], domestic animals (dog [48], chicken [49], pig [50],) and model organisms (mouse [51], rat [52]). Genotyping refers to the proc- ess by which DNA is analyzed to determine which genetic variant (i.e. al- lele) is present for a certain marker. A DNA marker could in essence be any type of variation found in the ge- netic code and could be located within a gene or in any other part of the ge- nome. The ideal genetic marker for use in a mapping experiment should be easily scored, display high levels of polymorphism and preferably be co- dominant (i.e. all three genotypes can be scored: AA, Aa, aa). Two com- monly used markers are microsatellites (tandem repeats) and single nucleo- tide polymorphism (SNP). Microsatellites or Simple Sequence Repeats (SSR) are defined as long stretches of short sequences (1 – 6 bp) tandemly repeated, for instance (CA)n. These type of repetitive DNA sequences are highly abundant and approximately randomly scattered throughout the genome. Due to the higher mutation rate of this marker type compared to SNPs, microsatellites tend to be variable, with multiple allele size variants usually found within a popula- tion [53]. A SNP represents a single base variation in the genome where the

12 DNA sequence is variable between individuals (i.e. tgcAtag versus tgcTtag). Today, SNPs are the favored marker type in genetic studies due to their high abundance in the genome, and due to the cost effective way they can be genotyped. The most prominent SNP genotyping platforms are the Illumina BeadArray technology (Illumina Inc., San Diego, CA, USA) and the Affy- metrix SNP array technology (Affymetrix Inc., Santa Clara, CA, USA). Both methods enable rapid genotyping of several thousand, and even up to one million SNP markers, in a single reaction.

Principles of linkage analysis by means of meiotic recombination When ordinary somatic cells divide, the cell undergoes a procedure named mitosis. During that process two daughter cells are produced that have iden- tical genetic material. However, during the production of gametes, the cells divides according to a different biological pathway called meiosis, that after completion has produced four gametes which are haploid and carry a random set of one copy of each chromosome. In the course of meiosis, homologous chromosomes will pair, form chiasmata and exchange genetic material under the process known as recombination or crossing-over (Figure 2).

Figure 2. Schematic drawing of the process of crossing-over between loci A and B. (A) Two homologous parental chromosomes, one that carries alleles ‘A’ and ‘B’, and an other that carries alleles ‘a’ and ‘b’. (B) Chromosomes goes through replica- tion, generating two pairs of identical sister chromatids. (C-D) Cross-over occurs between two of the chromatids resulting in two recombinant (R) and two non- recombinant (NR) chromatids (figure adapted from Wu et al. [54]).

The purpose of linkage analysis is to determine whether or not pairs of genetic markers show independent assortment. To illustrate genetic linkage, let us consider the formation of gametes from an individual that is a double heterozygote (Aa/Bb) for two loci (Table 1). If the two loci in this example are located on different chromosomes, or far away on the same chromosome, they will segregate independently during meioses, and thus, there is a 0.5 probability that alleles at different loci will be inherited together. The inde- pendent segregation of alleles from the two loci would lead to the formation

13 of four different gametes in equal proportions (i.e. AB, Ab, aB and ab). In contrast, if two loci are located on the same chromosome they tend to show genetic linkage and the proportions of gametes will be distorted and not pro- duced in equal numbers. In this case we have to consider that two alternative linkage phases exist; either that allele A is coupled to B or alternatively A could be coupled to b. In Table 1, we see that the majority of gametes corre- spond to either ‘AB’ or ‘ab’. Those represent the parental or non- recombinant (NR).

Table 1. Segregation of alleles in a double heterozygote individual. Independent segregation yields equal proportions (25%) of each possible gamete, whereas link- age causes a distortion of the proportions of the gametes and they are not formed in equal numbers. Recombinant chromosomes are denoted as R and non-recombinants as NR.

In practice, recombination between homologous chromosomes means that the physical linkage between marker alleles on the same chromosome will occasionally be broken up by recombination events and that the linkage be- tween two alleles will be incomplete. The more distant two markers are on a chromosome, the more likely it is that recombination events will break up the linkage between the markers. The genetic distances between pairs of genetic markers can be calculated by scoring recombinant and non-recombinant gametes transmitted from par- ents to offspring and by subsequently dividing the observed number of re- combinants with the total number of informative meioses. A requirement for the estimation of the recombination fraction is that recombinant and non- recombinant chromosomes can be distinguished. The genetic markers used for linkage experiments must therefore be polymorphic and heterozygous in the parental individuals in the linkage pedigree. Two markers are considered linked when the observed proportion of recombinants is significantly less than 0.5. To test if the observed recombination fraction () deviates signifi- cantly from the null hypothesis of independent assortment ( = 0.5) a LOD score is calculated. The LOD score is the log (base 10) odds ratio determined by dividing the likelihood of the alternative hypothesis (linkage between a pair of markers) by the likelihood of the null hypothesis (independent as- sortment,  = 0.5) [55]. The threshold for rejecting the null hypothesis is usually a LOD score of 3 or more and a LOD score value of less than -2 is

14 taken as evidence against linkage at a given recombination fraction. A LOD score of 3 corresponds to the odds of 1000:1. The two-point linkage test described above provides a method of inferring linkage between pairs of markers and is also used to estimate the recombina- tion fraction between markers showing linkage. It does not provide the linear order of several markers, so two-point analysis can only be used to organize markers into linkage groups, i.e. a group of markers that show linkage to one or several other markers within the same group. The next step is to determine the most likely order of markers within link- age groups. The inherent problem with constructing multi-point linkage maps is the multitude of potential marker orders. For example, in a situation where you have five markers and wish to decide the internal order, there are a total of 120 possible combinations. In reality this number is only 60, as half of the combinations are the same but in reversed order (i.e. ABCDE and EDCBA). To overcome the problem of multiple testing, many methods for map construction apply a step-by-step approach, starting with a small num- ber of markers and then gradually adding markers one by one to the map. By calculating the maximum likelihood (ML) probability as this progresses, it is possible to find the most likely marker order. Computer algorithms are nec- essary to handle the numerous and complex calculations associated with linkage analysis in large pedigrees containing hundreds of markers. The CRI-MAP software [56, 57] calculates the ML score for a given marker or- der and compares the likelihood scores for different orders. This software was used in paper II, III and IV to evaluate marker orders and to compute map distances between loci. Whilst the recombination fraction () reports the observed frequency of recombination between two parental chromosomes, there are complications in terms of relating this ratio directly to the distance separating two loci. Discrepancies arise because double recombinants may occur if the distance between two loci is large, and so an even number of recombination events along a chromosome would create the appearance of a non-recombinant . Except for very closely linked loci, there is no complete linear relationship between estimates of recombination fraction and the actual map distance between two loci. With a marker dense linkage map however, the problem with double recombinants is limited, as every recombination event can be observed. Although if markers are sparse, the observed recombination fraction can be transformed using empirical models called map functions to compensate for multiple crossing-over events. These models were first de- veloped by Halden [58] and have been modified by Kosambi [59]. Distances between loci on a marker map are expressed in units of centiMorgan (cM) which are additive between loci, and reflect the expected number of recom- bination events between loci. The total length of a linkage map represents the average number of recombination events per meiotic cell.

15 As discussed above, recombination between the maternal and paternal chromosomes occurs during meiosis, whereby homologous chromosomes are paired and genetic material is exchanged. At a genomic level the ex- change of genetic material between chromosomes produces diversification by shuffling allelic variants, and hence results in novel haplotype combina- tions that are transmitted to the next generation. Therefore, recombination has considerable influence on the genome diversity and on evolution. At a molecular level, the process of recombination generates a temporary connec- tion between chromosomes (chiasma). At least in mammals and chicken, a minimum of one chiasma appears to be necessary to ensure the proper seg- regation of each chromosome pair during meiosis [60]. The failure of proper segregation can lead to aneuploidy, which is highly deleterious and is the major genetic cause of miscarriage in humans [61] as well for genetic disor- ders such as Down´s syndrome, caused by an additional copy of chromo- some 21 [61]. There are species that lack recombination. For example in Drosophila melanogaster, males go through meiosis without recombination, and thus other mechanism have to ensure proper segregation [62]. The mo- lecular process of segregation has been reviewed in great detail by Petronc- zki and colleagues [63]. Although recombination appears to be tightly regu- lated, there is variation between individuals, sexes and between species [60]. Moreover, the distribution of recombination events appears to be non- randomly distributed along the length of chromosomes. Regions that more frequently recombine are referred to as ‘hot-spots’ and are surrounded by areas where levels of recombination are suppressed (‘cold-spots’). Several approaches have been used to study the frequency and the chro- mosomal distribution of meiotic cross-over events. Historically, recombina- tion was studied directly, by observing chiasmata in cells going through meiotic division using cytogenetic methods. The cytogenetic approach can be used to examine the total number of cross-over events per cell as well as the number or events per chromosome. Whilst this method can reveal gen- eral patterns, it fails to provide a detailed view of the distribution of recom- bination events. Indirectly, recombination can be monitored by the construc- tion of genetic maps to estimate the frequency of recombination between pairs of loci. The resolution of a linkage map is dependant on the marker density and the number of observed meiosis. Fine-scale patterns of recombi- nation can be studied by observing the degree of linkage disequilibrium (LD) between marker loci in natural populations [64]. LD measurements can infer the historical level of recombination that occurred in a population, however, because it is a population based parameter, it fails to provide information about the variation among individuals and between sexes. Genomic regions that have historically experienced high levels of recombination show lower levels of LD, whereas low levels of recombination results in higher LD. The clear advantage of the LD approach is that we are indirectly observing a vast number of recombination events that occurred during many generations and

16 therefore a higher resolution can be achieved. On the other hand, the pattern of LD in a population is influenced by events other than recombination, such as natural selection, migration and genetic drift [65]. Thus, interpretation of LD patterns should be made with caution.

Quantitative genetics Up to this point, phenotypic traits that are largely dependent on the genotype at a single locus and segregate according to Mendelian principles, have been discussed. However, most phenotypic traits have a quantitative or complex genetic background. Body-weight and height are typical examples of quanti- tative traits, because they do not form discrete categories and have a con- tinuous or normal distribution within a population [66]. There is no simple relationship between genotype and phenotype for any given quantitative or complex trait. This is in contrast to a Mendelian trait, where a one to one relationship between genotype and phenotype exists. As a consequence, the segregation of individual loci effecting quantitative traits cannot be as easily observed as those for Mendelian traits. It follows that methods used to ana- lyze simple segregation patterns cannot be applied to quantitative traits, even though these are inherited according to the same genetic principles as quali- tative traits. A locus affecting a quantitative trait is termed a (QTL). The variation observed for a quantitative trait is usually a result of the simultaneous segregation of multiple QTL, where each contrib- utes a small fraction to the total trait variation. Also, non-genetic factors, usually termed ‘environmental effects’, may have large influences on the phenotype. The difference between a locus affecting a quantitative trait and a locus determining a Mendelian character merely lies in the effect of the gene relative to the other genetic and environmental components. The importance of quantitative genetics is obvious for plant and animal breeders as most traits of economic value are quantitative. Much of the theoretical basis concerning the inheritance of quantitative traits was established by the work of Fisher, Haldane, Wright and has since been further developed by others (reviewed by Lynch and Walsh [67], Fal- coner and Mackay [66] and by Roff [68]). The phenotypic variance of quan- titative traits, are as we discussed above, affected by genetic as well as envi- ronmental factors and can be partitioned and formulated as follows:

V  V  V P G E    VG VA VD VI where we have first partitioned the phenotypic variance (VP) into compo- nents of genetic (VG) and environmental variance (VE). The genetic variance

17 can be further subdivided into parts that result from the additive effects of alleles (VA), the effect from the interactions of alleles within locus (VD), and the effects of interactions between alleles at different loci 2 (VI). The ratio of VG to VP gives the broad sense heritability (H ), and de- scribes to what degree the phenotypic variance is genetically determined. Heritability estimates can range from 0.0, when all of the phenotypic varia- tion is caused solely by environmental factors, to 1.0 when all the variance is due to genetic factors. The phenotypic effect caused by dominance or inter- action, will not be transmitted to the next generation because the effect of these components is determine by the genotypic combination in the progeny. The only genetic component that contributes to the resemblances between relatives is the additive component, as the effect of alleles is not dependent on interactions with other alleles, or other alleles at different loci. The nar- row sense heritability (h2) is the ratio of the additive variance to the total phenotypic variation (VA/ VP), and is used to quantify the degree to which the phenotype will be passed on to the offspring. The value of h2 is an impor- tant measurement for breeders as it can be used to predict the selection re- sponse for a trait in a population. Estimates of heritability (H2 and h2) are only applicable to the specific population and generation studied, and cannot be used to draw conclusions about the heritability of the same trait in another population. The reason being that genetic variance is dependent on QTL allele frequencies and environmental factors, and both of these factors may differ between populations. There exist numerous estimates of heritability (h2) both in experimental and livestock species. In general, morphological traits have higher heritabilities, whereas life history traits and behavior traits have lower heritabilities [68].

QTL mapping The development of molecular genetic techniques, which enable vast marker sets to be genotyped in hundreds or even thousands of individuals, allows for the dissection of the genetic variance of a quantitative trait into the contribu- tors from individual QTL affecting that trait. The original paradigm of QTL mapping can be traced back to the beginning of the 20th century [69]. At a basic level, genetic markers dispersed throughout the genome are genotyped for individuals within a population, which have also been charac- terized at a phenotypic level. If a genetic marker is closely linked with a QTL, marker alleles and QTL alleles may be in linkage disequilibrium (LD) within the population, and an association between marker genotype and trait variation can be detected if the power in the experiment is sufficient. Re- quirements to identify QTL are thus; a genome-wide set of polymorphic markers and a ‘mapping’ population that shows heritable variation for a quantitative trait. The association between markers and QTL can be studied in either a large random set of unrelated individuals, in existing pedigree

18 materials or in experimental crosses [70]. The resolution of a QTL mapping experiment is proportional to the number of observed meiosis. These control the number or recombination events that can occur between marker and QTL and thereby reduce LD. In studies using pedigree material, the number or meiotic events are equal to and restricted by the number of individuals in- cluded in the pedigree. Association studies using unrelated samples have greater resolution than pedigree studies due to the many generations of re- combination events that have served to reduce the LD between marker and QTL. As a consequence of this, detecting LD between marker and QTL in unrelated populations requires marker densities several magnitudes higher than those needed in pedigree studies. The number of association studies performed in humans has greatly increased since the Hapmap project was completed and genome-wide LD patterns between millions of marker were reported [64]. By means of QTL mapping we can address questions relating to quantita- tive genetics, such as; ‘How many loci affect a trait?’, ‘Are there loci with large effect or is the variation caused by many loci each with a small indi- vidual effect?’, ‘Which gene and mutation underlie the QTL effect?’. The following text provides a brief introduction to the concepts required to map QTLs in experimental crosses. These theories have been described in detail by Lynch and Walsh [67]. Studies in unrelated individuals, such as associa- tion studies in humans have been reviewed elsewhere and will not be con- sidered further [71]. The first step in any QTL mapping experiment is to establish a mapping population, where QTL affecting a trait are segregating. In order to maxi- mize the chance of identifying QTL, it is desirable to produce a cross be- tween two phenotypically divergent breeds or strains. If there is a large aver- age trait difference between the parental lines when raised in a common en- vironment, then the observed phenotypic difference is likely to have a ge- netic basis. In this scenario, the only QTL that can be mapped in the cross are those that show allelic differences between the parental lines. Numerous QTL studies have used crosses between phenotypically divergent inbred lines of model organisms (Drosophila [72], mouse [73]). Crosses between outbred animals have also been commonly used (farm animals [3, 74], mouse [75], zebrafish [76]). Papers I and IV included in this thesis utilized a cross between two chicken lines divergently selected for body-weight to map QTL. There are several types of breeding schemes that can be used to produce a QTL mapping population. For example, recombinant inbred lines, advanced intercross lines, backcrosses and F2 intercrosses, with the latter two being the most commonly used. In a backcross, two parental lines are mated to pro- duce an F1 generation that is further crossed back with one or both of the parental strains to produce the mapping population. An F2 mapping popula- tion is formed by intercrossing F1 individuals. The advantage of using an F2

19 intercross over the backcross is that three QTL genotype classes (i.e. QQ, Qq and qq, where Q and q represents alternative marker alleles derived from respectively parental line) will be present, whereas the backcross only gener- ates two genotypic classes for any given locus. Hence, in the F2 design it is possible to estimate both the additive effect and the degree of dominance interaction between alleles at QTL. In a backcross design the additive and dominance effect is confounded, which will decrease the power to detect dominant QTL. Furthermore, the F2 design is more suitable for detection of interaction between loci (). A backcross or F2 intercross between inbred lines is a powerful tool for mapping QTL as both QTL and marker alleles are fixed for different variants in the parental lines. The resulting F1 offspring from line crosses are het- erozygous for marker and QTL alleles and have a known linkage phase. Analyzing crosses between divergent outbred lines is complicated by the fact that the degree of homozygosity at QTL and marker loci is unknown. In these cases it is often assumed that QTL are fixed for alternative alleles in the outbred parental lines during the analysis. This is a reasonable assump- tion if the parental lines have been selected for different purposes over mul- tiple generations [77]. This may not be true in some cases and thus will negatively influence the power to detect QTL [78]. In the case of an inbred line cross, a simple way of detecting association between a marker and a QTL is to first categorize individuals according to QTL genotype class based on marker genotypes, followed by a statistical test to determine whether there is a significant difference between the mean trait values of the marker classes. If significance is shown, a QTL is located in the vicinity of the genetic marker. The additive (a) and dominance (d) effect of a QTL can be defined as follows:

a  0.5(  ) 1 3      d 0.5(2 2 1 3) where μ1 represents the mean trait value of QQ, μ2 is the mean of Qq and μ3 is mean of QTL genotype class qq [44]. A drawback with the single-marker approach is the inability to distinguish if the marker associated with the phenotypic effect is tightly linked to a QTL with a small effect, or loosely linked to a QTL with a major effect. A more efficient and accurate approach that overcomes the disadvantages of single- marker analysis is interval mapping. This technique was first described by Lander and Botstein, for inbred line crosses [79] and later modified by Haley et al. [78, 80] for analyzing outbred line crosses. Interval mapping uses a genetic linkage map as a framework to locate QTL. The position and genetic effect of a QTL can be modeled using maximum likelihood (ML) as origi- nally proposed by Lander and Botstein or with regression-based methods

20 developed by Haley and co-workers. In the regression-based approach, QTL genotype (i.e. QQ, Qq or qq) probabilities are calculated at defined genomic locations conditionally on the genotypes of markers for each individual in the pedigree. Probabilities are usually calculated in a 1 cM grid throughout the genetic map. Conditional QTL genotype probabilities are then used to estimate regression coefficients of additive and dominance effect for putative QTL at the predefined locations in the genetic map. The phenotype is then regressed onto these indicator variables to estimate QTL effect at these loca- tions. An F-ratio is calculated at each position by testing the null hypothesis of no QTL versus the alternative hypothesis of a QTL at the position. To visualize the results from a genome scan where all locations in the grid are tested for QTL effects, the F-values from the regression analysis are plotted against the corresponding positions on the linkage map. The most probable QTL location is the position on the genome with the highest F-value. Within the framework of regression-based interval mapping, the model can be ex- tended to include parameters for estimating interaction between alleles at different positions [80]. Epistatic QTL analysis can reveal novel loci that mainly affect the phenotype through interaction with other locus and hence would not be detected in a QTL analysis that only models individual loci effects [81]. In addition, an epistatic model also exposes pair-wise interac- tions between loci that have an individual marginal effect, but where the combined joint effect is greater than the sum of the individual effects of the two loci [82, 83]. QTL mapping involves a large number of statistical tests as we test for presence of QTL throughout the entire linkage map. Therefore the signifi- cance threshold needs to be adjusted to account for the multiple testing in order to control the number of false positives (type I errors). Appropriate significance levels for a QTL mapping experiment involving experimental crosses can be determined by permutation testing as described by Churchill and Doerge [84]. The mapping resolution in a typical F2 QTL experiment is low, and the size of the genomic region statistically associated to a QTL can usually span 10-30 cM [77]. Hence, a QTL defined in a mapping experiment does not represent a single gene locus, but a chromosomal region containing multiple genes in which one or several mutations affecting the trait reside. The preci- sion with which a QTL can be mapped to a genomic region depends to a large degree on the individual effect of the QTL, the scale of the experiment (i.e. the size of the mapping population) and to some extent upon the marker density and local recombination rate in the QTL region. The number of QTL detected in a study is mainly a function of the size of the experimental popu- lation, and thus, the detected QTL only represent a minimum of the total number of QTL contributing to the variance of a trait in a mapping popula- tion [68]. Increasing the experimental population size would enable detec-

21 tion of additional QTL with smaller effect sizes as well as separation of closely linked QTL. Narrowing down a QTL region generally involves further intercrossing to increase the number of meiotic events, which breaks down the LD between marker and QTL. Whilst increasing the sample size of an F2 population will enhance QTL resolution, an alternate and more efficient method involves the use of an advanced intercross line (AIL). AILs are produced by repeated intercrossing of individuals from F2 and subsequent generations [85]. The strategies used in high-resolution QTL mapping experiments are to a large degree dependent on the species investigated and are dictated by such things as generation time and the genomic resources available for the species. It should be stated that high-resolution mapping attempts, especially in ani- mals, is at present very time consuming and expensive. Due to this, identifi- cation of the gene and mutation underlying QTL effects detected in mapping populations has thus far been relatively unsuccessful. A small number of studies have, however been able to clone the gene and provide extensive evidence pointing out a genetic variant responsible of influencing a quantita- tive trait. For example, in cattle, a missense mutation in the ABCG2 gene has been reported as the causative mutation underlying a QTL affecting milk yield [86]. Additionally, in pigs, a SNP in a conserved sequence element that influences the levels of IGF2 expression has been shown to be the causative mutation underlying a QTL effect influencing muscle growth and fat deposi- tion [87].

22 Aims of this thesis

The main objectives of this thesis have been to:

 Investigate the genetic background to the eight-fold difference in body-weight between two divergently selected chicken lines (Paper I and IV).

 Construct an updated consensus linkage map for chicken and to study patterns of recombination throughout the chicken genome (Paper II and III).

23

24 Present investigations

Construction and analysis of genetic linkage maps in chicken (Paper II and III)

Background Linkage maps have for a long time provided the basis for mapping mono- genic traits and QTL, and can be used to study patterns of recombination [51]. Linkage maps can also serve as reference points during whole genome sequence assemblies. The first chicken linkage map was published in 1930 and since that time has been updated by more marker dense maps (reviewed by Romanov et al. [88]). The most comprehensive linkage map prior to this thesis included 1,889 molecular markers and formed 50 linkage groups [49]. Construction of high-density linkage maps has been possible since the re- lease of a draft genome sequence of the chicken genome together with a 2.8 million SNP map [4, 16]. A 6.6x coverage sequence assembly was generated from an inbred female red jungle fowl and released in February 2004. The use of a female bird meant that the W chromosome would be included in the genome sequence, however this came at the cost of only about 3X sequence coverage for both the Z and W chromosomes. The draft assembly was by no means perfect, with only 30 of the 38 chromosomes present in the assembly and with numerous gaps along the length of chromosomes. These gaps were particularly notable on the sex chromosomes, mainly due to the reduced sequence coverage. The assembly of the W-chromosome was further ham- pered by the repetitive structure of that chromosome [26]. To improve the draft assembly, additional sequence reads were generated and together with new SNP linkage data and radiation hybrid data, a second assembly was made available in May 2006. The updated genome assembly enhanced the sequence coverage of the Z chromosome, however the W chromosome re- mained partial. The genome sequence is still incomplete as eight micro- chromosomes are missing from the current assembly. Paper II included in this thesis describes a dense SNP-based linkage map of the Z chromosome and further assigns three more genes to the gene poor W chromosome. An updated high-resolution consensus linkage map, includ- ing 9,268 genetic markers is presented in paper III, together with an evalua- tion of patterns of recombination in the chicken genome. The SNP markers

25 used in both studies were randomly selected from the 2.8 million polymor- phisms previously discovered when domestic sequences reads were com- pared to the reference genome sequence [16]. A proportion of the selected SNP markers were located on contigs which had not been assigned to a chromosome in the draft assembly.

Results and Discussion Paper II (A high-resolution linkage map for the Z chromosome in chicken reveals hot spots for recombination) Genotype data from about 9,000 SNP markers, analyzed in a set of 182 birds representing several different breeds, was used to identify SNP markers that displayed sex chromosome inheritance. Based on genotype distributions in males and females, 308 markers could be placed on the Z chromosome and 13 markers on the female specific W chromosome. Z-linked loci were identi- fied as those SNP loci where only males were heterozygous and all female genotypes were homozygous. W-linked loci were recognized as SNPs that exclusively gave genotype calls in females. The majority of the Z-linked SNPs identified in this study were subsequently assigned to this sex chromo- some in the updated assembly (May, 2006), with only 12 out of the 308 Z- linked SNP markers still located to unassigned contigs. A linkage map for the Z chromosome was constructed using a population of 47 F2 birds that represent a subset of a large intercross population between the red jungle fowl and White Leghorn [13]. The map presented in paper II includes 208 markers that segregated in our pedigree with a total length of 200.9 cM, giv- ing an average cross-over frequency of two events per meiosis. The overall quality of the linkage map reported in paper II appears to be high. This conclusion was based on a comparison to the marker order of the May 2006 assembly. Of the 208 segregating markers, only a single microsa- tellite marker was out of agreement. This discrepancy could be due to geno- typing errors giving rise to a false marker order or alternatively, a potential mistake in the assembly process. Importantly, the dense nature of the map would have allowed suspicious double-recombinants to be easily identified in the dataset. None was observed. Rates of recombination are often expressed as cM/Mb, where the physical distance is in units of one million basepairs (Mb). Paper II reports an average recombination rate of 2.7 cM/Mb on the Z chromosome based on comparing the total map length to the physical map (May, 2006), an estimate that re- sembles previously published results for chicken macrochromosomes [4, 89]. However, the distribution of recombination events appears to be uneven along the Z chromosome (Figure 2, Paper II), with hot- and cold-spots of recombination. An increased recombination rate was also observed at the

26 telomeric ends of the chromosome, a result which has been reported in other species [60]. The results from paper II show that the May 2006 assembly of the Z chromosome is both comprehensive and accurate. A precise sequence as- sembly enables positional cloning of Z-linked traits and could also shed light on the mechanisms of sex determination which in birds is currently unknown [90]. Further, complete sex chromosomes sequences are vital for the study of avian sex chromosome evolution [91].

Paper III (A high density SNP based linkage map of the chicken genome reveals sequence features correlated with recombination rate) Paper III presents an updated consensus linkage map constructed by analyz- ing the genotype data obtained by screening 13,000 SNPs across three sepa- rate pedigrees. The Wagerningen (WU) pedigree was derived from a cross between two commercial broiler lines and 92 F2 individuals were used in this study [92]. Eighty-eight birds from the East Lansing (EL) pedigree, a back- cross between a red jungle fowl line and an inbred White Leghorn line, were also included [93]. The third pedigree from Uppsala University (UPP) is an F2 intercross between red jungle fowl and a White Leghorn lines used to build the Z linkage map presented in paper II, and described previously in the text. From the initial 13,000 SNPs, 8599 segregating SNPs were combined with a subset of microsatellites (669) from the previous consensus map [49]. The new consensus map has marker coverage on chromosomes 1-28, chro- mosome Z and additionally five small linkage groups not assigned to a chromosome in the current genome assembly. The number of autosomal chromosomes that are currently covered should therefore be 33, if each of the five small linkage groups represents separate chromosomes. This then leaves five microchromosomes that are still lacking marker coverage. The length of the current sex-average map is about 3200 cM, which is considerably shorter then the previous consensus map that had a total length of approximately 4200 cM [49]. The reduced map length is due in large part to the high quality of genotype data obtained from SNP array assay com- pared to previous microsatellite genotyping. The previous map was inflated due to genotyping errors in microsatellite and AFLP markers, which could easily be detected with the current dense marker coverage. Variations in the rates of recombination between genders are common in many species, and in humans the female linkage map is almost two-fold longer than the male linkage map [47]. However, only a minor difference was observed between the sex-specific linkage maps constructed in the cur- rent analysis, which is in concordance with previous studies in chicken [13]. The same result has also been noted in a recently published zebra finch link- age map [94]. A dramatic difference was however observed between pedi-

27 grees, with an approximate 600 cM difference in the total sex-average maps length between the domestic broiler cross (WU population) compared to the EL and UPP pedigrees. The latter are both crosses between the red jungle fowl and White Leghorn lines. Burt and Bell (1987) reported that domestic animals have a higher chiasmata count compared to natural populations, and further suggested that the higher rate in domestic animals could be a conse- quence of an indirect selection pressure for increased recombination rates. A high recombination rate would allow an increased selection response to a phenotype as unfavorable genetic correlations can be broken up by recombi- nation. Comparisons between genetic and physical maps clearly show that the rates of recombination vary considerably between species [95]. The genome average recombination rate estimated from the current linkage map amounts to 3.1 cM/Mb. This is almost two-fold higher when compared to humans, where the recombination rate is about 1.2 cM/Mb [47]. Even lower rates of recombination have been reported for rodents (rat and mouse 0.5 cM/Mb [95]). Additionally, the recombination rate observed in the chicken genome is two-fold higher than estimates from the zebra finch [94]. The difference between chicken and zebra finch is foremost caused by differences on the macrochromosomes. As stated, the chicken genome-wide average recombination rate was 3.1 cM/Mb, however the distribution of recombination along the chromosomes was found to be highly variable, ranging from 0 to 20 cM within 1 Mb win- dows. The recombination rate also varied considerably between chromo- somes. Recombination rate was negatively correlated with chromosome size, similar to what has been reported in other species [96]. In concordance with studies in mammalian species, the recombination rates in chicken tend to be higher at the end of the chromosomes and lower in centromeric regions [96]. Several studies have investigated the correlation between recombination rates and different sequence parameters. For example, GC-rich genomic regions, nucleotide diversity and gene density have all shown a positive cor- relation to recombination rate in several species [51]. In the current study, we tested the correlation between recombination rates, GC-rich sequence motifs and to repeat element content, in one Mb intervals. The results showed a clear positive correlation to GC-rich sequence motifs and a nega- tive correlation to AT-rich sequences, whilst repeat structures were nega- tively correlated to recombination rate. The overall features of recombina- tion in chicken appear to be similar to observations made in mammalian species.

28 Future perspective Although we screened a large number of SNPs, many of which had not been assigned to chromosomes in the draft assembly, we failed to provide a com- plete linkage map representing all 38 autosomal chicken chromosomes. The chromosomes missing both from the assembly and our updated linkage map are microchromosomes. These are characterized by small size and a have a higher GC content compared to the macrochromosomes. A complete linkage and sequence map are essential requirements for future detailed studies of recombination processes, in addition to providing a framework for investi- gating evolutionary processes that have shaped the chicken genome, such as the size differences between the macro- and microchromosomes. The in- complete map of the chicken genome also hampers attempts, to discover, clone and characterize any trait loci residing on these missing chromosomes. The absence of sequence data from a proportion of the microchromo- somes is quite puzzling. The missing sequence could possibly reflect a clon- ing bias, perhaps due to high GC content or other sequence characteristics. The end result could have been reduced cloning success of microchromo- somes sequence fragments in libraries used to generate the whole genome sequence. Such a bias could be overcome by generating sequence data from recently developed sequencing systems that do not require vector based cloning, such as the 454 system provided by Roche. Such attempts are on the way and will hopefully result in a complete chicken genome sequence.

Genetic analysis of two extreme selection lines – QTL mapping (Paper I and IV)

Background In 1957, Prof. Paul Siegel at Virginia State University started a long-term selection experiment in chickens. His objective was to study the genetic and physiological effects of divergent selection on juvenile body-weight [97]. The study population was created by first intercrossing seven partially inbred White Plymouth Rock broiler lines. From this base population, two diver- gent lines were selected based on high and low body-weight at 56 days of age (with this being the sole selection criterion) [98]. The two lines have been reproduced annually since the start of the experiment, with breeding and hatch always taking place at the same date every year to allow a valid comparison between generations. In March 2008, the 50th generation of the two selection lines were hatched. The selection responses in the High- Weight Selected (HWS) and the Low-Weight Selected (LWS) lines have been remarkable (See Figure 1, Paper I). Sex-average gain in the HWS line

29 up to generation thirty-seven was 19.7 g per generation, whist within the LWS line an average body-weight reduction per generation have been re- ported to be 19.6 g [99]. At generation forty, an approximate eight-fold dif- ference in body-weight was observed between the two divergently selected lines. This equated to a sex-average body-weight of about 1.5 kg for the HWS line compare to an average of about 0.2 kg for the LWS line. Selection for body-weight has also rendered a series of correlated responses to various body-weight related traits. For example, the appetite in the HSW line birds has increased whereas in the LWS line the chickens display an anorectic phenotype, exhibiting extremely low appetite. In order to maintain a breed- ing flock the HWS line has to be feed restricted after 56 days of age (the selection age) to avoid metabolic disease. In the LWS line, the selection for reduced body-weight has led to the appearance of an anorexia phenotype from generation 25 [99]. For each subsequent generation, approximately 5- 20% of the LWS line birds die before the age of selection due to inadequate basal food consumption. Numerous reports have been published discussing the effects of divergent selection on these chicken lines (for a comprehensive review see Dunnington and Siegel [97]). In order to identify the genetic components that have responded to selec- tion and caused the observed differences in body-weight between the two lines, an intercross between the lines was produced. A reciprocal HWS/LWS intercross was bred constituting a mapping population of about 850 F2 birds. Body-weight measurements were collected at hatch and in subsequent two- week intervals until slaughter. At 70 days of age, the F2 population was sac- rificed and an additional set of body-weight related traits was collected. These included body-composition, metabolic and immune response traits. Park et al. have reported results from a QTL analysis for body-composition and metabolic traits collected from this intercross [100]. Paper I in this thesis describes results from a QTL analysis based on the intercross for body-weight traits, as well as anorexia, packed cell volume (PCV), blood protein and a immune response trait (SRBC). In paper IV, we constructed an updated linkage map by further including an additional 316 SNP markers and report results from a revised QTL analysis of body-weight traits. Using the extended genetic map we also conducted a pair-wise ge- nome-wide search for epistatic QTL pair affecting body-weight traits. A previous study by Carlborg et al. had suggested that interactions between loci contributed to the variation in body-weight in the two chicken lines [83]. Their analysis exposed a genetic network of interacting loci affecting body- weight at 56 days of age. In total, five locus-pairs were detected. All pairs shared one common locus on chromosome 7 (Growth9), which appeared to be the key regulator of this network of epistatic QTL.

30 Results and Discussion

Paper I (Many QTLs with minor additive effects are associated with a large difference in growth between two section lines in chickens) A genome scan including 145 microsatellite markers was conducted in order to map QTL segregating in the HWS/LWS intercross. The genetic map in- cluded marker coverage on 25 autosomal linkage groups with a total map length of 2,426 cM [89]. QTL analysis was performed using a least square interval mapping method for outbred intercrosses [78], implemented in the web-based QTLExpress software [101]. The analysis revealed 13 chromosomal regions affecting body-weight or growth (Growth 1-13, Table 2, Paper I). Each individual QTL explained only a small proportion of the total phenotypic variance in the F2 population (1.3 -3.1%). The majority of the QTL showed additive effects, however there were two QTL that displayed negative over-dominance (i.e. an enhanced growth in the heterozygote genotype class). Only five of the 13 QTL re- ported in paper I, reached the 5% genome-wide significance level set by permutation testing [84]. For all QTL, the HWS line alleles were associated with enhanced body-weight. This implied that most of the QTL are real, as such an outcome would be unlikely if the majority of the QTL were false positive. In a joint regression analysis including only QTL with additive effect, we estimated that the combined effect could, at most, explain about half of the difference in body-weight at 56 days between lines, and roughly 13% of the residual variance in the F2 population. No QTL were detected for anorexia, PBC, blood protein or for the antibody response to SRBC. The main conclusion from paper I was that the divergent selection response ob- served in the two chicken lines was most likely a result of a fixation of many alternative QTL alleles in the selected lines each with individually small additive effects. The genetic map used for QTL analysis in paper I only covered 24 out of the 38 autosomal chromosomes. The incomplete map coverage could have influenced the main conclusions of this paper, as it is possible that additional QTL with large effect were segregating on the 13 uncovered chromosomes.

Paper IV (Genetic analysis of an F2 intercross between two chicken lines divergently selected for body-weight) To enhance the map coverage in the HWS/LWS intercross, 316 SNP mark- ers distributed over marker gaps and on uncovered chromosomes were geno- typed using the Illumina GoldenGate assay. A general increase in marker density was sought to enhance the precision of the QTL analysis, and to pro- vide power to detect additional QTL. Highly informative genetic markers

31 were chosen from a panel of 13,000 SNPs genotyped in 15 individuals of each line. The linkage map, including the previously genotyped microsatellite and the new SNP markers, comprised 434 genetic markers distributed over 31 linkage groups. The map spanned 2954 cM, in comparison to the previous microsatellite map that covered 2,426 cM. The current map has a higher average markers density than the previous map, with one marker per 6.9 cM compared to one marker every 16.7 cM. Additionally, seven microchromo- somes are now covered, leaving only seven chromosomes without markers. A comparison between the two linkage maps used for the QTL analysis in papers IV and I are presented in Figure 1, paper IV. Using the updated genetic map, we re-analyzed the intercross data. First, we scanned the genome for QTL with marginal effect (additive and domi- nance) for body-weight traits and second, we extended the regression model to also account for epistatic interactions. Epistatic QTL analysis was con- ducted as previously described by Carlborg et al. [102, 103]. The results from the QTL analysis for marginal effect on body-weight traits were to a large degree in concordance with our previous study (Paper I). No additional QTL were detected on previously uncovered chromosomes. In total, we report seven QTL each with small effect sizes, five of which were significant at the 5% genome-wide level. Six QTL that we reported in paper I, did not reach the significance threshold set in paper IV (see discus- sion paper IV). In the epistatic QTL analysis, we identified 19 QTL pairs where the inter- actions were significant at the 1% level and an additional six pairs signifi- cant at a 5% interaction level. The 25 pairs represented 15 unique interacting QTL pairs (Figure 2, paper IV). Of those 15 pairs, nine involved interactions with the Growth 9 QTL. The current epistatic analysis, using the updated linkage map, supported the main results and conclusions of Carlborg et al. who used the microsatellite-based linkage map [83]. Our results showed that epistatic interactions between loci can contribute substantially to the varia- tion for a quantitative trait. Epistatic effects between alleles at different loci have for a long time been known to influence Mendelian traits, such as coat color variation in pigs [104] and also comb types in chickens. To what degree epistasis is involved in regulating quantitative trait variation is not clear. Several studies have reported epistatic QTL in various model organisms, such as mammalian species [105, 106], Drosophila [107-109] and plants [110], while others have failed to detect interacting QTL effects [111]. A recent study by Hill et al. reported that most trait variation in humans was typically dependent on addi- tive genetic variation, with limited or no effect of dominance or interaction components [112]. Whilst epistasis is unlikely to affect every trait in every population, it is clearly worthwhile investigating and should not be ignored when studying quantitative traits [81, 82].

32 The intense bi-directional selection during forty generations is expected to have resulted in allele frequency changes within and between the two chicken lines at QTL. Strong directional selection on QTL alleles will lead to selective sweeps, as favorable QTL alleles increase in frequency and replace other alleles at loci present in the population [3]. This process will lead to a reduction of nucleotide diversity around selected QTL due to hitchhiking effects with nearby loci, and result in a fixation of haplotype blocks [113]. The size of the haplotype block depends both on the time required to bring the favorable mutation to fixation in the population and on the local recom- bination rate. An example of a selective sweep is the IGF2 locus in domestic pigs [87]. Reduced nucleotide diversity within a genomic region could be a result of selection on favorable mutation(s), but may also be a result of demographic processes that have occurred in the population, such as bottle- necks and genetic drift [114]. At QTL that have responded to selection in the two chicken lines, we ex- pect high allelic divergence between lines as they have been selected in dif- ferent directions, coupled with increased homozygosity within lines. If such regions of low nucleotide diversity could be identified within QTL intervals, this could potentially narrow down a large QTL area to the region that har- bors the causative mutation underlying a QTL effect. A search for signs of selective sweeps, using genotype data from 13,000 SNP markers scored on 15 individuals from each line, in large failed to detect homozygosity sur- rounding QTL regions affecting body-weight. The current marker density represented one segregating SNP per 200 kb. Despite the large difference in body-weight between the HWS and LWS lines, we only identified a handful of QTL, each with small marginal effect on body-weight in the intercross population. From the current studies we draw the conclusion that many loci, each with small effect, were contributing to the difference in body-weight between the two divergently selected chicken lines. This is also in agreement with the consistent progressive re- sponse to the selection on body-weight, as illustrated in Figure 1, Paper I. If several major QTL had been segregating in the base population, the response curve over generations would be expected to show more dramatic changes. No major leaps in body-weight between generations have been observed in the two populations. This also implies that no novel QTL mutation with large effect have appeared during the course of the selection experiment. The continued response to selection on body-weight throughout the experiment most likely reflects selection on the standing genetic variation present in the founder population. The failure to detect selective sweeps using the current marker set also supports the hypothesis that selection has primarily been on standing genetic variation. The epistatic effects we observed illustrate that this mechanism has played an important role in mediating the selection re- sponse. Although 31 of the 38 autosomes were covered in this experiment, it

33 is still possible that major QTL are segregating in our intercross but re- mained undetected due to incomplete genome coverage.

Future perspective Results from paper I and IV present an overall view of the genetic basis of the large body-weight difference observed between the HWS and LWS lines following long-term bi-directional selection. To gain further insight, a com- plete genome scan that covers all chromosomes must be performed. This would provide a comprehensive catalog of QTL segregating in the inter- cross, and would reveal if any QTL with large effect were segregating on currently uncovered microchromosomes. There are however limitations with the power to detect QTL using the current size of the intercross. At present, only those QTL with reasonable sized effects can be detected and therefore, the current population does not reflect an accurate estimate to the true num- ber of QTL that have responded to selection for high and low body-weight respectively. To increase the statistical power and to gain an enhanced view of the number of QTL that have been selected would require a much larger sample size than the approximately 850 individuals which were used in pa- per I and IV. Identifying genes and the causative mutations affecting QTL is key to ad- vancing our understanding of the underlying molecular mechanisms that control phenotypic variation. While there is a limitation to the number of QTL the current intercross can detect, there is also a limitation with regards to the mapping resolution gained using the current F2 population. The confi- dence intervals for the seven QTL that we reported in paper IV are between 10-30 cM, which relate to genomic regions of between 3.5 – 9.7 Mb. Within such large regions, tens or hundreds of genes are normally located. Increased map resolution can be achieved by using an Advanced Intercross Line (AIL) as described in the QTL mapping section [85]. An AIL has been maintained for the HWS/LWS intercross by further crossing F2 individuals and subse- quent generations (F3, F4, etc.). As recombination events accumulate over several generations of intercrossing, the linkage disequilibrium between marker and QTL is reduced, and hence, a higher map resolution can be ob- tained. For generation F8 we produced a population consisting of approxi- mately 400 birds that will be used for fine-mapping QTL that were discov- ered in the F2 intercross. Genetic research has gone through a technological revolution during the last five years. First with the development of highly efficient SNP genotyp- ing technologies, such as the Illumina and Affymetrix systems. Secondly, new sequence technologies that can generate vast amount of sequence data in a short time frame to a relatively low cost compared to conventional Sanger sequencing are now available.

34 In paper IV we used a SNP map comprising on average 1 SNP/200 kb to search for selective sweeps. With that marker density constrained to large QTL interval, our attempt failed to locate selective sweeps within QTL re- gions. This suggests that sizes of potential selective sweep regions present in the two selection lines are small, and that the detection of sweep intervals would require an increased SNP density. Applying these new technologies, the levels of genome-wide nucleotide variation within and between the HWS and LWS line can be determined. This information coupled with segregation data from the AIL can be a powerful approach to identify candidate poly- morphisms that underlie the QTL effects causing the dramatic phenotypic differences between the lines.

35 Acknowledgements

Studies presented in this thesis were performed at the Department of Medical Biochemistry and Microbiology, Uppsala University, Sweden.

I would like to thank the following people that I have been working closely with during my graduate studies;

My main supervisor professor Leif Andersson, for being an excellent guide in the world of scientific research. Your sincere interest, enthusiasm and passion for science is impressive and inspiring, it has been stimulating to work in your lab. Thanks.

Örjan Carlborg, my co- supervisor during the last years of my studies, thanks for all the help with regards to QTL analysis and for proofreading manuscripts and thesis. I would also like to thank Örjan´s students Weronica Ek and Francois Besnier for collaborations with the AIL cross and building linkage maps.

Professor Paul Siegel, I will never forget my two visits to Blacksburg, thanks for the hospitality, moments of fun and for shearing your scientific knowledge. I hope to see you soon. I also would like to thank Christa, in Paul´s lab for all the help during phenotyping of chickens.

Lina Stömstedt and Xavier Tordoir, for collaboration on the Z chromo- some and the Highlow papers.

Jennifer Meadows, I am grateful for the time you spend proofreading my thesis, instead of being out playing in the snow. Although I kept you busy reading, you managed to catch a cold.

I would also like to thank Martien Groenen at Wageningen University for collaboration on the consensus map, and to Mario Foglio, Ivo Gut and Mark Lathrop at the CNG institute in Paris, for providing genotype data.

36 All the current and past members of the lab, thanks for all the help and support during these years. I will miss the entertaining discussions we had together whilst enjoying a delicious and nutritious lunch from Bikupan.

People outside the lab; Anna-Stina S thanks for all the kind invitations to Skara, and Calle R, thanks for all the help with the analysis of bone struc- ture.

Lastly, I would like to thank my friends outside the lab, my extended fam- ily, and most importantly mamma, pappa, Anna, Björn, Edwin and Rut for love and support.

37 References

1. Clutton-Brock J: A natural history of domesticated mammals. Cam- bridge: Cambridge University Press; 1995. 2. Darwin C: The Origin of Species by means of Natural Selection. London: John Murray; 1859. 3. Andersson L, Georges M: Domestic-animal genomics: deciphering the genetics of complex traits. Nat Rev Genet 2004, 5(3):202-212. 4. International Chicken Genome Sequencing Consortium: Sequence and comparative analysis of the chicken genome provide unique per- spectives on vertebrate evolution. Nature 2004, 432(7018):695-716. 5. Wyles JS, Kunkel JG, Wilson AC: Birds, behavior, and anatomical evolution. Proc Natl Acad Sci U S A 1983, 80(14):4394-4397. 6. Stevens L: Genetics and Evolution of the Domestic Fowl. New York: Cambridge University Press; 1991. 7. Fumihito A, Miyake T, Sumi S, Takada M, Ohno S, Kondo N: One subspecies of the red junglefowl (Gallus gallus gallus) suffices as the matriarchic ancestor of all domestic breeds. Proc Natl Acad Sci U S A 1994, 91(26):12505-12509. 8. Fumihito A, Miyake T, Takada M, Shingu R, Endo T, Gojobori T, Kondo N, Ohno S: Monophyletic origin and unique dispersal pat- terns of domestic fowls. Proc Natl Acad Sci U S A 1996, 93(13):6792- 6795. 9. Eriksson J, Larson G, Gunnarsson U, Bed'hom B, Tixier-Boichard M, Stromstedt L, Wright D, Jungerius A, Vereijken A, Randi E et al: Identification of the yellow skin gene reveals a hybrid origin of the domestic chicken. PLoS Genet 2008, 4(2):e1000010. 10. Liu YP, Wu GS, Yao YG, Miao YW, Luikart G, Baig M, Beja-Pereira A, Ding ZL, Palanichamy MG, Zhang YP: Multiple maternal origins of chickens: out of the Asian jungles. Mol Phylogenet Evol 2006, 38(1):12-19. 11. Havenstein GB, Ferket PR, Qureshi MA: Growth, livability, and feed conversion of 1957 versus 2001 broilers when fed representative 1957 and 2001 broiler diets. Poult Sci 2003, 82(10):1500-1508. 12. Crawford R: Poultry breeding and genetics. New York: Elsevier Sci- ence; 1996.

38 13. Kerje S, Carlborg O, Jacobsson L, Schutz K, Hartmann C, Jensen P, Andersson L: The twofold difference in adult size between the red junglefowl and White Leghorn chickens is largely explained by a limited number of QTLs. Anim Genet 2003, 34(4):264-274. 14. Siegel PB, Dunnington EA: Genetic selection strategies--population genetics. Poult Sci 1997, 76(8):1062-1065. 15. Rishell WA: Breeding and genetics--historical perspective. Poult Sci 1997, 76(8):1057-1061. 16. Wong GK, Liu B, Wang J, Zhang Y, Yang X, Zhang Z, Meng Q, Zhou J, Li D, Zhang J et al: A genetic variation map for chicken with 2.8 million single-nucleotide polymorphisms. Nature 2004, 432(7018):717-722. 17. Burt D, Pourquie O: Genetics. Chicken genome--science nuggets to come soon. Science 2003, 300(5626):1669. 18. Brown WR, Hubbard SJ, Tickle C, Wilson SA: The chicken as a model for large-scale analysis of vertebrate gene function. Nat Rev Genet 2003, 4(2):87-98. 19. Mariani FV, Martin GR: Deciphering skeletal patterning: clues from the limb. Nature 2003, 423(6937):319-325. 20. Ellegren H: Molecular evolutionary genomics of birds. Cytogenet Genome Res 2007, 117(1-4):120-130. 21. Backstrom N, Fagerberg S, Ellegren H: Genomics of natural bird populations: a gene-based set of reference markers evenly spread across the avian genome. Mol Ecol 2008, 17(4):964-980. 22. Sachidanandam R, Weissman D, Schmidt SC, Kakol JM, Stein LD, Marth G, Sherry S, Mullikin JC, Mortimore BJ, Willey DL et al: A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 2001, 409(6822):928-933. 23. Lindblad-Toh K, Wade CM, Mikkelsen TS, Karlsson EK, Jaffe DB, Kamal M, Clamp M, Chang JL, Kulbokas EJ, 3rd, Zody MC et al: Ge- nome sequence, comparative analysis and haplotype structure of the domestic dog. Nature 2005, 438(7069):803-819. 24. Parker HG, Kim LV, Sutter NB, Carlson S, Lorentzen TD, Malek TB, Johnson GS, DeFrance HB, Ostrander EA, Kruglyak L: Genetic struc- ture of the purebred domestic dog. Science 2004, 304(5674):1160-1164. 25. Lindblad-Toh K, Winchester E, Daly MJ, Wang DG, Hirschhorn JN, Laviolette JP, Ardlie K, Reich DE, Robinson E, Sklar P et al: Large- scale discovery and genotyping of single-nucleotide polymorphisms in the mouse. Nat Genet 2000, 24(4):381-386. 26. Schmid M, Nanda I, Hoehn H, Schartl M, Haaf T, Buerstedde JM, Arakawa H, Caldwell RB, Weigend S, Burt DW et al: Second report on chicken genes and chromosomes 2005. Cytogenet Genome Res 2005, 109(4):415-479.

39 27. Fridolfsson AK, Cheng H, Copeland NG, Jenkins NA, Liu HC, Raud- sepp T, Woodage T, Chowdhary B, Halverson J, Ellegren H: Evolu- tion of the avian sex chromosomes from an ancestral pair of auto- somes. Proc Natl Acad Sci U S A 1998, 95(14):8147-8152. 28. Burt DW: Origin and evolution of avian microchromosomes. Cyto- genet Genome Res 2002, 96(1-4):97-112. 29. Ohno S, Muramoto J, Stenius C, Christian L, Kittrell WA, Atkin NB: Microchromosomes in holocephalian, chondrostean and holostean fishes. Chromosoma 1969, 26(1):35-40. 30. Stock AD, Mengden GA: Chromosome banding pattern conserva- tism in birds and nonhomology of chromosome banding patterns between birds, turtles, snakes and amphibians. Chromosoma 1975, 50(1):69-77. 31. Primmer CR, Raudsepp T, Chowdhary BP, Moller AP, Ellegren H: Low frequency of microsatellites in the avian genome. Genome Res 1997, 7(5):471-482. 32. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W et al: Initial sequencing and analysis of the human genome. Nature 2001, 409(6822):860- 921. 33. Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA et al: The sequence of the human genome. Science 2001, 291(5507):1304-1351. 34. Waterston RH, Lindblad-Toh K, Birney E, Rogers J, Abril JF, Agarwal P, Agarwala R, Ainscough R, Alexandersson M, An P et al: Initial se- quencing and comparative analysis of the mouse genome. Nature 2002, 420(6915):520-562. 35. Clamp M, Fry B, Kamal M, Xie X, Cuff J, Lin MF, Kellis M, Lind- blad-Toh K, Lander ES: Distinguishing protein-coding and noncod- ing genes in the human genome. Proc Natl Acad Sci U S A 2007, 104(49):19428-19433. 36. Hughes AL, Hughes MK: Small genomes for better flyers. Nature 1995, 377(6548):391. 37. Hughes AL: Adaptive Evolution of Genes and Genomes. Oxford: Oxford University Press; 1999. 38. Van den Bussche RA, Longmire JL, Baker RJ: How bats achieve a small C-value: frequency of repetitive DNA in Macrotus. Mamm Genome 1995, 6(8):521-525. 39. Kazazian HH, Jr.: Mobile elements: drivers of genome evolution. Science 2004, 303(5664):1626-1632. 40. Organ CL, Shedlock AM, Meade A, Pagel M, Edwards SV: Origin of avian genome size and structure in non-avian dinosaurs. Nature 2007, 446(7132):180-184.

40 41. Mendel G: Experiments in Plant Hybridization (English version). In: Bateson W. 1909. Mendel´s Principle of Heredity. London: Cam- bridge University Press; 1865. 42. von Tschermak E: The rediscovery of Gregor Mendel´s work. Jour- nal of Heredity 1951, 42(4):163-171. 43. Bateson W: Experiments with poultry. Repts Evol Comm Roy Soc 1902, 1:87-124. 44. Liu B-H: Statistical Genomics: Linkage, Mapping, and QTL Analy- sis. New York: CRC Press; 1998. 45. Ott J: Analysis of Human Genetic Linkage, 3rd edn. Baltimore: Johns Hopkins University Press; 1999. 46. Botstein D, White RL, Skolnick M, Davis RW: Construction of a genetic linkage map in man using restriction fragment length po- lymorphisms. Am J Hum Genet 1980, 32(3):314-331. 47. Kong A, Gudbjartsson DF, Sainz J, Jonsdottir GM, Gudjonsson SA, Richardsson B, Sigurdardottir S, Barnard J, Hallbeck B, Masson G et al: A high-resolution recombination map of the human genome. Nat Genet 2002, 31(3):241-247. 48. Neff MW, Broman KW, Mellersh CS, Ray K, Acland GM, Aguirre GD, Ziegle JS, Ostrander EA, Rine J: A second-generation genetic linkage map of the domestic dog, Canis familiaris. Genetics 1999, 151(2):803-820. 49. Groenen MA, Cheng HH, Bumstead N, Benkel BF, Briles WE, Burke T, Burt DW, Crittenden LB, Dodgson J, Hillel J et al: A consensus linkage map of the chicken genome. Genome Res 2000, 10(1):137- 147. 50. Marklund L, Johansson Moller M, Hoyheim B, Davies W, Fredholm M, Juneja RK, Mariani P, Coppieters W, Ellegren H, Andersson L: A comprehensive linkage map of the pig based on a wild pig-Large White intercross. Anim Genet 1996, 27(4):255-269. 51. Shifman S, Bell JT, Copley RR, Taylor MS, Williams RW, Mott R, Flint J: A high-resolution single nucleotide polymorphism genetic map of the mouse genome. PLoS Biol 2006, 4(12):e395. 52. Steen RG, Kwitek-Black AE, Glenn C, Gullings-Handley J, Van Etten W, Atkinson OS, Appel D, Twigger S, Muir M, Mull T et al: A high- density integrated genetic linkage and radiation hybrid map of the laboratory rat. Genome Res 1999, 9(6):AP1-8, insert. 53. Ellegren H: Microsatellites: simple sequences with complex evolu- tion. Nat Rev Genet 2004, 5(6):435-445. 54. Wu R, Chang-Xing M, Casella G: Statistical Genetics of Quantita- tive Traits. New York: Springer Science 2007. 55. Morton NE: Sequential tests for the detection of linkage. Am J Hum Genet 1955, 7(3):277-318.

41 56. Green P, Falls, K., and Crooks, S.: Documentation for CRI-MAP, version 2.4. Washington School of Medicine, St Louis, MO 1990. 57. 5Lander ES, Green P: Construction of multilocus genetic linkage maps in humans. Proc Natl Acad Sci U S A 1987, 84(8):2363-2367. 58. Haldane J: The combination of linkage values and the calculation of distance between the loci of linked factors. Journal of Genetics 1919, 8:299-309. 59. Kosambi D: The estimation of map distance from recombination values. Annals of Eugenics 1944, 12:172-175. 60. Lynn A, Ashley T, Hassold T: Variation in human meiotic recombi- nation. Annu Rev Genomics Hum Genet 2004, 5:317-349. 61. Hassold T, Hall H, Hunt P: The origin of human aneuploidy: where we have been, where we are going. Hum Mol Genet 2007, 16 Spec No. 2:R203-208. 62. Hawley RS, Theurkauf WE: Requiem for distributive segregation: achiasmate segregation in Drosophila females. Trends Genet 1993, 9(9):310-317. 63. Petronczki M, Siomos MF, Nasmyth K: Un menage a quatre: the molecular of chromosome segregation in meiosis. Cell 2003, 112(4):423-440. 64. Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, Gibbs RA, Belmont JW, Boudreau A, Hardenbol P, Leal SM et al: A second gen- eration human haplotype map of over 3.1 million SNPs. Nature 2007, 449(7164):851-861. 65. Sabeti PC, Varilly P, Fry B, Lohmueller J, Hostetter E, Cotsapas C, Xie X, Byrne EH, McCarroll SA, Gaudet R et al: Genome-wide detection and characterization of positive selection in human populations. Nature 2007, 449(7164):913-918. 66. Falconer DS, Mackay TFC: Introduction to Quantitative Genetics, fourth edn. Harlow: Addison Wesley Longman; 1996. 67. Lynch M, Walsh B: Genetics and Analysis of Quantitative Traits. Sunderland: Sinauer Associates, Inc.; 1998. 68. Roff DA: Evolutionary Quantitative Genetics. New York: Chapman & Hall; 1997. 69. Sax K: The Association of Size Differences with Seed-Coat Pattern and Pigmentation in PHASEOLUS VULGARIS. Genetics 1923, 8(6):552-560. 70. Lander ES, Schork NJ: Genetic dissection of complex traits. Science 1994, 265(5181):2037-2048. 71. Hirschhorn JN, Daly MJ: Genome-wide association studies for com- mon diseases and complex traits. Nat Rev Genet 2005, 6(2):95-108. 72. Mackay TF: Quantitative trait loci in Drosophila. Nat Rev Genet 2001, 2(1):11-20.

42 73. Lightfoot JT, Turner MJ, Pomp D, Kleeberger SR, Leamy LJ: Quanti- tative trait loci for physical activity traits in mice. Physiol Genomics 2008, 32(3):401-408. 74. Andersson L: Genetic dissection of phenotypic diversity in farm animals. Nat Rev Genet 2001, 2(2):130-138. 75. Brockmann GA, Haley CS, Renne U, Knott SA, Schwerin M: Quanti- tative trait loci affecting body weight and fatness from a mouse line selected for extreme high growth. Genetics 1998, 150(1):369-381. 76. Wright D, Nakamichi R, Krause J, Butlin RK: QTL analysis of behav- ioral and morphological differentiation between wild and labora- tory zebrafish (Danio rerio). Behav Genet 2006, 36(2):271-284. 77. Haley CS, Andersson L: Linkage mapping of quantitative trait loci in plants and animals. In: Genome mapping: a practical approach. Edited by Dear PH. Oxford: Oxford University press; 1997: 49-71. 78. Haley CS, Knott SA, Elsen JM: Mapping quantitative trait loci in crosses between outbred lines using least squares. Genetics 1994, 136(3):1195-1207. 79. Lander ES, Botstein D: Mapping mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 1989, 121(1):185-199. 80. Haley CS, Knott SA: A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Hered- ity 1992, 69(4):315-324. 81. Carlborg O, Haley CS: Epistasis: too often neglected in complex trait studies? Nat Rev Genet 2004, 5(8):618-625. 82. Phillips PC: Epistasis--the essential role of gene interactions in the structure and evolution of genetic systems. Nat Rev Genet 2008, 9(11):855-867. 83. Carlborg O, Jacobsson L, Ahgren P, Siegel P, Andersson L: Epistasis and the release of genetic variation during long-term selection. Nat Genet 2006, 38(4):418-420. 84. Churchill GA, Doerge RW: Empirical threshold values for quantita- tive trait mapping. Genetics 1994, 138(3):963-971. 85. Darvasi A, Soller M: Advanced intercross lines, an experimental population for fine genetic mapping. Genetics 1995, 141(3):1199- 1207. 86. Cohen-Zinder M, Seroussi E, Larkin DM, Loor JJ, Everts-van der Wind A, Lee JH, Drackley JK, Band MR, Hernandez AG, Shani M et al: Identification of a missense mutation in the bovine ABCG2 gene with a major effect on the QTL on chromosome 6 affecting milk yield and composition in Holstein cattle. Genome Res 2005, 15(7):936-944.

43 87. Van Laere AS, Nguyen M, Braunschweig M, Nezer C, Collette C, Mo- reau L, Archibald AL, Haley CS, Buys N, Tally M et al: A regulatory mutation in IGF2 causes a major QTL effect on muscle growth in the pig. Nature 2003, 425(6960):832-836. 88. Romanov M, Sazanov A, Smirnow A: First centurt of chicken gene study and mapping - alook back and forward. World´s Poultry Sci- ence Journal 2004, 60:19-41. 89. Jacobsson L, Park HB, Wahlberg P, Jiang S, Siegel PB, Andersson L: Assignment of fourteen microsatellite markers to the chicken link- age map. Poult Sci 2004, 83(11):1825-1831. 90. Graves JA: Sex and death in birds: a model of dosage compensation that predicts lethality of sex chromosome aneuploids. Cytogenet Genome Res 2003, 101(3-4):278-282. 91. Ellegren H, Parsch J: The evolution of sex-biased genes and sex- biased gene expression. Nat Rev Genet 2007, 8(9):689-698. 92. Groenen MA, Crooijmans RP, Veenendaal A, Cheng HH, Siwek M, van der Poel JJ: A comprehensive microsatellite linkage map of the chicken genome. Genomics 1998, 49(2):265-274. 93. Cheng HH, Levin I, Vallejo RL, Khatib H, Dodgson JB, Crittenden LB, Hillel J: Development of a genetic map of the chicken with markers of high utility. Poult Sci 1995, 74(11):1855-1874. 94. Stapley J, Birkhead TR, Burke T, Slate J: A linkage map of the zebra finch Taeniopygia guttata provides new insights into avian genome evolution. Genetics 2008, 179(1):651-667. 95. Jensen-Seaman MI, Furey TS, Payseur BA, Lu Y, Roskin KM, Chen CF, Thomas MA, Haussler D, Jacob HJ: Comparative recombination rates in the rat, mouse, and human genomes. Genome Res 2004, 14(4):528-538. 96. Coop G, Przeworski M: An evolutionary view of human recombina- tion. Nat Rev Genet 2007, 8(1):23-34. 97. Dunnington EA, Siegel PB: Long-term divergent selection for eight- week body weight in white Plymouth rock chickens. Poult Sci 1996, 75(10):1168-1179. 98. Siegel PB: A double selection experiment for body weight and breast angle at eight weeks of age in chickens. Genetics 1962, 47:1313-1319. 99. Liu G, Dunnington EA, Siegel PB: Responses to long-term divergent selection for eight-week body weight in chickens. Poult Sci 1994, 73(11):1642-1650. 100. Park HB, Jacobsson L, Wahlberg P, Siegel PB, Andersson L: QTL analysis of body composition and metabolic traits in an intercross between chicken lines divergently selected for growth. Physiol Ge- nomics 2006, 25(2):216-223.

44 101. Seaton G, Haley CS, Knott SA, Kearsey M, Visscher PM: QTL Ex- press: mapping quantitative trait loci in simple and complex pedi- grees. Bioinformatics 2002, 18(2):339-340. 102. Carlborg O, Andersson L: Use of randomization testing to detect multiple epistatic QTLs. Genet Res 2002, 79(2):175-184. 103. Carlborg O, Andersson L, Kinghorn B: The use of a genetic algo- rithm for simultaneous mapping of multiple interacting quantita- tive trait loci. Genetics 2000, 155(4):2003-2010. 104. Kijas JM, Wales R, Tornsten A, Chardon P, Moller M, Andersson L: Melanocortin receptor 1 (MC1R) mutations and coat color in pigs. Genetics 1998, 150(3):1177-1185. 105. Leamy LJ, Pomp D, Lightfoot JT: An epistatic genetic basis for physical activity traits in mice. J Hered 2008, 99(6):639-646. 106. Ankra-Badu GA, Pomp D, Shriner D, Allison DB, Yi N: Genetic in- fluences on growth and body composition in mice: multilocus in- teractions. Int J Obes (Lond) 2008. 107. Fedorowicz GM, Fry JD, Anholt RR, Mackay TF: Epistatic interac- tions between smell-impaired loci in Drosophila melanogaster. Ge- netics 1998, 148(4):1885-1891. 108. Yamamoto A, Zwarts L, Callaerts P, Norga K, Mackay TF, Anholt RR: Neurogenetic networks for startle-induced locomotion in Droso- phila melanogaster. Proc Natl Acad Sci U S A 2008, 105(34):12393- 12398. 109. Long AD, Mullaney SL, Mackay TF, Langley CH: Genetic interac- tions between naturally occurring alleles at quantitative trait loci and mutant alleles at candidate loci affecting bristle number in Drosophila melanogaster. Genetics 1996, 144(4):1497-1510. 110. Li Z, Pinson SR, Park WD, Paterson AH, Stansel JW: Epistasis for three grain yield components in rice (Oryza sativa L.). Genetics 1997, 145(2):453-465. 111. Zeng ZB, Liu J, Stam LF, Kao CH, Mercer JM, Laurie CC: Genetic architecture of a morphological shape difference between two Dro- sophila species. Genetics 2000, 154(1):299-310. 112. Hill WG, Goddard ME, Visscher PM: Data and theory point to mainly additive genetic variance for complex traits. PLoS Genet 2008, 4(2):e1000008. 113. Smith JM, Haigh J: The hitch-hiking effect of a favourable gene. Genet Res 1974, 23(1):23-35. 114. Galtier N, Depaulis F, Barton NH: Detecting bottlenecks and selec- tive sweeps from DNA sequence polymorphism. Genetics 2000, 155(2):981-987.

45 Acta Universitatis Upsaliensis Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine 409

Editor: The Dean of the Faculty of Medicine

A doctoral dissertation from the Faculty of Medicine, Uppsala University, is usually a summary of a number of papers. A few copies of the complete dissertation are kept at major Swedish research libraries, while the summary alone is distributed internationally through the series Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine. (Prior to January, 2005, the series was published under the title “Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine”.)

ACTA UNIVERSITATIS UPSALIENSIS Distribution: publications.uu.se UPPSALA urn:nbn:se:uu:diva-9502 2009