Mapping Viability Loci Using Molecular Markers

Heredity (2003) 90, 459–467 & 2003 Nature Publishing Group All rights reserved 0018-067X/03 $25.00 www.nature.com/hdy Mapping viability loci using molecular markers L Luo and S Xu Department of Botany and Plant Sciences, University of California, Riverside, CA 92521, USA In genetic mapping experiments, some molecular markers likelihood (ML) method that uses the observed marker often show distorted segregation ratios. We hypothesize that genotypes as data and the proportions of the genotypes of these markers are linked to some viability loci that cause the the viability locus as parameters. The ML solutions are observed segregation ratios to deviate from Mendelian obtained via the expectation–maximization algorithm. Appli- expectations. Although statistical methods for mapping cation and efficiencies of the method are demonstrated and viability loci have been developed for line-crossing experi- tested using a set of simulated data. We conclude that ments, methods for viability mapping in outbred populations mapping viability loci can be accomplished using similar have not been developed yet. In this study, we develop a statistical techniques used in quantitative trait locus mapping method for mapping viability loci in outbred populations using for quantitative traits. a full-sib family as an example. We develop a maximum Heredity (2003) 90, 459–467. doi:10.1038/sj.hdy.6800264 Keywords: EM algorithm; four-way cross; maximum likelihood; segregation distortion Introduction (Lander and Botstein, 1989). Fu and Ritland (1994a,b) first utilized a QTL mapping approach to map viability The genetic consequence of selection is the change in (a fitness component) loci under the maximum like- frequencies of the genes affecting fitness. The process of lihood (ML) framework. Mitchell-Olds (1995) also evolution is reflected by the dynamic change of gene proposed a similar ML method for viability mapping in frequencies by selection and other evolutionary agents. F2 families. Recently, Vogl and Xu (2000) investigated a Fitness is a complicated trait, which can be decomposed Bayesian method to map viability loci in a backcross into many fitness components (Falconer and Mackay, family. All the aforementioned existing methods deal 1996; Hartl and Clark, 1997). Therefore, the genetic with line-crossing experiments that require inbred lines. variance of fitness is considered to be controlled by the Inbred lines, however, may not be available for many segregation of multiple genes. Fitness behaves like a species, such as humans, large animals and trees quantitative trait. It responds to natural selection with a (Hedrick and Muona, 1990). Mapping viability loci may response equal to the genetic variance of fitness (Fisher, be more relevant to natural populations than to line 1958). To study the genetic architecture of fitness, it is crosses. This is equivalent to the situation where important to explore the change of gene frequency of mapping QTLs is more relevant to breeding populations alleles at individual loci. However, only in very limited than to designed line crosses. However, it is easier to situations, for example, where allozyme markers are map QTLs in line-crossing experiments because we can available, can we evaluate natural selection on individual control the genetic background and environments. After loci. In most situations, we do not know what the genes QTL are mapped in line crosses, the results may be are and where in the genome the genes are located. extended to natural populations or used to find homo- With the rapid development of molecular technology, logous loci in closely related species. Similarly, viability large amounts of molecular data are now available, loci may be mapped in line crosses and the inference which provide a great opportunity to estimate the effects later extended to natural populations. In this study, we and locate the chromosomal positions of loci responsible attempt to map viability loci directly in outbred popula- for complicated traits, for example, quantitative traits. tions. Full-sib families are the simplest outbred popula- The technology is now called quantitative trait locus tions. Although not necessarily natural populations, they (QTL) mapping. Since fitness is just another complicated are one step closer to natural populations than are line trait with a polygenic background, a similar technology crosses. can be applied to map loci determining variation in The fitness of a genotype at a locus is the average fitness. fitness of all individuals bearing this genotype. If we Although it does not seem easy to map fitness loci, assign the fitness for the ‘best’ genotype a value of one, statistical methods of mapping QTL can be adopted the selection coefficient for an arbitrary genotype is defined as the reduction in fitness from this maximum value. Therefore, we only describe the measurement of fitness (rather than the selection coefficient) in subse- Correspondence: S Xu, Department of Botany and Plant Sciences, University of California, Riverside, CA 92521, USA. quent discussion. Viability is only one of many compo- E-mail: [email protected] nents of fitness. Fecundity is another important Received 7 June 2002; accepted 17 January 2003 component. In this study, however, we focus only on Mapping viability loci L Luo and S Xu 460 loci responsible for viability selection, assuming that all of the three independent parameters, as shown surviving individuals have an equal fecundity. below: We develop a model of viability mapping that uses a 1 s d full-sib family derived from the mating of two unrelated w11 ¼ 4ð1 þ w Þð1 þ w Þþd outbred parents. A full-sib family contains four different 1 s d w12 ¼ ð1 þ w Þð1 À w ÞÀd alleles at a single locus, rather than two as is usually 4 ð3Þ 1 s d assumed in inbred line crosses. Mapping in a full-sib w21 ¼ 4ð1 À w Þð1 þ w ÞÀd family requires the general rule of allelic transmission w ¼ 1ð1 À wsÞð1 À wdÞþd from parents to children and thus the algorithm can 22 4 be extended to pedigree analysis. The method can be This model is important in hypothesis tests and directly applied to fitness analysis for open-pollinated computer simulations that will be discussed in later plants. sections. Theory and methods ML estimation We first assume that the four alleles of the viability locus Genetic model of fitness in the parents are distinguishable and the genotypes are Consider a single viability locus and a full-sib family. observable. Suppose that we sample n individuals from Denote the genotypes of the sire (paternal parent) and the full-sib family in question. Let us define dam (maternal parent) by As As and AdAd, respectively. 1 2 1 2 y ¼½y y y y for j ¼ 1; ...; n Mating between the two parents will generate progenies j jð11Þ jð12Þ jð21Þ jð22Þ 0 0 each with one of the four possible genotypes: where yjðklÞ ¼ 1 and yjðk0l0Þ ¼ 0 for k 6¼ k and l 6¼ l if f s d; s d; s d; s dg s d A1A1 A1A2 A2A1 A2A2 . Under the assumption of individual j takes genotype AkAl . We now have the Mendelian segregation, the four genotypes will have an data, y, and the parameter, w, which allow the construc- equal frequency, that is, 1. If this locus is subject to 4 tion of the log"# likelihood: viability selection, we will observe two or more Xn X2 X2 genotypes, which have frequencies different from Men- LcðwÞ¼ yjðklÞ lnðwklÞ ð4Þ delian expectations. j¼1 k¼1 l¼1 To model viability selection, we define the underlying frequencies of the four genotypes in the progeny The ML estimate of w is simply Xn Pby a vector w ¼½w11 w12 w21 w22 for 0 wkl 1, 1 w ¼ 1 and k; l ¼ 1; 2. These frequencies are now w^ kl ¼ yjðklÞ ð5Þ kl kl n defined as the relative fitness of the four genotypes. This j¼1 is a little different from the usual definition of relative for k; l ¼ 1; 2. fitness in which the maximum fitness is set to one In fact, the genotype of a viability locus cannot be and the rest expressed as reduced values relative to observed and we must use markers to infer the genotype. one.ÂÃ Deviation of w from the Mendelian vector Unless the viability locus is located exactly at a fully ¼ 1 1 1 1 w0 4 4 4 4 reflects the intensity of viability informative marker, inference will be subject to error. The selection. amount of error depends on the distances of the viability The fitness of a genotype can be decomposed into the locus from marker loci, the level of marker polymorph- product of the fitness of the two alleles that make up the ism and the genotypes of the markers. As a result genotype and a deviation reflecting the interaction of the error, we are not certain about the actual genotype between the two alleles, called the dominance effect, of the viability locus for each individual, even though that is, we can observe the marker genotypes. The viability locus can take any one of the four genotypes, but w ¼ wswd þ d ð1Þ kl k l kl with a different probability for each genotype given s d where wk and wl denote the relative fitness of the kth the marker information. Define the four condi- allele of the sire and the lth allele of the dam, tional probabilities of the given viability locus markers respectively, and dkl is the dominance effect. This Pby p P¼½pjð11Þ pjð12Þ pjð21Þ pjð22Þ for 0 pjðklÞ 1 and 2 j 2 partitioning of the fitness is important because we can k¼1 l¼1pjðklÞ ¼ 1. This is a typical problem of missing separate gametic selection from zygotic selection using values in statistics where we can use the expectation- statistical technology.

Mapping Viability Loci Using Molecular Markers

Lineage-Specific Mapping of Quantitative Trait Loci

Pleiotropic Scaling and QTL Data Arising From: G

1 ROBERT PLOMIN on Behavioral Genetics Genetic Links Common

Epigenome-Wide Study Identified Methylation Sites Associated With

QUANTITATIVE TRAIT LOCUS ANALYSIS: in Several Variants (I.E., Alleles)

Article the Power to Detect Quantitative Trait Loci Using

Glossary/Index

Mapping Quantitative Trait Loci in Selected Breeding Populations: a Segregation Distortion Approach

Pleiotropic Patterns of Quantitative Trait Loci for 70 Murine Skeletal Traits

To Clone Or Not to Clone Plant Qtls: Present and Future Challenges

100 Years of Quantitative Genetics Theory and Its Applications: Celebrating the Centenary of Fisher 1918”

Nature, Nurture and Egalitarian Policy: What Can We Learn from Molecular Genetics?