A Haplotype Map for the Laboratory Mouse
Total Page:16
File Type:pdf, Size:1020Kb
NEWS AND VIEWS Sample size dictates inference dogma collection, rather than to data generation or ent experimental approaches provides insight The technical revolution that made the cur- analysis. The open question is whether the com- into mechanism and helps specify the genetic rent data boom possible has matured; genetics munity of funding agencies, peer reviewers and model. In fact, genomic convergence is a spe- laboratories now routinely generate millions of regulatory bodies will be prepared to encourage cific embodiment of the concept of ‘consilience genotypes per day. However, more data can be a and support the assembly of subjects and mate- of inductions’, formulated by William Whewell problem when we do not have the discipline to rials needed to accomplish these goals. in the 1840s11, which says that valid inductions apply the scientific method. Replication is also will be supported by data from many different a problem when working with effects so small Candidate gene success experimental approaches. that replication success could not be reasonably Rather than being discovered initially through a Although they benefit greatly from the expected, even if the marker is associated with genome-wide scan of hundreds of thousands of impressive acceleration provided by genome- the disease. genetic markers, which is now the rage, IL7R is wide scans, virtually all genetic association Larger sample size is the mantra of the hour. a candidate gene that refused to quit. Certainly, disease studies ultimately become candidate Experimentalists and funding agencies no longer the first paper on IL7R polymorphisms in mul- gene projects. Additional susceptibility genes in scoff at theoretical analyses showing that thou- tiple sclerosis7 would not have satisfied the strin- multiple sclerosis—siblings of HLA-DRB2, IL7R sands of cases and controls are needed to dem- gent replication criteria advocated by a recent and, probably, IL2RA4—might now stand above onstrate modest genetic associations (Fig. 1). National Cancer Institute–National Human the background noise so that they may be seen. Rather, risk variants discovered in the recent Genome Research Institute working group8. We know they must be there. rash of relatively large, successful genome scans Nevertheless, the current studies2–4 establish- COMPETING INTERESTS STATEMENT settle the importance of large samples by direct ing the IL7R association with multiple sclerosis The author declares no competing financial interests. demonstration. The apparent exception of com- are an example of persistence paying off. The plement factor H (CFH) in age-related macular studies also show that thought and insight are 1. Kenealy, S.J., Pericak-Vance, M.A. & Haines, J.L. J. Neuroimmunol. 143, 7–12 (2003). degeneration escapes this problem only by the valuable and important, even in an era when 2. Gregory, S.G. et al. Nat. Genet. 39, 1083–1091 (2007). http://www.nature.com/naturegenetics very unusual magnitude of the effect (with an high-throughput data generation and brute- 6 3. Lundmark, F. et al. Nat. Genet. 39, 1108–1103 odds ratio of ∼7) . Small effect sizes (with odds force approaches dominate. (2007). ratios <1.2) require much larger sample sizes The hubbub over genetic association studies, 4. The International Multiple Sclerosis Genetics than are readily available for an initial study, the problem of inconsistent replication and the Consortium. N. Engl. J. Med., published online 29 July 2007 (doi:10.1056/NEJMoa073493). considering also the many additional samples obvious false-positive associations that accom- 5. Silman, A.J. et al. Br. J. Rheumatol. 32, 903–907 needed for multiple, independent replication pany high-throughput genotyping technolo- (1993). 6. Klein, R.J. et al. Science 308, 385–389 (2005). studies. gies obscure the conceptual simplicity of the 7. Teutsch, S.M., Booth, D.R., Bennetts, B.H., Heard, Although discerning a genotype is much less scientific method. What we cannot replicate R.N. & Stewart, G.J. Eur. J. Hum. Genet. 11, 509–515 difficult today than previously, collecting the cannot be made dogma. As scientists, to accept (2003). 8. NCI-NHGRI Working Group on Replication in Association samples needed is becoming increasingly dif- a relationship, we must find it again through Studies et al. Nature 447, 655–660 (2007). ficult and expensive. Controversy begets regu- independent, careful evaluation. As the modest 9. Hauser, M.A. et al. Hum. Mol. Genet. 12, 671–676 Nature Publishing Group Group Nature Publishing (2003). 7 lation, and regulatory compliance has become effect size for IL7R in multiple sclerosis attests, 10. Noureddine, M.A. et al. Mov. Disord. 20, 1299–1309 predatory. Permission to perform human subject those doing genetic association studies in com- (2005). 200 11. Whewell, W. in Theory of Scientific Method (ed. © research is soaked in carefully wrought legalities plex diseases have been working too long in a Butts, R.) (Hackett Publishing, Indianapolis, Indiana, that subjects generally ignore. This social choice, statistically underpowered universe. 1989). along with the small effect sizes anticipated, The selection of candidate genes using multi- 12. Skol, A.D. et al. Nat. Genet. 38, 209–213 (2006). 13. Gordon, D. et al. Hum. Hered. 54, 22–33 (2002). dictates that genetics projects will dedicate an ple sources of data is a strategy termed “genomic 14. Gordon, D. et al. Pac. Symp. Biocomput. 490–501 increasing proportion of resources to sample convergence”9,10. The application of differ- (2003). A haplotype map for the laboratory mouse Richard Mott Two reports present detailed analyses of the haplotype structure of widely used laboratory mice based on resequencing data from 15 inbred strains. The studies provide the deepest view thus far of the patterns of genetic variation segregating in the inbred lines, and have implications for the design of complex trait mapping studies in mice. The mouse is the primary animal model of where a human gene is known to be associated ent among inbred mouse strains that, together human disease. About 90% of human genes with a mendelian disease, the knockout of the with the intercrosses, heterogeneous stocks, have an ortholog in the mouse, and in cases mouse ortholog will often produce a similar recombinant inbred lines and other genetic ref- phenotype. Just as importantly, we gain insights erence populations derived from them, are the Richard Mott is at the Wellcome Trust Centre into human multifactorial diseases by examin- workhorses for the dissection of complex traits for Human Genetics, University of Oxford, ing the phenotypic consequences of naturally in the mouse1. Recently, the National Institute Oxfordshire OX3 7BN, UK. occurring genetic variation in the mouse. This of Environmental Health Sciences (NIEHS) e-mail: [email protected] variation is captured in the polymorphisms pres- contracted Perlegen Sciences to resequence the 1054 VOLUME 39 | NUMBER 9 | SEPTEMBER 2007 | NATURE GENETICS NEWS AND VIEWS a b c Figure 1 The genetic variation between these three classical inbred strains of mice is explained by the variation observed between wild-derived inbred strains. (a) C57BL/6J (the reference-sequence strain). (b) 129S1/SvImJ. (c) DBA/2J. All photographs were provided by Joyce Peterson at The Jackson Laboratory. genomes of 15 inbred mouse strains, compris- haplotypes of the founders. Unlike in humans, up to 45 million SNPs segregating between all ing 11 classical strains and 4 wild-derived strains where resequencing candidate genes among the strains. corresponding to the subspecies Mus musculus many individuals uncovers rare variants that Second, variation between the 11 classical domesticus, M. m. musculus and M. m. castaneus may be disease related, the frequency of rare strains captures only 41% of the total number and the natural hybrid M. m. molossinus. Frazer variants in mice descended from these strains of observed SNPs, which is an underestimate of et al.2, reporting in Nature, now present the first should be negligible, limited to the few sponta- the total variation in the wild-derived strains. description and analysis of this data set, which neous mutations that can accumulate in a small Third, the studies estimate that roughly 76% of they generated and released publicly last fall. In number of generations. In other words, in these the genome of each classical strain originated a related study on page 1100 of this issue, Yang mice, the ‘common disease/common variant’ from M. m. domesticus, roughly 5% from M. M. et al.3 present an independent analysis of the hypothesis is true by construction. musculus and roughly 3% from M. m. castaneus, 6 http://www.nature.com/naturegenetics same data set. Together, these analyses give fresh Starting with the report by Wade et al. , the with the remainder being of uncertain ancestry, insight into the haplotype structure of widely haplotype structure of the mouse inbred strains where the wild-derived strains show evidence used inbred mouse strains. has gradually become clearer. It was known that of introgression. Yang et al.3 suggest that these the classical strains showed less genotypic and regions are in fact 67% domesticus, raising the Unnatural history and haplotype structure phenotypic variation compared to the wild total fraction of the domesticus contribution to Mouse inbred strains are grouped into the ‘clas- strains, suggesting that the former descended just over 90%. Lastly, patterns of ancestry vary, sical’ strains (Fig. 1), which include C57BL/6J, from a