A germ-line-selective advantage rather than an increased mutation rate can explain some unexpectedly common human disease mutations

Soo-Kyung Choi, Song-Ro Yoon, Peter Calabrese*†, and Norman Arnheim*†

Molecular and Computational Biology Program, University of Southern California, 1050 Childs Way, Los Angeles, CA 90089-2910

Edited by James F. Crow, University of Wisconsin, Madison, WI, and approved April 16, 2008 (received for review February 7, 2008) Two nucleotide substitutions in the human FGFR2 (C755G or mutation hot-spot model (the nucleotide has a higher-than- C758G) are responsible for virtually all sporadic cases of Apert average chance of undergoing a base substitution) at one of these syndrome. This condition is 100–1,000 times more common than sites (C755G). Positive selection for spermatogonial cells car- genomic mutation frequency data predict. Here, we report on the rying a newly arisen mutation is an alternative explanation (4–6, C758G de novo Apert syndrome mutation. Using data on older 11, 12) for an apparently high nucleotide substitution frequency. donors, we show that spontaneous mutations are not uniformly In our earlier study (6), we were unable to reject the selection distributed throughout normal testes. Instead, we find foci where hypothesis as an explanation for the C755G data. C758G mutation frequencies are 3–4 orders of magnitude greater In this work, we studied the other common de novo Apert than the remaining tissue. We conclude this nucleotide site is not nucleotide substitution (C758G). We examined four individual a mutation hot spot even after accounting for possible Luria– testes from three older individuals and two testes from younger Delbruck ‘‘mutation jackpots.’’ An alternative explanation for such donors. By comparing the data on both the C755G and C758G foci involving positive selection acting on adult self-renewing Ap mutations, we were able to ask whether the two mutations were spermatogonia experiencing the rare mutation could not be re- dependent or independent with respect to their anatomical jected. Further, the two youngest individuals studied (19 and 23 distribution throughout each testis. Overall, our results support years old) had lower mutation frequencies and smaller foci at both the selection hypothesis as an explanation for the high frequency mutation sites compared with the older individuals. This implies of both Apert syndrome mutations for being responsible for the that the mutation frequency of foci increases as adults age, and increased chance of older fathers having an affected child thus selection could explain the for Apert (paternal age effect: refs. 1, 11, 13–15). syndrome and other genetic conditions. Our results, now including the analysis of two mutations in the same set of testes, suggest Results that positive selection can increase the relative frequency of C758G Mutation Distribution in Individual Testes. We studied the premeiotic germ cells carrying such mutations, although individu- spatial distribution of de novo C758G mutations within six testes als who inherit them have reduced fitness. In addition, we com- from donors who do not suffer from Apert syndrome. Four of the pared the anatomical distribution of C758G mutation foci with testes were from older donors. Two of these testes were from one both new and old data on the C755G mutation in the same testis donor (374, age 62) and one testis from each of two additional and found their positions were not correlated with one another. donors (854, 54 years; 59089, 45 years). We also examined two testes from younger individuals: one testis each from donors Apert syndrome ͉ paternal age ͉ positive selection ͉ spermatogonia ͉ testis 60832, 23 years, and 59056, 19 years. Each testis was dissected into 192 pieces (six slices each with he frequency at which human germ-line nucleotide substi- 32 pieces), the DNA extracted and a highly specific single- Ttutions arise in each generation can be measured by analysis molecule PCR assay was used to estimate the number of mutant of sporadic cases of human autosomal dominant or sex-linked molecules in each DNA aliquot (see Methods). A sperm sample diseases (1, 2) or by DNA analysis of sperm (3–5). In a recent taken from the epididymis of each testis was also analyzed for article (6), we described an alternative method to study germ-line C758G mutations. The summary data are shown in Table 1 and mutations by dissecting the whole human testis and measuring Fig. 1 [the total DNA content and mutation frequency of each the mutation frequency at a specific nucleotide site in individual piece for all testes are presented in supporting information (SI) pieces. This procedure allowed us to study the spatial distribution Table S1]. The false-positive frequency of the C758G assay was of mutations in this organ and to compare different models of 3.8 ϫ 10Ϫ7 (based on experiments using 31.6 million control the human mutation process. ). Individuals born with Apert syndrome (Online Mendelian Among the three older individuals (374, 854, and 59089), the Inheritance in Man no. 101200) exhibit prematurely fused average mutation frequency per testis ranged from 7.3 ϫ 10Ϫ4 to cranial sutures and fused fingers and toes. Most Apert syndrome 1.0 ϫ 10Ϫ4 (Table 1). Each of the four testes is characterized by cases result from a spontaneous fibroblast 2 gene (FGFR2) mutation in a normal father’s germ line that is transmitted to his offspring. The birth frequency for sporadic Author contributions: P.C. and N.A. designed research; S.-K.C., S.-R.Y., and P.C. performed cases is between 10Ϫ5 and 10Ϫ6 (7, 8). Surprisingly, Ͼ98% of research; S.-K.C., S.-R.Y., P.C., and N.A. analyzed data; and P.C. and N.A. wrote the paper. Apert syndrome cases arise by transversion mutations at only two The authors declare no conflict of interest. sites (C755G and C758G), and a single copy of either mutation This article is a PNAS Direct Submission. is sufficient to cause the disease. The birth frequency of indi- Freely available online through the PNAS open access option. viduals with new mutations at either of these two nucleotide sites *P.C. and N.A. contributed equally to this work. suggests that the mutation frequency at either site is 100- to †To whom correspondence may be addressed. E-mail: [email protected] or petercal@ 1,000-fold greater than expected based on what is known about usc.edu. transversion mutations since humans and chimpanzees last had This article contains supporting information online at www.pnas.org/cgi/content/full/ a common ancestor (9) and mutation data at many human 0801267105/DCSupplemental. MEDICAL SCIENCES disease loci (10). Our previous study (6) allowed us to reject the © 2008 by The National Academy of Sciences of the USA

www.pnas.org͞cgi͞doi͞10.1073͞pnas.0801267105 PNAS ͉ July 22, 2008 ͉ vol. 105 ͉ no. 29 ͉ 10143–10148 Downloaded by guest on September 29, 2021 Table 1. Summary data on six testes Epididymal sperm Testis Frequency range Percentage of all mutation mutation among testis No. pieces containing testis genomes Testis Age frequency frequency pieces 95% of mutants* in those pieces

Mutation C758G 374–1 62y 5.6 ϫ 10Ϫ4 7.3 ϫ 10Ϫ4 Ͻ4 ϫ 10Ϫ6–0.043 12 8.2 374–2 62y 3.1 ϫ 10Ϫ6 1.0 ϫ 10Ϫ4 Ͻ4 ϫ 10Ϫ6–0.039 1 0.27 854–2 54y 2.7 ϫ 10Ϫ4 1.2 ϫ 10Ϫ4 Ͻ4 ϫ 10Ϫ6–0.011 7 3.6 59089–1 45y 8.2 ϫ 10Ϫ5 1.1 ϫ 10Ϫ4 Ͻ4 ϫ 10Ϫ6–0.005 6 4.5 60832–1 23y ‡ 6.5 ϫ 10Ϫ7 Ͻ4 ϫ 10Ϫ6–9.0 ϫ 10Ϫ6 27 15.3 59056–1 19y 2.0 ϫ 10Ϫ6 5.8 ϫ 10Ϫ7 Ͻ4 ϫ 10Ϫ6–3.1 ϫ 10Ϫ5 14 9.4 Mutation C755G 374–1† 62y 4.5 ϫ 10Ϫ4 3.8 ϫ 10Ϫ4 Ͻ10Ϫ6–0.027 12 5.7 374–2 † 62y 3.9 ϫ 10Ϫ5 6.7 ϫ 10Ϫ5 Ͻ10Ϫ6–0.007 5 2.6 854–2 † 54y 1.1 ϫ 10Ϫ4 6.8 ϫ 10Ϫ4 Ͻ2 ϫ 10Ϫ6–0.047 7 5.0 59089–1 45y 9.1 ϫ 10Ϫ5 1.6 ϫ 10Ϫ4 Ͻ4 ϫ 10Ϫ6–0.008 10 4.9 60832–1 23y ‡ 9.4 ϫ 10Ϫ7 Ͻ4 ϫ 10Ϫ6–9 ϫ 10Ϫ6 35 20.1 59056–1 19y 3.0 ϫ 10Ϫ6 1.6 ϫ 10Ϫ5 Ͻ4 ϫ 10Ϫ6–0.004 20 17

*Based on a resolution of 192 pieces/testis. †Previously published data (6). ‡Not enough sperm could be collected.

a very small number of ‘‘hot’’ pieces with mutation frequencies sperm (found in that testis’s epididymis) is near one, ranging 3–4 orders of magnitude greater than most of the remaining from 0.29 to 1.76 (see Table 1). For two more pairs, this ratio is pieces (Fig. 1). In every case, 95% of the mutant molecules were higher at 5.33 and 6.18. In ref. 6, we reported that the relatively found in no more than 12 (average 6.5) of the 192 pieces high ratio of 6.18 for the C755G mutation in 854-2 could result examined, whereas, according to a random spatial distribution, from a highly viscous material found only in the one epididymis, many more pieces, which together contain 95% of the genomes, possibly contaminating that sperm sample with nongerm-line would have been expected. In many cases, several pieces with DNA; this explanation is now unlikely, because for the same high mutation frequencies appear to form foci adjacent to one testis and epididymal sperm sample, the ratio for mutation another in the same slice or even between slices (Fig. 1). C758G is less than one (0.44). For testis 374-2 and muta- For the two younger donors, the average C758G mutation tion C758G, the ratio is much greater at 32.3. This testis– frequencies were 2–3 orders of magnitude lower than for the mutation combination is unique in that Ͼ95% of the mutants are older donors (Table 1). Moreover, perhaps, as would be expected found in a single testis piece (Table 1, Fig. 1). It seems possible with such low frequencies, there is not much variation in the that the sperm in the epididymis results from a less-than-perfect mutation frequencies among the different testis pieces (Fig. 1). sampling of the testis, perhaps because of pathological effects associated with aging (ref. 16 and refs. therein), leading to local C755G Mutation Distribution in Individual Testes. We next analyzed tubule blockage. An underrepresentation of this one testis piece the frequency of de novo C755G Apert syndrome germ-line could cause the observed high ratio. Although we could detect mutations in testes 59089-1, 59056-1, and 60832-1 with a similar sperm from the epididymal washes of testis 60832-1 microscop- site-specific assay used to examine the C755G mutation in the ically, the amount of sperm DNA obtained was insufficient to testes of donors 374 and 854 (6). The new results are shown in make any mutation frequency estimates. We cannot account for Table 1 and Fig. 1 along with the previously obtained data. The this observation other than it being a reflection of some aspect average C755G mutation frequency for testis 59089-1 from the of this donor’s epididymal or testicular function or sperm- 45-year-old donor (1.6 ϫ 10Ϫ4) is very similar to the data extraction procedure. observed for the other older donors [374 and 854 (6)]. Likewise, this testis features a small number of pieces with very high Ratio of C755G to C758G Mutations. Data on the relative frequency mutation frequencies among a general background of lower of the two common types of Apert syndrome mutations found mutation frequencies (Fig. 1). among affected individuals indicate that about twice as many The average testis mutation frequency at the C755G nucleo- carry the C755G as the C758G mutation (262 C755G mutations, tide site in the 23-year-old donor (60832-1) is 9.4 ϫ 10Ϫ7, 2–3 128 C758G mutations; reviewed in ref. 17). In our studies, orders of magnitude lower than for the older donors. For the pooling data from the four testes from the older individuals, 19-year-old donor (59056-1), the frequency is 1.6 ϫ 10Ϫ5. This there were 5,871,596 mutations at the C755G site, whereas frequency is lower than for any of the older donors; however, it 5,843,766 were found at the C758G site. The exact binomial test is not orders of magnitude lower. Indeed, it is only a factor of 4 strongly rejects (P value Ͻ2.2 ϫ 10Ϫ16) that these data conform lower than for testis 374-2 (62 years). Further, the testis piece to the 2:1 ratio reported for affected individuals. Alterations with the highest frequency in the 19 year old (0.004) has a lower from the expected ratio could simply result from the small frequency than the comparable piece in any of the older number of individual testes examined. However, for example, the individuals; however, it is only a factor of 1.75 times lower than data supporting the 2:1 ratio come from children whose normal in testis 374-2. For the two younger donors and the two mutation fathers, on average, are considerably younger than the older sites, testis 59056-1 and mutation C755G is the only testis– testes donors we studied. Although far fewer mutations were mutation combination to have a piece with a mutation frequency counted (see Table S1), the exact binomial test also rejects the great enough to be colored differently than gray in Fig. 1. 2:1 ratio (P value Ͻ2.2 ϫ 10Ϫ16) for the two younger donors.

Comparison of Testis to Epididymal Sperm Mutation Frequencies. For Positional Independence of C758G and C755G Mutation Occurrences. 7 of the 12 testis–disease mutation pairs, the ratio of the Having data on both the C758G and the C755G mutation mutation frequency in the testis to the mutation frequency in the frequencies for each individual testis allows us to examine

10144 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0801267105 Choi et al. Downloaded by guest on September 29, 2021 We conclude there is no correlation between the high-frequency pieces for the two mutations and that, not unexpectedly, the two different mutation events arise independently of one another.

Testing the Mutation Hot-Spot Model. One possible explanation for the incidence of Apert syndrome being so much higher than expected is that the C755G and C758G sites might have a higher-than-average chance of undergoing a base substitution. To test this possibility in our previous article on the C755G mutation (6), we proposed a model for mutation based on what is known about human germ-line development and maturation (1, 15, 18–25) (also see ref. 6 for more references). A key parameter in this model is the mutation rate per cell division. The mutation hot-spot model is simply that these two disease sites have much higher-than-average nucleotide substitution rates per cell division. A complete description of the model is in the previous article (6); below, we outline the main points. The model has two phases. The first phase, called the growth phase, models the testis from zygote formation to puberty. In this phase, the male germ-line cells divide symmetrically, and the number of such cells increases exponentially. Because of this exponential increase, an early mutation will be shared by many later germ-line cells. This phenomenon is similar to the Luria and Delbruck ‘‘mutation jackpot’’ in bacteria (24). The primor- dial germ cells migrate to the site of gonad formation and form the seminiferous cords early in embryogenesis (see refs. 24 and 26). Thereafter, the germ cells are expected to remain physically close to their ancestors, so that any early mutation will result in a region of the testis with a high mutation frequency. In the previous article (6), we considered a range of growth-phase cell generations between 27 and 34. For all values, the conclusion was the same. In this article, we have fixed the growth phase generations at 30. The germ-line cells of the growth phase eventually form the adult self-renewing Ap spermatogonia (SrAp). The second phase, called the adult phase, models the testis from puberty to death. In this phase, the SrAp divide asymmetrically to produce a daughter SrAp (self-renewal) and another daughter cell whose descendents, after a few additional divisions, will produce sperm. During the adult phase, the number of SrAp are assumed to remain constant (see Discussion). Any new mutation in this phase produces only one mutant SrAp lineage (and the descen- dent mutant meiotic and postmeiotic cells, including sperm), Fig. 1. Distribution of mutants in dissected human testes. Each of the six unlike the clusters produced in the growth phase. Thus, muta- testes are depicted twice, once in the C758G column and once in the C755G tions arising in the adult phase will be distributed uniformly column. The donor/testis identification number sits between the data for each throughout the testis. An experimental estimate (20) indicates mutation type. Each testis is shown as six slices, each divided into the 32 pieces. that, in the adult human male, SrAp cells divide every 16 days. In every case, the orientation of each testis relative to the head and tail In the model, this division rate persists from age 13 until death portions of the epididymis is the same: the head portion is on the left (slice 1), (see Discussion). Based on a testis donor’s age at death, we the tail on the right (slice 6); the long epididymal axis runs along the upper surface. The color code shows the number of mutant molecules per million estimate the number of adult-phase generations. genomes. In each column, the testes are arranged by age from oldest (Upper) To test the mutation hot-spot hypothesis, we wrote a computer to youngest (Lower). The data on C755G from donors 374-1, 374-2, and 854-2 program to simulate the model. Both the testing procedure and have been published (6). the computer program are the same as in our previous article (6), with one subtle change (see SI Text). We performed a goodness- of-fit test; the statistic we considered was the fraction of the whether a testis piece, which has a high mutation frequency for genomes present in the minimum number of testis pieces, which the one mutation, will also have a high mutation frequency for contain 95% of the mutant cells in the testis. For the older the other. We restrict our attention to those pieces with mutation donors, this statistic was near 5% (Table 1). In contrast, for most frequencies Ͼ0.001 (the colors representing the four highest- of the simulations, this statistic was near 95%. Based on these frequency categories in Fig. 1) for either mutation. For five of the simulations (see Methods), we can strongly reject (P value six testes, there are no pieces with high frequencies for both Ͻ10Ϫ6) the hot-spot model (Table 2). For the one younger mutations. For testis 374-1, there is one piece with a frequency donor–mutation combination (testis 59056-1, mutation C755G) above the threshold for both mutations. Testis 374-1 has more for which there was a relatively high mutation frequency, the pieces with high frequencies than any of the other testes, so to hot-spot model was also rejected by the goodness-of-fit test (P test whether this one overlapping piece is significant, we per- value Ͻ10Ϫ6). For the three other younger donor–mutation formed a permutation test (SI Text). This test determined that combinations, which had much lower mutation frequencies, the the one overlapping piece is not significant, because, for a hot-spot model was not rejected after a Bonferroni correction random model, 43% would have one or more overlapping pieces. for multiple tests (see Table 2). Even for these data, the statistic MEDICAL SCIENCES

Choi et al. PNAS ͉ July 22, 2008 ͉ vol. 105 ͉ no. 29 ͉ 10145 Downloaded by guest on September 29, 2021 Table 2. Model parameters and goodness-of-fit P values Hot-spot model (p ϭ 0) Selection model (p Ͼ 0)

Testis Age Optimal ␭, 95% C.I. GOF P value Optimal ␭, 95% C.I. Optimal p, 95% C.I. ␹2 P value

Mutation C758G 374–1 62y 6.4 ϫ 10Ϫ7, (6.0–6.7) ϫ 10Ϫ7 Ͻ10Ϫ6 5.2 ϫ 10Ϫ11, (1.2–13) ϫ 10Ϫ11 0.016, (0.010–0.019) 0.13 374–2 62y 9.0 ϫ 10Ϫ8, (8.0–10) ϫ 10Ϫ8 Ͻ10Ϫ6 5.6 ϫ 10Ϫ12, (1.1–10) ϫ 10Ϫ12 0.012, (0.009–0.019) 0.30 854–2 54y 1.2 ϫ 10Ϫ7, (1.1–1.3) ϫ 10Ϫ7 Ͻ10Ϫ6 2.8 ϫ 10Ϫ11, (0.8–5.2) ϫ 10Ϫ11 0.011, (0.010–0.021) 0.05 59089–1 45y 1.5 ϫ 10Ϫ7, (1.4–1.6) ϫ 10Ϫ7 Ͻ10Ϫ6 1.4 ϫ 10Ϫ11, (0.6–6.6) ϫ 10Ϫ11 0.016, (0.013–0.025) 0.62 60832–1 23y 2.9 ϫ 10Ϫ9, (1.6–4.3) ϫ 10Ϫ9 0.42 2.7 ϫ 10Ϫ9, (0.3–4.6) ϫ 10Ϫ9 0, (0–0.018) 0.36 59056–1 19y 4.1 ϫ 10Ϫ9, (1.2–6.4) ϫ 10Ϫ9 0.004 4.3 ϫ 10Ϫ10, (1.3–40) ϫ 10Ϫ10 0.025, (0–0.040) 0.19 Mutation C755G 374–1 62y 3.3 ϫ 10Ϫ7, (3.1–3.5) ϫ 10Ϫ7 Ͻ10Ϫ6 2.8 ϫ 10Ϫ11, (1.0–8.4) ϫ 10Ϫ11 0.011, (0.009–0.018) 0.14 374–2 62y 5.9 ϫ 10Ϫ8, (5.6–6.2) ϫ 10Ϫ8 Ͻ10Ϫ6 1.2 ϫ 10Ϫ11, (0.4–3.2) ϫ 10Ϫ11 0.010, (0.008–0.018) 0.88 854–2 54y 7.2 ϫ 10Ϫ7, (6.6–7.5) ϫ 10Ϫ7 Ͻ10Ϫ6 2.0 ϫ 10Ϫ11, (0.4–6.0) ϫ 10Ϫ11 0.016, (0.012–0.023) 0.93 59089–1 45y 2.1 ϫ 10Ϫ7, (2.0–2.2) ϫ 10Ϫ7 Ͻ10Ϫ6 4.8 ϫ 10Ϫ11, (1.4–11) ϫ 10Ϫ11 0.015, (0.011–0.028) 0.38 60832–1 23y 4.0 ϫ 10Ϫ9, (2.0–5.9) ϫ 10Ϫ9 0.28 4.0 ϫ 10Ϫ9, (0.2–6.6) ϫ 10Ϫ9 0, (0–0.018) 0.49 59056–1 19y 1.1 ϫ 10Ϫ7, (0.9–1.2) ϫ 10Ϫ7 Ͻ10Ϫ6 3.6 ϫ 10Ϫ10, (2.3–7.0) ϫ 10Ϫ10 0.055, (0.040–0.070) 0.01

␭ is the mutation rate per cell division, which is less than the mutation frequency (see ref. 6). To correct the P values for multiple tests (Bonferroni), multiply the number of tests performed by 24. For example, 0.004 becomes 0.096 and is not significant; in fact, the only uncorrected P values that remain significant after correction are those Ͻ10Ϫ6. p, probability of a symmetric division in the adult phase; GOF, goodness of fit.

was substantially Ͻ5%, but this was not due to ‘‘hot spots,’’ individuals, we found some foci where the mutation frequency because there were not any; rather, it was because, for these data, was 3-4 orders of magnitude greater than most of the other most testis pieces had zero observed mutants (Table S1). pieces. In conjunction with our modeling efforts, these data allow us to reject the hot-spot model. As was the case for the Germ-Line Selection. In our previous article (6), we modified the C755G mutations in the same gene (this work and ref. 6), the model to incorporate selection. It is known from model organ- observed high mutation frequency is not simply due to an isms that stem cells can switch from an asymmetric to a exceptionally high C to G transversion mutation rate per cell symmetric division pattern and back again, and that such be- division. havior can depend on factors that are intrinsic and extrinsic to The key insight to the hot-pot model is that for a 50- to the stem cells (27, 28). The form of selection that we introduced 60-year-old man, the ratio of expected mutations in the adult is that in the adult phase, mutated SrAp cells occasionally divide phase to the growth phase is Ϸ500:1. (For a 19–23 year old, this symmetrically (in the hot-spot model, all adult phase divisions ratio is closer to 100:1.) In simulations with the mutation rates are asymmetric; in the selection model, nonmutated SrAp still per cell division set to match the observed mutation frequencies, always divide asymmetrically). Because these new SrAp cells are almost all of the mutations occur in the adult phase. Thus, these expected to remain near their progenitors, these rare symmetric simulated mutations are not clustered but are scattered uni- divisions enable mutation clusters to form and grow locally with formly throughout the testis; in simulations, the ‘‘mutation time increase the overall mutation frequency in the testis. Our jackpots’’ rarely occur. Two further observations argue against selection model is based on the model described above, but we the mutation hot-spot hypothesis, irrespective of modeling de- add a selection parameter p: at each adult phase generation, a tails. First, in our previous article (6), we carried out the same mutated SrAp divides symmetrically with probability p and experiment foraCtoGtransversion mutation at a neutral CpG divides asymmetrically with probability 1Ϫp (after a symmetric site (on chromosome 7). We found these mutations were uni- division, each daughter SrAp reverts to asymmetric divisions formly spread throughout the testis and, unlike the Apert until the next rare symmetric division). A similar model was syndrome disease sites, we could not reject our nonselection independently proposed by Crow (11). model (p ϭ 0) for this neutral mutation case (as expected, the Unlike the hot-spot model, the selection model qualitatively overall mutation frequency was much lower than for the Apert matches the distribution of mutation frequencies in the data. In sites in that testis). Second, the younger testis donors have simulations, foci of high mutation frequency emerge, and these substantially lower mutation frequencies than the older donors, foci often intersect several adjacent testis pieces. To quantita- and they do not have significant mutation foci (except for C758G tively test the selection model, we used the ␹2 test to compare in 59056-1). Apparently, the disease sites are different from the simulations to the actual testis data (see Methods). After apply- sites of neutral mutation, and their mutation clusters are not ing the Bonferroni correction for multiple tests, we could not jackpots in the classical sense but grow in the adult phase. reject the selection model for any of the testis–mutation com- An alternative to the mutation hot-spot hypothesis suggests binations. Regarding the selection parameter, for the older that high mutation frequencies can result if premeiotic diploid donors, the inferred p is Ϸ0.01 (on average, 1 of every 100 cells experience a mutation that confers a selective advantage divisions is symmetric; Table 2). For the 23-year-old donor over wild-type premeiotic cells. First, it was noted (5) that this (60832-1), this parameter is zero, implying no selection. This is selection hypothesis might explain why Ϸ99-fold more sporadic consistent with the very low testis mutation frequency that is Apert syndrome mutations (and the common responsible for the hot-spot model also not being rejected for this mutation; see below) arise in males than females (29, 30), individual. For the 19-year-old donor (59056-1), this selection whereas neutral germ-line mutations (at many different sites) parameter is greater (see Discussion). show only an Ϸ5-fold male bias (31). Second (12), rare Apert syndrome patients born with two FGFR2 mutations were argued Discussion to be much more likely to result from selection than from two We examined the spatial distribution of spontaneous occur- independent mutation events in the same germ-line cell. Third, rences of the second most-common Apert syndrome mutation when semen from normal men heterozygous for a single- (C758G) in pieces dissected from normal human testes. In older nucleotide polymorphism tightly linked to the mutation site were

10146 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0801267105 Choi et al. Downloaded by guest on September 29, 2021 studied for the presence of C755G mutations, the results showed Is there any evidence for positive selection of mutant premei- what appeared to be a nonrandom distribution of mutations otic spermatogonia at other loci? Virtually all new achondro- among the two homologous chromosomes (ref. 4; see, however, plasia (the most common form of dwarfism) mutations arise at ref. 6). A truly nonrandom distribution could be explained if a very high frequency in the male germ line at one nucleotide site positive selection acted on rare progenitor spermatogonia car- (G1138A) in the FGFR3 gene, suggesting that the high fre- rying a newly arisen C755G mutation. Finally, analysis of C755G quency may also be explained by selection (5, 11). A recent study mutation data from dissected testis of older individuals indicated on G1138A mutations in sperm from semen also included some this site was not a mutation hot spot, but a selection model fit the testis biopsies. In two individuals Ͼ80 years old, the frequencies data (6). We now have shown that the same is true for an in two independent biopsies from the same testis were not independent mutation (C758G) that causes the same disease. concordant (34), just as our selection model would predict. Modifying our model of germ-line development by incorpo- Some insight into how the FGFR2 and FGFR3 mutations rating a simple selection scheme led to predictions on mutation contribute to a selective advantage comes from noting that both frequency and the distribution of mutations in the testis, con- gene products are receptor tyrosine kinases and can influence sistent with our data. We model selection by proposing that downstream members of the signal transduction pathway (for a mutant SrAp cells occasionally divide symmetrically; the in- more detailed discussion, see refs. 35 and 36). Especially relevant ferred rate is Ϸ1 every 100 divisions on average or around once may be that some mutations in FGFR2 and FGFR3 (although every 4 years. These divisions cause mutation clusters to grow usually not the specific mutations that cause Apert syndrome or with time (as microscopic ‘‘tumors’’), thereby increasing the achondroplasia) have been associated with certain cancers (37– overall testis mutation frequency. This growth could also explain 39). No published data yet exist on the specific molecular the paternal age effect (the observed exponential increase in processes that these mutations might influence in providing a disease incidence with the age of the father), by a model (4, 6, selective advantage to human spermatogonia carrying a new 11) that differs from the classical explanation (1, 13–15). For the mutation. older donors we studied, the inferred mutation rate per cell Considering all of the published work (4–6, 11, 12) and our division for the selection model would produce mutation fre- present results, we suggest that positive selection can be a driving quencies similar to the existing data on neutral transversion force acting to increase the germ-line mutation frequency in mutations (9, 10), if the selection parameter p were set to zero humans above that at which spontaneous nucleotide substitu- (so no selection, all adult phase divisions asymmetric). This tions arise. Germ-line selection of this kind has a root in result suggests that the C755G and C758G mutations arise at theoretical work by Hastings published almost 20 years ago approximately the frequency expected for neutral transversion (40–42) on recessive deleterious mutations segregating in ani- mutations. mal populations. Experimental literature on germ-line selection For the 19-year-old donor (59056) and mutation C755G, the in premeiotic diploid cells in animals is very sparse (43). inferred p parameter value is unrealistically high (0.055). If Interestingly, Hastings also realized that alleles conferring a the selection model is simulated with this p parameter then by the selective advantage in the germ line may be disadvantageous in age of the older donors, most of the testis would be mutated (6). the adult and might lead to ‘‘mitotic drive’’ systems that increase For modeling purposes, the difficulty is that, although the the mutational load of a population. Both Apert syndrome and mutation frequency for this testis is lower than for the older achondroplasia may be examples of such a system and others, testes, the observed mutation frequency is still greater than we including those of medical interest, may also exist (see refs. 11, would expect if we were to insert the parameters inferred from 13–15, 31). The testis dissection method can be useful in the older donors into the selection model. There is evidence that examining this hypothesis at any locus in a species. there are some SrAp asymmetric divisions before age 13 (24), and that the 16-day division rate we have assumed is constant Methods from age 13 until death slows with age (32). If one or both of Source of Testes. Testes were obtained from the National Disease Research these modifications is included in a more complicated selection Interchange (NDRI) with the approval of the Institutional Review Board of the University of Southern California. All samples were frozen within 12 h after model, there would be relatively more divisions early in an death. individual’s life, and then the model could match the data on the 19 year old with a lower and therefore more realistic selection Testis and Epididymis Processing. Sperm cells were collected from the epidid- parameter p. ymal tail and adjacent vas deferens. For testis dissection (Fig. S1), the epidid- We have also considered other modifications to the model. In ymis is removed and the testis is cut into six approximately equal-size slices at our previous article (6), we considered replication-independent right angles to the testis’ long axis. Each slice is divided into 32 approximately mutations, but it turned out this addition did not change our equal-size pieces such that piece 1 from slices 1–6 are adjacent to one another conclusions. A new variant, which in principle could match the and run along the long axis on the dorsal (epididymal) surface, etc. Details of data and for which the effect of the disease mutations is perhaps the dissections, DNA isolation, and quantitation are found in ref. 6. Variation less extreme, is to allow all adult SrAp to alternate between in the number of genomes per piece (see Table S1) results from variation in slice and piece size because of testis shape. symmetric divisions, asymmetric divisions, and sometimes not dividing at all; in this model, the mutant SrAp would divide C758G Mutation Frequency Assay. We modified the pyrophosphorolysis acti- symmetrically more often than the nonmutants. We have also vated PCR (PAP) protocol (44). Our assay is almost identical to the C755G considered cell death; if all cells are equally likely to die, this mutation assay described in the methods and text S2 of ref. 6. Each C758G affects the parameter values of the models but not the conclu- reaction contained 20 mM Hepes (pH 7.35), 30 mM KCl, 50 ␮MNa4PPi, 2 mM sions; if the mutant SrAp are less likely to die, this process would MgCl2,80␮M of each dNTP; 160 nM of each primer; 2 ␮M Rox; 0.2ϫ Syber be another form of selection, but because the total number of green I; 0.04 unit/␮l TMA31FS DNA polymerase (Roche); and 25,000 copies of SrAp are estimated to decrease by only 25% from age 35 to 65 testis or epididymal sperm DNA molecules. Research samples of Tma31FS DNA (33), this reduction is not great enough to produce the observed polymerase may be obtained from Thomas W. Myers (Roche Molecular Sys- tems). The C758G-specific PAP primers were: 5Ј-CCCCACTCCTCCTTTCTTC- mutation clusters. Likewise, if the mutant SrAp cells were to CCTCTCTCCACCAGAGCGATG 3Ј and 5Ј-TTTGCCGGCAGTCCGGCTTGGAG- have a higher asymmetric division rate than the nonmutant cells, dd GATGGGCCGGTGAGGCCdd 3Ј. Cycling conditions were: initial denaturation 2 this would be yet another form of selection, but the change in min, 94°C and 150 cycles of 6 s, 94°C and 40 s, 74°C. For sample PAP data, see division rate necessary to produce the intensity of the observed figure S1 of ref. 6. The C755G reagents and primers in this study were the same MEDICAL SCIENCES mutation clusters is unrealistically high. as described in ref. 6 except for the Hepes pH (7.35), the primer concentration

Choi et al. PNAS ͉ July 22, 2008 ͉ vol. 105 ͉ no. 29 ͉ 10147 Downloaded by guest on September 29, 2021 (320 nM), the initial denaturation step (94°C), and the cycling conditions (125 (Table 2). We then performed a goodness-of-fit test; the statistic we used is the cycles of6sat94°C and 40 s at 77°C). minimum number of testis pieces required to contain 95% of the mutants in the testis. We simulated the model many times with the inferred optimal mutation Mutation Counting Strategy. Ten reactions (250,000 total genomes) were used rates, so that for each testis and each mutation, there would be one million to estimate the C758G mutation frequency for every testis piece. If Ͻ5/10 simulations with overall mutation frequency within 5% of the frequency for the reactions were positive, we took the number (after Poisson correction) as an data, and we could then compare the distribution of the statistic in the simula- estimate of the mutation frequency. If five or more reactions were positive, tions to the statistic for the data. then the experiment was repeated by using dilutions until Ͻ10/20 (or in some The selection model has two parameters. We infer the selection parameter cases Ͻ25/40) were positive. Single mutant molecule reactions were distin- p and the mutation rate per cell division ␭ by fitting both the overall testis guished from negative ones by the kinetics of fluorescence increase as a mutation frequency and the minimum number of testis pieces, which contain function of cycle number and PCR product melting profile using quantitative 95% of the mutant genomes (Table 2 and ref. 6). Separately for each testis and PCR. The estimate of the total testis mutation frequency (per million genomes) mutation, we quantitatively tested the selection model by simulating the was the average of the frequencies of the pieces weighted by the number of selection model with the inferred optimal parameters. For the simulations and genomes in those pieces (see Table S1). the data corresponding to the older donor’s testes at both mutations and For each experiment, 20 negative controls each contained 25,000 human testis 59056-1 at mutation C755G, we counted the number of testis pieces with blood genomes (Clontech). Twenty positive controls each had added an frequencies in the different color categories in Fig. 1; for both mutations for average of 0.5 or 1 of Apert C758G or C755G DNA (kindly provided by testis 60832-1 and for the C758G mutation for testis 59056-1, because the Mimi Jabs, Mount Sinai School of Medicine, New York). mutation frequencies were so low, we counted the testis pieces with frequen- cies in the following categories: [0–5]ϫ10Ϫ6, [5–10] ϫ10Ϫ6, [10–20] ϫ10Ϫ6, and Ͼ20 ϫ 10Ϫ6. Separately for each testis and mutation, we then performed a ␹2 Quantitative Modeling and Testing. The details of the quantitative model, the test comparing these counts for the simulation and the data. computer code, and the estimation and testing procedures are the same as in our previous article (6), with one subtle difference (see SI Text). The hot-spot model ACKNOWLEDGMENTS. We are grateful to our posthumous anonymous tissue has one free parameter: the mutation rate per cell division (the number of donors. We thank David Gelfand (formerly at Roche Molecular Systems) and growth-phase generations and adult-phase generations have been set indepen- Thomas Myers for supplying Tma31FS and Tina Hu for other assistance. This dent of the mutation data, as described in Results). Separately for each testis and work was supported in part by grants from the National Institute of General for each of the two mutations, we found the value of the mutation rate per cell Medical Sciences (N.A. and P.C.) and the Ellison Medical Research Foundation division, which maximizes the likelihood of the observed mutation frequency (N.A.).

1. Vogel F, Motulsky AG (1997) Human Genetics: Problems and Approaches (Springer, Berlin). 24. Nistal M, Paniagua R (1984) Testicular and Epididymal Pathology (Thieme-Stratton, New 2. Haldane JBS (1935) The rate of spontaneous mutation of a human gene. J Genet York). 31:317–326. 25. Zhengwei Y, Wreford NG, Royce P, de Kretser DM, McLachlan RI (1998) Stereological evalu- 3. Glaser RL, et al. (2003) The paternal-age effect in Apert syndrome is due, in part, to the ation of human spermatogenesis after suppression by testosterone treatment: heteroge- increased frequency of mutations in sperm. Am J Hum Genet 73:939–947. neous pattern of spermatogenic impairment. J Clin Endocrinol Metab 83:1284–1291. 4. Goriely A, McVean GA, Rojmyr M, Ingemarsson B, Wilkie AO (2003) Evidence for 26. Muller J, Skakkebaek NE (1992) The prenatal and postnatal development of the testis selective advantage of pathogenic FGFR2 mutations in the male germ line. Science Baillieres. Clin Endocrinol Metab 6:251–271. 301:643–646. 27. Nagano M, Avarbock MR, Brinster RL (1999) Pattern and kinetics of mouse donor 5. Tiemann-Boege I, et al. (2002) The observed human sperm mutation frequency cannot explain spermatogonial stem cell colonization in recipient testes. Biol Reprod 60:1429–1436. the achondroplasia paternal age effect. Proc Natl Acad Sci USA 99:14952–14957. 28. Morrison SJ, Kimble J (2006) Asymmetric and symmetric stem-cell divisions in devel- 6. Qin J, et al. (2007) The molecular anatomy of spontaneous germ line mutations in opment and cancer. Nature 441:1068–1074. human testes. PLoS Biol 5:e224. 29. Wilkin DJ, et al. (1998) Mutations in fibroblast growth-factor receptor 3 in sporadic 7. Tolarova MM, Harris JA, Ordway DE, Vargervik K (1997) Birth prevalence, mutation cases of achondroplasia occur exclusively on the paternally derived chromosome. Am J rate, sex ratio, parents’ age, and ethnicity in Apert syndrome. Am J Med Genet Hum Genet 63:711–716. 72:394–398. 30. Glaser RL, et al. (2000) Paternal origin of FGFR2 mutations in sporadic cases of Crouzon 8. Cohen MM, et al. (1992) Birth prevalence study of the Apert syndrome. Am J Med syndrome and . Am J Hum Genet 66:768–777. Genet 42:655–659. 31. Li WH, Yi S, Makova K (2002) Male-driven evolution. Curr Opin Genet Dev 12:650–656. 9. Nachman MW, Crowell SL (2000) Estimate of the mutation rate per nucleotide in 32. Kimura M, et al. (2003) Balance of and proliferation of germ cells related to humans. Genetics 156:297–304. spermatogenesis in aged men. J Androl 24:185–191. 10. Kondrashov AS (2003) Direct estimates of human per nucleotide mutation rates at 20 33. Nistal M, Codesal J, Paniagua R, Santamaria L (1987) Decrease in the number of human loci causing Mendelian diseases. Hum Mutat 21:12–27. Ap and Ad spermatogonia and in the Ap/Ad ratio with advancing age. New data on the 11. Crow JF (2006) Age and sex effects on human mutation rates: an old problem with new spermatogonial stem cell. J Androl 8:64–68. complexities. J Radiat Res (Tokyo) 47 Suppl B:B75–B82. 34. Dakouane Giudicelli M, et al. (2008) Increased achondroplasia mutation frequency with 12. Goriely A, et al. (2005) Gain-of-function amino acid substitutions drive positive selection of advanced age and evidence for G1138A mosaicism in human testis biopsies. Fertil Steril FGFR2 mutations in human spermatogonia. Proc Natl Acad Sci USA 102:6051–6056. 89:1651–1656. 13. Risch N, Reich EW, Wishnick MM, McCarthy JG (1987) Spontaneous mutation and 35. Thisse B, Thisse C (2005) Functions and regulations of fibroblast growth factor signaling parental age in humans. Am J Hum Genet 41:218–248. during embryonic development. Dev Biol 287:390–402. 14. Glaser RL, Jabs EW (2004) Dear old dad. Sci Aging Knowledge Environ 2004:re1. 36. Eswarakumar VP, Lax I, Schlessinger J (2005) Cellular signaling by fibroblast growth 15. Crow JF (2000) The origins, patterns and implications of human spontaneous mutation. factor receptors. Cytokine Growth Factor Rev 16:139–149. Nat Rev Genet 1:40–47. 37. Bernard-Pierrot I, et al. (2006) Oncogenic properties of the mutated forms of fibroblast 16. Kuhnert B, Nieschlag E (2004) Reproductive functions of the ageing male. Hum Reprod growth factor receptor 3b. Carcinogenesis 27:740–747. Update 10:327–339. 38. Hansen RM, Goriely A, Wall SA, Roberts IS, Wilkie AO (2005) Fibroblast growth factor 17. Muenke M, Wilkie AO (2007) in Syndromes: The Online Metabolic receptor 2, gain-of-function mutations, and tumourigenesis: investigating a potential and Molecular Bases of Inherited Disease, www.ommbid.com/OMMBID/the_online_ link. J Pathol 207:27–31. metabolic_bases_of_inherited_disease/b/fulltext/part30/ch245/9. 39. Easton DF, et al. (2007) Genome-wide association study identifies novel breast cancer 18. Clermont Y (1966) Spermatogenesis in man. A study of the spermatogonial population. susceptibility loci. Nature 447:1087–1093. Fertil Steril 17:705–721. 40. Hastings IM (1989) Potential germline competition in animals and its evolutionary 19. Clermont Y (1963) The cycle of the seminiferous epithelium in man. Am J Anat 112:35–51. implications. Genetics 123:191–197. 20. Heller CG, Clermont Y (1963) Spermatogenesis in man: An estimate of its duration. 41. Otto SP, Hastings IM (1998) Mutation and selection within the individual. Genetica Science 140:184–186. 102–103:507–524. 21. Drost JB, Lee WR (1995) Biological basis of germline mutation: comparisons of spon- 42. Hastings IM (1991) Germline selection: population genetic aspects of the sexual/ taneous germline mutation rates among drosophila, mouse, and human. Environ Mol asexual life cycle. Genetics 129:1167–1176. Mutagen 25:48–64. 43. Extavour C, Garcia-Bellido A (2001) Germ cell selection in genetic mosaics in Drosophila 22. Johnson L, Varner DD (1988) Effect of daily spermatozoan production but not age on transit melanogaster. Proc Natl Acad Sci USA 98:11341–11346. time of spermatozoa through the human epididymis. Biol Reprod 39:812–817, 1988. 44. Liu Q, Sommer SS (2004) Detection of extremely rare alleles by bidirectional pyrophospho- 23. Ehmcke J, Wistuba J, Schlatt S (2006) Spermatogonial stem cells: questions, models and rolysis-activated polymerization allele-specific amplification (Bi-PAP-A): Measurement of mu- perspectives. Hum Reprod Update 12:275–282. tation load in mammalian tissues. BioTechniques 36:156–166.

10148 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0801267105 Choi et al. Downloaded by guest on September 29, 2021