Copyright 0 1990 by the Society of Alnerica

Mathematical Models of Gene Amplification With Applicationsto Cellular Drug Resistance and Tumorigenicity

Marek Kimmel"" and David E. Axelrod+ *Investigative Cytology Laboratory, Memorial Sloan-Kettering Cancer Center, New York, New York 10021, and TWaksman Institute, Rutgers, The State University of New Jersey, Piscataway, New Jersey 08855-0759 Manuscript received July 3 1, 1989 Accepted for publication March 12, 1990

ABSTRACT An increased number of copies of specific genes may offer an advantage to cells when they grow in restrictive conditions such as in the presence of toxic drugs, or in a tumor. Three mathematical models of gene amplification and deamplification are proposed to describe the kinetics of unstable phenotypes of cells with amplified genes. The models differ in details but all assume probabilistic mechanisms of increase and decrease in gene copy number per (gene amplification/deamplifica- tion). Analysis ofthe models indicates that a stable distribution of numbers of copies of genesper cell, observed experimentally, exists only if the probability of deamplification exceeds the probability of amplification. The models are fitted to published data on the loss of methotrexate resistance in cultured cell lines, due to the loss of amplified dihydrofolate reductase gene. For two mouse cell lines unstably resistant tomethotrexate the probabilities of amplification and deamplification of the dihydrofolate reductase gene on double minute are estimated to be approximately 2% and lo%, respectively. These probabilities are much higher than widely presumed. The models explain the gradual disappearance of the resistant phenotype when selective pressure is withdrawn, by postulating that the rate of deamplification exceeds the rate of amplification. Thus it is not necessary to invoke a growth advantage of nonresistant cells which has beenthe standardexplanation. For another analogous process, the loss of double minute chromosomes containing the myc oncogene from SEWA tumor cells, the growth advantage model does seem to be superior to the amplification and deamplification model. In a more theoretical section of the paper, it is demonstrated that gene amplification/deamplification can result in reduction to homozygosity, such as is observed in some tumors. Other applications are discussed.

HE amount of DNA per cell remains constant cation of oncogenes has been observed in many hu- T fromone cell generationto another because man tumor cells and may confer a growth advantage during each cell cycle the entire content of DNA is on cells which overproduce the oncogene products. duplicated and then at each mitotic cell division the The inheritance of cellular phenotypes conferred DNA is evenly apportioned to two daughter cells. by gene amplification differs from that confered by However,recent experimental evidence shows that classical genetic mutations in several ways. The rate for a fraction of DNA, the number of gene copies per at whichcells with new phenotypesappear is very cell may not be constant. We are concerned with the rapid (RATH,TLSTY and SCHIMKE1984), the pheno- distribution of those genes which exist in variable types are unstable and rapidly reversible, subpopula- numbers of copies. tions with a broad range of phenotypes can coexist, An increase in gene copy number is referred to as and each subpopulation can regeneratethe entire gene amplification. Gene amplification can been- range of subpopulations (KAUFMANN,BROWN and hanced by conditions that interferewith DNA synthe- SCHIMKE 198 1). sis (BROWN,TLSTY and SCHIMKE1983; TLSTY, In this communication we present mathematical BROWN andSCHIMKE 1984), andis increased in some models of gene amplification that may help to under- mutant (GIULOTTO,KNIGHTS and STARK1987) and stand the inheritance of cellular phenotypes associated tumor cells (ALITALOand SCHWAB 1986). Increased with variation in gene copy number. We focus on the numbers of genes may produce an increased amount kinetics of phenomenarather than on molecular of gene products, and in tumor cells confer resistance mechanisms..All the models assume that at each cell to chemotherapeutic drugs (SCHIMKE1984). Amplifi- generation, there is a probabilistic mechanism of in- creasing or decreasing the number of gene copies per I,1 resent address. Department of Statistics,Rice University. Houston, Texts 7725 1. cell. We analyze the behavior of these models in order

Genetics 125 639-644 (July, 1990) 634 M. Kimmel and D. E. Axelrod

TABLE 1 Comparison of the essential features of the models of gene amplification

Geometric amplification Feature nod el 2 Each double minute repli- Unspecified All copies replicated w.p. /3, cated independently w.p. p; or not replicated w.p. 1 not replicated w.p. 1 - p Division of gene copies be- Each double minute randomly Unspecified Copies split equally between t\veell daughter cell assigned to one of the daughters w.p. 6, or a11 cop- daughter cells ies to a random selected daughter W.p. 1 - 6 Resulting stochastic pIocess Galton-Watson branching Random walk with 0 as Random walk with 0 21s ab- process absorbingboundary sorbing boundary andaddi- tional transition to 0 Probabilities of mmplification, deamplification and gene loss: b b d d C 0

Abbrevi;ttion: w.p., with probability. to determine under what conditions a population of formally by, cells will have a stable distribution of gene copy num- bers, as is observed experimentally.We then apply the models to published experimental data in order to explain the loss of the dihydrofolate reductase gene, Process {x,) assumes all nonnegative integer values. and the myc oncogene of SEWA mouse tumor cells. In the following paragraph we give a detailed de- We also consider more theoretical issues, among them scription of our models. Essentials are collected in the reduction to homozygosity observed in some tu- Table 1. mors. Model 1, independent amplification:It is assumed that all the extrachromosomalelements containing the genes of interest, are independent of each other. During cell’s S-phase, each element is replicated once MODELS OF GENEAMPLIFICATION AND DEAMPLIFICATION with probability (w.P.) equal top, orit is not replicated with probability 1 - p. Then,at the time ofcell In this section we present three probabilistic models division, each element is assigned randomly to one of of amplification and deamplification of genes on ex- the daughter cells. In a lineage of cells a randomly trachromosomalelements. Models 1 and 3 are de- selected daughter will inherit, for each extrachromo- signed to represent two extremely different possibili- somal element present in the mother cell, two ele- ties, one ofwhich is that each of the elementsamplifies ments with probability b = p/4, oneelement with or deamplifies independently of the rest, and the other probability 1 - b - d = 1/2 and noelements with that they amplify or deamplify jointly. Predictions of probability d = (2 - p)/4. Thus, in accordance with these idealized models will be compared to the data, the assumed independence, the stochastic process {X,, in order to determinewhich of them is more likely to n 2 0), is the Galton-Watson process with progeny be true. generating function,f(s) = bs2 + (1 - b - d)s -t d. Model 2 represents the simplest conceivable alter- In a wider sense, model 1 may represent any mech- native of amplification/deamplification, virtually with anism by which extrachromosomal genes are ampli- no structural hypotheses. fied (w.p. b) or deamplified (w.p. d) independently of In all three models we consider the stochastic proc- each other. ess tX,, n 2 01 of the number of gene copies in the n- Model 2, geometric amplification: In this model th generation of the randomly selected cell lineage. no particular mechanism of gene amplificdtion/deam- In models 2 and 3, X, may assume only values 0, 1, 2, plification is assumed. It is only postulated that from 4, 8, 16, . . . . Therefore, we will also employ the one generation to another the numberof gene copies process {xn, n 2 01, with reduced state space, defined onextrachromosomal elements may double with Models of Gene Amplification 635 probability b, stay the same with probability 1 - b - to limiting values. In mathematical terms: d or halve with probability d. Therefore process (x,; n 2 0) is a denumerable Markov chain with the follow- Kan; model 1, 1 - Pix, = 01 (2) ing infinite transition Matrix I: - Kan/n3/';models 2 and 3, 0 0 0 d (1 -b-d) b 0 d (1 - b - d) b

0 d (1 -b-d) b *a* MATRIX 1 It may be classified as a random walk with an absorb- Pix, = ilXn # 0) + c,, i L 1, (3) ing boundary at 0. Assumptions of this model have as n + m), whererates CY,the proper probability been first introduced by PETERSON(1983, 1984), in distributions (ci), and constants K are defined in the an informal way, in reference to chromosomal genes. terms of parameters b, d and c (different for each Model 3, joint amplification: It is assumed that all model; see Table 2). Mathematical details are deferred the extrachromosomal elements are replicated, or not to the APPENDIX. replicated, jointly. The probability of replication is P Thus, in the subcritical case, there exist limit distri- < 1. At division, the elements areeither divided butions of gene copy number, conditional on nonex- equally between daughter cells, with probability 6 < tinction of the gene. The expected value of such 1, or assigned to one randomly chosen daughter cell, distribution is denoted by E,, with probability 1 - 6. In a lineage of cells, a randomly selected daughter will inherit twice her mother's num- E, = 2 ic,. (4) ber of extrachromosomal elements with probability b t = p(1 - 6)/2, exactly this number with probability 1 - b - d - c = (1 - P)(1 - 6)/2 + pa, half of her Convergence to the limit distribution, for model 2, mother's number with probability d = (1 - @)a, and is illustrated by simulations in Figure 1. no extrachromosomalelements with probability In the critical and supercritical cases, the asymptotic (1 - 6)/2 = c. behavior is different and generally a proper asymp- The process (x,, n 2 01, is now adenumerable totic conditional distribution does notexist. The prop- Markov chain with the infinite transition Matrix 11. erties in the supercritical and critical cases are as 0 0 0 0 [dit (1-b-d-c) b 0 0 ... d (1-b-d-C) b 0 0 d (1-b-d-C) b ::: 1 0 0 d (1 -6-d-c) *e* MATRIXI1 Process (xn, n z O] can be classified as an asymmetric follows: random walkwith zero as absorbing boundary and additional transition to zero. It is usually referred to b > d.Models 1 and 2. Pixn = O] + p < 1 and as a random walk with disasters. (Ci] = 0. Model P(X, = O] + 1 and (c,] = 0. Asymptotic properties of the models: The three 3. models behave asymptotically in a related manner. b = d. PIX, = 0) -+ 1 and (c,) = 0. First of all, they display the phenomenonof criticality. In summary, a proper asymptotic distribution of gene Their behavior is different dependent on the relation copy number, conditional on nonextinction of the between probabilities b and d. Wecall the models gene, exists only in the subcritical case, b < d. subcritical, critical and supercritical, if b < d, b = d and b > d, respectively. LOSS OF CELL RESISTANCE TO METHOTREXATE The second common feature is the essentially iden- tical behavior in the subcritical case b < d (see Table Geneticmechanism of methotrexateresistance: 2). The probability P(Xn = 0) of falling intothe Amplification of genes coding for the enzyme dihy- absorbing state (losing all gene copies), tends asymp- drofolate reductase (DHFR) has been associated with totically to 1 with geometric rate, as n + 03. The cellular resistance to growth in the presence of the conditional probabilities P(Xn= i IXn # 0) of maintain- drug methotrexate (MTX) (cj a review by SCHIMKE ing i gene copies, provided the gene is not lost, tend 1988, where a numberof original sources are quoted). 636 M. Kirnrnel and D. E. Axelrod

0.9 The population distribution of numbers of copies o.*c 'I per cell can be estimated by flow cytometry after staining gene products. In the experiments described, two features of these distributions are notable. First, 0.710.6 1 as expected, the proportions of cells with amplified genes decrease with time. Second, and perhaps less obvious, the shapes of the distributions of the gene copy number within the cell subpopulations with am- plified gene, seem to remain stable during the loss of resistance. This stable distribution is evident, for in- stance, in Figure 1 in BROWN, BEVERLEYand SCHIMKE (198 1). Thus,we do not observe the distribution of cells with amplified gene shifting to the left. It stays -12481632 in place, only the area under thedistribution progres-

"OfGBlecCpEs sively decreases (as in our Figure 2). Our models are consistent with the kind of distri- FIGURE1 .-Numerical example of convergence of the distribu- bution of gene copies described above, and with the tion IC:), with n increasing, to their asymptotic values (model 2). Parameters; b = 0.016, d = 0.088; n = 10, solid squares; n = 20, dynamics of loss and gain of gene copies [compare open squares; n = 300, solid triangles. The last distribution is very our Figure 2 to Figure1 of BROWN,BEVERLY and close to the limit. SCHIMKE(1981)l. In the next section, we show how A resistant population with an increased number of experimental data on average gene copy number can DHFR gene copies per cell can be obtained after a be used to estimate the probabilities of gene amplifi- sensitive population is grown in progressively increas- cation and deamplification. ing concentrationsof the drug. Theincreased number Mathematical modeling: Let us assume that at the of copies may be associated with chromosomal alter- time MTX is removed from the culture media (n = 0) ations called homogeneously staining regions or with there exists in the population a stable distribution of extrachromosomalelements called doubleminute cells with different numbers of DHFR gene copies, chromosomes. located on the double minute chromosomes or other In populations of cells with double minute chro- extrachromosomal elements. These cells areMTX mosomes, both the increased drug resistance and the resistant. There also exists a fraction of cells with no increase in number of gene copies are reversible. The DHFR genes on double minutes. Whether these cells classical experiment confirming this includes transfer- are strictly MTX-sensitive or partly MTX-resistant, is ing the resistant cell line into drug-free medium ($ immaterial for our analysis. They will be referred to KAUFMAN, BROWNand SCHIMKE1979,1981; BROWN, as sensitive cells. BEVERLEYand SCHIMKE1981). When these popula- It seems justified to consider a single double minute tions of cells are grown in the absence of the drug, as the smallest indivisible unit in the they gradually lose resistance to the drug, by losing amplification/deamplification process, although ac- extra gene copies. The increase or reduction in the cording to BROWN,BEVERLY and SCHIMKE(1981), numbers of gene copies can be detected by molecular there are three to four copies of the DHFR gene on hybridization, cytological observation of numbers of each double minute. double minutes or fluorescent staining of gene prod- The predictions of the models, for the DHFR gene ucts. dynamics n cell generations after MTX removal (n > Frequency

FIGURE2.-Schematic representation of the 1 Generation nl eneration "2 ' n1 model prediction of the loss of MTX-resistant cells.Left, equilibrium distribution after cells have been cultured in selective conditions for many cell generations. Right, the distribution a number of generations after cells have been re- moved from selective conditions. Note that the proportion of cells with no genes increases, but the shape of the distribution of cells with the gene amplified, does not change. This figure is very similar to Figure 1 of BROWN, BEVERLYand 0 01248 SCHIMKE(1981). Number of Gene Coples lIlumberL of Gene Coples Models of Gene Amplification 637

TABLE 2

Comparison of limit properties of the models of gene amplification

Independent Geometric Joint amplification amplification amplification Property model I model 2 model 3

Asymptotic rate of gene loss a" anln"" ffn/n9/Y a l+b-d 1 -(&- &)J 1 -c-(&- A)?

E, V' (1) where [(& - &)I(& - 2&)]' [(VG - &)/(& - 2A)J' 1 - Vlf(S)] = a[1 - V(s)]

0), are as follows (4. Figure 2): (i) The total frequency of the limit distribution is difficult to recover. This is of cells with at least one double minute chromosome because the observable quantity is the distribution of decreases as a" (model l), or as a"/a(models 2 and gene products and not of the gene itself. It has to be 3). (ii) The shape of the distribution of cells with at assumed that the amount of molecules (or other dis- least onedouble minute chromosome does not crete units) of gene product corresponding to a single change. The process described above can take place gene copy is also a random variable. The resulting only if the probability of deamplification of the DHFR distribution may generally have quite a differentshape gene exceeds its probability of amplification (b < d). than the distribution of the gene copy number. Anal- Let us note that the loss of resistant cells after MTX ogous effects changingthe shape of theobserved removal is, in our models, a natural consequence of equilibrium distribution may be caused by inaccura- this assumption. In particular, it is not necessary to cies of the measurement techniques, lack of adequate postulate a growth advantage for the sensitive cells. cell separation etc. The only invariant of such distor- This point will be discussed later. tions is (up to a proportionality factor) the mean, or To characterize the state of dynamic equilibrium in expected, value of the distribution. This is the reason more detail, we will carry out the computations for why it is employed here. models 2 and 3. The final result is valid also for model Values of a and E, may be obtainedfrom the 1. Let us consider the nth generation of cells without analysis of experimental data: (i) ln(a) is the slope of MTX. Suppose thatbefore n = 0, the moment of the straight line least square fit, of the logarithmic MTXremoval, the cells have been cultured with transformation of equation (8), to the observed de- MTX for N generations. Based on Theorem 2 or 3, creasing fraction of resistant cells plotted versus num- the total expected number of resistant cells, R(n),is ber ofcell generations (doublings) afterMTX re- asymptotically equal to moval. (ii) E, is the average fluorescence specifically coupled to the DHFR gene, computed from the flow R(n) K 2N+naN'n/w (5) - cytometric distribution and renormed based on con- Since N is supposed to be large compared to n, this siderations relating this fluorescence to the number can be rewritten as of gene copies. Then, expressions defining a and E, are solved for R(n) (2a)"R(O). (6) = parameters of the models (numerically, incase of Analogously, the total expected number of sensitive Model 1). For uniformity, all final results are ex- cells, S(n), can be expressed as pressed in the terms of probabilities b and d for model 1 and b, d and c for models 2 and 3. [S(O) R(0)(1 - a")]2". (7) S(n) = + This means that the probability of gene amplifica- If r(n) is defined as the fraction of resistant cells in the tion (b),deamplification (d), and loss (c) can be calcu- population, at generation n, then Equations 6 and 7 lated from (1) observed data of the average gene copy yield number (E,) among cells with amplified gene grown in methotrexate and the rate of decrease (a)of gene r(n)/r(O)= an (8) copy number per generation of cells grown in the where a is defined differently in each model (Table absence of MTX. 2). Probabilities of gene amplification and deampli- One way to characterize the equilibrium distribu- fication from MTX data: We present an analysis of tion is tocompute, based on the Theorems in the some published data on the loss of cell resistance to APPENDIX, the expected (mean)number of gene copies MTX. To be of use in our computations, the data in the resistant subpopulation, in the limit. The should include two kinds of information: (1) mean expressions for E, are presented in Table 2. contents, per cell, of the DHFR gene in the resistant Remark: The information about the precise shape subpopulation, and (2) the fraction of the cell popu- 638 M. Kimmel and D. E. Axelrod

Cone Coplo. per Cell

0 1 2 4 8 FIGURE3,“Gene amplification and deampli-

0 fication in a representative small cell pedigree of cells grown under nonselective conditions (based 1 on model 2). Squaresrepresent cells with the Cell 2 b number of gene copies per cell indicated across the top row. Open squares represent cells with Cemratlon 3 d‘1-1 no gene copies, and solid squares represent cells 4dd with one or more gene copies. Each cell with at least one gene copy,can give rise to progeny cells that have double the numberof gene copies, with Dlmtrlbutlon probability b, half that number, with probability of Cell. wlth d, or the same number, with probability 1 - b - Clwn Nmr d. The histogram at the bottomshows the result- Of Cerm Cople. ing distribution of gene copies per cell in the 0 i 2 4 8 fourth generation.

Cene Coples per Csll

TABLE 3 Estimates of the parameters of the models based on data on the dihydrofolate reductase gene

IndependentGeometric Joint Cellamplification line/ amplification ampllfication parametel- model 1 model 2 model 3

-0.0292 -0.0292 -0.0292 30 30 30 ** 0.019 0.01 1 ** 0.095 0.056 * * 0.0 12 ** 5 5

-0.0242 -0.0242 -0.0242 22.5 22.5 22.5 0.471 0.01 5 0.009 FIGURE4.-Decrease in the fraction of cells resistant to metho- 0.495 0.077 0.046 trexate. Cells were grown for very long periods of time in medium * * 0.009 containing MTX and thenmoved to MTX-free medium. Atvarious 1.05 5.1 5.1 times after the moment of removing MTX, the fraction of cells a Source of data: BROWN, BEVERLEYand SCHIMKE(1981). resistant to MTX (ie., cells with the amplified gene) was deter- Source of data: KAUFMAN, BROWNand SCHIMKE(1 981). mined. Plot ofexperimental data from BROWN,BEVERLEY and Notation: In(@), the slope of the straight line approximation of SCHIMKE(1981) (solid triangles) and KAUFMAN,BROWN and loss resistance data, cf: Figure 4; E,, the average (expected) value SCHIMKE(1 98 1) (open squares), in semilogarithmic coordinates, of number of double minutes per cell, from flow cytometric data; fitted by the model equation In[r(n)/r(O)] = n In(a), wherer(n)/r(0) b, d and c probabilities of gene amplification and deamplification is the fraction of cells with amplified gene, and R is the number of and oftotal loss of gene. * = parameterirrelevant, ** = not cell generations after MTXremoval. estimable.

lation that remains MTX-resistant, at various times 0.46, and r(47) = 0.22. This gives r( 17)/r(0) = 0.780, after removal of MTX. We selected two sources which r(34)/r(0) = 0.505, and r(47)/r(O) = 0.242. Loga- provide satisfactory quality and quantity of data. rithms of these values are plotted as solid symbols in Inthe first reference, BROWN, BEVERLEYand Figure 4, together with the straight line, In[r(n)/r(O)] SCHIMKE(198 1) investigated the loss of overpro- = n ln(a), fitted by the least squares method [In(a) = duction of the DHFR enzyme in murine 3T6 cells. -0.0292, as in Table 31. The cells were grown in 50 PM MTX for a long time, The mean level of DHFR specific fluorescence of and then transferred to the MTX-free medium. At the subpopulation of resistant cells, in the experiment different times they were labeled with the fluorescein described above, had been essentially unchanged for derivative of MTX and analyzed by a fluorescence- n = 0, 17, 34, and 47, as indicated in Figure 1 of activated cell sorter. The results are summarized in BROWN,BEVERLY and SCHIMKE(1981). From their Figure 1 of the reference quoted. Aftern = 0, 17, 34, Figure 4 and other data found in theirpaper, we and 47 cell generationsthe recorded fractions of determined that this mean level corresponded to ap- resistant cells were, r(0) = 0.91, r(17) = 0.71, r(34) = proximately E, = 30 double minute chromosomes per Models of Gene Amplification 639 cell. Resulting values of parameters are presented in 0.5 I I Table 3. The second set of data is taken from KAUFMAN, BROWNand SCHIMKE(1 981) who investigated the mouse sarcoma S-180 cells, cell line RIA, exhibiting unstable resistance to MTX. Thecells, either studied for more than3 years previously, or frozen for 2 years before the experiment, were cultured without MTX fordifferent times. Our Figure4 depicts asopen symbols the values of ln[r(n)/r(O)] observed for n between 0 and 100, originally in Figure 1 of KAUF- MANN, BROWNand SCHIMKE(1981). The fitted value of In(a) is equal to -0.0242. From Table 1 of KAUF- MAN, BROWNand SCHIMKE(1 981) we determined E, = 22.5 double minute chromosomes per cell for the cleneroticrs without Methobwote unstable RIA line. Results are collected in Table 3. FIGURE5.-Comparison of the models used to fit the observed These two examples make possible a comparison of kinetics of loss of double minute chromosomes in cells removed the three models. from MTX. The data from KAUFMAN,BROWN and SCHIMKE(198 1) are plotted as in our Figure 4. The straight line is the best fit for Model 2 (geometric amplification) is the easiest to models 1, 2 and 3, in which loss is due todeamplification exceeding discuss. It yields the frequency of a geneamplification :unplification, In[r(n)/r(O)]= n In(a). The concave line is the best fit event of the order of 2 per 100cell divisions, and the for the model in which loss is due to cells without gene copies frequency of a deamplification event approximately 5 growing faster than cells with gene copies (Equation 9 in the text; times higher. with r/T = 17% and r(0) = 86%,by best fit). Model 3 (joint amplification) yields similar results, but the estimates of b and d are lower because of the fluorescence; the R;, moderate degrees; and the RT presence of the additional transition to zero (proba- (double minus) cells, very low degree of fluorescence. bility c). In Figure 3 of that paper, the corresponding growth Model 1 (independent amplification) requires more curves were depicted, relative to the original R I'me. discussion. It does not fit the datain the first example. The growth potential of the three sublines remained The reason is that the conditional limit distribution in an inverse relationship to their contents of DHFR. of Galton-Watson process may have expectation as We attempted to analyze quantitatively the growth large as desired, but it requires conditions close to disadvantage hypothesis. The most elementary way of criticality, i.e. the difference d - b small enough. Since doing it is to assume that the sensitive cells have the value of d - b = 1 - a from the data is equal to doubling time T while those resistant have doubling 0.03, the process will have the conditional expectation time T + 7 and that at time t = 0 (the moment of E, too low to fit the observed value of 30. In the removing MTX from the culture medium), the frac- second example,a fit is found, but b + d is much tion of resistant cells was r(0). Then, attime t > 0, the greater than the value of 1/2 required by the basic fraction of the resistant cells, r(t),would be equal to formulation of model 1 (see Table 1). Evenif the r(O) Galton-Walson process is interpreted in thebroad r(t) = r(0) [ 1 ?-(0)]2(~"T(T+')1' (9) manner, as an unspecified mechanism of independent + - amplification/deamplification of each extrachromo- We fitted this equation to the unstable resistance somal element (with b + d possibly greater than 1/23, data in Figure 1 of KAUFMAN,BROWN and SCHIMKE then the fit requires high probability of loss of extra- (1981).Comparable quality of fit is achieved with chromosomalelements (d high) coupled with high various combinations of parameters 7 and r(0). We probability of their amplification (b high), not a likely present the best fit obtained by us, in Figure 5. In combination. semilogarithmic coordinates,the growth advantage Gene deamplification vs. growth advantage: The curve (8) is concave, as opposed to the straight line standard explanation of the loss of MTX resistant cells provided by the amplification/deamplification transferred into MTX-free medium (cJ, e.g., KAUF- models. MAN, BROWNand SCHIMKE1981) has been that the The growth disadvantage hypothesis does not pro- amplified gene causes a growth disadvantage when vide a superior fit to theresistance loss data, compared the selective pressure is removed. As a support of this to that provided by the amplification/deamplification viewpoint, KAUFMAN,BROWN and SCHIMKE(1981) models. In fact, a combination of the two approaches provided measurements of the growth rates of three (i.e. cell growth disadvantage and gene amplification/ sorted subpopulationsof the resistant R2 cell line: The deamplification) could be invoked. If interpreted from R: cells containedhigh degrees of DHFR-related this viewpoint, the estimates of probabilities 6, d and 640 M. Kimmel and D. E. Axelrod

amplification/deamplification models and with Equa- tion 9 of the growth advantage model. [Marginally, this latter is similar to that employed by LEVANet al. (1977) to analyze their own data.] The slope of the linear fit, according to the gene amplification/deamplification models, is equal to In(a) = -0.122. We estimate that E, = 27 double minutes. This yields the estimates of gene amplification and deamplification probabilities, b = 0.007, d = 0.374. The growth advantage model (Equation 9) yields a better fit (Figure 6) than the geneamplification/deam- plification models, Equation 8. With anestimated -ad initial proportion ofcells carrying double minutes, 0 5 10 15 20 25 30 r(0) = 97%, and doublinga time increment, Generotians in vi r/T = 0.59, the agreement between data and model FIGURE6,"Decrease in the fraction of SEWA mouse tumor cells is very good. with double minute chromosomes when grown in vitro. Cells were The conclusion is that in the process of loss of the first grown in vivo as ascites, and then grown in vitro. At various SEWA tumor oncogene, the growth advantageof cells times after being explanted, the fraction of cells containing one or without double minutes, plays a dominant role. more double minute chromosome was determined. Plot of experi- mental data of LEVANet al. (1977) fitted by equation In[r(n)/r(O)] = n In(a), implied by the gene deamplification models (straight line) and by Equation 9 implied by the growth advantage model (concave REDUCTION TO HOMOZYGOSITY line). Reduction to homozygosity is a phenomenon which c of gene amplification/deamplification and loss, as allows the expression of recessive alleles in the prog- presented in Table 1, overestimatethe true rates. eny of a heterozygous cells. This has been reported Unfortunately, a combined model of this kind would to occur in several tumor types in which a dominant have too many parameters to be fitted unequivocally tumor suppressor geneis lost and a recessive oncogene to existing data. is expressed (PONDER1988). Since oncogene amplifi- cation is a frequently observed event in tumor cells LOSS OF THEONCOGENE OF THE (LEVANet al. 1977; ALITALOand SCHWAB1986), we SEWA TUMOR explore the possibility that reduction to homozygosity We have analyzed the data of LEVANet al. (1977) in these tumors is a consequence of chromosomal gene on the loss of double minute chromosomes from the amplification. SEWA tumor cells. When grown in vivo as an ascites, To analyze this assertion, we will employ a modifi- 89% of these cells have double minute chromosomes. cation of model 2 (geometric amplification). The con- When explanted and grown in vitro, the fraction of clusions remain valid when models 1 or 3 areem- cells withdouble minute chromosomesdecreases until ployed. at 106 days it reaches 4%. The average number of Suppose that in the tumor each cell has copies of double minutes per cell can be estimated from the one of the alleles of the suppressor gene on each of published distribution (LEVANet al., 1977) (Figure 6) the homologues of a chromosome (X copies on the to be 27. We have assumed that the double minute first homolog, Y copies on the second). The numbers chromosomescontain genes advantageous for cell change from generation to generation ofcells, the proliferation in vivo, but not advantageous and lost in rules for each homologue being independentand vitro. Consistent with this assumption are the obser- consistent with model 2, previously employed for ex- vations that SEWA cells contain amplified c-myc on- trachromosomal genes. cogenes associated with double minute chromosomes In the framework of this model, let us consider the (SCHWABet al. 1985) and that the degree of c-my distribution of the sum of the numberof copies of the amplification is correlated with tumorigenicity (MAR- gene on both homologues,X, + Y,, conditional on {X, Y, # 0). Let us denote by W,(s) the probability TINSSON et al. 1988). + In other words, theenvironment of thetumor generating functionof X,: exerts a selective pressure on cells which disappears Wn(S)= PlS2"' (10) in cell culture conditions. This is a clear analogy to i20 the loss of MTX resistance after removing MTX. and by R,(s), the conditional p.g.f. The data of LEVINet al. (1 977) depicting the loss of double minutes in vitro, are presented in Figure 6. We fitted these data bothwith Equation 8 of the gene Models of Gene Amplification 64 1

Under theassumptions of Theorem 2, the following a subpopulation of cells with a broad distribution of holds, as n "-* 03, multiple copies of genes and of a subpopulation of cellswith no genes. When selective conditions are (a) W40) - 1 removed, the frequency of cells with amplified genes (b) W&) - W,(O) + 0, for st [0, 11, decreases relative to the frequency of cells with no (c) R,(s) + R(s),where amplified genes. The mean value of the number of R(s) = (1 - ~)"(~)""'s2"'. (12) genes per cell, among thosecells withamplified genes, i? 1 does not change. The change in the proportion of The p.g.f. of X, + Y, conditional on {X, + Y, # 0) is cells in the subpopulation with amplified genes is a equal to consequence of the differential rates of deamplifica- tion and amplification, it does not require that cells W,(S)' - W,(O>' with different numbers of genes grow at different S7&) = 1 - W,(0)2 ' rates (except for cellswith no copies of thegene, which do not yield progeny under restrictive condi- Application of (a), (b) and(c) above, yields tions). The strength and novelty of our approach are that it attempts to understand the changes in the growth The interpretation is that the population of cells of populations of cells by describing the probabilities with nonextinctgenes on at least one of the two of the changes in numbers of gene copies in each cell. homologous chromosomes, will be eventually domi- In theliterature, several molecular mechanisms of nated by cells with the gene nonextinct only on either gene amplification and deamplification have been pro- the first or the second homologous chromosome. posed including extra reinitiations of DNA synthesis, Concluding, it is remarkable that the chromosomal DNA recombination with unequal crossing over, gene amplification/deamplification process may lead and extrachromosomal DNA replication and loss to loss of one or the other allele from all cells in the (SCHIMKEet al. 1986; WAHL1989; STARKet al. 1989). population. Our models, with small modifications, can be made consistent with these, and other, possible molecular DISCUSSION mechanisms. The role of the models is to suggest that Our mathematical models providea quantitative some molecular mechanisms are unlikely, for example framework for thinking about the unstable pheno- those that lead to unacceptable estimates of the prob- types associated with gene amplification. In particular, abilities of amplification and deamplification. Thus, they explain how the amplified genotype disappears the lack of good numerical fit, makes the independent from the population under nonselective conditions. amplification of model 1, highly unlikely. Also, mech- In their first application, the models provide an expla- anisms that result in the probability of gene increase nationfor the loss of resistance to methotrexate, being equal to, or greater than, the probability of which seems to fit the data better than the usually decrease, would not produce stable distributions of invoked hypothesis of a growth disadvantageof resist- copy number. ant phenotype. (A combination of the two mechanisms There exist theoretical papers addressing the issue is quite plausible; CJ: HABER and SCHIMKE,1). 198 Also, of the evolution of the highly repetitive DNA se- the models allow estimates of the rates of geneampli- quences, which can be considered a mode of amplifi- fication and deamplification to be calculated, from the cation of genes located on normal chromosomes (e.g., existing data. the so called homogeneously staining regions). One of Superiority of models 3 and 2 (geometric andjoint them is the work by SMITH (1976) inwhich it is amplification, respectively) over, model 1 (independ- proposedthat DNA, whose sequence is not main- ent amplification), in their ability to fit thedata, tained by selection will develop periodicities as a result suggests that it is likely that, in the caseof DHFR of random crossing over. gene, the extrachromosomal elements are amplified JONES, LUMPKINand SMITH(1980) present mathe- and deamplified in a correlated and not independent matical formalism andcomputer simulations fora manner. relatively comprehensive model, based on a descrip- For the second biological process considered, the tion in the form of a Markov chain. They list among elimination of the SEWA tumor oncogene, the tradi- possible applications, the unstable resistance to meth- tional explanation of cell growth advantage seems to otrexate-without providing further details. be superior. Our analysis suggests that amplification and deam- In a more theoretical way, the models explain the plification of agene can occur in 2-1 0% of cell so called reduction to homozygosity observed in some divisions. The observed low frequency of cells with an tumors. amplified gene in the absence of selection may now Our models predict a cell population consisting of be understoodas a resultof the rateof deamplification 642 M. Kimmel and D. E. Axelrod being greater than the rate of amplification, rather abilities of gene increase and decrease at each cell than as a consequence of the rarity of gene amplifi- division. The models have allowed us to use experi- cation. The implication of this conclusion is that mo- mental data to estimate these probabilities in the spe- lecular events which alter gene copy number may be cial case of DHFR gene and to draw qualitative con- occuring much more frequently than previously sup- clusions on themechanisms and consequences of gene posed (STARKet al. 1989). amplification. Bacterial resemble mammalian extrachro- mosomal elements in several aspects of their behavior. We thank one of the reviewers, who brought to our attention the paper by PAKES (1973).Results in that reference include, as a They replicate autonomously from the chromosomal special case, the proof of our Theorem 2. Our original direct proof DNA and segregate independently of chromosomal based on the Laplace method for integrals is available on request. DNA. They may code for dispensible genes or for We thank D. GEORGEwho informed us about the observations of genes which confer a survival advantage under some LEVAN and his co-workers and FRANKTRACANOS who critically selective condition. Bacterial cells may harbor a large read the manuscript. DAVID AXELROD was supported in part by grants from the U.S. Public Health Service (CA42795), the New number of copies of plasmids, in a stable or unstable Jersey Commission on Cancer Research (#687-002), and the Waks- manner, or may have no plasmids. Biochemical mech- man-Busch Memorial Fund. M.K. was supported in part by grants anisms for the control of replication (NORDSTROM, from theU.S. Public Health Service (CA23296 andRO 1CA-42830). MOLIN and LIGHT1984) and partitioning(AUSTIN 1988) of bacterial plasmids have been proposed. LITERATURE CITED Mathematical models have been employed to help ALITALO,K., and M. SCHWAB,1986 Oncogene amplification in to understand the kinetics of maintenance and loss of tumor cells. Adv. Cancer Res. 47: 235-28 1. bacterial plasmids. In the extensive literature of the ATHREYA,K. B., and P. E. NEY, 1972 Branching Processes. Sprin- subject, two classes of models can be distinguished, ger. New York. deterministic and probabilistic. In the first group of AUSTIN,S. J., 1988 partition. Plasmid 20: 1-9. models differentialequations are employed to de- BROWN,P. C., S. M. BEVERLYand R. T. SCHIMKE,198 1 Relationship of amplified dihydrofolatereductase genes to scribe the effects of growth advantageor disadvantage double minute chronIosonles in unstably resistant mouse fibro- of cells with plasmids, combined with the loss of plas- blast cell lines. Mol. Cell. Biol. 1: 1077-1083. mids through segregation [see, e.g., Cooper et al. BROWN, P. C., T. D. TLSTYand R. T. SCHIMKE,1983 (1987) and LENSKIand BOUMA(1987)l. Plasmid loss Enhancement of methotrexate resistance anddihydrofolate is represented by a term in an equation. It should be reductase gene amplification by treatment of mouse 3T6 cells with hydroxyurea. Mol. Cell. Biol. 3: 1097-1 107. noticed that this deterministic description predicts a COOPER, N.S., M. E. BROWN andC. A. CAULCOT,1987 A nrath- stable equilibrium in some situations inwhich the enlatical method for analysing plasmid stability in micro-orga- probabilistic description would predict the eventual nisms. J. Gen. Microbiol. 133: 1871-1880. extinction (elf:, e.g., the case considered in COOPER, COX, D. R.. andH. D. MILLER,1965 The Theory of Stochastic BROWNand CAULCOT1987, Figure 5). The failure of Processes. John Wiley Xe Sons, New York. FELLER,W., 1969 An Introduction to ProbabilityTheory and Its the deterministic model is caused by the fact that it Applications. Vol. 2. John Wiley 8c Sons, New York. describes only the mean values of random variables GIULOTTO,E., C. KNIGHTSand G. R. STARK, 1987 Hamster cells which may tend to aconstant while theserandom with increased rates of DNA amplification, a new phenotype. variables tend to zero with probability one. Such a Cell 48: 837-845. mechanism is known, for example, from the theoryof HARER,D. A,,and R. T. SCHIMKE,1981 Unstable amplification of analtered dihydrofolate reductase gene associated with critical and subcritical branching processes (ATHREYA double-minute chromosomes. Cell 26: 355-362. and NEY 1972). JONES, R. B., C. K. LUMPKIN andJ. R. SMITH, 1980 A Stochastic Probabilistic models have been employed to study tnodel for cellular senescence. Part I. Theoretical considera- in more detail bacterialplasmid segregation, partition- tions. J. Theor. Biol. 86: 581-592. ing, and incompatibility e.g., SENETAand TAVER~KAUFMAN, R. J., P. C. BROWNand R. T. SCHIMKE, 1979 Loss and [CJ, stabiliration of amplified dihydrofolatereductase genes in (1983) and NOVICKand HOPPENSTEADT(1978) and tnouse sarcoma S-180 cell lines. Mol. Cell. Biol. 1: 1084-1093. references cited therein]. Some of these models are KAUFMAN,R. J., P. C. BROWNand R. T. SCHIMKE, I981Amplified formally similar tothe independent amplification dihydrofolate reductasecells are associated with double minute model we employ in the present paper. However,most chromosomes. Proc. Natl. Acad. Sci. USA 76 5669-5673. of them include features specific for bacterial plas- LENSKI, R.E., and J. E. BOUMA, 1987 Effects of segregation and selection on instability of plasmid pACYC184 in Escherichia coli mids, for example, an upper boundon the number of B. J. Bacteriol. 169: 5314-5316. plasmid copies per cell. Stochastic models satisfactorily LEVAN,G., N. MANDAHL,B. 0. BENGTSSON and A. LEVAN, answer questions concerning the asymptotic distribu- 1977 Experimental elimination and recovery of double min- tions of plasmid copy number and the rateof plasmid ute chrotnosomes in malignant cell populations. Hereditas 86: loss, analogous to the questions we ask about double 75-90. MARTINSSON,l., F. STOHL,P. POLLWEIN,A. WENZEL,A. LEVAN, minute chromosomes in mammalian cells. M. SCHWARand G. LEVAN, 1988 Tumorigenicity of SEWA In summary, models forgene amplification have murine cells correlates with degree of c-myc amplification. been presented and analyzed that assume fixed prob- Oncogene 3: 437-44 1. Models of Gene Amplification 643

NORDSTROM,K., S. MOLINand J. LIGHT, 1984 Control of repli- provided X, # 0, x, # 0, respectively. Finally, let us cation of bacterial plasmids: genetics, molecular biology, and denote physiology of the plasmid R1 system. Plasmid 12: 71-90. f(s) = bs2 + (1 - b - d)s + d. NOVICK, R.P., and F. C. HOPPENSTEADT,1978 On plasmid incom- (A4) patibility. Plasmid 1: 421-434. Model 1 (independent amplification): PAKES,A. G., 1973 Conditional limit theorems for a left-contin- Theorem 1: Suppose d > b. Then uous random walk. J. Appl. Prob. 10 39-53. PETERSON,J. A,, 1983 The widespreadnature of phenotypic 1 - fi; - Kd, (A5) variability in hepatomas andcell lines inthe form ofa geometric series. J. Theor. Biol. 102: 41-53. Vn(s) + V(s), (A6) PETERSON,J. A., 1984 Analysis of variability in albumin content of sister hepatoma cells and a model for geometric phenotypic as n + 00, where variability (quantitative shift model). SomaticCell Mol. Genet. 10: 345-357. a=l+b-d, (A7) PONDER,B., 1988 Gene losses in humantumors. Nature 335: 400-402. V(s)is the only p.g.f. solution of the functional equa- RATH,H., T. TLSTYand R. T. SCHIMKE,1984 Rapid emergence tion, of methotrexate resistance in cultured mousecells. Cancer Res. 44: 3303-3306. 1 - Vlf(s)] = ff[l - V(s)], (AS) SENETA,E., and S. TAVAR~,1983 Somestochastic models for plasmid copy number. Theor. Popul. Biol. 23: 241-256. and SCHIMKE,R. T., 1984 GeneAmplification, drug resistance and cancer. Cancer Res. 44: 1735-1 742. K = 1/E, = l/V'(l). (A9) SCHIMKE, R.T., 1988 Gene amplification in cultured cells. J. Biol. Chem. 263: 5989-5992. Proof: (A6) and (A8) arethe statements of the SCHIMKE,R. T., S. W. SHERWOOD, B.A. HILL^^^ R. N.JOHNSTON, Yaglom's theorem (ATHREYAand NEY 1972; Corol- 1986 Overreplication and recombination of DNA in higher lary 1.8.1). (A5) and (A9) follow from expressions (1) eukaryotes: potential consequences and biological implications. and (22)in section I. 1 1of ATHREYAand NEY(1 972). Proc. Natl. Acad. Sci. USA 83: 2157-2161. Model 2 (geometricamplification): In this case, SCHWAB,M., G. RAMSAY,K. ALITALO,H. E. VARMUS,J. M. BISHOP, T. MARTINSSON,G. LEVAN and A. LEVAN, 1985 Am- process (x,) is a special case of random walk with an plification and Enhanced Expression of the c-my oncogenein absorbing barrier at zero. The usual textbook variant mouse SEWA tumor cells. Nature 315: 345-347. assumes b + d = 1 (FELLER1969). Our variant is SMITH, G. P. 1976 Evolution ofrepeated DNA sequences by treated in Cox and MILLER (1 965). Asymptotics of unequal crossover. Science 191: 528-535. exactly the type required in this paper, called quasi- STARK,G. R., M. DEBATISSE,E. GIULOTTOand G. M. WAHL, 1989 Recent progress in understanding mechanisms of mam- stationary distributions of the process, are discussed malian DNA amplification. Cell 57: 901-908. in PAKES(1973). The following result is a special case TLSTY,T. D., P. C. BROWNand R. T. SCHIMKE,1984 UV Radia- of Theorem 2 and Lemma 3 in that paper (it follows tion facilitates methotrexate resistanceand amplification of the by rather tedious but elementary calculations). dihydrofolate reductase genein cultured 3T6 mouse cells. Mol. Theorem 2: Suppose that d b. Then, Cell. Biol. 4: 1050-1056. > WAHL,G. M., 1989 The importance of circular DNA inmam- malian gene amplification. Cancer Res. 49: 1333-1340. Communicating editor:R. R. HUDSON

APPENDIX Let us denote by fi: the probability that there arei copies of the gene in the nth cell generation: fir = P{X, = i). (-41) We consider the limit properties,as n tends to infinity, of the gene extinction probability fi; and of the set of conditional probabilities,

C? = P(X, = i IX, # 0) = p?/( 1 - fi;) (A2) that i gene copies are present, provided the gene is as n * w. not extinct. We limit ourselves to the subcritical case, Model 3 (joint amplification): b d. We also formally suppose that X. = 1. Theorem 3: Suppose that d > b. Then We introduce the following notations: (1 - fit) - Ka"/L3, Vn(s) = crsit vn(S) = P(Xn = i Ixn # 0)s'. (A3) i20 irO as n + 00, where

Vn(S) and v,(s) are theconditional p.g.f.-s of X, and x, ff=l-c-(&-&)2, 644 M. Kirnmel and D. E. Axelrod

[I - - (& - &)'13/2 $2 - c) and d replaced by d/(l - c). It follows that K= (A17) 2&(& - &)W ' Rn = [in (I -OT c)"RE 1 ' Furthermore, v,(s) tends to the same limit as in model 2. Pro$ Let us construct a new Markov chain (J., n = 0, 1, . . .], with state space (-1, 0, 1, 2, . . .), and transition matrix R, and 10 0 0 0 0 c 1-c 0 0 0 0 c d (1-b-d-c) b 0 0 R= c 0 d (1-b-d c) (1-b-d-C)b 0b c0 0 d ... co 0 0 d (1 - b - d - C) :;;I. Let us1 note that P{xn = O] = P{yn 5 0) and that P(yn = 0) = (Rn)lo= (1 - ~)n(R$lo. Pixn = = P(p, = i 2 1. Also, conditional on not i) i), Eventually, entering state {-l), process (J,) behaves like process {x,) of model 2, with parameters b and d replaced by Pi., 2 I] = P(JnI 11 b/( 1 - c) and d/( 1 - c), respectively. Proceeding formally, we can write matrix R in the = 1 - P{y, = 0) - P{Jn= -1) form = (1 - c). [1 - (R;)lo] - Kann-3D f this last following from Theorem 2 [constants K and where OT = (0, 0, . . .), cT = (c, c, . . .), and Ro is the a defined by (A16) and (A17)]. transition matrix for model 2, with b replaced by b/(l Remaining assertions follow.