Stochastic modelling, Bayesian inference, and new in vivo measurements elucidate the debated mtDNA bottleneck mechanism
Iain G. Johnston1, Joerg P. Burgstaller2,3, Vitezslav Havlicek4, Thomas Kolbe5,6, Thomas R¨ulicke7, Gottfried Brem2,3, Jo Poulton8, Nick S. Jones1 1 Department of Mathematics, Imperial College London, UK 2 Biotechnology in Animal Production, Department for Agrobiotechnology, IFA Tulln, 3430 Tulln, Austria 3 Institute of Animal Breeding and Genetics, University of Veterinary Medicine Vienna, Veterinrplatz 1, 1210 Vienna, Austria 4 Reproduction Centre Wieselburg, Department for Biomedical Sciences, University of Veterinary Medicine, Vienna, Austria 5 Biomodels Austria, University of Veterinary Medicine Vienna, Veterinaerplatz 1, 1210 Vienna, Austria 6 IFA-Tulln, University of Natural Resources and Life Sciences, Konrad Lorenz Strasse 20, 3430 Tulln, Austria 7 Institute of Laboratory Animal Science, University of Veterinary Medicine Vienna, Veterin¨arplatz1, 1210 Vienna, Austria 8 Nu eld Department of Obstetrics and Gynaecology, University of Oxford, UK
Abstract Dangerous damage to mitochondrial DNA (mtDNA) can be ameliorated during mammalian development through a highly debated mechanism called the mtDNA bottleneck. Uncertainty surrounding this process limits our ability to address inherited mtDNA diseases. We produce a new, physically motivated, generalisable theoretical model for mtDNA populations during development, allowing the first statistical comparison of proposed bottleneck mechanisms. Using approximate Bayesian com- putation and mouse data, we find most statistical support for a combination of binomial partitioning of mtDNAs at cell divisions and random mtDNA turnover, meaning that the debated exact magnitude of mtDNA copy number depletion is flexible. New experimental measurements from a wild-derived mtDNA pairing in mice confirm the theoretical predictions of this model. We analytically solve a mathematical description of this mechanism, computing probabilities of mtDNA disease onset, e cacy of clinical sampling strategies, and e↵ects of potential dynamic interventions, thus developing a quantitative and experimentally-supported stochastic theory of the bottleneck. Mitochondria are vital energy-producing organelles within dramatically to 102,reducingthee↵ective population size eukaryotic cells, possessing genomes (mitochondrial DNA or of mitochondrial⇠ genomes [11, 12]. One postulated bottleneck- mtDNA) that replicate, degrade and develop mutations [1, 2]. ing mechanism is that this low population size accelerates ge- MtDNA mutations have been implicated in numerous patholo- netic drift and so increases the cell-to-cell heteroplasmy variance gies including fatal inherited diseases and ageing [3, 4, 5, 1]. [11, 13, 14, 15], which was first observed to generally increase Combatting the buildup of mtDNA mutations is of paramount from primordial germ cells through primary oocytes to mature importance in ensuring an organism’s survival. Substantial oocytes [16]. However, independent experimental evidence [12] recent medical, experimental and media attention focused on suggests that heteroplasmy variance increases negligibly dur- methods to remove [6] or prevent the inheritance of [7, 8, 5, 9, ing this copy number reduction, though this interpretation has 10] mutated mtDNA in humans. been debated [17]. Ref. [12] shows heteroplasmy variance ris- One means by which organisms may ameliorate the mtDNA ing during folliculogenesis, after the mtDNA copy number min- damage that builds up through a lifetime is through a devel- imum has been passed. In yet another picture, supported by opmental process known as bottlenecking. Immediately after conflicting experimental results [18, 19], heteroplasmy variance fertilisation, a single oocyte (which may contain > 105 indi- increases with a less pronounced decrease in mtDNA copy num- vidual mtDNAs) may have a nonzero mtDNA mutant load or ber (a minimum copy number > 103 in mice), solely through heteroplasmy(theproportionofmutantmtDNAinthecell). As random e↵ects associated with partitioning at cell divisions. the number of cells in the developing organism increases, the Clearly a consensus on this important mechanism is yet to be intercellularpopulationwillthenacquireanassociatedhetero- reached. plasmyvariance,thatis,thevarianceinmutantloadacrossthe Important existing theoretical work on modelling the bottle- population of cells (Fig.A), allowing removal of cells with neck has assumed a particular underlying mechanism [13, 20] or high heteroplasmy and retention of cells with low derived statistics of mtDNA populations [21, 14, 22, 23] without heteroplasmy. In-tense and sustained debate exists as to the explicitly considering changing mtDNA population size, or the mechanism by which this increase of heteroplasmy variance discrete nature of the mtDNA population: e↵ects which may occurs. Several experi-mental results in mice suggest that, powerfully a↵ect mtDNA statistics. To capture these e↵ects during development, thecopy number of mtDNA per cell in it is necessary to employ a ‘bottom-up’ physical description the germ cell line drops 1 of mtDNA as populations of individual, discrete elements sub- involves partitioning of clusters of mtDNA at each cell division, ject to replication and degradation, as in, for example, Refs. thus providing strong stochastic e↵ects associated with each di- [21] and [24]. Exploring the bottleneck also requires explic- vision. We consider a general set of dynamics through which itly modelling partitioning dynamics throughout a series of cell this cluster inheritance may be manifest, including the possi- divisions, over which population size can change dramatically. bility of heteroplasmic ‘nucleoids’ of constant internal structure While previous simulation work [11, 25] has taken such a phi- [31], sets of molecules or nucleoids within an organelle, homo- losophy with specific model assumptions, we are not aware of plasmic clusters, and di↵erent possible cluster sizes (see Ap- such a study allowing for the wide variety of replication and pendix 1). The Wai mechanism involves the replication of a partitioning dynamics proposed in the literature; we further subset of mtDNAs during folliculogenesis. We note that this not that replication-degradation-partitioning mtDNA models latter mechanism can be manifest in several ways: (a) through are yet to be fully described analytically. Nor is there a general slow random replication of mtDNAs (so that, in any given time quantitative framework under which di↵erent proposed bottle- window, only a subset of mtDNAs will be actively replicating) neck mechanisms can be statistically compared given extant or (b) through the restriction of replication to a specific subset data (although statistical analyses focusing on particular mech- of mtDNAs at some point during development. We will refer anisms and individual sets of experimental results have been to these di↵erent manifestations as Wai (a) and Wai (b) re- used throughout the literature, for example, using a Bayesian spectively. The Wai (a) mechanism and the Cree model can approach under a particular bottleneck model to infer model both be addressed in the same mechanistic framework (with bottleneck size [26]). Combined developments in theory and potentially di↵erent parameterisations): if the rate of random inference are therefore required to make progress on this im- replication in the Cree model is su ciently low during folliculo- portant question. genesis, only a subset of mtDNAs will be actively replicating at We remedy this situation by constructing a general model any given time during this period, thus recapitulating the Wai (featuresandparametersdescribedinFig.)forthepopulation (a) mechanism (see Appendix 1). We will henceforth combine dynamics of the bottleneck, able to describe the range of pro- discussion of the Wai (a) and Cree mechanisms into what we posedmechanismsexistingintheliterature. Usingexperimen- term the birth-death-partition (BDP) mechanism. taldataonmtDNAstatisticsthroughdevelopment[16,11,12, Weseekaphysicallymotivatedmathematicalmodelforthe 18],weuseapproximateBayesiancomputation[27,28,29,30] bottleneckthatiscapableofreproducingeachofthesemecha- torigorouslyexplorethestatisticalsupportforeachmechanism, nisms. Ourgeneralmodelforthebottleneck(detaileddescrip- showing that random mtDNA turnover coupled with binomial tioninMethods)involvesa‘bottom-up’representationofmtD- partitioning of mtDNAs at cell divisions is highly likely, and NAs as individual intracellular elements capable of replication thatthedebatedmagnitudeofmtDNAcopynumberreduction and degradation (Fig.B) with rates and ⌫ respectively. A issomewhatflexible. Subsequently,weconfirmthepredictions parameterSdetermineswhethertheseprocessesaredeterminis- of this model by performing new experimental measurements tic(specificratesofproliferation)orstochastic(replicationand of heteroplasmy statistics in mice with an mtDNA admixture, degradationofeachmtDNAisarandomevent). Theseratesof including a wild-derived haplotype, that is genetically distinct replicationanddegradationofmtDNAarelikelystronglylinked frompreviousstudies. Wethenanalyticallysolvetheequations to mitochondrial dynamics within cells, through the action of describingmtDNApopulationdynamicsunderthismechanism mitochondrialqualitycontrol[32,33]modulatedbymitochon- and show that these results allow us to investigate potential drial fission and fusion [34, 35, 36], which can act to recycle interventions to modulate the bottleneck (suggesting that up- weakly-perfoming mitochondria [37, 38]. This quality control regulation of mtDNA degradation may increase the power of can be represented through the degradation rates assigned to the bottleneck to avoid inherited disease; we discuss potential each mtDNA species, which may di↵er (for selective quality strategies for such an intervention) and yield quantitative re- control)orbeidentical(fornon-selectiveturnover). sultsforclinicalquestionsincludingthetimescalesandproba- The proportion of mtDNAs capable of replication is con- bilitiesofdiseaseonset,andthee cacyofstrategiestosample trolledbyaparameter↵inourmodel,dictatingtheproportion heteroplasmyinclinicalplanning. of mtDNAs that may replicate after a cuto↵ time T. Thus, if ↵ = 1, all mtDNAs may replicate; if ↵ < 1, replication of a subsetproportion↵ofmtDNAsisenforcedatthiscuto↵time. Results Atcelldivisions,mtDNAsmaybepartitionedeitherdetermin- istically, binomially, or in clusters according to a parameter c A general mathematical model encompass- (Fig.C). ing proposed bottlenecking mechanisms The copy number of mtDNA per cell is observed to vary dramatically during development, with dynamic phases of copy We will consider three di↵erent classes of proposed generat- number depletion and di↵erent rates of subsequent recovery ob- ing mechanisms for the mtDNA mechanisms: namely, those served. Additionally, cell divisions occur in the germline at dif- proposed in Cree et al. [11]; Cao et al. [18] and Wai et al. ferent rates during development, with cells becoming largely [12]. We will refer to these mechanisms by their leading author quiescent after primary oocytes develop. To explicitly model name. The Cree mechanism involves random replication and these di↵erent dynamic regimes, and the behaviour of mtDNA degradation of mtDNAs throughout development, and binomial copy number during each, we include six di↵erent dynamic partitioning of mtDNAs at cell divisions. The Cao mechanism
2 phases throughout development, each with di↵erent rates of asaunitoftimethroughout),risingtointermediatevaluesbe- replicationanddegradation(labelledwithsubscriptilabelling tween7.5and21dpc,graduallyrisingfurthersubsequentlyto the dynamic phase: hence 1,⌫1,..., 6,⌫6), and allowing for become large in the mature oocytes of the next generation. In di↵erent rates of cell division or quiescence. This protocol en- Fig.A, and throughout this article, experimentally measured ablesustoexplicitlymodele↵ectsofchangingpopulationsize datawillbedepictedascircularpointsandinferredtheoretical throughout development rather than assuming dependence on behaviourwillbedepictedaslinesorshadedregions. asingle,coarse-grainede↵ectivepopulationsize;andtoinclude Fig. A shows mtDNA population dynamic trajectories re- the e↵ects of specific and varying cell doubling times. A sum- sulting from optimised parameterisations of each of the mech- maryofsymbolsusedinourmodelandthroughoutthisarticle anismsweconsider(seeMethods).InFigBweshowposterior ispresentedinFig.D. probabilities on each of these mechanisms. These posterior Our model, with suitable parameterisation, can thus mirror probabilities give the inferred statistical support for each the dynamics of the Cree and Wai (a) mechanisms (stochas- mechanism, derived from model selection performed with ap- tic dynamics and binomial partitioning, which we refer to as proximate Bayesian computation (ABC) [27, 28, 29, 30] us- the BDP mechanism); the Cao mechanism (clustered parti- ing uniform priors. ABC involves choosing a threshold value tioning); and Wai (b) mechanism (deterministic dynamics, re- dictating how close a fit to experimental data is required to stricted subset of replicating mtDNAs). The Cao mechanism, accept a particular model parameterisation as reasonable. In partitioning of clusters of mtDNA molecules, represents the ex- our case, this goodness-of-fit is computed using a comparison pected case if mtDNA is partitioned in colocalised ‘nucleoids’ of squared residuals associated with the trajectories of mean within each organelle (or in other sub-organellar groupings). mtDNA copy number and normalised heteroplasmy variance The size of mtDNA nucleoids is debated in the literature [1, 39, (seeMethodsandAppendix1). Eachoftheexperimentalmea- 40] (although recent evidence from high-resolution microscopy surements corresponds to a sample variance, derived from a suggests that nucleoid size is generally < 2 [41], consonant with finite number of samples of an underlying distribution of het- recent evidence that individual nucleoids may be homoplasmic eroplasmies, and therefore has an associated uncertainty and [42]); our model allows for inheritance of homoplasmic or het- sampling error [14]. The reasonably small sample sizes used eroplasmic nucleoids of arbitrary characteristic size c, thus al- inthesesamplevariancemeasurementsarelikelytounderesti- lowing for a range of sub-organellar mtDNA structure. We dis- mate the underlying heteroplasmy variance (the target of our cuss the impact of mixed or fixed nucleoid content in Appendix inference). Our ABC approach naturally addresses these un- 1. certainties by using summary statistics derived from sampling asetofstochasticincarnationsofagivenmodel,wherethesize A birth-death-partition model of mtDNA dy- of this set is equal to the number of measurements contribut- namics has most statistical support given ex- ing to the experimentally-determined statistic (see Methods). Fig.B clearly shows that as the ABC threshold is decreased, perimental measurements requiring closer agreement between the distributions of sim- We take data on mtDNA copy number in germ line cells in ulated and experimental data, the posterior probability of the mice from three recent experimental studies [11, 18, 12]. We BDPmodelincreases,todramaticallyexceedthoseoftheother also use data from two experimental studies on heteroplasmy models. ThisincreaseindicatesthattheBDPmodelisthemost variance in the mouse germ line during development [16, 12]. statisticallysupported,andcapableofprovidingthebestexpla- This data, by convention [17], is normalised by heteroplasmy nationofexperimentaldata(whichcanbeinutitivelyseenfrom level h, giving thetrajectoriesinFig.A).WenotethatABCmodelselection automatically takes model complexity into account, and con- V(h) cludethattheBDPmechanismisthebestsupportedproposed V0(h)= , (1) E(h)(1 E(h)) mechanism for the bottleneck. Briefly, this result arises be- cause the BDP model produces increasing variance both due where normalised variance V0(h) is a quantity that will be to early cell division stochasticity and later random turnover. oftenusedsubsequently. Thisnormalisedvariancecontrolsfor By contrast, the Cao model alone only increases variance in thee↵ectofdi↵erentorchangingmeanheteroplasmy,andthus early development when cell divisions are occurring. Qualita- allows a comparison of heteroplasmy variance among samples tively, this behaviour through time holds regardless of cluster withdi↵erentmeanheteroplasmiesandsubjecttoheteroplasmy (nucleoid)sizeandregardlessofwhetherclustersareheteroplas- change with time. We use a time of 100 dpc to correspond to micorhomoplasmic(allowingheteroplasmicclustersdecreases mature oocytes (see Methods). These heteroplasmy variance the magnitude of heteroplasmy variance but not its behaviour studiesemployintracellularcombinationsofthesamepairingof throughtime,seeAppendix1). TheWai(b)modelalonesimi- mtDNAhaplotypes(NZBandBALB/c),modellingtwodi↵er- larlyonlyincreasesvarianceatasingletimepoint(later,during entmtDNAtypeswithinacellularpopulation. Wetakedataon folliculogenesis). celldoublingtimesfromaclassicalstudy[43](seeMethods). A In Ref. [12], visualisations of cells after BrU incorporation possiblesummaryofthesedata(althoughtheyprovokeongoing show that a subset of mitochondria retain BrU labelling, which debate;seeDiscussion)isthat,asshowninFig.A,theexisting the authors suggest indicates that a subset of mtDNAs are repli- data on normalised heteroplasmy variance shows initially low cating. In Appendix 1, we show that the BDP model also re- variance until 7.5 dpc (days post conception, which we use ⇠
3 sultsintheobservationofonlyasubsetofreplicatingmtDNAs Experimental verification of the birth-death- overthetimeframecorrespondingtotheseexperimentalresults. partition model These observations thus correspond to results expected from the random turnover from the BDP model. We also note the The bottleneck mechanism identified through our analysis has mathematicalobservationthattheWai(b)mechanismrequires several characteristic features which facilitate experimental ver- the replication of < 1% of mtDNAs during folliculogenesis to ification. Key among these are the prediction that heteroplasmy yieldreasonableheteroplasmyvarianceincreases(Fig.Ashows variance acquires an intermediate (nonzero, but not maximal) theoptimalcasewith↵=0.006;optimalfitstodatagenerally value as a result of the copy number bottleneck, then contin- show 0.005<↵<0.01), and the proportions of loci visible in ues to increase due to mtDNA turnover in later development. Ref. [12]aresubstantiallyhigherthanthisrequired1%value. Our theory also produces quantitative predictions regarding the We show in Appendix 1 that the heteroplasmy statistics structure of heteroplasmy distributions at arbitrary times. corresponding to binomial partitioning also describe the case The existing data that we used to perform inference and where the elements of inheritance are heteroplasmic clusters, model selection display a degree of internal heterogeneity, com- where the mtDNA content of each cluster is randomly sampled ing from several di↵erent experimental groups. Furthermore, from the population of the cell (either once, as an initial step, or these data represent statistics resulting from a single pairing of repeatedly at each division). This similarity holds broadly, re- mtDNA types, and it is thus arguable how conclusions drawn gardless of whether the internal structure of clusters is constant from them may represent the more genetically diverse reality across cell divisions or allowed to mix between divisions. The of biology. Ref. [44] recently addressed the issue of this limited BDP model, in addition to describing the partitioning of indi- number of mtDNA pairings by producing novel mouse models vidual mtDNAs, also thus represents the statistics of mtDNA involving mixtures of standard and several new, unexplored, populations in which heteroplasmic nucleoids are inherited [31], wild-derived haplotypes which capture a range of genetic diver- or individual organelles containing a mixed set of mtDNAs or sity. To test the applicability and generality of our predictions, nucleoids are inherited, regardless of the size of these nucleoids we have perfomed new experimental measurements of germline (see Discussion). heteroplasmy variance in these model animals under a consis- tent experimental protocol (see Methods). We use the ‘HB’ mouse line from Ref. [44] pairing a wild-derived mtDNA hap- Parameterisation and interpretation of the lotype (labelled ‘HB’ after its source in Hohenberg, Germany) birth-death-partition model with C57BL/6N; we refer to this model as ‘HB’. Heteroplasmy measurements were taken in oocytes sampled HavingusedABCmodelselectiontoidentifytheBDPmodelas from mice at ages 24-61dpc (see Methods and Appendix 1; raw themoststatisticallysupported,wecanalsouseABCtoinfer data in Source data 1). The statistics of these measurements the values of the governing parameters of this model given ex- yielded (h), (h) and (h) as previously. This age range perimentaldata.Figs.AandBshowsthetrajectoriesofmean E V V0 was chosen to address the regions with most power to discrimi- copy number and mean heteroplasmy variance resulting from nate between the competing models; the existing (h) data is modelparameterisationsidentifiedthroughthisprocess.Fig.C V0 most heterogeneous around 20-30dpc and the later datapoints shows the inferred behaviour of mtDNA degradation rate ⌫ in allow us to detect developmental heteroplasmy behaviour after themodel,aproxyformtDNAturnover(asthecopynumberis constrained). Turnover is generally low during cell divisions, the copy number minimum. Fig.A shows these V0(h) mea- surements. The qualitative behaviour predicted by the BDP allowing heteroplasmy variance to increase due to stochastic mechanism is clearly visible: variance around birth (after the partitioning. Turnover then increases later in germ line devel- copy number bottleneck) is low but non-zero, subsequently in- opment,resultinginagradualincreaseofheteroplasmyvariance creasing with time. The ability of the BDP model to account afterbirthuntilthematureoocytesforminthenextgeneration. for the magnitudes and time behaviour of heteroplasmy vari- Fig.D shows posterior distributions on the copy number ance more satisfactorily than the alternative models is shown by minimumandtotalturnover(seeMethods)resultingfromthis the model fits in Fig. A. We explored these new data quan- process; posteriors on all other parameters are shown in Ap- titatively through the same model selection approach used for pendix1. Substantialflexibilityexistsinthemagnitudeofthe the existing data. As shown in Fig. B, the BDP mechanism copynumberminimum,illustratingthatobservedheteroplasmy again experiences by far the strongest statistical support in this variancecanresultfromarangeofbottlenecksizesfrom 200 3 ⇠ genetically di↵erent system. to >10 ; going some way towards reconciling the conflict be- Fig. C shows the result of our parameteric inference ap- tween Refs. [18] and [19] and Refs. [11] and [12]. The total 6 proach using these (h) measurements coupled with the (m) amount of mtDNA turnover (presented as = ⌫ ⌧ ,the V0 E i=3 i i0 measurements used previously (employing our assumption that product of turnover rate and the time for which this rate ap- modulation of copy number by heteroplasmy in this non-pathological plies, summed over quiescent dynamic phases; forP example, a 1 haplotype is small). Strikingly, the quantitative behaviour of turnover rate of 0.1hr for 30 days yields =0.1 24 30 = 72) is constrained more than the specific trajectory⇥ of mtDNA⇥ V0(h) with time inferred from the HB model (red) matches turnover rates, showing that a variety of time behaviours of the previous behaviour inferred from the NZB/BALB/c sys- tem (blue) very well, suggesting that our theory is applicable turnover are capable of producing the observed heteroplasmy across a range of genetically distinct pairings. We note th behaviour. e
4 shaded region in Figure 4C corresponds to credibility intervals Mitochondrial turnover, degradation, and se- around the mean behaviour of V0(h), and the fact that indi- lective pressures exert quantifiable influence on vidual V0(h) datapoints (subject to fluctuations and sampling heteroplasmy variance e↵ects) do not all lie within these intervals is not a signal of poor model choice. An analogous situation is the observation We can use our theory to explore the dependence of bottleneck of a scatter of datapoints outside the range of the standard er- dynamics on specific biological parameters. We first explore the ror on the mean (s.e.m.), which does not imply a mistake in e↵ects of modulating mtDNA turnover by varying and ⌫ in the s.e.m. estimate. The di↵erence between the trace in Fig- concert, corresponding to an increase in mtDNA degradation ure 4A and the mean curve in Figure 4C arises because Figure balanced by a corresponding increase in mtDNA replication. 4A shows the behaviour of the model under a single, optimised This increased mtDNA turnover increases the heteroplasmy parameterisation, whereas Figure 4C shows the distribution of variance due to bottlenecking (see Figure 5A). This result arises model behaviours over the posterior distributions on parame- due to the increased variability in mtDNA copy number from ters: the mean V0(h) trace of this distribution is comparable but the underlying random processes occurring at increased rates. not equivalent to that from the single best-fit parameterisation. Additionally, we find that increasing mtDNA degradation ⌫ To confirm more detailed predictions of our model, we also without increasing also increases heteroplasmy variance, in examined the specific distributions of heteroplasmy in our new addition to decreasing the overall mtDNA copy number (Fig- measurements. Given a mean heteroplasmy and an organismal ure 5B). Applying this unbalanced increase in mtDNA degra- age, the parameterised BDP model predicts the structure of dation without a matching change in replication has a strong the heteroplasmy distribution (see Methods and next section). e↵ect on mtDNA dynamics as it corresponds to a universal We parameterised the model using V0(h) values from a sub- change in the ‘control’ applied to the system, analogous, for set of half of the new measurements (chosen by omitting every example, to changing target copy numbers in manifestations of other sampled set when ordered by time). Figure 4D shows a relaxed replication [21]. The simple model we use does not in- comparison of measured heteroplasmy distributions with a 95 clude feedback and controls mtDNA dynamics solely through % bound from the parameterised BDP model. We then tested kinetic parameters. Perturbing the balance of these parame-ters the predictions of the parameterised model against the other thus strongly a↵ects the expected behaviour of the system. As half of new measurements. 8 of the test measurements (2.4%) we discuss later, elucidation of the specific mechanisms by which fell outside the inferred 95% bound from the training dataset, control is manifest in mtDNA populations will require further illustrating a good agreement with distributional predictions. research, but these numerical experiments attempt to represent The Anderson-Darling test was used to compare the distribu- the cases where a perturbation is naturally compen-sated for tion of heteroplasmy in sampled oocytes with distributions pre- (matched changes, Figure 5A) and where it is not (unbalanced dicted by our theory (given age and mean heteroplasmy); no change, Figure 5B). set of samples showed significant (p<0.05) departures from These results suggest that an artificial intervention increas- the hypothesis that the two distributions were identical. Some ing mitochondrial degradation may generally be expected to example distributions are presented in Figure 4D (i), (ii), (iii). increase heteroplasmy variance during development. An in- crease in mtDNA degradation is expected to either directly The birth-death-partition model is analyti- increase heteroplasmy variance (Figure 5B) if mtDNA popu- lations are weakly controlled, or to provoke a compensatory, cally tractable population-maintaining increase in mtDNA replication, thus in- Importantly, the birth-death-partitioning model yields analytic creasing mtDNA turnover, which also acts to increase variance solutions for the values of all genetic properties of interest, using (Figure 5C) if mtDNA populations are subject to feedback con- tools from stochastic processes (detail in Methods, Appendix 1). trol. The increase in variance through either of these pathways These results facilitate straightforward further study and fast will increase the power of cell-level selection to remove cells predictions of timescales and probabilities of interest. The full with high heteroplasmy and thus purify the population. For theoretical approach is detailed in Appendix 1, and equations this reason, we speculate that mitochondrial degradation may for the mean and variance of mtDNA populations and hetero- represent a potential clinical target to address the inheritance plasmy are given in the Methods. In Figure 2A we illustrate of mtDNA disease (more detail in Appendix 1). that these analytic results exactly match the numeric results of Our model also allows us to explore the e↵ect of di↵erent stochastic simulation, a result that holds across all BDP model mtDNA types experiencing di↵erent selective pressures, by set- parameterisations. It is also straightforward to calculate the ting 1 = 2 (mutant mtDNA experiences a proliferative ad- 6 fixation probability P(m = 0), which allows us to characterise vantage or disadvantage). Such a selective di↵erence causes all heteroplasmy distributions that arise from the bottlenecking changes in both mean heteroplasmy and heteroplasmy variance, process, even when highly skewed (see Methods and Appendix as shown in Figure 5C (for example, if heteroplasmy decreases 1). We have thus obtained analytic solutions for the time be- towards zero, heteroplasmy variance will also decrease, as the haviour of mtDNA copy number and heteroplasmy throughout wild type is increasingly likely to become fixed). We do not fo- the bottleneck with no assumptions of continuous population cus further on selection in this study, noting that selective pres- densities or fixed population size, under a physical model with sures are likely to be specific to a given pair or set of mtDNA the most statistical support given experimental data. types and are not generally characterised well enough to perform
5 satisfactory inference. However, we do note that our theory gives nosis [5, 47], assists in clinical planning by allowing inference a straightforward prediction for the functional form of mean of the specific heteroplasmic nature of the embryo itself rather heteroplasmy when nonzero selection is present, a sigmoid with than a population average of an a↵ected mother’s oocytes [48]. slope set by the fitness di↵erence (see Methods). However, the complicated and stochastic nature of the bottle- neck makes this inference a challenging problem. Probabilities of exceeding threshold hetero- Givenaheteroplasmymeasurementfromsamplinghm,ac- plasmy values curatepreimplantationdiagnosisiscontingentonknowledgeof thedistributionP(h hm), thatis, theprobabilitythattheem- | A key feature of mtDNA diseases is that pathological symp- bryonic heteroplasmy is h given that a measurement hm has toms usually manifest when heteroplasmy in a tissue exceeds a been made. We can use our modelling framework and Bayes’ certain threshold value, with few or no symptoms manifested theorem(seeMethods)toobtainaformulaforthisconditional below this threshold [45]. The probability and timescale as- probability, allowing a rigorous probability to be assigned to sociated with which cellular heteroplasmy may be expected to inferences from preimplantation sampling. Here, as above, we exceed a given value is thus a quantity of key interest in clinical employ the truncated Normal approximation for the hetero- planning of mtDNA disease strategies. plasmydistribution,notingthattheexacttreatmentusinghy- In our model, the probability, as a function of time, of a pergeometric functions is straightforward but more computa- cell containing m1 wildtype and m2 mutant mtDNAs can be tionally expensive. Fig.E illustrates this process by showing straightforwardly derived. The resultant analytic expression theprobabilitydistributionsonembryonicheteroplasmywhen involves a hypergeometric function, also an important math- measurements hm = 0.1 or 0.4 have been taken at di↵erent ematical element in expressions describing mtDNA statistics times during development. The increasing heteroplasmy vari- based on classical population genetics [23, 46]. The probability ancethroughdevelopmentmeansthatsubstantiallygreaterun- of obtaining a given heteroplasmy can therefore be computed certaintyisassociatedwithheteroplasmyvaluesinferredusing as a sum over all copy number states that correspond to that measurementstakenatlatertimes. Inconclusion,althoughcare heteroplasmy. However, as hypergeometric functions are com- mustbetakeninapplyingthisreasoningtocelltypesinwhich, paratively unintuitive and computationally expensive, we here for example, mitochondrial and cell turnover rates di↵er from employ an approximation to the distribution of heteroplasmy those assumed here, or di↵erentiation leads to tissue-specific based upon the above moments that are straightforwardly cal- selectivefactorsactingonthemtDNApopulation,thisformal- culable from our model. This approximation involves fixation ismprovidesageneralmeansofrigorouslyinferringembryonic probabilities for each mtDNA type and a truncated Normal heteroplasmythroughgeneticdiagnosissampling. distribution for intermediate heteroplasmies (see Methods). In Appendix 1 we show that this approximation corresponds well to the exact distributions calculated using the hypergeometric Discussion function. We underline that exact heteroplasmy distributions We have used a general stochastic model and approximate Bayesian are straightfoward to compute using our approach: we use the computation with the available experimental data on develop- truncated Normal approximation as it represents the exact dis- mental mtDNA dynamics to show that the bottleneck is most tribution well, is more intuitively interpretable, and is compu- likely manifest through stochastic mtDNA dynamics and par- tationally very inexpensive. titioning, with increased random turnover later during devel- Usingthisapproach,theprobabilitywithtimeofacellex- opment, a mechanism which we can describe exactly and ana- ceeding a threshold heteroplasmy h⇤ can be straightforwardly lytically (Fig.). We emphasise that the bottom-up construc- computedforanyinitialheteroplasmy,allowingrigorousquanti- tionofourmodelfromphysicalfirstprinciplesbothincreasesthe tativeelucidationofthisimportantclinicalquantity(seeMeth- flexibility and generality of our model, allowing di↵erent ods). Fig.D illustrates this computation by showing the an- mechanisms to be compared together, and providing informa- alytic probability with which thresholds h⇤ = 0.4, 0.5, 0.6 are tiononmtDNAdynamicsthroughoutdevelopmentratherthan exceededatatimet,giventheexampleinitialheteroplasmyh= estimating an overall e↵ect. We note that even though our 0.3.Theseresultsserveasasimpleexampleofthepowerofour model cannot represent the full microscopic truth underlying modelling approach: any other specific case can read-ily be themtDNAbottleneck,itsabilitytorecapitulatethewiderange addressed. Our theory thus allows general quantitative of extant experimental measurements suggest that its study calculation of the probability (and timescale) that any given mayyieldusefulinsightsintothee↵ectsofdi↵erenttreatments heteroplasmy threshold will be exceeded, given knowledge of and perturbations on the bottleneck. the initial (or early) heteroplasmy. A key debate in the literature has focussed on the magni- tude of the bottleneck. Some studies [11, 15] have observed Developmental sampling of embryonic het- a depletion of mtDNA copy number during the bottleneck to eroplasmy minima around several hundred; other studies [18, 19] have ob- served that mtDNA copy number remains > 103.Ourstudy We next turn to the question of estimating heteroplasmy levels shows that observed increases in heteroplasmy variance [16, 12] in a developed organism by sampling cells during development. can be achieved across this range of potential minimal mtDNA This principle, clinically termed preimplantation genetic diag- copy numbers, meaning that the much-debated magnitude of
6 mtDNA copy number reduction is not the sole critical fea- primates and humans, heteroplasmy variance may increase at ture of the bottleneck, in agreement with arguments from Refs. early developmental stages [50, 51], and that partitioning of [18, 19, 12]. We find that the role of stochastic mtDNA dy- mitochondria is binomial in HeLa cells [52]. As more studies namics can play a key role in determining heteroplasmy vari- become available on human mtDNA behaviour during develop- ance without additional mechanistic details, in keeping with ment we will test our model’s applicability and its clinical pre- approaches proposed by Ref. [11]. The mechanism with the dictions. We note that the results of a recent study of human most statistical support is thus consistent with aspects from all preimplantation sampling [48] found that earlier measurements existing proposals in the literature. provided strong predictive power of mean heteroplasmy, about We have shown that, of the models proposed in the litera- which substantial variation was recorded in the o↵spring – both ture, a birth-death-partitioning model, proposed after Ref. [11] of which results are consistent with the application of our model and compatible with an interpretation of Ref. [12], is the in- to theoretical sampling considerations. In addition, recent ob- dividually most likely mechanism, and capable of producing servations that the m.3243A>Gmutation in humans both experimentally observed heteroplasmy behaviour. We cannot, increases mtDNA copy number during development [53], and given current experimental evidence, discount hybrid mecha- displays a less pronounced increase of heteroplasmy variance nisms, where birth-death-partitioning dominates the popula- [51] than other mutations, are consistent with the link between tion dynamics but small contributions from other mechanisms heteroplasmy variance and mtDNA copy number in our theory. provide perturbations to this behaviour, and propose experi- The combination of modern stochastic and statistical treat- ments to conclusively distinguish between these cases (see Ap- ments that we have employed provides a generalisable and pow- pendix 1). As the expected statistics of mtDNA populations erful way to recapitulate experimental data and rigorously de- undergoing inheritance of heteroplasmic mtDNA clusters is very duce underlying biological mechanisms. We have used this com- similar to those undergoing binomial partitioning of mtDNAs bination to explore pertinent questions regarding the mtDNA (see Appendix 1), the inheritance of heteroplasmic nucleoids bottleneck (and others have used a similar philosophy to nu- (as opposed to individual mtDNAs) is not excluded by our find- merically explore mtDNA point mutations [25]): we hope to ings, though other recent experimental evidence suggests that convince the reader that such methodology may be appropriate this situation may be unlikely [42, 41]. We contend that the to explore other problems involving stochastic biological sys- most likely situation may involve the partitioning of individual tems. We have used new experimental measurements to confirm organelles, containing a mixture of homoplasmic nucleoids of our theoretical findings, illustrating the beneficial and power- characteristic size < 2. Notably, this case (inheritance of het- ful coupling of mathematical and experimental approaches to eroplasmic groups, likely with fluid structure due to mixing of address competing hypotheses in the literature. Our detailed organellar content and mitochondrial dynamics), gives rise to elucidation of the bottleneck allows us to propose further exper- statistics which our binomial model reproduces (see Appendix imental methodology to address the current unknowns in our 1). theory, including the specifics of mtDNA partitioning at cell As mentioned in the model description, it is likely that mi- division and the roles of selective di↵erences between mtDNA tochondrial dynamics (fission and fusion of mitochondria) [35] types; importantly, we also propose a strategy to investigate play a role in determining natural mtDNA turnover, and par- our claim that our most supported model is compatible with ticularly mtDNA turnover in the presence of pathological mu- the subset-replication picture of mtDNA dynamics. We list tations [49], through the mechanism of mitochondrial quality these experiments in full in Appendix 1. Finally, we believe control [32, 37]. Mitochondrial dynamics may also influence that the theoretical foundation for mtDNA dynamics that we the elements of partitioning, through changes in the connectiv- have produced allows increased quantitative rigour in the pre- ity of the mitochondrial network. In our current model, these dictions and strategies involved in mtDNA disease therapies, influences are coarse-grained into descriptions of the dynamic illustrated by the above application of our theory to problems rates of mtDNA replication and degradation, and the character- in mtDNA sampling strategies, disease onset timescales, and istic elements that are partitioned at divisions. These physical interventions to increase the power of the bottleneck. parameters, as opposed to the more microscopic details of mito- chondrial dynamics, are expected to be the key determinants of heteroplasmy statistics through development. Accounting for Methods how these parameters, which summarize the relevant outputs General model for mtDNA dynamics. Our ‘bottom-up’ of mitochondrial dynamics, connect to details of microscopic model represents individual mtDNAs as elements which repli- models of mitochondrial dynamics is an important future re- cate and degrade either randomly or deterministically according search direction to be addressed when more quantitative data to the model parameterisation. Consonant with experimental is available. studies showing that it is often a single mutant genotype that The experimental data used to parameterise the first part dominates the non-wildtype mtDNA population of a cell [54], of our study was taken from four studies in mice. Observation we consider two mtDNA types (wildtype and mutant), though of similar dynamics in salmon [20] points towards the bottle- our model can readily be extended to more mtDNA types. We neck being a conserved mechanism in vertebrates. We also note denote the number of ‘wild-type’ mtDNAs in a cell as m and that our results in mice are broadly consistent with findings 1 the number of ‘mutant’ mtDNAs as m . The heteroplasmy of from recent experiments in other organisms, suggesting that in 2 a cell is then h = m2 , that is, the population proportion of m1+m2
7 mutant mtDNA. on cell doubling times in the mouse germ line is taken from MtDNA dynamics within a cell cycle. Individual mtD- Ref. [43], suggesting that doubling times start with an interval NAs are capable of replication and degradation, with rates de- of every 7h, then after around 8.5 days post conception (dpc) noted and ⌫ respectively. According to a binary categorical increase to 16h, before the onset of a quiescent regime around parameter S,theseeventsmaybedeterministic(S = 0; the 13.5 dpc (roughly consistent with the estimate of 25 divisions mtDNA population replicates and degrades by a fixed amount between generations in the female mouse germ line⇠ [55]). per unit time) or Poisson processes (S = 1; each individual Simulation, model selection, and parametric infer- mtDNA randomly replicates and degrades with average rates ence. We use Gillespie algorithms, also known as stochas- and ⌫). A parameter ↵ controls the proportion of mtDNAs tic simulation algorithms [56], to explore the behaviour of our capable of replication: ↵ = 1 allows all mtDNAs to replicate model of the bottlenecking process for a given parameterisation. throughout development, ↵ < 1 enforces a subset proportion ↵ For a given model parameterisation, the Gillespie algorithm is of replicating mtDNAs a time cuto↵ T after conception. used to simulate an ensemble of 103 possible realisations of the MtDNA dynamics at cell divisions. A parameter c time evolution of mtDNA content, and the statistics of this en- (cluster size; a non-negative integer) dictates the partitioning semble are recorded. The experimental data we use is derived of mtDNAs at cell divisions. When c = 0, partitioning is de- from sets of measurements of di↵erent sizes; to compare simula- terministic, so each daughter cell receives exactly half of its tion data with an experimental datapoint i corresponding to a parent’s mtDNA. For c>0, partitioning is stochastic. When statistic derived from ni measurements, we sampled a random 3 c = 1, partitioning is binomial: each mtDNA has a 50% chance subset of ni of the 10 simulated trajectories (all datapoints but of being inherited by either daughter cell. When c>1, the one have n 103), and used this subset to derive the simulated parent cell’s mtDNAs are grouped in clusters of size c before statistic for⌧ comparison to datapoint i [29]. division. Each cluster is then partitioned binomially, with a To fit the di↵erent models to experimental data we define a 50% chance of being inherited by either daughter cell. distance measure, a sum-of-squares residual between the E(m) Di↵erent dynamic phases through development. The (in log space) and V(h) dynamics produced by our model and mtDNA population changes in di↵erent ways as development observed in the data, weighted to facilitate comparison of these progresses, first decreasing, then recovering, then slowly grow- di↵erent quantities [29]. We also constrain copy number to be ing. We include the possibility of di↵erent ‘phases’ of mtDNA < 5 105 at all points throughout development, rejecting pa- dynamics in our model to capture this behaviour. Each phase rameterisation⇥ that disobey this criterion. Metropolis MCMC j has its own associated pairs of j, ⌫j parameters and may was used to identify the best-fit parameterisation according to either be quiescent (involving no cell divisions) or cycling (en- this distance function. For statistical inference, we use approxi- compassing nj cell divisions). Thus, we may have an initial mate Bayesian computation (ABC), a statistical approach that cycling phase with low mtDNA replication rates, so that copy has successfully been applied to parametric inference and model number falls for several cell divisions, then a subsequent ‘recov- selection in dynamical systems [30] to infer posterior probabil- ery’ cycling phase with higher replication rates so that mtDNA ity distributions both for individual models and the parameters levels are amplified, then quiescent phases as cell lineages are of the models given experimental data. ABC samples poste- identified. We allow six di↵erent phases, with the first two fixed rior probability distributions on parameters that lead to be- as cycling phases with the above doubling times, and the final haviour within a certain threshold distance of the given data; phase fixed to include no mtDNA replication (representing the these posteriors are shown to converge on the true posteriors as stable, final occyte state). the threshold value decreases to zero (see Appendix 1). We em- Initial conditions. The initial conditions of our model ployed an MCMC sampler with randomly-selected initial condi- involve an initial mtDNA copy number m0 (the total number tions. For further details, including priors, thresholds and step of mtDNAs in the fertilised oocyte) and an initial heteroplasmy sizes used in ABC, see Appendix 1. Minimum copy number was h0 (the fraction of these mtDNAs that are mutated). recorded directly from the resulting trajectories; our measure 6 Data acquisition. We used three datasets for mtDNA of total turnover is defined as = i=3 ⌧i0⌫i,thesumover copy number during mouse development: Cree [11]; Cao [18]; quiescent dynamic phases of the product of degradation rate and Wai [12]. We use two datasets for heteroplasmy variance and phase length. P during development: Wai [12] and Jenuth [16]. By convention, Creation of heteroplasmic mice. Heteroplasmic mice we use the normalised versions of heteroplasmy variance (that were obtained from a heteroplasmic mouse line (HB) we cre- is, measured variance divided by a factor h(1 h)). Where ated previously by ooplasmic transfer [44]. This mouse line the measurements were not given explicitly in these publica- contains the nuclear DNA of the C57BL/6N mouse, and mtD- tions, we manually analysed the appropriate figures to extract NAs both of C57BL/6N and a wild-derived house mouse. Both the numerical data. For these values, we used data from cor- mtDNA variants belong to the same subspecies, Mus musculus respondence regarding the Wai study (reply to Ref. [17]), and domesticus. For details on sequence divergence see Ref. [44]. manually normalise the Jenuth dataset. The Jenuth dataset Isolation and lysis of oocytes. Mice were sacrificed at contains measurements taken in ‘mature oocytes’ with no time the indicated ages by cervical dislocation. Ovaries were ex- given; we assume a time of 100 dpc for these measurements, tracted and immediately placed in cryo-bu↵er containing 50% though this time is generalisable and does not qualitatively af- PBS, 25% ethylene glycol and 25% DMSO (Sigma-Aldrich, Aus- fect our results. All values are presented in Appendix 1. Data tria) and stored at -80oC. For oocyte extraction, ovaries were
8 placed into a drop of cryo-bu↵er and disrupted using scalpel and ⌧ ⇤ = ⌃ (n ⌧ +⌧ 0), so that t ⌧ ⇤ is the time since the last cell i i i i and forceps. Oocytes were collected and remaining cumulus division. E(m, t) is thus intuitively interpretable as a product cells were removed mechanically by repeated careful suction of the initial copy number with the e↵ects of halving at each through glass capillaries. Prepared oocytes were then washed cell division, and the copy number evolution through past and in PBS before they were individually placed into compartments current cell cycles and quiescent phases. of 96-well PCR plates (Life Technologies, Austria) containing 10 The expression for the variance is lengthier, taking the form µl of oocyte-lysis bu↵er [50] [50 mM Tris-HCl, (p.H 8.5), 1 mM EDTA, 0.5% tween-20 (Sigma-Aldrich, Austria) and 200 µg/mL E(m, t) ProteinaseK (Macherey-Nagel, Germany)]. Samples covered V(m, t)= 4ni (e( i ⌫i)⌧i 2)2( ⌫ )2 stages from primary oocytes of 3 day-old mice up to mature phases i i i o 2 oocytes of 40 day-old mice. Samples were lysed at 55 C for 2 +QE(m, t) E(m, t) , (3) h, and incubated at 95oC for 10 min to inactivate Proteinase K. The cellular DNA extract was finally diluted in 190 µlTris- where is a lengthy, though algebraically simple, function EDTA bu↵er, pH 8.0 (Sigma-Aldrich, Austria). 3µL were used of all physical parameters, which we derive and present in Ap- per qPCR reaction. pendix 1. Once the means and variances associated with mu- Heteroplasmy quantification by Amplification Re- tant and wild-type mtDNAs have been determined (for brevity, 2 fractory Mutation System (ARMS)-qPCR. Heteroplasmy we write these as µ1 E(m1,t), 1 V(m1,t) and µ2 2 ⌘ ⌘ ⌘ E(m2,t), V(m2,t)), the relations below can be used to quantification was performed by Amplification Refractory Mu- 2 ⌘ tation System (ARMS)-qPCR, an established method in the compute heteroplasmy statistics: field [57, 58, 59], as described in Ref. [44]. The study was conducted according to MIQE (minimum information for pub- µ2 E(h)= µh (4) lication of quantitative real-time PCR experiments) guidelines µ1 + µ2 ⌘ [44, 60]. The proportion between HB derived and C57BL/6N 2 2 2 2 2 2 2 2 1 + 2 mtDNA was determined by ARMS-qPCR assays based on a V(h)=µ + (5) h µ2 µ (µ + µ ) 2 SNP in mt-rnr2 [44]. These assays were normalised to changes 2 2 1 2 (µ1 + µ2) ! in the input mtDNA amount by consensus assays, located in Selection. The predicted mean heteroplasmy at time t as- conserved regions of mt-Co2 and mt-Co3 (see Appendix 1). For suming a constant selective pressure (though this assumption the calculation of mtDNA heteroplasmy, the assay detecting the can straightforwardly be relaxed) is given by Eqn. 4, which, minor allele (C57BL/6N or wild-derived < 50%) was always given Eqn. 2, straightforwardly reduces to used. If both specific assays gave values > 50% (which can happen around 50% heteroplasmy), the mean value of both as- 1 E(h)= , (6) says was taken. All qPCR runs contained no template controls 1 h0 t 1+ h e (NTCs) for all assays; these were negative in 100%. Further 0 experimental details available in Appendix 1. where h0 is initial heteroplasmy and is the increase (or Analytic model. In the birth-death-partitioning model, decrease, if negative) in replication rate of mutant over wild- processes within a cell cycle constitute a birth-death process type mtDNA. Eqn. 6 predicts that mean heteroplasmy in the which can be solved using generating functions [61]. For bino- presence of selection will follow a sigmoidal form (as expected mial partitioning, the generating function for the system after from population dynamics [63], by the constraint that h0 must an arbitrary number of divisions has a recursive structure [62] lie between 0 and 1, and by the intuitive fact that heteroplasmy and an analytic solution can be obtained through solving a Ric- changes slow down as these limits are approached). cati recurrence relation. This reasoning also extends to the dif- Threshold crossing. The probability of heteroplasmy ex- ferent phases of replication and degradation, allowing an exact ceeding a certain threshold h⇤ is simply given by integrating generating function to be constructed for an arbitrary point in the probability distribution of heteroplasmy between h⇤ and 1. the bottleneck. Derivatives of this generating function are then The exact distribution of heteroplasmy can be written as a sum used to obtain moments of the distributions of interest. The over hypergeometric functions; however, for computational ef- full procedure is given in Appendix 1. Recall that we assume ficiency and interpretability, we employ an approximation to that the bottlenecking process consists of a series of dynamic this distribution involving the truncated Normal distribution phases, which may either involve cycling cells (and hence cell and fixation probabilities. As shown in Appendix 1, the distri- divisions) or quiescent cells. The expression for mean mtDNA bution of heteroplasmy, taking possible fixation into account, copy number E(m, t) at time t is: can be well approximated by 2 P(h)=(1 ⇣1 ⇣2) 0(h µ, )+⇣1 (h)+⇣2 (h 1) (7) (ni⌧i+⌧i0)( i ⌫i) N | (t ⌧ ⇤) e (m, t)=m0e , (2) where is the truncated Normal distribution (truncated E ni 0 2 N 2 phasesY i at 0 and 1), µ and are found numerically given our model results for (h) and (h), and ⇣1 (h = 0) and ⇣2 (h = 1) where ni is the number of cell divisions in phase i (0 for E V P P are fixation probabilities, also straightforwardly⌘ calculable⌘ from quiescent phases), ⌧i is the length of a cell cycle in cycling phase our model. The probability of threshold crossing for 0 9 encompassing Cree and Wai (a) mechanisms. Left plots show 1 trajectories during development; right plots show behaviour P(h>h⇤)=(1 ⇣1 ⇣2) 1 1+erf (h⇤ E(h))/ 2V(h) + ⇣2. 2 in mature oocytes in the next generation. * denotes mea- ✓ ◆ ⇣ ⇣ p ⌘⌘ (8) surements in mature oocytes, modelled as 100 dpc (see Meth- Inference from heteroplasmy measurements. Given a ods). (B) Statistical support for di↵erent mechanisms from ap- sampled measurement heteroplasmy hm, the probability P(h0 hm)proximate Bayesian computation (ABC) model selection with that embryonic heteroplasmy is h is given by Bayes’ theo-| 0 thresholds ✏1,2,3,4 = 75, 60, 50, 45. As the threshold decreases, rem P(h0 hm)=P(hm h0)P(h0)/P(hm). Assuming a uniform | | forcing a stricter agreement with experiment (thinner, darker prior distribution on embryonic heteroplasmy (though this can columns), support converges on the birth-death-partitioning be straightforwardly generalised), we thus obtain P(h0 hm)= (BDP) model. 1 | P(hm h0)/ 0 P(hm h0)dh0, and using the above expression for the heteroplasmy,| | R Figure 3. Parameterisation of the BDP model and 2 (1 ⇣1 ⇣2) 0(hm µ, )+⇣1 (hm)+⇣2 (hm 1) inferred details of bottleneck mechanism. Trajectories of (h0 hm)= N | , P 1 2 | dh (1 ⇣1 ⇣2) (hm µ, )+⇣1 (hm)+⇣2 (hm 1) 0 0 N 0 | (A) mean copy number E(m) and (B) normalised heteroplasmy (9) variance 0(h) resulting from BDP model parameterisations R 2 2 V where µ, , ⇣1, ⇣2 are functions of h0: µ, may be found sampled using ABC with a threshold ✏ = 40. * denotes mea- numerically and the ⇣ values are analytically calculable (see surements in mature oocytes, modelled as 100 dpc (see Meth- Appendix 1). ods). Note: the range in (B) does not correspond to a credibility Ethics Statement. The study was discussed and approved interval on individual measurements, but rather on an expected by the institutional ethics committee in accordance with Good underlying (population) variance, from which individual vari- Scientific Practice (GSP) guidelines and national legislation. ance measurements are sampled. We thus expect to see, for FELASA recommendations for the health monitoring of SPF example, several measurements lower than this range due to mice were followed. Approved by the institutional ethics com- sampling limitations (see text). (C) Posterior distributions on mittee and the national authority according to Section 26 of the mtDNA turnover ⌫ with time. (D) Posterior distribution on Law for Animal Experiments, Tierversuchsgesetz 2012 TVG min E(m), the minimum mtDNA copy number reached during 2012. 6 development. (E) Posterior distribution on = i=3 ⌧i0⌫i,a Competing interests. The authors declare that no com- measure of the total amount of mtDNA turnover. peting interests exist. P Figure Captions & Other Titles Figure 4. Predictions and experimental verification of the BDP model. (A) New V0(h) measurements from the HB mouse system, with optimised fits for the BDP, Wai (b) and Cao models. (B) Posterior probabilities of each model given this data under decreasing ABC threshold: ✏ = 50, 40, 30, 25 . Figure 1. The mitochondrial bottleneck, and ele- { } ments of a general model for bottlenecking mechanisms. (C)AllV0(h)measurementsfromtheHBmodel(points)with (A) The mtDNA bottleneck acts to produce a population of inferred V0(h) behaviour from ABC applied to the BDP model oocytes with varying heteroplasmies from a single initial oocyte (red curves). As in Fig., this range does not correspond to a with a specific heteroplasmy value. During development, mtDNA credibility interval on individual measurements, but rather on copy number per cell decreases (by a debated amount, which we anexpectedunderlying(population)variance,fromwhichindi- address; see Main Text) then recovers, suggesting a ‘bottleneck’ vidual variance measurements are sampled. The inferred be- of cellular mtDNA populations. (B) Cellular mtDNA popula- haviour strongly overlaps with the inferred behaviour for the tions during the bottleneck are modelled as containing wild- BALB/csystem(bluecurves),suggestingthattheBDPmodel type and mutant mtDNAs. MtDNAs can replicate and degrade applies to a genetically diverse range of systems. (D) Hetero- 1 within a cell cycle, with rates and ⌫ respectively. (C) At cell plasmy distributions. The transformation h0 = ln (h 1)E(h)/(1 E(h)) [44] is used to compare distributions with di↵erent mean hetero- divisions, the mtDNA population is partitioned between two daughter cells either deterministically, binomially, or through plasmy. Red jitter points are samples from sets used to parame- the binomial partitioning of mtDNA clusters. (D) Symbols used terise the BDP model; red curvesshow the 95 % range on trans- to represent quantities and model parameters used in the main formed heteroplasmy with time inferred from these samples. text, and their biological interpretations. Blue jitter points are samples withheld independent from this parameterisation; their distributuions fall within the indepen- dently inferred range. Insets show, in untransformed space, dis- Figure 2. Di↵erent mechanisms for the mtDNA bot- tributions of the withheld heteroplasmy measurements (blue) tleneck. (A) Trajectories of mean copy number E(m) and nor- compared to parameterised predictions (red); no withheld datasets malised heteroplasmy variance V(h) arising from the models de- show significant support against the predicted distribution (Anderson- scribed in the text, optimised with respect to data from exper- Darling test, p<0.05). imental studies. BDP denotes the birth-death-partition model, 10 Figure 5. Quantitative influences and clinical results [4] R. Lightowlers, P. Chinnery, D. Turnbull, and N. How- from our bottlenecking model. (A-C) Trajectories of copy ell. Mammalian mitochondrial genetics: heredity, hetero- number E(m) and normalised heteroplasmy variance V0(h)re- plasmy and disease. Trends In Genetics, 13:450, 1997. sulting from perturbing di↵erent physical parameters. Trajec- [5] J. Poulton, S. Kennedy, P. Oakeshott, and D. Wells. Pre- tory C labels the ‘control’ trajectory resulting from a fixed pa- venting transmission of maternally inherited mitochondrial rameterisation; black dots show experimental data; * denotes DNA diseases. Bmj, 338, 2009. measurements from primary oocytes, modelled at 100 dpc. (A) + Increasing (T ) and decreasing (T ) mtDNA turnover (both [6] S. Bacman, S. Williams, M. Pinto, S. Peralta, and mtDNA replication and degradation) by 20%. (B) Increasing C. Moraes. Specific elimination of mutant mitochondrial + (M ) and decreasing (M ) mtDNA degradation throughout genomes in patient-derived cells by mitoTALENs. Nature 4 1 development by a constant value (2 10 , in units of day ), Medicine, 19:1111, 2013. while keeping replication constant.⇥ (C) Applying a positive + [7] L. Craven, H. Tuppen, G. Greggains, S. Harbottle, J. Mur- (S ) and negative (S ) selective pressure to mutant mtDNA 6 1 phy, L. Cree, A. Murdoch, P. Chinnery, R. Taylor, and by 5 10 day . (D) Probability of crossing di↵erent het- ⇥ R. Lightowlers. Pronuclear transfer in human embryos to eroplasmy thresholds h⇤ with time, starting with initial hetero- prevent transmission of mitochondrial DNA disease. Na- plasmy h0 =0.3. (E) Probability distributions over embryonic ture, 465:82, 2010. heteroplasmy h0 given a measurement hm from preimplantation sampling (** hm =0.1; *** hm =0.4) at di↵erent times. [8] J. Poulton, M. Chiaratti, F. Meirelles, S. Kennedy, D. Wells, and I. Holt. Transmission of mitochondrial DNA diseases and ways to prevent them. PLoS Genetics, Figure 6. Model for the mtDNA bottleneck. A sum- 6:e1001066, 2010. mary of our findings. (A) There is most statistical support for a bottlenecking mechanism whereby mtDNA dynamics is stochas- [9] A. Bredenoord, G. Pennings, and G. De Wert. Ooplas- tic within a cell cycle, involving random replication and degra- mic and nuclear transfer to prevent mitochondrial DNA dation of mtDNA, and mtDNAs are binomially partitioned at disorders: conceptual and normative issues. Human Re- cell divisions. (B) This mechanism results in heteroplasmy vari- production Update, 14:669, 2008. ance increasing both due to stochastic partitioning at divisions [10] Joerg Patrick Burgstaller, Iain G Johnston, and Joanna and due to random turnover. The absolute magnitude of the Poulton. Mitochondrial dna disease and developmental copy number bottleneck is not critical: a range of bottleneck implications for reproductive strategies. Molecular human sizes can give rise to observed dynamics. Random turnover of reproduction, 21(1):11–22, 2015. mtDNA increases heteroplasmy variance through folliculogene- sis and germline development. [11] L. Cree, D. Samuels, S. de Sousa Lopes, H. Rajasimha, P. Wonnapinij, J. Mann, H. Dahl, and P. Chinnery. A re- Appendix 1. Detailed derivations involved in the mathe- duction of mitochondrial DNA molecules during embryo- matical analysis and inference approach; numerical descriptions genesis explains the rapid segregation of genotypes. Nature of the source data giving rise to existing and new heteroplasmy Genetics, 40:249, 2008. variance measurements; experimental details of these new mea- [12] T. Wai, D. Teoli, and E. Shoubridge. The mitochondrial surements; and details on proposed experimental strategies for DNA genetic bottleneck results from replication of a sub- further elucidation. population of genomes. Nature Genetics, 40:1484, 2008. Figure 4 – Source data 1. Individual heteroplasmy mea- [13] C. Bergstrom and J. Pritchard. Germline bottlenecks and surements in the HB mouse model contributing to the new het- the evolutionary maintenance of mitochondrial genomes. eroplasmy variance data used to test our theory. Genetics, 149:2135, 1998. [14] P. Wonnapinij, P. Chinnery, and D. Samuels. Previous References estimates of mitochondrial DNA mutation level variance did not account for sampling error: comparing the mtDNA [1] D. Wallace and D. Chalkia. Mitochondrial DNA genetics genetic bottleneck in mice and humans. The American and the heteroplasmy conundrum in evolution and disease. Journal Of Human Genetics, 86:540, 2010. Cold Spring Harbor Perspectives In Biology, 5:a021220, [15] C. Aiken, T. Cindrova-Davies, and M. Johnson. Variations 2013. in mouse mitochondrial DNA copy number from fertiliza- [2] D. Rand. The units of selection of mitochondrial DNA. An- tion to birth are associated with oxidative stress. Repro- nual Review Of Ecology And Systematics, page 415, 2001. ductive Biomedicine Online, 17:806, 2008. [3] D. Wallace. Mitochondrial diseases in man and mouse. [16] J. Jenuth, A. Peterson, K. Fu, and E. Shoubridge. Random Science, 283:1482, 1999. genetic drift in the female germline explains the rapid seg- regation of mammalian mitochondrial DNA. Nature Ge- netics, 14:146, 1996. 11 [17] D. Samuels, P. Wonnapinij, L. Cree, and P. Chinnery. Re- [30] T. Toni, D. Welch, N. Strelkowa, A. Ipsen, and M. Stumpf. assessing evidence for a postnatal mitochondrial genetic Approximate Bayesian computation scheme for parameter bottleneck. Nature Genetics, 42, 2010. inference and model selection in dynamical systems. Jour- nal Of The Royal Society Interface, 6:187, 2009. [18] L. Cao, H. Shitara, T. Horii, Y. Nagao, H. Imai, K. Abe, T. Hara, J. Hayashi, and H. Yonekawa. The mitochondrial [31] Howard T Jacobs, Sanna K Lehtinen, and Johannes N bottleneck occurs without reduction of mtDNA content in Spelbrink. No sex please, we’re mitochondria: a hypothesis female mouse germ cells. Nature Genetics, 39:386, 2007. on the somatic unit of inheritance of mammalian mtdna. Bioessays, 22(6):564–572, 2000. [19] L. Cao, H. Shitara, M. Sugimoto, J. Hayashi, K. Abe, and H. Yonekawa. New evidence confirms that the mitochon- [32] G. Twig, B. Hyde, and O. Shirihai. Mitochondrial fu- drial bottleneck is generated without reduction of mito- sion, fission and autophagy as a quality control axis: the chondrial DNA content in early primordial germ cells of bioenergetic view. Biochimica Et Biophysica Acta (BBA)- mice. PLoS Genetics, 5:e1000756, 2009. Bioenergetics, 1777:1092, 2008. [20] J. Wol↵, D. White, M. Woodhams, H. White, and N. Gem- [33] Bradford G Hill, Gloria A Benavides, Jack R Lancaster, mell. The strength and timing of the mitochondrial bot- Scott Ballinger, Lou DellItalia, Jianhua Zhang, and Vic- tleneck in salmon suggests a conserved mechanism in ver- tor M Darley-Usmar. Integration of cellular bioenergetics tebrates. PloS One, 6:e20522, 2011. with mitochondrial quality control and autophagy. Biolog- ical chemistry, 393(12):1485–1512, 2012. [21] P. Chinnery and D. Samuels. Relaxed replication of mtDNA: a model with implications for the expression [34] Richard J Youle and Alexander M Van Der Bliek. of disease. The American Journal Of Human Genetics, Mitochondrial fission, fusion, and stress. Science, 64:1158, 1999. 337(6098):1062–1065, 2012. [22] J. Elson, D. Samuels, D. Turnbull, and P. Chinnery. Ran- [35] Scott A Detmer and David C Chan. Functions and dys- dom intracellular drift explains the clonal expansion of mi- functions of mitochondrial dynamics. Nature Reviews tochondrial DNA mutations with age. The American Jour- Molecular Cell Biology, 8(11):870–879, 2007. nal Of Human Genetics, 68:802, 2001. [36] Hanne Hoitzing, Iain G Johnston, and Nick S Jones. What [23] P. Wonnapinij, P. Chinnery, and D. Samuels. The distri- is the function of mitochondrial networks? a theoretical bution of mitochondrial DNA heteroplasmy due to random assessment of hypotheses and proposal for future research. genetic drift. The American Journal Of Human Genetics, BioEssays (early online), 2015. 83:582, 2008. [37] G. Twig and O. Shirihai. The interplay between mito- [24] G. Capps, D. Samuels, and P. Chinnery. A model of the chondrial dynamics and mitophagy. Antioxidants & Redox nuclear control of mitochondrial DNA replication. Journal Signaling, 14:1939, 2011. Of Theoretical Biology, 221:565, 2003. [38] P. Mouli, G. Twig, and O. Shirihai. Frequency and selec- [25] S. Poovathingal, J. Gruber, B. Halliwell, and R. Gunawan. tivity of mitochondrial fusion are key to its quality main- Stochastic drift in mitochondrial DNA point mutations: a tenance function. Biophysical Journal, 96:3509, 2009. novel perspective ex silico. PLoS computational biology, [39] Christian Kukat and Nils-G¨oran Larsson. mtdna makes 5:e1000572, 2009. a u-turn for the mitochondrial nucleoid. Trends in cell [26] DR Marchington, V Macaulay, GM Hartshorne, D Barlow, biology, 23(9):457–463, 2013. and J Poulton. Evidence from human oocytes for a genetic [40] Daniel F Bogenhagen. Mitochondrial dna nucleoid struc- bottleneck in an mtdna disease. The American Journal of ture. Biochimica et Biophysica Acta (BBA)-Gene Regula- Human Genetics, 63(3):769–775, 1998. tory Mechanisms, 1819(9):914–920, 2012. [27] M. Beaumont, W. Zhang, and D. Balding. Approximate [41] Stefan Jakobs and Christian A Wurm. Super-resolution Bayesian computation in population genetics. Genetics, microscopy of mitochondria. Current opinion in chemical 162:2025, 2002. biology, 20:9–15, 2014. [28] M. Sunn˚aker, A. Busetto, E. Numminen, J. Corander, [42] Bobby G Poe III, Ciar´anF Du↵y, Michael A Greminger, M. Foll, and C. Dessimoz. Approximate bayesian com- Bradley J Nelson, and Edgar A Arriaga. Detection of het- putation. PLoS Computational Biology, 9:e1002803, 2013. eroplasmy in individual mitochondrial particles. Analytical [29] I. Johnston. E cient parametric inference for stochastic and bioanalytical chemistry, 397(8):3397–3407, 2010. biological systems with measured variability. Stat. Appl. [43] K. Lawson and W. Hage. Clonal analysis of the origin of Genet. Molec. Biol., 13:379, 2014. primordial germ cells in the mouse. Germline Develop- ment, 165:68, 1994. 12 [44] Joerg Patrick Burgstaller, Iain G Johnston, Nick S Jones, [55] J. Drost and W. Lee. Biological basis of germline muta- Jana Albrechtova, Thomas Kolbe, Claus Vogl, Andreas tion: comparisons of spontaneous germline mutation rates Futschik, Corina Mayrhofer, Dieter Klein, Sonja Sabitzer, among drosophila, mouse, and human. Environmental And et al. mtdna segregation in heteroplasmic tissues is com- Molecular Mutagenesis, 25:48, 1995. mon in vivo and modulated by haplotype di↵erences and developmental stage. Cell reports, 7(6):2031–2041, 2014. [56] D. Gillespie. Exact stochastic simulation of coupled chemi- cal reactions. The Journal Of Physical Chemistry, 81:2340, [45] R. Rossignol, B. Faustin, C. Rocher, M. Malgat, J. Mazat, 1977. and T. Letellier. Mitochondrial threshold e↵ects. Biochem- ical Journal, 370:751, 2003. [57] Daniel Paull, Valentina Emmanuele, Keren A Weiss, Nathan Tre↵, Latoya Stewart, Haiqing Hua, Matthew Zim- [46] M. Kimura. Solution of a process of random genetic drift mer, David J Kahler, Robin S Goland, Scott A Noggle, with a continuous model. Proceedings Of The National et al. Nuclear genome transfer in human oocytes elimi- Academy Of Sciences Of The United States Of America, nates mitochondrial dna variants. Nature, 493(7434):632– 41:144, 1955. 637, 2013. [47] J. Ste↵ann, N. Frydman, N. Gigarel, P. Burlet, P. Ray, [58] Ralf Steinborn, Pamela Schinogl, Valeri Zakhartchenko, R. Fanchin, E. Feyereisen, V. Kerbrat, G. Tachdjian, and Roland Achmann, Wolfgang Schernthaner, Miodrag Sto- J. Bonnefont. Analysis of mtDNA variant segregation dur- jkovic, Eckhard Wolf, Mathias M¨uller, and Gottfried ing early human embryonic development: a tool for suc- Brem. Mitochondrial dna heteroplasmy in cloned cattle cessful NARP preimplantation diagnosis. Journal Of Med- produced by fetal and adult cell cloning. Nature Genet., ical Genetics, 43:244, 2006. 25:255, 2000. [48] N. Tre↵, J. Campos, X. Tao, B. Levy, K. Ferry, and [59] Masahito Tachibana, Paula Amato, Michelle Sparman, R. Scott. Blastocyst preimplantation genetic diagnosis Joy Woodward, Dario Melguizo Sanchis, Hong Ma, (PGD) of a mitochondrial DNA disorder. Fertility And Nuria Marti Gutierrez, Rebecca Tippner-Hedges, Eunju Sterility, 2012. Kang, Hyo-Sang Lee, et al. Towards germline gene therapy of inherited mitochondrial diseases. Nature, [49] Jodi Nunnari and Anu Suomalainen. Mitochondria: in 493(7434):627–631, 2013. sickness and in health. Cell, 148(6):1145–1159, 2012. [60] Stephen A Bustin, Vladimir Benes, Jeremy A Garson, [50] H. Lee, H. Ma, R. Juanes, M. Tachibana, M. Sparman, Jan Hellemans, Jim Huggett, Mikael Kubista, Reinhold J. Woodward, C. Ramsey, J. Xu, E. Kang, and P. Amato. Mueller, Tania Nolan, Michael W Pfa✏, Gregory L Ship- Rapid mitochondrial DNA segregation in primate preim- ley, et al. The miqe guidelines: minimum information plantation embryos precedes somatic and germline bottle- for publication of quantitative real-time pcr experiments. neck. Cell Reports, 1:506, 2012. Clinical chemistry, 55(4):611–622, 2009. [51] S. Monnot, N. Gigarel, D. Samuels, P. Burlet, L. Hesters, [61] C. Gardiner. Handbook of stochastic methods, volume 3. N. Frydman, R. Frydman, V. Kerbrat, B. Funalot, and Springer Berlin, 1985. J. Martinovic. Segregation of mtDNA throughout human embryofetal development: m. 3243A¿ G as a model system. [62] J. Rausenberger and M. Kollmann. Quantifying origins Human Mutation, 32:116, 2011. of cell-to-cell variations in gene expression. Biophysical Journal, 95:4523, 2008. [52] I. Johnston, B. Gaal, R. das Neves, T. Enver, F. Iborra, and N. Jones. Mitochondrial variability as a source of [63] D. Futuyma. Evolutionary Biology, 1997. extrinsic cellular noise. PLoS Computational Biology, 8:e1002416, 2012. [53] S. Monnot, D. Samuels, L. Hesters, N. Frydman, N. Gi- garel, P. Burlet, V. Kerbrat, F. Lamazou, R. Frydman, and A. Benachi. Mutation dependance of the mitochon- drial DNA copy number in the first stages of human em- bryogenesis. Human Molecular Genetics, 22:1867, 2013. [54] K. Khrapko, N. Bodyak, W. Thilly, N. Van Orsouw, X. Zhang, H. Coller, T. Perls, M. Upton, J. Vijg, and J. Wei. Cell-by-cell scanning of whole mitochondrial genomes in aged human heart reveals a significant frac- tion of myocytes with clonally expanded deletions. Nucleic Acids Research, 27:2434, 1999. 13 c = 0 Deterministic A MtDNA BCWildtype mtDNA Mutant mtDNA Cell division partitioning bottleneck (WT) (M) dynamics +