BBACAN-88143; No. of pages: 18; 4C: 2, 6, 8, 9, 10 Biochimica et Biophysica Acta xxx (2017) xxx–xxx

Contents lists available at ScienceDirect

Biochimica et Biophysica Acta

journal homepage: www.elsevier.com/locate/bbacan

A population perspective on the determinants of intra-tumor heterogeneity☆

Zheng Hu, Ruping Sun, Christina Curtis ⁎

Departments of Medicine and Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA Stanford Institute, Stanford University School of Medicine, Stanford, CA 94305, USA article info abstract

Article history: Cancer results from the acquisition of somatic alterations in a microevolutionary process that typically occurs Received 16 December 2016 over many years, much of which is occult. Understanding the evolutionary dynamics that are operative at differ- Received in revised form 1 March 2017 ent stages of progression in individual tumors might inform the earlier detection, diagnosis, and treatment of can- Accepted 2 March 2017 cer. Although these processes cannot be directly observed, the resultant spatiotemporal patterns of genetic Available online xxxx variation amongst tumor cells encode their evolutionary histories. Such intra-tumor heterogeneity is pervasive not only at the genomic level, but also at the transcriptomic, phenotypic, and cellular levels. Given the implica- tions for precision medicine, the accurate quantification of heterogeneity within and between tumors has be- come a major focus of current research. In this review, we provide a population genetics perspective on the determinants of intra-tumor heterogeneity and approaches to quantify genetic diversity. We summarize evi- dence for different modes of evolution based on recent cancer genome sequencing studies and discuss emerging evolutionary strategies to therapeutically exploit tumor heterogeneity. This article is part of a Special Issue enti- tled: Evolutionary principles - heterogeneity in cancer?, edited by Dr. Robert A. Gatenby. © 2017 Elsevier B.V. All rights reserved.

1. Introduction additional ‘hits’ accrue, eventually leading to transformation and uncon- trolled cell growth (Fig. 1). Cancer results from the acquisition of alterations during somatic cell Given that tumor cells continuously accrue and are sub- division in a microevolutionary process that typically occurs over de- ject to ever changing microenvironments, genetic and phenotypic het- cades, much of which is occult and with extended clinically latent inter- erogeneity is expected. Indeed, according to the clonal evolution vals. For example, more than half of all detectable somatic mutations in theory, heterogeneity is attributed to continued genetic and heritable colorectal occur prior to transformation [1,2]. A set of initiating epigenetic alterations and selection in diverse microenvironments (epi)genetic events (so called drivers) provide a selective fitness advan- within tumors [7–9]. Hence rates, genetic drift, population tage (resulting in higher proliferation or lower apoptosis for example) structure, microenvironment and selection impact the extent of tumor in the target cell relative to other pre-malignant cells, resulting in clonal heterogeneity. As tumors progress, expanding into an environment expansion. These initiating events in the founding tumor cell are often that is resource limited, cells that are better able to replicate, utilize en- specific to certain tissue types or cells of origin as is the case for APC in ergy, and migrate will have a competitive advantage and these ‘hall- colon and other gastrointestinal tumors [3–5] or DNMT3 in marks’ are noted in most advanced cancers [10].Atthisstage,the [6]. Hence, the fitness benefit that the mutation confers may depend initiating genetic lesions may no longer ensure cell survival. As such, it on the microenvironment or cellular context in which it occurs. Howev- is essential to develop an understanding of the phenotypic conse- er, a single mutation is seldom sufficient to evoke a fully malignant phe- quences of putative oncogenes and tumor suppressors in a context-de- notype and the pre-malignant clones likely maintain functionality until pendent fashion. Although intra-tumor heterogeneity (ITH) has been appreciated for decades [11], the advent of high-throughput technologies has enabled the characterization of (epi)genomic, transcriptomic, phenotypic, and ☆ This article is part of a Special Issue entitled: Evolutionary principles - heterogeneity in cellular heterogeneity at enhanced resolution [12–25]. The presence of cancer?, edited by Dr. Robert A. Gatenby. fi ⁎ Corresponding author at: Department of Medicine and Genetics, Stanford University pervasive ITH poses signi cant challenges for precision medicine. For School of Medicine, Stanford, CA 94305, USA. example, sampling bias due to solid tumor spatial structure within E-mail address: [email protected] (C. Curtis). lesions can obscure the interpretation of genomic profiles for patient

http://dx.doi.org/10.1016/j.bbcan.2017.03.001 0304-419X/© 2017 Elsevier B.V. All rights reserved.

Please cite this article as: Z. Hu, et al., A population genetics perspective on the determinants of intra-tumor heterogeneity, Biochim. Biophys. Acta (2017), http://dx.doi.org/10.1016/j.bbcan.2017.03.001 2 Z. Hu et al. / Biochimica et Biophysica Acta xxx (2017) xxx–xxx

Fig. 1. Schematic illustration of different phases and modes of tumor evolution. Somatic alterations (including SSNVs and CNAs) arise during cell division in normal/pre-neoplastic cells, where mutations that confer a selective growth advantage relative to the rest of the population (drivers) result in clonal expansions. Multiple such events promote transformation in a process that can occur over long time periods (indicated by hatch marks). The mutations (both drivers and passengers) that accumulate in the founding tumor cell lineage during initiation will be present in all cells in the tumor (public/clonal). After transformation, the tumor may grow in the absence of stringent selection, compatible with a Big Bang model [Ref 21] and effectively neutral evolution (A) or be subject to continuous selection resulting in clonal expansions (B and C). In the Big Bang model, the tumor grows as a single ‘terminal’ expansion populated by numerous heterogeneous subclones that are not subject to strong selection (A). As such, the timing of a mutation is the primary determinant of its frequency in the final tumor and most detectable subclonal alterations (resulting in ITH) arise early during growth, whereas late-arising mutations will be largely undetectable. In contrast, under selection, the so called ‘sequential’ or ‘linear’ evolution model describes clonal successions or sweeps that occur sequentially (B) while ‘branched’ evolution corresponds to a scenario where multiple subclonal alterations co-occur and compete during tumor growth (C). stratification and therapeutic decision-making. Elevated ITH may inherited during cell division, and hence surreptitiously encode their also be associated with disease progression [26] and poor prognosis ancestries. As such, the resultant patterns of ITH can be ‘read’ via [27,28]. Given the numerous clinical implications, the accurate quantifi- multi-region sequencing (MRS) of spatially and/or temporally separat- cation of heterogeneity within and between tumors from the same pa- ed tumor regions or single-cell sequencing (SCS) and used to recon- tient is a major focus of current research. However, to date, these struct the evolutionary relationships amongst the distinct cell analyses have largely been descriptive in nature. populations (clones) that comprise a tumor, as has now been performed While human tumor evolution cannot be directly observed, spatio- in a variety of tumor types [13,16–19,21,22,25,29–34]. Intriguingly, temporal patterns of genetic variation amongst tumor cells are stably these and other cancer sequencing studies suggest that distinct modes

Please cite this article as: Z. Hu, et al., A population genetics perspective on the determinants of intra-tumor heterogeneity, Biochim. Biophys. Acta (2017), http://dx.doi.org/10.1016/j.bbcan.2017.03.001 Z. Hu et al. / Biochimica et Biophysica Acta xxx (2017) xxx–xxx 3 of evolution are operative during tumor progression and in different may be restricted to certain tumor regions or more pervasive, but as tumor types. However, this has yet to be systematically evaluated. these events initially occur in a single cell, they only become detectable More generally, while the elements of somatic evolution have been de- after subsequent rounds of cell division or as a result of a clonal expansion. fined [7,8,35] and include mutation, genetic drift, migration, population As discussed in subsequent sections, multiple factors influence the structure, and selection, the evolutionary dynamics that are operative at ability to reliably detect subclonal variants, including the sensitivity different stages of progression in individual tumors are poorly charac- and specificity of an assay, tumor purity (or the proportion of tumor terized. A quantitative understanding of tumor dynamics has the poten- cells), sequencing coverage, as well as the nature of the sample and tial to define cancers' evolutionary playbook and might hold clues to as the mode of tumor evolution. To date, most large-scale cancer genome to how to better prevent, detect, and treat cancers. For example, the ear- sequencing efforts have profiled a single bulk sample. This requires lier the initial clonal expansion is detected, the less diverse and likely deconvolution to delineate the fraction of tumor cells harboring a less fit the cell population, and the better the prognosis. Hence, knowl- given alteration, and poses limitations due to tissue architecture edge of the initiating events and their relative temporal ordering may in solid tumors, which can result in the spatial segregation of inform strategies for earlier intervention. Likewise, computational and subclones and therefore sampling bias. Sampling of multiple spatially mathematical modeling can provide insights into mechanisms of pro- separated bulk tumor regions or microdissected subregions such as indi- gression and enable the inference of patient-specific tumor dynamics. vidual glands can mitigate sampling bias and provide varying degrees of Such information may ultimately inform the design of rational treat- spatial resolution. However, these approaches still require deconvolution. ment strategies. In contrast, enables direct measurement of clonal In this review, we provide a population genetic perspective on the genotypes, but is not yet scalable to large numbers of cells and suffers origins of tumor heterogeneity and summarize approaches to quantify from measurement errors. Each of these methods requires tissue dissoci- genetic diversity, drawing from established theory for studying genetic ation so spatial information is lost if not recorded during sampling. variation within and among natural populations in light of the forces of In situ analyses that retain tissue architecture have typically been mutation, genetic drift, selection, and demography. We summarize evi- limited to relatively low resolution techniques such as fluorescent in dence for different modes of clonal evolution based on recent cancer ge- situ hybridization (FISH). However, these can be complemented by nome sequencing studies and the potential clinical relevance of the high-resolution spatial tumor sampling studies, where both approaches resultant heterogeneity. Finally, we discuss emerging evolutionary have revealed the intermingling of heterogeneous subclones. For exam- strategies to therapeutically exploit tumor heterogeneity and steer clon- ple, analysis of EGFR copy number states in GBM via FISH exposed strik- al dynamics. ing patterns of mosaicism [43]. Copy number heterogeneity was similarly reported for both EGFR and PDGFRA and found to be associat- 2. Genetic and non-genetic tumor heterogeneity ed with differential growth factor responsiveness in GBM [44]. These findings were recapitulated by genome-wide copy number profiling of 2.1. Genomic heterogeneity within and between tumors multiple tumor sectors from a cohort of 11 GBM patients [16]. More- over, parallel transcriptional profiling revealed the presence of multiple Conventionally, the distinct stages of tumor initiation and subse- molecular subgroups within a given patient, highlighting the challenges quent growth have been assumed to proceed in a ‘linear’ sequential that ITH poses for precision medicine. In CRC, single gland neutral meth- fashion in which successively more fit ‘driver’ events arise within a ylation tag profiling was consistent with the presence of numerous clone, resulting in the replacement or clonal succession of less fit intermixed subclones [37]. More recently, multi-region, single gland clones through selective sweeps [8,36]. While the sequential and single cell analyses revealed extensive genetic variegation and model may accurately describe tumor initiation, it is not known wheth- subclone mixing at the copy number and mutational levels in both ade- er this applies to most solid tumors. Sequencing of established primary nomas and carcinomas [21], as discussed in greater detail in Section 4. tumors has revealed extensive ITH in diverse tumors as demonstrated Computational modeling of spatial tumor growth and statistical infer- via multi-region profiling of clear-cell renal cell carcinoma (ccRCC) ence on the genomic data revealed that most detectable ITH occurs [13,18], (CRC) [21,37], (GBM) [16,22], early during tumor growth, long before the lesion is clinically evident. non-small cell lung cancer (NSCLC) [30], lung adenocarcinoma [29], These findings were subsequently corroborated by reports that cervical [38] and ovarian cancer [17], where somatic alterations subclonal alterations originate early in colorectal adenomas/carcinomas are often restricted to distinct tumor regions. Several of these studies [45–48]. Hence, ITH is already present in early lesions and continues to have reported evolutionary trajectories in which distinct driver alter- accrue during progression. Subclone mixing has been reported in other ations are ubiquitous (public/clonal) and therefore located on the solid tumors, when detailed multi-region sampling was performed [49], trunk of the tree, whereas other drivers are heterogeneous (private/ whereas such patterns would be obscured by bulk profiling. Single cell subclonal) and restricted to certain branches of a . sequencing has similarly revealed extreme levels of genetic heterogene- As a result of these and other studies, ITH is now accepted as an in- ity [32,50–53]. Intriguingly, early studies in GBM reported that geneti- herent feature of . The observation that many tumors exhibit cally distinct cell subpopulations can cooperate to promote tumor genetic variegation has therapeutic implications since actionable ‘driv- growth, progression, and maintenance [54]. Several recent studies er’ mutations may be suboptimal targets if they are restricted to have similarly reported clonal cooperation in a mouse subclonal branches of the tumor's ancestral tree. Moreover, ITH itself model [55] and xenografts [56]. Collectively, these findings highlight may represent a potential prognostic [26,39]. There are nu- the potential for a minor tumor cell population to promote ITH. They merous excellent reviews on ITH [15,40–42], hence we do not attempt also suggest that interdependencies amongst subclones may represent to comprehensively cover this vast literature here, but rather to intro- potential therapeutic vulnerabilities. duce relevant concepts for quantifying, interpreting and exploiting ITH Despite the extensive diversity, convergent evolution is apparent in within an evolutionary framework. A crucial concept for understanding some tumors, suggestive of constrained evolutionary trajectories. This is cancer genomic data is that clonal (public) mutations correspond to most notable in ccRCC, where on a background of clonal (truncal) events that accumulated in the founding tumor cell lineage during initi- inactivating mutations in the von Hippel Lindau (VHL) gene and loss ation and prior to transformation such that they are present in all cells of of heterozygosity (LOH) of 3p, harboring the second the tumor (Fig. 1). These include both pathogenic alterations that con- copy of the VHL gene, spatially separated subclones were found to har- tribute to disease progression and benign passengers. Subclonal alter- bor distinct mutations in multiple members of the SWItch/Sucrose non ations occurring after transformation are by definition present in a Fermentable (SWI/SNF) chromatin remodeling complex, including subset of the tumor cell population and hence considered private.They PBRM1, SETD2, ARID1A and SMARCA4 or even distinct mutations in the

Please cite this article as: Z. Hu, et al., A population genetics perspective on the determinants of intra-tumor heterogeneity, Biochim. Biophys. Acta (2017), http://dx.doi.org/10.1016/j.bbcan.2017.03.001 4 Z. Hu et al. / Biochimica et Biophysica Acta xxx (2017) xxx–xxx same gene [13,18]. Convergent evolution of EGFR mutations/amplifica- 2.2.1. Mutational heterogeneity tions was also observed in GBM via single nucleus sequencing [57], As much of tumor progression is occult, by the time the lesion is clin- and in chronic lymphocytic (CLL) [58]. These findings imply ically evident (and composed of tens of millions to billions of cells), tens that alterations in these pathways are selectively advantageous and to hundreds of thousands of mutations can have accrued. Comprehen- suggest genetic canalization towards a particular . Such pat- sive cancer genome sequencing studies have revealed that the aberra- terns have only been noted in a minority of tumors examined to date, tion landscape is vaster than previously thought with different genes although the integration of multiple ‘omic’ features may unmask this. mutated in individuals with the same histological tumor type [36,83, In the context of targeted therapies, which impose a highly specificse- 84]. As such while a handful of genes are recurrently mutated at high lective pressure, convergent evolution of on-target point mutations in frequency in the population, the vast majority of alterations occur infre- domains and the activation of compensatory pathways have quently, resulting in the so-called ‘long-tail’ distribution of putative been observed [59,60]. driver genes. Considerable (N1000-fold) variation in the frequency of Most genomic profiling studies to date have focused on primary tu- non-synonymous mutations was observed across tumor types with pe- mors, whereas relatively fewer have examined genetic divergence be- diatric cancers exhibiting the lowest mutational burden, and melanoma tween paired primaries and metastases or recurrences arising at local and lung cancer the highest. In some cases, the elevated mutation fre- or distant sites after initial therapy. This is due, in part, to the challenge quency could be explained by exposures to known carcinogens, namely in obtaining biopsies from distant metastatic sites, which are sometimes UV radiation in the case of melanoma and tobacco smoke in lung cancer. inaccessible and the time lag between initial diagnosis and disease re- Even amongst tumors of the same type, mutation frequencies varied currence, wherein procedures may be performed at different centers. multiple orders of magnitude (0.01–100/Mb), highlighting diversity in Given that therapy can alter a tumor's evolutionary trajectory, it is im- somatic mutation rates. portant that the genomic data are interpreted in this context. Several studies have comprehensively profiled paired primary and recurrent disease in glioma [19],GBM[22,61], ovarian cancer [62],AML[63], 2.2.2. Genomic instability ALL [64], neuroblastoma [65]. It has been more than 40 years since Loeb et al. [85] proposed the Genomic concordance between paired primaries and metastases has mutator phenotype hypothesis, which posits that errors during DNA been investigated in diverse solid tumors at the level of somatic single replication promote malignant progression [86]. Two years later, nucleotide variants (SSNVs) [13,66–73], structural variants (SVs) [74], Nowell's perspective piece on the clonal evolution of cancer proposed copy number aberrations (CNAs) [66,67,75,76] [73] and loss of hetero- that heterogeneity in cancer results from increased genetic instability zygosity (LOH) [77,78] with varying degrees of genomic divergence re- during disease progression [7]. Cancer genome sequencing efforts ported. The inferences that can be drawn from these efforts depend on have revealed genomic instability resulting from point mutations the study design, platform(s) used, depth of sequencing (when rele- (SSNVs), CNAs, and SVs, each of which can contribute to elevated genet- vant), and extent of clinical and treatment information. For example, ic diversity. For example, chromosomal instability (CIN) leading to CNAs studies that are targeted in nature preclude the unbiased discovery of and SVs [87,88] results in cells in which the mechanisms that ensure the global genomic similarities and differences. Sampling bias and tumor fidelity of chromosome segregation during cell division are compro- spatial structure, as well as intervening therapy can confound the inter- mised. As such, chromosomal aberrations can dramatically impact cel- pretation of patterns of heterogeneity between lesions. Hence, all of lular fitness [89,90], which further propagates aneuploidy and these factors should be reported, even if they cannot practically be con- facilitates rapid evolution [91]. trolled for. Another form of , known as microsatellite insta- The genomic characterization of multiple metastases from the same bility, results from DNA slippage events at simple tandem repeat ele- patient has been limited to a handful of studies [18,34,79,80], which ments present throughout the genome, and is common in tumors tend to suggest greater similarities amongst metastases than between with defective DNA mismatch repair (MMR). Although microsatellite the primary and . For example, in a large study of paired pri- instability (MSI) has been appreciated for over two decades beginning maries and metastases (n = 86), Brastiantos et al. [34] reported that with its discovery in familial CRC patients [92], the advent of next-gen- spatially and temporally separated brain metastases were often geneti- eration sequencing (NGS) has revealed fresh insights into its functional cally homogenous, whereas 53% of cases harbored potentially clinically implications [93]. As the mechanisms and consequences of genome in- informative alterations in brain metastases that were not detected in stability in relation to tumor heterogeneity have recently been reviewed the matched primary. The authors also reported that extra-cranial me- [40,94], here we focus on core concepts and points relevant to an under- tastases were often distinct from the brain metastases. standing of the population dynamics amongst tumor cells. Although multi-region sampling is essential for the accurate A particularly intriguing aspect of genomic instability is that it ap- reconstruction of tumor evolutionary histories [21,81], few studies pears to be associated with cellular fitness tradeoffs [89,95–97].Forex- have performed MRS of both the primary and metastasis and this ample, aneuploidy has been shown to both initiate and inhibit tumor has generally been restricted to one or a few cases [47,73,82]. Hence, progression depending on the extent and context [98]. On the one there remains a need to characterize larger patient cohorts at a ge- hand, aneuploidy appears to be associated with tumor cell fitness ad- nome-wide level in order to delineate concordance between primaries vantages. For example, in CRCs whole genome doubling events are com- and metastases, as well as the timing of metastasis and patterns of monly thought to precede the acquisition of a CIN phenotype and to seeding. endow the tumor with tolerance to additional genome instability, there- by driving further evolution [97]. Indeed, genome doubling is associated with worse relapse free survival in colorectal [97] and ovarian [99] can- 2.2. Sources of tumor heterogeneity cers. CIN has also been associated with poor prognosis [100].Onthe other hand, overexpression of MAD2, a gene contributing to chromo- Cancer is a disease of the genome, resulting from both endogenous some instability (CIN), usually leads to cell growth arrest or death, sug- and environmental damage to DNA, which can be propagated during gesting that it is disadvantageous to cells [90]. These apparent normal cell division. In this review, we focus on genetic heterogeneity contradictions may be explained in part by the findings of Laughney et in cancer and the evolutionary forces that shape these patterns. Howev- al. [89] who demonstrated that cancer cells have evolved to exist within er, tumor heterogeneity manifests not only at the genomic level, but at a narrow range of chromosome missegregation rates that optimize phe- the cellular, phenotypic and microenvironmental levels and they jointly notypic heterogeneity and clonal survival, suggesting that CIN is under influence one another. selective constraints.

Please cite this article as: Z. Hu, et al., A population genetics perspective on the determinants of intra-tumor heterogeneity, Biochim. Biophys. Acta (2017), http://dx.doi.org/10.1016/j.bbcan.2017.03.001 Z. Hu et al. / Biochimica et Biophysica Acta xxx (2017) xxx–xxx 5

Mutational burden has also been hypothesized to achieve saturation, hypoxia and acidosis have been reported during breast tumor evolution where additional mutations are lethal, resulting in “error catastrophe”, [122] and invasiveness has been attributed to adaptive changes in the that render the tumor more sensitive to insult and therefore confer fa- microenvironment [123,124]. Additionally, cancer cells at the tumor pe- vorable outcome [95]. Indeed, higher total mutation burden does not riphery versus necrotic core can have different resulting universally associate with worse prognosis. This is perhaps best exem- from regional variation in microenvironmental selective forces, where plified in CRC, where mismatch repair (MMR) deficiency results in a local populations rapidly converge to the fittest phenotype hypermutator phenotype and excess (100-fold) SSNVs, and yet MMR- given a stable environment [124]. deficient microsatellite unstable (MSI-high) tumors tend to have favor- In summary, the genetic, non-genetic, and environmental forces at able prognosis compared to microsatellite stable (MSS) tumors [101]. play during a tumors long evolution shape the detectable patterns of Given that genome instability has been established as an important ITH in a tumor sampled at the time of diagnosis. While studies to date determinant of genetic diversity [40,102], it is critical to understand have largely focused on genomic heterogeneity, in part due to greater how tumors respond to these insults, and how they propagate diversity ease in measuring it, new techniques to delineate the genotype to phe- as this may inform strategies to limit ITH. While a common feature of notype map in cancer and to probe in situ many cancers, variable levels of mutational burden and genomic insta- may offer opportunities to characterize their relationship. bility are observed within and between tumor types [83,103].Inadult tumors, relatively genomically (copy number) stable subgroups of 3. Quantification of genetic diversity in cancer genomes breast [104] and gastric cancers [105] have been reported. Moreover, some pediatric cancers have remarkably ‘quiet’ genomes, as is the case 3.1. Practical challenges in quantifying ITH in tumor samples for synovial sarcoma, which is commonly driven by the SS18-SSX onco- genic fusion, resulting in disruption of the SWI/SNF (BAF) chromatin The quantification of ITH from current NGS data is challenging for regulatory complex [106]. Thus, while genome instability is a critical multiple reasons. From a practical perspective, the power to detect source of tumor heterogeneity in many cancers, it does not appear to subclonal somatic single nucleotide variants (SSNVs) is limited in cur- be universally essential for disease progression. rent bulk tumor WES studies, which often aim to achieve 80–100× cov- erage (with WGS often lower) [84].Thisisparticularlyproblematic 2.2.3. Cellular, phenotypic and microenvironmental heterogeneity when coupled with the low purity of many tumor samples, which con- Tumor cell heterogeneity can manifest at the level of cellular mor- tain a heterogeneous mixture of non-cancerous and cancerous cells. phology, proliferation kinetics, cell motility, and cell metabolism Ideally purity is estimated from adjacent hemotoxylin and eosin amongst others [107,108]. These differences can be attributed to both (H&E) stained tissue sections prior to nucleic acid extraction to guide genetic and non-genetic variation, where the latter includes epigenetic the target depth of coverage. When needed, flow sorting [125], single and transcriptional changes, as well as microenvironmental factors. gland isolation [21,37] or microdissection [126,127] can be performed For example, epigenetic changes can shape cellular phenotypes and to enrich for tumor cells. As pathology based purity estimates may not modulate cellular plasticity [109,110]. Phenotypic heterogeneity pro- agree with molecularly inferred estimates obtained using tools such as vides cancer cells with immense plasticity to cope with environmental PurBayes [128], ABSOLUTE [99],orTITAN [129], it can also be beneficial stress during tumor progression in the face of limited resources and to perform initial sequencing to determine how much additional cover- hence may also be important for allowing tumor cells to escape thera- age is needed to achieve a given power for detection. However, even peutic insult. Indeed, phenotypic heterogeneity is now recognized as a then, genome sequence context, and high duplication rates can hinder major source of drug resistance [111,112]. Early work demonstrated uniform and optimal target coverage, the latter of which can be partic- the presence of reversible drug tolerance associated with specificchro- ularly problematic for formalin fixed paraffin embedded (FFPE) clinical matin states and the re-establishment of heterogeneity of the putative specimens. Nonetheless, given the wealth of clinically annotated archi- (CSC) surface marker CD133 [109]. Subsequently, val tissue samples, it is important to optimize both the study design and stem-like characteristics were reported to contribute to the stabilization bioinformatics approaches for such specimens. of drug-tolerant populations for extended time periods [111]. These and Perhaps even more problematic is the issue of sampling bias, which other studies point to the fact that high-dose cytotoxic therapies can in- is inevitable in solid tumors defined by their unique tissue architecture duce pathogenic cell state transitions in the residual tumor cell popula- and spatial structure. Multi-region sequencing (MRS) of spatially sepa- tion, wherein they revert to a primordial (stem-like) developmental rated tumor regions can help mitigate the issue of sampling bias, and state [111,112]. Intriguingly, other studies have demonstrated pheno- has become more common in recent years [13,16–19,21,22,25,29,30, typic divergence between distant metastases relative to primary breast 34]. Indeed, several recent studies have highlighted the need for MRS tumors and lymph node metastases [113], consistent with the notion to accurately reconstruct tumor subclonal architecture [81,130–133] that elevated diversity in advanced stage disease contributes to treat- and infer tumor evolutionary dynamics [21,37]. From a practical per- ment resistance. spective, it is not feasible to randomly sample the entire surgically It is also evident that stochastic fluctuations can be functional, partic- resected lesion and in most cases, the location of the specimen within ularly in constrained conditions such as development [114–116], stress the tumor is not known. For biopsies, the specimen itself is limited response [117],apoptosis[118] and cancer progression [119,120]. Fur- and hence only provides a local sampling. Moreover, multiple factors ther, non-cell-autonomous factors can contribute to tumor growth can influence the extent of sampling bias, including spatial constraints, and promote clonal interactions that may, in turn, lead to new pheno- the mode of evolution, and the rates of migration, which are as of yet types [56]. These non-genetic components occur on the backdrop of ge- poorly characterized. Despite these complicating factors, it is evident netic variation and jointly shape a tumor's evolutionary trajectory. As that MRS aids the detection and quantification of ITH [21,134]. tumors progress, irrespective of their tissue of origin, similar microenvi- The distribution of variant allele frequencies (VAF) critically de- ronmental cues appear to arise as a result of spatial constraints, limited pends on the underlying mode of tumor growth [21,135]. For example, nutrients and vasculature, as well as activation of the immune system in colorectal tumors characterized by Big Bang growth dynamics, after [121]. This is in accordance with the long-held view that advanced can- transformation, the tumor expands in the absence of stringent selection, cers exhibit similar defining hallmarks [10]. compatible with effectively neutral evolution [21]. Hence, in this model, Microenvironment factors, including immune and stromal cells, the timing of a mutation is the fundamental determinant of its frequen- acidification, oxygen, and geographical constraints also contribute to cy and later arising subclonal mutations will be present in increasingly phenotypic heterogeneity within tumors, influencing signaling path- diminutive fractions of the tumor cell population (Fig. 1A). As such, ge- ways and regulatory circuitry. For example, cellular adaptations to netic diversity will be vastly underestimated [21], as has been predicted

Please cite this article as: Z. Hu, et al., A population genetics perspective on the determinants of intra-tumor heterogeneity, Biochim. Biophys. Acta (2017), http://dx.doi.org/10.1016/j.bbcan.2017.03.001 6 Z. Hu et al. / Biochimica et Biophysica Acta xxx (2017) xxx–xxx theoretically [127]. In a model of effectively neutral evolution, subclonal stochastic fluctuations and errors introduced during early PCR steps variants are not functionally relevant for primary tumor growth, but that are amplified exponentially, resulting in differences in biased allele they nonetheless provide a potentially rich substrate for subsequent se- frequency estimates [141–143]. Such artifacts can be mitigated to vary- lection in the context of therapy. In the alternate scenarios of continuous ing extents by modifying library preparation techniques to incorporate strong positive selection (Fig. 1B) or branched evolution (co-occurrence unique molecular identifiers and via circle [144] or duplex sequencing of multiple selectively advantageous clones, Fig. 1C) after transforma- [145], although these are not currently scalable to whole exome/ge- tion, subclonal variants that confer a selective growth advantage may nome analyses. attain high frequency (depending on various population parameters). Once SSNVs have been called, it is common to adjust their VAF In this case, they may be more readily detected, but in the absence of values for tumor purity and copy number to obtain an estimate of multiple sampling data, it is not always feasible to determine whether their cellular prevalence or cancer cell fraction (CCF). However, this is they are high frequency subclonal (private) SSNVs or public alterations a non-trivial task since multiple parameters are unknown (purity, ploi- due to the inherent noise associated with NGS (Fig. 2). dy, number of subclones) and often need to be estimated simultaneous- Numerous methods have been developed to perform somatic vari- ly. Moreover, these estimates depend on the relative ordering of CNAs, ant calling from tumor-normal pairs, including deepSNV [136] and SSNVs, and genome doubling events during tumor evolution. MuTect [137], which aim to achieve greater sensitivity for subclonal var- The accurate identification of subclonal CNAs from NGS is a challeng- iants. However in a typical NGS study, the limits of detection corre- ing task given that the distribution of short reads along the genome is sponds to a VAF of ~10% [137]. Critically, subclonal variants alone are non-uniform, with high variability amongst adjacent loci at standard se- informative in chronicling the dynamics of growth after transformation, quencing depths. As a result, only CNAs with relatively high frequency as variants that were accrued prior to transformation will have either can be reliably detected from current NGS data, and this depends on been lost or else present in the founding tumor and hence clonal in the segment length and copy number states. Methods to predict the the final tumor. On the other hand, it is possible to over-estimate ITH subclonal CNAs from tumor sequencing data include TITAN [129], when calling variants from multiple tumor samples in the same patient TheTA [146], CloneHD [147],andCloneCNA [148]. Tools designed to that have non-uniform coverage or purity. Purity differences can be es- infer CNAs from multi-region tumor sequencing data are currently lack- pecially notable in comparisons of paired primaries and metastases [34] ing, but have the potential to improve the detection of subclonal CNAs, or paired pre versus post-treatment samples [138,139] and hence are as well as the inference of tumor phylogenies, as discussed below. important to account for. The increasing availability of MRS also moti- Various tools have been developed to estimate the number of vates methods such as MultiSNV [140] that aim to jointly call SSNVs subclonal clusters from tumor sequencing data by grouping mutations from multiple samples. By exploiting pseudo-repeated measurements, on the basis of their allele frequencies, [133,147,149,150]. The majority such an approach can potentially aid the detection of false positives of these approaches aim to define ‘clonal’ relationships based on the as- and false negatives. Deep targeted sequencing or digital PCR are com- sumption that mutations with similar frequencies descended from the monly used for orthogonal validation of candidate SSNVs discovered same ancestral cell and that a limited number of dominant subclones through unbiased surveys. However, PCR based approaches (including with distinct mutational profiles have undergone clonal expansion. NGS library preparation) can suffer from ‘jackpot effects’ due to However, a single bulk sample may fail to delineate clonal alterations

Fig. 2. Schematic illustration of different modes of tumor evolution and the impact of spatial sampling on patterns of ITH. (A) After transformation, an established tumor may propagate in the absence of stringent selection (for example if the cells are already sufficiently fit), compatible with effectively neutral growth. Alternatively, the tumor may be subject to ongoing clonal selection. To illustrate these scenarios, Muller plots (http://doi.org/10.5281/zenodo.240589) depicting tumor subclone composition under a model of effectively neutral growth (upper panel) or strong selection (bottom panel) are shown. Descendant genotypes emerge within the parental clone, where height indicates genotype frequency (corresponding to the relative abundance of the genotype in the tumor cell population or cancer cell fraction) and the horizontal axis indicates time (generations). At the time of diagnosis or surgery, a single sample or multiple samples (denoted A, B) are sampled and subject to NGS. (B) Tumors arising under effectively neutral growth are expected to exhibit a bimodal site frequency spectrum (SFS) with a public mutational cluster centered at 0.5 and a large cluster of low VAF subclonal mutations, where θ indicates the threshold below which low frequency mutations cannot be reliably detected. This can be contrasted with the expected SFS under strong selection in which private SSNVs can attain high frequency due to regional clonal expansion of a selectively advantageous clone (purple shading).

Please cite this article as: Z. Hu, et al., A population genetics perspective on the determinants of intra-tumor heterogeneity, Biochim. Biophys. Acta (2017), http://dx.doi.org/10.1016/j.bbcan.2017.03.001 Z. Hu et al. / Biochimica et Biophysica Acta xxx (2017) xxx–xxx 7 versus subclonal alterations that attained high frequency to selection or deconvolution, including those that can take as input multiple samples drift (Fig. 2). More generally, the inference of the number of subclones, [132,166,167]. However, even with multiple samples, there is no guar- their proportions and genotypes from single sample bulk tumor se- antee that low VAF SSNVs can be accurately phased. More recently, sev- quencing is challenging with the solution underdetermined under eral computational methods have been developed to jointly model most conditions [81,147,151,152]. MRS data should facilitate these in- SSNVs and CNAs [81,131,133] towards more accurate reconstruction ferences and have recently used to reconstruct tumor phylogenies, as of clonal trees. This represents an important methodological advance discussed below. Single cell sequencing enables the direct measurement and importantly suggest that many possible trees may be consistent of clonal genotypes, and hence avoids the need for deconvolution, mak- with the data [81]. However, these methods often couple the task of ing it a powerful approach [153,154]. However, this is complicated by mutation clustering and phylogeny reconstruction and require prior noisy and incomplete measurements and remains impractical on a clustering of CCF values derived from WGS/WES to reduce the feature large scale. Ultimately, a combination of bulk and single cell sequencing space to a limited number of subclones due to the computationally in- may help to further resolve tumor subclonal architecture. tensive nature of enumerating a large possible tree-space. Hence, this too can result in a loss of information. However, this new class of 3.2. Phylogenetic approaches to interpret ITH methods is in its relative infancy and is likely to remain an area of active research. Likewise, several recent methods have been described to re- Phylogenetic trees are the canonical structure for describing the construct phylogenies from single cell data [168,169], where phase is genealogical relationship among tumor cells. They offer a potential- known, but the data are generally sparse and noisy due to allelic drop- ly powerful and theoretically grounded approach to study heteroge- out. Single cell WGS has the potential to directly resolve clonal geno- neity within and between lesions [155,156],drawingfromtherich types, however this will require improved genome coverage and field of molecular phylogenetics, the principles and methods of mutation sensitivity since errors can distort lineage reconstruction which are reviewed elsewhere [157]. Here, we briefly outline con- [170]. siderations for the analysis of cancer genome sequencing data. Given the absence of ground truth phylogenies for actual tumors, When modeling NGS data, it is important to account for systematic simulation studies provide the primary means to evaluate a errors and to increase robustness to model violations. In the case methods performance. Hence, it is important that simulations used of cancer genomes, not only is this complicated by uncertainty in for benchmarking capture relevant aspects of tumor growth, SSNV and CNA estimates (as discussed above), but by extensive ge- which can be influenced by numerous factors (as discussed in netic diversity due to mixtures of cells for which both the number of Section 4). It is also essential to consider the underlying model as- clones and their relative proportions are unknown. The issue of sumptions and robustness to violations. For example, an infinite- sampling bias in solid tumors further necessitates that multiple re- site model is commonly assumed such that each site can mutate gions of temporally separated lesions be assayed to robustly recon- only once and persists. However, in practice, convergent evolution struct their ancestral relationship, although this has seldom been could result in the acquisition of the same mutation more than done in practice. once and mutations could be lost due to loss of heterozygosity To date, various approaches have been employed to reconstruct (LOH), thereby violating this assumption. Some methods also re- sample or subclone phylogenies from tumor sequencing data. Sev- quire assumptions about the underlying models of molecular evolu- eral studies have employed standard phylogenetic approaches, in- tion, which may require refinement in cancer. cluding parsimony [158], maximum-likelihood [159] and Despite the power of phylogenetic approaches to delineate tumor distance-based methods such as the neighbor-joining algorithm architecture and interpret ITH, there are limitations. In particular, phy- [160] to delineate the relationship between tumor samples based logenies alone do not report on the underlying dynamics [130].Howev- on SSNVs (often from CNA-neutral regions) or CNA breakpoints. er, genealogy-based statistical inference methods, stemming from the For example, distance methods were used to cluster SSNVs across development of coalescent theory [171] and the advent of routine multiple bulk samples in renal cell carcinoma [13] and CNAs in sin- large scale sequencing efforts have transformed modern computational gle breast cancer cells [32]. As distance-based methods take as input population genetics approaches [172]. Techniques such as Approximate summary statistics, they cannot account for complex character Bayesian Computing (ABC) have been used extensively in population (SSNV, CNA) states. Likewise, the exclusion of SSNVs in regions of genetics to infer posterior parameter distributions from molecular CNA to avoid the complex issue of phasing ignores potentially valu- data when using stochastic models for which likelihoods cannot be cal- able information, although for tumors that are largely diploid or culated [172,173]. Related approaches based on coalescent modeling have sufficient events in non-diploid regions, this simplifying ap- and Bayesian inference have been used to characterize stem cell dynam- proach may be reasonable. CNAs also represent a powerful feature ics in human colon crypts [174]. More recently, ABC has been integrated for phylogenetic inference, but require custom methods due to the within 3-dimensional agent-based computational models of tumor horizontal dependencies between adjacent loci and overlapping al- growth that represent cancer as an evolutionary process to enable the terations. An early method, termed TuMult [161] sought to over- inference of patient-specific tumor dynamics such as the mutation come the issue of overlapping aberrations by employing copy rate and CSC fraction [37], as well as subclone fitness differences and number breakpoints (which should be persistent). More recently, the mutational timeline [21]. MEDICC was developed to jointly phase parental alleles and per- form tree reconstruction [162]. This approach takes as input integer 3.3. Ecological and population genetic measures of ITH copy number profiles and B-allele frequencies and has been used to reconstruct tumor phylogenies from multi-region copy number The relationship between genetic diversity in cancer and population profiles [21,163]. evolvability may provide prognostic and therapeutic information [26]. Other studies have sought to perform the challenging task of Importantly, it is the total diversity (including both functional and selec- subclonal deconvolution so that subclones can be modeled as species tively neutral diversity) that likely determines the evolvability of cancer or taxa. For example, by performing in silico phasing of subclonal cells since in the absence of diversity is not operative. SSNVs and exploiting the pigeon-hole principle, a probable phylogeny This is in-line with the Dykhuizen-Hartl effect proposed by Kimura, in was inferred from high coverage bulk WGS data of a breast tumor which neutral mutations may later become adaptive in an altered envi- [164]. However, phylogeny reconstruction from a single bulk tumor ronment [175]. Hence, ITH may be a proxy for the underlying rate of sample will generally lead to an incomplete and potentially erroneous evolutionary processes operative in a tumor. While it is not clear how tree [37,165]. Numerous methods have been described to automate to best sample a tumor or to quantify ITH, several measures have been

Please cite this article as: Z. Hu, et al., A population genetics perspective on the determinants of intra-tumor heterogeneity, Biochim. Biophys. Acta (2017), http://dx.doi.org/10.1016/j.bbcan.2017.03.001 8 Z. Hu et al. / Biochimica et Biophysica Acta xxx (2017) xxx–xxx borrowed from ecology and population genetics for this task and others associated with pathologic complete response, whereas levels did not have been defined based on recent cancer genomic data. change in tumors that failed to respond to treatment [20]. Importantly, One widely used measure of genetic diversity within a population is the extent of ITH was comparable irrespective of the chromosomal re- heterozygosity [176], or the probability that two sequences chosen at gion analyzed. A survey of various measures of clonal diversity was re- random from the population have different allelic types, which is equiv- ported by Merlo et al. [180]. alent to Simpson's Diversity Index in ecology [177]: More recently, the MATH index which corresponds to the ratio of the median absolution deviation (MAD) and the median of the VAF values ¼ −∑ 2 ð Þ D 1 i f i 1 was described [181] and used to quantify ITH in head and neck squa- mous cell carcinoma (HNSCC), where it was shown to correlate with where fi is the frequency of the allelic type or cell type, i. For asexually worse outcome [27]. reproducing populations such as cancer, the heterozygosity is the prob- ability that two cells chosen at random have different sequences. For ex- MADðÞ VAF MATH ¼ 100 ð4Þ ample, Ling et al. [127] derived that the expected heterozygosity at time MedianðÞ VAF t (denoted by Ht) during tumor growth can be expressed by:  −2u −2uj ¼ − e −∑t e ∏ j−2 − 1 ð Þ Other approaches to quantify ITH include determination of the num- Ht 1 j¼2 i¼0 1 2 Nt −1 Ntþ1− j−1 Nt−i−1 ber of subclones and their frequency based on shared CCF values [149, 150], although this requires additional estimates and assumptions, as where Nt is tumor size at time t starting with a single transformed cell at outlined in the preceding section. Zhang et al. found that lung adenocar- time 0 and assuming a well-mixed neutrally growing population of cinomas with higher subclonal mutational burden (n =3/11cases) synchronous cell generations. Here, u is the mutation rate per cell were more likely to relapse [29]. However, this was again based on a division for the 30 Mb exonic coding regions, namely u = μ ×3×107 limited number of cases. Additionally, in post- treated (e.g. corresponding to u = 0.03 when the per-base pair mutation rate pediatric kidney cancer, tumors with microdiversity exhibited worse is μ =10−9). A cancer genome can diversify rapidly during tumor disease-specific survival [182]. Other studies have reported an associa- growth as illustrated in Fig. 3 for a logistic growth model. In particular, tion between survival and various measures of ITH independent of H N 0.8 when the tumor is barely detectable at 1 million cells. Thus, other clinical and molecular variables in diverse tumor types [183, tumor growth may accompany rapid genetic diversification and a 184]. However, these data suggest a non-linear relationship between small tumor can already harbor extensive diversity. This prediction is the number of subclones and outcome, thereby complicating the consistent with recent findings that early tumors display high-levels interpretation. of genomic diversity [33,46,178]. In summary, several studies have suggested that ITH is associated Genetic heterogeneity within and between regional samples has with disease progression [26] and poor prognosis [27,28], raising the also been quantified using the Shannon index [179]: possibility that ITH may represent a general biomarker. However, lack of association is seldom reported and further studies are needed. More- ¼ −∑ ðÞ ðÞ H i f i ln f i 3 over, as ITH is already present in early lesions, it is of interest to deter- mine whether its predictive utility depends on disease stage. Finally, it where fi is the frequency of the allelic type or cell type, i. For example, will be important to understand which genomic features and measures the Shannon Index was used to measure microsatellite and copy num- of ITH are most informative, and this may depend on the underlying ber heterogeneity based on FISH in Barrett's esophagus where high di- modes of tumor evolution, which are as of yet poorly understood. versity was predictive of progression to [26]. It has similarly been used to and to quantify neutral methylation tag and 4. Evolutionary forces determining the extent of genetic copy number diversity in colorectal tumors [21,37], where high ITH heterogeneity was observed at different genomic levels within and between single glands. The Shannon index was also used to quantify genetic diversity Mutation, genetic drift, migration and selection are the fundamental based on immunoFISH of paired primary, lymph node and distant met- forces that collectively shape genetic diversity within a population astatic breast tissue [113], where diversity was found to be highest in [185]. Clonal evolution, the process of mutation accumulation and adap- distant metastases. Comparisons of pre versus post neoadjuvant treated tation in somatic cells, likely involves each of these forces, although breast tumors suggests that lower pre-treatment diversity was their collective impact on ITH is seldom considered. Here we discuss

rt rt 9 Fig. 3. Genetic diversification resulting from random neutral mutations. (A) Tumor growth via a logistic model, Nt =KN0e /(K + N0(e −1)), where K = 10 , r = 0.05 and N0 =1 correspond to the carrying capacity, growth rate and initial tumor size, respectively. (B) The expected probability of heterozygosity, H (equivalent to Simpson's Diversity Index), during tumor growth (shown in A) where the neutral mutation rate is u = 0.03 per cell division in exonic coding regions. (C) The relationship between heterozygosity and tumor size.

Please cite this article as: Z. Hu, et al., A population genetics perspective on the determinants of intra-tumor heterogeneity, Biochim. Biophys. Acta (2017), http://dx.doi.org/10.1016/j.bbcan.2017.03.001 Z. Hu et al. / Biochimica et Biophysica Acta xxx (2017) xxx–xxx 9 these factors in the context of classical population genetic theory and re- For a growing population, the probability of fixation of a new mutation cent cancer genomic data. depends on both the time of occurrence and the birth/death rate. Assum- ing a branching process, Bozic et al. [201] derived that the probability of 4.1. Mutation fixation of k-th surviving lineage is given by:

ρ ¼ ðÞ= − ðÞδ k ð Þ Mutation is the primary source of genetic diversity, the rate of which k u u log 5 is a key parameter in population genetics studies [185,186]. Although cancer genome sequencing efforts have provided a catalogue of somatic where u is the neutral exonic mutation rate per cell division and δ the alterations across different tumor types [83,84], the per-cell division ratio of cell death and birth rates. Although, there is limited experimental mutation rate is not known and likely varies during the course of disease data, the ratio of death and birth rates in cancer has been estimated to progression. Mutational burden can also vary by orders of magnitude range from δ = 0.72 in aggressive CRC metastases [202] to δ =0.99in within and between tumor types [83,84]. Although the somatic muta- early tumors [203]. It can be seen from this formula that early tion rate is generally assumed to be elevated relative to the germline mutations can drift to fixation (clonality) with high probability during [186,187], even germline mutation rates can vary considerably within the early stages of clonal expansion when the cell death rate is high −9 families [188]. Additionally, the number of germ-cell divisions differs such that: ρ1 =0.22,ρ2 =0.05andρ3 =0.01foru =0.03(μ =10 ) in males (~102 divisions) and females (20–30 divisions) since ovum un- and δ = 0.9. Neutral mutations may therefore be present at high frequen- dergo two meiotic and 22 mitotic cell divisions and reach maturity prior cy in a tumor and detectable by NGS. Thus, while genetic drift is a negative to birth, whereas in males the number of germ cell divisions increases force that decreases whole-population diversity, it simultaneously in- with age [189]. Nonetheless, the mutation rate for single nucleotide var- creases the diversity of high frequency mutations due to random fluctua- iants (SNVs) in the human germline has been estimated to be ~10−8 per tions (Fig. 4) and hence is an important, but often overlooked, feature of base pair per generation [188] or ~10−10 per base pair per cell division, tumor sequencing data. whereas human somatic cells have been reported to accumulate 4 to 25 times more mutations than germline cells, corresponding to an estimat- 4.3. Spatial structure and migration ed mutation rate of ~10−9 per cell division per base pair [186,187]. However, there is no consensus regarding the mutation rate in human Spatial structure and migration can dramatically alter the genetic di- tumors. The finding that some tumors exhibit high genomic instability versity of a population [204,205]. In a spatially structured population, ge- led to the mutator hypothesis, which posits that an elevated mutation netic drift and selection are restricted in localized regions thus giving rise rate is a prerequisite of tumor development [96,190,191].While to decreased genetic diversity within a local subpopulation while in- mutator phenotypes likely accelerate tumor growth and are most effec- creasing genetic divergence between spatially segregated subpopula- tive if acquired early, they do not appear to be universally required for tions. As shown in Fig. 5A, spatial partitioning within a tumor naturally tumorigenesis [192,193]. gives rise to genetic divergence between samples from distant regions. More generally, the extreme variation in cancer risk across tissues This genetic segregation can be enhanced by local stochastic extinctions remains poorly understood. While it has been proposed that this is due to large demographic fluctuations at the expansion front and, hence due to differences in the lifetime number of adult stem cell divisions, during which DNA replication errors occur [194], this simple explana- tion has been debated [195–197]. Recent studies have sought to charac- terize mutation rates in diverse mouse tissues [170] and human adult stem cells [198] by sequencing of normal clonal organoid cultures de- rived from primary multipotent cells. Intriguingly, the initial mouse study reported differences in the mutation rate across tissue, as well as distinct mutational processes, whereas the subsequent study similar- ly suggest tissue-specific activity of mutational processes but a steady and consistent accumulation of mutations (approximately 40 de novo mutations per year) in the human small intestine, colon and liver. While further investigation is needed, this approach suggests new ways to quantify human somatic mutation rates.

4.2. Genetic drift

Genetic drift describes random fluctuations in the frequency of genetic variants in a population and is a major driving force of genetic diversity especially for small populations [176,185]. The neutral evolution theory posits that most fixed nucleotide substitutions in a population are driven by random genetic drift rather natural selection [199]. The neutrality hy- pothesis has been foundational for statistical tests of neutrality and com- putational analyses of natural selection from population sequencing data. However, the role of genetic drift in cancer is often overlooked as selection is generally evoked to explain most aspects of tumor evolution. For in- stance, it is usually assumed that detectable subclones are under strong Fig. 4. Genetic drift decreases genetic diversity in the whole tumor, while increasing selection [13,49,200]. However, theory predicts that early neutral muta- diversity amongst detectable mutations in NGS data. (A) The process of tumor tions during tumor growth may also be present at high frequency at the diversification in the absence of genetic drift (and selection), in which the frequency of a time of diagnosis since random drift predominates when tumor size is mutation (or cancer cell fraction, CCF) remains constant over time, corresponding to its small. Much of the theory concerning genetic drift derives from the frequency at the time of occurrence (1/n(t), where n(t) is the tumor size at time t). (B) The process of tumor diversification in the presence of genetic drift in which the Wright-Fisher or related models, which assume constant population frequency of a mutation fluctuates stochastically during tumor growth. Early mutations sizes (N genome copies) [185]. A well-known prediction is that the prob- whose initial frequencies are high can reach detectable frequencies, while late ability of fixation in the population for a new neutral mutation is ρ =1/N. mutations remain non-detectable under a model of effectively neutral evolution.

Please cite this article as: Z. Hu, et al., A population genetics perspective on the determinants of intra-tumor heterogeneity, Biochim. Biophys. Acta (2017), http://dx.doi.org/10.1016/j.bbcan.2017.03.001 10 Z. Hu et al. / Biochimica et Biophysica Acta xxx (2017) xxx–xxx

Fig. 5. Spatial structure and migration contribute to intra-tumor heterogeneity. (A) Schematic illustration of tumor biopsy or resection specimens derived from multiple local regions of a tumor expanding under stringent spatial constraints. Due to the spatial constraints, clones are partitioned and the genetic diversity within regions (R1, R2 or R3) is smaller than the diversity between regions (e.g. R1 vs R3). (B) While the three regions exhibit equivalent clonal composition, one cannot distinguish whether subclone mixing occurred early during tumor growth. (C) Clone map derived from a virtual tumor simulated within an agent-based tumor growth model with spatial constraints and subclone mixing. Each colored region represents an individual clone composed of cells that share the same mutations. Most clones are locally restricted, whereas the green clone spans distant regions of the tumor. (D) Migration and metastasis can significantly increase the divergence between tumors due to augmented genetic drift and clonal selection.

explains spatial heterogeneity in tumors [206]. These patterns are espe- in clonal expansions within the niche, regionally in the tumor, unless cially relevant for interpreting NGS data from solid tumors characterized the mutations promote dedifferentiation of non-stem cells to stem by significant spatial constraints since biopsies and even resections are cells [213]. CSC-driven tumor growth likely impedes mutational adapta- often obtained from a localized region of the tumor. Such local sampling tion due to the decreased effective population size. Microenvironmental is inherently limited in its ability to capture much of the (detectable) ge- heterogeneity can also promote the emergence of spatial structure in netic diversity in a tumor (Fig. 5). Recent multi-region sequencing (MRS) tumors [214] and the plasticity of cellular phenotypes that are often reg- studies have revealed extensive genetic diversity between different sam- ulated by epigenetic factors. Importantly, the efficiency of selection ples from the same tumor suggesting that spatial constraints are signifi- resulting from genetic mutations would be dampened under high levels cant in many tumor types [13,16,18,21,23,29,30,127,207].Hence,tumor of phenotypic plasticity [215]. evolution may be driven by adaptations to distinct microenvironmental Given the spatial constraints within solid tumors, spatial information niches amongst spatially partitioned populations. is important for resolving the evolutionary history of a tumor and the Spatial structure thwarts selection and promotes the co-existence of timing of mutation occurrence. For instance, without knowledge of multiple selective clones when selection is pervasive. Martens et al. sampling locations, subclonal composition can be explained by differ- [208] modeled crypt-structured tissues to study the effect of spatial ences in tumor growth dynamics, stringent spatial constraint (Fig. 5A) structure on colorectal tumor initiation where selection is expected to or early subclone mixing (Fig. 5B). However, knowledge of spatial loca- be stringent. They report that spatial structure significantly increases tion can help to constrain these scenarios. For example, if samples R1 the waiting time to transformation due to elevated clonal interference and R2 are spatially separated, early clonal mixing is more likely than between selectively advantageous clones compared with that in well- a stringent spatial model. Therefore, the integration of spatial sampling mixed populations such as leukemia. More generally, they proposed information within a computational modeling framework (Fig. 5C) can that in somatic cells, the damaging effect of deleterious mutations can aid the inference of tumor evolutionary dynamics. be mitigated by differentiation and apoptosis and that tissue structure Cell migration has a dual role in the generation of tumor genetic diver- minimizes the risk of cancer [208]. However, the contributions of spatial sity. On the one hand, within-tumor migration relaxes spatial constraints, structure to these dynamics are complex [209] and warrant further in- thereby augmenting the efficiency of selection, giving rise to lower genet- vestigation. For example, spatial structure does not always impede se- ic diversity within a tumor. On the other hand, out-of-tumor migration lection. In contrast, it can accelerate complex adaption, a scenario and metastasis will increase the genetic divergence between tumors where adaptation requires the accumulation of multiple mutations (Fig. 5D). By simulating spatial tumor growth with cell migration, Waclaw (e.g. a double hit to inactivate a tumor suppressor), depending on the et al. [216] showed that out-of-tumor migration can dramatically acceler- epistatic fitness landscape [210]. ate tumor growth and increase tumor diversity. Recently, comparative ge- CSC organization within tumors promotes spatial structure and in- nomic sequencing studies revealed genomic divergence between paired creases phenotypic heterogeneity [211,212]. Due to the limited prolifer- primaries and metastasis [34,80,217] and multi-focal tumors [23,218], ative potential of non-stem cells, only mutations in stem cells can result suggesting that migration drives tumor diversification.

Please cite this article as: Z. Hu, et al., A population genetics perspective on the determinants of intra-tumor heterogeneity, Biochim. Biophys. Acta (2017), http://dx.doi.org/10.1016/j.bbcan.2017.03.001 Z. Hu et al. / Biochimica et Biophysica Acta xxx (2017) xxx–xxx 11

4.4. Selection alterations might be deleterious [229]. Indeed, as discussed in Section 2.2, chromosomal aberrations can impact cellular fitness [89,90,97]. 4.4.1. Positive selection Given the complete genomic linkage of asexually reproducing Cancer is traditionally viewed to result from the stepwise accumula- populations (due to the lack of recombination) such as tumor cells, in- tion of mutations, where cells transition from a normal health state to terference between positive and negative selection (known as the pre-malignant, malignant and migratory phenotypes. The process be- Hill-Robertson effect) [230] may be particularly strong when accompa- gins when a single somatic cell acquires a heritable mutation that con- nied by high mutation rates [231]. Intriguingly, McFarland et al. [232, fers a fitness advantage, resulting in an increased proliferation or 233] found that deleterious mutations significantly alter tumor progres- survival phenotype that enables it to outcompete less fit cells within sion by impeding the efficiency of positive selection. They further the population. Natural selection operates on this phenotypic diversity, showed via simulation studies that some pre-malignant lesions die leading to sequential waves of clonal expansions and heterogeneity out due to negative selection, suggesting the possibility of novel thera- amongst subclones [7]. Within the current view, ongoing strong selec- peutic strategies through dramatic elevation of the mutation rate. It is tion is assumed to be the primary force and to govern all stages of tempting to speculate that the apparently paradoxical relationship be- tumor progression from initiation to subsequent primary tumor growth, tween variable levels of genomic instability and cellular fitness tradeoffs as well as metastasis where the acquisition of additional ‘drivers’ results is compatible with mutational meltdown and Muller's Ratchet [234, in continual selective sweeps [8,36,84]. In contrast, mutation, genetic 235], wherein deleterious alterations drive population extinction. drift, population structure, and migration are often overlooked, al- While this may be particularly acute for small asexually reproducing though they represent key evolutionary forces that influence the expan- populations, it is not clear whether this applies in highly adaptive cancer sion, fixation and extinction of subclones, as outlined in the preceding cell populations or how it is influenced by the interplay between delete- sections. The population dynamics and distribution of passenger, ad- rious and advantageous mutations [233]. vantageous and deleterious mutations can be understood through pop- ulation genetics models, where the relative importance depends on key 4.5. Evidence for different modes of tumor evolution parameters such as population size. For example, genetic drift within a small population can render selection less efficient, by allowing for the 4.5.1. Pre-tumor progression: multiple clonal expansions to transformation fixation of deleterious mutations and loss of advantageous mutations. It has been appreciated for nearly two decades that more than half of Various computational methods have been developed to identify all somatic mutations in colorectal cancers occur prior to transformation ‘driver’ mutations that confer a selective growth advantage (reviewed [1,2]. Critically, the initiating (epi)genetic events provide a selective fit- in [219]). The majority of approaches evaluate whether a gene is altered ness advantage in the target cell relative to other pre-malignant cells, more frequently than expected by chance based on patterns of recur- resulting in clonal expansion. Initiating events are often tissue type spe- rence across tumors, where it is critical to account for replication timing cific as is the case for APC in gastrointestinal tumors [3–5] and DNMT3 in and variable mutation rates throughout the genome (or in leukemias [6], suggesting that they are beneficial only in certain cellular hypermutated tumors) that can otherwise bias these patterns [83, contexts. It has long been assumed that the cell of origin may be a self- 220]. In contrast to these cohort-based approaches to define putative renewing stem cell and that this is the “unit” on which selection driver genes, formal methods to detect selection in an individual operates in cancer [8,236], where cancer reflects a state in which the tumor have been described in only a few instances [21,221]. For exam- normal balance of self-renewal and differentiation is lost. This is sup- ple, Sottoriva et al. [21] employed a spatial computational model of ported by observations that higher stem cell activity is associated with tumor growth and statistical inference framework to infer subclone fit- poor prognosis in multiple tumor types [9]. The subsequent expansion ness differences. Martincorena et al. [221] evaluated average mutant of a subclone depends on the rates of advantageous and deleterious mu- clone sizes and the dN/dS ratio, a measure of the relative abundance tations, the fitness benefit they confer in a given background population of non-synonymous to synonymous mutations, and report evidence and the effective population size. for strong positive selection for mutations in normal skin, although Several studies have characterized genetic diversity in precursor or the interpretation has been debated [222,223]. More generally, it has pre-invasive lesions, including the seminal study by Maley et al. [26], been assumed that the presence of subclonal known or putative drivers demonstrating that clonal diversity predicts progression from Barretts implies selection [13,29,30,164]. For example, in a breast cancer subject esophageal to esophageal adenocarcinoma (ESCA). More recently, to high-depth WGS high frequency subclonal mutational clusters were WGS and targeted sequencing was conducted to delineate mutational observed [164], compatible with the clonal expansion of a driver muta- ordering in pre-invasive esophageal lesions [237], and reported that tion accompanied by numerous hitchhiking passengers. Although this is mutations in ESCA occur very early during disease development, a reasonable assumption, the presence of subclonal ‘driver’ mutations where several additional drivers occur during later stages of progression alone does not necessarily guarantee that the mutation conferred a fit- with both diagnostic and therapeutic implications. Pre-tumor progres- ness advantage. Fitness estimates for specific mutations are unknown sion in the colon has also been studied by examining clonal evolution in human tumors, but mutations can be context dependent and epistatic in stem cell populations [238,239]. In particular, Baker et al. [239] re- [224,225] with differential effects during tumor initiation versus subse- ported that the number of functional stem cells is 5–6 in both normal quent expansion, potentially confounding the interpretation of such patients and individuals with familial adenomatous polyposis (FAP) as- patterns. Moreover, many putative ‘drivers’ have yet to be functionally sociated with germline APC mutation. In contrast, in adenomatous and phenotypically characterized. Another strategy is to evaluate evi- crypts (APC(−/−)), an increase in both functional stem cell number dence for convergent evolution, as has been done in ccRCC [13,18, and the loss/replacement rate was noted as was the rate of division 226],GBM[57] and CLL [58], providing evidence for selection. for colonic crypts. Other studies have sought to experimentally derive fitness measurements of common drivers (such as APC and KRAS)dur- 4.4.2. Negative selection ing tumor initiation in the mouse small intestine and report that while Negative selection by deleterious mutations is pervasive in the evo- they each provide strong selective advantages, mutation fixation is lution of natural species and can significantly decrease population di- highly inefficient due to hierarchical tissue architecture, strong genetic versity through purifying selection [227,228]. Classic molecular drift and stochastic stem cell replacement [240]. Multiple studies have evolution theory posits that most newly arising mutations are slightly revealed the early acquisition of tumor heterogeneity in non-invasive deleterious and purged from the population quickly [185,199].Howev- adenomas [21,45,47,241]. Breast tumors also have well defined precur- er, the extent to which this applies to somatic cell evolution is unclear sor lesions, facilitating studies of tumor initiation and early growth, since tumors can harbor high levels of genomic instability in which where genetic diversity was similarly reported to occur early [178]

Please cite this article as: Z. Hu, et al., A population genetics perspective on the determinants of intra-tumor heterogeneity, Biochim. Biophys. Acta (2017), http://dx.doi.org/10.1016/j.bbcan.2017.03.001 12 Z. Hu et al. / Biochimica et Biophysica Acta xxx (2017) xxx–xxx and reconstruction of phylogenetic trees amongst pre-invasive lesions that would have dramatically altered subclonal architecture (by altering revealed lineage heterogeneity both within and between lesions [33]. clone frequencies), compatible with effectively neutral evolution. Clonal labeling techniques have also been employed to delineate Subsequent studies have corroborated aspects of ‘Big Bang’ dynam- mechanisms of tumor initiation and growth by genetically tracking ics [46,47,247], suggesting that they may be relatively common in colo- the fate of clones [242–244]. In particular, Dreissens et al. [242] reported rectal tumors. Others have considered more strict models of neutrality. the identification of two populations in benign papilloma cells, corre- For example, Ling et al. reported neutral evolution in hepatocellular car- sponding to a small, but persistent stem-cell-like population that rapid- cinoma via in depth multi-region profiling [127]. Borrowing established ly divides and a slower cycling transient population that gives rise to approaches from population genetics, they analyzed the site frequency terminally differentiated tumor cells. In parallel, mathematical model- spectrum (SFS) [248] in a single liver tumor where 24 samples were sub- ing of tumor initiation has sought to determine the number of drivers ject to WES and 300 samples were deeply genotyped and found that the required to initiate a tumor [245], suggesting that only three events observed data could not reject the expected distribution arising under a are required for lung and colorectal cancer. As the timing of genomic neutral exponential growth model, assuming a well-mixed population. events that occur prior to transformation is obscured when studying They also reported very high levels of genetic diversity, where the hetero- established tumors, it will be of continued importance to characterize zygosity between samples was estimated H = 0.941. Williams et al. [135] precursor lesions towards identification of the cell of origin, drivers similarly employed a neutral exponential growth model and attempted and temporal ordering of tumor initiating events as may inform treat- to fit somatic mutation data from bulk tumor samples in TCGA, resulting ment and prognostication. in the claim that 1/3 of tumors are compatible with neutrality.

4.5.2. Linear versus branched (selective) evolution 4.5.4. Punctuated evolution As described above, cancer genome sequencing efforts have revealed Several recent studies have reported punctuated clonal evolution extensive ITH in diverse tumors. Although relatively few primary tu- [31,32,83], karyotypic chaos [249], and other catastrophic events in var- mors of a given tissue type have been analyzed in detail via multi-region ious tumor types, providing further evidence that the linear, sequential or single cell sequencing, there have been multiple reports of ‘branched’ clonal evolution model does not accurately describe the patterns of ITH evolution [13,17,18,22,25,29,30,33] (Fig. 1C). Hence, these data contrast found in many human cancers. For example, chromothrypsis, a cataclys- with the early view that after transformation, subsequent tumor growth mic event involving surges of chromosomal rearrangements that occur typically follows a linear clonal succession model [9] (Fig. 1B), wherein a in a single or few cell cycles, has been reported in medulloblastoma, os- selectively advantageous mutation has time to sweep through the pop- teosarcoma, pancreatic cancers, and other tumors [125,250,251].By ulation prior to the acquisition of addition of additional advantageous generating multiple driver alterations in a single cell division, a greater mutations. Rather, branched clonal evolution occurs when multiple se- fitness benefit may be achieved than would be feasible through sequen- lectively advantageous subclones are present and complete via clonal tial alterations [252]. Chromoplexy is another type of catastrophic interference before fixation (Fig. 1C). It has been postulated that this event, wherein chains of translocations are formed across multiple may be indicative of adaptation to different microenvironments or , potentially due to double-stranded DNA breaks [253]. that there are several routes from a normal cell to an advanced cancer In contrast, Kataegis results in localized hypermutation, preferentially [246]. Of note, phylogenies are branching diagrams and so called at cytosine in TpC dinucleotides [254,255]. Collectively, these studies ‘branched’ phylogenies do not imply branched clonal evolution. The to- suggest that mutation rates are not constant during the course of pography of phylogenetic trees may differ depending on whether both tumor progression and highlight the importance of understanding a neutral (passenger) mutations and putative selectively advantageous tumor's evolutionary trajectory (including the distinct stages of initia- mutations are considered or just the latter. Moreover, in spatially struc- tion and subsequent growth) in light of multiple sources of genomic al- tured populations, such as solid tumors, there is a high chance of detect- terations, including SSNVs, CNAs and SVs. ing genetic divergence between regionally segregated samples due to enhanced genetic drift in localized regions. 5. Evolutionary strategies to exploit intra-tumor heterogeneity therapeutically 4.5.3. Evidence for effectively neutral tumor evolution after transformation Selection is necessary for tumor initiation as the stepwise process of 5.1. Understanding, predicting and preventing tumor progression mutation, selection and clonal expansion is what gives rise to a cell that harbors the constellation of events required for unchecked growth and Although tumor growth cannot be directly observed, patterns of ge- which becomes the founding tumor cell (Fig. 1). While it is generally as- netic diversity present in a lesion reflect the evolutionary forces that sumed that strong selection is operative and detectable throughout all gave rise to it. It is therefore not surprising that ITH since a tumor's fu- stages of tumor progression, this has not been systematically examined. ture behavior depends on its evolutionary course. In theory, the earlier Rather, recent findings have challenged the notion that strong positive the initial clonal expansion is detected, the less diverse and less fitthe selection is continuously operative during the growth of a primary cell population (since natural selection operates on diversity), and the tumor after transformation [21,127,135]. For example, Sottoriva et al. better the prognosis. Even relatively early tumors exhibit high genomic [21] described a Big Bang model of colorectal tumor growth, wherein diversity [33,46], which may reflect processes that were already opera- after transformation, the tumor grows as a single terminal expansion tive for some time and likely will continue in the absence of interven- populated by a large number of heterogeneous subclones that are not tion. The term “early” is indeed relative as the first clonal expansions subject to stringent selection. As such, in the Big Bang model, the timing will be invisible and the lesion must be large enough to be detected of a mutation is the primary determinant of its frequency in the final and sampled. In line with this view, we have previously proposed that tumor and most detectable subclonal alterations (resulting in ITH) some tumors may be ‘born to be bad’ [21], wherein invasive potential arise early during growth, whereas late-arising mutations will be largely may be specified early. We have also observed that many private muta- undetectable. The predictions of the Big Bang model were tested tions originate from the first few cell divisions in human colorectal through single gland and bulk tumor profiling as well as spatial compu- adenomas [45].Thesefindings suggest that targeting early clonal alter- tational modeling and statistical inference on the genomic data. Intrigu- ations in the relevant target cells and/or perturbing the microenviron- ingly, inference on the genomic data in a patient-specific fashion ment may alter the evolutionary trajectory of some tumors. As revealed signals of selection (corresponding to subclone fitness differ- discussed above, numerous forces shape tumor diversity and it is impor- ences) in the majority of carcinomas examined. However, selection tant to appreciate these contributions even when they cannot be was generally not sufficiently strong to result in hard selective sweeps disentangled. More generally, understanding the mode of tumor

Please cite this article as: Z. Hu, et al., A population genetics perspective on the determinants of intra-tumor heterogeneity, Biochim. Biophys. Acta (2017), http://dx.doi.org/10.1016/j.bbcan.2017.03.001 Z. Hu et al. / Biochimica et Biophysica Acta xxx (2017) xxx–xxx 13 evolution may help to pinpoint the drivers of this process. This will re- EGFR-mutant NSCLC [263]. Other studies sought to explore the evolu- quire a combination of detailed ‘omic’ analyses of clinical samples and tionary dynamics of treatment response. For example, Bozic et al. dem- relevant model systems, as well as computational modeling to delineate onstrated that combination targeted therapy is superior to sequential the dynamics of this process. therapy and can lead to long-term disease control, but even one poten- Recently, several studies have described the use of computational tial cross-resistance mutation will dampen its benefit [264].Othershave and mathematical modeling to gain insight into complex phenomena focused on understanding the impact of ITH on combinatorial treatment during tumor evolution [21,256], as well as strategies to prevent un- strategies. For example, Zhao et al. demonstrated that drugs selected as wanted clonal evolutionary processes [257–260]. For example, patterns being the most beneficial single agent on the basis of clonal dominance of ITH can be analyzed using genealogy-based statistical inference are often suboptimal for treating heterogeneous tumors via combina- methods within spatial computational models of tumor growth to tion therapy [265]. Rather, ITH reduces the advantage that a targeted infer patient-specific tumor parameters such as the mutation rate, therapy has on a specific cell population, thereby altering the efficacy subclone fitness differences, and mutational timeline [21].Thisap- of intuitive combination therapy. Zhao et al. subsequently sought to ex- proach was employed to test the predictions of a Big Bang model of co- ploit temporal collateral sensitivity to circumvent resistance [259].They lorectal tumor growth. As it takes as input genomic data and requires report the existence of a predictable and opportunistic therapeutic win- minimal assumptions about the underlying process, it can also reveal dow during a tumors' evolutionary trajectory and determine mecha- unexpected features in the data. nisms of drug sensitization through a combination of genomic, Other efforts have sought to outline principles for preventing cancer phenotypic, signaling, and binding assays in conjunction with computa- by exploiting host defenses against clonal evolution such as the immune tional modeling to identify novel vulnerabilities. This integrated evolu- response, inhibition of angiogenesis and replicative senescence [258]. tionary and pharmacological approach highlights the potential to For example, analysis of telomerase length regulation [261] illustrates develop evolutionary informed treatment strategies with clinical the need for mathematical modeling in order to understand the dynam- benefit. ics of clones that could be restricted by replicative limits, as well as the Collectively, these studies highlight key considerations for therapeu- efficacy of anti-telomerase therapies. Here, the authors demonstrate tic strategies in light of ITH and suggest rational approaches for the de- that knowing the replicative capacity of the founding tumor cell alone sign of drug combinations and dosing. Ultimately, strategies such as is insufficient as it is also essential to understand the distributions of these that harness both evolutionary and ecological principles to target the number of cells, the balance between cell division and death, as tumors may have the greatest benefit [206]. While most aspects of well as the probability of escaping replication senescence through telo- tumor ecology are hard to probe in vivo and therefore poorly under- merase activation. These phenomena are unlikely to be understood stood, new technologies promise to enable fresh insights into tumor mi- from experimental data alone; hence this highlights the importance of croenvironment and spatial structure, which can in turn, inform these integrated experimentation and quantitative modeling. models. It is also clear that the full armament of strategies to combat Once tumor cells escape from natural defense mechanisms and tumor progression have yet to be explored. As these efforts mature, it achieve uncontrolled growth, disease progression inevitably ensues al- will be important to continue to examine model assumptions though the latency can vary considerably. Tumor reduction is often concerning tumor dynamics, as well as to generate testable predictions attempted via surgery, cytotoxic and/or cytostatic chemotherapy, and and to experimentally evaluate them. A combination of iterative exper- in some cases, molecular targeted agents. However, therapeutic resis- imentation and computational modeling should help to solidify this tance is virtually universal in solid tumors. Therapy itself can impose framework. new selective pressures on the tumor cell population and it will be im- portant to optimize treatment strategies by exploiting evolutionary 5.3. Implications of ITH for clinical trial design and principles and context-dependencies, as discussed below. Molecularly targeted therapies have revolutionized clinical 5.2. Evolutionary and ecological strategies to circumvent resistance and the discovery of new targets and for patient stratifica- tion remains an active area of research. In the era of targeted therapies, The fact that cancer is a clonal evolutionary process poses challenges the extensive evidence for ITH in diverse tumors accompanied by evi- for its effective control. Resistance can arise via several mechanisms, in- dence for pre-existing resistance conferring mutations raises significant cluding the selection of pre-existing resistant subclones, de novo or ac- concerns about the implications for achieving a durable response [60]. quired resistance, and competitive release, wherein elimination of In the metastatic setting, the acquisition of drug resistance can be partic- treatment sensitive cells results in a substantial reduction of tumor bur- ularly rapid. To date, multiple examples of such pre-existing subclonal den and a small population where resistant clones can readily grow out resistant mutations have been described, including KRAS mutations in without competition [262]. Hence, the interplay between cancer cell the context of anti-EGFR targeted therapy in colorectal cancer [202, population size and their ecological niche impacts treatment response. 266,267], rare MEK1 mutations in BRAF mutant melanomas receiving Competitive release may be a common phenomenon given that conven- BRAF targeted therapy [268], chemo-resistant TP53 mutations in acute tional treatment strategies aim to achieve maximal tumor cell kill under myeloid leukemia [269]. These clinical examples are corroborated by the assumption that this is optimal for patient outcomes, whereas it in vitro studies using multiplexed barcode libraries to enable the detec- may actually promote resistance [257]. To circumvent this, Basanta et tion of extremely low frequency variants that confer resistance [270],as al. [257] proposed a game theoretical framework based on an evolution- well as by mathematical modeling, which predicts the presence of pre- ary double bind where two therapies are used sequentially such that re- existing resistant subclones [202,264]. Complicating matters further, ac- sistance to one drug comes at a fitness cost, leaving the cells more quired resistance can occur in the context of treatment selective pres- sensitive to the other drug. They then investigated the impact that this sure, as exemplified by convergent loss of PTEN during PI(3)Kα strategy has on cancer control. A related approach, termed adaptive inhibition in breast cancer [271] and secondary EGFR mutations in therapy, administers treatment in pulses wherein removal of the drug NSCLC during anti-EGFR targeted therapy [272]. Non-genetic mecha- results in a fitness penalty to the resistant cell population, thereby keep- nism can also drive resistance, and have recently begun to be explored ing the population size in check. Adaptive therapy was found to yield [109,111]. extended tumor control in preclinical breast cancer models [260] with Given the diversity of mechanisms a cell can employ to evade ther- efforts underway to evaluate this in clinical trials. apy, the emergence of resistance should be anticipated and actively Some of the early efforts to understand resistance employed mathe- managed. Rare subclonal events pose a particular challenge as they fall matical modeling to optimize drug dosing, for example in the context of below the limits of most diagnostic tools, and hence are not generally

Please cite this article as: Z. Hu, et al., A population genetics perspective on the determinants of intra-tumor heterogeneity, Biochim. Biophys. Acta (2017), http://dx.doi.org/10.1016/j.bbcan.2017.03.001 14 Z. Hu et al. / Biochimica et Biophysica Acta xxx (2017) xxx–xxx considered for treatment stratification. The earlier detection of these 6. Conclusions variants could inform patient-tailored adaptive treatment strategies, particularly if this can be achieved in a minimally invasive manor via liq- In this review, we have outlined some of the recent advances in uid biopsy. Increasing data suggest that circulating tumor DNA (ctDNA) characterizing ITH, as well as technical and methodological challenges can be used to detect the emergence of resistance [273,274], monitor in quantifying and interpreting these patterns. Although much metastatic progression [275], predict recurrence and track minimal re- focus to date has been on genotypic heterogeneity, emergent sidual disease [276]. Such liquid biopsies may also help to overcome techniques may facilitate the characterization of heterogeneity in situ some of the challenges associated with spatial tumor heterogeneity. [281]. In parallel, functional heterogeneity amongst tumor cells However, this will require detailed comparison of detailed MRS and can be assessed in patient-derived xenograft [282,283] or organoid [284, ctDNA. In the metastatic setting it will also be important to understand 285] models. Moreover, clinical imaging techniques may enable non-in- whether distant metastases shed ctDNA at the same rates. For example, vasive, longitudinal monitoring of ITH at a whole organ level [286]. metastases to the brain (and primary brain tumors) are unlikely to shed Collectively, these approaches will help to resolve the genotype to significant cell free DNA into the blood due to the blood-brain barrier. phenotype map in cancer. We have also highlighted the power of Here, cerebrospinal fluid may provide an alternate reservoir of ctDNA interpreting tumor cell diversity within a population genetics [277]. framework and the need for mathematical and computational models In practice, the clinical community is grappling with how to detect to infer tumor dynamics. The holy grail of such an approach would be subclonal somatic alterations, whether the presence of a subclonal the development of predictive models to forecast and steer tumor resistance mutation detected at low frequency should necessitate the evolution in order to achieve durable patient responses, thereby render- use of an alternate therapy (should one exist) [267] and whether ing cancer a chronic disease. Before this can be achieved a systematic actionable clonal mutations are preferred targets. As discussed above, understanding of the different modes of evolution and fitness land- quantitative modeling can make predictions concerning both of scapes is needed. It is also evident that the failure to accurately detect these scenarios [259,265] and it will be important to pursue these ap- subclonal alterations will impede precision medicine. Hence, it remains proaches further. It is also of interest to evaluate clinical actionability of central importance to quantify both spatial and temporal genetic het- and utility in light of VAF values and quantitative measures of ITH. The erogeneity towards the goal of improved predictive and prognostic TRACERx (TRAcking non-small cell lung Cancer Evolution through ther- biomarkers. apy [Rx]) study, which aims to prospectively study cancer evolution via longitudinal and spatial sampling, coupled with analysis of ctDNA and Transparency document circulating tumor cells (CTCs) should help to address some of these questions [278]. A chief aim of this effort is to determine whether the The Transparency document associated with this article can be clonal dominance of molecularly targetable alterations portends im- found, in the online version. proved progression-free survival in advanced disease. In parallel, this study seeks to identify high-risk subclonal ‘drivers’ implicated in meta- static progression and to track clonal dynamics in the context of Acknowledgements therapy. A systematic understanding of ITH will be particularly important for This work was funded by awards from the NIH (R01CA182514), the improved management of metastatic disease, which is the ultimate Susan G. Komen Foundation (IIR13260750), and the Breast Cancer cause of death for most patients. To date, therapeutic advances have yet Research Foundation (BCRF-16-032) to C.C. Z.H. is supported by an to be fully translated to the metastatic setting. Here, the analysis of ge- Innovative Genomics Initiative (IGI) Postdoctoral Fellowship. netic diversity is particularly complicated due to the challenge in obtaining clinically annotated paired primaries and metastases, the References common administration of intervening therapy, and potential for differ- [1] J.L. Tsao, et al., Genetic reconstruction of individual colorectal tumor histories, Proc. ences in purity. Moreover, the extent of genetic divergence between pri- Natl. Acad. Sci. U. S. A. 97 (3) (2000) 1236–1241. maries and metastases may vary according to tissue of origin with [2] C. Tomasetti, B. Vogelstein, G. Parmigiani, Half or more of the somatic mutations in implications for the relevant targets of systemic therapy. For example, cancers of self-renewing tissues originate prior to tumor initiation, Proc. Natl. Acad. Sci. U. S. A. 110 (6) (2013) 1999–2004. colorectal cancers routinely share canonical drivers in the primary and [3] E.R. Fearon, B. Vogelstein, A genetic model for colorectal tumorigenesis, Cell 61 (5) metastasis [66,67,69–71,75–77,279], whereas greater divergence and (1990) 759–767. de novo putative drivers have been identified in other solid tumors [4] S. Nakatsuru, et al., Somatic mutation of the APC gene in gastric cancer: frequent [34]. To date, few studies have performed comprehensive genomic pro- mutations in very well differentiated adenocarcinoma and signet-ring cell carcino- ma, Hum. Mol. Genet. 1 (8) (1992) 559–563. filing of paired primaries and metastases with detailed treatment infor- [5] A. Horii, et al., Frequent somatic mutations of the APC gene in human pancreatic mation (as therapy can confound the interpretation of the resultant cancer, Cancer Res. 52 (23) (1992) 6696–6698. [6] L.I. Shlush, et al., Identification of pre-leukaemic haematopoietic stem cells in acute genomic patterns) and the limited available data derives from small – fi leukaemia, Nature 506 (7488) (2014) 328 333. numbers of cases. Thus, additional efforts to pro le metastatic disease [7] P.C. Nowell, The clonal evolution of tumor cell populations, Science 194 (4260) are needed and should similarly account for the spatial sampling biases (1976) 23–28. reported in primary tumors. Only then can rigorous comparisons of [8] M. Greaves, C.C. Maley, Clonal evolution in cancer, Nature 481 (7381) (2012) 306–313. paired primaries and metastases be made with the potential to reveal [9] M. Greaves, Evolutionary determinants of cancer, Cancer Discov. 5 (8) (2015) new therapeutic vulnerabilities and to inform the dynamics of metasta- 806–820. tic colonization [280]. [10] D. Hanahan, R.A. Weinberg, Hallmarks of cancer: the next generation, Cell 144 (5) (2011) 646–674. It is clear that advances in precision oncology will require tight coor- [11] G.H. Heppner, Tumor heterogeneity, Cancer Res. 44 (6) (1984) 2259–2265. dination between laboratory scientists, computational biologists, oncol- [12] K. Anderson, et al., Genetic variegation of clonal architecture and propagating cells ogists and clinical trial communities, as well as patient engagement. in leukaemia, Nature 469 (7330) (2011) 356–361. [13] M. Gerlinger, et al., Intratumor heterogeneity and branched evolution revealed by Despite the challenges that pervasive ITH and clonal dynamics pose, multiregion sequencing, N. Engl. J. Med. 366 (10) (2012) 883–892. they beckon for evolutionary and ecological therapeutic strategies, [14] S.P. Shah, et al., The clonal and mutational evolution spectrum of primary triple- which have only begun to be explored. Hence, there is an armament negative breast cancers, Nature 486 (7403) (2012) 395–399. of tools that were developed to understand population dynamics for [15] A. Marusyk, V. Almendro, K. Polyak, Intra-tumour heterogeneity: a looking glass for cancer? Nat. Rev. Cancer 12 (5) (2012) 323–334. which further investigation in the context of cancer is warranted to- [16] A. Sottoriva, et al., Intratumor heterogeneity in human glioblastoma reflects cancer wards the goal of improving patient outcomes. evolutionary dynamics, Proc. Natl. Acad. Sci. U. S. A. 110 (10) (2013) 4009–4014.

Please cite this article as: Z. Hu, et al., A population genetics perspective on the determinants of intra-tumor heterogeneity, Biochim. Biophys. Acta (2017), http://dx.doi.org/10.1016/j.bbcan.2017.03.001 Z. Hu et al. / Biochimica et Biophysica Acta xxx (2017) xxx–xxx 15

[17] A. Bashashati, et al., Distinct evolutionary trajectories of primary high-grade serous [54] M.M. Inda, et al., Tumor heterogeneity is an active process maintained by a mutant ovarian cancers revealed through spatial mutational profiling, J. Pathol. 231 (1) EGFR-induced cytokine circuit in glioblastoma, Genes Dev. 24 (16) (2010) (2013) 21–34. 1731–1745. [18] M. Gerlinger, et al., Genomic architecture and evolution of clear cell renal cell car- [55] A.S. Cleary, et al., Tumour cell heterogeneity maintained by cooperating subclones cinomas defined by multiregion sequencing, Nat. Genet. 46 (3) (2014) 225–233. in Wnt-driven mammary cancers, Nature 508 (7494) (2014) 113–117. [19] B.E. Johnson, et al., Mutational analysis reveals the origin and therapy-driven evo- [56] A. Marusyk, et al., Non-cell-autonomous driving of tumour growth supports sub- lution of recurrent glioma, Science 343 (6167) (2014) 189–193. clonal heterogeneity, Nature 514 (7520) (2014) 54–58. [20] V. Almendro, et al., Inference of tumor evolution during chemotherapy by compu- [57] J.M. Francis, et al., EGFR variant heterogeneity in glioblastoma resolved through tational modeling and in situ analysis of genetic and phenotypic cellular diversity, single-nucleus sequencing, Cancer Discov. 4 (8) (2014) 956–971. Cell Rep. 6 (3) (2014) 514–527. [58] J. Ojha, et al., Deep sequencing identifies genetic heterogeneity and recurrent con- [21] A. Sottoriva, et al., A Big Bang model of human colorectal tumor growth, Nat. Genet. vergent evolution in chronic lymphocytic leukemia, Blood 125 (3) (2015) 47 (3) (2015) 209–216. 492–498. [22] H. Kim, et al., Whole-genome and multisector exome sequencing of primary and [59] C. Holohan, et al., Cancer drug resistance: an evolving paradigm, Nat. Rev. Cancer post-treatment glioblastoma reveals patterns of tumor evolution, Genome Res. 13 (10) (2013) 714–726. 25 (3) (2015) 316–327. [60] M.W. Schmitt, L.A. Loeb, J.J. Salk, The influence of subclonal resistance mutations on [23] P.C. Boutros, et al., Spatial genomic heterogeneity within localized, multifocal pros- targeted cancer therapy, Nat. Rev. Clin. Oncol. 13 (6) (2016) 335–347. tate cancer, Nat. Genet. 47 (7) (2015) 736–745. [61] J. Wang, et al., Clonal evolution of glioblastoma under therapy, Nat. Genet. (2016). [24] M.R. Corces, et al., Lineage-specific and single-cell chromatin accessibility charts [62] M. Castellarin, et al., Clonal evolution of high-grade serous ovarian carcinoma from human hematopoiesis and leukemia evolution, Nat. Genet. 48 (10) (2016) primary to recurrent disease, J. Pathol. 229 (4) (2013) 515–524. 1193–1203. [63] L. Ding, et al., Clonal evolution in relapsed acute myeloid leukaemia revealed by [25] A. McPherson, et al., Divergent modes of clonal spread and intraperitoneal mixing whole-genome sequencing, Nature 481 (2012) 506–510. in high-grade serous ovarian cancer, Nat. Genet. (2016). [64] D.A. Landau, et al., Mutations driving CLL and their evolution in progression and re- [26] C.C. Maley, et al., Genetic clonal diversity predicts progression to esophageal ade- lapse, Nature 526 (2015) 525–530. nocarcinoma, Nat. Genet. 38 (4) (2006) 468–473. [65] A. Schramm, et al., Mutational dynamics between primary and relapse neuroblas- [27] E.A. Mroz, et al., Intra-tumor genetic heterogeneity and mortality in head and neck tomas, Nat. Genet. 47 (2015) 872–877. cancer: analysis of data from the Cancer Genome Atlas, PLoS Med. 12 (2) (2015), [66] F. Al-Mulla, et al., Comparative genomic hybridization analysis of primary colorec- e1001786. tal carcinomas and their synchronous metastases, Genes Chromosom. Cancer 24 [28] D. Wangsa, et al., Phylogenetic analysis of multiple FISH markers in oral (4)(1999)306–314. tongue squamous cell carcinoma suggests that a diverse distribution of [67] H.E. Alcock, et al., Analysis of colorectal tumor progression by microdissection and copy number changes is associated with poor prognosis, Int. J. Cancer 138 comparative genomic hybridization, Genes Chromosom. Cancer 37 (4) (2003) (1) (2016) 98–109. 369–380. [29] J. Zhang, et al., Intratumor heterogeneity in localized lung adenocarcinomas delin- [68] S.P. Shah, et al., Mutational evolution in a lobular breast tumour profiled at single eated by multiregion sequencing, Science 346 (6206) (2014) 256–259. nucleotide resolution, Nature 461 (7265) (2009) 809–813. [30] E.C. de Bruin, et al., Spatial and temporal diversity in genomic instability processes [69] E. Miranda, et al., Genetic and epigenetic alterations in primary colorectal cancers defines lung cancer evolution, Science 346 (6206) (2014) 251–256. and related lymph node and liver metastases, Cancer 119 (2) (2013) 266–276. [31] R. Gao, et al., Punctuated copy number evolution and clonal stasis in triple-negative [70] J.S. Vermaat, et al., Primary colorectal cancers and their subsequent hepatic metas- breast cancer, Nat. Genet. 48 (10) (2016) 1119–1130. tases are genetically different: implications for selection of patients for targeted [32] Y. Wang, et al., Clonal evolution in breast cancer revealed by single nucleus ge- treatment, Clin. Cancer Res. 18 (3) (2012) 688–699. nome sequencing, Nature 512 (7513) (2014) 155–160. [71] A. Brannon, et al., Comparative sequencing analysis reveals high genomic concor- [33] Z. Weng, et al., Cell-lineage heterogeneity and driver mutation recurrence in pre- dance between matched primary and metastatic colorectal cancer lesions, Genome invasive breast neoplasia, Genome Med. 7 (1) (2015) 28. Biol. 15 (8) (2014) 454. [34] P.K. Brastianos, et al., Genomic characterization of brain metastases reveals [72] L. Ding, et al., Genome remodelling in a basal-like breast cancer metastasis and xe- branched evolution and potential therapeutic targets, Cancer Discov. 5 (11) nograft, Nature 464 (7291) (2010) 999–1005. (2015) 1164–1177. [73] P. Savas, et al., The subclonal architecture of metastatic breast cancer: results from [35] J. Cairns, Mutation selection and the natural history of cancer, Nature 255 (5505) a prospective community-based rapid autopsy program “CASCADE”, PLoS Med. 13 (1975) 197–200. (12) (2016), e1002204. [36] L.R. Yates, P.J. Campbell, Evolution of the cancer genome, Nat. Rev. Genet. 13 (11) [74] P.J. Campbell, et al., The patterns and dynamics of genomic instability in metastatic (2012) 795–806. pancreatic cancer, Nature 467 (7319) (2010) 1109–1113. [37] A. Sottoriva, et al., Single-molecule genomic data delineate patient-specifictumor [75] T. Knosel, et al., Chromosomal alterations during lymphatic and liver metastasis profiles and cancer stem cell organization, Cancer Res. 73 (1) (2013) 41–49. formation of colorectal cancer, Neoplasia 6 (1) (2004) 23–28. [38] S.L. Cooke, et al., Intra-tumour genetic heterogeneity and poor chemoradiotherapy [76] J.K. Jiang, et al., Genetic changes and clonality relationship between primary colo- response in cervical cancer, Br. J. Cancer 104 (2) (2011) 361–368. rectal cancers and their pulmonary metastases—an analysis by comparative geno- [39] E.A. Mroz, et al., High intratumor genetic heterogeneity is related to worse out- mic hybridization, Genes Chromosom. Cancer 43 (1) (2005) 25–36. come in patients with head and neck squamous cell carcinoma, Cancer 119 (16) [77] J.C. Weber, et al., Allelotyping analyses of synchronous primary and metastasis CIN (2013) 3034–3042. colon cancers identified different subtypes, Int. J. Cancer 120 (3) (2007) 524–532. [40] R.A. Burrell, et al., The causes and consequences of genetic heterogeneity in cancer [78] T.E. Becker, et al., The genomic heritage of lymph node metastases: implications for evolution, Nature 501 (7467) (2013) 338–345. clinical management of patients with breast cancer, Ann. Surg. Oncol. 15 (4) [41] C. Swanton, Intratumor Heterogeneity: evolution through space and time, Cancer (2008) 1056–1063. Res. 72 (19) (2012) 4875–4882. [79] M.C. Haffner, et al., Tracking the clonal origin of lethal , J. Clin. In- [42] L.J. Barber, M.N. Davies, M. Gerlinger, Dissecting cancer evolution at the macro-het- vest. 123 (11) (2013) 4918–4922. erogeneity and micro-heterogeneity scale, Curr. Opin. Genet. Dev. 30 (2015) 1–6. [80] G. Gundem, et al., The evolutionary history of lethal metastatic prostate cancer, Na- [43] M. Snuderl, et al., Mosaic amplification of multiple receptor tyrosine kinase genes ture 520 (7547) (2015) 353–357. in glioblastoma, Cancer Cell 20 (6) (2011) 810–817. [81] M. El-Kebir, et al., Inferring the mutational history of a tumor using multi-state per- [44] N.J. Szerlip, et al., Intratumoral heterogeneity of receptor tyrosine EGFR and fect phylogeny mixtures, Cell Syst. 3 (1) (2016) 43–53. PDGFRA amplification in glioblastoma defines subpopulations with distinct growth [82] T.M. Kim, et al., Subclonal genomic architectures of primary and metastatic colo- factor response, Proc. Natl. Acad. Sci. U. S. A. 109 (8) (2012) 3041–3046. rectal cancer based on intratumoral genetic heterogeneity, Clin. Cancer Res. 21 [45] H. Kang, et al., Many private mutations originate from the first few divisions of a (19) (2015) 4461–4472. human colorectal adenoma, J. Pathol. 237 (3) (2015) 355–362. [83] M.S. Lawrence, et al., Mutational heterogeneity in cancer and the search for new [46] C.K. Sievers, et al., Subclonal diversity arises early even in small colorectal tumours cancer-associated genes, Nature 499 (7457) (2013) 214–218. and contributes to differential growth fates, Gut (2016). [84] B. Vogelstein, et al., Cancer genome landscapes, Science 339 (6127) (2013) [47] R. Uchi, et al., Integrated multiregional analysis proposing a new model of colorec- 1546–1558. tal cancer evolution, PLoS Genet. (2016) 12(2). [85] L.A. Loeb, C.F. Springgate, N. Battula, Errors in DNA replication as a basis of malig- [48] Y. Suzuki, et al., Multiregion ultra-deep sequencing reveals early intermixing and nant changes, Cancer Res. 34 (9) (1974) 2311–2321. variable levels of intratumoral heterogeneity in colorectal cancer, Mol. Oncol. 11 [86] L.A. Loeb, Human cancers express a mutator phenotype: hypothesis, origin, and (2) (2017) 124–139. consequences, Cancer Res. 76 (8) (2016) 2057–2059. [49] L.R. Yates, et al., Subclonal diversification of primary breast cancer revealed by [87] C. Lengauer, K.W. Kinzler, B. Vogelstein, Genetic instability in colorectal cancers, multiregion sequencing, Nat. Med. 21 (7) (2015) 751–759. Nature 386 (6625) (1997) 623–627. [50] N.E. Potter, et al., Single-cell mutational profiling and clonal phylogeny in cancer, [88] C. Lengauer, K.W. Kinzler, B. Vogelstein, Genetic instabilities in human cancers, Na- Genome Res. 23 (12) (2013) 2115–2125. ture 396 (6712) (1998) 643–649. [51] A.P. Patel, et al., Single-cell RNA-seq highlights intratumoral heterogeneity in pri- [89] A.M. Laughney, et al., Dynamics of tumor heterogeneity derived from clonal karyo- mary glioblastoma, Science 344 (6190) (2014) 1396–1401. typic evolution, Cell Rep. 12 (5) (2015) 809–820. [52] M. Meyer, et al., Single cell-derived clonal analysis of human glioblastoma links [90] K. Rowald, et al., Negative selection and chromosome instability induced by Mad2 functional and genomic heterogeneity, Proc. Natl. Acad. Sci. U. S. A. 112 (3) overexpression delay breast cancer but facilitate oncogene-independent out- (2015) 851–856. growth, Cell Rep. 15 (12) (2016) 2679–2691. [53] X. Zhang, et al., Single-cell sequencing for precise cancer research: progress and [91] R.A. Burrell, et al., Replication stress links structural and numerical cancer chromo- prospects, Cancer Res. 76 (6) (2016) 1305–1312. somal instability, Nature 494 (7438) (2013) 492–496.

Please cite this article as: Z. Hu, et al., A population genetics perspective on the determinants of intra-tumor heterogeneity, Biochim. Biophys. Acta (2017), http://dx.doi.org/10.1016/j.bbcan.2017.03.001 16 Z. Hu et al. / Biochimica et Biophysica Acta xxx (2017) xxx–xxx

[92] L.A. Aaltonen, et al., Clues to the pathogenesis of familial colorectal cancer, Science [131] Y. Jiang, et al., Assessing intratumor heterogeneity and tracking longitudinal and 260 (5109) (1993) 812–816. spatial clonal evolutionary history by next-generation sequencing, Proc. Natl. [93] T.M. Kim, P.J. Park, A genome-wide view of microsatellite instability: old stories of Acad. Sci. U. S. A. 113 (37) (2016) E5528–E5537. cancer mutations revisited with new sequencing technologies, Cancer Res. 74 (22) [132] N. Niknafs, et al., Subclonal hierarchy inference from somatic mutations: automatic (2014) 6377–6382. reconstruction of cancer evolutionary trees from multi-region next generation se- [94] J.K. Lee, et al., Mechanisms and consequences of cancer genome instability: quencing, PLoS Comput. Biol. 11 (10) (2015), e1004416. lessons from genome sequencing studies, Annu. Rev. Pathol. 11 (2016) [133] A.G. Deshwar, et al., PhyloWGS: reconstructing subclonal composition and evolu- 283–312. tion from whole-genome sequencing of tumors, Genome Biol. 16 (2015) 35. [95] E.J. Fox, L.A. Loeb, Lethal mutagenesis: targeting the mutator phenotype in cancer, [134] R. Kostadinov, C.C. Maley, M.K. Kuhner, Bulk genotyping of biopsies can create spu- Semin. Cancer Biol. 20 (5) (2010) 353–359. rious evidence for hetereogeneity in mutation content, PLoS Comput. Biol. 12 (4) [96] J.H. Bielas, et al., Human cancers express a mutator phenotype, Proc. Natl. Acad. Sci. (2016), e1004413. U. S. A. 103 (48) (2006) 18238–18242. [135] M.J. Williams, et al., Identification of neutral tumor evolution across cancer types, [97] S.M. Dewhurst, et al., Tolerance of whole-genome doubling propagates chromo- Nat. Genet. (2016). somal instability and accelerates cancer genome evolution, Cancer Discov. 4 (2) [136] M. Gerstung, et al., Reliable detection of subclonal single-nucleotide variants in tu- (2014) 175–185. mour cell populations, Nat. Commun. 3 (2012) 811. [98] B.A. Weaver, et al., Aneuploidy acts both oncogenically and as a tumor suppressor, [137] K. Cibulskis, et al., Sensitive detection of somatic point mutations in impure and Cancer Cell 11 (1) (2007) 25–36. heterogeneous cancer samples, Nat. Biotechnol. 31 (3) (2013) 213–219. [99] S.L. Carter, et al., Absolute quantification of somatic DNA alterations in human can- [138] J.M. Findlay, et al., Differential clonal evolution in oesophageal cancers in response cer, Nat. Biotechnol. 30 (5) (2012) 413–421. to neo-adjuvant chemotherapy, Nat. Commun. 7 (2016) 11111. [100] N. McGranahan, et al., Cancer chromosomal instability: therapeutic and diagnostic [139] C. Swanton, et al., Consensus on precision medicine for metastatic cancers: a report challenges, EMBO Rep. 13 (6) (2012) 528–538. from the MAP conference, Ann. Oncol. (2016). [101] F.A. Sinicrope, Z.J. Yang, Prognostic and predictive impact of DNA mismatch repair [140] M. Josephidou, A.G. Lynch, S. Tavaré, multiSNV: a probabilistic approach for im- in the management of colorectal cancer, Future Oncol. 7 (3) (2011) 467–474. proving detection of somatic point mutations from multiple related tumour sam- [102] R.A. Burrell, C. Swanton, The evolution of the unstable cancer genome, Curr. Opin. ples, Nucleic Acids Res. 43 (9) (2015) e61. Genet. Dev. 24 (2014) 61–67. [141] E.N. Smith, et al., Biased estimates of clonal evolution and subclonal heterogeneity [103] T.I. Zack, et al., Pan-cancer patterns of somatic copy number alteration, Nat. Genet. can arise from PCR duplicates in deep sequencing experiments, Genome Biol. 15 45 (10) (2013) 1134–1140. (8)(2014)420. [104] C. Curtis, et al., The genomic and transcriptomic architecture of 2,000 breast tu- [142] I. Kinde, et al., Detection and quantification of rare mutations with massively par- mours reveals novel subgroups, Nature 486 (7403) (2012) 346–352. allel sequencing, Proc. Natl. Acad. Sci. U. S. A. 108 (23) (2011) 9530–9535. [105] Cancer Genome Atlas Research Network, Comprehensive molecular characteriza- [143] M. Costello, et al., Discovery and characterization of artifactual mutations in deep tion of gastric adenocarcinoma, Nature 513 (7517) (2014) 202–209. coverage targeted capture sequencing data due to oxidative DNA damage during [106] C. Kadoch, G.R. Crabtree, Reversible disruption of mSWI/SNF (BAF) complexes by sample preparation, Nucleic Acids Res. 41 (6) (2013), e67. the SS18-SSX oncogenic fusion in synovial sarcoma, Cell 153 (1) (2013) 71–85. [144] D.I. Lou, et al., High-throughput DNA sequencing errors are reduced by orders of [107] V. Almendro, A. Marusyk, K. Polyak, Cellular heterogeneity and molecular evolu- magnitude using circle sequencing, Proc. Natl. Acad. Sci. U. S. A. 110 (49) (2013) tion in cancer, Annu. Rev. Pathol. 8 (2013) 277–302. 19872–19877. [108] C.E. Meacham, S.J. Morrison, Tumour heterogeneity and cancer cell plasticity, Na- [145] M.W. Schmitt, et al., Detection of ultra-rare mutations by next-generation sequenc- ture 501 (7467) (2013) 328–337. ing, Proc. Natl. Acad. Sci. U. S. A. 109 (36) (2012) 14508–14513. [109] S.V. Sharma, et al., A chromatin-mediated reversible drug-tolerant state in cancer [146] L. Oesper, A. Mahmoody, B.J. Raphael, THetA: inferring intra-tumor heterogeneity cell subpopulations, Cell 141 (1) (2010) 69–80. from high-throughput DNA sequencing data, Genome Biol. 14 (7) (2013) R80. [110] T. Mazor, et al., Intratumoral heterogeneity of the epigenome, Cancer Cell 29 (4) [147] A. Fischer, et al., High-definition reconstruction of clonal composition in cancer, (2016) 440–451. Cell Rep. 7 (5) (2014) 1740–1752. [111] R.H. Chisholm, et al., Emergence of drug tolerance in cancer cell populations: an [148] Z. Yu, A. Li, M. Wang, CloneCNA: detecting subclonal somatic copy number alter- evolutionary outcome of selection, nongenetic instability, and stress-induced ad- ations in heterogeneous tumor samples from whole-exome sequencing data, aptation, Cancer Res. 75 (6) (2015) 930–939. BMC Bioinformatics 17 (2016) 310. [112] A.O. Pisco, S. Huang, Non-genetic cancer cell plasticity and therapy-induced [149] A. Roth, et al., PyClone: statistical inference of clonal population structure in cancer, stemness in tumour relapse: ‘what does not kill me strengthens me’, Br. J. Cancer Nat. Methods 11 (4) (2014) 396–398. 112 (11) (2015) 1725–1732. [150] C.A. Miller, et al., SciClone: inferring clonal architecture and tracking the spatial and [113] V. Almendro, et al., Genetic and phenotypic diversity in breast tumor metastases, temporal patterns of tumor evolution, PLoS Comput. Biol. 10 (8) (2014), e1003665. Cancer Res. 74 (5) (2014) 1338–1348. [151] I.M. Lonnstedt, et al., Deciphering clonality in aneuploid breast tumors using SNP [114] J.E. Dietrich, T. Hiiragi, Stochastic patterning in the mouse pre-implantation em- array and sequencing data, Genome Biol. 15 (9) (2014) 470. bryo, Development 134 (23) (2007) 4219–4231. [152] M. El-Kebir, et al., Multi-state perfect phylogeny mixtures for cancer sequencing, [115] H.H. Chang, et al., Transcriptome-wide noise controls lineage choice in mammalian Cell Syst. (2016). progenitor cells, Nature 453 (7194) (2008) 544–547. [153] Y. Wang, N.E. Navin, Advances and applications of single-cell sequencing technol- [116] R. Losick, C. Desplan, Stochasticity and cell fate, Science 320 (5872) (2008) 65–68. ogies, Mol. Cell 58 (4) (2015) 598–609. [117] M. Acar, J.T. Mettetal, A. van Oudenaarden, Stochastic switching as a survival strat- [154] A. Roth, et al., Clonal genotype and population structure inference from single-cell egy in fluctuating environments, Nat. Genet. 40 (4) (2008) 471–475. tumor sequencing, Nat. Methods (2016). [118] S.L. Spencer, et al., Non-genetic origins of cell-to-cell variability in TRAIL-induced [155] K. Naxerova, R.K. Jain, Using tumour phylogenetics to identify the roots of metas- apoptosis, Nature 459 (7245) (2009) 428–432. tasis in humans, Nat. Rev. Clin. Oncol. 12 (5) (2015) 258–272. [119] S.A. Frank, M.R. Rosner, Nonheritable cellular variability accelerates the evolution- [156] Z.M. Zhao, et al., Early and multiple origins of metastatic lineages within primary ary processes of cancer, PLoS Biol. 10 (4) (2012), e1001296. tumors, Proc. Natl. Acad. Sci. U. S. A. 113 (8) (2016) 2140–2145. [120] P.B. Gupta, et al., Stochastic state transitions give rise to phenotypic equilibrium in [157] Z. Yang, B. Rannala, Molecular phylogenetics: principles and practice, Nat. Rev. populations of cancer cells, Cell 146 (4) (2011) 633–644. Genet. 13 (5) (2012) 303–314. [121] L.I. Shlush, D. Hershkovitz, Clonal evolution models of tumor heterogeneity, Am. [158] Z.H. Yang, Phylogenetic analysis using parsimony and likelihood methods, J. Mol. Soc. Clin. Oncol. Educ. Book (2015) e662–e665. Evol. 42 (2) (1996) 294–307. [122] R.A. Gatenby, et al., Cellular adaptations to hypoxia and acidosis during somatic [159] J. Felsenstein, Evolutionary trees from DNA sequences: a maximum likelihood ap- evolution of breast cancer, Br. J. Cancer 97 (5) (2007) 646–653. proach, J. Mol. Evol. 17 (6) (1981) 368–376. [123] H.O. Lee, et al., Evolution of tumor invasiveness: the adaptive tumor microenviron- [160] N. Saitou, M. Nei, The neighbor-joining method: a new method for reconstructing ment landscape model, Cancer Res. 71 (20) (2011) 6327–6337. phylogenetic trees, Mol. Biol. Evol. 4 (4) (1987) 406–425. [124] M.C. Lloyd, et al., Darwinian dynamics of intratumoral heterogeneity: not solely [161] E. Letouze, et al., Analysis of the copy number profiles of several tumor samples random mutations but also variable environmental selection forces, Cancer Res. from the same patient reveals the successive steps in tumorigenesis, Genome 76 (11) (2016) 3136–3144. Biol. 11 (7) (2010) R76. [125] F. Notta, et al., A renewed model of pancreatic cancer evolution based on genomic [162] R.F. Schwarz, et al., Phylogenetic quantification of intra-tumour heterogeneity, rearrangement patterns, Nature 538 (7625) (2016) 378–382. PLoS Comput. Biol. 10 (4) (2014), e1003535. [126] N.R. Bertos, M. Park, Laser capture microdissection as a tool to study tumor stroma, [163] R.F. Schwarz, et al., Spatial and temporal heterogeneity in high-grade serous ovar- Methods Mol. Biol. 1458 (2016) 13–25. ian cancer: a phylogenetic analysis, PLoS Med. 12 (2) (2015), e1001789. [127] S. Ling, et al., Extremely high genetic diversity in a single tumor points to preva- [164] S. Nik-Zainal, et al., The life history of 21 breast cancers, Cell 149 (5) (2012) lence of non-Darwinian cell evolution, Proc. Natl. Acad. Sci. U. S. A. 112 (47) 994–1007. (2015) E6496–E6505. [165] K. Sprouffske, J.W. Pepper, C.C. Maley, Accurate reconstruction of the temporal [128] N.B. Larson, B.L. Fridley, PurBayes: estimating tumor cellularity and order of mutations in neoplastic progression, Cancer Prev. Res. (Phila.) 4 (7) subclonality in next-generation sequencing data, Bioinformatics 29 (15) (2011) 1135–1144. (2013) 1888–1889. [166] S. Malikic, et al., Clonality inference in multiple tumor samples using phylogeny, [129] G. Ha, et al., TITAN: inference of copy number architectures in clonal cell popula- Bioinformatics 31 (9) (2015) 1349–1356. tions from tumor whole-genome sequence data, Genome Res. 24 (11) (2014) [167] V. Popic, et al., Fast and scalable inference of multi-sample cancer lineages, Ge- 1881–1893. nome Biol. 16 (1) (2015) 91. [130] Z. Hu, C. Curtis, Inferring tumor phylogenies from mulit-region sequencing, Cell [168] E.M. Ross, F. Markowetz, OncoNEM: inferring tumor evolution from single-cell se- Syst. 3 (1) (2016) 12–14. quencing data, Genome Biol. 17 (2016) 69.

Please cite this article as: Z. Hu, et al., A population genetics perspective on the determinants of intra-tumor heterogeneity, Biochim. Biophys. Acta (2017), http://dx.doi.org/10.1016/j.bbcan.2017.03.001 Z. Hu et al. / Biochimica et Biophysica Acta xxx (2017) xxx–xxx 17

[169] K. Jahn, J. Kuipers, N. Beerenwinkel, Tree inference for single-cell data, Genome [212] J. Poleszczuk, P. Hahnfeldt, H. Enderling, Evolution and phenotypic selection of can- Biol. 17 (2016) 86. cer stem cells, PLoS Comput. Biol. 11 (3) (2015), e1004025. [170] S. Behjati, et al., Genome sequencing of normal cells reveals developmental line- [213] K. Sprouffske, et al., An evolutionary explanation for the presence of cancer ages and mutational processes, Nature 513 (7518) (2014) 422–425. nonstem cells in , Evol. Appl. 6 (1) (2013) 92–101. [171] J.F. Kingman, The coalescent, Stoch. Process. Appl. 13 (1982) 235–248. [214] C. Carmona-Fontaine, et al., Emergence of spatial structure in the tumor microen- [172] P. Marjoram, S. Tavaré, Modern computational approaches for analysing molecular vironment due to the Warburg effect, Proc. Natl. Acad. Sci. U. S. A. 110 (48) (2013) genetic variation data, Nat. Rev. Genet. 7 (10) (2006) 759–770. 19402–19407. [173] M.A. Beaumont, W. Zhang, D.J. Balding, Approximate Bayesian computation in pop- [215] J. Gallaher, A.R. Anderson, Evolution of intratumoral phenotypic heterogeneity: the ulation genetics, Genetics 162 (4) (2002) 2025–2035. role of trait inheritance, Interface Focus 3 (4) (2013) 20130016. [174] P. Nicolas, et al., The stem cell population of the human colon crypt: analysis via [216] B. Waclaw, et al., A spatial model predicts that dispersal and cell turnover limit methylation patterns, PLoS Comput. Biol. 3 (3) (2007), e28. intratumour heterogeneity, Nature (2015) http://dx.doi.org/10.1038/nature14971. [175] D. Dykhuizen, D.L. Hartl, Selective neutrality of 6PGD allozymes in E. coli and the [217] S. Yachida, et al., Distant metastasis occurs late during the genetic evolution of pan- effects of genetic background, Genetics 96 (4) (1980) 801–817. creatic cancer, Nature 467 (7319) (2010) 1114–1117. [176] J.H. Gillespie, Population genetics : a concise guide, second ed. Johns Hopkins Uni- [218] C.S. Cooper, et al., Analysis of the genetic phylogeny of multifocal prostate cancer versity Press, Baltimore, Md., 2004 (xiv, 214 p). identifies multiple independent clonal expansions in neoplastic and morphologi- [177] E.H. Simpson, Measurement of diversity, Nature (1949). cally normal prostate tissue, Nat. Genet. 47 (4) (2015) 367–372. [178] D.C. Allred, et al., Ductal carcinoma in situ and the emergence of diversity during [219] B.J. Raphael, et al., Identifying driver mutations in sequenced cancer genomes: breast cancer evolution, Clin. Cancer Res. 14 (2) (2008) 370–378. computational approaches to enable precision medicine, Genome Med. 6 (1) [179] C.E. Shannon, A mathematical theory of communication, Bell Syst. Tech. J. 27 (2014) 5. (1948) 379–432. [220] E. Hodis, et al., A landscape of driver mutations in melanoma, Cell 150 (2) (2012) [180] L.M. Merlo, et al., A comprehensive survey of clonal diversity measures in Barrett's 251–263. esophagus as biomarkers of progression to esophageal adenocarcinoma, Cancer [221] I. Martincorena, et al., Tumor evolution. High burden and pervasive positive selec- Prev. Res. (Phila.) 3 (11) (2010) 1388–1397. tion of somatic mutations in normal human skin, Science 348 (6237) (2015) [181] E.A. Mroz, J.W. Rocco, MATH, a novel measure of intratumor genetic heterogeneity, 880–886. is high in poor-outcome classes of head and neck squamous cell carcinoma, Oral [222] B.D. Simons, Deep sequencing as a probe of normal stem cell fate and preneoplasia Oncol. 49 (3) (2013) 211–215. in human epidermis, Proc. Natl. Acad. Sci. U. S. A. 113 (1) (2016) 128–133. [182] L.H. Mengelbier, et al., Intratumoral genome diversity parallels progression and [223] B.D. Simons, REPLY TO MARTINCORENA, et al., Evidence for constrained positive predicts outcome in pediatric cancer, Nat. Commun. 6 (2015) 6125. selection of cancer mutations in normal skin is lacking, Proc. Natl. Acad. Sci. U. S. [183] L.G. Morris, et al., Pan-cancer analysis of intratumor heterogeneity as a prognostic A. 113 (9) (2016) E1130–E1131. determinant of survival, Oncotarget 7 (9) (2016) 10051–10063. [224] B. Bauer, R. Siebert, A. Traulsen, Cancer initiation with epistatic interactions be- [184] N. Andor, et al., Pan-cancer analysis of the extent and consequences of intratumor tween driver and passenger mutations, J. Theor. Biol. 358 (2014) 52–60. heterogeneity, Nat. Med. 22 (1) (2016) 105–113. [225] C.A. Ortmann, et al., Effect of mutation order on myeloproliferative neoplasms, N. [185] D.L. Hartl, A.G. Clark, Principles of Population Genetics, fourth ed. Sinauer Associ- Engl. J. Med. 372 (7) (2015) 601–612. ates, 2007. [226] R. Fisher, et al., Development of synchronous VHL syndrome tumors reveals con- [186] M. Lynch, Evolution of the mutation rate, Trends Genet. 26 (8) (2010) 345–352. tingencies and constraints to tumor evolution, Genome Biol. 15 (8) (2014) 433. [187] M. Lynch, Rate, molecular spectrum, and consequences of human mutation, Proc. [227] D. Charlesworth, B. Charlesworth, M.T. Morgan, The pattern of neutral molecular Natl. Acad. Sci. U. S. A. 107 (3) (2010) 961–968. variation under the background selection model, Genetics 141 (4) (1995) [188] D.F. Conrad, et al., Variation in genome-wide mutation rates within and between 1619–1632. human families, Nat. Genet. 43 (7) (2011) 712–714. [228] B. Charlesworth, M.T. Morgan, D. Charlesworth, The effect of deleterious mutations [189] J.F. Crow, The origins, patterns and implications of human spontaneous mutation, on neutral molecular variation, Genetics 134 (4) (1993) 1289–1303. Nat. Rev. Genet. 1 (1) (2000) 40–47. [229] R.A. Beckman, L.A. Loeb, Negative clonal selection in tumor evolution, Genetics 171 [190] L.A. Loeb, A mutator phenotype in cancer, Cancer Res. 61 (8) (2001) 3230–3239. (4)(2005)2123–2131. [191] L.A. Loeb, Mutator phenotype in cancer: origin and consequences, Semin. Cancer [230] W.G. Hill, A. Robertson, The effect of linkage on limits to artificial selection, Genet. Biol. 20 (5) (2010) 279–280. Res. 8 (3) (1966) 269–294. [192] I.P. Tomlinson, M.R. Novelli, W.F. Bodmer, The mutation rate and cancer, Proc. Natl. [231] C.I. Wu, et al., The ecology and evolution of cancer: the ultra-microevolutionary Acad. Sci. U. S. A. 93 (25) (1996) 14800–14803. process, Annu. Rev. Genet. 50 (2016) 347–369. [193] I. Tomlinson, W. Bodmer, Selection, the mutation rate and cancer: ensuring that [232] C.D. McFarland, et al., Impact of deleterious passenger mutations on cancer pro- the tail does not wag the dog, Nat. Med. 5 (1) (1999) 11–12. gression, Proc. Natl. Acad. Sci. U. S. A. 110 (8) (2013) 2910–2915. [194] C. Tomasetti, B. Vogelstein, Cancer etiology. Variation in cancer risk among tissues can [233] C.D. McFarland, L.A. Mirny, K.S. Korolev, Tug-of-war between driver and passenger be explained by the number of stem cell divisions, Science 347 (6217) (2015) 78–81. mutations in cancer and other adaptive processes, Proc. Natl. Acad. Sci. U. S. A. 111 [195] A.I. Rozhok, G.M. Wahl, J. DeGregori, A critical examination of the “Bad Luck” expla- (42) (2014) 15138–15143. nation of cancer risk, Cancer Prev. Res. (Phila.) 8 (9) (2015) 762–764. [234] M. Lynch, et al., The mutational meltdown in asexual populations, J. Hered. 84 (5) [196] R. Noble, O. Kaltz, M.E. Hochberg, Peto's paradox and human cancers, Philos. Trans. (1993) 339–344. R. Soc. Lond. Ser. B Biol. Sci. (2015) 370(1673). [235] W. Gabriel, M. Lynch, R. Burger, Muller's ratchet and mutational meltdowns, Evo- [197] S. Wu, et al., Substantial contribution of extrinsic risk factors to cancer develop- lution 47 (6) (1993) 1744–1757. ment, Nature 529 (7584) (2016) 43–47. [236] A. Kreso, J.E. Dick, Evolution of the cancer stem cell model, Cell Stem Cell 14 (3) [198] F. Blokzijl, et al., Tissue-specific mutation accumulation in human adult stem cells (2014) 275–291. during life, Nature 538 (7624) (2016) 260–264. [237] J.M. Weaver, et al., Ordering of mutations in preinvasive disease stages of esopha- [199] M. Kimura, The Neutral Theory of Molecular Evolution, Cambridge University geal , Nat. Genet. 46 (8) (2014) 837–843. Press, Cambridge Cambridgeshire; New York, 1983 (xv, 367 p.). [238] P. Calabrese, S. Tavaré, D. Shibata, Pretumor progression: clonal evolution of [200] D.A. Landau, et al., Evolution and impact of subclonal mutations in chronic lympho- human stem cell populations, Am. J. Pathol. 164 (4) (2004) 1337–1346. cytic leukemia, Cell 152 (4) (2013) 714–726. [239] A.M. Baker, et al., Quantification of crypt and stem cell evolution in the normal and [201] I. Bozic, J.M. Gerold, M.A. Nowak, Quantifying clonal and subclonal passenger mu- neoplastic human colon, Cell Rep. 8 (4) (2014) 940–947. tations in cancer evolution, PLoS Comput. Biol. 12 (2) (2016), e1004731. [240] L. Vermeulen, et al., Defining stem cell dynamics in models of intestinal tumor ini- [202] L.A. Diaz Jr., et al., The molecular evolution of acquired resistance to targeted EGFR tiation, Science 342 (6161) (2013) 995–998. blockade in colorectal cancers, Nature 486 (7404) (2012) 537–540. [241] C.K. Sievers, et al., Understanding intratumoral heterogeneity: lessons from the [203] I. Bozic, et al., Accumulation of driver and passenger mutations during tumor pro- analysis of at-risk tissue and premalignant lesions in the colon, Cancer Prev. Res. gression, Proc. Natl. Acad. Sci. U. S. A. 107 (43) (2010) 18545–18550. 9 (8) (2016) 638–641. [204] M. Slatkin, Gene flow and the geographic structure of natural populations, Science [242] G. Driessens, et al., Defining the mode of tumour growth by clonal analysis, Nature 236 (4803) (1987) 787–792. 488 (7412) (2012) 527–530. [205] L. Excoffier, Patterns of DNA sequence diversity and genetic structure after a range [243] J. Chen, et al., A restricted cell population propagates glioblastoma growth after expansion: lessons from the infinite-island model, Mol. Ecol. 13 (4) (2004) chemotherapy, Nature 488 (7412) (2012) 522–526. 853–864. [244] A.G. Schepers, et al., Lineage tracing reveals Lgr5+ stem cell activity in mouse in- [206] K.S. Korolev, J.B. Xavier, J. Gore, Turning ecology and evolution against cancer, Nat. testinal adenomas, Science 337 (6095) (2012) 730–735. Rev. Cancer 14 (5) (2014) 371–380. [245] C. Tomasetti, et al., Only three driver gene mutations are required for the develop- [207] T.M. Kim, et al., Regional biases in mutation screening due to intratumoural hetero- ment of lung and colorectal cancers, Proc. Natl. Acad. Sci. U. S. A. 112 (1) (2015) geneity of prostate cancer, J. Pathol. 233 (4) (2014) 425–435. 118–123. [208] E.A. Martens, et al., Spatial structure increases the waiting time for cancer, New J. [246] M. Gerlinger, et al., Cancer: evolution within a lifetime, Annu. Rev. Genet. 48 Phys. 13 (2011). (2014) 215–236. [209] L. Hindersin, et al., Should tissue structure suppress or amplify selection to mini- [247] Y. Suzuki, et al., Multiregion ultra-deep sequencing reveals early intermixing and mize cancer risk? Biol. Direct 11 (2016). variable levels of intratumoral heterogeneity in colorectal cancer, Mol. Oncol. [210] N.L. Komarova, Spatial interactions and cooperation can change the speed of evo- (2016). lution of complex phenotypes, Proc. Natl. Acad. Sci. U. S. A. 111 (Suppl. 3) (2014) [248] R. Nielsen, Molecular signatures of natural selection, Annu. Rev. Genet. 39 (2005) 10789–10795. 197–218. [211] A. Sottoriva, et al., Cancer stem cell tumor model reveals invasive morphology and [249] H.H. Heng, et al., Stochastic cancer progression driven by non-clonal chromosome increased phenotypical heterogeneity, Cancer Res. 70 (1) (2010) 46–56. aberrations, J. Cell. Physiol. 208 (2) (2006) 461–472.

Please cite this article as: Z. Hu, et al., A population genetics perspective on the determinants of intra-tumor heterogeneity, Biochim. Biophys. Acta (2017), http://dx.doi.org/10.1016/j.bbcan.2017.03.001 18 Z. Hu et al. / Biochimica et Biophysica Acta xxx (2017) xxx–xxx

[250] P.J. Stephens, et al., Massive genomic rearrangement acquired in a single cata- [268] K. Kemper, et al., Intra- and inter-tumor heterogeneity in a vemurafenib-resistant strophic event during cancer development, Cell 144 (1) (2011) 27–40. melanoma patient and derived xenografts, EMBO Mol. Med. 7 (9) (2015) [251] T. Rausch, et al., Genome sequencing of pediatric medulloblastoma links cat- 1104–1118. astrophic DNA rearrangements with TP53 mutations, Cell 148 (1–2) (2012) [269] T.N. Wong, et al., Role of TP53 mutations in the origin and evolution of therapy-re- 59–71. lated acute myeloid leukaemia, Nature 518 (7540) (2015) 552–555. [252] J.O. Korbel, P.J. Campbell, Criteria for inference of chromothripsis in cancer ge- [270] H.E. Bhang, et al., Studying clonal dynamics in response to cancer therapy using nomes, Cell 152 (6) (2013) 1226–1236. high-complexity barcoding, Nat. Med. 21 (5) (2015) 440–448. [253] S.C. Baca, et al., Punctuated evolution of prostate cancer genomes, Cell 153 (3) [271] D. Juric, et al., Convergent loss of PTEN leads to clinical resistance to a PI(3)Kalpha (2013) 666–677. inhibitor, Nature 518 (7538) (2015) 240–244. [254] C. Cheng, et al., Whole-genome sequencing reveals diverse models of structural [272] M.N. Balak, et al., Novel D761Y and common secondary T790M mutations in epi- variations in esophageal squamous cell carcinoma, Am. J. Hum. Genet. 98 (2) dermal growth factor receptor-mutant lung adenocarcinomas with acquired resis- (2016) 256–274. tance to kinase inhibitors, Clin. Cancer Res. 12 (21) (2006) 6494–6501. [255] L.B. Alexandrov, et al., Signatures of mutational processes in human cancer, Nature [273] M. Murtaza, et al., Non-invasive analysis of acquired resistance to cancer therapy 500 (7463) (2013) 415–421. by sequencing of plasma DNA, Nature 497 (7447) (2013) 108–112. [256] B. Waclaw, et al., A spatial model predicts that dispersal and cell turnover limit [274] A. Romanel, et al., Plasma AR and abiraterone-resistant prostate cancer, Sci. Transl. intratumour heterogeneity, Nature 525 (7568) (2015) 261–264. Med. 7 (312) (2015) 312re10. [257] D. Basanta, R.A. Gatenby, A.R. Anderson, Exploiting evolution to treat drug resis- [275] S.J. Dawson, et al., Analysis of circulating tumor DNA to monitor metastatic breast tance: combination therapy and the double bind, Mol. Pharm. 9 (4) (2012) cancer, N. Engl. J. Med. 368 (13) (2013) 1199–1209. 914–921. [276] J. Tie, et al., Circulating tumor DNA analysis detects minimal residual disease and [258] I.A. Rodriguez-Brenes, D. Wodarz, Preventing clonal evolutionary processes in can- predicts recurrence in patients with stage II colon cancer, Sci. Transl. Med. 8 cer: insights from mathematical models, Proc. Natl. Acad. Sci. U. S. A. 112 (29) (346) (2016) 346ra92. (2015) 8843–8850. [277] Y. Wang, et al., Detection of tumor-derived DNA in cerebrospinal fluid of patients [259] B. Zhao, et al., Exploiting temporal collateral sensitivity in tumor clonal evolution, with primary tumors of the brain and spinal cord, Proc. Natl. Acad. Sci. U. S. A. Cell 165 (1) (2016) 234–246. 112 (31) (2015) 9704–9709. [260] P.M. Enriquez-Navas, et al., Exploiting evolutionary principles to prolong tumor [278] M. Jamal-Hanjani, et al., Tracking genomic cancer evolution for precision medicine: control in preclinical models of breast cancer, Sci. Transl. Med. 8 (327) (2016) the lung TRACERx study, PLoS Biol. 12 (7) (2014), e1001906. 327ra24. [279] N. Knijn, et al., KRAS mutation analysis: a comparison between primary tumours [261] I.A. Rodriguez-Brenes, C.S. Peskin, Quantitative theory of telomere length regula- and matched liver metastases in 305 colorectal cancer patients, Br. J. Cancer 104 tion and cellular senescence, Proc. Natl. Acad. Sci. U. S. A. 107 (12) (2010) (6) (2011) 1020–1026. 5387–5392. [280] S. Turajlic, C. Swanton, Metastasis as an evolutionary process, Science 352 (6282) [262] B. Zhao, M.T. Hemann, D.A. Lauffenburger, Modeling tumorclonal evolution for (2016) 169–175. drug combinations design, Trends Cancer 2 (3) (2016) 144–158. [281] M. Angelo, et al., Multiplexed ion beam imaging of human breast tumors, Nat. Med. [263] J. Chmielecki, et al., Optimization of dosing for EGFR-mutant non-small cell lung 20 (4) (2014) 436–442. cancer with evolutionary cancer modeling, Sci. Transl. Med. 3 (90) (2011) 90ra59. [282] A. Bruna, et al., A biobank of breast cancer explants with preserved intra-tumor [264] I. Bozic, et al., Evolutionary dynamics of cancer in response to targeted combination heterogeneity to screen anticancer compounds, Cell 167 (1) (2016) 260–274 therapy, elife 2 (2013), e00747. (e22). [265] B. Zhao, M.T. Hemann, D.A. Lauffenburger, Intratumor heterogeneity alters most ef- [283] P. Eirew, et al., Dynamics of genomic clones in breast cancer patient xenografts at fective drugs in designed combinations, Proc. Natl. Acad. Sci. U. S. A. 111 (29) single-cell resolution, Nature 518 (7539) (2015) 422–426. (2014) 10773–10778. [284] M. van de Wetering, et al., Prospective derivation of a living organoid biobank of [266] S. Misale, et al., Emergence of KRAS mutations and acquired resistance to anti- colorectal cancer patients, Cell 161 (4) (2015) 933–945. EGFR therapy in colorectal cancer, Nature 486 (7404) (2012) 532–536. [285] M.A. Cantrell, C.J. Kuo, Organoid modeling for cancer precision medicine, Genome [267] P. Laurent-Puig, et al., Clinical relevance of KRAS-mutated subclones detected with Med. 7 (1) (2015) 32. picodroplet digital PCR in advanced colorectal cancer treated with anti-EGFR ther- [286] J.Y. Kim, R.A. Gatenby, Quantitative clinical imaging methods for monitoring apy, Clin. Cancer Res. 21 (5) (2015) 1087–1097. intratumoral evolution, Methods Mol. Biol. 1513 (2017) 61–81.

Please cite this article as: Z. Hu, et al., A population genetics perspective on the determinants of intra-tumor heterogeneity, Biochim. Biophys. Acta (2017), http://dx.doi.org/10.1016/j.bbcan.2017.03.001