Phylomedicine of Mutational Processes in Somatic Cancer Cell Populations
Total Page:16
File Type:pdf, Size:1020Kb
bioRxiv preprint doi: https://doi.org/10.1101/2021.04.02.438268; this version posted April 4, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. eLife Evolutionary Medicine Phylomedicine of mutational processes in somatic cancer cell populations Sayaka Miura1,2¶*, Tracy Vu1,2¶, Jiyeong Choi1,2, and Sudhir Kumar1,2,3 1Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, Pennsylvania, USA 2Department of Biology, Temple University, Philadelphia, Pennsylvania, USA 3Center for Excellence in Genome Medicine and Research, King Abdulaziz University, Jeddah, Saudi Arabia *Corresponding author E-mail: [email protected] ¶These authors contributed equally to this work. 1 bioRxiv preprint doi: https://doi.org/10.1101/2021.04.02.438268; this version posted April 4, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. ABSTRACT Mutational processes in somatic cancer cell populations are constantly changing, leaving their signatures in the accumulated genomic variation in tumors. The inference of mutational signatures from the observed genetic variation enables spatiotemporal tracking of tumor mutational processes that evolve due to cellular environmental changes, mutations, and treatment regimes. Ultimately, mutational patterns illuminate the mechanistic understanding of their evolution in cancer progression. We show that the integration of cancer cell phylogeny with mutational signature deconvolution enables higher-resolution detection of gain and loss of mutational processes within the phylogeny. This approach to analyzing somatic genomic variations in 61 lung cancer patients revealed a high turn-over of mutational processes over time and closely related clonal lineages. Some mutational signatures (e.g., smoking-related) showed a higher propensity to be lost, whereas others (e.g., AID/APOBEC) were gained during lung tumors evolution. These observations shed light on the evolution of mutational processes in somatic cell evolution. 2 bioRxiv preprint doi: https://doi.org/10.1101/2021.04.02.438268; this version posted April 4, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. INTRODUCTION Tumor cells accumulate somatic mutations during cancer progression, in which cells exhibit dynamic demography, including emergence, expansion, and extinction (Bailey et al., 2018; Martincorena & Campbell, 2015). Through the analysis of genomic variation, researchers now routinely reconstruct mutational histories and phylogenies of clones (Brown et al., 2017; El-Kebir et al., 2018; Miura et al., 2020; Turajlic et al., 2018; Zhao et al., 2016). In a clone phylogeny, variants can be localized to individual branches and relative frequencies of different variant types compared across lineages to detect shifts in cellular mutational processes over time. For example, the trunk of the clone phylogeny in figure 1 shows many more C→A transversions than in its descendants, suggesting that the process of mutagenesis is not the same over time in this lung cancer patient. Such temporal comparisons of genome variation patterns are a new frontier in enhanced understanding of the intricacies of evolution in individual tumors and patients. These comparisons reveal how pre-existing genetic alterations and treatment regimens are fundamentally altering the landscape of mutational processes, often producing resistant cells that promote cancer progression (Ashley et al., 2019; Barry et al., 2018; de Bruin et al., 2014; Dentro et al., 2020; Gerstung et al., 2020; Leong et al., 2019). Many mutational processes leave distinct signatures in the form of types of variants and their relative counts. For example, a large C→A variant frequency is a tell-tale sign of smoking-related mutational processes that arise early (COSMIC signature S4; Fig. 1b and 1d). Their activity begins to decline after smoking cessation (Alexandrov et al., 2016; Le Calvez et al., 2005). In contrast, age-related mutagenic processes create C→T transitions that arise throughout a person’s lifetime (COSMIC signature S1) and result in the decay of methylated CpG sites (Alexandrov et al., 2013; Alexandrov et al., 2018; Alexandrov & Stratton, 2014; Van Hoeck et al., 2019). Many distinct mutational signatures have been inferred from the genetic variation found in various cancers' tumors, which has been assembled in online catalogs (Alexandrov et al., 2020; Goncearenco et al., 2017). For example, 30 signatures have been recognized in COSMIC v2, each of which is a vector of 96 different mutational contexts consisting of the mutated base and adjacent 5’ and 3’ bases (e.g., Fig. 1d) (Alexandrov et al., 2015; Alexandrov et al., 2020; Tate et al., 2019). Computational methods are available to estimate their relative activities of mutational signatures from observed tumor genetic variants (Blokzijl et al., 2018; Huang et al., 2018; Rosenthal et al., 2016). Mutational processes operating in early and late stages of cancer progression have been contrasted using predicted mutational signatures (Ashley et al., 2019; Barry et al., 2018; de Bruin et al., 2014; Dentro et al., 2020; Gerstung et al., 2020; Leong et al., 2019). Researchers have also begun to analyze branch-specific 3 bioRxiv preprint doi: https://doi.org/10.1101/2021.04.02.438268; this version posted April 4, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. mutational signatures in clone phylogenies to discover mutagens linked with the origin of new clones in cancer patients (Barry et al., 2018; Hao et al., 2016; Roper et al., 2019; Wang et al., 2019). Successful mutational signature detection using existing methods currently requires hundreds of somatic variants (Li et al., 2018). This requirement makes the inference of evolutionary dynamics of mutational signatures at a finer phylogenetic resolution (e.g., branch-by-branch) challenging because the collection of variants in individual branches in clone phylogenies is often small (Fig. 2 and 3). For example, fewer than 100 variants were mapped to most branches in carcinoma cell phylogenies (Jamal-Hanjani et al., 2017) (Fig. 2 and 3). It is not yet possible for such small collections of variants to detect branch-specific mutational signatures reliably (Li et al., 2018). This problem is illustrated in an analysis of a simulated clone phylogeny modeled after an adenocarcinoma clone phylogeny (Fig. 4a; phylogeny CRU0079 in Fig. 2). The available state-of-the-art methods used to branch-specific variants frequently produced too many signatures, while some correct signatures remained undetected (Fig. 4b-d). This means that we cannot yet reliably detect the evolution of branch-specific signatures over time in a patient, limiting us to gross comparisons that pool variants to build large-enough collections (de Bruin et al., 2014; Dong et al., 2018; Hao et al., 2016; Nahar et al., 2018). We hypothesized that the detection of mutational signatures would be more accurate if the clone phylogeny is utilized alongside mutation signature detection approaches. This idea is based on the expectation that neighboring branches in the clone phylogeny will share some mutational signatures due to their shared environment and evolutionary history, e.g., Dentro et al. (2018). We leveraged this property to infer branch- specific mutational signatures through a joint analysis of the collection of mutations mapped on phylogenetically proximal branches of the clone phylogeny, which is called PhyloSignare and presented below. Then, we present an assessment of PhyloSignare’s usefulness by analyzing computer-simulated datasets, which establish that PhyloSignare can significantly improve the accuracy of current methods for smaller collections of variants (Blokzijl et al., 2018; Huang et al., 2018; Rosenthal et al., 2016). Finally, we apply PhyloSignare to infer mutational signature evolution in non-small cell lung cancer patients, revealing branch-specific mutational signatures at a finer phylogenetic resolution. RESULTS The PhyloSignare (PS) approach As noted above, current methods produce many spurious signatures when the number of variants analyzed is not large enough. To detect spurious signatures, we estimate an importance score (iS) for each signature predicted using an existing method, e.g., a quadratic programming approach (QP) (Huang et al., 2018). This score contrasts the statistical fit of the predicted signatures to explain the frequencies of branch-specific 4 bioRxiv preprint doi: https://doi.org/10.1101/2021.04.02.438268; this version posted April 4, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. variants with and without the given signature (see Methods section