POLD Replicates Both Strands of Small Kilobase-Long Replication Bubbles Initiated at a Majority of Human Replication Origins

POLD Replicates Both Strands of Small Kilobase-Long Replication Bubbles Initiated at a Majority of Human Replication Origins

bioRxiv preprint doi: https://doi.org/10.1101/174730; this version posted August 10, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. POLD replicates both strands of small kilobase-long replication bubbles initiated at a majority of human replication origins Artem V. Artemov1,2,a*, Maria A. Andrianova2,3, Georgii A. Bazykin2,3 and Vladimir B. Seplyarskiy2,4,b* 1 Faculty of Bioengineering and Bioinformatics, Moscow State University, Moscow, Russia 2 Institute for Information Transmission Problems of the Russian Academy of Sciences (Kharkevich Institute), Moscow, Russia 3 Skolkovo Institute of Science and Technology, Russia 4 Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA * These authors contributed equally a [email protected] b [email protected] Abstract Error-prone mutants of polymerase epsilon (POLE*) or polymerase delta (POLD1*) induce a mutator phenotype in human cancers. Here we show that the rate of mutations introduced by POLD1* is elevated by 50%, while the rate of POLE*-induced mutations is decreased twofold, within one kilobase from replication origins. These results support a model in which POLD1 replicates both the leading and the lagging strands within a kilobase from an origin. The magnitude of the mutational bias suggests that the probability of an individual origin to initiate replication exceeds 50%, which is much higher than previous estimates. Using additional data from nascent DNA sequencing and Okazaki fragments sequencing (OK-seq) experiments, we showed that a majority of origins are firing at each replication round, but the initiated replication fork does not propagate further than 1Kb in both directions. Analyses based on mutational data and on OK-seq data concordantly suggest that only approximately a quarter of fired origins result in a processive replication fork. Taken together, our results provide a new model of replication initiation. Keywords POLE; POLD; replication; origins; MSI; cancer Accumulation of sequencing data is improving the understanding of processes involved in mutagenesis. Recently, analysis of preferential fork direction helped uncover a major mode of APOBEC-induced mutagenesis in cancer (Seplyarskiy et al. 2016a; Morganella et al. 2016; Haradhvala et al. 2016) and revealed the dominant role of mismatches introduced by POLD1 in mutagenesis of cancers with mismatch repair 1/19 bioRxiv preprint doi: https://doi.org/10.1101/174730; this version posted August 10, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. deficiency (Andrianova et al. 2017). Understanding of how major DNA polymerases, POLE and POLD1, divide the labor during DNA replication has largely stemmed from patterns of mutation or ribonucleotide incorporation in cell systems with deficiencies in different proofreading mechanisms (Nick McElhinny et al. 2008; Larrea et al. 2010a; Lujan et al. 2012; Johnson et al. 2015a; Reijns et al. 2015; Clausen et al. 2015; Andrianova et al. 2017). By contrast, a qualitative model of replication has been based on reconstruction of replication in vitro (Yeeles et al. 2015; Georgescu et al. 2014; Yeeles et al. 2017; Kurat et al. 2017; Devbhandari et al. 2017), and efficiency of replication origins in mammalian cells has been estimated mainly from sequencing of 500-2500 nucleotide long stretches of nascent DNA (Cayrou et al. 2011; Besnard et al. 2012, 2014; Cayrou et al. 2015). Analyses of mutagenesis in systems with deficiencies of different components of proofreading machinery have not yet been applied to study firing of replication origins. Here, we use somatic mutations in cancers as well as OK-seq and repli-seq data to qualitatively and quantitatively revisit the model of origin firing. Mutational traces of low-fidelity POLE (POLE*) or POLD1 (POLD1*) mutants may be used to trace the genomic regions replicated by each polymerase. However, mismatches introduced by polymerases are co-replicatively repaired by mismatch repair system (MMR), which may bias the mutational traces left by POLE* or POLD1*. This bias would not exist in tumours with biallelic mismatch-repair deficiency (bMMRD) or MMR deficiency manifested as microsattelite instability (MSI). In our analyses we used somatic mutations collected by whole-genome sequencing of POLE* MSI endometrial tumours and by whole-exome sequencing of POLD* bMMRD glioblastoma samples. ORC binding sites determined by ChIP-seq in HeLa cells (Dellino et al. 2013a) were taken as markers of potential origins. To obtain a mutation-based model of replication, we estimated context-corrected mutation rate for the 5 kb around each ORC site (see Methods). We discovered a twofold drop of mutation rate within a kilobase from ORC sites in POLE* MMR-deficient tumours (P = 1:2 ∗ 10−157, Figure 1A). The drop had a similar magnitude for different substitution types (Figure S1A) and for genomic contexts excluding CpG dinucleotides (Figure S1B). In contrast to POLE* MMR-deficient cancers, POLD* MMR-deficient tumours possessed a 1.5-fold higher mutation rate nearby ORC sites as compared to the flanks 5Kb apart (P = 2:3 ∗ 10−4), (Figure 1A). To control that a mutation rate peak at replication origins in POLD* tumours could not be explained by biases in genomic distribution specific to exome data, we studied POLE* tumours for which only whole-exome data were available. No peak was observed around replication origins (Figure S2). Replication origins are characterized by a very specific epigenetic profile, and to account for this, we plotted average mutation rates around DNA features which have been shown to affect the local mutation rate: gene transcription start sites (Sabarinathan et al. 2016; Perera et al. 2016), the peaks of H3K4me3, H3K9ac, H3K27me3 histone marks, binding sites of CTCF, cohesin (SMC3) (Schuster-B¨ockler, Benjamin, and Ben 2012); and SUZ12 binding sites that were believed to be associated with replication process (Cayrou et al. 2015). ORC sites and SUZ12 had the strongest effect on local mutation rate (Figure 1B). We performed multiple regression predicting mutation rate in 1kb genomic windows based on epigenetic factors and showed that the presence of an ORC site was −16 the best predictor of the local mutation rate in POLE* tumours (PANOV A < 2 ∗ 10 ). Epigenetic background, such as DNAse sites which frequently overlapped with ORC sites, could not explain the effect of ORC sites on mutation rates in POLE* and POLD1* tumours (Figure S3A,B). We also controlled for replication timing and preferential fork direction (Figure S4A), but did not find any decrease or increase in mutation rate for a control set of genomic regions (Figure S4B). Therefore, the effect was associated with origins themselves and could not be explained by clustering of origins in the domains of early replication timing. We also explored APOBEC-induced 2/19 bioRxiv preprint doi: https://doi.org/10.1101/174730; this version posted August 10, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. mutational patterns, which had been recently linked to replication (Haradhvala et al. 2016; Seplyarskiy et al. 2016b), but did not see any alterations of mutation rate near ORC sites (Figure S5). The observed accumulation of mutational trace left by POLD and the depletion of POLE-introduced mutations at ORC sites suggests that POLD replicates both the leading and the lagging strands 1kb from the origin site, while the two polymerases canonically divide the labor further apart from the origin: POLE replicates the leading strand whereas POLD replicates the lagging strand (Figure 1C). POLE* and POLD1* were known to introduce mutations in different trinucleotide contexts (Andrianova et al. 2017; Shlien et al. 2015). If context preferences may be extrapolated on non-mutated polymerases in systems with dominant contribution of polymerase errors to mutagenesis, we may use somatic mutations in specific contexts to A. B. 2.0 2.0 Mutants 1.0 Mutants POLE* POLE* 1.0 POLD* POLD* 0.5 0.5 Context−corrected mutation rate, relative to genome−wide, log scale to genome−wide, relative rate, mutation Context−corrected TSS Fold change of mutation rate at epigenetic features, log scale at epigenetic features, rate change of mutation Fold CTCF SMC3 ORC1 SUZ12 DNAse 0 H3K9ac 2500 5000 H3K4me3 −5000 −2500 H3K27me3 distance to ORC site C. D. POLD 15 POLE polδ polδ polδ polε <<lagging leading>> 10 <<leading lagging>> polε polδ polδ polδ 5 eSPAN coverage of both strands coverage eSPAN -1Kborigin +1Kb 0 −4000 −2000 0 2000 4000 coordinate relative to origin Figure 1. Local somatic mutation rates on a 5-kilobase scale centered around various epigenetic features in POLE* tumours and in POLD* tumours. Note that only exome data were available for polD deficient tumours. (A) For each epigenetic feature (x-axis), we plotted context-corrected mutation rate adjacent to this feature (-500..+500bp window) normalized by mutation rate further from the regions of interest (-5000..-4500 bp and 4500..5000 regions, see Methods section). ORC1 sites were associated with the strongest drop in local mutation rate in POLE* tumours and, in the same time, with a strong and significant increase of mutation rate in poD-deficient tumours. (B) Context-corrected somatic mutation rates (see Methods) on a 5-kilobase scale centered around potential human replication origins defined as ORC binding sites in POLE* tumours and in POLD* tumours. Note that only exome data were available for POLD* tumours. 3/19 bioRxiv preprint doi: https://doi.org/10.1101/174730; this version posted August 10, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    19 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us