Modelling and Measurement of Simple and Complex DNA Damage Induction by Ion Irradiation

A thesis submitted to the University of Manchester for the degree of Doctor of Philosophy in the Faculty of Biology, Medicine and Health

2018

Nicholas T. Henthorn

School of Medical Sciences

Division of Cancer Sciences

Contents 1. Introduction ...... 17 1.1 Cancer ...... 17 1.1.1 Incidence and Prevalence...... 17 1.1.2 Biology ...... 17 1.2 External Beam Radiotherapy ...... 18 1.2.2 In Practice ...... 18 1.2.2 Photons ...... 19 1.2.3 Particles ...... 24 1.3 Proton Beam Therapy ...... 25 1.3.1 Delivery Techniques ...... 27 1.3.2 Benefits ...... 30 1.3.3 Linear Energy Transfer ...... 31 1.3.4 Uncertainties and Areas of Research ...... 34 1.4 Relative Biological Effectiveness ...... 36 1.4.1 Definition ...... 36 1.4.2 Experimental Data ...... 39 1.4.3 RBE = 1.1 ...... 43 1.4.4 Clinical Implications ...... 44 1.4.5 Phenomenological Models ...... 45 1.5 DNA Damage, Nanodosimetry, and DNA Repair ...... 50 1.6 Modelling ...... 54 1.6.1 Current Models ...... 56 1.6.2 Mechanistic Models ...... 61 1.6.3 Track Structure Codes...... 63 1.7 Geant4-DNA ...... 66 1.7.1 Physics ...... 66 1.7.2 Chemistry...... 69 1.8 Summary and Aims ...... 70 2. Nanodosimetric Simulation of Direct Ion-Induced DNA Damage Using Different Chromatin Geometry Models ...... 73 2.1 Introduction ...... 76 2.2 Methods...... 79 2.2.1 Model of DNA and Chromatin Geometries...... 79 2.2.2 Track Structure Simulation and Irradiation Details ...... 81 2.2.3 Scoring of Clusters ...... 82

2 2.2.4 Nanodosimetric Parameters ...... 83 2.3 Results ...... 84 2.3.1 Damaged Backbone Cluster Size Distribution...... 84

2.3.2 Conditional Average Cluster Size (m1) ...... 86

2.3.3 Relative Cumulative Distributions (F2 and F3) ...... 87 2.3.4 SSB to DSB Ratio ...... 89 2.3.5 The Effect of Backbone Size ...... 90 2.4 Discussion ...... 92 2.4.1 Chromatin Model Comparison ...... 92 2.4.2 Damage Complexity...... 93 2.5 Summary and Conclusions ...... 95 2.6 Acknowledgements ...... 96 3. Nanodosimetric Simulation of DNA Damage from Protons and Across a Clinically Relevant Proton Spread Out Bragg Peak with Direct and Indirect Effects ...... 97 3.1 Introduction ...... 99 3.2 Methods...... 101 3.2.1 Models of DNA Geometry ...... 101 3.2.2 Track Structure Simulation ...... 104 3.2.3 Direct and Indirect DNA Damage ...... 105 3.2.4 Damage Classification ...... 106 3.2.5 Simulation of Plasmid Irradiation ...... 108 3.2.6 Simulation of Chromatin Fibre Irradiation ...... 110 3.2.7 Total Damage Yields in a Cell Model ...... 111 3.3 Results ...... 112 3.3.1 Direct DNA Damage - Plasmid Irradiation ...... 112 3.3.2 Indirect DNA Damage – Chromatin Irradiation ...... 113 3.3.3 Damage Complexity...... 114 3.3.4 Damage Complexity Distribution ...... 116 3.3.5 Clinically Relevant Considerations ...... 118 3.4 Discussion ...... 119 3.5 Summary and Conclusions ...... 122 3.6 Acknowledgements ...... 123 4. In Silico Non-Homologous End Joining Following Ion Induced DNA Double Strand Breaks Predicts That Repair Fidelity Depends on Break Density ...... 124

3 4.1 Introduction ...... 126 4.2 Methods...... 129 4.2.1 Simulation of DSBs ...... 129 4.2.2 Non-Homologous End Joining Repair Model ...... 130 4.2.3 Data Availability ...... 131 4.3 Results ...... 131 4.3.1 Misrepair and LET ...... 131 4.3.2 Cluster Density, Misrepair, and Residual DSBs ...... 134 4.3.3 Residual and Misrepaired DSB Yields ...... 137 4.4 Discussion and Conclusions ...... 139 4.5 Acknowledgements ...... 143 5. SDD: A Standardised Data Format to Record DNA Damage 144 5.1 Introduction ...... 145 5.2 The New Standard to Record DNA Damage ...... 148 5.2.1 Website and Updates ...... 149 5.2.2 Header ...... 149 5.2.3 The Data Block ...... 154 5.2.4 Dissemination and Repository ...... 162 5.3 Discussion and Conclusions ...... 162 6. Comparing DNA damage and repair models with respect to experimentally supported mechanisms ...... 164 6.1 Introduction ...... 166 6.2 Methods...... 169 6.2.1 Inter-Comparison Overview ...... 169 6.2.2 Use of a Standard Format for DNA Damage (SDD) ...... 169 6.2.3 The Henthorn DNA Damage Model ...... 169 6.2.4 The McMahon DNA Damage Model ...... 170 6.2.5 The McMahon DNA Repair Model...... 170 6.2.6 The Warmenhoven DNA Repair Model ...... 171 6.3 Results ...... 172 6.4 Discussion ...... 178 6.5 Conclusions ...... 183 6.6 Acknowledgements ...... 184 7. Final Discussions ...... 185 7.1 Chapter 2 ...... 185 7.2 Chapter 3 ...... 186

4 7.3 Chapter 4 ...... 188 7.4 Chapter 5 ...... 189 7.5 Chapter 6 ...... 190 7.6 Model Assumptions ...... 191 8. Conclusions and Future Work ...... 193 9. References ...... 195 A. Appendix 1 ...... 229 A. Appendix 2 ...... 242

Word count:  65,000 words

5 List of Figures Figure Description Page 1.1 UK cancer incidence by age 17 1.2 Photon interactions and dose 20 1.3 Photon attenuation coefficient 21 1.4 Photon dose buildup 22 1.5 Photon dose depth as a function of photon energy 23 1.6 Dose depth profile of photons, electrons, protons, and 25 carbon ion 1.7 Number of operational proton therapy centres across time 26 1.8 The proton spread-out Bragg peak 28 1.9 Dose map comparison between protons and photons 30 1.10 Proton dose depth with track- and dose-averaged LET 33 1.11 Range uncertainty due to dose inhomogeneity 35 1.12 Cell survival curves from proton and photon irradiation 37 1.13 Relative biological effectiveness as a function of survival 38 fraction 1.14 Relative biological effectiveness across an ion and LET 40 range 1.15 Relative biological effectiveness across a proton LET 41 range 1.16 Relative biological effectiveness across a proton LET 42 range, grouped by radiosensitivity 1.17 Phenomenological relative biological effectiveness models 47 applied to a proton spread-out Bragg peak 1.18 Fitted parameters of phenomenological relative biological 48 effectiveness models 1.19 Relationship between proton and photon radiosensitivity 49 parameters 1.20 Relationship between proton and photon radiosensitivity 49 parameters, incorporating proton LET 1.21 Simplified process from DSB induction to biological 54 outcome 1.22 Mechanistic process from radiation to quantification of 56 radiation quality 1.23 Linear quadratic model and lethal potentially lethal model 60 fits to cell survival 2.1 Three chromatin models implemented in Geant4-DNA 81

6 2.2 Probability distributions of DSB cluster size in different 85 chromatin geometries 2.3 Average DSB cluster size in different chromatin 86 geometries as a function of LET 2.4 Cumulative probability of forming a DSB, containing 2 or 87 more DNA backbones, in different chromatin 2.5 Cumulative probability of forming a DSB, containing 3 or 88 more DNA backbones, in different chromatin 2.6 SSB to DSB ratio in different chromatin geometries across 89 an LET range 2.7 Nanodosimetric parameters as a function of DNA 91 backbone volume size 2.8 SSB to DSB ratio in different chromatin geometries across 94 a proton energy range, compared to literature predictions 3.1 Different DNA geometry models implemented in Geant4- 102 DNA 3.2 Secondary electron energy spectrum from a Co-60 source 105 3.3 Schematic representation of DNA damage complexities 108 3.4 Plasmid model implemented in Geant4-DNA 109 3.5 Solenoid chromatin fibre model implemented in Geant4- 110 DNA 3.6 SSB and DSB yield from plasmid simulations as a function 112 of proton LET. DNA geometries and direct damage mechanisms compared to literature results 3.7 Simulation predicted contribution to DNA damage from 113 indirect effects 3.8 Predicted yields of isolated and clustered DNA damage 114 across a proton LET range, per unit dose 3.9 Predictions of DSB complexity across a proton LET range, 115 considering direct DNA damage only or direct and indirect damage 3.10 Cumulative probability of forming a damage cluster with a 117 given number of DNA backbones and bases. Shown for a photon irradiation and a selection of proton LETs 3.11 Fitted parameters of the cumulative probability to 117 determine damage cluster size as a function of proton LET 3.12 Predictions of DSB complexity across a 1D proton spread- 119 out Bragg peak, and relative biological effectiveness for damage complexity compared to photon irradiation

7 4.1 Predicted probability of misrepaired DSBs across an LET 132 range for protons, alpha particles, and carbon ions. Probability distribution of DSB separation for iso-LET ions. DSB end displacement distribution for 24 hours of repair 4.2 Probability of DSB misrepair as a function of DSB 135 clustering. Probability of residual DSBs across an LET range. Initial yield of DSBs across an LET range, per unit dose 4.3 Model predictions for the yield of residual and misrepaired 138 DSBs across a proton LET range and the predictions of fitted correlations. Shown for 1, 2, and 5 Gy 4.4 Predictions of residual and misrepaired DSB yields for the 139 case of a 1D proton spread-out Bragg peak 5.1 Illustration of the data structure of the standard format to 148 record DNA damage 5.2 Example of scoring base lesions in the standard format 157 5.3 Example of the DNA double helix format to record details 158 of the damage structure 5.4 Example of the types of DNA damage clusters and the 160 data structure in the standard damage format 6.1 Comparison of the predicted DSB yield and complexity for 172 photons and across a proton LET range for the Henthorn and McMahon damage models 6.2 Comparison of the predicted DSB linear and radial density 173 across a proton LET range between the Henthorn and McMahon damage models 6.3 Comparison of predictions for residual and misrepaired 175 DSB yields for different damage and repair model combinations for photons and across a proton LET range 6.4 Comparison of DSB interaction as a function of separation 177 between the Warmenhoven and McMahon repair models. Comparison of DSB density across a proton LET range between the Henthorn and McMahon damage models. Comparison of DSB misrepair probability as a function of DSB density between the Henthorn-Warmenhoven and McMahon-McMahon damage and repair model combinations 6.5 Comparison of the probability of residual and misrepaired 181 DSBs for photons and across a proton LET range

8 between Henthorn-Warmenhoven and McMahon- McMahon damage and repair model combinations A1.1 Probability distributions of LET for mono-energetic 232 protons, alphas, and carbon ions across the cell volume A1.2 Average LET as a function of ion energy; for protons, 232 alphas, and carbon ions A1.3 Dose per primary as a function of ion LET 233 A1.4 Schematic representation of irradiation methodology and 234 dosimetry A1.5 Ion range for low energy protons and alphas 235 A1.6 Predicted yield of proton induced DSBs as a function of 235 LET per unit dose, compared to literature values A1.7 The effect of DSB complexity on the probability of DSB 236 misrepair A1.8 DSB density for different definitions of local DSBs 237 A1.9 Goodness of fit between misrepaired DSBs for different 238 definitions of local DSB A1.10 DSB density as a function of LET for protons, alphas, and 238 carbon ions A1.11 Carbon LET as a function of primary particle energy, 239 showing a switch between classical and relativistic behaviour A1.12 Predicted DSB yields following 24 hours of repair 240 compared to experimental data in the literature A1.13 The time of DSB misrepair for different proton energy 241 irradiated cell models

9 List of Tables Table Description Page 1.1 Fitting parameters of phenomenological relative biological 46 effectiveness models 1.2 Common free radicals, production yield, and diffusion 52 coefficient 1.3 Chemical reactions in the default Geant4-DNA chemistry 70 modules 2.1 Fitting parameters for average DNA backbone cluster size 87 2.2 Fitting parameters for cumulative probability of DNA 89 backbone cluster size 3.1 Fitting parameters for DSB complexity 115 4.1 Fitting parameters for initial DSB yield, residual DSBs, and 137 misrepaired DSBs 5.1 Data for the header file of the standard format for reporting 150 DNA damage 5.2 Data for the block of the standard format for reporting DNA 155 damage A1.1 Definitions of field specific terms 230 A2.1 Parameters used in damage and repair models 243 A2.2 Radiation energy and LET 243 A2.3 Fitting equations and parameters used in Chapter 6 244

10 List of Abbreviations

1D One Dimensional 3D Three Dimensional 4D Four Dimensional ATS Amorphous Track Structure BER Base Excision Repair bp Base Pairs c-NHEJ Canonical Non-Homologous End Joining CDF Cumulative Distribution Function CDW-EIS Continuum Distorted Wave-Eikonal Initial State COB Classical Over-Barrier CTMC Classical Trajectory Monte Carlo CTRad Clinical and Radiotherapy Translational Group CTV Clinical Target Volume DBCSD Damaged Backbone Cluster Size Distribution DBSCAN Density-Based Spatial Clustering of Applications with Noise DDR DNA Damage Response DNA Deoxyribonucleic Acid DSB Double Strand Break FBA First Born Approximation FLm Fractional Langevin Motion FRAP Fluorescent Recovery After Photobleaching GCR Galactic Cosmic Ray GTV Gross Tumour Volume HalfCyl Half Cylinder Hen Henthorn HR Homologous Recombination HU Hounsfield Unit HWHM Half Width Half Maximum ICRU International Commission on Radiation Units ICSD Ionisation Cluster Size Distribution IMPT Intensity Modulated Proton Therapy IMRT Intensity Modulated Radiotherapy KERMA Kinetic Energy Released per Mass KS Kolmogorov-Smirnov LEM Local Effects Model LET Linear Energy Transfer (keV/um)

LETd Dose Averaged Linear Energy Transfer

LETt Track Average Linear Energy Transfer LINAC Linear Accelerator LPL Lethal Potentially Lethal LQ Linear Quadratic McM McMahon

11 MCTS Monte Carlo Track Structure MMR Mismatch Repair NER Nucleotide Excision Repair NHEJ Non-Homologous End Joining NHS National Health Service NTCP Normal Tissue Complication Probability OAR Organ At Risk OER Oxygen Enhancement Ratio PBS Pencil Beam Scanning PBT Proton Beam Therapy PDF Probability Distribution/Density Function PDG Particle Data Group PET Positron Emission Tomography PG Prompt Gamma PIDE Particle Irradiation Data Ensemble PTV Planning Target Volume QALY Quality-Adjusted Life Years QUANTEC Quantitative Analysis of Normal Tissue Effects in the Clinic QuartCyl Quarter Cylinder RBE Relative Biological Effectiveness RCT Randomised Clinical Trial RIF Radiation Induced Foci RMR Repair Misrepair RNA Ribonucleic Acid ROS SDD Standard to Record DNA Damage SOBP Spread-Out Bragg Peak SSB Single Strand Break TCP Tumour Control Probability TPS Treatment Planning Software VMAT Volumetric Modulated Arc Therapy Warm Warmenhoven

12 Abstract Photons have been used in radiotherapy for a number of years, and a lot of experience has been gained; experience which does not currently exist for protons. In order to apply this experience, and to optimise proton therapy, a dose conversion is applied, known as the Relative Biological Effectiveness (RBE). A constant RBE of 1.1 is in clinical use. However, a number of experimental studies have shown that RBE is not constant; depending on a number of factors, such as Linear Energy Transfer (LET), cell type, and dose etc. The RBE of 1.1 is based on a number of in vitro studies, however, within this data exists a significant variance.

It has been estimated that proton RBE ranges from around 1, at the entrance, to around 2.5, at the distal edge. The value of 1.1 has been clinically accepted as a “safe” value, with no signs of significant under- or over-dosing. However, the open question of RBE, and the biologically extended range, can lead to potential degradation in treatment plan quality. For example, proton distal edges are not placed near organs at risk, where RBE is highest. A number of phenomenological models have been developed to encapsulate variable RBE. These models link cell survival parameters between photons and protons, with a scaling from LET. However, the models are fit with the same in vitro data used to derive RBE. The models also, by definition, give no information on underlying mechanisms of variable RBE, aside from implicitly stating that there is increased cell kill at increased LET. Noise in the data used to fit the models could explain the lack of clinical implementation.

Mechanistically, it is believed that cell kill is a result of DNA damage and the efficacy of repair. In particular, the induction of DNA Double Strand Breaks (DSBs) has been identified as the toxic mechanism. By simulating the process of DSB induction and repair, mechanisms can be uncovered. This work presents results of such a methodology. The mechanisms that lead to direct and indirect DNA damage are simulated, with parameters of the mechanisms fit to experiments on DNA extracts or parameters taken from the literature. The mechanisms are applied to larger biological systems, making predictions of DNA damage at the cellular level. Prediction of DNA damage is correlated to conventional units that can be scored in proton therapy, dose and LET. This allows for the model predictions to be applied to clinically relevant cases, such as in treatment planning software. In all cases, the simulations predict an increase in yield, complexity, and density of DSBs with LET. This translates to an increase in misrepaired and residual DSBs, i.e. biological effect. The work provides mechanisms for the experimentally observed increase in cell kill with proton depth.

The University of Manchester Nicholas T. Henthorn Doctor of Philosophy “Modelling and Measurement of Simple and Complex DNA Damage Induction by Ion Irradiation” 31st March 2018

13 Declaration

A portion of the work referred to in the thesis has been submitted in support of an application for another degree or qualification at this University. The work presented in chapters 4, 5, and 6 were jointly authored by myself and J. Warmenhoven, where both authors contributed considerable independent work, and as such the chapters are also included in J. Warmenhoven’s thesis.

Copyright

1. The author of this thesis (including any appendices and/or schedules to this thesis) owns certain copyright or related rights in it (the “Copyright”) and he has given The University of Manchester certain rights to use such Copyright, including for administrative purposes.

2. Copies of this thesis, either in full or in extracts and whether in hard or electronic copy, may be made only in accordance with the Copyright, Designs and Patents Act 1988 (as amended) and regulations issued under it or, where appropriate, in accordance with licensing agreements which the University has from time to time. This page must form part of any such copies made.

3. The ownership of certain Copyright, patents, designs, trademarks and other intellectual property (the “Intellectual Property”) and any reproductions of copyright works in the thesis, for example graphs and tables (“Reproductions”), which may be described in this thesis, may not be owned by the author and may be owned by third parties. Such Intellectual Property and Reproductions cannot and must not be made available for use without the prior written permission of the owner(s) of the relevant Intellectual Property and/or Reproductions.

4. Further information on the conditions under which disclosure, publication and commercialisation of this thesis, the Copyright and any Intellectual Property and/or Reproductions described in it may take place is available in the University IP Policy (see http://documents.manchester.ac.uk/DocuInfo.aspx?DocID=24420), in any relevant Thesis restriction declarations deposited in the University Library, The University Library’s regulations (see http://www.library.manchester.ac.uk/about/regulations/) and in The University’s policy on Presentation of Theses

14 Acknowledgements

A number of my friends and family are deserving of my thanks and have all contributed to this work in their own way. Too many to name explicitly, but thanks are due to all the people who were willing to listen to my complaints no matter how insignificant, be it a piece of code not compiling or an abstract deadline approaching a little faster than I’d like. There are of course others who have offered more practical help, most of whom are included as co-authors on my publications. This work would not have been possible if it were not for the friends and colleagues I have made at Manchester and during my time in Surrey. Explicitly there are a few people I will thank. Firstly, my supervisors, Ran Mackay, Karen Kirkby, and Michael Merchant. Their guidance throughout this PhD has helped an immeasurable amount. They have always been supportive of my research, have always seen the value, and have opened the door to some fantastic opportunities during my studies. Michael Merchant in particular has dedicated a lot of his time to this, always willing to check code, read drafts, listen to practice presentations, or discuss potential research topics. The lessons I have learnt and experience I have gained is all owed to these people. Secondly, I would like to thank my colleague and friend John Warmenhoven. He, more than anyone else, has had to listen to my complaints, and for that I am eternally grateful.

Preface

This thesis is presented in the alternative format and reproduces, with permission, papers published in academic journals. Whilst related by a common narrative, each chapter in this thesis is presented as stand-alone investigations and can be read as such. Chapters 2 and 4 have been published as:

Nanodosimetric Simulation of Direct Ion-Induced DNA Damage Using Different Chromatin Geometry Models Published in Radiation Research 2017 Vol. 188 (6) pages 690-703

In Silico Non-Homologous End Joining Following Ion Induced DNA Double Strand Breaks Predicts That Repair Fidelity Depends on Break Density Published in Scientific Reports 2018 Vol. x (x) pages x-x

15 The Author

B.Sc. Physics, University of Surrey, 2012 M.Sc. Medical Physics, University of Surrey, 2013 I began my PhD studies at the University of Surrey in 2013. In 2015 I transferred my PhD to the University of Manchester, where I joined the PRECISE research group led by Professor Karen Kirkby.

16 1. Introduction

1.1 Cancer Cancer is an ever-growing concern for healthcare institutions around the world. For the UK there were around 2.5 million people living with cancer, in 2015, and it has been predicted that by 2030 this number will rise to 4 million (1).

1.1.1 Incidence and Prevalence Cancer is largely described as a disease of the elderly, with the rise in incidence attributed to an aging population. This is best viewed from the data collated by Cancer Research UK (2), shown in Figure 1.1. Despite the skewing of the data towards the elderly there is still a significant amount of cases within the younger population. Treatment within the younger population is particularly important, where Quality-Adjusted Life Year (QALY) plays a major role in treatment efficacy.

Figure 1.1 Cancer incidence in the UK between 2013-2014 for all cancers, male and female data combined. Data replotted from Cancer Research UK (2).

1.1.2 Biology The biology of malignant cells is different to normal cells, where malignant cells have defects in regulatory circuits that govern cell proliferation and homeostasis. There are a vast number of cancer types making definitions of cancer difficult. However, the review paper of Hanahan and Weinberg suggested that there are six generic “hallmarks of cancer” (3). These hallmarks are outlined and briefly described in the following section. “Self sufficiency in growth signals” where cancer cells require no external stimulation to move from a quiescent state to an active proliferation state. “Insensitivity to anti-growth signals” where cancer cells can ignore signals to halt the

17 cell cycle. “Evading apoptosis” where cancer cells can bypass the programmed cell death, this hallmark, along with the previous two, results in an increasing population of cancer cells through fast proliferation and lack of attrition. “Limitless replicative potential” whilst most cells have a limited number of times they can divide, tumour cells appear to be immortalised. “Sustained angiogenesis” where cancer cells can promote the formation of new vasculature to supply oxygen and nutrients, whilst removing waste. “Tissue invasion and metastasis” where primary tumour masses invade adjacent tissues or travel to distant sites. The list of hallmarks was updated by Hanahan and Weinberg in 2011 (4), adding new definitions and further evidence of existing hallmarks. Ultimately these hallmarks describe a group of cells that proliferate at an abnormally fast rate, by ignoring cell cycle arrest signals and cell death signals, leading to the accumulation of a mass of abnormal cells, resulting in a solid tumour. The tumour has the ability to invade nearby tissues, or parts of the mass can break away and travel to distant sites. Due to the rapid proliferation of cells the tumour will often outpace the rate of angiogenesis, leading to regions of lower oxygen supply (hypoxia). Treatment options for cancer include either surgery, chemotherapy, radiotherapy, or a combination of these.

1.2 External Beam Radiotherapy Radiotherapy is a common treatment modality in cancer; it has been estimated that around 40% of UK patients should receive radiotherapy at some stage of their illness (5). The treatment is also cost effective, accounting for around 10% of the cancer budget (6). The aim of radiotherapy is to deliver a large amount of energy, i.e. dose, to the tumour whilst minimising the dose delivered to the surrounding healthy tissue. The ideal scenario delivers dose to the tumour with no dose given to healthy tissue. However, physical processes make this scenario impossible for external beam radiotherapy.

1.2.2 In Practice There are some commonalities in external beam radiotherapy, regardless of modality. The process has typically begun with a diagnostic image, whereby the tumour volume has been identified. Before commencing radiotherapy, the clinical oncologist will identify and delineate the visible part of the tumour, known as the Gross Tumour Volume (GTV). Two margins are added to this volume to form the Clinical Target Volume (CTV) and the Planning Target Volume (PTV) (7). The CTV accounts for microscopic spread whilst the PTV accounts for uncertainties in planning or in the delivery of treatment. The clinician will also delineate any nearby Organ at Risk (OAR). The dose delivered to OARs is strictly controlled, adding constraints to the later stage of dose planning.

18 A dose is prescribed to the tumour volume. The required dose is informed by evidence and experience, though recommendations exist (8). The dose can deviate from the recommendations under certain circumstances at the discretion of the clinical team, for example nearby OARs or estimates from the Tumour Control Probability (TCP) and Normal Tissue Complication Probability (NTCP). The total dose is divided into a number of daily fractions, typically this will be split into around 2 Gy per fraction (8). The concept of fractionation is based on what has come to be known as the “5 R’s of radiobiology” (9); repair, redistribution, reoxygenation, repopulation, and radiosensitivity. These are factors related to the tumour biology, all of which predict benefit if treatment is fractionated. Radiation induces both lethal and sub-lethal damage to cells. Time in between doses allows for the cells to repair sub-lethal damage (Repair). One characteristic in malignancy is disruption to normal cellular functions, including damage repair pathways. As such the time given for repair is more beneficial to normal cells than it is to cancerous cells. There is a measured difference in radiosensitivity across the cell cycle, with higher sensitivity in the G2 and M phase (10). Time in between doses increases the chance that cells that were in a radioresistant phase will move into a radiosensitive phase (Redistribution). The tumour oxygen concentration has been linked to treatment efficacy, with hypoxic tumours responding more poorly than normoxic (11). Allowing time between dose deliveries increases the chance that the tumour regains oxygen (Reoxygenation). Radiotherapy causes cell death in both normal and malignant tissues. In response the tissues begin to repopulate the dead cells (Repopulation). By spreading the treatment course over the time period of repopulation side effects can be reduced. However, this can result in an accelerated growth of the tumour, if the malignant cells repopulate at a faster rate than the normal cells. The final “R” refers to Radiosensitivity. This does not implicate fractionation, but instead serves as a reminder that there is an intrinsic difference in radiosensitivity between different tissue types. The fraction is planned and optimised by the medical physicist with use of Treatment Planning Software (TPS). The TPS takes the patient image, clinical delineations, OAR tolerances, beam angles, and prescribed dose as inputs. The TPS models the beam interactions analytically, providing an optimised dose map on the planning image. The dose is delivered with regular imaging to assess efficacy and anatomical changes in the patient.

1.2.2 Photons Conventionally radiotherapy has been delivered with high energy photons. This is achieved with the use of a linear accelerator (LINAC). Electrons are accelerated with a megavoltage potential, through the LINAC, towards an X-ray target. The

19 sudden electron deceleration produces photons of comparable energy, which are shaped to match the tumour geometry. In matter there are three physical processes that determine energy deposition for photons; the photoelectric effect, Compton scattering, and pair production. The photoelectric effect describes the process of photon absorption within the target atom, resulting in the ejection of an orbital electron. Compton scattering is the process by which the photon scatters from orbital electrons in the target atom, transferring some energy in the process. Pair production is the process of photon conversion into an electron and positron pair. Figure 1.2 shows the relative contribution of these processes to dose across a range of photon energies, simulated in liquid water.

Figure 1.2 The relative contribution to dose of photon interactions as a function of primary photon energy simulated in liquid water.

The photoelectric effect and pair production both result in the complete absorption of the photon, whilst Compton scattering redirects the photon with a reduction in energy. For clinically relevant photon energies (1-10 MeV) the dominant process of dose deposition is Compton scattering. A number of these scattering events may occur until the photon loses enough energy and is eventually absorbed or leaves the patient. Macroscopically photon absorption can be described by the Beer-Lambert law, shown in Equation 1.1.

푁 = 푁0exp⁡(−휇푡) (1.1)

Where N is the number of photons left unabsorbed after travelling through an absorber of thickness t. N0 is the initial number of photons and  is the attenuation coefficient of the absorber. The attenuation coefficient is sometimes quoted as the

20 mass attenuation coefficient, where the value is divided by the density of the absorber. The attenuation coefficient describes the probability that the photon is stopped within the target, with a higher attenuation coefficient corresponding to a greater stopping potential. The absorption of photons is equivalent to dose, as such the Beer-Lambert law predicts an exponential fall off in dose with depth. It also predicts a small chance that photons will not be stopped by any thickness of absorber. The rate of this dose fall off is dependent on the attenuation coefficient of the material, which is determined by the atomic number, Z, of the target and photon energy. As a comparison, the mass attenuation coefficient is shown for soft tissue and bone in Figure 1.3, reported in the NIST database (12).

Figure 1.3 The photon attenuation coefficient for soft tissue and bone across a range of primary photon energy. Reported in the NIST material database (12).

Figure 1.3 shows that the greatest photon attenuation occurs at low energies, where the process is dominated by photoelectric interactions (see Figure 1.2). For bone, the various K-edges can be seen, where the photon energy is equal to the binding energy of K-shell electrons. The difference in attenuation between materials is an important consideration for dose planning, where the beam may traverse a series of tissues with different elemental composition. Each voxel in the patient’s diagnostic CT contains information on the photon attenuation, providing a 3D map of photon stopping potential, given as Hounsfield Units (HU). This information is used by the TPS to accurately model the patient geometry and composition, making predictions of photon absorption, and thus the dose. The measured photon dose profile differs from that predicted by the Beer- Lambert law. As the photon initially impinges upon the target it creates secondary electrons. These electrons are liberated with a given amount of energy, corresponding to a range in the forward direction. This leads to a buildup region of

21 dose before charged particle equilibrium is established, where the number of particles entering a volume is equal to the number exiting (13). In this buildup region the Kinetic Energy Released per Mass (KERMA) is not equal to the dose deposited. The dose buildup phenomenon is a beneficial property of megavoltage beams since it provides a skin sparing effect (14). The difference between KERMA and dose is demonstrated for a simulation of a 10 MeV photon beam in liquid water, Figure 1.4.

Figure 1.4 The photon dose depth without buildup, i.e. KERMA, calculated with the Beer-Lambert law (green) and the simulated photon dose depth (purple).

An absorber, known as a bolus, can be placed on the surface of the patient to minimise this effect, which may be useful when treating shallower tumours (13). The size of the buildup region is dependent on the energy of the primary photon. Higher energy photons are able to transfer more energy to secondary electron, which have a further range in the target. The higher energy photons also have an increased range since absorption is less likely, though this difference in range is not clinically relevant. Figure 1.5 shows this effect in a water phantom simulated for a range of megavoltage photons.

22

Figure 1.5 The photon dose depth profiles for 1, 4, 8, and 10 MeV mono-energetic photons. Increased primary photon energy corresponds to a larger buildup region before charged particle equilibrium is established.

The physical processes that determine photon dose deposition are unavoidable. As such a number of treatment techniques have been developed in order to best optimise photon therapy, or to overcome some of the physical constraints (15). Firstly, a number of photon beam angles are used, overlapping at the tumour site, in order to mitigate a large dose buildup near the surface of the patient. This does, however, increase the total volume of healthy tissue receiving low doses. Secondly, the patient is regularly imaged between fractions to take account of anatomical changes, which can affect the dose map, particularly if regions of high HU have moved. Technological innovations have also improved the efficacy of photon therapy by delivering a well-controlled conformal dose. Photon beams can be modulated, Intensity Modulated Radiation Therapy (IMRT), to better shape the beam to the tumour geometry. Photons can be delivered rapidly, Volumetric Modulated Arc Therapy (VMAT), shaping the beam online to deliver the dose in a single arc of the gantry minimising the effect of motion during treatment and increasing patient throughput. A large contributing factor to the success of photon radiotherapy, aside from advanced dose delivery techniques, is the years of experience that have been gained over the lifetime of the modality. This experience is regularly collated and presented in publications known as “Quantitative Analysis of Normal Tissue Effects in the Clinic” (QUANTEC) (16). The aim of QUANTEC is to summarise the links between 3D dose volumes and outcome data, with a particular interest in the risk associated with normal tissue irradiation. This has given the dose planner the evidence required to make decisions on dose tolerances, improving overall plan quality.

23

1.2.3 Particles In some radiotherapeutic cases the physical dose profile of photons has a significant negative effect, particularly for superficial treatment, volumes very close to OARs, or cases where normal tissue dose needs to be strictly controlled. A number of different radiation modalities exist for cases such as these. These modalities use particles rather than photons to deliver the dose. For superficial treatments it is common to use electron beams (13). The electron beam is also generated by the LINAC, with the X-ray target removed, and as such generates megavoltage energy beams. Since electrons are charged particles the processes determining dose deposition are different to photons. Here, the electrons continuously deposit energy along their path, mainly through scattering events and ionisations. This leads to a larger dose at the surface and a defined range for the electrons, making the modality ideal for skin or surface treatments since little dose is delivered past the range. Alternatively, ion therapy has been gaining prevalence around the world. Common particles for external beam radiotherapy are carbon ions and protons (17). The ion mainly deposits dose through ionisation events, where the probability of ionisation increases as the ion energy decreases. The ion slows down as it travels deeper into the patient by depositing small amounts of energy, until the energy is low enough that a large amount of dose is deposited, and the ion stops. This leads to a characteristic dose depth profile referred to as the “Bragg curve”, ending with the “Bragg peak”, named for William H. Bragg’s discovery published in 1904 (18). Very little dose is deposited beyond the Bragg peak making this modality ideal for healthy tissue sparing. However, heavier ions, such as carbon, are more likely to undergo nuclear interactions along their path. This results in the breakup of the ion into lighter ions which have an increased range over the primary particle, resulting in a dose deposition beyond the Bragg peak (19). The effect is smaller for lighter ions such as protons, however, lighter ions tend to produce a slightly broader Bragg peak. The process of dose deposition for ions is similar to electrons, however, the ion mass makes it much less likely to scatter resulting in relatively linear particle tracks. This has the effect of creating a sharp lateral dose profile. Figure 1.6 shows a comparison of the simulated 1D dose depth profiles for 4 MeV photons, 4 MeV electrons, 150 MeV protons, and 282 MeV/u carbon ions in a liquid water phantom. Each radiotherapy modality’s characteristic dose depth profile has its own advantages, making each useful for the treatment of different clinical cases.

24

Figure 1.6 The simulated dose depth profiles for mono-energetic photons, electrons, protons, and carbon ions. Dose depth profiles of each modality are normalised to themselves. The dose depth profiles are characteristic for the radiation source, giving each modality advantages for specific clinical cases.

1.3 Proton Beam Therapy Due to the well-defined range of protons, and the lack of dose beyond the Bragg peak, the modality has been identified as a potentially beneficial treatment option. Particularly for paediatric cases, where complications arising from the irradiation of normal tissue is of particular importance. The benefits of proton beam therapy (PBT) were first recognised by Wilson in 1946 (20), with the first patient treated less than a decade later at the Lawrence Berkley Laboratory, California (21). For around 30 years PBT was only offered in research facilities, until the early 1990s when the first hospital based high energy PBT centre was opened at Loma Linda University Medical Centre, California (22). Since then the number of PBT centres around the world has grown rapidly, with 67 centres operational as of January 2018 (23). Figure 1.7 shows this rapid increase in PBT availability.

25

Figure 1.7 The number of worldwide proton beam therapy centres in operation over the years, data collated by the Particle Therapy Co-Operative Group (23).

The rapid increase in PBT centres has been attributed to a number of developments in areas including medical physics, clinical results, reimbursement, technology, and changes in PBT equipment (24). Reimbursement in particular may be a major driving factor, since the majority of PBT centres are located in the US. This has incentivised private investment. Due to the wider availability of PBT the number of patients treated with this modality has increased, with the total number of patients treated increasing from around 60,000 by the end of 2007 to around 150,000 by the end of 2016 (25). Furthermore, some increase in the uptake of PBT can be attributed to improvements in medical imaging. State of the art imaging has made identification of malignant and healthy tissues far better, providing more accurate dose planning (26). This is crucial for PBT, where accurate imaging is required since the protons have a definite stopping position. Currently the UK has one operational PBT centre located at the Clatterbridge Cancer Centre, capable of producing beams with energy up to 60 MeV. UK patients requiring PBT, that cannot be serviced by Clatterbridge, are currently sent abroad. However, two new National Health Service (NHS) high energy centres are under construction, one located at UCLH in London and the other at the Christie NHS foundation Trust in Manchester, which is due to open in 2018. The list of indications for access to PBT in the UK is largely for paediatric cases and adult head and neck cases (27), though mathematical modelling efforts are underway to offer better patient selection (28). Due to the imminent arrival of high energy PBT in the UK it has become crucial to carry out further research aiming to answer some of the open questions and to better optimise treatment

26 1.3.1 Delivery Techniques Protons are accelerated with a cyclotron or synchrotron and are transported to the treatment room. During this transport the protons are relatively mono-energetic and have a lateral spread of a few millimetres (29). Figure 1.6 shows the dose depth profile of a mono-energetic proton beam, giving maximum dose at a depth of around 15.5 cm, for 150 MeV protons. Clinically this dose depth profile is not particularly useful, since the tumour will have some extent in the depth direction. Changing the proton energy changes the depth of maximum dose, with increasing energy corresponding to an increased depth and vice versa. This follows from the Bethe- Bloch formula, Equation 1.2, which describes the relativistic rate of energy loss per length traversed.

2 2 2 2 2 (1.2) 푑퐸 4휋 푛푍 푒 2푚푒푐 훽 2 − = 2 × 2 × ( ) × [푙푛 ( 2 ) − 훽 ] 푑푥 푚푒푐 훽 4휋휀0 퐼(1 − 훽 )

Where c is the speed of light, 0 is the vacuum permittivity,  is the proton velocity relative to c, me is the electron mass, e is the electron charge, Z is the atomic number of the projectile (Z=1 for protons), I is the mean excitation potential of the target, and n is the electron density of the target. Higher energy protons are travelling with greater velocity and as such have a lower rate of energy loss, corresponding to a greater range. In order to cover a range of depths with a uniform dose a number of pristine Bragg peaks are used, each with different energies. The combined dose depth profile is referred to as the Spread-Out Bragg Peak (SOBP). For cyclotron accelerators the fixed proton energy can be decreased with use of a degrader, known as a range modulator, or by an energy selection system in the beamline. For synchrotron accelerators the protons can be accelerated to different energies. An analytical determination of the energy and weighting for protons of these energies in order to create a uniform SOBP was reported by Bortfeld (30) and added to by Jette and Chen (31), shown in Equation 1.3 and 1.4 respectively.

푅 = 0.0022 × 퐸1.77 (1.3)

Where R is the proton range in cm and E is the proton energy in MeV.

27 1 1 1− ⁄푝 1 − (1 − ) 푘 = 0 2푛 1 1 휔 푘 = 1 1 1− ⁄푝 1 1 1− ⁄푝 (1.4) [1 − (푘 − )] − [1 − (푘 + )] 푘 = 1, … , 푛 − 1 푛 2 푛 2 1 1 1− ⁄푝 ( ) 푘 = 푛 2푛

th Where k is the weighting of the k pristine Bragg peak and n is the total number of pristine Bragg peaks used, minus 1. k=n corresponds to the highest energy beam, giving the deepest range. p is a fixed parameter, in the Bortfeld work p=1.77, but varies in the Jette and Chen work depending on the maximum proton energy and the width of the SOBP. Using equations 1.3 and 1.4 13 pristine Bragg peaks are simulated with energies ranging from 200 MeV (R=26 cm) up to 237.5 MeV (R=35 cm), shown in Figure 1.8. Increasing the number of pristine Bragg peaks offers the opportunity to increases the width of the SOBP whilst maintaining uniformity, however, this results in an increased dose to the healthy tissue preceding the SOBP.

Figure 1.8 The combination of 13 weighted pristine Bragg peaks ranging in energy from 200 MeV (26 cm) to 237.5 MeV (35 cm), providing a flat dose across the SOBP region.

Varying the proton energy provides a method for delivering uniform dose across a given depth. However, to deliver uniformity in the lateral direction mechanical techniques must be used. This is achieved through either passive scattering or spot scanning, where passive scattering is a common form of PBT delivery.

28 For passive scattering the treatment head, or nozzle, is used to shape the mono-energetic proton beam, arriving from the accelerator, both laterally and distally providing a 3D shaped beam. There are 4 techniques to spread the beam laterally in passive scattering; flat scattering, contoured scattering, double scattering with dual ring, and double scattering with occluding ring (29). Flat scattering is the simplest technique, requiring a single flat scatterer to spread the beam into a Gaussian-like profile followed by collimation. To maintain dose uniformity over the lateral spread most of the beam must be collimated, requiring large beam currents and as such resulting in a high production of secondary neutrons. Larger proton fields require a second scatterer. Following the scatterer(s) the beam is collimated and shaped with a custom compensator, which is unique for every clinical case. For passive scattering PBT the range of the protons is controlled by passing the beam through a range modulator or a ridge filter. For spot scanning techniques, as will be implemented at the Christie, the beam is steered in the lateral direction with magnetic fields. The beam, arriving at the treatment room, has a Gaussian-like lateral profile with a spread of a few millimetres. The beams are often referred to as pencil beams, due to the high degree of focussing. A series of pencil beams are arranged laterally to sum to a flat lateral dose profile, with the highest weight given to the pencil beams at the edges so that the lateral dose fall off is Gaussian. For shallow tumours the lateral dose fall off, or penumbra, is larger with spot scanning than with passive beam scattering (32). However, the penumbra can be minimised with further optimisation of the spot beam weights and collimation (33,34). The lateral dose fall off is a particularly important metric in PBT, discussed more in Section 1.3.3. For spot scanning PBT the range of protons is controlled by reducing the energy of the accelerator and, for the treatment of shallow tumours, with a beam degrader. Similarly to photons, PBT is delivered with a rotating gantry. Allowing for multiple field angles, with beams overlapping at the tumour site. This is particularly useful for spot scanning delivery techniques in order to create Intensity Modulated Proton Therapy (IMPT). Here, each pencil beam can be weighted to create an inhomogeneous dose map for the field, which, when combined with other fields, creates a uniform dose map with lower integral dose (35). The technological advances in dose delivery for photon radiotherapy are in some ways far more advanced than in PBT, making photons a better treatment option for a number of cases. For example, to date there is no clinical implementation of VMAT in PBT. Changing the proton energy, in order to deliver uniform dose in depth, currently requires too much time to make VMAT a viable option, with cyclotron energy switching taking between 0.5 – 1 s (36). In order to achieve faster energy switching new accelerator technology must be developed, such as the fixed field alternating gradient accelerator (37). Alternatively, passive beam scattering with

29 multi-leaf collimation can be used for proton VMAT; however, this leads to a large amount of secondary neutrons and long delivery time due to range modulation (38).

1.3.2 Benefits From comparison of the dose depth profile, Figure 1.6, the advantages of protons when compared to photons are immediately apparent, reduced dose preceding the tumour and no dose after the tumour. However, comparison of the 1D profiles is unfair and misleading. No clinical plan would use a single mono-energetic photon or proton beam for dose delivery. Demonstrating differences in the physical dose map between modalities requires a 3D representation. Figure 1.9 shows a comparison of dose plans for pencil beam scanning PBT and photon VMAT, presented in a recent review by Baumann et al. (15) (reproduced with permission from Nature Reviews Cancer).

Figure 1.9 Dose maps for the treatment of Glioma with proton Pencil Beam Scanning (PBS) and photon Volumetric Modulated Arc Therapy (VMAT) presented by Baumann et al. (15).

Figure 1.9 shows how, for this case of Glioma treatment, PBT can offer an improved normal tissue sparing, particularly to the right hemisphere of the brain. However, it is possible that the precision of proton depth dose may have a negative effect. The sharp distal fall off in dose, resulting from the Bragg peak, means that there is no dose given just past the PTV. If the margins added by the PTV are of insufficient size to completely cover areas of microscopic spread, then it is possible that local recurrence will occur. This issue is minimised in photon therapy, since the physics governing dose depth leads to a dose wash covering surrounding areas with low dose. However, there is no clinical evidence to support this hypothesis. The biggest advantage of PBT is the favourable dose depth profile. However, on top of this advantage is the phenomenon of increased cell kill. When matching the physical dose of photons, protons produce an increased cell kill. This can be

30 advantageous and disadvantageous, particularly since the phenomenon presents a challenge when we try to translate our current experience learned through photon therapy. The effect is discussed in more detail in Section 1.4. The effect of increased cell kill carries over to hypoxic cells, quantified through the Oxygen Enhancement Ratio (OER). OER is defined as the ratio of radiation doses between hypoxia and normoxia to produce the same biological effect. For photons the OER is around 2.7 (39). In order to properly treat the cancer, it is necessary to use steep dose gradients between hypoxic and normoxic subvolumes, which, due to the physics of photon dose deposition, would result in large doses to surrounding normal tissue. These steep dose gradients can perhaps be achieved more readily in PBT, and with other ions, due to the Bragg peak. Furthermore, the OER is decreased for higher Linear Energy Transfer (LET) particles (40), discussed more in Section 1.3.3. Despite the expected benefits of PBT, and the number of years that the modality has been used, there remains little clinical evidence to suggest benefit over conventional radiotherapy techniques (41–47). For new medical treatments Randomised Clinical Trials (RCT) are seen as the gold standard for determining clinical benefit. However, some researchers are suggesting that random selection of cohorts may only show minimal undetectable differences, instead suggesting that modelling approaches should be used to determine patients where the maximum benefit would be seen (28,41). Here, it is suggested that selection of the PBT cohort should be restricted to cases where reduction in normal tissue toxicity forms the primary indication. The UK Clinical and Radiotherapy Translational group (CTRad) has recently laid out an eight point framework to address the challenge of generating high quality evidence for PBT (48). Others have argued that the benefits of PBT are clear from the physical dose depth profile alone, and that denying the modality to patients is unethical. For this argument analogies are often drawn with other medical interventions that show an obvious benefit. Commonly cited for this argument is the work by Smith and Pell (49) investigating the RCT of parachute effectiveness, where the authors state that “Individuals who insist that all interventions need to be validated by a randomised controlled trial need to come down to earth with a bump”.

1.3.3 Linear Energy Transfer Macroscopically the effect of radiation has been described in terms of dose, the amount of energy deposited per unit mass. However, unlike photons, particles deposit energy continuously along their path, described by the Bethe-Bloch formula (Equation 1.2). Differences in biological effect between photons and protons are thought to be related to this difference in energy deposition. Quantifying the spatial distribution of these energy deposition events and their biological effect became the focus of many early radiobiologists, including Lea (50) and Zirkle (51) who coined

31 the term Linear Energy Transfer (LET). The field of study became known as microdosimetry, and was furthered largely by the extensive work of Rossi (52,53). Microdosimetry starts from the basis that delivering dose to cells leads to cell death, and that this dose deposition is a stochastic process at the micron scale. Therefore, to understand radiotherapy efficacy, the ability to kill cells, it is important to understand the process of dose deposition at the cellular scale. Microdosimetry, and the parameters of note, were collated into a report by the International Commission on Radiation Units (ICRU) (54). At its simplest definition, LET is the amount of energy deposited per unit length, conventionally given in units of keV/m. The difficulty comes when trying to average the LET from many particles crossing a volume. There are two main techniques for averaging LET, track averaged (LETt) and dose averaged (LETd). For LETt the energy deposition per unit length of each primary particle is measured and finally averaged, giving the arithmetic mean. For LETd the energy deposition per unit length of each primary particle is weighted according to the amount of local dose deposited in the volume before averaging. If the energy of secondary particles is sufficient enough that they have a range to escape the locality then their energy is subtracted, giving the restricted LET. If secondary particles are created with insufficient energy to escape locality then the energy they are created with is included in the LET calculation, unrestricted LET, equal to the electronic stopping power. For ions, the chance of energy deposition increases with decreasing primary energy. Since the ion is continuously losing energy along its path the result is an increase in ionisations near the end of range, giving an increasing LET with depth.

An example simulation scoring both LETt and LETd is shown for a 150 MeV proton beam crossing a water phantom, Figure 1.10, averaged over a number of particles.

32

Figure 1.10 A 150 MeV proton dose depth profile showing both track averaged

Linear Energy Transfer (LETt) and dose averaged Linear Energy Transfer (LETd) simulated in a water phantom.

The LET values shown in Figure 1.10 become particularly noisy at the end of the proton range due to low proton fluence. The low LET as the proton enters the target corresponds to a diffuse pattern of energy deposition, whilst the high LET at the end of range corresponds to denser energy depositions. It is believed, and has been shown experimentally, that the LET has a considerable effect on the biological response of cells, discussed in Section 1.4. This effect has been shown clinically by Peeler et al. (55), where follow up MRI images showed a correlation between regions of elevated LET and tissue toxicity. Due to these differences in biological effect with LET it has become of interest to quantify this value in treatment planning (56), where the LET is determined through simulation or calculated analytically (57). LET and the increased biological response is particularly important when considering hypoxia. As the LET increases the OER decreases, requiring less dose for the same cell kill between hypoxia and normoxia (58). The mechanism behind this effect is thought to be the relative contribution to cell kill between direct and indirect damage, with high LET radiation causing mostly direct damage, discussed further in Section 1.5. There is a slight decrease in the OER for protons relative to photons, however the effect is significantly more for higher LET particles, such as carbon or oxygen (59). It is thought that further optimising PBT dose plans with LET will lead to greater treatment efficacy. For example, it is possible to move regions of elevated LET from OARs without significantly affecting the dose map (60).

33 1.3.4 Uncertainties and Areas of Research Since it is a relatively young treatment modality there are a number of uncertainties, and areas of active research, in PBT. These areas include, for example, range, setup, motion, biology, and technological. Since the proton deposits the majority of its energy at the end of range, the Bragg peak, any uncertainty in the range can result in the high dose being delivered outside of the tumour volume or delivering a non-uniform dose across the tumour. If this is a systematic error then it is likely that the tumour will be underdosed, instead delivering high doses to normal tissue preceding or posterior to the tumour. Uncertainty in proton range can emerge from a number of factors, including calculation of proton stopping powers and anatomical changes. The diagnostic or planning image of the patient is often a CT scan, giving voxel by voxel data on the photon stopping potential relative to water, the Hounsfield Unit (HU). However, the process of energy deposition for protons is different to photons and as such the HU of the CT scan must be converted into proton stopping powers. The protons deposit energy through coulombic and nuclear interactions, making it necessary to know both the electron density and elemental composition in order to calculate stopping powers. However, tissues of similar HU can have different elemental makeup (61). For this reason stoichiometric conversion of the HU to proton stopping powers is used (62), estimating the response of biological tissues from the response of reference samples. This calibration method leads to uncertainties in the range of around 1% for soft tissue and 2% for bone (63). Similarly, range uncertainties can arise from changes in patient weight (64) and variable organ filling (65). Due to these range uncertainties it is uncommon to use the proton Bragg peak close to OARs, instead using the lateral edge. This results in the most advantageous feature of protons going unused. Furthermore, to account for the range uncertainty extra margins are applied to the tumour delineation, adding millimetres of potential normal tissue irradiation (66). As such it is of interest to determine the in vivo range during treatment. Proton CT has been proposed as a technique for minimising range uncertainty (67), using protons instead of photons to image the patient provides a map of proton stopping power, removing uncertainties in the conversion of HUs. There are two other main techniques proposed for minimising range uncertainty, Positron Emission Tomography (PET) (68) and Prompt Gamma (PG) (69). Both of these techniques involve measurement of outcomes from proton nuclear reactions, the intensities of which are correlated to proton depth. Similar to the uncertainties arising from range, and adding to these uncertainties, are patient setup and motion during treatment. Setup errors arise from the requirement that patient positioning must be the same for each dose fraction. Misalignment can lead to random or systematic errors, for random setup errors the chance increases with the number of fractions, though the effect from one fraction

34 can be minimal relative to the total dose delivered. Systematic errors pose a bigger risk, and if not resolved, can result in a significant underdosing of the tumour and overdosing of normal tissue. Similarly, patient motion or organ motion can contribute to this effect. Where longer treatment times increase the chance of the effect, implicating a benefit for faster treatment, such as proton VMAT (discussed in Section 1.3.1). Both setup and motion uncertainties can result in changes to the expected proton range, as the beam passes through tissues of density or elemental makeup that may not have been expected by the TPS. This effect is more significant in PBT than with photons due to the proton Bragg peak, shown in Figure 1.11, where photon and proton dose depths are simulated in a water phantom (density = 1.0 g/cm3) with a less dense region between 13 and 17 cm (density = 0.8 g/cm3). The density inhomogeneity shown in Figure 1.11 is exaggerated to show the effect more clearly. Range uncertainty due to HU conversion, patient setup, and anatomical changes has previously been reviewed by Paganetti (70) and effects on treatment planning are discussed by McGowan et al. (71).

Figure 1.11 Simulated dose depth profiles for photons and protons in water (density = 1.0 g/cm3) with a less dense region (shaded area, density = 0.8 g/cm3). Dashed lines show expected profiles for homogeneous density. Solid lines show profiles simulated including the density heterogeneity.

Further to the proton range uncertainty is biological uncertainty, discussed in more detail in Section 1.4. The aim of radiotherapy is to kill cancer cells. However, there is currently no quantity to directly score biological effect, instead dose is used as a surrogate. The link between dose and biological effect is not straightforward, evidenced by the fact that different radiation modalities produce different biological effect at the same physical dose. It is of interest to determine the required dose to produce the same biological effect for a given modality when compared to another

35 modality. For example, from over 100 years of delivering radiotherapy with photons there exists a wealth of knowledge and experience on optimal dose prescriptions. To utilise this experience the photon dose must be converted into a proton equivalent dose. This is achieved through the Relative Biological Effectiveness (RBE), which is clinically set to a static value of 1.1 (72). This states that protons are 10% more effective at cell kill than photons at the same dose, meaning that the prescribed proton dose can be reduced by 10% relative to photons. However, it is well known that a constant RBE of 1.1 is a simplification and that RBE varies with a number of factors (73). There is no clinical evidence to suggest that a constant RBE is unreasonable, nor is there evidence to suggest that a constant RBE is reasonable (29). As such RBE = 1.1 is deemed a safe value, which does not lead to a significant under- or over-dosing of the tumour. Neither has there been evidence to suggest that the static RBE leads to a significant over-dosing of healthy tissue. However, further understanding of RBE has the potential to maximise the benefits of PBT.

1.4 Relative Biological Effectiveness 1.4.1 Definition RBE is defined as the ratio of doses between a reference radiation and a test radiation in order to produce the same biological effect or endpoint, shown by Equation 1.5.

퐷표푠푒푅푒푓푒푟푒푛푐푒 푅퐵퐸 = | (1.5) 퐷표푠푒푇푒푠푡 퐼푠표−퐸푓푓푒푐푡

For PBT it is of interest to determine the RBE with respect to photons, since this is the modality most commonly used in radiotherapy. Ideally, the reference radiation and the proton beam are compared for a clinical region of interest, i.e. the proton beam in a region of the patient and the photon beam at the same region. However, for practical reasons it is common for studies to compare positions in a SOBP to either a 60Co, 137Cs, or 250 kVp photon source. The assumption here is that, for a given dose, all photons produce the same biological effect. However, studies have shown differences in RBE depending on photon energy (74,75). Since there is no standard for the definition of the reference radiation it becomes important for researchers to specify their methodology in detail. The biological endpoint of interest varies between publications, though for radiotherapeutic relevance it is usually defined as cell survival. Here, a group of cells are irradiated at a range of doses with a given radiation type in vitro. The surviving fraction is measured, through the clonogenic cell survival assay (76), at each dose to produce a plot of surviving fraction. The RBE is quantified by comparing the dose, between radiation modalities, in order to achieve the same cell kill. Figure 1.12

36 shows an example of this, with data replotted from Belli et al. (77), for the irradiation of V79 cells by protons and photons, with 10% survival marked.

Figure 1.12 V79 cell survival fraction as a function of dose for protons and photons, showing the doses for 10% survival. Data reproduced from Belli et al. (77).

For the data shown in Figure 1.12, the doses required for 10% survival are 4.4 and 5.8 Gy for protons and photons respectively. This gives an RBE value of 1.3, higher than the clinically implemented value of 1.1. Though it is worth remembering that this is only for one comparison, whereas 1.1 is derived from a number of studies (discussed in Section 1.4.3) Figure 1.12 also highlights an important factor when determining RBE. Namely, the survival fraction, or dose, at which RBE is calculated. The data in Figure 1.12 only shows the cell kill from X-rays and 3 MeV protons, yet even with that limited data set the RBE can vary between around 1.9 and 2.8, depending on the survival fraction used for calculation. Or RBE can vary between 1.0 and 4.0 depending on the dose at which RBE is calculated. Clinically, radiotherapy is delivered in 2 Gy fractions making RBE calculation, as the ratio between survival fractions, at 2 Gy a clinically relevant metric, rather than the ratio for a given survival fraction. The change in RBE with biological endpoint, for the data in Figure 1.12, is shown in Figure 1.13.

37

Figure 1.13 a) Relative Biological Effectiveness (RBE) values, between X-ray and 3 MeV protons, calculated at different survival fractions (x axis) of irradiated V79 cells. b) Relative Biological Effectiveness (RBE) values, between X-ray and 3 MeV protons, calculated at different matching doses (x axis) of irradiated V79 cells. Calculated from the data reported by Belli et al. (77).

Studies measuring RBE will often quote terms of the fit applied to the survival data, making it easy to determine the RBE at any survival fraction or dose. The most ubiquitous fit is the Linear Quadratic (LQ) model, discussed further in Section 1.6.1. The LQ model fit is shown in Equation 1.6.

푆 = exp⁡(−훼푑 − 훽푑2) (1.6)

Where S is the survival fraction, d is the dose, and  and  are parameters of the fit. The dose required for a given survival fraction is then given by Equation 1.7.

38

훼 − √훼2 − 4훽⁡ln⁡(푠) 푑 = (1.7) −2훽

And the RBE between photons, x, and protons, p, for a given survival fraction, S, at a single dose delivery is given by Equation 1.8.

푑 훼 − √훼2 − 4훽 ⁡ln⁡(푆) −2훽 푅퐵퐸 = 푥 = 푥 푥 푥 ⁡×⁡ 푝 (1.8) 2 푑푝 −2훽푥 훼푝 − √훼푝 − 4훽푝⁡ln⁡(푆)

For the following sections all RBE are calculated as the ratio of doses required for 10% survival, unless stated otherwise.

1.4.2 Experimental Data A large number of experimental studies have been carried out to determine proton RBE. These studies have previously been reviewed by Paganetti in 2002 and later in 2014 (73,78), with the 2014 review analysing the data from 76 published reports. Further to this is the Particle Irradiation Data Ensemble (PIDE), an open database collating a number of published datasets on ion RBE (79). The PIDE database contains 56 paired data points for cell survival between protons and photons, at various energies and in various cell lines. Trends have been observed by analysing the data en masse, for example a dependence of RBE has been observed with LET, Figure 1.14 shows the RBE for every ion in the PIDE database as a function of LET.

39

Figure 1.14 Relative Biological Effectiveness (RBE) values across an LET range for different ion species in a variety of cell lines. Data replotted from the PIDE database (79).

Figure 1.14 shows a peak in the RBE at around 100 keV/m, after which the RBE begins to decrease. This is often referred to as the “overkill effect”, where a single particle deposits more energy than is required to kill a cell, with the extra unnecessary energy deposition causing a decrease in the effectiveness. Specifically, for protons, there seems to be an almost linear increase of RBE with increasing LET. However, there is a large variability in the experimental data reported, particularly at higher LET, making quantitative correlations such as this tenuous. Figure 1.15 shows the experimental data, summarised in the Paganetti

2014 review (73), with respect to proton LETd. A linear fit is applied between RBE and LET, as well as showing a constant RBE = 1.1.

40

Figure 1.15 a) Relative Biological Effectiveness (RBE) measured in a number of cell lines, grouped by species, across an LET range for protons. Data replotted from the Paganetti 2014 review (73). Solid line shows linear best fit. Shaded blue area shows ±1 on this fit. Dashed red line shows RBE = 1.1. b) average RBE in 2 keV/μm LET bins, error bar shows the standard error in the mean, the same fit from a) is shown. Note, some data in a) is not shown with the current RBE axis range.

There are three possible reasons for the noise in the reported data. Either there is a biologically relevant factor that is not considered by the LET alone, there are uncertainties introduced by the experimental methodology, or there is no simple link between RBE and LET. The last point can potentially be generalised to say that there may be no conversion between photon dose and proton dose. In the review by Paganetti (73) fits such as those presented in Figure 1.15 are presented with two methods. Firstly, fits are generated with equal weighting applied to each data point (as in Figure 1.15). Secondly, fits are produced weighted according to the reported uncertainty on , , and LET. These methods give

41 significantly different correlations, highlighting the significance of experimental uncertainty. However, Paganetti points out that many of the quoted uncertainties are under- or over-estimated, or not given. The stochastic nature of radiation induced cell death and the experimental specificity to measure this effect is likely the dominating factor leading to the variance seen between studies in the data. Some of the spread in experimental data may be explained by differences in radiosensitivity between the cell lines. The data shown in Figure 1.15 is for cell lines grouped by species. These cells have different radiosensitivity, quantified by the ratio of LQ parameters /. The / ratio gives the dose at which the linear and quadratic components of cell killing are equivalent. Before this value the cell kill to dose relationship is dominated by the linear term, after this value the cell kill is dominated by the quadratic term. The / ratio is also used to group cells by radiosensitivity and determine optimal fractionation schedules, with low / referring to late responding tissues and high / referring to early responding tissues. The data from Figure 1.15 is replotted, but sorted into / ratio bins, in Figure 1.16.

Figure 1.16 Relative Biological Effectiveness (RBE) reported in the literature, measured in a number of cell lines. Top row shows cells grouped by proton /, bottom row shows cell grouped by photon /. Data replotted from the Paganetti 2014 review (73). Solid lines show fit with LET. Dashed line shows RBE = 1.1. Symbol type and colour show species, the same as for Figure 1.15.

42 Arguably isolating the RBE measurements according to radiosensitivity, Figure 1.16, shows a slightly improved link between RBE and LET. Although, the data still show considerable noise. This is seen more when cells are isolated according to the photon / ratio (Figure 1.16, bottom row). The  and  parameters describe the cellular radiosensitivity as a function of dose, and don’t necessarily have an LET component. For example, two cell lines with the same / ratio may respond to dose in the same manner but may exhibit different responses with respect to LET. The link between LET and RBE is best observed from the PIDE data, Figure 1.14, when looking across all the ion species. Here, an increase of RBE can clearly be seen. Although, at the maximum RBE, around 100 keV/m, the reported RBE has a maximum of around 7.5 and a minimum of around 1.5, showing the magnitude of the experimental variance. For an idealised dose of photons there is an expected cell kill, with the cell kill depending on the sensitivity of the cell and the stochastic nature of microscopic dose deposition. The same can be said of protons, except here the microscopic dose deposition depends on the proton LET. Therefore, it would not be unexpected that there should be an average RBE, for a given photon and proton beam, with the variance depending on the fluctuations of microscopic dose deposition. This is an unavoidable variance that is resulting from physics. On top of this is the uncertainty added through the experimental technique; cell culture, dose delivery, and accuracy of measurement. Clearly the variance introduced by the experimental technique dominates the final result, with different publications reporting considerable differences in RBE for the same experimental setup. The data collated by Paganetti (73) has a number of these cases. Uncertainty due to physics can be minimised through a large number of repeats, usually accounted for by irradiating a large sample of cells. It is difficult to minimise experimental uncertainties in the same manner due to the practicality of performing a large number of repeats. These uncertainties may also be systematic, due to methodology.

1.4.3 RBE = 1.1 Clearly a static RBE does not capture the trends in the experimental data. Considering the different nature of microscopic dose deposition between photons and protons it would be surprising if there were a simple conversion between biological effect. Particularly considering how the microscopic dose deposition changes depending on proton LET. This fact alone implies a variable RBE. Paganetti calculated the average RBE at various positions in the SOBP, suggesting RBE ~ 1.1 in the entrance region, RBE ~ 1.15 at the midpoint, RBE ~ 1.35 at the distal edge, and RBE ~ 1.7 at the distal fall off (73). However, there are a few rationales and even benefits for a generic RBE, as well as some clinical implications, which are discussed in Section 1.4.5.

43 Primarily the scale of uncertainties in RBE measurements are too large to propose tissue specific values (80). Paganetti et al. (78) calculated that determining RBE with 80% power would require hundreds of animal experiments, for one tissue type, one endpoint, one dose, and one LET. For clinical implementation this would then require independent verification, investigation of different species, and supporting clinical evidence. On top of the required further investigation and verification is the practicalities of clinical implementation. If RBE values were well known, then delineation of tissues according to radiosensitivity would be required at the radiotherapy planning stage. It would seem that a generic RBE = 1.1, taken as the average from early in vivo studies (73), has not led to any clinically relevant side effects. The value has been deemed safe, with no evidence that there has been over- or under-dosing of the tumour or significant biological effect in nearby healthy tissue (81). This generic RBE has benefits when considering the comparison of treatments between centres and in clinical trials. With each centre adopting RBE = 1.1 comparisons are straightforward and treatment becomes somewhat standardised, with variations depending only on differences in planning technique, e.g. the number of beams and angles.

1.4.4 Clinical Implications As mentioned, there has been no clinical evidence to suggest that RBE = 1.1 is incorrect, nor that it is correct. The scale of RBE variation may be minimal, yet still significant, across the treatment area, with average values ranging from around 1.1 at the entrance region to around 1.35 at the distal edge (73). There is, however, a much more significant increase at the distal fall off. When considered alongside proton range uncertainties there is a clinical concern of placing Bragg peaks abutting OARs, since the biological effect of the dose is unknown, yet assumed to be significant. As with range uncertainty this concern leads to the most beneficial aspect of PBT, the sharp dose fall off, going unused. The increased biological effect at the distal fall off leads to greater cell kill than would be expected from the low dose, this has been termed as biological range extension. Even if the range of the proton beam were known precisely there is an extension due to the biological effect, unless the distal beam is appropriately weighted. It is apparent that clinical practice has been affected by the unsureness over RBE, with doubt and uncertainty regarding the proton distal edge. Possibly the most controversial clinical example of handling this uncertainty comes from treatment of tumours adjacent to the brainstem. Here, a suggested plan proposes stopping protons after the brainstem; delivering a high, yet biologically better understood, dose to an OAR and depositing uncertain biological dose to less sensitive areas (82). This, of course, is an extreme case from RBE uncertainty. Normal clinical procedure

44 handles the RBE uncertainty by ensuring that Bragg peaks do not abut OARs, resulting in the most advantageous dosimetric characteristic of protons going unused. A recent commentary by Underwood and Paganetti (83) acknowledges the difficulties associated to clinical implementation of RBE, as well as future directions to consider. In the meantime, Underwood and Paganetti suggest following the course of “no cost” LET optimisation within treatment planning, where macroscopic dose profiles are unaltered, but areas of high LET are moved away from OARs. This is also the recommendation from many of the phenomenological models of RBE and recent work by McMahon et al. (84). Others, such as Jones et al. (85), are suggesting clinical trials between RBE = 1.1 and tissue specific RBEs following much more preclinical radiobiological work. A review of the more recent experimental data, by Ilicic et al. (86), has reached similar conclusions on variable RBE, suggesting the continuation of experiments irradiating normal and tumour cells at different points in the SOBP. If the RBE were better understood then not only could the proton Bragg peak be used, but there would also be the potential for lower integral dose to healthy tissue. By scaling the peak RBE weighted dose, which is usually higher than for a constant RBE of 1.1, there is potential for dose reduction in the healthy tissue dose in the Bragg curve buildup region.

1.4.5 Phenomenological Models Due to the potential benefits of solving the RBE question, and aided by the large amount of experimental data, a number of phenomenological models have been developed. For example, the models by Carabe et al. (87), Wedenberg et al. (88), and McNamara et al. (89). Generally, these models begin with the LQ model of cell survival and scale the photon radiosensitivity, (/)x, with the proton LET. The interest in scaling photon radiosensitivity to proton radiosensitivity comes from the abundance of experimental data of cell survival for photon irradiation. Equation 1.8 can be rewritten by substituting the survival fraction resulting from the same physical dose, d, of protons and photons, to give Equation 1.9.

훼 − √훼2 − 4훽 ⁡(−훼 푑 − 훽 푑2) 푅퐵퐸 = 푥 푥 푥 푥 푥 ⁡ −2훽푥 −2훽 (1.9) ×⁡ 푝 2 2 훼푝 − √훼푝 − 4훽푝⁡(−훼푝푑 − 훽푝푑 )

By expressing the cellular radiosensitivity to proton radiation, (/)p, as a function of photon radiosensitivity and proton LET, and simplifying the equation

45 Carabe, Wedenberg, and McNamara develop models of RBE that depend only on proton dose, dp, proton LET, L, and photon radiosensitivity, (/)x. These equations are shown for the models of Carabe, Equation 1.10, Wedenberg, Equation 1.11, and McNamara, Equation 1.10.

√(훼⁄훽)2 + 4(훼⁄훽) 푅퐵퐸 푑 + 4(푅퐵퐸 )2푑2 − (훼⁄훽) 푅퐵퐸 = 푥 푥 푚푎푥 푝 푚푖푛 푝 푥 (1.10) 2푑푝

1 2 2 2 √ ⁄ (훼⁄훽)푥 + (훼⁄훽)푥푅퐵퐸푚푎푥푑푝 + (푅퐵퐸푚푖푛) 푑푝 (훼⁄훽)푥 4 (1.11) 푅퐵퐸 = − + 2푑푝 푑푝

For each model there are two terms, RBEmax and RBEmin, that are fit to experimental data. It is largely these parameters and the availability of data that give rise to differences between the models. RBEmax and RBEmin are the asymptotic values of RBE at doses of 0 Gy and ∞ Gy. Defined as ratios of LQ parameters between protons and photons, where 푅퐵퐸푚푎푥 = 훼푝⁄훼푥 and 푅퐵퐸푚푖푛 = √훽푝⁄훽푥. Table 1.1 shows the dependencies of these asymptotic RBEs with proton LET, L, used in each model.

RBE Model 푹푩푬풎풂풙 = 휶풑⁄휶풙 푹푩푬풎풊풏 = √휷풑⁄휷풙

0.414퐿 0.016퐿 Carabe et al. (87) 0.843 + 1.090 + (훼⁄훽)푥 (훼⁄훽)푥

0.434퐿 Wedenberg et al. (88) 1 + 1 (훼⁄훽)푥

0.356퐿 McNamara et al. (89) 0.999 + 1.101 − 0.004√(훼⁄훽)푥퐿 (훼⁄훽)푥

Table 1.1 Asymptotic RBE values used in the phenomenological models of Carabe et al. (87), Wedenberg et al. (88), and McNamara et al. (89).

Overall the models produce similar values for RBE, though the differences in asymptotic RBEs leads to small, yet non-negligible, changes in predictions. Figure 1.17 shows the three phenomenological models predictions of RBE adjusted dose for the case of a SOBP composed of 9 pristine Bragg peaks, with a maximum energy of 150 MeV.

46

Figure 1.17 a) Physical proton dose and dose average LET (LETd). b) RBE adjusted dose for RBE = 1.1, McNamara et al. (89), Wedenberg et al. (88), and Carabe et al.

(87) calculated in tissue with a constant (/)x = 3 Gy.

Figure 1.17 shows clearly that there is disagreement between the constant RBE of 1.1 and the biologically fit RBE models. It also demonstrates the biologically extended range as discussed in Section 1.4.4. There appears to be little disagreement between the models for the case demonstrated in Figure 1.17, particularly when comparing the McNamara and Wedenberg models. So, the question becomes, if the existing RBE models all produce similar predictions then why are they not clinically implemented? There are at least two answers to this question. One concerns the practicalities of implementing the models in the clinic, particularly when it comes to delineating tissues according to (/)x. The second answer concerns the validity of the models themselves. The models are fit with experimental data, such as that shown in Figure 1.15, and there is a large variance in this data. Since the constant RBE of 1.1 has not shown any clinically relevant complications it would appear that the need to move towards a variable RBE has been outweighed by a lack of trust in the ability to accurately predict variable RBE. With the concern that implementing variable RBE may introduce more uncertainty and degrade the quality of PBT relative to the current static RBE.

47 The core assumption of these phenomenological RBE models is that there is a relationship between the terms of the LQ model between protons and photons, and that this relationship varies with LET. Within the models this relationship is described by RBEmin and RBEmax. Figure 1.18 shows the experimental data summarised by Paganetti (73) and the fits produced by the phenomenological models.

Figure 1.18 Fitted parameters of the phenomenological RBE models (87–89) for a)

RBEmax and b) RBEmin with comparison to experimental data included in the

Paganetti review (73). Note, since the models define RBEmin differently only the Carabe fit is shown.

Figure 1.18 shows that, again, the noise contained within the experimental data limits our ability to accurately model the RBE dependencies on LET and parameters of the LQ model. It may also highlight a more fundamental problem with the approach of the phenomenological models. Namely, attempting to link parameters of the LQ model between protons and photons. Figure 1.19 shows the  and  terms for photons, x, and protons, p, from the data collected in the Paganetti review (73).

48

Figure 1.19 Relationship between a) proton and photon  parameter, and b) proton and photon  parameter. From the experimental data included in the Paganetti review (73).

From Figure 1.19 it would appear that an increase in x also corresponds to an increase in p, and similarly for the  parameter. The correlation between the proton and photon specific parameters seems almost linear, though it is difficult to draw any quantitative correlations due to the spread in the data. Introducing an LET component doesn’t lead to a reduction in variance or a clear link between radioresponse parameters, as shown in Figure 1.18 and 1.20.

Figure 1.20 LET scaling between a) proton and photon  parameter, and b) proton and photon  parameter. From the experimental data included in the Paganetti review (73).

Correctly determining the variable RBE will improve the efficacy of PBT. There is also the opportunity to deliver a uniform biological dose, across the treatment volume, by appropriately weighting the individual pristine Bragg peaks. A reduction

49 in the weighting of the most distal pristine Bragg peaks has the potential to minimise the biologically extended range.

1.5 DNA Damage, Nanodosimetry, and DNA Repair Implicit within the phenomenological models, and explicit from the experimental data, is an increasing RBE with LET. Logically this makes sense, especially when viewed within the context of microdosimetry. Here, the critical target of radiotherapy is the cell. Therefore, scoring the energy deposition at the same micron scale as the cell gives a physical measure of biological effect. However, for a long time we have known that DNA is the critical target, with early studies showing that damage to the nuclear DNA affects cell replication and viability (90,91). In particular the induction of DNA Double Strand Breaks (DSBs) has been shown to be a particularly toxic lesion to the cell (92). As with the change in focus from macroscopic dose to the cellular micro-scale (microdosimetry), there is now a growing interest in further refining the scale to the nano-scale of DNA (nanodosimetry). As with microdosimetry, nanodosimetry is an attempt to quantify the physical interactions of radiation in terms of biological effect. This has often been referred to as “biologically relevant dosimetry” (93) or as a method to bridge the gap between physics and biology (94); with some studies suggesting a move towards nanodosimetrically optimised treatment plans (95). Nanodosimetry is a broad subject that encompasses all works, simulation or experimental, linking physical energy depositions at the nano-scale to biological effect. The workings of nanodosimetry are detailed in Chapter 2 of this work, but in short, the methodology scores clusters of ionisations and relates these to clusters of damaged DNA volumes. Experimentally this is achieved through gas detectors, where the density of gas is scaled to match a nanometric volume of liquid water (96– 99). With simulation ionisations in liquid water that form spatial clusters are scored, or damages to explicit models of the DNA geometry are scored. Nanodosimetric simulation is discussed in more detail in Section 1.6.3. Ionising radiation has the ability to damage DNA by two methods, directly or indirectly. Understanding these mechanisms is central to our understanding of how to perform biologically relevant dosimetry. As well as the recent focus of physics on nanodosimetry there has been a considerable interest in mechanistic understanding in the field of radiobiology, both historically and contemporarily. Traditionally, the yield of radiation induced DSBs has been determined experimentally through neutral sedimentation gradients, filter elution, and pulsed field gel electrophoresis (100). More recently fluorescent microscopy techniques have been used to determine yields and spatial position of DSBs (101). Here, fluorescent tags are attached to repair proteins and the accumulation of proteins at damage sites leads to Radiation Induced Foci (RIF). However, microscopic techniques are limited by specificity, from

50 both the resolution of the microscope and the labelled protein. This means that a single DSB may not lead to a RIF, instead RIF represent a site of many DSBs. Direct damage refers to the physical interactions of the radiation that result in damage to the DNA. These are energy deposition events that change the chemistry of the DNA molecule and result in either the removal of the DNA volume from the double helix (strand break) or loss of function within the gene. Generally, it has been considered that there is an energy deposition threshold to damage DNA, thought to be related to the ionisation level of the DNA material (10s of eV) (102). The concept of a threshold for strand break induction comes from experiments investigating the incorporation of I-125 deoxyuridine into a DNA fragment (103,104). The experimental setup was simulated to show that setting an energy threshold of 17.5 eV for the induction of a Single Strand Break (SSB) reproduces experimental results (105). However, later experimental data, using low energy electrons and photons, has shown that strand breaks can be induced at energies considerably lower than 17.5 eV, with the lowest energy from 3 eV electrons (106,107). Energy depositions below the ionisation level are largely from excitation events. Prise et al. (107) discuss the experimentally reported evidence of strand break induction below the ionisation energy of DNA, around 9 eV (108), summarising that the efficiency of break induction is decreased. This means that strand breaks are predominantly, but not exclusively, induced through ionisation events. As an aside, this evidence implies a validity of the experimental nanodosimetry technique. To incorporate damage from the sub- ionisation energy depositions a probability is applied based on the energy deposited. In the modelling work “PARTRAC” a strand break is induced with a linear probability varying between 5 – 37.5 eV (109). With this linearity there is no increase in probability above the ionisation energy, that is to say the chance of 12 eV leading to a strand break is twice that of 6 eV, as an example. Indirect damage refers to the DNA damage induced by free attacks. Ionising radiation leads to the cleavage of chemical bonds in the bulk material of the nucleus, producing highly reactive radicals. Mechanistically indirect damage is considered through water radiolysis, since water accounts for around 70% of the cell mass (110). The most common process is the ionisation of the water molecule (111), resulting in the production of a hydrated electron (eaq) and a charged water molecule. The water molecule rapidly loses a proton, creating the (OH). These oxygen based free radicals are referred to as Reactive Oxygen Species (ROS). The yields of ROS are quantified by the “G value”, with SI units defining the number of moles formed per Joule of radiation (mol J-1). The G value is time dependent since the chemical processes leading to ROS are not instantaneous. The ROS diffuse through the bulk media following Brownian diffusion before reacting with other ROS, water, or DNA. Table 1.2 gives the G values, reported by Breen and Murphy (111),

51 and the diffusion coefficient, D, reported by Kreipl et al. (112), of common radiolysis products.

Free Radical G Value (mol J-1) D (x 10-9 m2 s-1) OH 0.280 2.8

eaq 0.275 4.9 H* 0.057 7.0

H2O2 0.070 2.3

H2 0.047 4.8

Table 1.2 Common free radicals, their production yield (G value) reported by Breen and Murphy (111), and their diffusion coefficient (D) reported by Kreipl et al. (112).

It is believed that the OH radical is the largest contributor to indirect damage, leading to base lesions, sugar lesions, SSBs, DSBs, abasic sites, and DNA protein cross links (113). Radiation generated eaq and H can also lead to modified DNA bases (113). A full picture of the types of DNA base modification by ROS can be found in the work by Dizdaroglu et al. (114). OH induced strand breaks are a result of hydrogen abstraction from the deoxyribose sugar unit of the DNA (backbone), which if not repaired leads to a cleavage of the backbone (111). A DSB is formed if two or more SSBs are created on opposite strands of the DNA double helix within one helical turn (109). If the OH attack occurs on a DNA backbone near to a directly induced SSB, then a DSB is formed. Given the fact that direct SSBs are predominantly formed by ionisation, and given the small diffusion coefficient of the OH radical, this route to DSB formation is highly likely. A ROS can readily react with DNA bases, again through hydrogen abstraction or, more commonly, through addition to the pi bonds (115). These base modifications can lead to a “fixation” of the damage, making repair unlikely or impossible. This is the basis of the “oxygen fixation hypothesis” (116–118) and has been used as a possible explanation for the oxygen enhancement ratio. Based on a number of works within his lab, Ward predicted that, for photons, around 65% of the strand breaks are a result of indirect damage (119). As the LET of the radiation increases the chance of ionisation increases, increasing the proportion of directly induced strand breaks. For high LET radiation there is also a recombination effect for the ROS, where radicals are produced in such close proximity that they can quickly recombine. Mechanistic understanding of indirect damage is particularly important, since it accounts for the majority of damage. The induction of DNA damage starts the process that can lead to cell kill, with the DNA Damage Response (DDR) determining cell fate. The cell has evolved many checkpoints and pathways to cope with endogenous and exogenous insults to the

52 DNA; for example MisMatch Repair (MMR), Base Excision Repair (BER), Nucleotide Excision Repair (NER), Homologous Recombination (HR), and Non-Homologous End Joining (NHEJ) (120). The MMR pathway corrects for small insertions or deletions occurring during DNA replication (121). The BER pathway corrects for base damages through either “short-patch”, removing a single nucleotide, or “long-patch”, removing 2-13 nucleotides around the damage site (122). NER is a versatile repair pathway that acts on bulky DNA damages, by excising the damaged DNA single strand sequence and refilling with the undamaged strand as a template (123). Repair pathways for DSBs are broadly grouped as HR or NHEJ, with each of these having variant sub pathways. HR involves replacing the missing section of the genome with a copy transcribed from the sister chromatid (124). As such HR is largely limited to the S- and G2-phase of the cell cycle, where a copy of the DNA is available (125). Due to the copying of homologous DNA, HR is often referred to as an “error free” repair pathway. However, the success of HR depends on the DSB ends ability to correctly find its sister chromatid. The HR process begins with resection of the DSB end, leaving an overhang in the 3’ direction of the double helix (125). The sequence of this overhang is then used to search for a homologous sequence, in either a random or directed way. If the overhang sequence is commonly repeated then misidentification of the homolog can occur, resulting in the insertion of an incorrect gene sequence (126). HR is a slow process (125), due to the search for a sister homolog and the synthesis of the repeating sequence, resulting in a chance of the repair persisting to the cell cycle checkpoints. This persistence can lead to a stall in progression through the cell cycle (127), this stall is either permanent (senescence) or semi-permanent (quiescence) (128). Failure to identify the unrepaired damage by the cell cycle checkpoint can result in cell death through mitotic catastrophe, an attempt to divide the cell with a broken genome (129). The most prominent repair pathway of DSB damage, throughout all stages of the cell cycle, is NHEJ (130). The NHEJ process involves the direct ligation of two free DSB ends (131). There is no preference of the DSB end for its correct partner end, which can easily result in the joining of incorrect partners leading to a mutation. For this reason, the NHEJ pathway is considered more “error prone” than the HR pathway. There has been evidence to suggest that the search for a DSB end is directed, with the observation of “repair centres” (132), though generally the motion of a DSB end is considered undirected (133). As such the success of NHEJ for the correct pairing of DSB ends is strongly dominated by proximity effects and the extent of local clustering of DSBs. This local clustering is the focus of Chapter 3 of this work (134). Undirected DSB end motion can also lead to the DSB end becoming spatially isolated from potential partners, resulting in residual DSBs. These residual DSBs can lead to cell cycle arrest in the same manner as for HR.

53 A full mechanistic understanding of the complex process of DSB repair is still somewhat missing from the literature, for example the choice of a DSB to undergo HR or NHEJ and the steps involved in the repair. However, Figure 1.21 shows a simplified outline of the process, from DSB formation to outcome.

Radiation

DSB Retry

Homologous Non-Homologous Recombination End Joining

Success Misrepair Residual

Cell Cycle Complete Aberration Stall

Failure

Normal Mutation Cell Death Senescence

non-Toxic Toxic

Figure 1.21 Simplified representation of DSB induction, repair choice, and repair outcomes.

1.6 Modelling It is apparent that quantifying RBE through the experimental technique is difficult, due to the range of values obtained between different studies. This is largely attributed to the unavoidable noise added by experimental protocols. As a way to understand trends in the data, or to make ab initio predictions, many models have been developed. These models have varying levels of complexity, employing different levels of mechanism in their formalism. The level of mechanistic description within the model gives confidence in its applicability and predictions. For example, a model that relies entirely upon physics and chemistry to describe a biological process can be used to probe the biological process and determine dependencies, provided the physics and chemistry are well-known and have been modelled correctly. The resulting model predictions should correctly recreate measurement from a biological experiment. If this is not the case then a mechanism has been

54 improperly modelled, or some mechanisms may have been omitted. However, the level of complexity in such a model is far beyond anything that can be implemented with today’s technology. As such, mechanisms must be condensed or described more simplistically. This gives rise to a grouping between models, both generally and for the prediction of cell fate, requiring different experimental data to inform or fit the model. Figure 1.22, as with Figure 1.21, shows the steps involved to go from radiation to quantification of biological effect, often referred to as the radiation quality. A higher quality corresponds to a higher biological effect. The concept is often used in radiation protection, where the physical dose is converted into an equivalent dose through a radiation weighting, or quality, factor (135). The path taken through Figure 1.22, or the approximations along the way, shows the differences between differently grouped models, and the kind of experimental data they require for fitting. The highest-level model, discussed in Section 1.6.1, takes the direct path from radiation exposure to radiation quality. This kind of path requires experimental data for every combination of radiation type and cell type, where the model essentially becomes a fit to experimental data. Mechanistic models take the path through the dashed box. These models attempt to describe the mechanism of direct DNA damage, labelled “Physics”, indirect DNA damage, labelled “Chemistry”, and the biological response, labelled “Biology”, in order to predict cell fate. Various assumptions or simplifications must be taken in these mechanisms, leading to mid-level (Section 1.6.2) and low- level models (Section 1.6.3). A similar figure to Figure 1.22 was presented by Lett in 1992 (136).

55 Radiation to Cell

Physics Direct

Chemistry

Indirect

Biology

Radiation Quality

Figure 1.22 The steps involved from radiation to determining radiation quality. High- level models take the direct path from radiation to radiation quality, requiring experimental data for every combination of radiation type and cell type. Mechanistic models take the path through the dashed box, modelling the action of direct DNA damage (Physics), indirect DNA damage (Chemistry), and biological response (Biology) in order to predict cell fate. Assumptions or simplifications can be made along the mechanistic path, distinguishing mid-level and low-level models.

1.6.1 Current Models The most prominent model for cell survival is the Linear Quadratic (LQ) model, first introduced by Chadwick and Leenhouts in 1973 (137), and is shown earlier in Equation 1.6. The model came from an improvement of the target theory, detailed by Lea (50). Target theory is an essential concept in radiation biology, and states that inactivation of the target(s) within an organism by radiation results in the organisms death (138). The single hit single target theory predicts an exponentially decreasing rate of survival with dose, Equation 1.12, which is derived from Poisson statistics.

푆 = exp⁡(−푥퐷) (1.12)

Where D is the radiation dose, and x is the probability of inducing a lethal lesion per unit dose. However, clonogenic survival data, such as that shown earlier in Figure 1.12, has a shoulder in the dose response curve which is not accounted for by Equation 1.12. The situation can be improved by changing the target theory to a

56 multi-target multi-hit model, such as the LQ model. In the paper by Chadwick and Leenhouts, the DNA backbones are considered the target and a mechanistic derivation of the LQ parameters is presented (137). Their theory is based on five philosophies.

• Cells contain critical molecules, the integrity of which determines cell viability. • The critical molecules are assumed to be DNA, and the critical damage is assumed to be a DSB. • The primary action of radiation on the cell is to cause molecular bond breaks in the DNA. • Differences in radiobiological effect in a given cell type under different irradiation conditions are due to varying degrees of repair. • Repair processes represent the physical recombination, chemical restitution, and biochemical enzymatic repair of the DNA.

The model concerns two modes of action for DSB induction. The first mode of action is the combination of two SSBs, spatially and temporally proximal, leading to a DSB, Equation 1.13 shows the number of DSBs formed by this mode of action

(N1). This mode of action represents strand breakage from different radiation events, or multiple hits. The second mode of action is the direct induction of a DSB, Equation

1.14 shows the number of DSBs formed by this mode of action (N2). This mode of action represents both strands being broken in a single radiation event.

2 푁1 = 휀푛1푛2푓1푓2{1 − 푒푥푝[−푘퐷(1 − Δ)]} (1.13)

푁2 = 푛0[1 − 푒푥푝(−푘0Δ퐷)] (1.14)

Where, 휀 is the proportion of broken DNA backbones that can combine to form a DSB, equal to 1 for nearby SSBs and 0 for distant SSBs. n1 and n2 are the number of critical targets on strand 1 and strand 2 of the double helix respectively, where n1

= n2. f1 and f2 are the proportion of SSBs on each strand that are not restituted or repaired. k is the probability per target per unit dose that the target is broken. D is the delivered dose. Δ is the proportion of dose that causes breaks by the action shown in Equation 1.14, 1 – Δ is the proportion of dose left to cause DSBs by the other action, shown in Equation 1.13. n0 and k0 are the number of targets for the second mode of action and the probability of forming a DSB per target respectively. Equations 1.13 and 1.14 give the total number of DSBs induced per unit dose by the combined action, or dual action, of the radiation. By applying Poisson statistics and introducing two new terms, f0 and p, Chadwick and Leenhouts link induction of

DSB to cell death, where f0 is the proportion of DSBs that are not restituted or

57 repaired, and p is a proportionality factor connecting DSBs to cell death. With this model, Chadwick and Leenhouts have described the mechanisms that they believe are responsible for cell death, namely the induction and failed repair of DSBs. Some of the parameters are loosely defined, for example the proportionality factor connecting DSBs to cell death. However, the mechanisms seem plausible based on our understanding of cell death and are combined to give the mechanistic LQ survival equation, shown in Equation 1.15.

푆 = 푒푥푝{−푝푓0푛0[1 − 푒푥푝(−푘0Δ퐷)]}⁡ 2 × 푒푥푝(−푝푓0휀푛1푛2푓1푓2 × {1 − 푒푥푝[−푘퐷(1 − Δ)]} )

assuming k0 and k are small (1.15)

2 2 2 푆 = 푒푥푝{−푝푓0푛0푘0Δ퐷} × ⁡푒푥푝[−푝푓0휀푛1푓1푓2푘 (1 − Δ) 퐷 ]

Chadwick and Leenhouts then go on to say that an equation of the form shown in Equation 1.16 will fit well to experimental data for cell survival, where  and  are fitted parameters representing the mechanisms described in Equation 1.15.

푆 = 푒푥푝{−훼퐷}⁡× 푒푥푝{−훽퐷2} (1.16)

Over the years the mechanistic content of the LQ model has been questioned, for example see the letter to the editor by Zaider (139). The mechanisms have also been defended, for example see the letter to the editor by Sachs and Brenner (140). Regardless of the mechanistic content, the LQ model has proven to be the most successful cell survival model to date, in terms of clinical and biological adoption. However, it is commonplace to see the model applied purely as a “fit” rather than a predictive model. There is no intrinsic negativity to this approach, although it does limit the utility of the model. Here, the model can only be used to predict the survival fraction of cells with a similar  and  or to interpolate between measured dose points. Experimentally determining  and  is difficult and is prone to large variance between studies. A number of other models have been proposed, along similar philosophies as the LQ model, for example the Repair MisRepair (RMR) model (141) and the Lethal Potentially Lethal (LPL) model (142). These models attempt to include more mechanistic understanding to the biological response of the radiation induced damage. For example, the LPL model, drawing on mechanisms from previous models, differentiates critical damages based on their reparability. In the LPL model lesions are categorised as either “lethal”, they cannot be repaired, or “potentially

58 lethal”, they are correctly repaired at a given rate or incorrectly repaired with a different given rate resulting in lethal lesions. The LPL model also considers the fixation of potentially lethal lesions, although not necessarily through the mechanism of oxygen fixation. The survival fraction predicted by the LPL model is shown in Equation 1.17.

휖 푁푃퐿 푆 = 푒푥푝{−푁푇표푡}⁡ × [1 + ] (1.17) 휖(1 − 푒푥푝(−휖푃퐿푡푟))

Where, NTot is the sum of potentially lethal lesions and lethal lesions at the end of the radiation exposure, 휖 is the sum of the rate at which potentially lethal lesions are correctly repaired and the rate at which they are incorrectly repaired, NPL is the number of potentially lethal lesions at the end of radiation exposure, 휖푃퐿 is the rate at which potentially lethal lesions are correctly repaired, and tr is the repair time available after the radiation exposure. The LPL predicts that an equation of the form shown in Equation 1.18 will reproduce the clonogenic survival curve, where a, b, c, and d are fitted parameters. Assuming the number of lethal and potentially lethal lesions depends on dose, D, introduces a dose term.

푏퐷 푐 푆 = 푒푥푝{−푎퐷}⁡× [1 + ] (1.18) 푐(1 − 푒푥푝(−푑))

Figure 1.23 shows the fitted LQ, Equation 1.16, and the fitted LPL, Equation 1.18, for data on cell survival extracted from the work of Belli et al. (143). In their work Belli et al. irradiated V79 cells with photons and protons at a range of LET, Figure 1.23 shows the case for photons and 11 keV/m protons.

59

Figure 1.23 Survival data for V79 cells irradiated by photons (circles) and 11 keV/m protons (triangles). Data extracted from Belli et al. (143). The solid lines show fits of the LQ model, Equation 1.16. The dashed lines show fits of the LPL model, Equation 1.18.

The fits of the LQ and LPL models are very similar, but begin to diverge at low survival fractions, demonstrated in particular by the photon fits in Figure 1.23. The range of doses for which the LQ model is valid has been questioned, showing poor goodness of fit when  and  are determined with the low dose and high dose data (144). This dose range is particularly important for hypo- and hyper-fractionated treatment. Most in vitro studies investigate a wide dose range to determine the  and  parameters, where the low and high dose regions may introduce noise in the parameters. In order for the LQ model to fit reasonably well across the dose range, the quality of fit in the clinically relevant dose range may suffer. High level models, such as the LQ and LPL, have become a fitting tool to help understand the experimentally determined dose response of a given cell type, cell cycle phase, and radiation type. In order to use these types of models to predict RBE, every combination of cell and radiation must be investigated, and agreement in the results must be attained. As of now the lack of repeatability in results is the main reason that phenomenological RBE models, which rely on the LQ model, have not been clinically implemented. Even if repeatability were attainable, the practicalities of investigating every combination of cell and radiation quality limits the approach of these types of models.

60 1.6.2 Mechanistic Models In order to improve the predictive power of models such as the LQ more mechanistic models have been developed. These types of models are based on similar principles to the previously discussed models, such as the common assumption that induction of critical lesions to the DNA determines cell fate. However, these models tend to use computer simulation and the Monte Carlo method to determine damage, rather than experimental measurement. An example of this type of model is the Local Effects Model (LEM). The LEM has gone through a number of iterations, since LEM I (145) through to LEM IV (146), with each iteration retaining some aspects of the previous version and improving others. The LEM was developed to investigate, in silico, the RBE effect of charged particles, taking as an input data on the cell survival from photon irradiation. Currently, the model is established in clinically applied TPSs for carbon ion therapy (147). The core principle, or assumption, of the LEM is that the spatial pattern of local dose is the critical factor in determining cell fate. Furthermore, If the spatial pattern of the local dose is the same between radiations then the LEM predicts the same outcome, regardless of radiation source (148). The LEM uses an Amorphous Track Structure (ATS) method to calculate local dose at the sub cellular level. ATS is a simulation method that samples radial dose from a primary particle, generating a 3D dose map. This technique is faster to implement than the full Mont Carlo Track Structure (MCTS), which is discussed later in Section 1.6.3. More detail on the ATS method, and applications to radiobiological modelling, can be found in a number of publications, for example Cucinotta et al. (149). Within the LEM the critical quantity, determined from the local dose, is the mean number of lethal events, 푁̅̅̅푙,̅퐼표푛̅̅̅, per cell, shown in Equation 1.19 (145).

−푙푛[푆푋(푑(푥, 푦, 푧))] 푁̅̅̅푙,̅퐼표푛̅̅̅ = ∫ 푑푉푁푢푐푙푒푢푠 (1.19) 푉푁푢푐푙푒푢푠

Where, SX(D) represents the effect after photon radiation with a dose D, d(x, y, z) is the local distribution of dose within the nucleus, and VNucleus is the volume of the nucleus. The mean number of lethal events is then used to calculate the survival fraction according to Equation 1.20.

푆퐼표푛 = 푒푥푝(−푁̅̅̅푙,̅퐼표푛̅̅̅) (1.20)

For photons, considered low LET, the LEM uses a modified version of the LQ model, where a transition dose, Dt, is introduced. Dt represents the dose below which the survival curve is more shouldered and above which the survival curve is more

61 exponential in shape. Equation 1.21 shows the LEMs treatment of photons, predicting a survival fraction SX.

2 푒푥푝(−훼푋퐷 − 훽푋퐷 ) :⁡⁡⁡퐷 ≤ 퐷푡 푆푋 (퐷) = {⁡⁡ 2 ⁡⁡⁡ (1.21) 푒푥푝(−훼푋퐷 − 훽푋퐷 + 푠푚(퐷 − 퐷푡) :⁡⁡⁡퐷 > 퐷푡

Where sm is the slope of the survival curve at the transition dose, given as 푠푚 =

훼푋 + 2훽푋퐷푡. The LEM was extended to determine the local concentration of SSBs, and the clustering of SSBs to form DSBs (150). It is assumed, within the LEM, that there is a constant yield of 1250 SSBs/cell/Gy and 30 DSBs/cell/Gy, with the local dose to a nucleus sub-volume converted into strand breaks to reproduce the average yields. SSBs on opposite strands within a critical separation of 25 bp are converted into DSBs. The LEM chose the critical separation based on a range of experimental values found in the literature (151–154), ranging from 3 to 60 bp, choosing the mean reported value. The radial dose calculated by the LEM is “smeared” to account for free radical diffusion. The description of the radial dose profile was further refined to account for an energy dependent extension of the track core (155). The latest iteration of the LEM (IV) develops previous concepts of the model. The change is that LEM IV specifies that the spatial distribution of DSBs, and their local density, determines cell fate (146). There is a strong correlation between the microscopic spatial pattern of energy deposition (local dose) and the spatial DSB pattern. Although, a subtle distinction can be shown, especially for low LET ions (148). As with previous versions of the LEM it is assumed that similar spatial DSB patterns will lead to similar biological effect, regardless of radiation quality. A new metric of interest was introduced with the LEM IV, namely a measure of the spatial clustering of DSBs. The nucleus model in LEM is split into voxels, of side length 540 nm. Voxels in which a single DSB is induced are termed “isolated”, whilst voxels containing two or more DSBs are termed “clustered”. The number of voxels containing isolated DSBs or clustered DSBs is summed across the nucleus, giving

Ni and Nc respectively. A measure of the overall clustering, or complexity, C, of the damage pattern is taken according to Equation 1.22.

푁 퐶 = 푐 (1.22) 푁푐 + 푁푖

The dose of photons required to produce the same clustering can then be determined. The biological mechanism behind the importance of the LEM clustering concept is the induction of two or more DSBs within a chromatin giant loop (156), where DSBs in the giant loop can lead to the detachment and loss of a large fragment

62 of the genome. The biological sub-volume occupied by a giant loop is of a similar size to the sub-nucleus voxels used in the LEM. The LEM assumes that the biological response is entirely dependent on the induced spatial damage pattern. Implicit in this is that the repair of dense DSBs is less effective, or that dense DSBs lead to the loss of large fragments of the genome. The performance of the LEM for predicting RBE has recently been compared to some of the phenomenological models discussed in Section 1.4.5 (157). Other models have more explicitly modelled the biological response, for example the BIANCA model (158,159). The BIANCA model assumes that DSBs mis- rejoin at a distance dependent rate, and that the chromosome aberrations created from this mis-rejoining leads to cell death (160). The BIANCA model has recently been extended to make predictions for protons, carbon ions, and alpha particles with a similar ATS method as is used in the LEM (161). Other models exist that follow a similar approach to the LEM for predicting damage induction and the BIANCA model for proximity based repair, for example the work of McMahon et al. (162,163); discussed further in Chapter 6 of this work.

1.6.3 Track Structure Codes The identification of DNA as the critical target of radiation, and the modelling efforts, have determined that the scale of interest for predicting cell fate is on the order of nanometres. As such it is of interest to investigate the radiation interactions at this scale, in particular interactions with the DNA. The models discussed so far have assumed a homogenous distribution of the DNA within the nucleus, representative of interphase cells. None of the previously discussed models have explicitly modelled the effect of indirect DNA damage, which is the largest contributor to damage for low LET particles (119). Furthermore, the models discussed so far require some input from experimental survival data, for example the LEM takes as an input the cell survival curve from photon irradiation. Monte Carlo Track Structure (MCTS) codes model the physical interactions of radiation within a medium. One example of MCTS modelling is the open source toolkit Geant4-DNA (164) discussed more in Section 1.7. MCTS codes use physical experimental data on the cross section of interaction within a medium, as a function of ion energy. A random number generator is used to determine whether the particle undergoes an interaction, with the chance of interaction determined from the cross section. The particle is stepped through the medium with each step representing a physical interaction. With this method a 3D pattern of particle interactions is determined. Some MCTS codes use this data to also model water radiolysis and the spatial and temporal diffusion of free radicals, for example see Section 1.7.2. More details on the MCTS method and the relevance to radiobiology can be found in a number of publications, for example (165).

63 A number of MCTS codes have been developed to investigate radiation induced direct and indirect damage, with geometric models of the genome, including KURBUC (166), PARTRAC (109), TOPAS-nBio (167), and implementations in Geant4-DNA (168). Some of the 4D track structure models have previously been reviewed by Nikjoo et al. (169), though much progress has been made since this review. The aim of these models is to make predictions of cell survival ab initio, where the model is capable of determining factors that are relevant. This approach is attractive since it removes the dependence on experimental survival data. Although, of course, for clinical implementation it is necessary to show that the model can sufficiently reproduce experimental results. Arguably the most advanced model of this kind is the multi-scale code PARTRAC. PARTRAC is described as a suite of Monte Carlo codes that simulates track structure, DNA damage, and cellular repair following irradiation with photons, electrons, and light ions (170). For protons and alpha particles PARTRAC uses charge state dependent cross sections to model interactions, with validity down to 100 eV (171,172). For heavier ions (Z>2) cross sections are calculated by scaling proton cross sections at the same velocity (MeV/u). Secondary electrons are tracked to an energy of 10 eV. This methodology provides a 3D map of energy deposition sites, with physical processes recorded, in a homogenous water volume. These energy deposition sites are then overlaid onto a geometrical representation of the nuclear DNA. PARTRAC represents the DNA at an atomistic level, with coiling of the double helix around histones to produce the nucleosome. The nucleosomes are arranged to form the 30 nm chromatin fibre. Five basic elements of the chromatin fibre are defined, within boxes of 50 nm x 50 nm x 50 nm. Where the boxes can be stacked to construct the DNA superstructure, such as chromatin loops (173). The nucleus is subdivided into 46 territories, with DNA constructed in each, to represent the interphase organisation of chromosomes. PARTRAC determines the resulting DNA damage for direct and indirect effects. For direct damage, energy depositions within a sugar-phosphate group are scored. The probability of inducing a strand break varies linearly based on the energy deposited from zero at 5 eV to one at 37.5 eV (109). Indirect damage is scored in a separate module of PARTRAC. All energy depositions to the bulk media are converted into reactive species. The radical species are tracked for up to 10-7 seconds, modelling transport through diffusion and accounting for reactions. Reactions of the OH radical with the DNA backbone are assumed to lead to a strand break. The probability of reaction leading to a strand break is chosen so that 65% of strand breaks arise from indirect damage for the case of Co-60 irradiation (109). Geant4-DNA has followed a similar methodology to PARTRAC for the simulation of water radiolysis and radical diffusion, and more specific details of the modelling can be found in Section 1.7.2. Strand breaks occurring on opposite strands of the double

64 helix, separated by 10 bp or less, are clustered into a DSB. Additionally, 1% of isolated strand breaks are converted to DSBs. Pairs of DSBs separated by 25 bp or less are grouped to form a clustered DSB. Following damage induction PARTRAC models the biological response, currently limited to repair through the Non-Homologous End Joining (NHEJ) pathway (174). A DSB is modelled as two particles that undergo tethered diffusion. The diffusion is limited initially by a radius of 70 nm, increasing by 0.01 nm/s. The tethered diffusion models the attachment of the helix to the histone, and the unwinding process. Each DSB end accumulates repair proteins based on a time constant. Two ends, with the required proteins, can undergo synapsis if they are separated by 20 nm or less. At this stage the repair can fail, reverting the synaptic complex back to two free ends. Alternatively, the synaptic complex can progress to end processing and ligation. The end processing step cleans any additional backbone or base damages that are associated to the break. Since PARTRAC models the entire genome, with explicit chromosome territories, there is quite a range of characterisation of the final repair product. Either the break was correctly partnered, was left unrepaired, or was mis-rejoined. The mis-rejoining can lead to ring formation, if two ends of a fragment are joined, a chromosome aberration, or a small deletion. PARTRAC has been fit with mechanistic processes, for example the conditions to induce direct or indirect DNA damage, the rate of repair protein recruitment, and the motion of DSB ends. These mechanistic processes occur much earlier in the chain of biological response than cell death and can often be investigated experimentally with DNA fragments. This removes some of the confounding factors that may occur in the complex biology of the cell and cell death. With this approach PARTRAC has been able to reproduce a lot of experimental data for later biological outcomes, including chromosome aberrations, repair kinetics, and DNA fragment size (172,174). Though PARTRAC has not made explicit predictions on cell viability. It is clear that the kind of modelling approach by models such as PARTRAC is desirable, with predictions of biological outcomes made from the applied knowledge of physics and chemistry. With this approach it may be possible to give clinical confidence into variable RBE, as has already been achieved by the LEM for carbon ion therapy. Furthermore, it is possible to investigate elements of the physical energy deposition that influence biological outcomes. Uncovering links such as these is desirable for metrology, such as experimental nanodosimetry. The drawback of these models comes in the form of clinical applicability, with the increased detail in the model resulting in unacceptable times for case specific clinically relevant predictions.

65 1.7 Geant4-DNA A large portion of the work in this thesis uses the open source Monte Carlo toolkit Geant4-DNA (164) to simulate particle interactions with biologically relevant targets. Tracking energy depositions and chemical interactions at the nanometre scale. Following is a short literature review on the toolkit and how the interaction models have been implemented, including the range of their validity.

1.7.1 Physics The main challenge when designing biologically relevant particle transport simulation software is the ability to track particle interactions, particularly electrons, down to very low energies in order to account for DNA ionisation; with DNA base ionisation occurring at estimated energies of around 4-5 eV (175,176). And to even lower energies, with studies showing that DNA damage can occur below the ionisation energy, through electron attachment to the DNA backbone and electron capture by the nucleobase (177,178). The Geant4 simulation toolkit (179) is an open source Monte-Carlo software, whereby particle transport is broken down into process limited steps with random numbers demining whether or not an interaction occurs within the step. The probability of interaction is weighted according to energy dependent cross sections, which are experimentally measured or theoretically derived. Initial steps were made to develop Geant4’s tracking ability down to the eV scale by Chauvie et al. in 2007 (180). Originally part of the Low Energy Electromagnetic package this new development allowed the simulation of physical processes such as excitation and ionisation of electrons (down to 7.4 eV), protons (down to 100 eV), H, He++, He+, He (down to 1 keV) in liquid water; as well as the elastic scattering of electrons. Following further refinement the performance of the new low energy electron elastic scattering cross sections are compared to standard Geant4 by Champion et al. (181). Champion et al. note a better agreement between the new simulations and experimentally measured electron total (elastic and inelastic) scattering cross sections than standard Geant4; total cross sections are used for comparison due to the lack of experimental data. Later work by Villagrasa et al. (182) summarises the physical models used within Geant4 DNA for protons, alphas, and electrons. Also stated are the energy ranges over which these different models are initiated. For protons with energy exceeding 500 keV the First Born Approximation (FBA) is used to consider inelastic interactions (ionisation and excitation). The FBA approach is fully explained by Dingfelder (183). As the incident proton energy becomes comparable to the orbital speed of the target electron the FBA approach is no longer used, instead ionisation cross sections are calculated according to work by Rudd et al. (184,185). Proton excitation cross sections are calculated according to the method of Miller and Green

66 (186). The ‘Rudd’ model and ‘Miller and Green’ model are applied for proton energies below 500 keV. Charge transfer, through electron capture and the stripping of hydrogen, are modelled according to an analytical expression, correctly reproducing experimental results. Alpha particles are modelled in a similar way to protons, with a correcting effective charge term. Electron inelastic collisions are modelled according to the FBA for 10 MeV to 1 keV, similar to protons but following the dielectric formalism. Below 1 keV the FBA approach is adapted with a classic Coulomb-field correction, accounting for potential energy gain of the incident electron in the field of the target molecule. Electrons exceeding 10 keV also have a relativistic transverse interaction term; below 10 keV transverse interactions are considered negligible. Elastic interactions for ions are neglected in Geant4-DNA, but are considered for electrons since it is the dominant interaction at low energies. Here the electron does not lose energy but does scatter, this is necessary to model the full track structure. Two models for electron elastic scattering are implemented and are available for selection by the user. The model by Champion et al. (181) or the Rutherford model (>0.2 keV) (187) with formula by Brenner and Zaider (<0.2 keV) (188). Work by Francis et al. in 2011 (189) details the application of nuclear interactions and elastic collisions of protons traversing a water target for the Geant4- DNA software. In the case of proton elastic scattering the standard Geant4 multiple scattering model is employed, detailed by Urban (190). This has a lower energy limit of 1 keV and the energy restriction is applied in Geant4-DNA. Francis et al. note that this limitation is acceptable when considering a small target, such as a cell or thin tissue. In this case it is unlikely that the incident proton will reach the lower energy limit. Only when considering stopping particles would the 1 keV limit become an issue and this is more suited to radiation protection applications (outside of the scope of Geant4-DNA). Francis et al. (189) also detail the fate of electrons with energy below the last excitation state of the water molecule (8.22 eV), stating that the only possible remaining processes are vibrational excitations (down to 0.025 eV), elastic scattering (down to 0.025 eV) and electron attachment (between 6 and 13 eV). Cross section data for sub excitation electrons are not available for liquid water although data is available for amorphous ice films described by Michaud et al. (191). Francis et al. (189) explain that commonly Monte Carlo users assume cross sections for ice films are twice the value for the condensed liquid phase, although Francis et al. (189) empirically deduced the conversion factor by comparing solid phase values (191) and gas phase values (192). This gives Geant4-DNA vibrational excitation cross sections up to 100 eV and down to 1.7 eV (the lowest experimental point). Below 1.7 eV vibrational cross sections are extrapolated. Francis et al. (189) extended the Geant4-DNA software to simulate electron tracks with energies between 0.025 eV

67 and 1 MeV, protons and hydrogen atoms between 1 keV and 100 MeV, and alpha particles between 10 keV and 40 MeV. In 2012 Ivanchenko et al. (193) described how to combine the Geant4-DNA physics models for electrons, protons, hydrogen, and alpha particles with the standard Geant4 models for photon transport. This offers the user the ability to simulate a photon incident on a target whilst tracking all secondary electrons with the more detailed tracking capabilities of the Geant4-DNA physics processes. The ability to combine the two sets of physics lists is a result of the modular nature of the C++ coding language (which Geant4 is written in). For example, a user can invoke the standard electromagnetic processes for photons and the more specialised Geant4-DNA physics list for particle tracking. In 2013 Champion et al. (194) described the use of a Classical Trajectory Monte Carlo (CTMC) simulation for the irradiation of water and nucleobases (Adenine, Thymine, Guanine, and Cytosine). With this classical physics treatment, the authors are able to simulate the effect of the already established incident particle types as well as Carbon, Nitrogen, Oxygen, and Iron on water and larger molecules. In the classical physics treatment standard Newtonian models are used for particle movements and the occurrence of ionisation is determined by Classical Over-Barrier (COB) treatment, whereby ionisation occurs if the projectile can overcome the impacted target electrons Coulomb potential barrier. The CTMC-COB approach allows for the simulation of larger target molecules, which would be difficult or technologically impossible to model with a quantum mechanical approach. The classical cross sections are tabulated and interpolated by the Geant4 software, the minimum energy threshold for the CTMC-COB approach is 10 keV, at which point particle tracking is stopped and any remaining energy is assumed to be transferred to the surrounding medium. Champion et al. (194) have also assumed identical cross sections for the four nucleobases. Through the CTMC-COB approach Champion et al. (194) found the cross sections for a nucleobase to be of about one order of magnitude larger than for liquid water, accounted for by the DNA nucleobase compared to the water molecule. Their work showed a non-negligible discrepancy for a liquid water target and a nucleobase target, albeit in a purely classical approach. Champion et al. (194) quickly followed up their previous work with a theoretically described quantum mechanical model of protons incident on RNA-Uracil (195). In the more recent work (195) Champion et al. developed theoretical models to estimate ionisation and electron capture cross sections for 1 MeV protons impacting DNA or RNA components. The models are based on the FBA with correct boundary conditions (CB1) or the Continuum Distorted Wave-Eikonal Initial State (CDW-EIS) approach. Champion et al. (195) found that both the CB1 and CDW-EIS models had limitations when compared to experimental data; with CDW-EIS showing limitations

68 on predicting angular distributions for secondary electrons ejected at angles greater than 120° and CB1 showing limitations for ejection energies below 10 eV. Although “reasonable” agreement between theoretical and experimental total ionisation cross sections was seen. Further work has been carried out in 2015 by Tran et al. (196) on classical elastic scattering of proton and alpha projectiles in water. And work by Kyriakou et al. (197) detailing improvements in the energy-loss model. The improvements by Kyriakou et al. are particularly important for simulating the nanometric pattern of ionisations. The current status of Geant4-DNA and the particle types available, in liquid water, to the user are summarised by Incerti et al. (164). Further improvements and additions to the models since this publication can be found on the Geant4-DNA website (198).

1.7.2 Chemistry In 2011 the Geant4-DNA toolkit was extended to provide a model of water radiolysis and free radical transport through Brownian diffusion (199), providing an open source framework for simulating direct and indirect DNA damage. The chemistry module was later improved, offering faster simulation speed by implementing dynamic time steps, the Smoluchowski reaction model, and a Brownian bridge technique (200). Geant4-DNA simulates the water radiolysis in three temporal stages; the “physical stage”, the “physico-chemical stage”, and the “chemical stage” (199).The yields and species of free radical production are determined according to the interactions occurring in the physical stage. The physico-chemical stage starts after the physical stage, at 1 fs, and ends. This stage encompasses the very fast events, such as the decay of excited water molecules and the removal of a proton for an ionised water molecule. The chemical stage, starting at 1 ps and ending at 1 s, models the diffusion and interaction of molecules. The diffusion of free radicals is determined according to the Smoluchowski equation, defining a 3D probability density of molecule position depending on the diffusion coefficient. Geant4-DNA determines a dynamic time step for the system based on the minimum time for a possible reaction to occur. The distance between each molecule and its nearby neighbours is checked. Based on the reaction rate of the molecule and its nearest neighbour an interaction range is determined, given by Equation 1.23.

푘 푅0 = (1.23) 4휋푁퐴퐷

69 Where R0 is the interaction range, k is the reaction rate, NA is Avogadro’s constant, and D is the sum of diffusion coefficients of the two molecules. If the pair of molecules come to within the interaction range, then they chemically react. The time step is limited to ensure a 95% confidence that the two molecules will not step past each other, i.e. ensuring that the reaction is not missed. As molecules become more and more proximal the time step becomes increasingly smaller. A limiting time step is applied to avoid these unnecessary multiple small steps. For this case a Brownian bridge technique is applied to calculate the probability that the pair of molecules could have reacted along their step. The list of chemical reactions, reaction rates, and interaction ranges included in the default Geant4-DNA version 10.4 are shown in Table 1.3.

Reaction Rate Interaction Range Reaction (m3/mol/s) (nm)

+ H3O + eaq → H 2.11 E+10 0.200589

+ - H3O + OH → No product 1.43 E+11 1.349730

- OH + eaq → OH 2.95 E+10 0.506256

OH + OH → H2O2 4.40 E+09 0.207651 OH + H → No product 1.44 E+10 0.194167

- - eaq + eaq → OH + OH + H2 5.00 E+09 0.134838

- eaq + H → OH + H2 2.65 E+10 0.294265

- eaq + H2O2 → OH + OH 1.41 E+10 0.295745

H + H → H2 1.20 E+10 0.226528

Table 1.3 Chemical reactions modelled by Geant4-DNA, reaction rates, and interaction range.

The time evolution of free radical yields predicted by Geant4-DNA was compared to reference data to show satisfactory agreement (200). Future developments of the chemistry modules include addition of reactions between the free radicals with the DNA material, for example see Meylan et al. (168).

1.8 Summary and Aims The favourable dose depth profile of PBT makes it a desirable treatment modality, especially for cases where low dose to healthy tissues should be avoided such as in paediatric treatments. However, the Bragg peak often goes unused for regions abutting OARs, due to uncertainty in the range and RBE at the distal edge. Phenomenological models are capable of predicting variable RBE with proton depth, though the experimental data used to fit these models contains a lot of noise. This noise limits the clinical confidence in the application of such models. Since there is

70 no clinical evidence to suggest that a static RBE leads to negative effects in treatment there has been no push to implement these variable RBE models. Particularly since the noise from fitting phenomenological models may lead to a bigger variance in true biologically equivalent dose compared to a static RBE. However, it has been proposed that no clinically relevant complications have been observed due to the caution in the placement of the distal edge. Models exist to either describe experimental data or to predict experimental data. All of these models contain some description of the underlying mechanisms that result in cell death. The level of mechanistic description varies, but generally more mechanistic description improves the predictive power of the model. A balance can be attained between mechanistic description and predictive power. For example, the LEM model scores local dose to sub-volumes of the cell nucleus making predictions of the spatial distribution of DSBs. With this method it is possible to make fast predictions on cell death. Other models, such as PARTRAC, fully describe the action of physical interactions, radical interactions, and biological repair of damage. This approach is slow and computationally expensive. However, models such as this can make more in depth predictions of biological outcomes, such as chromosome aberrations. The benefit of this may be in the prediction of normal tissue complications, that will be relevant to cells receiving low dose. It is clear that predicting RBE ab initio is desirable, without a dependence on the experimental survival data. Ideally this would be through an in sillico model, providing confidence through repeatability. The model should be capable of reproducing experimental data, providing sufficient agreement, so that it can be extrapolated to any treatment condition. This has been achieved by the LEM for carbon ion therapy. This thesis aims to produce such a model, or at least the first part of a multi- scale model, where data can be used for further simulation of biological response. Developing such a model, and validating it, allows for further investigation of repair processes in isolation. The thesis aims are summarised as:

• Develop a track structure-based model to predict radiation induced DNA damage. • Validate or verify the model through comparison to experimental data or predictions from other models. • Combine the model predictions to a model of DNA repair. • Identify elements of the DNA damage that influence the resulting repair. • Link these elements to metrics that are readily scored in PBT.

This thesis is presented in the alternative format, where chapters are presented in publication format. Although separate these chapters can be read as a common

71 narrative describing the process of developing a model of radiation induced DNA damage and the clinically relevant predictions that are made. The aims outlined above are addressed in the following chapters. Chapters 2 and 3 present details on the development of the model, including considerations on the mechanisms of direct and indirect DNA damage. Chapter 4 presents the results obtained when the damage model is combined with a repair model. Chapter 5 presents details on a standard format for recording DNA damage from simulations such as the one presented in this thesis. Using the standard format, it was possible to easily compare predictions of the damage and repair to other models, presented in Chapter 6. Chapters 2 and 3 score DSB complexity and proximity respectively. These are scored as a function of proton LET and dose, where correlations are drawn from the detailed simulations. The correlations can be used to predict outcomes during treatment planning. Beyond the scope of this thesis the project aims are to identify, and understand, aspects of physical and chemical processes that result in biological effect, and ultimately in cell death. With this understanding, and through a link to conventionally scored parameters in PBT, the model predictions can be incorporated into TPS. Here, it will be possible to optimise plans on desired biological effect rather than using dose as a surrogate.

72 2. Nanodosimetric Simulation of Direct Ion-Induced DNA Damage Using Different Chromatin Geometry

Models1

This work was published in Radiation Research in 2017 Vol. 188 (6) pages 690- 703. The paper presented in this chapter has been modified to use British English and section numbering. This is the first publication from the thesis on the simulation of DNA damage. The main focus of the work is on assessing the impact of biological target geometry on damage predictions. The manuscript also details how the damage models work. Conventionally research on the track structure simulation of radiation induced DNA damage will begin with a model of the DNA. There are a number of assumptions that can be made, or simplifications, to limit the complexity of this model. For example, it is possible to assume a homogenous distribution of DNA and sample energy deposition events. However, for complete damage reporting the model must include a model of the DNA backbone and DNA base. The chromatin fibre is a biologically relevant organisational level of DNA and is common to all eukaryotic cells. A number of simulation studies in the literature will start with a model of the chromatin fibre and then replicate the volume throughout a nucleus. This approach is usually taken since it is computationally efficient, though perhaps it lacks the detail that can be obtained from a full organised model of nuclear DNA. For example, chromosome organisation may play an important role in biological outcomes such as aberration formation. Within the literature it was noticed that simulation studies will often construct a chromatin fibre with a given geometry, commonly the solenoid model. However, the exact geometric structure of the chromatin fibre is still unknown. Simulation studies often overlooked this and had not considered the impact that a different chromatin geometry would have on their results. To address this, we simulated the direct DNA damage from protons and alpha particles with three models of the chromatin fibre and compared the outcomes. This work shows that the yield and complexity of the directly induced DSBs is not strongly influenced by the chromatin model chosen. This work provides the justification that allows DNA damage simulators to construct their model with any of the chromatin fibres tested here.

1 © Radiation Research. Reproduced with permission of Radiation Research. All rights reserved.

73 Author Contributions I developed the code to model chromatin fibre geometries in Geant4, and the scoring of DNA damage. I generated, analysed, and evaluated the data. I wrote the manuscript which was reviewed by all authors

74 Nanodosimetric Simulation of Direct Ion-Induced DNA Damage Using Different Chromatin Geometry Models

N T Henthorn1, ‡, J W Warmenhoven1, M Sotiropoulos1, R I Mackay2, K J Kirkby1, 3 and M J Merchant1, 3 1 Division of Molecular and Clinical Cancer Sciences, Faculty of Biology, Medicine and Health, University of Manchester, UK 2 Christie Medical Physics and Engineering, The Christie NHS Foundation Trust, Manchester, UK 3 The Christie NHS Foundation Trust, Manchester, UK ‡ Correspondence to: [email protected]

Monte Carlo based simulation has proven useful in investigating the effect of proton induced DNA damage, and processes through which this damage occurs. Clustering of ionisations within a small volume can be related to DNA damage through the principles of nanodosimetry. For simulation, it is usual to construct a small volume of water and determine spatial clusters. More recently realistic DNA geometries are used, tracking energy depositions within DNA backbone volumes. Traditionally a chromatin fibre is built within the simulation and identically replicated throughout a cell nucleus, representing the cell in interphase. However, the in vivo geometry of the chromatin fibre is still unknown within the literature, with many proposed models. This work uses the Geant4-DNA toolkit to build three chromatin models; the solenoid, zig-zag, and cross-linked geometries. All fibres are built to the same chromatin density of 4.2 Nucleosomes/11nm. The fibres are then irradiated with protons, in the LET range 5 – 80 keV/μm, or alpha particles, in the LET range 63 – 226 keV/μm. Nanodosimetric parameters are scored for each fibre at every LET and used as a comparator between the models. Statistically significant differences are seen in the double strand break backbone size distributions between the models, though non-significant differences are seen between the nanodosimetric parameters. From the data presented in this work we conclude that selection of the solenoid, zig-zag, or cross-linked chromatin model does not significantly affect the calculated nanodosimetric parameters. This allows for a simulation- based cell model to make use of any of these chromatin models for the scoring of direct ion induced DNA damage.

75 2.1 Introduction Damage to DNA through ionising radiation is accepted to be the main process by which radiotherapy kills cancer cells. The most toxic form of DNA damage is the Double Strand Break (DSB); conventionally defined as a break of the sugar- phosphate backbone on opposite strands of the helix, within a maximum separation of 1 helical turn, ≈10 base pairs (bp) (201,202). The lethality of a DSB is determined by its complexity; where the complexity is defined as the number of DNA lesions in close proximity, forming a cluster of damage (203,204). The higher the complexity of the DNA damage the more difficult it is for the cell to repair (205), and this has been shown experimentally through DNA repair kinetics (206). A single unrepaired DSB can be enough to cause cell cycle arrest (207), and depending on the genes affected could lead to cell death (208). The induction of complex DSBs is linked to the radiation type and depends strongly on the linear energy transfer (LET), which is a common descriptor of radiation quality for protons and heavier ions (209). Generally the complexity of induced DSBs increases with LET (210,211), due to the denser ionisation pattern, shown in experiments and modelling studies (212). Consideration of ionisation pattern and attempts to link radiation track structures to biological outcomes has called for a new field of study, namely nanodosimetry. Nanodosimetry attempts to bridge the gap between the structure of the radiation track and dosimetric parameters or cellular outcomes (94), with the aim of optimizing the biological effectiveness of ions for treatment. Nanodosimetry has a key role to play in gaining a better understanding of the physical processes that lead to observed values of proton relative biological effectiveness (RBE). The focus of nanodosimetry is primarily in scoring ionisation cluster sizes within a sensitive volume, traditionally filled with liquid water (93). A link between ionisation cluster size and DNA damage is thought to exist, either through a simple process suggested by Grosswendt (213) or a combinatorial process suggested by Garty et al. (214). Grosswendt’s approach assumes that there is a direct link between ionisation cluster size and DNA damage; that is a cluster size of 1 is equivalent to a Single Strand Break (SSB), while a cluster size of 2 or more is equivalent to a DSB. This approach fits with some of the earlier data collected by Brenner & Ward (215), who, by comparison of simulation to experiment, first showed that the ionisation cluster size correlates with DSB induction. The combinatorics approach (214) begins from a similar assumption to Grosswendt’s simple approach, assuming ionisation clusters can be related to DNA damage, but contains another layer of complexity. The combinatorial process states that the ionisations have a probability of causing a strand break and are randomly spread across either strand of the DNA double helix. The probability of the ionisation damaging the DNA is fitted to experimental data, ensuring an agreement between simulation and experimentally measurable

76 outcomes. Only when the ionisation points are on opposite strands, and within a given separation, can a DSB be scored. The basis of nanodosimetry relies on the measurement and scoring of ionisation cluster sizes and the subsequent conversion into biologically relevant parameters, such as SSBs and DSBs. Data for physical cluster size is readily available with the use of nanodosimeters such as ‘Jet Counter’ (99), ‘StarTrack’ (96), and ‘Ion Counter’ (97). With the use of nanodosimeters, ionisation cluster sizes can be measured within small volumes of gas, and by rescaling to the density of liquid water the ionisation pattern can subsequently be converted into equivalent DNA damage. Recently there has been a move towards direct conversion of ionisation cluster sizes to DNA damage through the use of Monte Carlo based track structure simulations. A review of these codes, with regards to predicting DNA damage, and the interaction models used was presented by Nikjoo et al. (169), though some of these codes have significantly progressed since then. For example the recent extension to the ‘Geant4’ toolkit (179), under the ‘Geant4-DNA’ project (164,216), now has the capability to track electrons, in an event-by-event fashion, down to thermalisation. The success of nanodosimetric simulations, for directly predicting DSBs, relies on the in silico model of the DNA. Recently Bueno et al. (217) compared a detailed continuous DNA geometry to a discontinuous simple geometry for the scoring of ionisation clusters, showing differences in absolute cluster frequency and cluster size. The work highlights the need for a model of DNA geometry. A number of simulation codes have incorporated DNA models in order to predict damage; notably the well-established work of “PARTRAC” capable of building an atomistic model of the entire genome (109), an atomistic description of a short DNA segment in the “PDB4DNA” application with the use of Geant4-DNA particle tracking (218), a model of the entire genome in Geant4-DNA “wholeNuclearDNA” (219,220), and a combination of the “PDB4DNA” with “wholeNuclearDNA” giving an atomistic representation of a chromatin fibre in the Geant4-DNA environment (221). Most recently software under the name of “DNAFabric” (222) has been made available to assist in the building of complex DNA geometries, allowing for the direct import of an atomistic DNA model into Geant4. Similarly, the extensive work of the TOPAS-nBIO project, based around the Geant4 toolkit, offers users a simple interface to simulate radiation damage to complex cellular geometries (167). Typically, the DNA model used in simulation software is built with a repeating sequence of nucleosomes, which are then geometrically arranged to form fibres of chromatin. For eukaryotic cells it is believed that a hierarchical compaction of DNA exists (223); from the wrapping of the DNA double helix around histones, forming nucleosomes (224), to the arrangement of nucleosomes into chromatin fibres (225). During cell division the fibres are looped, adding more compaction, to form

77 chromosomes (224). Although, the exact structure of the chromatin fibre in vivo is still unknown (226). Many models of chromatin structure have been suggested, based on ex vivo observations, and can be grouped into two main categories “single- start helices” and “two-start helices” (226). An example of the “single-start helices” group is the solenoid model originally proposed by Finch and Klug (227), where nucleosomes are arranged in a simple helix parallel to the chromatin fibre axis. The “two-start helices” group can be represented by the model proposed by Woodcock et al. (228), where nucleosomes are arranged in a zig-zag parallel to the fibre axis. As well as the two main groups there is also the added complication of “cross-linked” models, where nucleosomes are connected perpendicular to the chromatin fibre axis (229). In vivo the situation may be more complex (230), with the suggestion that no orderly structure of chromatin exists (231). With the vast number of proposed chromatin structures, it is worth considering the effect this may have on results from nanodosimetry simulations. For proton irradiation it has been shown, through simulation, that the DNA density has a noticeable effect on the number of predicted SSBs and DSBs (219), though this work did not consider chromatin structure. Work by Friedland et al. (232) has compared the effect of solenoid (single-start), zig-zag (two-start), cross-linked, and stochastic arrangements of nucleosomes when irradiated with photons and electrons within the PARTRAC code. Friedland et al. showed that SSB and DSB yields are similar between the fibre models, however the distribution of DNA fragment sizes are characteristic of the fibre model used. The characteristic fragment sizes was also previously reported by Holley & Chatterjee (233); this work was later followed up by Rydberg et al. with an investigation into experimental DNA fragment sizes, showing results consistent with the zig-zag chromatin model (234). This work investigates the effect of three chromatin geometries on ionisation cluster sizes, through nanodosimetric simulations, within the Geant4-DNA toolkit. The chosen chromatin models represent either ‘single-start’ or ‘two-start’ helices (226). Within the simulation a solenoid (227), zig-zag (228), or cross-linked solenoid (229) chromatin fibre is built. The fibre is then irradiated with proton or alpha particles, of various energies, and nanodosimetric parameters are scored. The parameters and cluster size distributions are used to compare differences between the chromatin geometries. We show the implications for cell models constructed with any of these chromatin fibre models, and assess expected differences in calculated DSBs through simulation.

78 2.2 Methods 2.2.1 Model of DNA and Chromatin Geometries The basic repeating unit of the chromatin fibre is the nucleosome. Here we build, within the simulation, the nucleosome from three volumes: a cylinder representing the histone, a sphere representing a DNA base, and a sphere representing a unit of the sugar phosphate backbone. The DNA base sphere is representative of a generic base; no structural changes are modelled dependent on if the base is Adenine, Thymine, Guanine, or Cytosine. The base sphere has a radius of 0.208 nm. The backbone sphere is designed as a combination of the sugar and phosphate group of the DNA backbone; it is built with a sphere of radius 0.240 nm. The radii of the base and backbone is chosen so as to follow the ratio of volumes reported by Nadassy et al. (235). The backbone and base volumes are arranged so as to create a double helix with a radius of 1 nm (110,236), in the conformation of B- DNA. The double helix has been set to complete a twist every 10.52 base pairs, with a rise per pair of 0.332 nm (236,237). There are three main conformations of the DNA double helix (A-, B-, and Z-DNA), with B-DNA the most prevalent under physiological conditions (238). The influence of DNA conformation on ion induced strand breaks has previously been investigated by Bernal et al. (239), showing the greatest DSB yield for A-DNA and lowest DSB yield for Z-DNA. The histone is modelled as a cylinder with radius 3.3 nm, and height 5.7 nm. Within the simulation the double helix is wrapped around the histone with 1.65 left- handed turns (240). This forms the nucleosome, which has a combined (histone and DNA) diameter of 11 nm, and a height of 5.7 nm (241). In our model one nucleosome has 138 bp, slightly lower than the reported 145-147 bp (242). The nucleosome is then repeated with a given geometry to build the chromatin fibre, with nucleosome connected by linker DNA (240,243). No linker histone (H1) (244) is built; instead the DNA from one nucleosome connects directly to the next, following a Bézier curve to ensure smooth connection of linker ends to histone DNA. The chosen resolution at which we build the DNA double helix allows us to construct a length of DNA with discrete volumes, this is important when tracking in which backbone volume an ionisation has occurred (see Section 2.2.3). The diameter of the ex vivo chromatin fibre is said to be around 30 nm, though variation between around 20-45 nm has been seen depending on experimental conditions (230). The fibres built in this work slightly exceed the 30 nm diameter. The solenoid geometry is based on the original proposition by Finch and Klug (227) (Figure 2.1a), here a fibre is built as simple left handed helix of nucleosomes. The fibre has a diameter of 37 nm, and is set to a length of 161 nm (217,219). The helix of nucleosomes is built around the central axis of the fibre and set to repeat every 6 nucleosomes, 6 nucleosomes per turn. The fibre is set to build 61 nucleosomes, with 10.8 kbp of DNA. The fibre has a density of 4.2 nucleosomes/11

79 nm. The number of nucleosomes/11 nm is a measure of chromatin density, with values between 6-7 generally accepted (226) though a range has been seen depending on salt concentration (228), including values up to 8 nucleosomes/11 nm at physiological concentrations of salt (230). A value of 4.2 nucleosomes/11nm corresponds to a relatively loose fibre. The geometry of the zig-zag model is taken from work by Woodcock et al. (228), shown in Figure 2.1b. This is a representation of a ‘two-start’ helix (226). Again, the nucleosomes are built around the central axis of the fibre, but unlike the solenoid model a nucleosome is not connected to the next nucleosome in the helical path. Instead two helices of nucleosomes are created around the fibres central axis. These helices are connected at every placement of a nucleosome, i.e. when a nucleosome is placed a ‘sister’ nucleosome is placed with an offset in the positive direction of the fibres central axis, with a slight progression around the fibres circumference. This ‘sister’ nucleosome then connects back into the original helix with a negative offset in the fibre direction. The repeating geometry gives rise to a zig-zag pattern of the nucleosomes along the central axis. With this geometry the fibre has a slightly larger diameter, of 40 nm, but is still set to a length of 161 nm. This geometry is also set for the construction of 61 nucleosomes, with 10.5 kbp of DNA. The zig-zag fibre has a density of 4.2 nucleosomes/11 nm, maintaining consistency in chromatin density between the models. The cross-linked model follows the same geometry as the solenoid, previously described, and is shown in Figure 2.1c. Although in this case the nucleosomes are connected across the central axis of the chromatin fibre, instead of around it. The distance between nucleosomes across the fibre is considerably larger than the distance between nucleosomes in the solenoid model, requiring more linker DNA to be built. In this cross-linked model the nucleosomes don't follow a continuous progression along the fibre central axis as they progress around the circumference of the fibre. Instead a nucleosome is placed and then the opposite nucleosome is placed with a slight offset along the central axis (positive z direction). For a fibre of 37 nm diameter and 161 nm length 61 nucleosomes are built, with a density of 4.2 nucleosomes/11 nm. In the cross-linked model there is a considerably larger amount of DNA packed into the same volume, with 12.4 kbp built, due to the presence of more linker DNA.

80

Figure 2.1 The three chromatin models implemented within the simulation. Histone colour alternates every other histone. a) The solenoid model, with 61 nucleosomes and 10.8 kbp of DNA. b) The zig-zag model, with 61 nucleosomes and 10.5 kbp of DNA. c) The cross-linked solenoid model, with 61 nucleosomes and 12.4 kbp of DNA.

2.2.2 Track Structure Simulation and Irradiation Details The Geant4-10-02 toolkit, with the Geant4-DNA extension (164), is used to simulate the transport and interactions of protons and alpha particles with the chromatin geometry described, using the option 4 DNA physics list (G4EmDNAPhysics_option4). The option 4 physics list, described by Kyriakou et al. (197,245), contains an improved implementation of the Emfietzoglou model for the

81 dielectric response function of liquid water. The improvements result in much better agreement of the simulated water W-value to other track structure codes, as well as a reduction in very low energy (<1 keV) electron range; shown through simulation of dose point kernels (197). The LET of the incident particle is calculated separately, using Geant4-DNA default physics list (G4EmDNAPhysics), by tracking the energy loss of the primary particle across a cubic water volume. Here the sides of the volume are set to the same length as the chromatin diameter (≈30 nm). Protons in the energy range 0.1- 10 MeV (80.3 – 5.3 keV/μm) and alpha particles in the energy range 1-7.5 MeV (226.0 – 62.9 keV/μm) are used in this work. Initially the proton or alpha particle is randomly placed on the surface of the chromatin fibre, the direction of the particle is then randomly selected. This mimics the experimental procedure of cellular irradiation; where the orientation of the chromatin fibre, within the cell nucleus, relative to the incident particle is random. This is the same irradiation methodology employed by Bueno et al. (217). 106 primary particles are simulated for each LET across the range used, ensuring good statistics for nanodosimetric characterisation. Analysis is carried out during a ‘run’ of 106 primary particles of a given energy, treating events independently (see Section 2.2.3). Here we refer to an event as a single primary particle and any secondary particles produced along its track. Only direct effects are considered in this work, we currently do not consider the effect of free radical production and its relation to DNA damage. This is a limitation of the current work as it does not fully describe the radiation induced DNA damage. For the proton and alpha particle energies used in this work Nikjoo et al. (246) predict that over 60% of the induced DSBs are attributable to direct effects. Therefore, it is not expected that the reported impact of chromatin fibre geometry on DNA damage would be significantly different after the inclusion of DNA damage from indirect effects. However, as 4D track structure codes become more widely available, and in particular DNA free radical interaction models develop, this work can be extended to fully encompass indirect effects.

2.2.3 Scoring of Clusters An ionisation occurring within a ‘backbone’ volume is recorded and stored by the simulation, where each backbone volume has a unique ID number. The unique ID is used as a surrogate for bp number, with a pair of backbones sharing the same ID differentiated in this case by the strand number. Following an event, a single primary with associated secondaries, the collection of ionisations is processed. Ionisations occurring in the same volume are combined by summing the energy and averaging the x y z position of the ionisations. This work follows the assumption that one ionisation is enough to break the backbone volume. The processed ionisations

82 are then analysed by a clustering algorithm; where the raw data includes details of the volume in which the ionisation occurred, such as the bp number, the strand, the energy deposited, and the x y z position of the ionisation. The clustering algorithm used analyses the set of data collected from an event during the track structure simulation. The algorithm checks for potential clusters, given the conditions that the ionisations must be within backbone volumes, on opposite strands, and within 10 bp. The 10 bp separation constraint acts on the ID number of the backbone, rather than a spatial separation. The algorithm determines the number of clusters created by an event, and the number of backbone volumes included in that cluster. A cluster size frequency distribution, for the single event, is returned and stored. The single event distribution is added to a central store, containing the cluster size frequency distributions of all events processed up until that point. Following the complete simulation of all events the probability distribution function (PDF) is calculated, from the frequency distributions. Two PDFs are created, the number of clusters created by an event, and the size of the cluster, calculated from the entire population of events simulated. Primary events that did not create a record in the data are said to have caused no clusters, and therefore a cluster size of 0. We assume a direct link between cluster size and DNA damage, with a cluster size of 1 equivalent to a SSB, and a cluster size of 2 or more equivalent to a DSB. Cluster sizes larger than 2 can be further classified as complex DSBs, or simply larger DSBs.

2.2.4 Nanodosimetric Parameters The nanodosimetric parameters considered in this work are calculated from the cluster size PDF, which we refer to here as the “damaged backbone cluster size distribution” (DBCSD) Here we define the DBCSD as P(Q, ν), where Q refers to the radiation quality and ν to the cluster size. P(Q, ν) fulfils the normal probability conditions given by Equation 2.1:

휈푚푎푥 ∑ 푃(푄, 휈) = 1 (2.1) 휈=0

From P(Q, ν) the average (mean) cluster size created by a primary, M1, is calculated. For DSB induction we consider a minimum cluster size ν=2, as such we calculate M1 with a lower limit of 2. This is otherwise referred to as the conditional

DBCSD, and the conditional average is denoted by the lower case m1. This gives an indication of the average DSB complexity for a given Q. m1 is defined here by Equation 2.2, where P(Q, ν) has been renormalized:

83 휈푚푎푥 1 푚 = ∑ 휈⁡푃(푄, 휈) (2.2) 1 ∑휈푚푎푥 푃 휈=2 휈=2

Also of interest is the relative cumulative distribution of cluster sizes with ν≥2, giving the probability of a primary causing a DSB of any complexity, F2. More complex DSBs, with ν≥3, can also be calculated from the DBCSD, F3. F2 and F3 are defined by Equations 2.3 and 2.4:

휈푚푎푥 퐹2 = ⁡ ∑ 푃(푄, 휈) (2.3) 휈=2

휈푚푎푥 퐹3 = ∑ 푃(푄, 휈) = 퐹2 − 푃(푄, 2) (2.4) 휈=3

The final parameter of interest is the ratio of SSBs to DSBs. Here the probability of an event causing a SSB is compared to the F2 parameter, showing the LET dependency. The ratio is defined by Equation 5:

푆푆퐵 푃(푄, 1) = (2.5) 퐷푆퐵 퐹2

Definitions of nanodosimetric parameters can be found in many published works, for example see Lazarakis et al. (247) and Alexander et al. (248). The nanodosimetric parameters are calculated from the DBCSD, this can be following the simulation of every event or in an online fashion. For online calculation a temporary parameter is calculated from the single event DBCSD, with the final parameter calculated from every temporary parameter, following the Welford algorithm (249). Online calculation offers a simple method for determining the variance of a parameter without storing every temporary parameter.

2.3 Results 2.3.1 Damaged Backbone Cluster Size Distribution The clustering algorithm, as described in Section 2.2.3, determines the DBCSD. This distribution gives the probability of a particle, with a given initial energy, creating a cluster of a specified size. Distributions for the three chromatin models are presented in Figure 2.2. Here the distributions for the maximum and minimum particle energy used in the simulations are shown, protons at 0.1 MeV (80.3 keV/μm), protons at 10 MeV (5.3 keV/μm), alphas at 1 MeV (226.0 keV/μm), and alphas at 7.5 MeV (62.9 keV/μm). The cluster sizes, ν, are shown from a minimum of 0 up to a maximum of 10. The algorithm determines the maximum cluster size,

84 which in high LET cases can exceed ν=10. The data is presented on a log-linear scale to show the relatively low probability of larger cluster sizes compared to small cluster sizes.

Figure 2.2 The probability distribution functions for the three chromatin models used, where Q refers to radiation quality and ν refers to cluster size. Distributions shown for the maximum and minimum particle energy used within the simulation.

For the range of LET investigated a large probability is observed for cluster size of 0 (ν=0) and a much smaller probability for bigger clusters. The probability for bigger clusters increases for the higher LET particles, for example an inspection of DBCSDs of 10 MeV protons (5.3 keV/μm) and 0.1 MeV protons (80.3 keV/μm) shows maximum cluster size increases from 4 to 7. Little difference can be seen in the DBCSD between the three chromatin models. Statistical comparison of the frequency DBCSDs was performed using the two- sample Kolmogorov-Smirnov (KS) test. Here the DBCSD of each chromatin model was compared to the DBCSD of the solenoid model, at each particle type and LET. Sample sizes, for the KS test, were on the order of 106. Statistically significant differences are observed for the majority of LETs investigated, i.e. comparison of the

85 DBCSD for the solenoid model (at a given LET) differs, with p<0.05, to the zig-zag or cross-linked model at the same LET. The difference between the 10 MeV proton DBCSD for the cross-linked model and solenoid model was not statistically significant. Nor were the 3, 4, and 10 MeV proton DBCSDs for the zig-zag model when compared to the solenoid model. Although the differences between DBCSDs are statistically significant, in most cases, the effect on nanodosimetry is small, as seen in the following sections.

2.3.2 Conditional Average Cluster Size (m1)

The average cluster size, m1, is calculated from the conditional DBCSDs. Here the minimum cluster size is set to ν=2. This gives an indication of the change in complexity of induced DSBs. The m1 value for the three chromatin models across the range of LET is shown in Figure 2.3:

Figure 2.3 The conditional average cluster size for the three chromatin models across a range of LET. Protons shown as closed symbols and alpha particles shown as open symbols. Error bars represent the standard error in the mean. Lines show exponential fits.

An increase in m1 with LET is observed for each of the three chromatin models, following an exponential fit of the form 푚1 = 퐴⁡exp⁡(퐵⁡퐿퐸푇). This relationship between m1 and LET is expected when considering the track structures, i.e. as LET increases the ionisation events become spatially closer giving rise to larger clusters. Note there is a discontinuity between protons and alpha particles seen here. For a given LET an alpha particle produces smaller clusters than a proton. Similar results have been seen elsewhere (217), including the exponential relation between LET and m1, and discontinuity between proton and alpha particles. The fitting parameters used, and associated goodness of fit (reduced χ2), are shown in Table 2.1.

86

Protons Alphas

χ2  (B 10-4) ± Model A ± % (B10-3) ± % A ± % χ2 10-4 10-4 %

Solenoid 2.050 ± 0.2 1.02 ± 5.9 5.6 2.035 ± 0.6 7.97 ± 5.1 4.6

Zig-Zag 2.034 ± 0.2 1.18 ± 5.9 7.5 2.035 ± 0.5 8.04 ± 4.3 3.3 Cross- 2.052 ± 0.3 0.98 ± 10 0.2 2.029 ± 0.5 8.02 ± 4.3 3.4 Linked

Table 2.1 Fitting parameters used for conditional average cluster size (m1). Uncertainties in the fitting parameters are calculated as the asymptotic standard error, and are expressed as a percentage error of the parameter. Where 푚1 = 퐴⁡exp⁡(퐵⁡퐿퐸푇)

The data presented in Figure 2.3 shows little difference between the chromatin models used, which can also be seen by the similar fitting constants (Table 2.1). A comparison of the distribution of m1 is presented in Section 2.4.1.

2.3.3 Relative Cumulative Distributions (F2 and F3) To determine the probability of a primary particle creating a DSB the relative cumulative probability is calculated, defined by Equations 2.3 and 2.4. The probability of a particle creating a cluster size of 2 or more is summed, giving the probability of inducing a DSB with any complexity, F2. This is shown for the three models in Figure 2.4:

Figure 2.4 The relative cumulative probability for cluster sizes of 2 or more for the three chromatin models at various LETs. Protons shown as closed symbols and

87 alpha particles shown as open symbols. Error bars represent the standard error in the mean. Lines show power law fits.

The probability of creating a DSB increases with LET following a power law of

B the form F2 = A LET . All three models produce similar F2 parameters across the LET range investigated. Though the cross-linked chromatin model leads to slightly higher F2, with the solenoid and zig-zag models producing similar F2 values.

The F2 parameter has a limiting value of 1, where every primary leads to a cluster of ν≥2. Although, for the case described in this work it is unlikely that F2 would reach this saturation point. This has been neglected in the fitted power laws, which do exceed 1 at LET values around 640 keV/μm. The complexity of DSBs can be further categorized by setting a higher value on the lower limit of the cumulative frequency. For example, we investigate complex

DSBs by considering clusters of ν≥3, F3, presented in Figure 2.5:

Figure 2.5 The relative cumulative probability for cluster sizes of 3 or more for the three chromatin models at various LETs. Protons shown as closed symbols and alpha particles shown as open symbols. Error bars represent the standard error in the mean. Lines show power law fits.

A similar pattern is seen as for F2, an increase in F3 with LET and a slightly higher probability for ν≥3 in the cross-linked chromatin model, relative to solenoid or zig-zag. A power law has been fitted to the observed values, of the same form as for F2. Table 2.2 shows the fitting parameters used for F2 and F3, and associated goodness of fit (reduced χ2).

88 Protons Alphas

(A10-7) (A10-7) ± Model B ± % χ2  10-5 B ± % χ2  10-5 ± % %

Solenoid (F2) 0.43 ± 14 1.90 ± 1.9 1.5 0.87 ± 17 1.63 ± 2.0 2.7

Zig-Zag (F2) 0.50 ± 15 1.87 ± 1.9 1.7 0.91 ± 11 1.62 ± 1.2 1.1 Cross-Linked 0.49 ± 18 1.89 ± 2.4 2.9 0.12 ± 11 1.58 ± 1.4 1.5 (F2)

Solenoid (F3) 0.42 ± 22 2.58 ± 2.0 0.3 0.33 ± 51 2.45 ± 3.9 0.4

Zig-Zag (F3) 0.38 ± 17 2.62 ± 1.6 0.2 0.36 ± 48 2.44 ± 3.7 0.4 Cross-Linked 0.36 ± 34 2.65 ± 3.0 0.9 0.48 ± 44 2.39 ± 3.5 0.4 (F3)

Table 2.2 Fitting parameters used for F2 and F3 parameters. Uncertainties in the fitting parameters are calculated as the asymptotic standard error, and are

B B expressed as a percentage error of the parameter. Where F2=A LET and F3=A LET

2.3.4 SSB to DSB Ratio The ratio of SSBs to DSBs is of biological interest, defined by Equation 2.5. For each LET investigated the probability of producing a cluster with one backbone (ν=1) is divided by the F2 parameter. Figure 2.6 shows how this ratio changes with LET:

Figure 2.6 The SSB to DSB ratio of the three chromatin models for various LETs. Protons shown as closed symbols and alpha particles shown as open symbols.

A similar trend is seen across all three models. Higher LET results in more ionised backbones within a closer proximity, causing higher yields of DSBs, and subsequently lower yields of SSBs. This effect results in not only the SSB to DSB ratio decreasing with LET, but decreasing at an exponential rate. This equates to the induction of more lethal damage at higher LET.

89

2.3.5 The Effect of Backbone Size Currently we have modelled the chromatin geometries with spherical approximations of DNA backbone and base volumes, with radii of 0.240 nm and 0.208 nm respectively. The choice of backbone size here is somewhat arbitrary, being set to a maximum before overlaps between successive backbones occur. It does, however, allow for a model to be built with discrete volumes and assigned bp numbers. This is important for the clustering algorithm, as exactly 10 bp separations between ionisations can be measured, as opposed to approximation through spatial separation used in existing models. We investigate the sensitivity of the model by changing the size of DNA backbone and base volumes within the simulation. This is done for a single test case of the solenoid chromatin geometry with mono-energetic protons at 1 MeV (28.7 keV/μm), though similar trends are observed for the other chromatin models. Restrictions are applied to the size variation; namely the backbone volume must be larger than the base volume (whilst maintaining the same ratio), and the backbone radius cannot exceed 0.24 nm. Selected backbone radii for testing are 0.24, 0.19, 0.14, and 0.09 nm. At each

6 radius the fibre is irradiated with 10 1 MeV protons. Nanodosimetric parameters m1,

F2, F3, and the SSB to DSB ratio are calculated, shown in Figure 2.7.

90

Figure 2.7 Values of m1, F2, F3, and the SSB to DSB ratio as a function of backbone volume size. For the test case of Solenoid chromatin fibre irradiated with 106 1 MeV protons. Error bars represent the standard error in the mean.

Figure 2.7 shows a non-linear relationship between backbone volume and parameters F2, F3, and the SSB to DSB ratio, with a linear relationship observed between backbone volume and the m1 parameter. Currently within the simulation the backbone volume is limited to 0.058 nm3, with a spherical radius of 0.24 nm, in order to avoid geometry overlaps. The non-linear relationship between backbone volume and nanodosimetric parameters highlights the sensitivity of the model, where increasing volume has a dramatic effect on all nanodosimetric parameters measured.

91 2.4 Discussion 2.4.1 Chromatin Model Comparison Comparison of the DBCSD between the three chromatin models showed a statistically significant difference. This was in most part due to the large number of events used in generating the DBCSD, 106 particles simulated at each energy. Although, in all cases similar maximum cluster sizes are observed between the three chromatin models, moreover a similar shape is seen in the DBCSDs. To quantify any meaningful differences between the models the nanodosimetric parameters have to be compared. The parameters are presented earlier as Figures

2.3, 2.4, 2.5, and 2.6. Here we see little difference in m1, F2, F3, and the SSB to DSB ratio between the models. Especially when considering the solenoid and zig-zag chromatin geometries. There is a tendency to see slightly higher m1, F2, and F3 parameters in the Cross-linked geometry relative to the other geometries. This is likely explained by the higher concentration of DNA in the cross-linked model, with 12.4 kbp compared to 10.8 kbp and 10.5 kbp in the solenoid and zig-zag models respectively. Relative differences in nanodosimetric parameters can be calculated between the models for all LETs investigated. Here the difference between a parameter for the solenoid model and either the zig-zag or cross-linked model, at a given LET, is determined as a percentage of the value for the solenoid model. The difference is then averaged across the LET range, but kept separate for the different models. For all parameters small variations are seen when compared to the solenoid model, especially at higher LET. For the m1 parameter an average difference of 0.39% is observed between the solenoid and zig-zag model, and 0.35% between the solenoid and cross-linked model. Larger differences are observed for the F2 and F3 parameters. With a difference of 6.4% and 11.7% in F2 for the zig-zag and cross- linked models respectively. Differences in F3 of 8.7% and 17.6% are seen for the zig-zag and cross-linked models respectively. Small differences are seen in the SSB to DSB ratio, with percentage difference of 3.3% and 3.4% for the zig-zag and cross- linked model respectively. To further compare the models, it is possible to compare the distribution of nanodosimetric parameters. Here we consider the differences in the conditional average cluster size, m1, between the three models. As mentioned previously,

Section 2.2.4, m1 can be calculated with an event-by-event method. Here an m1 is calculated from the DBCSD of each individual primary and the final m1 parameter is simply the average of every individual parameter. The distribution of m1 has a similar shape to the DBCSD, with a considerably higher frequency of m1=2 relative to larger values of m1. Comparison of the distribution of the m1 parameter revealed no statistically significant differences between the chromatin models, in all cases p>0.05. Furthermore, inspection of Figure 2.3 shows only very small differences in

92 the final m1 parameters of each model across the LET range investigated. This leads us to believe that there is no significant difference in m1 between the solenoid, zig- zag, or cross-linked chromatin models.

2.4.2 Damage Complexity Currently, in this work, complexity is only considered for two cases, ν≥2 and ν≥3. Though, biologically relevant complexity classification requires many more factors. A myriad of combinations exist in the formation of a DSB; for example, simple breaks involving only 2 backbones, base damage in conjunction with backbone damage, and multiple backbone damage to name but a few. It has been shown, in a number of studies, that the complexity of a DSB has an effect on the DNA damage response (250–253). Scoring of DNA damage is an active field of research, both historically (254,255) and more recently (256). Though, little has been done in fully classifying DSBs with a focus on the relationship to lethality or biological outcomes. Watanabe et al. (257) have made recent efforts in categorising the spectrum of DNA damage induced by ionising radiation, with consideration to the effect on the DNA damage response. Following the classification convention of

Watanabe et al. F2 represents the probability of a particle creating a DSB with a minimum complexity and above, whilst F3 represents the probability of creating a DSB+ and above. The work presented here does not report the effect of base damage. Although incorporated into the simulation, base damage does not contribute to the strand breaks and as such is omitted from calculation of cluster sizes. It would however be necessary for any biologically relevant classification of damage. The SSB to DSB ratio, shown in Figure 2.6, has been reported widely in the literature for proton irradiation. The ratio, with respect to LET or initial proton energy, has been measured experimentally through plasmid irradiation (258–260) and calculated through simulation (109,246,257,261–264). In general, it is observed that the SSB to DSB ratio decreases with increasing LET, due to closer proximity of ionisation events. Where possible values of initial proton energy and ratio of SSBs to DSBs were taken from the referenced literature and presented alongside data from in this work, Figure 2.8. The values of the SSB to DSB ratio from plasmid studies are omitted from Figure 2.8 due to differences in geometry between DNA plasmids and the chromatin fibre. Values of LET quoted in the literature varied depending on the method used for calculation, and occasionally were not given; due to these inconsistencies the initial proton energy was used instead.

93

Figure 2.8 The SSB to DSB ratio for the three chromatin models investigated in this work with a fitted power law, 1σ uncertainty between fit to this work and values shown. Data from other studies are shown with a fitted power law equation, and 1σ uncertainty calculated between literature values and fitted power law. Showing an overestimation of the SSB to DSB ratio we predict compared to literature values.

In Figure 2.8 a power law is fitted of the form SSB/DSB = A EB, where E is the proton energy (MeV). For the literature data, values of the constants A and B have been derived by the weighted average of the individual A and B from each data set, weighted according to the relative number of data points in the set. In the same way a standard deviation is calculated based on the difference between the literature data and the fit with the average A and B constants. For data from the literature A=10.5 and B=0.32, for this work A=27.1 and B=0.40. From Figure 2.8 it can be seen that the nanodosimetric model presented here overestimates the SSB to DSB ratio compared to other published values. There are a few possible explanations for this. This work doesn’t consider free radical production and interaction with the DNA; most of the quoted literature studies do this in some way, either through particle tracking or analytical equations. Incorporation of indirect damage would lead to an increase in backbone damage. This increase has the potential to link some of the direct SSBs, converting them into DSBs. This would result in a reduction of the SSB to DSB ratio. The DNA model used here is rather simplistic, due to the need to retain discrete volumes. Here the backbone is constructed as a sphere with volume 0.058 nm3, other work has calculated the sugar-phosphate volume to be 0.175 nm3 (235). The smaller volume used in this work leads to a lower probability of backbones hits, and therefore smaller

94 DBCSDs than may be seen in other work; ultimately leading to an overestimation of the SSB to DSB ratio. Chromatin density has no effect on the SSB to DSB ratio. A test case of the solenoid chromatin fibre was built with various chromatin densities, ranging between 1.2 - 5.6 Nucleosomes/11nm. Here the fibres were irradiated with 106 1 MeV protons, in the same irradiation pattern described in Section 2.2.2. It was observed that the SSB to DSB ratio was maintained across all chromatin densities, despite fewer backbones in the lower density fibres. The data presented, Figure 2.7, shows the effect that backbone volume has on the SSB to DSB ratio. Using the fit derived for a 1 MeV proton with the solenoid chromatin model an ideal backbone volume can be calculated. Using the backbone volume suggested by Nadassy et al. (235) we predict the SSB to DSB ratio to be 9.3, whereas the fit to literature values would seem to suggest a ratio equal to 10.13 (for 1 MeV protons). In order to match the literature value a backbone volume of 0.16 nm3 would be required, just over 2.8 times larger than the volume used within this work. A similar value, 0.15 nm3, for backbone volume was derived by Liang et al. (265) by fitting their data to experimental DSB measurements. A volume correction can be applied to the data collected in this work (averaged for the three chromatin models), with the fit derived from Figure 2.7, this is shown in Figure 2.8 as “VolCorrected”. Regardless of the discrepancy between SSB to DSB ratio seen here and in other work we still see similar values between the three chromatin models. Leading us to believe choice in chromatin model doesn't significantly affect this parameter.

2.5 Summary and Conclusions Three models of chromatin geometry have been built within the Geant4-DNA toolkit; the solenoid, zig-zag, and cross-linked fibre. All models were irradiated with 106 primary protons, in the LET range 5.3 – 80.3 keV/μm, and 106 primary alpha particles, in the LET range 62.9 – 226.0 keV/μm. Ionisations occurring within DNA backbone volumes were recorded and scored with a clustering algorithm. Here ionised backbones on opposite strands within a separation of 10 bp are grouped together in a cluster. Separation is measured according to backbone ID number, rather than spatial separation, perhaps an important factor for tightly packed chromatin; where DNA on nucleosomes separated by a complete turn around the fibre can be within the same spatial separation as 10 bp. The clusters are scored following the principles of nanodosimetry, where conditional average cluster size, m1, relative cumulative probabilities, F2 and F3, and the SSB to DSB ratio are calculated. Statistically significant differences are seen when comparing the damaged backbone cluster size distributions between the three chromatin models. Though little difference is seen in the nanodosimetric parameters between the

95 models. The distribution of m1 is compared between the three models, where non- significant differences are observed. The data collected and presented in this work lead us to believe that there is no significant difference in the nanodosimetric parameters if scored in either the solenoid, zig-zag, or cross-linked chromatin geometries; at least for the particle type and LET used in this study, though it is unlikely this conclusion would not hold for other radiation qualities. It is believed that nanodosimetric parameters can be used to understand how ions lead to cellular outcomes. It has been shown that the parameters depend on LET, with a higher probability for complex damage (F3) at greater LET. This could point towards evidence for, and help to understand, a variable RBE along the proton track.

2.6 Acknowledgements N T Henthorn would like to acknowledge financial support from EPSRC (grant No.: EP/J500094/1). We would like to acknowledge Lingjian Yang for his useful advice on nonparametric statistics.

96 3. Nanodosimetric Simulation of DNA Damage from Protons and Across a Clinically Relevant Proton Spread Out Bragg Peak with Direct and Indirect Effects

This paper presents the extension of the model from the previous chapter. Here, the detailed chromatin fibre is combined with a model of the cell. This makes the model capable of predicting the yields, positions, and complexities of DNA damage. This work investigates and details the mechanisms that are used in order to predict both direct and indirect DNA damage. This kind of information is crucial for mechanistic modelling of the biological response, and ultimately in predicting cell fate. A key method of nanodosimetry is the scoring of ionisation clusters, with the assumption that these clusters are related to the kinds of DNA damage clusters that would be formed in a nucleus under the same irradiation conditions. The reason for this is the evidence in the literature hypothesising that complex damage is more difficult to repair. By scoring this damage, in silico, with an explicit model of the DNA it is therefore possible to make biologically relevant predictions, even without a model of repair. This work shows predictions for DSB complexity across a range of proton LET. A Simplified approach is taken to make similar predictions for photon induced damage. The complexity of DSBs is investigated and a method for generating cluster sizes as a function of LET is presented, without the need for simulation. Throughout the work a series of correlations are made relating complexity to dose and LET. These correlations are applied to a 1D SOBP to make clinically relevant predictions. By using similar correlations for photon induced damage a RBE of complexity is shown. This RBE is not dissimilar to the predictions of RBE for cell death made by the phenomenological models in the literature.

Author Contributions I developed the biological target geometries and DNA damage models. I generated, analysed, and evaluated the data. I wrote the manuscript which was reviewed by all authors.

97 Nanodosimetric Simulation of DNA Damage from Protons and Across a Clinically Relevant Proton Spread Out Bragg Peak with Direct and Indirect Effects

N T Henthorn1, ‡, J W Warmenhoven1, M Sotiropoulos1, E A K Smith1, S Ingram1, R I Mackay2, K J Kirkby1, 3 and M J Merchant1, 3 1 Division of Molecular and Clinical Cancer Sciences, Faculty of Biology, Medicine and Health, University of Manchester, UK 2 Christie Medical Physics and Engineering, The Christie NHS Foundation Trust, Manchester, UK 3 The Christie NHS Foundation Trust, Manchester, UK ‡ Correspondence to: [email protected]

Monte Carlo based simulations of ion track structure are a useful tool to investigate radiation induced DNA damage. These simulations have a role when investigating the link between induced damage patterns and the resulting biological effects. This has relevance for revealing mechanisms that are responsible for variable Relative Biological Effectiveness (RBE) in proton therapy. This work uses Geant4-DNA simulations to predict DNA damage complexity across a proton Spread Out Bragg Peak (SOBP) and for comparison a given dose of Co-60 photons. Within this work three simplistic geometries of the DNA volumes are trialled with models for direct and indirect damage. The mechanism of inducing direct DNA damage is chosen as a probability based upon the amount of energy deposited by the radiation track in the DNA base or backbone volume, determined by comparing the simulated yields of Single Strand Breaks (SSBs) and Double Strand Breaks (DSBs) predicted by a plasmid model to experimental results in the literature. A model of the chromatin fibre is used to investigate indirect damage. Indirect damage is included by tracking the production and motion of free radicals, assigning a probability of causing a strand break to hydroxyl radicals that interact with the DNA volumes. Clusters of damaged DNA volumes are identified and are categorised according to the number of associated base or backbone damages. We show how this can be predicted analytically as a function of Linear Energy Transfer (LET). The damages predicted by the chromatin fibre model are combined with a model of the nucleus to predict total yields of each break type for a given dose of protons or photons. Correlations are drawn to predict break complexity as a function of LET and dose. These correlations are used to predict complexity across a SOBP, showing a slight increase with proton depth. RBE of damage is predicted by comparing these predicted yields of complexity to those predicted for the same physical dose of photons.

98 This RBE of damage shows similar trends to the RBE of cell kill predicted by phenomenological models in the literature, implying a role for break complexity in cell kill.

3.1 Introduction Proton therapy offers many benefits as a cancer treatment modality mostly due to the favourable dose deposition profile, characterised by the Bragg peak, providing a superior healthy tissue sparing effect compared to photons. Conventionally radiotherapy has been delivered with high energy photons. As such a wealth of knowledge exists in photon radiotherapy, with decades of data informing the optimal dose prescription in order to control the tumour whilst minimising normal tissue complications (16,266). In order to utilise this photon experience to optimise proton therapy a dose conversion is applied through use of the Relative Biological Effectiveness (RBE). RBE is defined as the dose of a reference radiation divided by the dose of a test radiation necessary to achieve the same biological effect. For RBE it is usual to define the biological endpoint as 10% survival measured in clonogenic assays, with 2 Gy delivered per fraction (29). For protons a constant value of RBE = 1.1 is in clinical use (72), stating that at the same dose protons are 10% better at cell killing than photons. However, there is considerable variance in the experimental evidence for this value (73). Instead it has been proposed that proton RBE is not a constant and depends on a number of factors; including Linear Energy Transfer (LET), dose, tissue type, etc. (73,267,268). To encompass this variable RBE a number of phenomenological models have been proposed (87–89). These models link the photon radiosensitivity parameters of the linear-quadratic survival model to dosimetric parameters of the proton, such as dose and LET. The models show merit since they can reproduce the heightened cell kill at increasing LET, however, it is not possible for this type of model to directly suggest a mechanism for the effect. The lack of mechanistic understanding limits the applicability of these models when they are used beyond the range that they have been fitted to. Furthermore, it has been argued that LET alone is not an adequate parameter for describing RBE, particularly when considering different ion species (269). Alternative to the phenomenological approach are mechanistic computational models. A branch of these mechanistic models relies on the Monte Carlo method to simulate radiation effects in the cell. A number of simulation frameworks exist to investigate DNA damage and or repair at the cellular level (167,169,270–272). The accepted theory in the field is that radiation induced cell death is a function of DNA damage and the resulting biological response, i.e. the efficacy of the repair pathways (273). Isolating the damage and repair aspects in silico allows for the determination of dependencies between the two. This can be used to determine aspects of the damage pattern that affect the repair process. To investigate mechanisms such as

99 these, the model must begin from no prior assumptions, such as an LET dependence. Instead the relations between radiation quality and biological outcome should be emergent from the results. This presence of emergent behaviour gives confidence that mechanisms have been correctly encapsulated, and that the model can be extrapolated to investigate beyond experimental data. DNA damage starts the process of cell kill, and as such it is common for simulations to score the yield of the most toxic damage, the Double Strand Break (DSB), as well as aspects of the induced DSBs such as complexity. Clinically relevant conclusions have been drawn with this methodology, for example proximity effects (274) and break complexity (257). It has been shown in silico that DSBs are produced more proximally with increasing LET (134). Closer proximity between DSBs promotes misrepair, which can manifest later as chromosome aberrations (159,162). This is important when considering biological effect at the distal edge of the proton Bragg peak, where LET is comparatively high. Along with spatial elements of the damage it is also of interest to score the complexity of DNA damage (275). It has been shown experimentally that complex damage is more difficult to repair (276). If the DNA damage persists at the end of the cell cycle then the unrepaired insults can cause cell cycle arrest, leading to cell death or senescence (277), or directly causing cell death. As such, it is hypothesised that this initial DNA damage pattern is strongly correlated to the early biological outcomes. Efforts are underway to measure and quantify this effect through both simulation and experiment. These efforts have been broadly grouped under the field of nanodosimetry (94). Experimentally, a number of nanodosimeters are in operation (96–99). Here, clusters of ionisations at the nanoscale are detected in gas, scaled to represent water. The metric of note is the Ionisation Cluster Size Distribution (ICSD), describing the number of ionisations in the cluster (247,248), which is characteristic for a given radiation quality (278). It is hypothesised that the ICSD is related, or equivalent, to the complexity of DSB that would be created by the radiation in a cell (213,214). This nanodosimetric scoring has begun to show clinical relevance, with many works now proposing an implementation into treatment planning systems (95,279). However, biologically the case may be more complex. For example, the nanodosimetric method only accounts for direct physical damage and neglects indirect damage, resulting from free radical production, it does however prove that ions produce clusters of energy depositions. The availability of 4D track structure simulation now offers the opportunity to model the complete DNA damage mechanisms (169,170,199). Modelling of indirect damage is particularly important when considering hypoxia, which is common to many tumour types (280). This decreased oxygen concentration has an impact on the DNA damage, the biological response, and treatment outcomes (11,116,281).

100 This work presents the results of track structure simulations using Geant4-DNA to score DNA damage, and the application to predict DNA damage complexity for a clinically relevant case, demonstrated with a 1D proton Spread Out Bragg Peak (SOBP). By comparing simulation to literature experimental results of plasmid irradiation we determine the combination of DNA geometry model and method for scoring direct DNA damage that can reproduce experimental values from the literature (Sections 3.2.1, 3.2.3, 3.2.5, 33..1). We include indirect damage in the simulation through the Geant4-DNA chemistry module (200) (Sections 3.2.3, 3.3.2). These models of direct and indirect damage are used to predict DNA damage complexity by simulating the irradiation of the chromatin fibre and a model of the cell (Sections 3.2.4, 3.2.6, 3.3.3, 3.3.4). Throughout the work correlations are drawn that relate damage complexity and yields to LET and dose. The clinical relevance of these correlations are demonstrated by scoring the dose and LET of a 1D SOBP (Section 3.3.5). Here, yields for given categories of DSB complexity are predicted, and by comparing to similar predictions for photons a RBE for damage induction is shown. These predictions for RBE of damage are similar in trend when compared to RBE of cell kill predicted by the phenomenological models, implying a role for damage complexity in cell kill.

3.2 Methods 3.2.1 Models of DNA Geometry Three simple geometric models of the DNA double helix have been implemented in this work, shown in Figure 3.1. All of these models have been used in published simulation work, by other groups, with the aim of predicting DNA damage (282–284). For each model the DNA sugar-phosphate backbone and the DNA base volume are formed by single discrete volumes. The DNA volumes are assigned an identification number equivalent to their position along the genome, allowing for exact determination of base pair (bp) separation between damaged volumes.

101 a) 0.480 nm

0.416 nm

y y

x z x b) 1.15 nm

o 90 0.50 nm

0.34 nm y y

x z x c) 1.15 nm

0.50 nm

0.34 nm y y

x z x

Figure 3.1 Linear segments of the simple DNA double helix geometries implemented in this work. Each sugar-phosphate backbone and base is built as a discrete volume and assigned an ID number equivalent to the base pair. The first column of images shows a cross section through the double helix, the second column shows a 30 bp segment of the double helix. a) Sphere DNA, b) Quarter cylinder DNA, c) Half cylinder DNA.

102 The first geometrical model of the DNA double helix (Spheres) is based on our previous work (282), where the sugar-phosphate backbones and bases are constructed as spheres (Figure 3.1a). In this model, the DNA backbone has a radius of 0.24 nm and the DNA base has a radius of 0.208 nm, giving volumes of 0.06 nm3 and 0.04 nm3 for the backbone and base respectively. The backbone radius was set to a maximum before overlaps between successive backbones occur. However, this backbone radius leads to a considerably smaller volume than may be expected in reality. Using the Nucleic Acid Database (285), Nadassy et al. (235) have calculated the backbone volume to be 0.175 nm3, calculated as a combination of the individual atoms. The base radius was chosen to conserve the volume ratio between the backbone and the Cytosine base, reported by Nadassy et al. (235). The second DNA geometry (QuartCyl) was proposed by Bernal and Liendo (283) (Figure 3.1b). Here, the DNA base and backbone are formed with a half cylinder and quarter cylinder respectively. The base has a radius of 0.5 nm and thickness of 0.34 nm. The backbone has a full radius of 1.15 nm, with a cut away section for the base. The base and backbone have volumes of 0.28 nm3 and 0.13 nm3 respectively. The model used here differs slightly from the original publication from Bernal and Liendo. We do not build the double helix with a major and minor groove, instead placing the nucleotides directly opposite each other. The backbone volume used by Bernal and Liendo is 0.24 nm3, slightly smaller than the volume we use (0.28 nm3). However, we do not expect these changes to significantly affect the DNA damage predictions made. The third model (HalfCyl) is based on one of the earlier DNA models for simulation, originally proposed by Charlton, Nikjoo, and Humm (284) (Figure 3.1c). Here, each base is formed from a half cylinder, with a radius of 0.5 nm and a thickness of 0.34 nm. The bases from each strand are built opposite each other and are surrounded by half cylinders representing the sugar-phosphate backbone, with a cut away section for the base. The sugar-phosphate backbone has a full radius of 1.15 nm and a thickness of 0.34 nm. This gives a backbone volume of 0.56 nm3 and a base volume of 0.13 nm3. Most recently the model has been used as part of a validation study for the TOPAS-nBio code (167). For each geometric model a rotation of 36 degrees around the central axis is applied between successive base pairs, achieving a full turn of the double helix every 10 bp. All DNA volumes in the simulation are made from liquid water, with a density of 1.407 g/cm3 (260).

103 3.2.2 Track Structure Simulation The Monte Carlo toolkit Geant4 (version 10-02-patch01) (179), with the Geant4- DNA extension (164,216) was used to simulate the transport and interaction of mono-energetic protons with the volumes described in this work. Within Geant4- DNA particles are tracked using an event-by-event method, where each track is composed of a series of particle steps. The particle steps contain user accessible information; such as position, energy deposited, and the physical process that led to the energy deposition. Currently the Geant4-DNA physics list (G4EmDNAPhysics) is limited to simulation in liquid water targets but does allow for changes in water density. Representing biological materials with water is a standard assumption in radiobiological Monte Carlo studies (173). However, this common assumption has recently been challenged by Francis et al. (286), where ionisation cross sections for realistic DNA base materials have been incorporated into Geant4-DNA. Francis et al. (286) show significant differences when calculating the proton lineal energy in water compared to a realistic DNA material. This method is not applied to this work. For comparison, photon induced DNA damage is investigated by exposing the simulated biological targets to the secondary electrons produced by a Co-60 source. To determine the secondary electron energy spectrum, Geant4 (G4EmStandardPhysics) is used to simulate the transport of 1.17 MeV and 1.33 MeV photons through a 10x10x10 mm3 water box, with photon intensity equal to the decay scheme of Co-60. The energy of any secondary electrons created by the primary photon is recorded and binned to create a probability distribution, Figure 3.2. The distribution is used to randomly select an electron energy for simulation with the biological targets.

104

Figure 3.2 The probability distribution for secondary electron energy, produced by a Co-60 beam in a 10x10x10 mm3 water box. The distribution is averaged across 109 primary photons and normalised to sum to one. Data is sorted into 10 keV bins.

3.2.3 Direct and Indirect DNA Damage Of particular interest for determining DNA damage is the scoring of energy deposition events. A number of methods have been proposed to convert energy depositions into DNA damage. We consider 3 methods in this work; an energy range, a threshold energy, and ionisation events. In the first method, it is assumed that the total energy deposited within a DNA volume has a probability of causing damage (Energy Range). We use a linear probability based on the energy deposited, between the energy range of 5 eV and 37.5 eV. This range has been evidenced by experimental results of DNA strand breaks from low energy electrons and photons (106,107,109). Volumes with a total energy deposition less than 5 eV are not damaged, whilst volumes with over 37.5 eV are guaranteed to be damaged. The energy depositions in a DNA volume are summed for a primary particle and its associated secondaries, provided multiple particle tracks interact with the volume. For simulations where a given dose is delivered the energy deposited in a DNA volume is summed for all particle tracks that interact with the volume. This assumes that damage is additive, and that the DNA cannot recover between interactions. For a single track this assumption is fair, since the time scale for primary and secondary particle interactions is short. For a dose of particles there may be time in between primary tracks for the DNA to recover, however the primary tracks are spatially distant for the doses considered in this work, so interactions in the same volume from different tracks are unlikely.

105 The second method uses a threshold energy for damage (Energy Threshold) (284). In this method DNA volumes that receive energy depositions more than 17.5 eV are considered damaged. Again, the energy depositions are summed for a primary and its associated secondaries, or for a delivered dose of particles. The third method scores ionisation events within the DNA volumes (Ionisation). This has a relevance for the emerging field of nanodosimetry, where it is hypothesised that clusters of ionisations at the nanoscale can be linked to early biological outcomes (94). There is also credence since ionisations represent the most likely process for large energy depositions. For this method the DNA volumes in which an ionisation event occurs are considered damaged. To study the effects of indirect damage the Geant4-DNA chemistry manager is used (200,287). Following the simulation of the physical interactions the Geant4 chemistry modules are invoked. The yield, species, and position of the free radicals are determined by Geant4-DNA based on molecular interactions of the physical beam with water. All of the free radicals are tracked for 1 ns with Brownian diffusion, including chemical reactions with the bulk water material or other free radicals. Hydroxyl radicals (OH) are assigned a probability of causing damage to the DNA backbone or base for a step taken in the DNA volume. The probability of hydroxyl radicals damaging a base is set to 0.8 (246). The probability of OH induced DNA backbone damage is fitted to match the estimated ratio of 35:65 between direct and indirect damage, for the case of Co-60 irradiation. The estimated ratio between direct and indirect DNA damage was first suggested by Ward based on a number of experiments (119), later Michael and O’Neill suggested around two thirds of the DNA damage is attributable to indirect effects (288). The assumed ratio has become the standard for the PARTRAC code (109,170). If the OH radical meets the probability conditions, then the damage is recorded and the OH track is terminated in the simulation. This is equivalent to chemical reaction with the DNA leading to a strand break. If the probability condition is not met the OH track is terminated without recording any damage. This is equivalent to a chemical reaction with no damage. Within this method is the assumption that OH radicals entering a DNA volume always react.

3.2.4 Damage Classification Following the simulation of the primary and secondary particles, and any associated free radicals, the list of damaged DNA volumes is analysed by a clustering algorithm. For determination of DSBs the clustering algorithm searches for damaged DNA backbones that are on opposite strands and separated by 10 bp or less (109). This can lead to the formation of a cluster containing multiple damaged backbones. In this work, we assume that this type of damage leads to one DSB, though in reality the situation may be more complex with the potential for multiple

106 breaks and small deletions. Damaged bases are included in the cluster if they are within 3 bp of the extreme ends of damaged backbones that form the DSB (257,289,290). Any damaged bases that are directly attached to a damaged backbone are neglected from the clustering, since it is assumed that these damages will be removed along with the backbone during repair. The damage classifications are focused on DSB type, since it is hypothesised that these are the lesion type most closely related to cell death (291). As such, the only non-DSB damage included is for isolated bases and isolated backbones (SSB). The classifications are descriptive of the physico-chemical damage rather than later damages arising from the biological response. For example, repair of isolated base damage through the base excision repair pathway can lead to the creation of a short- lived nucleotide gap (292). If this process occurs on the opposite strand to a damaged backbone, then a DSB may be induced. This is considered in this work by identifying isolated backbone and base damages that are on opposite strands separated by 10 bp or less, referred to as potential DSBs. However, a full classification of this type of time-dependent biologically induced DSB would be better accounted for in a model of the biological response. The damage is classified as one of seven biologically relevant groups, shown schematically in Figure 3.3. These classifications include an isolated base damage (Figure 3.3a), an isolated backbone damage (Figure 3.3b), a potential DSB (Figure 3.3c), a simple DSB with no associated base damage (Figure 3.3d), a simple DSB with at least one associated base damage (Figure 3.3e), a complex DSB with multiple associated backbones but no associated base damage (Figure 3.3f), and a complex DSB with multiple associated backbones and at least one associated base damage (Figure 3.3g). The damages shown in Figure 3.3e-3.3g are considered as complex DSBs, whilst the damage shown in Figure 3.3d is considered as a simple DSB.

107 a) b) c) d) e) f) g)

Isolated Base Isolated Backbone Potential DSB Simple DSB Simple Base DSB Complex DSB Complex Base DSB

Figure 3.3 Schematic representation of the DNA damage classifications used in this work. DNA backbones are shown as rectangles, DNA bases are shown as circles. Damaged volumes are filled with dashed lines. a) isolated base damage, b) isolated backbone damage (SSB), c) potential DSB, d) simple DSB with damaged backbones on opposite strands separated by less than 10 bp, e) simple DSB with at least one associated base damage within 3 bp of the DSB ends, f) complex DSB with multiple associated damaged backbones, g) complex DSB with at least one associated base damage.

These damage categories give an overview of the type of damage induced. However, more relevant is the number of damaged backbones and bases associated to a break. This is similar in concept to the Ionisation Cluster Size Distribution (ICSD) scored in nanodosimetry (214), though not equivalent. We have previously used the term Damaged Backbone Cluster Size Distribution (DBCSD) in reference to the number of backbones associated to a DSB (282). The DBCSD is the probability distribution describing the chance of forming a cluster, equivalent to a DSB, with a given number of backbones. Similarly, a distribution is produced to describe the number of bases associated to a DSB. A combination of these two distributions fully describes the break complexity for a given radiation quality. The break generated by drawing from both distributions is similar in concept to locally multiply damaged sites that were proposed by Ward (119). In this work, we determine these distributions as a function of the primary proton track averaged Linear Energy Transfer (LETt).

3.2.5 Simulation of Plasmid Irradiation A model of the pBR322 plasmid (293) is implemented in Geant4-DNA. Here, 4361 bp of DNA are organised to form a closed circular loop, with the DNA constructed from the three different volumes discussed in Section 3.2.1. The plasmid is placed on top of a slab of water, representative of a glass coverslip that is often used to mount plasmids during experiments (Figure 3.4). Water was chosen for the coverslip material to maintain the Geant4-DNA physics tracking throughout the

108 simulation. The coverslip is simulated to account for any backscattered particles. The plasmid geometry is surrounded by a water torus with a diameter of 2.4 nm. All volumes within the simulation, aside from the coverslip, torus, and DNA, are constructed with air.

236 nm

2.3 nm

100 nm

492 nm

Figure 3.4 Geometry of the pBR322 plasmid implemented in Geant4-DNA. The DNA double helix is organized to form a closed loop of 4361 bp. A water torus surrounds the double helix. The plasmid is mounted on a coverslip constructed with water.

The proton beam is simulated perpendicular to the plasmid, passing through the plasmid and then the coverslip. Initially the particle is uniformly placed on a disc 1 µm from the plasmid. A dose of 2000 Gy is used to irradiate the plasmid for each mono-energetic proton LETt. The dose is achieved by fixing the number of primary particles and varying the radius of the irradiation disc according to Equation 3.1.

푁 × 퐿 × 1.6 × 10−10 푟⁡(푛푚) = 109√ (3.1) 휋 × 퐷 × 휌

Where N is the number of primaries (5000), L is the primary particle LETt in units of keV/µm, D is the required dose (2000 Gy), and 휌 is the DNA density (1.407 g/cm3). The induction of DNA damage is determined according to the methods described in Section 3.2.3, with only direct damage considered. Yields of SSBs and DSBs are calculated by clustering damaged DNA backbones (Section 3.2.4) and reported per Mbp per Gy. The yields are compared to experimental results reported in the literature for dry plasmid irradiation in the literature. The DNA geometry model

109 (sphere, quarter cylinder, or half cylinder) and damage induction model (energy range, energy threshold, or ionisation) that most closely reproduces experimental results is selected for further simulation.

3.2.6 Simulation of Chromatin Fibre Irradiation In order to make predictions of the DNA damage complexity, with a relevance to in vitro cell irradiations, a model of the chromatin fibre was implemented; full details of the model have been reported in our previous work (282). The chromatin fibre is a level of DNA organisation common to all eukaryotic cells. In short, the chromatin fibre is composed of 102 histones, each wrapped by 1.65 turns of the DNA double helix, arranged in the solenoid conformation (227). The organisation of the in vivo chromatin fibre is still unknown within the literature (226). However, we have previously investigated the sensitivity of the chromatin fibre model, comparing three proposed geometries to show that neither the yield of damage nor the complexity is significantly affected (282). The fibre is 198 nm long with a diameter of 37 nm and has a density of 5.7 nucleosomes per 11 nm. A water cylinder is constructed around the fibre, with a length of 203 nm and diameter of 42 nm. In total 18.3 kbp of DNA is built in the chromatin fibre. The fibre is shown in Figure 3.5.

37 nm

198 nm y y

z x x Figure 3.5 Implemented geometry of the solenoid chromatin fibre shown with quarter cylinder DNA. The fibre is comprised of 102 nucleosomes and 18.3 kbp of DNA.

Within a cell nucleus the orientation of the fibre is random relative to the proton beam. To account for this in the simulation the mono-energetic proton is initially placed on the surface of the fibre with the direction randomised. Following the simulation of the primary proton, and all associated secondaries, the damaged DNA volumes are assessed by a clustering algorithm (see Section 3.2.4). This determines the DNA damage per primary, following the assumption that a segment of the chromatin fibre will not be traversed by more than one primary track. Indirect DNA

110 damage is included by tracking OH interactions with the DNA volumes, described by Section 3.2.3. An OH radical that enters a histone volume is “killed” within the simulation, recreating the histone free radical scavenging effect (109).

3.2.7 Total Damage Yields in a Cell Model The damages predicted by the fibre model are used to populate damages predicted by simulation of cellular irradiation. This allows for the prediction of the yields of each break type for a given dose of protons. Details of the method have been reported in our previous work (134). In short, a spherical nucleus (radius of 2.5 µm) is centred in a section of cytoplasm (box with half-length of 5 µm) and a uniform dose is delivered to the nucleus through Geant4-DNA track structure simulations. 15% of energy depositions in the nucleus are recorded, with the percentage chosen to reproduce DSB yields from the literature (109,168,246,255,263). Accepted energy depositions are converted to strand breaks based on the amount of energy deposited, with the probability scaling linearly between 5.0 – 37.5 eV. Strand breaks are randomly assigned to strand one or two of the double helix. A modified DBSCAN algorithm (294) searches for clusters amongst the strand breaks, given the condition that strand breaks must be on the opposite strand and separated by 3.4 nm or less, equivalent to the separation of 10 bp. The conversion of energy depositions into DSBs follows a similar methodology proposed by Francis et al. (262). The specific structure of the DSB is then populated with data from the chromatin fibre simulations, matching primary particle energy. The cell model provides details on the spatial coordinates of the damage site, whilst the fibre model provides details on the number of bases and backbones involved in the damage site. Isolated strand breaks determined by the cell model are populated with either isolated backbones, isolated bases, or potential DSBs from the fibre model. The damages from the fibre model are randomly sampled, giving higher chance of selection for common break types. The cell and chromatin fibre models are capable of predicting photon induced DNA damage, though for the cell model photon yields are not simulated with the full track structure. A large number of photons are required in order to deliver a given dose. Due to the large number of photons it is assumed that a homogeneous dose is delivered across the nucleus, and as such DSBs can be randomly placed, this assumption dramatically reduces simulation time. The DSB yield is assumed to follow a Poisson distribution with an average of 25 DSBs/Gy/Cell, where the average is determined to fit the DSB yield LET relation of protons (assuming Co-60 LETt = 0.2 keV/µm). The SSB yield is predicted from the estimated SSB to DSB ratio, between 25-40 for sparsely ionizing radiation (295), and is randomly selected per repeat. These damages are populated with the specific complexity data from the Co-

111 60 irradiated chromatin fibre, which includes the number of associated backbones and bases.

3.3 Results 3.3.1 Direct DNA Damage - Plasmid Irradiation The plasmid model was constructed in Geant4 (Section 3.2.5). The volumes used to form the DNA were selected as either spheres (Spheres, Figure 3.1a), quarter cylinders (QuartCyl, Figure 3.1b), or half cylinders (HalfCyl, Figure 3.1c). The induction of DNA damage was scored according to the three methods discussed in Section 3.2.3; an energy range, energy threshold, or ionisations. Only direct DNA damage was considered for these plasmid simulations, with results compared to experimental data of dry plasmids in the literature. Simulation and literature results are reported per Mbp per unit dose (Figure 3.6), a conversion of 650 Da per bp was assumed for studies reporting data per unit mass. The simulation was designed to match the experimental conditions of Vysin et al. (260), where dry pBR322 plasmids were irradiated with 10, 20, or 30 MeV protons. In order to assess the simulations predictive accuracy of DNA damage over a greater range of LETt three further studies were included. Similarly, Souici et al.

(296) irradiated dry pBR322 with proton LETt relevant to the Bragg peak region. Urushibara et al. (297) and Ushigome et al. (298) studied the direct DNA damage yields by irradiating hydrated pUC18 plasmids with alpha particles.

Figure 3.6 Yield of DSBs (dashed lines) and SSBs (solid line) predicted by the simulation as a function of proton LETt. Error bars show the standard error in the

4 mean between 10 repeats of 2000 Gy per LETt. Experimental data is replotted from (260,296–298), here DSBs are shown as open symbols and SSBs are shown as closed symbols.

The models investigated here reproduce the relative magnitude of SSB and DSB yields that are seen experimentally, aside from the spherical DNA model

(Spheres), which significantly underestimates yields across the LETt range. Due to

112 the spread in experimental data it becomes difficult to determine which geometry and DNA damage model combination best describes direct damage yields. However, for the model combinations tested, the quarter cylinder DNA model with backbone damage determined by an energy range reproduces the most experimental data.

3.3.2 Indirect DNA Damage – Chromatin Irradiation To determine the impact of indirect effects on DNA damage the chromatin fibre model (Section 3.2.6) is irradiated at a range of proton LETt, or electrons with an energy spectrum from a Co-60 source (Section 3.2.2). The Geant4-DNA chemistry modules were implemented to track the generation and motion of free radicals within the geometry. OH radicals crossing a DNA backbone are assigned a probability of inducing damage, PInd. The number of backbones damaged by direct effects and indirect effects are independently summed and the total fraction of strand breaks induced by indirect effects are calculated. PInd. is chosen in order to produce 65% total backbone damage from indirect effects (119), for the case of irradiation with a

Co-60 source. PInd. was determined to be 0.5, when scored with the quarter cylinder energy range model combination. The fraction of indirect backbone damage for Co-

60 and a range of proton LETt is shown in Figure 3.7.

Figure 3.7 The average fraction of backbone damage due to indirect effects, showing that across the LETt range the majority of damage is from indirect effects.

Data is shown for Co-60 (0.2 keV/µm, open symbol) and a range of proton LETt (closed symbols). Error bars show the standard error in the mean between 1.5x106 independent primaries.

A value of PInd. = 0.5 leads to indirect effects causing an average fraction of 0.648 ± 0.003 of the total strand breaks for Co-60. The simulation predicts that all

113 values of PInd. result in a higher fraction of strand breaks due to indirect effects compared to direct effects, investigated down to PInd. = 0.05. For this model, an average fraction of 0.61 of the strands breaks are due to indirect effects across the entire proton LETt range investigated, with the highest fraction occurring at the lowest LETt. The probability of OH damage to DNA bases is set to 0.8 (246) to account for higher reaction rates (168).

3.3.3 Damage Complexity The damage yields determined by the cell model (Section 3.2.7) are populated with detail from the chromatin fibre simulations (Section 3.2.6). The yields of isolated damages and clustered damages are presented in Figure 3.8.

Figure 3.8 The average yield of DNA damage type per unit dose per Gbp, assuming a genome of 6 Gbp, across the proton LETt range. a) isolated damages. b) clustered damages. Showing an increase in DSBs and potential DSBs with LETt. There is a corresponding decrease in the yield of isolated backbones (SSB) with LETt, as more of these damages are converted into clustered damages. Error bars are the standard error in the mean for 2500 repeats of 1 Gy. Photons are shown as open symbols and protons are shown as closed symbols.

The DSBs predicted are further categorised into the types of DSBs defined in Section 3.2.4. This is presented as a probability per DSB, removing the dependency on the initial DSB yield. The data is generated by the chromatin fibre simulations and as such can determine the DSB types for direct damage only or direct with indirect damage. Figure 3.9 shows that including indirect effects leads to an increase in complex DSBs, with a corresponding decrease in simple DSBs. This is highlighted between 20-25 keV/µm, where inclusion of indirect effects predicts that the probability of multi-base multi-backbone breaks (CompBase) is greater than multi- base two backbone breaks (SimpBase). This increase in complexity, when indirect

114 effects are included, highlights the importance of full damage simulation since the effect is not seen when only direct damage is simulated.

Figure 3.9 The probability of forming a given type of DSB across the proton LETt range. a) Considering direct DNA damage, b) Considering direct and indirect DNA damage. Including indirect effects leads to an increase in more complex breaks, with a corresponding decrease in the simpler form of DSB (Simp DSB).

A polynomial fit is applied to the data shown in Figure 3.9b and the yields shown in Figure 3.8b, giving the total yield of each DSB category as a function of dose and

LETt. This correlation (Equation 3.2) can then be used to predict DSB complexity without the need for simulation.

푌𝑖푒푙푑(퐷, 퐿) = 퐷 × (푎 × 퐿2 + 푏 × 퐿 + 푐) (3.2)

Where D is the proton dose in units of Gy and L is the track averaged LET in units of keV/µm. The parameters of Equation 3.2, as well as the asymptotic standard error, are presented in Table 3.1.

Simp DSB SimpBase DSB Comp DSB CompBase DSB

a (-2.33 ± 0.36) E-3 (-6.77 ± 1.46) E-4 (1.29 ± 0.28) E-4 (3.47 ± 0.21) E-3

b (3.98 ± 0.12) E-2 (2.09 ± 0.1) E-1 (3.16 ± 0.01) E-1 (1.41 ± 0.01) E-1

c (1.64 ± 0.00) E+1 (2.38 ± 0.03) E+0 (4.86 ± 0.05) E+0 (1.56 ± 0.04) E+0

Table 3.1 The fitted parameters of Equation 3.3, including the asymptotic standard error, for predicting the yield of each DSB category as a function of proton dose and

LETt.

115

3.3.4 Damage Complexity Distribution The data presented in Section 3.3.3 describes the types of DNA damage produced across a proton LETt range. However, it may be of interest to determine the exact number of backbones and bases associated to a break. There is likely to be some variation in the biological response, even for breaks in the same category, i.e. a complex break with 5 damaged backbones may produce a different response than a complex break with 3 damaged backbones. The distribution of backbones and bases involved in a cluster is shown for a range of LETt (Figure 3.10), for the case of direct with indirect damage. These distributions are presented as cumulative distribution functions (CDF). Figure 3.10 presents some of these CDFs, with data selected to represent a range across the clinically relevant LETt. Figure 3.10 shows that the most likely cluster, for any proton LETt, involves two backbones and no bases, consistent with the data shown in Figure 3.9b. The CDFs can be described by Equation 3.3, with 3 fitted parameters. This correlative equation is shown for the CDFs in Figure 3.10 with dashed lines. The parameters of Equation 3.3 are fit across the LETt range, shown in Figure 3.11.

퐶퐷퐹(휈) = 1 − 푒푥푝[√휈 + 푎 − 푏(휈 + 푎)푛] (3.3)

Where n and b are fitted parameters. 휈 is the cluster size, the number of bases or backbones in the cluster. a takes the value of 0 for the backbone distribution and 2 for the base distributions. Equation 3.3 has validity for cluster sizes 휈 ≥ 2 when describing the backbone distributions, and 휈 ≥ 0 when describing the base distributions.

116

Figure 3.10 The cumulative probability distribution of forming a cluster with a given number of a) backbones, b) bases. Dashed lines show Equation 3.3, with fitted parameters. Higher LETt protons have a greater probability of inducing DSBs containing multiple damaged backbones and bases, leading to more complex breaks.

Figure 3.11 The Cumulative Distribution Functions that describe the number of backbones and bases involved in a DSB can be determined by a 3-parameter fit (b, n, a). This figure shows the dependence of the fitted parameters on proton LETt for a) backbones and b) bases. Parameters for the photon distribution fits are shown as open symbols at LETt = 0.2 keV/µm. Error bars show the asymptotic standard error in the mean from the fit.

117 3.3.5 Clinically Relevant Considerations The optimal DNA geometry and direct damage model combination have been determined as the quarter cylinders and energy range, shown in Figure 3.6. The probability of OH radicals causing backbone damage was determined as 0.5, shown in Figure 3.7. These models of direct and indirect DNA damage are applied to the chromatin fibre. Combined with the cell model yields of each damage type are predicted and correlations are determined (Figures 3.8 and 3.9). These correlations are used to predict damages across a proton SOBP, Figure 3.12, with 1 Gy across the dose plateau region. The SOBP is composed from 9 pristine Bragg peaks, with a maximum energy of 150 MeV, simulated with the Geant4 “QGSP_BIC” physics list. Here, it can be seen that across the entire simulated proton dose depth profile the predominant DSB type is the simple form, involving two backbones only. By comparing the types of DSB produced by Co-60 simulation a RBE for damage is predicted, shown in Figure 3.12c. Here, the total yields of damage types predicted by proton irradiation are divided by the yields that are predicted for the same physical dose of photons. RBETotal is the ratio of complete yields of DSB,

RBESimple is the ratio of yields for DSBs that contain two backbones only, and

RBEComplex is the ratio of yields for DSBs that contain more than two backbones and or base damages.

118

Figure 3.12 Using the correlations from this work the a) proton physical dose and

LETt are used to predict the b) types of induced DSBs in an idealised cell across a SOBP. Simple DSBs, involving only two backbones, are the dominant damage type across the entire depth profile. The shaded area shows uncertainties from ±1σ on the fitted parameters. c) RBE of damage for the yield of total, simple, or complex breaks predicted from protons divided by the same yields predicted for photons of the same physical dose.

3.4 Discussion This work presents the results of proton and photon induced DNA damage simulations, with the aim of predicting the types of DNA damage produced by clinically relevant LETt. The mechanisms of direct and indirect DNA damage are investigated through simulation of dry plasmid experiments and irradiation of a chromatin fibre model. When combined with a model of the cell nucleus, predictions of the absolute yields for the DSB types are made. Through simulation at a number of proton LETt a series of correlations are produced. These correlations are used to make clinically relevant predictions of DSB types across a proton SOBP.

119 The results of the plasmid simulation, Figure 3.6, show that the quarter cylinder DNA model with strand breaks determined by an energy range probability most closely reproduces experimental data. However, there is large variability in the experimental data. For example, the data from Vysin (260) and Souici (296) are from experiments with the same plasmid (pBR322) and under similar conditions. Although the studies do not investigate the same LETt range, extrapolation of the results would suggest disagreement in measured strand break yields. The purpose of this comparison between simulation and experiment of plasmids is to determine the mechanism of direct damage, which can then be used to investigate more complex biological systems. However, without the specificity and agreement in the experimental data it is difficult to conclude whether the mechanism has truly been understood. Instead, the assumptions that have been made, 10 bp damage separation and damage determined by an energy range probability, can only be described as conditions that produce reasonable results when compared to experiment. For example, recent work from Villagrasa et al. (299) showed that using an energy threshold as the mechanism for direct strand break reproduces the experimental yield of 53BP1 foci. Since the mechanism may not have been identified, applying these conditions to a larger more complex system requires caution. However, this does not negate the results presented in this work, since similar damage yields are predicted when compared to the literature data. Instead it highlights the need for more data in order to give confidence in the hypothesised mechanisms of direct DNA damage. Similar statements can be made about the fits to indirect damage. The simulations presented in this work can reproduce the estimated 65% strand breakage from indirect damage induced by Co-60, Figure 3.7. However, this methodology does not fully describe the mechanisms of indirect damage. Here, the mechanisms of chemical reaction and DNA damage are combined into a single probability. A better approach would involve simulations based on molecular dynamics (300), where the interaction of free radicals with DNA can be explicitly modelled with no prior fitting. However, with current computational power this would severely limit the scale of the simulation. It is here that the chosen approach shows merit, allowing simulation of chemical diffusion in a system with relevant size. The chosen approach is similar to recent work by Meylan et al. (222), except, in their work the authors split the action of chemical reaction and DNA damage. In our work we neglect the spectrum of chemical attacks that do not lead to DNA strand breaks but could hinder repair. For example, our simulations do not account for DNA adducts. This type of scoring would be necessary to fully quantify DNA repair efficacy, and to uncover effects such as the oxygen fixation hypothesis (116). This is particularly important when considering hypoxia and other non-ambient biological conditions.

120 Figure 3.7 shows that across the proton LETt range investigated the majority of strand breaks are formed from indirect action. This suggests that by not including indirect damage in simulations such as these it is likely that the total yield of damage complexities will be underestimated. Since the diffusion of the OH radical is on the order of nanometres, for the time scales investigated, the most likely effect is to add damage to a directly induced cluster. This is shown more explicitly in Figure 3.9. Here, it can be seen that by only simulating direct effects there is a much higher probability of simple DSBs. Analytically going from these direct DSBs to the types of DSBs produced when indirect effects are included is difficult, since there is no constant scaling factor. This would imply that scoring direct DSBs does not give a clear indication of the complexity when indirect effects are included. However, direct damage alone does reproduce the relative yields of DSB types below around 20 keV/µm, with simple DSBs most likely and complex base DSBs the least likely.

Figure 3.10 shows how the specific break complexity varies with proton LETt. Here, the number of backbones and bases involved in a DSB is shown as the cumulative probability. This CDF can be sampled to reproduce the DSBs predicted by the complex geometry and track structure simulations. The CDFs can be summarised by a three-parameter correlation, Equation 3.3. Equation 3.3 can reproduce the CDF across the LETt range investigated, with fitted parameters shown in Figure 3.11. The parameters of the CDF fit can in turn be predicted as a function of LETt, allowing for reproduction of the specific break structure as a function of LETt alone. However, this fit becomes particularly noisy at low LETt, and we find that photons do not fit the LETt correlation for backbones. The CDF fits presented in Figure 3.11 can be coupled to more simple damage simulations that do not explicitly model the DNA geometries, such as the cell model presented in this work. This allows for much faster simulation of DNA damage whilst retaining the same level of detail. For those interested only in break complexity Equation 3.3 can be used to determine the spectrum of DSB type with no simulation, provided LETt is known or assumed.

The LETt and dose dependencies shown in Figures 3.8 & 3.9 are used to give total DSB yield as well as the relative types of DSB (Equation 3.2). Due to investigation across a large range of proton LETt values correlations can be determined, by applying polynomial fits. These fits give no information on the underlying mechanism, but instead allow for prediction of DSB types as a function of LETt. The detail of these predictions can be further increased with use of Equation 3.3. All the fits developed in this work are used to make clinically relevant predictions across a proton SOBP, Figure 3.12. Here, proton dose and LETt are scored in silico in a bulk water phantom. These conventional dosimetric units are then converted into biologically relevant values, giving the yields of DSB type across the proton range. The predominant DSB type is the simple form, involving only two backbones.

121 The more complex DSBs gain significance across the SOBP plateau region, where

LETt and dose are both high. A slight increase in complexity can be seen at the distal edge, however the effect is minimal with the total yield of complex breaks increasing from around 8 DSBs/Cell at the start of the plateau to 11 DSBs/Cell at the end of the plateau. The yield of complex DSBs remains relatively constant across the plateau region, with a change of around ±10%. With such a small increase in complexity it is difficult to foresee this as being the cause for increased biological effectiveness, unless the small increase in complexity results in a significant increase in cell kill. This would then imply that the complex DSBs are very toxic to the cell. In fact, when comparing the predicted induction of complex breaks between protons and photons it can be seen that the RBEComplex increases from 0.95, at the entrance, to 1.47 at the distal edge. These trends are largely consistent with some of the predictions made by phenomenological models of RBE for cell kill (89). This change in DSB complexity is accompanied by another mechanism, which, when considered together, can likely explain the experimentally observed increase in cell kill with LET. Namely, spatial damage distribution factors such as DSB proximity. Our previous work (134) has shown that DSBs are induced more proximally with increasing LETt. The closer proximity leads to a higher yield of misrepaired DSBs, some of which will lead to chromosome aberrations and a heightened toxicity. There are biologically relevant damages that are not accounted for in this work, aside from the described break complexity and proximity. For example, as previously mentioned chemical attacks that lead to adducts are neglected here. These types of damage are likely to slow the repair process, or create irreparable damage (116), which would increase the biological effectiveness.

3.5 Summary and Conclusions In this work simplistic DNA geometries were assessed for their applicability in predicting DNA damage through track structure simulation. The mechanisms of direct and indirect damage were investigated through the simulated irradiation of plasmids and the chromatin fibre. A combination of DNA geometry, direct damage model, and indirect damage model were determined in order to reproduce yields of damage that have been published in the literature. Namely, a quarter cylinder model of DNA, an energy range probability for direct damage, and a probability for OH interaction with DNA leading to a strand break. With these models, predictions of proton and photon induced clustered and non-clustered DNA damage were made. These predictions show that there is a slight increase in damage complexity with increasing LETt. Specifics of the DSB can be predicted as a function of LETt alone, where the cumulative distribution function that describes the number of backbones and bases included in a DSB can be reproduced with a 3-parameter correlation

122 (Equation 3.3). This allows for the quick determination of DSB complexity without the need for track structure simulation. By including a model of the nucleus predictions are made for the absolute yields of DSB type as a function of dose and LETt. Here, correlations are drawn and are applied to the clinically relevant case of a one- dimensional proton SOBP; similarly, correlations are drawn for cells irradiated by photons. By comparing the yields of DSB type for protons and photons at the same physical dose a RBE for damage is calculated. This RBE increases with proton depth, since it is largely a function of LETt. The trends predicted for RBE of damage are similar to those predicted for RBE of cell death by the phenomenological models, lending weight to the idea that cell kill is related to break complexity. The work presented here does not include a model of DSB repair, and as such we do not assign toxicity to the types of breaks predicted. Instead, we assume that break complexity is related to cell kill and show the RBE of induction for different complexities. The data produced by this work can be further used as an input to biological models, which may help to elucidate dependencies between the damage pattern and outcomes.

3.6 Acknowledgements N T Henthorn would like to acknowledge financial support from EPSRC (grant No.: EP/J500094/1). The computational element of this research was achieved using the Condor High Throughput Computing facility at the University of Manchester.

123 4. In Silico Non-Homologous End Joining Following Ion Induced DNA Double Strand Breaks Predicts That Repair Fidelity Depends on Break Density

This work was published in Scientific Reports in 2018. The paper presented in this chapter has been modified to use section numbering. The main focus of this work was to investigate the aspects of the ion induced DNA damage that impact upon the resulting biological response. The work shows a combination of the DNA damage simulations, that I have developed during this PhD, with a simulation of DNA repair developed by my colleague J. Warmenhoven. We focussed on two biological outcomes, the misrepair of DSBs and the incomplete repair of DSBs (residuals). Misrepaired DSBs lead to genomic instability and are experimentally observable, to some extent, as chromosome aberrations. Whereas residual DSBs are likely to lead to a pause in the cell cycle, and if unresolved will cause cell death. We score these parameters at 24 hours post irradiation for a number of reasons; firstly, the typical radiotherapy schedule includes a 24-hour gap between fractions, secondly 24 hours is on the order of time for a typical cell cycle phase, and thirdly experimental studies often report biological outcomes at 24 hours (providing data to compare against). The combination of damage and repair models highlighted a number of emergent properties, though most interesting was the concept of DSB density, which we term the “cluster density”. We found that by counting the average number of neighbouring DSBs for any given DSB we were able to linearly predict the probability of misrepair. This demonstrates an example of where the damage pattern influences biological response. Briefly the mechanism is explained; a DSB end has a certain amount of mobility, exploring a small volume around its site of formation. The number of other DSB ends within this volume, the cluster density, gives the number of potential partners for an end to repair with. As this number increases so does misrepair, following a linear relation. The cluster density depends on ion species and LET, with higher LET producing denser DSBs. Through a series of correlative equations, we are able to fit parameters to reproduce the model combination outcome predictions. This allows for the average behaviour of the complex models to be summarised quickly through dose and LET. This is an important step when considering clinical implementation, or at least prediction of clinically relevant information, since speed is paramount. Within the work we use these correlations to predict the yield of residual and misrepaired DSBs with proton depth. This was applied to a 1D SOBP, though in practice this can be applied to a complete 3D treatment plan, provided dose and LET are scored.

124 Author Contributions I developed the DNA damage model. J. Warmenhoven developed the DNA repair model. Myself and J. Warmenhoven gathered and analysed the data. M. Sotiropoulos developed the dose and LET depth code used in Figure 4.4. Myself and J. Warmenhoven drafted the manuscript based on discussions with the other authors. All authors reviewed and approved the manuscript.

125 In Silico Non-Homologous End Joining Following Ion Induced DNA Double Strand Breaks Predicts That Repair Fidelity Depends on Break Density

N T Henthorn1, *, J W Warmenhoven1, *, ‡, M Sotiropoulos1, R I Mackay2, N F Kirkby1,3, K J Kirkby1, 3 and M J Merchant1, 3 1 Division of Molecular and Clinical Cancer Sciences, Faculty of Biology, Medicine and Health, University of Manchester, UK 2 Christie Medical Physics and Engineering, The Christie NHS Foundation Trust, Manchester, UK 3 The Christie NHS Foundation Trust, Manchester, UK * Both authors contributed equally to this work ‡ Correspondence to: [email protected]

This work uses Monte Carlo simulations to investigate the dependence of residual and misrepaired double strand breaks (DSBs) at 24 hours on the initial damage pattern created during ion therapy. We present results from a nanometric DNA damage simulation coupled to a mechanistic model of Non- Homologous End Joining, capable of predicting the position, complexity, and repair of DSBs. The initial damage pattern is scored by calculating the average number of DSBs within 70 nm from every DSB. We show that this local DSB density, referred to as the cluster density, can linearly predict misrepair regardless of ion species. The models predict that the fraction of residual DSBs is constant, with 7.3% of DSBs left unrepaired following 24 hours of repair. Through simulation over a range of doses and linear energy transfer (LET) we derive simple correlations capable of predicting residual and misrepaired DSBs. These equations are applicable to ion therapy treatment planning where both dose and LET are scored. This is demonstrated by applying the correlations to an example of a clinical proton spread out Bragg peak. Here we see a considerable biological effect past the distal edge, dominated by residual DSBs.

4.1 Introduction The induction of DNA double strand breaks (DSBs) is the mechanism by which radiation kills cells; with DSBs either repaired, misrepaired, or left unrepaired (residual). Misrepaired DSBs, considered potentially lethal lesions (142), can lead to chromosome-type or chromatid-type aberrations, which are experimentally measurable (301). These aberrations cause genomic instability, contributing to the chance of cell death (302) and development of secondary cancers. Residual DSBs

126 are likely to cause cell cycle arrest if detected by the various cell cycle checkpoints. The unresolved residual DSBs will cause cell death, and residuals that are missed by the G2/M checkpoint will cause the cell to undergo mitotic catastrophe. For this reason residual DSBs are conventionally considered to contribute substantially to the likelihood of cell death (302). A difference in the cell killing effectiveness between radiation qualities has previously been noted in the literature (39,303), where a greater effect is seen for higher linear energy transfer (LET) radiations. Clinically it is important to understand this difference as it impacts the required dose for tumour control between radiation qualities, such as photons and protons. In practice, this is accounted for by use of the proton relative biological effectiveness (RBE), which is the ratio of proton dose relative to photon dose to achieve iso-effectiveness. An RBE value of 1.1 is in clinical use to give the physical proton dose that is biologically equivalent to the required photon dose (29,72). However, it has been shown that the proton RBE is variable depending on a number of factors including dose, fraction, α/β ratio, and LET (73). Marshall et al. (268) have recently shown that consideration of RBE variation along a proton Spread Out Bragg Peak (SOBP) results in increases in both the delivered dose and range when compared to a constant RBE. To fully exploit the beneficial dose deposition characteristics of protons over photons, the variable RBE must be understood and incorporated into Treatment Planning Software (TPS). RBE has been linked to LET, where the LET of a particle is defined as the energy deposited per unit length. An increase in LET is equivalent to an increase in the number of ionisations in the same distance. In the context of particle therapy, a higher LET causes a greater number of DSBs (304) as well as a higher complexity of DSB (257). The proximity between DSBs is also dependent on the LET, with high LET leading to more proximal breaks due to the closer spacing between energy depositions (305). Assuming irradiation in the same cell line and cell cycle phase, i.e. same α/β or simplistically “the same biology”, then the differences in the success or failure of the cell’s DNA damage response (DDR) depends only on these three DSB factors; the number of, complexity of, and proximity of DSBs, regardless of the radiation quality used. Similar statements have been expressed in other work, for example the Local Effects Model (LEM) (146,306). Break complexity and proximity have both been suggested to influence misrepair (159). It has been proposed that either a class of DSB exists that has a propensity for causing aberrations (202,307), or that misrepair occurs through pairwise interaction of DSBs dependent on their proximity (308). The influence of DSB proximity on chromosome aberrations has been reviewed previously (274) and has been investigated experimentally (309). Determining the distance over which these DSB interactions occur is of importance for mechanistic models of DNA repair (162,163). However, predictions for the value of this DSB separation varies within

127 the literature; with values ranging from 0.25 µm to 1.30 µm (174,305,310–313). The motion of DSB ends has a large impact on this value and is a debated topic in the literature. Some experimental work has shown that DSB ends in metazoan have limited motion in the nucleus and are unable to explore the entire volume (314,315), travelling less than 0.5 µm in a 10 µm diameter nucleus (316). On the other hand, evidence has also been published to suggest greater mobility of DSB ends (317), particularly along the tracks of damage caused by heavy ions (317,318). More directed motion has also been suggested by Neumaier et al., reporting the observation of repair centres in human cells (132). To fully understand the consequence of how DSB end mobility leads to chromosome aberrations a more detailed repair model must be used. Averbeck et al. (319) have shown that DSBs induced across the carbon Spread Out Bragg Peak (SOBP) are efficiently repaired, although there is an increase in cell death. The authors attribute this increase to a greater number of misrepaired DSBs, highlighting the role of misrepair in RBE. It is known that, for the same cell line, the RBE with different ion species at the same LET (iso-LET) is substantially different (143). This implies that the change in RBE of ions arises from differences in track structure (320), the energy deposition pattern at the nanoscale. This can explain both DSB complexity and proximity, information that is not fully encompassed by the LET alone. The concept of track structure leading to biological outcomes is a core principle of the emerging field of nanodosimetry (94). To our knowledge, no study has assessed the difference in misrepair induction between ion species and across a range of LET relevant to the clinical setting. In this work, Monte Carlo simulations are used to predict the damage and repair of cells irradiated by protons, alpha particles, and carbon ions at a range of iso-LET. By scoring the average number of DSBs within close proximity of each other we are able to show a linear relation to misrepair across all the radiation qualities. We believe that this local DSB density, which we refer to as the cluster density, can be used as a predictor of chromosome aberrations. The residual DSBs at 24 hours of repair are also investigated. Our model predicts a constant fraction of DSBs left unrepaired, regardless of ion type, dose, or LET. From analysis of the results, we have developed correlations that are capable of predicting the yield of residual and misrepaired DSBs for our model. These correlations express the impact that the physics of the beam, dose and LET, has on the early biological effect, misrepaired and residual DSBs at 24 hours. The trends displayed by these correlations for misrepair and residual DSBs have a relevance to proton therapy planning. It is recognised that the work contained here covers a wide range of fields which not all readers may be familiar with. We include a table summarising the field specific terms used (Appendix A1.1).

128 4.2 Methods 4.2.1 Simulation of DSBs This work uses the Monte Carlo based toolkit Geant4 (10.02 patch 01) (179), with the default G4EmDNAPhysics list. Incerti et al. have presented and validated the parameters of this physics list (164), which has been used in a number of published DNA damage simulations (168,217,263) . The Geant4-DNA extension accurately models event-by-event particle tracking down to low energies, simulating each track as a series of steps determined by physical interactions. We use this toolkit to simulate the transport of different ion species across a water medium representing a simplified cell model. Using water as a surrogate for biological material has become a standard assumption in Monte Carlo studies of DNA damage (173). The cell geometry consists of a 5 µm diameter spherical nucleus, as might be typical of a lymphocyte (321), in the centre of a 10 µm box, used as a surrogate for the cellular cytoplasm. A uniform average dose is delivered to the cell through methods described in Appendix A1.2. DSBs are calculated according to the principles of nanodosimetry through assessing the clustering of energy depositions within the nucleus. This work follows the methodology of Francis et al. (262), where assumptions are made in order to convert energy depositions into strand breaks. The methodology assumes that a fraction of the nucleus is occupied by the sensitive material (DNA backbones). We determined the sensitive fraction of the nucleus as 15% by fitting the predicted initial yield of DSBs to literature data (109,168,246,255,263), Appendix A1.3. The cell model implicitly considers indirect damage by fitting the sensitive fraction to literature data that includes indirect effects. In our model energy depositions in the sensitive volume above 37.5 eV are guaranteed to cause strand damage, whilst energy depositions below 5 eV cannot cause strand damage, with the probability increasing linearly across this range (109). An accepted damage site is randomly assigned to strand 1 or 2 of the double helix and the position is recorded. Following the simulation of all primary particles, to achieve the required dose, the list of damage sites is analysed by a clustering algorithm based on a modified implementation of the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) (294) algorithm. The algorithm determines DSBs by clustering damage sites that are on opposite strands and separated by a maximum distance of 3.32 nm, equivalent to the separation of 1 helical turn (10 bp) (237). A separate simulation of the chromatin fibre is used to determine the structure of individual DSBs, most of the details of which were reported in our previous work (282). We build a model of the double helix, with backbones and bases modelled as quarter cylinders and half cylinders (283) with volumes of 0.28 nm3 and 0.13 nm3 respectively. The double helix is wound around cylindrical histones in 1.65 left- handed turns to form the nucleosome, which are then arranged in the solenoid

129 chromatin conformation (227). The chromatin is irradiated with primary ions matching the energy of those used in the cell model. For each incident ion, energy depositions are cumulatively scored in the DNA volumes and the same energy dependent probability of damage induction is applied (5 - 37.5 eV). Indirect damage is included with use of the Geant4-DNA chemistry modules (200). Hydroxyl radicals crossing a DNA backbone or base are assigned a probability of inducing damage, with the probability fitted to produce 65% strand damage due to indirect effects when the fibre is irradiated by a 60-Co source (109). DNA volumes damaged by direct and/or indirect effects are recorded per primary particle. The damaged volumes are analysed by an improved clustering algorithm, since both the genomic separation between damage sites and their strand is known. This removes any discrepancy that may arise due to damaged backbones within 3.32 nm but separated by more than 10 bp. A damaged base is included in the cluster if it is between 3 bp of the extreme ends of the backbones involved in the break (257). Damaged bases directly attached to a damaged backbone are not included in the cluster since it is assumed that these will be removed along with the backbone during repair. The process is repeated for at least 106 primary ions recording details of the induced DSBs per primary to develop a break complexity library. The results of the cell and fibre model are combined to give the positions of DSBs, calculated as the central coordinate of the damage sites in the cluster, and the break complexity, defined as the number of extra SSBs and base lesions included in the break. This damage data is then passed to our repair model.

4.2.2 Non-Homologous End Joining Repair Model The geometric distribution and complexity of DSBs are used to convert each DSB into two pseudo-molecules representing exposed DNA strand ends. Ends are allowed to move within the cell nucleus by sub-diffusive motion, where mean squared displacement scales with time as tα, with α < 1. This has been shown experimentally to better model DSB end mobility compared to Brownian motion (315). We have assumed the continuous time random walk implementation of sub- diffusion (322). In this method DSB ends are assigned a random waiting time drawn from an exponential distribution, during which they do not move. After this waiting time, the end is displaced in a random direction with length drawn from a Gaussian distribution, and a new waiting time is assigned. Any proposed step leaving the nuclear envelope is rejected and resampled. Simultaneous to their motion, DSB ends undergo a series of stochastic time constant based state changes. Each change of state represents the recruitment and action of a particular protein or complex of proteins, implicitly assumed to be randomly distributed throughout the nucleus. Time constants are fitted to reproduce the known recruitment kinetics of these proteins or complexes of proteins. We have

130 chosen to model only the canonical non-homologous end joining (c-NHEJ) process as it is computationally less intensive; the impact of this is discussed later. The c- NHEJ process has an agreed core group of proteins which bind sequentially during repair (323). DSB ends are initially placed with no associated proteins. The ends then either change to an inhibited state or to a state with Ku70/80 attached. The inhibited state represents attachment of proteins from competing processes occupying the DSB end and prevent Ku70/80 from binding. Once a DSB end has recruited Ku70/80 it cannot freely dissociate back to a naked DSB end (324), but must progress to a state with DNA-PKcs attached. Once a DSB end has recruited Ku70/80 and DNA-PKcs (DNA-PK complex) it cannot dissociate to a previous state (325). Two ends in the DNA-PK complex state, that are within 25 nm, can react to form a long range synaptic complex (326). In this complex DNA-PKcs can cross/auto-phosphorylate, and in turn phosphorylate Ku70/80 (324). The synaptic complex can either dissociate to two DSB ends with no attached proteins, or move to a short range synaptic complex state (326). Dissociation occurs with a short time constant resulting in only transient formation of the long-range complex (326). We have used this model to reproduce results of fluorescent recovery after photo- bleaching (FRAP) experiments reported in the literature (325). Once in a short range synaptic complex state the repair pathway cleans all extra lesions associated to the break before final ligation can occur, with time constants taken from literature (174). The final ligation step represents the actions of XLF, XRCC4, DNA-Ligase IV, and polymerases. The c-NHEJ repair model allows us to investigate correctly and incorrectly repaired DSBs, as well as the number of residual DSBs for a given repair time. The full details of this repair model will be published separately at a later date.

4.2.3 Data Availability The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

4.3 Results 4.3.1 Misrepair and LET The cell model was irradiated with 1, 2, or 5 Gy of protons, alphas, and carbon-

6+ 12 across a range of track averaged linear energy transfer (LETt). The NHEJ model was run for a repair time of 24 hours, tracking motion, protein recruitment, and the repair of DSB ends. The fraction of DSBs that misrepaired was recorded, calculated as a fraction of the DSBs that are joined at 24 hours. Figure 4.1a shows a difference in misrepair between the three ions investigated, with the difference becoming more pronounced at higher values of LETt. Similar values for the fraction misrepaired are seen between different doses, though there is a difference in the absolute number misrepaired. Carbon-126+ and alpha particles show relatively similar behaviour,

131 however, protons have a higher fractional misrepair. We investigate this difference in terms of the repair pathway and the ion induced damage patterns.

Figure 4.1 a) The average misrepair of DSBs as a fraction of total DSBs joined at

6+ 24 hours at a range of LETt for protons (p), alphas (α), and carbon-12 (C). A dose

6+ independent difference in misrepair between iso-LETt ions is shown. Carbon-12 shows anomalous behaviour between LETt of 20-30 keV/µm, discussed later. Error bars show the standard error in the mean between 200 repeat simulations. Lines

132 show second order polynomial fits to guide the eye. b) The DSB separation

6+ probability density function for iso-LETt protons (p), alpha (α), and carbon-12 (C) in a 5 µm diameter spherical nucleus. The distributions show the probability that another DSB is located at a given separation from any DSB. Differences are only seen at small separations (inset), due to separations between intra-track DSBs. c) The DSB end displacement at 24 hours. Showing that 95% of DSB ends move 168 nm or less in 24 hours. This limits DSB interactions to nearby neighbours where the DSB separation distribution is different between the ions.

Within the c-NHEJ model only two DNA-PK complexes can undergo synapsis, causing possible misrepair. Ku70/80 has a high affinity for naked DSB ends, and once loaded rapidly recruits DNA-PKcs (327–329). The recruitment of Ku70/80 is largely unaffected by break complexity (329), and both Ku70/80 and DNA-PKcs are abundant in the cell making depletion of local concentrations near composite break sites unlikely (330). Therefore, in the model there is a similar and rapid formation of DNA-PK complexes at break sites, regardless of the radiation quality used. As such this cannot explain the difference in misrepair observed.

Different particles at iso-LETt produce different track structures as they pass through the nucleus. The resultant difference in the energy deposition pattern leads to differences in DSB complexity. To determine the influence of break complexity on misrepair for a single ion species, the repair model was rerun for two cases; one populated with only simple DSBs and one populated with only complex DSBs. The predicted misrepair fraction for either case is not different from the data in Figure 4.1a (Appendix A1.4). Furthermore, we investigate the effect that break complexity has on our ability to differentiate misrepair between ions species, using the same cases as above. For the data in Figure 4.1a, the differences in misrepair are statistically indistinguishable below an LETt of 5 keV/µm. Neither case alters this value, showing that within our model the simulated break complexity does not impact misrepair. A further implication of the different energy deposition patterns is a difference in the spatial distribution of the DSBs themselves. We assess proximity between DSBs by determining the probability that a DSB is within a radial distance from another DSB. Figure 4.1b shows the probability density functions (PDF) of DSB separation for the three ion species at an iso-LETt, equivalent to the distal edge of the proton SOBP ( 27 keV/µm). A non-zero probability for separations of 0 µm is shown since it includes DSBs that are separated between 0 and 10 nm. Similarities at larger DSB separations are seen, indicating that it is equally probable to have DSBs separated by a value greater than around 0.5 µm. Below 0.5 µm a difference can be seen between the ions, due to separations of intra-track DSBs. Here, we see a greater probability of proximal DSBs in the proton-irradiated cell compared to the

133 alpha and carbon-126+ case. The importance of this scale is further established in Figure 4.1c. This shows that the end displacement in 24 hours, 95% moving less than 168 nm, limits interactions to nearby breaks only and coincides with the DSB separations for which the iso-LETt ions are different. As such break proximity can explain differences in misrepair.

4.3.2 Cluster Density, Misrepair, and Residual DSBs To quantify the initial damage patterns, the average neighbouring DSBs within a given radius for each DSB is determined, which we refer to as the “cluster density”. If motion of DSB ends is predominantly responsible for misrepair, then this value scales with the number of potential incorrect partners a DSB end has available for interaction. The cell model was irradiated with the same radiation qualities as shown in Figure 4.1a, and the average cluster density was calculated for 2500 repeat simulations. We find that the cluster density has the strongest correlation to misrepair when calculated with a radial distance of 70 nm, giving a minimum in χ2. However, the strength of this correlation is not greatly decreased by deviations of ± 30 nm from this value (Appendix A1.5). Figure 4.2a shows a strong linear relation between cluster density and misrepair. This shows that the local DSB concentration (cluster density) in the initial damage pattern is predictive of the misrepair at 24 hours in our model. Based on our results, we suggest this physico-marker can be used to indicate the impact that beam parameters have on biological response. Cluster density can be estimated from the ion LETt though a fitted second order polynomial (Appendix

A1.6), with parameters given in Table 4.1. Generally, higher LETt results in a higher cluster density, with protons producing a higher cluster density relative to iso-LETt alphas (a similar trend as seen in Figure 4.1a).

134

Figure 4.2 a) The average number of DSBs separated by less than 70 nm from any given DSB (cluster density) and the corresponding misrepair at 24 hours for protons (p), alphas (α), and carbon-126+ (C). Showing a linear relation (R2>0.9) with fitting parameters presented in Table 4.1. Error bars represent the standard error in the mean between multiple repeat simulations (2500 for cluster density, 200 for misrepair). b) Fraction of DSBs left unrepaired following 24 hours of repair across the LETt range investigated. The solid line shows a constant residual fraction of 0.073. Dashed lines show the standard deviation in this value. Error bars are the

135 standard error in the mean for 200 repeats. c) The predicted number of DSBs per

6+ unit dose for protons, alphas and carbon-12 as a function of LETt, with lines showing linear fits for protons and alphas. There is a clear discontinuity in carbon- 126+ data. Error bars are the standard error in the mean for 2500 repeats. Due to the anomalous behaviour of carbon-126+, its data is not used to determine any correlations but is included in the plots for interest only.

In the context of radiotherapy, residual DSBs are another biological endpoint of interest. Our model predicts that the fraction of residual DSBs following 24 hours of repair is constant across the radiation qualities investigated, with an average of 7.3% breaks left unrepaired. Figure 4.2b shows that this constancy holds across the ion species, dose, and LETt range investigated. To apply the misrepair and residual correlations to a clinically relevant case, and to make predictions of absolute yields, the number of DSBs is determined as a function of LETt and Dose. For the clinically relevant LETt range of 0-20 keV/µm we observe a linear correlation to initial DSB yield for protons and alphas (Figure 4.2c). This linear trend has been reported in the literature for both ions (109,255,331). The DSB yield for alpha particles begins to deviate from linearity above ≈30 keV/µm. However, in the rest of this work we only use the correlation between the bounds of its linearity. The constant fraction of residual DSBs at 24 hours combined with the linear dependence of initial DSB yield on LETt leads to a linear relationship between the yield of residuals and LETt (Appendix A1.8); this has been experimentally observed for 53BP1 foci under similar conditions by Chaudhary et al. (332), although at a systematically lower yield. Potential causes for this are discussed later. The parameters for all correlations are given in Table 4.1. The DSB yield for carbon ions is linear up to ≈20 keV/µm, after which there appears to be a discontinuity, also noticed by other users of Geant4-DNA (personal communication S. McMahon). The behaviour originates from a switching between relativistic and classical calculations in the Geant4-DNA models. This may have implications for studies simulating clinical carbon ions (Appendix A1.7). It is possible to use only relativistic calculations, though this has not been validated by the Geant4- DNA collaboration, and further implications are unknown. As such we have chosen

6+ to use the default Geant4-DNA models and not include carbon-12 data in any of the correlations.

136 4.3.3 Residual and Misrepaired DSB Yields Here we present the correlations from Figure 4.2a, Figure 4.2b, and Figure 4.2c. We convert the correlations of fractional misrepair and residual DSBs into absolute yields, shown in equations 4.4 and 4.6. In order to apply the correlations to an example of a clinical proton SOBP, we determine the dependency on dose and LETt, shown in equations 4.5 and 4.7.

퐹푀푖푠. = 푎 ∙ 퐶푙푢푠푡푒푟⁡퐷푒푛푠𝑖푡푦 + 푏 (4.1)

퐹푅푒푠. = ⁡⁡푐 (4.2)

푁퐷푆퐵 = ⁡퐷. (푑. 퐿 + 푒) (4.3)

푀𝑖푠푟푒푝푎𝑖푟⁡푌𝑖푒푙푑 = 푁퐷푆퐵 ⁡×⁡{푎 ∙ 퐶푙푢푠푡푒푟⁡퐷푒푛푠𝑖푡푦 + 푏} × {1 − 푐} (4.4) = 퐷. (푑. 퐿 + 푒)⁡× {푎 ∙ [푓 ∙ 퐿2 + 푔 ∙ 퐿 + ℎ] + 푏} × {1 − 푐} (4.5)

푅푒푠𝑖푑푢푎푙⁡푌𝑖푒푙푑 = 푁퐷푆퐵 ⁡ × ⁡푐 (4.6) = ⁡퐷 ∙ (푑 ∙ 퐿 + 푒)⁡ × ⁡푐 (4.7)

Where, Fres is the fraction of DSBs that are unresolved at 24 hours, Fmis is the fraction of repaired DSBs that are incorrectly joined at 24 hours, NDSB is the initial number of DSBs induced by the exposure, Cluster Density is the average number of neighbouring DSBs within a 70 nm radius, L is the track averaged linear energy transfer in units of keV/µm, D is the dose in units of Gy, and a-h are parameters determined through best fitting, the values of which are reported in Table 4.1 for protons and alphas.

Parameter Proton ± % Alpha ± % a 0.1966 ± 0.4 0.1966 ± 0.6 b 0.008 ± 3.4 0.008 ± 3.4 c 0.0736 ± 0.2 0.0736 ± 0.2 d 1.149 ± 1.0 1.02 ± 2.5 e 24.10 ± 0.6 20.2 ± 1.3 f 4.879E-4 ± 0.8 3.25E-3 ± 5.0 g 2.84E-3 ± 4.7 1.61E-3 ± 33 h 5.13E-2 ± 1.6 5.51E-2 ± 5.7

Table 4.1 The correlation fitting parameters developed in this work for protons and alphas with their asymptotic standard error shown as a percentage error.

Figure 4.3 shows how the yield of residual and misrepaired DSBs predicted by equations 4.5 and 4.7 increases across a clinically relevant LETt range (332) whilst

137 keeping dose constant. The plotted data points show the results of our model and show good agreement with the correlations derived earlier. There is a difference in behaviour between the predicted yield of residual and misrepaired DSBs, with misrepaired DSBs becoming more important at high LETt.

Figure 4.3 The correlation predicted yield of misrepaired, equation 4.5, and residual, equation 4.7, DSBs across the clinically relevant proton LETt range, with a constant dose of a) 1 Gy, b) 2 Gy, and c) 5 Gy. Points show values measured within the simulation. Error bars in the yield are the standard error in the mean of 200 repeats.

Error bars in the LETt are the standard error in the mean for 50,000 repeats. Both yield and LETt error bars are too small to be seen on this scale.

The proton SOBP depth dose and LETt profile is simulated using the Geant4 “QGSP_BIC” physics list (Figure 4.4a). Here, the SOBP is simulated from the combination of nine pristine Bragg peaks, where the maximum proton energy is 150 MeV. Using equations 4.5 and 4.7, the expected yields of misrepaired and residual DSBs per nucleus is calculated with proton depth (Figure 4.4b). Due to low proton fluence there is increasing noise in LETt at the distal end of Figure 4.4a. However, since the dose in this region is negligible the noise in LETt doesn’t impact our predictions. The undulation of the plateau is due to combination of the 9 pristine Bragg peaks. Figure 4.4b shows a gradual increase in residual and misrepaired DSBs across the SOBP, with a more pronounced peak at the distal end. After this, the yields of residual and misrepaired DSBs fall off at a slower rate than physical dose, resulting in biological effect at low dose regions past the SOBP. Interestingly, the peak of residual DSBs coincides with the distal edge of the dose plateau whilst the peak of misrepaired DSBs is situated at a slightly increased depth. Combined with Figure 4.3 this demonstrates the higher sensitivity that misrepair events have to

LETt. For this 2 Gy SOBP misrepaired DSBs are always predicted at a lower yield than residual DSBs; however, the plotted ratio shows that the relative importance of misrepair is highest at the distal edge.

138

Figure 4.4 a) The depth dose (solid) and LETt (dashed) profile of an example proton SOBP with 2 Gy across the plateau comprised of nine pristine Bragg peaks. b) The predictions of residual and misrepaired DSB yields for the same proton SOBP, calculated with equations 4.7 and 4.5 respectively. The ratio of residual DSBs to misrepaired DSBs is shown as a dashed line. The solid black vertical line on both graphs denotes the depth where the physical dose begins to fall off (15.58 cm).

4.4 Discussion and Conclusions In this work, we demonstrate plausible mechanisms which, combined, lead to the conclusion that the early biological effect of radiation depends on the extent to which DSBs are clustered, shown with the cluster density. The fraction of misrepaired DSBs can be predicted by cluster density; independent of ion type, dose, LETt, and break complexity. The cluster density produced by different ion species is not the same for the same LET values (Figure 4.1b), and therefore produce different early biological effect (Figure 4.1a). We suggest that this can explain the experimentally observed differences in cell kill and mutation caused by different ion species at iso-LET (143). This cluster density dependent response is the result of three factors: the time required for an exposed DNA end to be prepared for re-joining, the motion of these broken ends during this time, and the fact that formation of synaptic complexes depends on co-localisation of two DSB ends. The diffusion of breaks will likely result in separation of initially co-

139 localised correct partners and, in regions of high DSB density, increases the possibility of co-localising with an incorrect partner. We quantify this in terms of cluster density. The cluster density can be estimated from LETt for a given ion type (Appendix A1.6) and is dose independent. Only intra-track DSBs are proximal enough to contribute to the cluster density. We do not see any considerable numbers of DSBs formed as a result of inter-track energy depositions, which we have investigated up to 5 Gy (data not shown). Our model predicts that the fraction of DSBs left unrepaired is constant; independent of ion type, dose, and LETt (Figure 4.2b). This arises due to the tight confinement of DSB ends around their initial break site, Figure 4.1c. Only a small fraction of DSB ends are able to escape this confinement, and are then unlikely to meet another end. The escaping fraction is solely a product of the sub diffusive motion and therefore not dependent on the beam parameters. The number of initial breaks has a linear relationship with dose and with LETt, we therefore predict that the absolute yield of residual DSBs scales linearly with these factors. This is in agreement with current clinical practice, where treatment plans use dose as a surrogate for cell kill. Furthermore, it lends weight to the concept of LET optimisation. Individual NHEJ pathways are thought to respond to different break complexities (333). Our repair model only contains the DNA-PKcs dependent c- NHEJ pathway, capable of resolving simple and complex breaks. For this reason the kinetics of our model were fitted to experimental data from laser generated complex DSBs (325). The model omits DNA-PKcs independent pathways, which resolve simple breaks and have been suggested to complete faster (333). However, the above mechanism of misrepair is still applicable for these pathways, since they also require time dependent recruitment of proteins to mobile ends before synapsis between co-localised partners. Differences would arise in the time required to prepare ends for re-joining, leading to a difference in the radius at which the cluster density should be determined. This means that our cluster density radius is likely an misestimation for fully NHEJ capable cells. The effect of this is somewhat dampened by the fact that some proportion of these simple DSBs can be resolved by c-NHEJ. We therefore do not expect the addition of DNA-PKcs independent repair pathways to substantially impact the results presented. The recruitment kinetics of Ku70/80 and DNA-PKcs are fitted to laser irradiated in vitro Xrs6 and V3 cells with c-NHEJ pathways (325). Repair kinetics are fitted to proton irradiated in vitro AG01522 cells (332). As such the response of our model is not representative of a specific cell type. However, it can be shown that the mechanism of cluster density dependent misrepair holds for any NHEJ capable cell type, following the same justification as above. The induction of DNA damage in this model is fitted to the DSB yields of other in silico models for a range of cell types (109,168,246,255,263), with a constant nucleus size of 5 µm diameter. Therefore,

140 any deviation from this size will change the DSB density, as the same number of DSBs will be distributed across a different volume. The initial damage is therefore also not representative of any specific cell type. A misestimation in the initial DSB yield would lead to misestimation in the yields of residual and misrepaired DSBs. A misestimation in the DSB density would lead to changes in the cluster density, and consequently in the fraction of misrepaired DSBs. However, potential misestimation only changes the initial yield, not the mechanism of cluster density dependent misrepair. Given all of the above, we would expect the correlations detailed in this work to hold for both end points in any NHEJ capable cell, but with cell specific changes in the predicted yields. This change in yield, but not in trends, could explain the systematically lower yield of residual DSBs compared to the experimental results of Chaudhary et al. (332). The model does not include the Homologous Recombination (HR) pathway which is an “error free” pathway available to cells occupying the relevant cell cycle stages. It has been shown that NHEJ is a faster repair pathway than HR (130), and is dominant throughout the cell cycle (130). The core c-NHEJ proteins, DNA-PKcs and Ku70/80, have similar, rapid, recruitment kinetics throughout the cell cycle (325). Phosphorylation of these recruited proteins causes dissociation from DSB ends, in our model this has a time constant of 95 seconds, providing an opportunity for processing by HR (325). However, our model shows that the timescale of interest for predicting misrepair is much less than this (Appendix A1.9) and therefore, we do not expect addition of HR to cause a notable deviation in cellular behaviour from current predictions. We have implemented sub-diffusion in our model by a continuous time random walk method with no form of tethering. This results in limited motion of DSB ends on the order of 100 nm at 24 hours, similar to that which has been reported for live cell experiments (314). This limited motion means that the size of the nucleus, on the order of microns, does not affect the availability of potential partners. In V(D)J recombination the sub-diffusive motion is better described by fractional Langevin motion (fLm) (334). If fLm were to apply to DSB repair in general, broken ends would be more strictly confined around their formation site. However, the confinement scale proposed would allow for an exploration volume that is larger than the volume of interest for misrepair, which we show to have a radius of around 70 nm (Figure 4.2a). Therefore, we do not expect it to influence misrepair substantially. The scale of motion associated with DSB ends in vivo is a debated topic within the literature. The only assumption made regarding this in our model is that the motion is sub-diffusive. The magnitude of the mobility of these ends was then scaled in order to reproduce the endpoint of residual breaks reported in literature (332). The final scale of motion, and the fact that it agrees with the more conservative experimental results, is therefore an emergent property. To achieve greater mobility

141 whilst retaining agreement to the literature reported residual yields would require a different mechanism. One possible mechanism that has previously been proposed, and observed experimentally by some (132), is the formation of repair centres. This directed motion could result in large displacements of DSB ends whilst still maintaining their proximity, a necessary component for resolution of breaks This work uses a detailed short segment of the chromatin fibre to determine break complexity. The model includes damage from both direct and indirect effects in order to predict the DSB complexity. However, we do not consider the impact of some chemical effects such as the oxygen fixation hypothesis (116,118). Inclusion of these processes would likely yield higher residuals at 24 hours due to a decrease in repair efficacy. Furthermore, our process of converting damage sites into DSBs is a simplification that does not account for deletions of very short DNA segments from complex breaks. If these segments are of insufficient length to recruit repair proteins (335) they are both irreparable and experimentally unmeasurable in fluorescent repair kinetics experiments (271). We therefore would expect to underestimate the true yield of residuals. We have discussed reasons that lead us to believe the mechanisms behind residual and misrepaired DSBs are applicable to all NHEJ capable cells. This would result in conserved behaviour between cell types, but with cell specific yields of residuals and misrepaired DSBs. We therefore propose that the general behaviour predicted by our model from the interaction of these mechanisms is representative of reality. The components of the biological response, residual and misrepaired DSBs, are correlated to our results and combined to give equations 4.4 and 4.6. These correlations show how the physical descriptors of the incident beam that are relevant to biological response can be summed up using cluster density. The mechanisms we have proposed for misrepair and residuals respond differently to LETt, Figure 4.3, however both scale equally with dose. This means that our model predicts no change in the ratio of cells undergoing misrepair for either hypo or hyper fractionation.

However, it also means that regardless of dose, misrepair will be dominant at LETt corresponding to the dose fall-off region of the proton SOBP, >22 keV/µm, Figure 4.3b. Assuming that misrepair is a marker for chromosome aberration this potentially mean areas of higher genomic instability are created at the distal edge of the treatment volume. A second clinically relevant trend is predicted; using the 2 Gy case as an example, Figure 4.3B, a change in LETt from 20 keV/µm to 10 keV/µm reduces the average yield of misrepaired DSBs from 6.1 to 2.1, whilst only decreasing the average yield of residual DSBs from 6.8 to 5.1. Treatment plans extend the prescribed dose region beyond the gross tumour volume (GTV) partly in order to treat microscopic spread. In these regions, and bordering areas, it would seem that induction of genomic instability could be limited by moving high LET components

142 into the tumour, whilst a similar cell kill potential could be achieved by maintaining dose. Predictions of our model can be derived from more conventionally scored parameters in proton therapy, equation 4.5 and 4.7. The predicted biological response, shown in Figure 4.4, follows similar trends to the phenomenological predictions of RBE across a proton SOBP (89,268). In both cases there is a gradual increase in response across the dose plateau with a pronounced peak at the distal edge. As such it lends weight to the concept that RBE is driven by a combination of residual and misrepaired DBSs. Due to both residual and misrepaired DSB yields falling off at a slower rate than physical dose, the model predicts considerable biological effect past the distal edge of the SOBP. Misrepair falls off at a slower rate than residual DSBs, however the latter still dominates the yield. Therefore, we suggest that the biologically extended range reported by Marshall et al. (268), and others, is most likely due to the presence of residual DSBs. For the proton SOBP presented here, misrepaired DSBs are predicted to peak at a slightly increased depth compared to residual DSBs, although the yields always remain lower. During dose escalation, this peak will be the first region where misrepair starts to dominate the biological response. Since the misrepair peak is deeper than the physical dose peak, this could have implications clinically. Methods of LET optimisation have been suggested for clinical use, with the justification that RBE is dependent on LET (56,83,89). The difference in response of residual and misrepaired DSBs to LETt further emphasises the benefits of this type of planning, but suggests that prediction of biological outcome is not a straightforward combination of dose and LET. Therefore, if the benefits of proton therapy are to be fully exploited, the mechanisms we have proposed should be considered at the treatment planning stage. This could allow the manipulation of not only RBE in general but also the dominant pathway taken to cell death; a difference which could possibly be exploited clinically.

4.5 Acknowledgements N.T.H and J.W.W would like to acknowledge financial support from EPSRC (grant No.: EP/J500094/1). The computational element of this research was achieved using the Condor High Throughput Computing facility at the University of Manchester.

143 5. SDD: A Standardised Data Format to Record DNA Damage

Whilst developing the models that have been detailed up until this chapter an important step had been comparing the outcomes to experimental data or predictions from other models. Often such verification and validation requires extracting data from publications. There are common metrics that tend to be reported, such as DSB yield for example. However, to investigate the performance of the mechanisms that the models are based on more fundamental data is required. For example, a different mechanism may be applied to induce direct DNA damage, yet similar yields of DSBs may be predicted between the models. In order to benchmark new simulations and investigate mechanisms of existing models it would be immensely useful to record the model outputs in a standard format. Not only would this allow for the development of common analysis tools and derivation of common metrics, it would also allow for the output of any damage model to be used in follow up models of biological response. This allows for the investigation into the effects of small changes in damage predictions on biological response predictions. With this kind of follow up test it is possible to determine characteristics of the damage that influence the repair. Funding was awarded as part of a sandpit project proposal to develop a new standard for reporting results from DNA damage simulations (SDD), funded by STFC Global Challenge Network+ in Advanced Radiotherapy and EPSRC Grand Challenge Network+ in Proton Therapy. The SDD, and the following manuscript, has been disseminated to many researchers in the field. This standard is still under development, with the many authors contributing new ideas. The following chapter presents details of the SDD at the time of writing.

Author Contributions Myself, J. Warmenhoven, M. Merchant, J. Schuemann, and S. McMahon were award the funding to develop the new standard format. All other authors, as researchers with a model of DNA damage or users of such data, have read the draft manuscript and offered useful details on the kind of information they would like to see recorded in a standard. J. Schuemann drafted the manuscript with feedback from A. McNamara, J. Warmenhoven, N. Henthorn, M. Merchant, and S. McMahon.

144 SDD: A Standardised Data Format to Record DNA Damage

J. Schuemann1, A. McNamara1, J. W. Warmenhoven2, N. T. Henthorn2, M. J. Merchant2, H. Paganetti1, KD. Held1, J. Ramos, B. Faddegon, J. Perl, D. Goodhead, I. Plante, H. Rabus, W. Friedland, P. Kundrat, A. Ottolenghi, G. Baiocco, M. Dingfelder, S. Incerti, C. Villagrasa, M. Bueno, M. Bernal, S. Guatelli, J. Brown, Z. Francis, I. Kyriakou, N. Lampe, D. Sakata, F. Ballarini, F. Cucinotta, R. Schulte, R. Stewart, D. Carlson, S. Galer, Z. Kuncic, S. LaCombe, J. Milligan, F. Salvat, T. Sato, and S. J. McMahonN 1 Massachusetts General Hospital & Harvard Medical School, Boston, MA 2 Division of Cancer Sciences, The University of Manchester, Manchester, UK … N Centre for Cancer Research & Cell Biology, Queens University Belfast, Belfast, UK

Our understanding of radiation induced cellular damage has greatly improved over the past decades. Despite this progress, there are still many obstacles to fully understanding how radiation interacts with biologically relevant components to form observable endpoints. One of these hurdles is the fact that it is difficult for researchers of different groups to directly compare their results. Multiple Monte Carlo codes have been developed to simulate damage induction at the DNA-scale, while at the same time various groups have developed models that describe DNA repair processes with varying detail. However, these repair models are intrinsically linked to the damage model employed in their development, making it difficult to disentangle systematic effects in either part of the modelling chain. Here we propose a new standard data format to record DNA damage (SDD) to unify the interface between damage induction simulations and biological modelling. Such a standard greatly facilitates inter-model comparisons, providing an ideal environment to tease out model assumptions and identify persistent, underlying mechanisms. Through such inter-model comparisons, this unified standard has the potential to greatly advance our understanding of the underlying mechanisms of radiation induced damages and the resulting biological effects.

5.1 Introduction Cellular responses to radiation damage have been studied for many decades, showing the dependency of DNA damage on the delivered dose as well as particle type and energy. Numerous models have been developed to explain these responses across a range of endpoints, including DNA damage, mutations, micronuclei, chromosome aberrations and cell survival. Many of these are

145 phenomenological macroscopic models, and simply relate cellular endpoints to the delivered dose and empirical parameters expressing cell sensitivity, which can depend on the cell line, irradiation conditions, and irradiation quality. These include the linear quadratic (LQ) survival model, which is widely used both experimentally and clinically. To more systematically include the observed dependence of cell survival on the ionisation pattern of the radiation modality, i.e. the particle type and energy, various models have been proposed that explicitly include additional physics properties such as the linear energy transfer (LET) in the cell survival calculation. However, such models are also primarily phenomenological and their parameters are dependent on fitting to a selected data set, rather than more fundamental radiobiology. A phenomenological approach can be well suited to capture the overall population-based trends in cell survival necessary to describe the effects of radiation therapy, or to estimate effects of radiation exposure from environmental or space radiation. However, to advance the field towards more individualised therapies we must study the underlying biological mechanisms of cellular response to radiation exposure. Nuclear DNA has long been established as the primary radiation target determining cell viability. The response of cells to radiation has been shown to correlate with the pattern of energy depositions within the nucleus, which is attributed to the resulting differences in patterns and types of DNA damage. Several decades ago, the first studies using Monte Carlo simulations were performed to correlate the track structure of different radiation modalities with DNA geometries and the probability of damage induction. These studies represent the first attempts to mechanistically understand how radiation energy depositions leads to DNA damage. In recent years, several major developments have led to a surge in attempts to describe DNA damage and repair kinetics mechanistically. In particular, an increase in the computational power of standard computers has enabled the simulation of particle tracks in DNA fragments and even whole nuclei. This has been accompanied by improvements in imaging techniques for studying the responses of cells to ionising radiation, providing sufficient data to determine the importance of repair pathways and their effect on cell viability. Currently, several Monte Carlo simulation codes exist that can provide the full track structure of particles passing through a medium (typically water, but more recently also including DNA base pair material) on the nanoscale. Most of these codes also include the first chemical reactions, i.e. the physicochemical production of reactive oxygen species (ROS) and their subsequent diffusion and interaction. Thus, these Monte Carlo codes can provide estimates of both the directly (from the initial particle track) and indirectly (from chemical reactions) induced DNA damages.

146 The patterns of damage along the DNA strands, as well as their complexity, can then be used in models to describe the mechanisms of DNA repair. At the same time, multiple groups have developed mechanistic models of DNA damage repair. These modelling approaches typically either use an assumed (often random) distribution of damage or are designed specifically to interface with one of the available track structure codes in an ad hoc fashion. While typically based on similar principles, these models often employ different approaches and make different assumptions about the underlying repair processes. In addition, inter- comparison between repair models is often complicated by their close links with underlying damage models, which introduces implicit assumptions and dependencies that may not be apparent on simple inspection. In order to fully describe the impact of DNA damage induction and repair on cell survival, chromosome aberrations, mutations or other endpoints of interest, these two efforts need to be combined, providing an accurate description of the initial physics and chemistry processes, followed by mechanistic models of damage repair. To better understand the dependencies of predicted endpoints on uncertainties and assumptions made in each part of this modelling chain, a direct comparison between models and simulation results from different groups would be immensely useful. However, because of the differences in damage model outputs and dependencies between different damage and repair models, these comparisons are significantly complicated. In this paper, we propose a new Standard to record DNA Damage (SDD) to facilitate cross-comparisons between the various track-structure Monte Carlo codes and their implementations of first chemical reactions within the cell nucleus, and to link these to mechanistic models of cell repair and the kinetics of DNA damage repair. This data format provides a new method for cross-code comparisons and promotes collaborations between groups by allowing easy sharing of DNA damage patterns at selected time points, i.e. after the initial energy depositions (direct damages) or after the chemical stage (including indirect damages), as input to calculate the biological endpoint of interest. While non-nuclear damages can also result in cells becoming non-viable, the proposed standard focuses on the main pathway of cell damage, i.e. damage to the DNA, to provide a compact and easily transferable format. This standard is primarily designed to collect nuclear damage information for mammalian cells, however, one can also apply the standard to any other cell with DNA, such as bacterial/viral DNA damage. In that case, some of the cell specific information listed in the standard may be omitted. We indicated in some fields where bacterial/viral information can be used instead.

147 Figure 5.1 Illustration of the header and data structure of the proposed Standard DNA Damage (SDD) format. The information common to all events is listed in the header and the information relevant to each damage is recorded in the data section.

5.2 The New Standard to Record DNA Damage The data format for the proposed Standard to record DNA Damages (SDD) is based on the format of a typical tuple and a phase space file. The file format for each damage specification consists of two parts:

1. A header consisting of a series of factors common to all damage sites in the file 2. A series of lines defining individual damage sites within the modelled volume

Our method intends to offer a well-defined standard, suitable to accommodate a wide range of underlying simulations and model designs. To achieve this, the header requires some basic information about the recorded damage patterns for automated read-in of standardised data, while at the same time providing free text sections to extend on the details of the simulation tools. For wide-spread readability, files employ a comma-separated value format in the header and commas to indicate a new field in the data block, with the data saved entirely as plain text. Additional comment lines are allowed to further comment on the irradiation, simulation, or modelling details, which can be included by starting a line with ‘#’, indicating it is to be ignored.

148 5.2.1 Website and Updates This data format anticipates that with increasing use cases, numbering schemes will need to be expanded to define additional details in some fields. In order to keep the numbering scheme unique and continue to allow users to share their SDD files without ambiguity, we recommend that new numbering schemes are submitted to the SDD collaboration following the steps detailed on the website www.sdd.readthedocs.com. Each new specification for fields in the header or data block will be assigned a specified number and documentation about all fields will be provided and updated.

5.2.2 Header The header provides information defining the conditions common to all entries of the following data block. The structure of the header is presented in Table 5.1. In order to ensure reproducibility and easy sharing of SDD files, the header contains information of the modelled geometry as well as the irradiation that caused the DNA damage. While the design of the SDD format and the description below is focussed on radiation induced damage, the data structure is flexible enough to allow scoring of other sources of DNA damage, for example from chemotherapeutic drugs. In that case, some of the lines in the header will be obsolete and the damage induction may be described in the free text lines. The SSD format can thus also be used to study the effects of combined treatments such as chemo-radiation therapy.

Field Value Notes Type

1 SDD version Version number of SDD definition SDDv1.0 2 Source Program name & version Free text program 3 Author Corresponding author, date, Free text references 4 Simulation Description of details of simulation Free text details settings 5 Source Field to describe source properties Free text 6 Source Type Using mono-energetic, phase space, Int GCR, … 7 Incident Definition of primary incident PDG particles irradiation particle(s) Code(s) 8 Mean Particle Mean incident energy for each Float(s) Energy particle in MeV 9 Energy Full energy distribution specification String(s) + Distribution Floats 10 Particle Fraction of each particle in field Float(s) distribution 11 Dose or Define dose or fluence in each Int + Float fluence exposure, or note that the simulation (+Float) is for a single track

149 12 Irradiation Description of simulated cell or target Free text Target (DNA) region and microenvironment 13 Bounding Shape parameter plus X,Y,Z extents Int + 6 Float Volume (µm) 14 Chromosome Number and Base Pair size of Int + (Int) sizes chromosomes Floats 15 DNA Density Density of Base Pairs in volume Float (MBP/μm3) 16 Cell Cycle Cell Cycle Phase index and Float Phase progression 17 DNA structure Additional field to define DNA 2 Ints structure 18 In vitro / in vivo Experimental condition Int

19 Oxygen As molar O2 concentration Float content 20 Damage Define how types of damage are 1 Float + 1 definition determined Int + 1 Bool + 2 Float 21 Time Time point at which damages are Float recorded 22 File row count Number of distinct damage lesions 2 Int scored & events simulated 23 Data Entries Number of data fields included in the 14 Bool data set 24 Additional Field for additional information that Free text Information may be relevant

Table 5.1 Line by line summary of the header fields and their type.

The header data consists of 24 fields and is stored one element to a line. Table 5.1 summarises the proposed data fields and their format, and additional details for each line entry are provided below. Each line starts with the “value” string in the table followed by a comma followed by the types defined in Table 5.1. Field 1, SDD version: SDD version number to allow tracking future modifications of the file structure and to enable automatic transformation of the data files after such modifications. The version detailed here is SDDv1.0. Field 2, Source program: Here the program and version number that were used to obtain the DNA damages are described. This can be anything from a simple random sampling function to a combination of dedicated Monte Carlo codes. Due to the free text format, any additional information required to fully describe the damage induction model can be added here. Field 3, Author: Lists the corresponding author of the simulations (name and email address) to allow for communication about the data provided and the date of the file creation. Additional references to publications relevant to the simulations can be listed here, recommended format: First author (et al), Title, Journal, edition, page, DOI. Field 4, Simulation details: Free text line to describe further simulation details, ideally providing sufficient information to potentially produce a similar simulation

150 setup. For example, this field should include information about the physics settings, e.g. which secondary particles are included in the simulations with their respective energy cut-off, where relevant, and the size of the world volume used in the simulation. Field 5, Source: Provides a free text field to describe the particle source used for the simulation, in particular for scenarios that include multiple particle irradiations, use phase spaces or other functional forms such as the galactic cosmic rays (GCR). Additional information about the direction can be added here, too, if it is not included in the data block. Field 6, Source Type: Integer giving a first overview of the particle source, separating the source into mono-energetic particles (1), using a phase space as source (2), simulating multiple particles with energy distributions (3), or a special case of that GCRs (4). Additional options can be submitted to the website. Field 7, Incident Particles: Defines the radiation type(s) of the incident particle(s), using the particle specification by the Particle Data Group (PDG) to provide flexibility and a comprehensive handling of all known particle types, including (charged) ions and excited states of ions. Each incident particle type can be fully described by a single PDG code (integer). This field lists all incident particle types in the same order that further source definitions are provided in subsequent fields. Each particle type is comma separated. Field 8, Mean particle energy: Lists the mean incident particle energy in units of MeV as a single float for each particle type listed in field 7 following the same order. Field 9, Energy distribution: Is used to further specify the distribution of the incident radiation field. For mono-energetic fields (indicated in field 6), this field should be reduced to a single value of 0. For other fields, the expected format is a letter specifying distribution (“G” for Gaussian, “B” for bifurcated Gaussian) and a series of space-separated distribution parameters. The mean, µ, is given by field 8. Values required are the variance (G), and left and right variance (B). This field should define one set of parameters for each particle type, using comma separation. Field 10, Particle Distribution: Requires one number per particle type (defined in field 7, same order) to define the fraction of the fluence represented by each particle type. Field 11, Dose or Fluence: The field contains first an integer to define the simulated radiation field as single track irradiation (0), a delivered dose (1), or a fluence (2). The second entry is a float given in units of Gray for dose, or particles per µm2 for fluence. If only a single track was simulated, this value should be set to 0. A third, optional, value can be added to provide the standard deviation of exposures when averaging the dose or fluence.

151 Field 12, Irradiation Target: A free text format line that allows for a detailed description of the irradiation target: the cell type, size, cell cycle stage and other properties relevant to the damage induction; nuclear size or size of sub-nuclear region simulated; and the potential presence of additional factors, such as nanoparticles for radio-enhancement or chemotherapeutic drugs. In case of bacterial/viral irradiations, the bacterial/viral DNA content can be defined here. Field 13, Bounding Volume: Defines the extent of the simulation volume (relevant scoring volume) using a comma-separated list of an integer and 6 floats. The integer defines the shape of the bounding volume, typically the nucleus, as either a box (0), an ellipsoid (1), or a cylinder (2); other volume shapes can be added by submitting a request to the website to assign higher number integers. The shape definition is followed by three floats specifying the bounds of the volume in µm. For a box, the values are given in half lengths, i.e. from (+X,+Y,+Z) to (-X, -Y, -Z); for an ellipsoid, the floats define the half axes of the ellipsoid along each of these three axes, i.e. for the special case of a spherical bounding volume, X, Y, Z are identical; for a cylinder, X and Y define the half axes of the ellipsoid along these axes, and Z defines the half length of the cylinder extent (from +Z to -Z). The bounding box thereby also defines the origin of the coordinate system as the centre of the bounding box (i.e. the centre of the nucleus or cell). The last 3 floats define Euler rotations, , , , respectively, to allow orienting the target in space according to the simulation setup. Field 14, Chromosome sizes: Lists the number N of chromosomes in the nucleus (or in bacteria/viruses), followed by N floats for the size of the chromosome in mega base pairs (MBPs). The order of chromosomes listed here should be consistent with the chromosome ID used in field 5 of the data block. Each chromosome should be listed, i.e. a total of 46 for a normal human cell. This allows for the easy inclusion of cells with missing or multiploid chromosomes. Optionally, if only N is provided, the chromosomes are assumed to be uniform in size based on the density stored in field 15. Field 15, DNA density: Describes the density of the DNA base pairs in the scoring volume in units of MBP per µm3 as a single float value. Field 16, Cell Cycle Phase: Defines the cell cycle numerically as G0 (1), G1 (2), S (3), G2 (4), and M (5) using a float. Progression through a phase can be denoted by the fractional part of the field – for example, 3.7 indicates a cell 70% of the way through S phase. This option is included to allow more granular inclusion of asynchronous cell populations. For simulations without a specific cell phase, the value can be set to 0. The cell cycle phase is important to determine the presence of sister chromatids. It further influences the number of chromosome base pairs (BPs) listed in field 14, for cells in (late) S or G2, the number BPs in a chromosome

152 should only be half the total number of BPs as they are repeated and identified by their chromatid number (CR) in field 5 of the data block. Field 17, DNA structure: Field to define the DNA structure by two comma separated integers. The first integer defines the arrangement of DNA as whole nucleus (0), a heterochromatid section (1), euchromatid section (2), single DNA fibre (3), DNA wrapped around a single histone (4), DNA plasmid (5), or a simple circular (6) or straight (7) DNA section. The second integer indicates “naked” (0) or wet (1) DNA. Additional settings can be added and described by submitting it to the website. Field 18, In vitro / in vivo: Describes the experimental conditions defined by two comma separated integers, the first integer defines if the simulations refer to in vitro (0) or in vivo (1) conditions, the second integer explains the in vitro conditions, it should be 0 for in vivo experiments, (1) for monolayers of cells, (2) for cell suspensions, (3) 3D grown tissue models. Additional conditions can be added by submitting requests to the website.

Field 19, Oxygen content: Contains the molar oxygen (O2) concentration in the volume as float in molars (M). If no value is provided, a normoxic cell is assumed. Other relevant concentrations such as the concentration of various scavengers should be defined in the free text format of field 20, they are not included due to the wide range of potential scavenging agents. Potential additional fractions can be added by expanding the amount of floats in this field through the website. Field 20, Damage definition: This field defines how damage is scored and accumulated into distinct damage sites in the data block. It consists of a vector of the following values using comma separation: Energy threshold to induce a break in eV (float). Integer to define if damages are recorded as direct damages only (0) or including chemistry (1). Other types can be defined through the website but are not currently included. A set of 3 values: 3.1) A Boolean to define if the following numbers are listed in number of BPs (0) or in nm (1). 3.2) This value sets the distance in base pairs or nm between backbone lesions that are considered double strand breaks (float). 3.3) If this value is set to -1, it indicates that base lesions are not scored. Non-negative values mean that damages to the bases add to the damage complexity and are stored in the data block. In that case, all base damages between backbone damages that form a DSB are stored. This value then determines the distance (in BP or nm) beyond the outer backbone damages where base damages are also stored in the same site (float). Note: This field will influence the full break specification in the data part of the standard. Fields 20.1 and 20.2 influence which interactions are scored as damages, field 20.3 determines the distances between and around damages that are clustered in a single break record. However, together with the chromosome position (field 6 in

153 the data block), the data block can be post-processed to yield new break clustering using different distances as desired. An example field 20 would look like: “Damage Definition, 17.5, 0, 0, 10, 3”: Only interactions depositing at least 17.5 eV are counted as lesions, only counting lesions from direct track interactions, distances are defined in number of BPs, a distance of 10 BP to call two opposite strand SSBs a DSB, and base damages are considered, grouping base damages up to 3 BPs on either side of backbone damages in a single site. Field 21, Time: This specifies the total simulation time over which damage is recorded for each primary particle - that is, the time up to which the chemistry is simulated in nanoseconds. For only direct (physics) events, this value should be set to 0. Field 22, File row count: The first integer records the number of distinct damage lesions scored as a single integer and should be identical to the row count of the data block. The second integer is a counter of how many primary particle ‘events’ were simulated. This counter is important to count events that did not cause any damage to the DNA to accurately represent the probability of interactions and avoid overestimation of damage induction. Field 23, Data entries: An array of 14 comma-separated Booleans to indicate which fields of the data block are filled. Field 24, Additional Information: Allows for additional comments about the simulation that may be relevant for the scored damages. This can for example include further details on the physics settings, simulated geometries, material compositions, additional information about the source, potential scavenger concentrations in the cell, or other descriptions of the simulation or irradiated target that may be helpful to better understand the simulations or improve the biological modelling. This field can also be used to define new user-specified values for any field in the header or data block, however, we strongly recommend to submit such new settings to the standard collaboration so it can be officially included on the website (www.sdd.readthedocs.com) to ensure that all users use the same uniquely defined values.

5.2.3 The Data Block The data block is recorded in text (ASCII) format. Each damage site is stored on a single line with a series of space- and/or forward-slash-separated fields to define the structure of the damage. Each field will end with a comma to indicate the start of the next field in the data block. Some fields are required to identify the parent event, position and type of the damage; other fields are optional to provide additional information, these can be left blank. However, the commas indicating the end of each field should still be included to facilitate read-in script design. In general, if the

154 information is available, it should be added and all optional fields can be filled to increase the value of the data. The data structure is summarised in Table 5.2 and detailed below.

Field Value Notes Type Req? 1 New Event Is damage associated with new Int Y event or exposure? 2 X,Y,Z Spatial X, Y, Z coordinates and 3x3 * extent (µm) Floats 3 Damage Types of damage at site (Base 3 Int ** Types damage, SSB, DSB) 4 Cause Cause of damage - direct or 3 Int N indirect and number 5 Chromosome ID of chromosome/chromatid 4 Int * related IDs where damage occurred and on which arm (long/short) or specification of non-nuclear DNA type. 6 Chromosome Location of damage within Float * Pos. chromosome 7 Full Break Full description of strand break Special ** Spec structure 8 DNA DNA Base Sequence around Special N Sequence break site 9 Lesion Time Time of each damage induction Special N 10 Particle PDG list of particles Int(s) N Types 11 Energies List of energies for each particle Float(s) N 12 Translation Starting position of each particle Floats N 13 Rotation Starting direction of each Floats N particle 14 Particle Time Starting time of each particle Floats N

Table 5.2 Value by value definition of the data fields to score DNA damages. Of the fields indicated with “*”either field 2 or 5&6 are required, similarly either of the “**” fields is required.

Field 1, New Event: Identifies if the damage site is associated with a new exposure or a new event. ‘Exposure’ here means the radiation dose or fluence defined in field 11 of the header (e.g. 1 Gy), while a new ‘event’ refers to damages created by a new (single) primary particle within the same exposure. Since a new exposure refers to a new simulation run for the same radiation field, this allows multiple instances of the same irradiation conditions to be recorded in the same file. This field is specified by an integer, defined as: 0 for a damage caused by the same primary event as the previous row, 1 for a damage caused by a new primary event within the same (user defined) exposure, and 2 for a damage which represents the start of a new exposure (which is also necessarily a new primary event).

155 Field 2, X, Y, Z: Define the spatial position X, Y, Z of the centre and extent of each recorded damage, using coordinates within the bounding box specified by field 13 in the header. The first 3 values define positions specified as 3 space separated values with unit µm. All following fields are optional but should be included if available. The second set of 3 space separated values defines the maximal position in X, Y, Z and the last 3 space separated values list the minimal position in X, Y, Z, respectively. Each 3-tuple of values is separated by a forward-slash, for example field 2 could read “0.002 0 1.2 / 0.004 0.002 1.122 / 0.001 -0.001 1.117,”. Either the fields of X, Y and Z, or Chromosome ID and Position have to be defined. If possible, one should list both, the option to define either acknowledges the fact that, depending on the code, not all information may be available. Field 3, Damage types: Provides a high-level specification of the type of damage present at a given site defined by the chromosome position (or X, Y, Z), in terms of base damages, backbone (single strand) and double strand breaks (defined as exactly two single strand damages within the parameters defined in the header). This classification can be seen to be a numerical description of many other damage classification metrics, which effectively groups these damage events into broader categories according to the expected biological severity of the damage. Breaks separated by less than the minimum distance of BPs between separate backbone lesion sites defined by the damage definition in the header are scored as in a single data block, i.e. are considered to be a single lesion. Base damages are added depending on the distances defined in header field 20. An example of how lesions are grouped based on the information provided by field 20 in the header is shown in Figure 5.2. The damages are stored as 3 space-separated integers, where the first integer lists the number of base damages, the second the total number of single backbone breaks including those contributing to the formation of a DSB, and the last number is a binary (0 or 1) indicating the presence of a double strand break, i.e. if lesions occurred on both backbones within the BPs defined in the header. For example, ‘3 2 1’ would represent a damaged DNA site consisting of 3 base damages with 2 backbone damages that are on opposing strands within the BP limit and thus are counted as double strand break. Additional examples are listed in Figure 5.4. Either field 3 or 7 are mandatory, but if available, the full damage structure should always be included as detailed below to provide more details of the break structure. This field is intended to provide a high-level summary and support models that don’t calculate the full structure of individual breaks and rather rely on simple numbers of SSBs and DSBs and their distribution.

156 Figure 5.2 Example of a single DSB counted with a 10 BP maximum backbone separation. The top half scores damages if base lesions are counted with a BP separation of up to 3 BPs as defined in the header field 20 (3.3), scoring in three entries in the data block as indicated by the solid arrows. The entries in field 3 would be a DSB “1 2 1,” a BD “1 0 0,” and a SSB “2 1 0,”. If base damages are neglected (lower half), the same damage pattern will be scored two separate damages: a DSB “0 2 1,” and a SSB “0 1 0,”. The dashed lines demonstrate the separations considered for grouping, red indicates the distances being larger than the cut-off.

Field 4, Cause: Offers a flag to identify the cause of the induced damage and a counter for how many damages were caused by direct or indirect events. The first integer classifies the damage type, currently included are options to identify whether the damage in the record is a result of direct physics interactions (0), indirect interactions, i.e. the result of the propagation of any chemical species and following reactions with the DNA (1), or caused by a combination of direct and indirect interactions (2). Additional options can be included according to the needs of other codes, for example, (3) could be assigned to damages induced by concomitant drug- based therapies, new definitions of this value should be submitted following the steps on the website. The second and third integers provide counters for the number of direct and indirect damages, respectively. Additional information about the damages, for example which damage was induced by which process, and more specification

- of the indirect damages (e.g. fixation by OH, stabilisation of R° by O2 or O2 ) can be recorded in field 7. Field 5, Chromosome related IDs: Stores the identity of the chromatid where the damage occurs. The entry consists of 4 integers. The first defines the ‘DNA structure’ as generic (0), hetero- (1) or eu-chromatin (2) regions of nuclear DNA, DNA fragment (3) or bacterial/viral DNA (4). In case of nuclear DNA, the next 3 integers are the chromosome and chromatid number and indication of long/short arm. The values are stored space-separated as CH CR CA, where CH is the chromosome number and CR is chromatid number and CA is the arm of the

157 chromosome (short (0) or long (1)). CR is specified as 1 for unduplicated chromosomes, and 1 or 2 to identify the two chromatids in the duplicated chromosome in later S and G2 (and early M) phases. For example, ‘12 1 1’ corresponds to the long arm of chromatid 1 on chromosome 12. Chromosome numbering is assumed to follow the order listed in header field 14. For cells without a specified cell phase or cells in G0 or G1 phase the chromatid number CR is always 1. In case the CA information of short vs. long arm is not available, the final number should be set to -1. Field 6, Chromosome Position: Defines the damage position along the chromosome’s genetic length. This value is defined as the distance along the chromosome from the start of the short (p) arm towards the end of the long (q) arm. It can be stored either as a value between 0 and 1 (excluding 1) giving the fractional distance along the chromosome at which the break occurs, or, if the value is greater or equal to 1, as the distance in base pairs from the beginning of the short arm (p) to the damage site. In case of non-nuclear DNA, such as DNA fragments or bacterial/viral DNA, the fraction simply refers to the size of DNA segment provided in the header or, if the value is greater than 1, the BP number along the defined DNA. Either this value or the X, Y, Z information have to be included. Field 7, Full break spec: Either the full break spec or field 3 (damage types) need to be included, and ideally both will be provided. This field, in combination with field 6 (chromosome position) can be used to identify exactly where along the strand the damage occurred, for example to obtain the number of non-hit BPs on either side of the lesion. This field allows for a full specification of the structure of the damage. We apply a four-strand structure, using a 4 x N array, with the rows consisting of the backbone (row 1) and strand (row 2) of the 5’ to 3’ strand, and the bases (row 3) and backbone (row 4) of the 3’ to 5’ strand. The base positions (columns) are aligned reading from the short arm (p) towards the long arm (q), beginning from base 1 which is defined as the first involved alteration at the position corresponding to field 6 (if provided). Thus, increasing columns for strand 1 corresponds to the 5’ to 3’ direction and the 3’ to 5’ on strand 2. The design of the data structure is illustrated in Figure 5.3, with blue fields corresponding to the backbones and orange to the base pair fields.

Figure 5.3 Structural design of the detailed damage scoring in field 7 of the data structure.

158 All unmarked sites are assumed to be unaffected, while damaged sites are marked numerically: Strand Damages are denoted as a 1 for point breaks on a strand from direct effects, and 2 for lesions (deletions or attachments) from indirect effects, and 3 for multiple damages to the same base pair strand either combining direct and indirect effects or two or more interactions of the same type (direct or indirect). In addition, 0 can be used to denote non-damage inducing interactions, i.e. events that are below the damage induction threshold. 4+ can be used for adducts or other modifications, according to the source program, or to define additional details, such as what type of indirect reaction occurred to account for differences in repair likelihood for different types of ROS reactions such as dehydrogenation or OH addition. All additional numbering schemes should be submitted to and detailed on the webpage to ensure a unique numbering scheme. Base Damages are denoted in the same manner as for strand breaks, with 1 for point deletions from direct effects, 2 for lesions (deletions or attachments) from indirect effects, 3 for multiple damages, and 0 for non-damage inducing interactions. Again, all additional suggested numbering schemes should be submitted to and detailed on the webpage to ensure a unique numbering scheme. For file storage use a space-separated list of: Strand (row), Base Pair (column), Damage type, with each damage separated by the delimiter ‘/’. Damage events are recorded by row, beginning with the 5’ to 3’ backbone, then its accompanying bases, then the 3’ to 5’ bases, then finally their backbone. Within each row, damages are then recorded in the 5’ to 3’ direction. To illustrate the syntax of this field, Figure 5.4 showcases example break types, ranging from a single base deletion to a complex damage with multiple deletions in both bases and strands. These are presented both as schematic DNA sections as in Figure 5.3, and an accompanying structure definition above the image. For example, for the case of multiple base double strand break with overhang, with both direct and indirect damages illustrated in Figure 5.4d, the definition reads: ‘1 2 1 / 1 3 1 / 1 4 1 / 1 5 1 / 1 9 0 / 2 1 1 / 2 2 1 / 2 3 2 / 2 4 2 / 2 5 3 / 3 3 1 / 3 4 2 / 3 5 1 / 3 6 2 / 3 7 0 / 4 3 2 / 4 4 2 / 4 5 2 / 4 6 1’. The base pair count of each damage site starts with the first occurring damage, which in this case is on the 5’ to 3’ strand base (2 1 1). However, the damages are stored starting with the 5’ to 3’ strand backbone, and the first damage for this strand occurs on the second BP, so the damage definition starts with (1 2 1). The first 1 indicates the strand, the middle 2 indicates that a damage occurred on the second BP, and the 1 for the last value shows that this was a direct damage. The second, third and fourth triplets (1 3 1 / 1 4 1 / 1 5 1) similarly define 3 more direct damages on the following three BPs. The fifth triplet (1 9 0) shows that there was another interaction with the strand that did not result in a lesion. If additional damages on this strand would have occurred within this break, it

159 would be listed next, for example an additional single direct backbone damage at BP 10 would add a (1 10 1) next. The group ‘2 1 1 / 2 2 1’ likewise defines a block of damages on the 5’ to 3’ bases, starting at BP 1 up to BP 2 caused by direct damages, and the next ‘2 3 2 / 2 4 2’ indicates the next two bases were damaged from indirect processes. ‘2 5 3’ shows that the fifth BP was hit by multiple events in any combination of direct and indirect lesions. The opposite bases start with 3 and the block ‘3 3 1 / 3 4 2 / 3 5 1 / 3 6 2’ defines damages on BP 3-6 iterating between direct and indirect damages, followed by another interaction that did not result in a lesion ‘3 7 0’. The 3’ to 5’ backbone has the same positions damaged but with the first 3 damages from indirect processes recorded as ‘4 3 2 / 4 4 2 / 4 5 2’, and an additional direct damage ‘4 6 1’.

Figure 5.4 Examples for the damage definition structure used in field 7 of the data structure. Here “*” lists interactions that were not sufficient to cause a damage, i.e. below the cut-off defined in the header, “D” denotes direct damages, “I” stands for indirect damages, “M” for multiple damages from any combination of D and I events. These damages are defined by values of 0, 1, 2, 3 in the definition, respectively.

160

Field 8, DNA Sequence: Provides an optional field to further specify the structural geometry of the DNA that was damaged and the surrounding DNA sequence. The field consists of an integer to identify the structure, and a 1 x N array to record the DNA sequence. The integer defines the DNA structure, free DNA (1), DNA linker (2), DNA around a histone (3), euchromatid (4) or heterochromatid (5) DNA section. Additional structures can be registered on the website. For models which incorporate the actual physical structure of individual bases, the DNA sequence along the strand can be included in this field. The design uses the same layout as the break structures above, beginning from the 5’ end at the position of the first damage. Bases are written in sequence for the 5’ to 3’ strand, stored as strings of integer values, with bases denoted as: Missing=0, A=1, C=2, T=3, G=4. Backbones are not specified in this data. The field consists of a 1 x N array. This field is kept optional as most codes do not yet consider the DNA sequence. However, evidence exists that specific types of individual lesions formed by ionising radiation differ for A, G, T and C. There is also evidence in the literature that the larger-scale base sequence can have an impact on the types and quantities of individual lesions created by irradiation. An example of this latter effect would be hole migration along a DNA molecule. These effects are generally not considered in any of the current Monte Carlo DNA damage models. However, they have potentially important implications for modelling of DNA repair. Field 9, Lesion Time: In this field is provided to add (optional) a time for each induced damage in nanoseconds starting from the first recorded damage, using the same order as in field 7. If only a single value is given here, that is assumed to be the time at which the whole damage site enters the simulation. The values are recorded separated by ‘/’, for example for the case shown in Figure 5.4b denoted as (1 3 2 / 2 1 1 / 3 6 0 / 4 8 0), the time structure could be “2.1 / 0 / 0.0000008 / 0.000001”. This translates to an event with an initial direct damage (on 2 1 1), a direct base and a direct backbone damage below the break threshold 0.8 and 1 fs later (3 6 0) and (4 8 0), respectively, and an indirect damage 2.1 ns after the first break (1 3 1). Field 10, Particle Types: This optional field defines the irradiation source that impacted the target, i.e. the primary particles used in the simulation that caused the recorded damage by itself, through secondaries or chemical reactions. This field describes the particle type for each source particle using space-separated integers (PDG value). Field 11, Energies: The corresponding energies to the particles defined above, in MeV, one space-separated entry per particle defined in field 10 (optional).

161 Field 12, Translation: The 3-vectors (X, Y, Z) of the starting points of the particles, in μm relative to the centre of the volume, one space-separated 3-tuple for each particle, with each 3-tuple separated using ‘/’ (optional). Field 13, Rotation: Euler rotation angles (, , ) for the above particles following the same style as for the translations (optional). Field 14, Particle Time: A single space separated float per particle defined in field 10 to define the start time of each particle in ns (optional). This may be particularly important for very low dose rate exposures such as from GCRs.

5.2.4 Dissemination and Repository We have set up a website (https://sdd.readthedocs.org) where the standard is explained and new requests for enumeration schemes can be submitted. In addition, we provide a link to our GitHub repository which offers selected example codes and a place to upload SDD files for sharing. We expect that such a repository to upload damage data, would be useful for modellers who don’t have the resources to perform their own full damage simulations, and to test (new) damage models against other (published) models that provided data here. This repository also includes a code for random damage generation.

5.3 Discussion and Conclusions The outlined standardised data format for DNA damages (SDD) is intended to provide the basis for cross-disciplinary investigations of DNA damage inductions and ensuing kinetics of DNA repair mechanisms. By standardising the recording format of the distribution of damages and their structural pattern for single cells/nuclei, we anticipate to create synergies between various developments in modelling cellular response to DNA damage. For example, there are currently various codes that can be used to simulate the full track-structure of different types of radiation and then score the resulting initial damages to a cell nucleus, including PARTRAC, RITRACK, Penelope, Geant4-DNA, and TOPAS-nBio. Each of these groups have developed models of DNA structures within the nucleus or cell that are used to obtain the initial patterns of DNA damage. However, each of these codes uses their own, specific data structure and damage pattern definition, complicating inter-code comparisons of damage induction. At the same time, various groups are working on models to describe DNA repair kinetics. These codes are often either specifically designed to interface with one of the track-structure codes, use their own damage simulations, or simply sample damage distribution patterns randomly based on assumptions about the damage inducing agent. Providing a common interface would offer much more flexibility and scope to testing different combinations of models, and comparing implicit assumptions and uncertainties.

162 The proposed standard concentrates on interfacing the damage simulations to the biological repair processes defining the cell response. Our goal is to provide the means to not only compare the results of different codes and models, but also investigate the influence of each model assumption and cross validate between models. Testing the dependencies of various observable outcomes on model parameters and their implementation in different models can help us to understand which parts of the models are most sensitive and which parts have only a minor effect on the outcome. In combination with new experimental data of repair processes, in particular with higher time or spatial resolution from new microscopy technologies, this can further help to test the models at various stages along the repair process and identify key experiments to advance the field of research. Overall, we anticipate the standardised data format for DNA damage (SDD) will greatly reduce the burden of sharing analysis tools and thus, facilitate the formation of new collaborations. Using standardised data will allow researchers to test the predictions from different models simply by feeding the SSD data to another code. The standard already is (or will soon be) supported by the following codes: RITRACK, MCDS, TOPAS-nBio, DAMARIS, PHITS, … as well as by users of Geant4-DNA. By providing a clearly defined standard and example codes of scorers for some of the models, we hope to provide an incentive for other existing and newly developed codes to offer interfaces to the SSD data format both for using it as scorer or as damage distribution input for repair models.

163 6. Comparing DNA damage and repair models with respect to experimentally supported mechanisms

Using the standard format for reporting DNA damage (SDD), detailed in Chapter 5, we were able to compare the damage predictions made by the model discussed in previous chapters and a model developed by S. McMahon. The manuscript in this chapter demonstrates the use the SDD, and its ability to make inter-model comparisons easy and insightful. Furthermore, it was possible to use the data produced by each damage model as an input to each of the repair models developed by J. Warmenhoven and S. McMahon. A number of differences between the models was highlighted with this work, particularly in the implemented mechanisms. Ultimately this led to differences in the spatial distribution of damage patterns predicted by the two damage models. In particular, the McMahon damage model predicts less proximal DSBs. Since the McMahon repair model is fitted with experimental data on the yield of misrepaired DSBs the motion of DSB ends had to be significantly higher than in the Warmenhoven repair model. This dependency between damage and repair model was a previously unconsidered effect. The repair model depends on the damage model, and, if a particular result is desired, can compensate by changing mechanistic parameters. This does, however, imply that if mechanisms of the repair process and biological outcomes are well known then the damage pattern can be inferred. The same can be said if the damage pattern and biological outcomes are well known, i.e. repair mechanisms can be inferred. This inference between damage mechanism and outcome or repair mechanism and outcome gives value to independently developed damage and repair models. Despite differences in mechanism, the yields of residual and misrepaired DSBs between the model combinations, McMahon-McMahon and Henthorn- Warmenhoven, are similar. This is particularly interesting in the context of the misrepair mechanism. Since the Henthorn-Warmenhoven models were fit separately, and were not explicitly fit to reproduce experimental results on misrepair, similar results to McMahon-McMahon implies that the mechanisms may be correct. The damage models used in this comparison are of varying complexity. The Henthorn model uses Geant4-DNA track structure simulation, with detailed geometry in the chromatin fibre combined to a simplified cell model. Whilst the McMahon damage model uses an amorphous track structure technique and no explicit model of DNA. This allows for a comparison of damage predictions and the effect of detail in the model, the differences are shown in Figures 6.1 and 6.2. A further damage model will be included in the final version of the manuscript. The damage model of McNamara also uses Geant4-DNA track structure, but has a more

164 complete model of the whole nuclear DNA. It will be interesting to see the effect of this detail on the predictions, and to see whether a full model of the DNA provides useful information.

Author Contributions NTH, JWW, SJM, and AM developed their respective DNA damage or repair models. NTH, JWW, AM, and JS implemented separately developed models into TOPAS-nBio. The SDD used in this work was developed by JWW, NTH, MJM, JS, AM, HP, and SJM. JWW and NTH drafted the manuscript based on discussions with MJM, KJK, JS, AM, HP, KMP, and SJM. All authors reviewed and approved the manuscript.

165 Comparing DNA damage and repair models with respect to experimentally supported mechanisms

J W Warmenhoven1, *, N T Henthorn1, *, M J Merchant1, K J Kirkby1, J Schuemann2, A McNamara2, H Paganetti2, K M Prise3 and S J McMahon3 1 Division of Cancer Sciences, University of Manchester, Manchester, UK; 2 Department of Radiation Oncology, Harvard Medical School and Massachusetts General Hospital, MA, USA; 3 Centre for Cancer Research and Cell Biology, Queen’s University Belfast, Belfast, UK * Both authors contributed equally to this work

The induction and repair of DNA double strand breaks (DSBs) are critical factors in the treatment of cancers by radiotherapy. To investigate the relationship between incident radiation and cell death through DSB induction many in silico models have been developed. These models produce and use custom formats of data, specific to the investigative aims of the researchers, and often focus on particular pairings of damage and repair models. In this work we show how use of a standard format for reporting DNA damage can facilitate collaboration between institutes by evaluating combinations of different, independently-developed models. We demonstrate the capacity of such inter-comparison to determine the sensitivity of models to both known and unknown assumptions. Specifically, we report on the impact of differences in assumptions regarding DNA damage induction on predicted initial DSB yield, and the subsequent effects this has on derived DNA repair models. This highlighted the importance of considering initial DNA damage on the scale of nanometres rather than micrometres. We show that the differences in DNA damage models result in subsequent repair models assuming significantly different rates of random DSB end diffusion. This in turn leads to disagreement on the mechanisms responsible for different biological endpoints, particularly when different damage and repair models are combined, demonstrating the importance of inter-model comparisons to explore underlying model assumptions.

6.1 Introduction DNA strand breaks pose a significant threat to a cell’s survival and as such many complex mechanisms have evolved to restore the double helix structure (336,337). Double strand breaks (DSBs) are the most challenging damages to repair, which if handled inappropriately can lead to chromosomal aberrations and persistent, or residual, breaks. Both chromosomal aberrations and residual DSBs

166 contribute to cell death (338–340). Thus, it is the induction of DSBs by ionising radiation which is exploited in radiotherapy to deliver a lethal dose to a tumour volume. The number of DSBs induced by radiation has long been thought to be the dominant influence that incident particles had on cell death (341). In the context of hadron-therapy, the Relative Biological Effectiveness (RBE) of different radiations is of critical importance to determine the dose prescribed clinically. RBE is simply the ratio of doses required to produce the same cell kill, usually referenced against conventional X-ray therapy. This ratio is expected to vary not only with the number of DSBs induced, but also parameters such as the Linear Energy Transfer (LET) of the radiation (73). However, with an increasing variety of particles investigated it has also become clear that different particles of similar LET produce different biological outcomes (269,342). The mechanisms that would lead to this observed difference in RBE are not well understood. Aspects of both the physical creation of DSBs and the subsequent repair have been implicated (343,344). This necessitates a deeper understanding of the involved mechanisms to better exploit the different RBE of radiotherapy modalities available clinically. Past advances in understanding have led to novel targeted therapies of drugs in combination with radiation (345,346), as well as identification of genetic phenotypes predisposed to cancer (347,348). In silico modelling of complex biological systems provides an effective method of forming and refining hypotheses to be tested experimentally. The tight control of experimental conditions in such models results in an ability to investigate the sensitivity of specific parameters in the absence of confounding factors that would otherwise serve to obscure causal relationships. Furthermore, in silico models allow us to link mechanisms operating at scales challenging to resolve in state of the art in vitro or in vivo experiments to more readily observable effects that can validate underlying hypotheses. Such models are widely used in radiotherapy research, with many institutions separately developing their own custom codes. A wide range of modelling approaches and data formats are used, relevant to each institutes’ research interests. Linked systems of rate equations, such as those used by Belov et al. (349) and Dolan et al. (350,351), can describe the interaction of multiple repair pathways by simulating the rates of specific mechanistic steps. However, this approach lacks any description of the induced damage pattern, known to impact biological response (352–355). To incorporate this spatial dimension, Poisson statistics can be used to randomly place DSBs in a simulated cell nucleus. This methodology was employed by Sachs et al. (356) to investigate chromosome aberration induction after photon irradiation. However, ion irradiation has been shown to produce damages primarily along a limited number of particle tracks (357), making this description unsuitable for investigating proton- or carbon-therapy. Modelling of these modalities can instead

167 consider DSBs to be placed stochastically along particle tracks though the nucleus. This approach allows comparison of ion and photon biological effects such as in work by Ballarini et al. (358) and McMahon et al. (162,163). The LEM model developed by Scholz et al. (359) and built upon by Elsasser et al. (146) shows how consideration of the spatial proximity of DSBs could be related to biological response, proposing that it could be the defining feature of LET dependent biological effect. However, these models neither include descriptions of specific mechanistic steps in the repair processes, nor explicit descriptions of DSB end mobility. A more detailed approach to the inclusion of the spatial distribution of DSBs is to use Monte Carlo track structure simulations, such as the Geant4-DNA (200,216,218) extension used in work by Henthorn et al. (282) or the PARTRAC software developed by Friedland et al. (360). Further work done by Friedland et al. (174,271) and Henthorn and Warmenhoven et al. (134) consider the implications of this highest level of detail by linking this detailed track structure derived DNA damage to mechanistic models of DNA repair. However, these mechanistic models suffer from their own specificity in that their accuracy is limited by our sparse knowledge of the DNA repair pathways. The above discussed models all contain assumptions about the underlying physical and biological processes. As can be seen, different authors have chosen different starting assumptions on which to build their models so as to reduce the complexity of the problem being studied. These assumptions can have both direct consequences as well as implicit impacts on subsequent processes which are not immediately obvious when investigating individual models, particularly when, for example, a DNA repair model is optimised on a particular DNA damage model. Inter- comparison of models can highlight the sensitivity of different descriptions to these assumptions, and potentially reveal the conserved mechanisms required to reproduce experimentally observed trends, highlighting important areas for experimental investigation. In this work we compare models developed by Henthorn (282), McMahon (162,163), McNamara, and Warmenhoven (134). We demonstrate that seemingly slight differences in initial assumptions regarding DNA damage induction can lead to significant differences in predicted DSBs. These differences then necessitate different responses of the DNA repair models in order to fit biological endpoints. We demonstrate that, in spite of significant differences in the magnitude of assumed random DSB end diffusion, the density dependent misrepair of DSBs emerges as a common property. However, the significant quantitative differences between rates and ranges of misrepair result in a divergence of the mechanisms predicted to be responsible for DSBs persisting to 24 hours.

168 6.2 Methods 6.2.1 Inter-Comparison Overview In this work we compare the behaviour of two DNA damage models and two DNA repair models applied to proton irradiation. These are specifically the McMahon damage and repair models (162,163) (McM), the Henthorn damage model (282) (Hen), and the Warmenhoven repair model (134) (Warm). The McMahon damage and repair models were developed together and the Henthorn and Warmenhoven models were developed together. In this work, when referring to a combination of damage and repair models we will do so as -, e.g. Henthorn-Warmenhoven. A table detailing the parameters used in these models and how they were determined is provided in Appendix A2. To ensure an even spread across the range of proton LET investigated, ten mono-energetic energies were selected between 0.975 and 34 MeV, presented in Appendix 2 Table A2.2. The irradiation of cell nuclei was simulated at each energy, with damages scored and stored in a standard format which both the McMahon and Warmenhoven repair models can parse. The damages from the repair models were investigated in terms of the total yield, average complexity, and proximity of the DSBs. The repair of each damage distribution was then simulated using both repair models, and the final results of the repair models were investigated in terms of residual and misrepaired DSBs at 24 hours.

6.2.2 Use of a Standard Format for DNA Damage (SDD) The standard damage format used in this work consists of a header containing information needed to recreate the simulation and a list containing the details of each DSB recorded. These details include the geometric position of the break as well as its complexity in terms of extra backbone and base damages. The complexity could either be reported as a simplified total number of each insult or be stored as a detailed structure to be processed by the repair models.

6.2.3 The Henthorn DNA Damage Model The DNA damage model published by Henthorn et al. (134,282) is split into two stages, both using Geant4-DNA track structure simulations (164,216). DSB positions are determined in a spherical nucleus (radius 2.5 m) centred in a box (half-length 5 m). The nucleus is uniformly irradiated with mono-energetic protons. 15% of energy depositions are randomly selected to be converted into strand breaks via a probability that varies linearly from 0 to 1 between 5 – 37.5 eV (361) and are randomly assigned to strand 1 or 2 of the double helix. Strand breaks are then clustered into DSBs by a modified DBSCAN algorithm (262). For photons, the number of DSBs per Gray is determined by a Poisson distribution, with λ=25, and are randomly spread uniformly through the nucleus. DSB complexity is determined

169 by a chromatin fibre model, with DNA backbone and base volumes, irradiated by protons of a matching energy or the secondary electron spectrum from a Co-60 source. Energy depositions in DNA are summed per primary, and the same probability is used to determine DNA damage. Geant4-DNA chemistry (200) is implemented to simulate indirect damage. Hydroxyl radicals are assigned a probability, P, of creating damage when crossing the DNA volumes (P=0.5 for backbones, P=0.8 for bases). Damaged backbones on opposite strands separated by 10 bp or less are clustered into DSBs. Damaged bases are included if they are no more than 3 bp outside the ends of the DSB. These detailed breaks are used to populate the DSBs positions determined by the cell simulation.

6.2.4 The McMahon DNA Damage Model In the DNA response model published by McMahon et al. (162,163), the probability of a DNA damage event occurring within a volume depends linearly on the density of DNA and total energy deposited within it. From fitting to experimental

RBE data 퐸퐷푆퐵, the energy required to create an average of 1 DSB, was determined to be 60.7 keV, equivalent to a yield of 35 DSB/Gy for normal human cells with a nuclear radius of 4.32 μm. Importantly, this assumption means that the absolute yield of DSBs is independent of LET. For X-rays, DSBs are assumed to be uniformly randomly distributed within the nucleus, with the breaks in an individual cell determined by sampling a Poisson distribution with a mean of 35퐷, where D is the dose delivered to the cell. Damage complexity is assigned by random sampling, with a complex break probability determined by the ratio of slow and fast DSB repair. For protons, individual tracks were directed along the z-axis at random radial positions within the nucleus. DSBs per track were again sampled from a Poisson distribution with a mean determined by the track’s LET, distance travelled through the cell and 퐸퐷푆퐵. The radial distance between the track core and each DSB was obtained by sampling from cumulative radial energy distributions pre-calculated for different proton energies using Geant4 (163). The total number of tracks was also Poisson-distributed, based on the fluence required to give a particular dose.

6.2.5 The McMahon DNA Repair Model The McMahon repair model makes use of a simple probabilistic Monte Carlo approach to describe misrepair. Starting from a distribution of DSBs within the nucleus, each break is modelled as a pair of free DNA ends in close spatial proximity. As the cell repairs DSBs, each free end has a probability of interacting with every

푟2 − other free end in the nucleus with a rate given by 휆푒 2휎2, where 휆 is the rate of correct repair, 푟 is the separation between the two free ends and 휎 is a parameter governing the interaction range of repair. The distance-dependent factor ranges from

170 approximately 1 for correct ends which are nearby, to 0 for distant DSBs. These values have been obtained by fitting to experimental data (162). In particular, 휎 has a value of 0.0428푅, where 푅 is the nuclear radius, giving 휎 = 185 nm for normal human cells with 푅 = 4.32⁡휇푚 as described above. Repair events are then identified as either correct repair (when two matching free ends from a single DSB rejoin) or misrepair (when two ends from different DSBs rejoin). As the rate of correct repair remains constant, it can be seen that an increase in dose (creating more DSBs) or LET (creating more densely spaced DSBs) will lead to increases in the probability of misrepair.

6.2.6 The Warmenhoven DNA Repair Model The Warmenhoven model is a mechanistic Monte Carlo repair simulation. DSBs are represented as two independent ends in a simple spherical geometry that move by subdiffusion (315,362). DSBs progress along the repair process by acquiring repair proteins through stochastic time constant based state changes. DSB ends in the appropriate state and within a defined interaction range can form synaptic complexes. Synaptic complexes can either fail, reverting both ends to their initial state (324,325), or progress to final ligation steps including the processing of additional complexities. Once finally ligated DSBs are classified as fixed. The state of the system is scored at 24 hours, as this is a commonly reported time point in experimental studies. Residual breaks are scored as the initial breaks less the breaks classified as fixed. This represents DSB ends and synaptic complexes with active repair proteins capable of producing foci. Synaptic complexes and fixed DSBs that contain ends not originally from the same initial DSB are scored as misrepaired. Deletions within the same DSB are not taken into account with this metric. Synaptic complexes at 24 hours are counted as contributing to stable misrepairs as they are unlikely to fail. Pre-synaptic recruitment kinetics (325,363–365), rate of formation, failure, and stabilisation of synaptic complexes (325,326,366–368), variance of residual DSBs with LET (332), and repair kinetics (332) are fitted to literature data.

171 6.3 Results

Figure 6.1 B) Average yield of initial DSBs per cell per Gy predicted by the damage models for a range of proton LET and Co-60 photons for both models (see Appendix 2 Table A2.1). Yields are for a spherical nucleus irradiated with 1 Gy of mono- energetic radiation, plotted against LET calculated in a 1 μm thick water slab. Henthorn data is from 2500 repeats and McMahon data from 500 repeats, making the standard error in the mean for all points too small to be visible. A) The coefficient of variation for the predicted yield of DSBs per cell per Gy for both models. C) The average complexity is plotted as the ratio of complex breaks to simple breaks at each LET in each model. Simple breaks are defined as DSBs involving only two backbone breaks, and complex breaks as involving two backbone breaks with any additional number of damages. Error bars represent the standard error in the mean and are too small to be visible for most datapoints.

Figure 6.1 presents the DSB yields and average complexities obtained from the two damage models. The McMahon damage model show a constant ≈35 DSBs and a constant complexity of 0.7 complex DSBs for each simple DSB, independent of

172 radiation type or LET. In contrast, the Henthorn model predicts a linear trend of increasing DSB yield and complexity with LET per unit dose. The Henthorn model predicts a lower DSB yield than the McMahon model for LET below ≈7.4 keV/µm. Similar trends in the coefficients of variation can be seen for both models, showing increasing variation in the number of DSBs per cell with increasing LET due to the reducing number of particle tracks per cell, with the Henthorn model predicting a slightly wider spread of yields. As with DSB yields, the Henthorn model predicts a lower DSB complexity than the McMahon model for protons with LET below ≈5.3 keV/µm. Furthermore the Henthorn model predicts that complex breaks will become more frequent than simple breaks at LETs above ≈20 keV/µm.

Figure 6.2 A) Average yield of DSBs per track per µm predicted by the damage models for protons; simulated by single track irradiations. B) The average number of these DSBs generated outside the track core, arbitrarily defined to be of 50 nm radius. Henthorn data is from 2000 repeats and McMahon data from 500 repeats making the standard error in the mean for all points too small to be visible. C) The radial probability of creating a DSB at distance perpendicularly from the proton track, shown for two selected proton LET for both models. Lines are to guide the eye.

Figure 6.2 compares the initial spatial distribution of breaks between the two damage models for single proton tracks. Although the DSB density of individual tracks increases with LET for both models, different trends are seen. While McMahon predicts a linear relationship with LET due to the assumption that the DSB yield depends only on the energy deposited, Henthorn predicts a quadratic relationship. At low LET both models predict similar yields, with tracks only producing a DSBs every few microns. It is not untill an LET of ≈10.6 keV/µm that the expected number of DSBs for both models reaches 1 per micron. Above this, the Henthorn damage model predicts a more rapid increase, reaching 2 DSB per micron at ≈17.8 keV/µm, which is not reached in the McMahon model untill ≈25.3 keV/µm. The increase in yield of DSBs with LET for the Henthorn model in Figure 6.1B is much

173 less than the increase in DSBs per track due to the reduced number of tracks needed to deliver a given dose. The radial distribution of these DSBs is such that over the range of LET investigated the average number of DSBs outside a track core of an arbitrarily selected 50 nm radius remains less than 0.17. Both models show similar numbers across the range of LET with a maximum predicted difference of ≈0.02 DSBs at 15.2 keV/µm. Both McMahon and Henthorn models appears to show a possible biphasic behaviour. From Figure 6.2C it can be seen that the probability of creating a DSB shows similar trends for different LETs in both models as a function of distance to the track, with a very rapid fall-off within the first few tens of nm. Both models therefore agree that the damage caused by protons is contained mainly within a small radius around the track core, and that the density of these damage events along the track increases with LET. As a result, for the same dose delivered, the separation between DSBs originating from the same track must decrease with increasing LET in both models. Figure 6.3 presents the biological endpoints of residual breaks and misrepaired breaks at 24 hours after irradiation, for different combinations of damage and repair models. The McMahon-McMahon and Henthorn-Warmenhoven models show similar linear trends in residual DSBs with LET. Although there is a systematic offset between the two models, these trends are comparable to literature results of 53BP1 foci (332). Figure 6.1 shows that this cannot be explained by either differences in the initial number of DSBs or break complexity alone, as both these values are predicted to cross over between the two models.

174

Figure 6.3 Yield of A) residual and B) misrepaired breaks at 24 hours after 1 Gy of radiation in a spherical nucleus for a range of proton LET and 0.2 keV/µm photons. Results from previously discussed damage models were used as input to the Warmenhoven and McMahon repair models. Filled symbols represent results of ‘matched’ model combinations whose parameters were fit together (134,162,163). Empty symbols represent combinations not fitted to literature data. Photon data are shown as triangles for repair by the Warmenhoven model and inverted triangles for repair by the McMahon model. The Henthorn-McMahon model combination predicts much higher yield for both end points, therefore the entire range of results are not shown. Errors are standard error in mean for 50 repeats with the Warmenhoven model, and 500 repeats with the McMahon repair model.

Similar behaviour can be seen for the biological endpoint of misrepaired DSBs in that the McMahon-McMahon and Henthorn-Warmenhoven models show comparable yields of misrepaired DSBs. Their behaviour differs in that the McMahon-McMahon model displays a linearly increasing yield of misrepair with LET, and the Henthorn-Warmenhoven model shows a quadratic trend. Other model combinations make significantly different predictions, with the Henthorn-McMahon combination showing an extremely steep dependence on LET, whilst the McMahon- Warmenhoven combination shows only a very small linear increase of misrepair yield across the LET range. Again we can eliminate both initial DSB yield and complexity as defining factors, since crossovers of misrepair occur at different LET than those in Figure 6.1. Comparing the yields for these two biological endpoints reveals that the McMahon-McMahon model predicts a higher yield of misrepaired

175 DSBs than residuals across the range of LET investigated. On the other hand, the Henthorn-Warmenhoven model predicts that yields of residual DSBs will be greater than misrepaired DSBs up to an LET of ≈23 keV/µm; which encapsulates the clinically relevant LET range. In order to explain the behaviour of the models with respect to misrepaired DSB yields, we investigated the interaction of DSBs with incorrect partners in Figure 6.4. For the Warmenhoven model two DSBs were placed at specified separations and simulations run to determine the probability of misrepair. For the McMahon repair model the equation governing the interaction probability for two DSBs was solved. Both the Warmenhoven and McMahon repair models exhibit sigmoidal interaction probability curves, favouring interaction of nearby breaks over distant DSBs. At 0 nm separation the misrepair probability for all models is 0.67. This is because in a system of 2 DSBs each DSB end has two possible incorrect partners and one possible correct partner, with the first interaction determining the possible correct/incorrect state of the rest. The models differ in interaction range, with the Warmenhoven model having an interaction range with half-width half-maximum (HWHM) of 26.3 nm, and a negligible interaction probability at separations >100 nm. By contrast, the McMahon repair model has a similarly shaped function but an interaction range an order of magnitude larger at 307.9 nm. When applied to similar damage patterns the narrower interaction range shown for the Warmenhoven model should result in a higher rate of correct DSB end pairing than the wider interaction curve of the McMahon repair model, possibly explaining the behaviour observed in Figure 6.3B.

176

Figure 6.4 A) The probability of an interaction leading to misrepair for a pair of DSBs as a function of initial separation. The solid line is an analytical plot of the function used by the McMahon repair model to determine interaction of two separate DSBs. Filled circles are average results from 1000 repeats of the Warmenhoven model with associated standard error in the mean. B) Taking into account the HWHM of this characteristic interaction range the average number of neighbours within 26.3 nm for the Henthorn damage model and within 307.9 nm for the McMahon damage model is plotted with proton LET. Dashed lines show a linear correlation for McMahon damage model and an exponential correlation for the Henthorn damage model, equations and parameters for which are shown in Appendix 2 Table A2.3. Standard error in the mean is too small to be visible. C) The misrepair probability can be correlated with the density of neighbours for both the McMahon and Warmenhoven repair models. Dashed lines show the fitted correlations with parameters given in Appendix 2 Table 2.3. Errors are standard error in mean for 50 repeats with the Warmenhoven model, and 500 repeats with the McMahon repair model.

DSB clustering was quantified by calculating the average number of DSBs within a given separation from each DSB in a simulation for the two repair models.

177 The radii used to analyse the two damage models was set to the HWHM of the interaction ranges from their partner repair model. The McMahon damage model shows a linear increase in neighbouring DSBs with LET, while the Henthorn model shows an exponential increase, in both cases corresponding to similar trends in DSB yield per track per µm from Figure 6.2A. Notably, although the Warmenhoven HWHM encompasses a volume a thousandth the size of that of the McMahon model, the magnitude of DSBs found within this range is similar. This suggests that the Henthorn model has a significantly greater chance of producing DSBs in extremely close proximity than the McMahon model, even though the models predict similar yields of DSB/μm. Comparing the average number of neighbouring DSBs to the modelled misrepair probabilities produces initially linear trends for both models. Misrepair probability in both models can therefore be said to be dominated by the number of potential incorrect partners available to each DSB end. A similar concept has been explicitly used in analytic formulations of the McMahon model, where the weighted average of DSB interaction rates is used to calculate misrepair rates (162). The simple sum of breaks within the HWHM provides an approximation to this more complex calculation. The McMahon-McMahon model has an intercept of 0.008±0.001 DSB-1 because the sharp cut-off of what is counted as a nearby neighbour excludes the non-zero contribution of DSBs beyond 307.9 nm to the misrepair probability, evident in Figure 6.4A. Conversely, the Henthorn- Warmenhoven model has an intercept which includes zero within errors because the sharp fall-off in interaction probability at increasing separations means less total probability is discounted by this method. Both models have the exact same gradient of 0.573 ± 0.006 DSB-1 giving strong support that the same density dependent misrepair mechanism is driving the response.

6.4 Discussion This work compares models of DNA damage and repair with different levels of complexity for proton and photon radiation. From this comparison we investigate the effect of the added complexity of full Monte Carlo track structure and detailed biological repair pathway simulations. Furthermore, we explore the different biological assumptions these models take to arrive at similar experimentally observable outcomes. Because the behaviour of photon data does not deviate from what is expected given the trends of the proton data, the discussion is focussed on results from proton irradiation. Similarly we would expect the general mechanisms discussed in this work to hold for irradiations using other particles. To understand the predictions of the damage models we first consider the processes that lead to double strand break formation in the nucleus. As radiation traverses the nucleus it deposits energy in ionisation events. These depositions

178 cause damage to individual parts of the DNA double helix, either the backbones or the bases, directly or through free-radical mediated indirect interactions. Single strand breaks (SSBs) are created when damage occurs in a backbone genomically isolated from other damaged backbones, while DSBs are typically assumed to be created when damage occurs in two opposite backbones within one helical turn. The path of a proton through a cell nucleus can be approximated as straight for the energies considered in this work due to the small distances crossed. Figure 6.2B and 6.2C show that in the models investigated, DSBs are predominantly created along this straight track, in agreement with what is seen experimentally (369). For this reason we can consider that the mechanism which creates double strand breaks in these models can be simplified to one which randomly places events along lines through the nucleus, the total length of which depends on dose and LET. The selection of which events to place defines the difference observed between the Henthorn and McMahon damage models. The McMahon damage model has been designed around randomly placing an LET dependent number of DSBs along the track. Therefore, the linear increase in DSB/µm, seen in Figure 6.2A, is counteracted by the linear decrease in total track length with LET at constant dose. This results in the model’s direct proportionality between DSB yield and total energy deposited, seen in Figure 6.1B. The LET independent behaviour in Figure 6.1B and 6.1C can be explained if there is an LET independent spectrum of events, with specific events leading to specific types of DNA damage. These may depend, for example, on the production of secondary electrons of a particular energy or the location of energy deposition relative to the DNA. By contrast, the Henthorn damage model can be considered to randomly position an LET dependent number of energy deposition events along the track. Applied to a single track of fixed length, this must result in an increasing number of DSBs with LET. With every additional energy deposition event there is less track length available where a new energy deposition is genomically isolated enough from existing SSBs to avoid forming a new DSB, explaining the behaviour observed in Figure 6.2A. This in turn explains the observed behaviour of the Henthorn model in Figure 6.1B, where the quadratic increase in DSB/µm outpaces the linear decrease in total track length with LET at constant dose. It also explains the increase in break complexity with LET of the Henthorn model shown in Figure 6.1C, where an increasing number of simple DSBs are converted into complex DSBs by the creation of additional nearby backbone or base damages. However, as can be seen in Figure 6.4B, the number of breaks at very small separations in the Henthorn model is much greater than would have been expected based on density alone when compared to the McMahon model. This is likely because the model is based on full Monte Carlo

179 simulations, enabling it to reflect the potential for damage to be correlated due to, for example, creation by the same secondary electron. Overall, the assumptions discussed result in both models predicting an increase in the proximity of breaks with LET. This increase in proximity, shown in Figure 6.4C, drives the LET dependent misrepair yield of the two repair models shown in Figure 6.3B. The similar yields of misrepaired DSBs for both McMahon-McMahon and Henthorn-Warmenhoven models are remarkable for two reasons: Firstly, whilst the McMahon-McMahon model was fit to experimental data on misrepair, the Henthorn- Warmenhoven model was not, and the LET dependence of misrepair is instead an emergent property. Secondly, there is a large difference between the two models in the complexity of the process which leads to misrepaired DSBs. Both models assume that misrepair occurs though pairwise interactions of incorrect partner DSB ends. The McMahon repair model describes both the interaction and repair of DSB ends in a single step with a single probability depending only on their initial position. The Warmenhoven model contains a detailed mechanistic description of the processes which convert DSB ends to fixed breaks, as well as a mechanistic description of the random sub-diffusive motion of individual DSB ends. This large difference in complexity, and therefore simulation time, results in a difference in misrepair yield of less than 2 DSBs per Gy across the LET range investigated. The fact that these two independently developed, structurally different models arrive at similar predictions for misrepair yield, and that this yield is experimentally supported, points to the applicability of the conserved mechanism of random, undirected rejoining of DSB ends, and the density of DSBs being a driving factor in elevated misrepair at high LETs. Whilst Figure 6.2 suggests comparable track DSB densities for both models, Figure 6.4B shows that the nanoscale distribution of these DSBs in the two models is very different; the Henthorn model producing comparable numbers of breaks within a radius of 26.3 nm of each other as the McMahon model does in a radius of 307.9 nm. The empty symbols in Figure 6.3 shows how repair models fit based on one damage model produce dramatically different results when applied to damage patterns from other models with the same microscopic descriptors, but different nanoscopic properties. This inter-comparative work therefore highlights the importance of considering DNA damage densities on the scale of the DNA helix, nm-1, rather than on the scale of the whole nucleus, µm-1; in both the effect on overall damage induced (Figure 6.1) and its subsequent impact on repair fidelity (Figure 6.3). By normalising the results in Figure 6.3 to the initial number of breaks we can deduce the mechanisms driving residual break probabilities in the Warmenhoven and McMahon models. Figure 6.5 shows that the Warmenhoven model predicts a constant probability of 0.0729 ± 0.0003 for each DSB being residual at 24 hours.

180 Conversely, the McMahon repair model has a linearly increasing probability with LET of an initial DSB end persisting for at least 24 hours. As the initial yield of DSBs and DSB complexity is independent of LET in the McMahon model (Figure 6.1B and 6.1C), this effect depends on the increased density of DSBs and associated misrepair. In particular, the McMahon model explicitly simulates repair rates which are greatest for adjacent breaks, reducing as break separation increases. When a misrepair event occurs between ends from two DSBs, the other end of each DSB becomes relatively isolated and its rate of repair decreases dramatically. As a result, the yield of residual break ends is closely correlated with the yield of misrepaired breaks. Therefore, the McMahon repair model again suggests that variation in residual DSBs with LET is due to a change in the spatial clustering of damage caused by the ionising radiation exposures.

Figure 6.5 Residual and misrepaired breaks at 24 hours as a fraction of initial DSB yield after 1 Gy of radiation in a spherical nucleus for a range of proton LET and Co- 60 photons (0.2 keV/μm). Filled symbols represent the predicted residual probabilities and empty symbols the misrepair probabilities. Photon damages are shown as triangles for the Henthorn-Warmenhoven model combination and inverted triangles for the McMahon-McMahon model combination. Errors are standard error in mean for 50 repeats with the Warmenhoven model, and 500 repeats with the McMahon repair model.

181 Figure 6.5 shows that Warmenhoven model has a constant probability for an initial break being unresolved at 24 hours for all LETs; ruling out a similar dependence on misrepair. This can be explained by taking into account the reduced interaction range of the Warmenhoven model compared to the McMahon model (Figure 6.4A), which results in only very proximal DSBs leading to misrepairs. In contrast to the McMahon model where misrepair interactions can directly result in spatially isolated DSB ends, the remaining DSB ends from misrepair events in the Warmenhoven model are therefore likely to be, and remain, very proximal. These remaining ends are likely to be resolved with each other unless they can escape proximity and so the probability of being unresolved is predominantly a function of the mobility of the DSB ends. In the Warmenhoven model the mobility of a DSB end is invariant; neither the number of nearby breaks nor the inherent complexity of the DSB that produced it affects its mobility. As such, the yield of residual DSBs is determined only by the linearly increasing initial number of DSBs, resulting in a similar trend to the McMahon model via a different mechanism. This inter-comparative work therefore highlights the importance of considering DNA mobility, particularly on the scale of DSB ends. The distinct implementations of DSB mobility by the repair models is enough to cause a significant change in the proposed mechanism responsible for residual DSBs. However, these results seem to suggest that the fully mechanistic description of the repair pathway itself is not necessary to determine the likelihood of misrepair. Furthermore, both models can produce reasonable predictions of various biological endpoints without involving DNA damage complexity. The two likeliest explanations are: Firstly, that individual break complexity is not a major influence on DNA repair or is highly correlated with another parameter, such as break density. This means it is not feasible to quantify independently simply by varying LET. Secondly, and perhaps most likely, the models and endpoints investigated here may not include sufficient detail to accurately determine the influence of complexity; highlighting a key area for improvement. However, to include this added detail much more experimental biology needs to be done to provide the necessary fitting data in order to rationally expand these models. The requirement for careful, step by step, experimentally driven expansion of these models is made even clearer when the results thus far are summarised and considered together. Both damage models in this work assume the same method for inducing DNA DSBs but apply it at slightly different scales, resulting in significantly different predictions. In spite of this, both models show the same DSB density dependent misrepair which arises due to assumed random migration of DSB ends. The exact implementation of this mobility has a determining and divergent impact on the suggested mechanism, but importantly not yield, of residual breaks. This highlights how models which produce similar final results can differ dramatically

182 in intermediate predictions such as initial DSB yield or the ranges over which misrepair can occur and underlines the necessity of combing such in silico work with experimental data at all stages of the radiation response. From a clinical perspective, the probabilities in Figure 6.5 can be applied to a dose map where LET has been scored. This would result in a probability map for the two biologically relevant endpoints, misrepair and residual breaks, considered in this work. Comparison between the resultant misrepair and residual maps can then alert us if there is a significant difference predicted by assuming one or the other mechanism as proposed by the different models. From Figure 6.5 it can already be seen that the McMahon-McMahon model always predicts greater misrepair yield than residual break yield, whereas the Henthorn-Warmenhoven model predicts more residual breaks than misrepaired DSBs for clinically relevant LET (<20 keV/μm).

6.5 Conclusions We have shown how comparative work, facilitated by adoption of a standard format for reporting damage, has allowed us to explore the differences in our models and revealed which parameters are most sensitive. We have highlighted, through analysis of both damage and misrepair, how important consideration of DNA damage on the nanoscale is. We have also shown, through investigation of residual DSBs, how important consideration of DSB end motion is. Both these factors are difficult to investigate with current experimental technologies, instead requiring a combination of in silico modelling supported by in vitro/vivo methods. Furthermore, this work shows how inter-comparison can highlight areas where models are lacking. Whilst clinical implementation of biological outcomes modelling is desirable it necessitates quick simulation times. To achieve this with current technological limitations requires simplified models that predict the overall behaviour of DNA damage and repair. However, through this work we have shown that such simplification should be approached with caution as models can produce similar final results despite dramatically different intermediate predictions or mechanisms. When applied to scenarios which the model was not explicitly fit to, such differences could adversely interact with unaccounted for parameters to produce spurious predictions. This suggests that before simplification, models must first be comprehensively validated through experimental investigation of mechanisms at all stages of the radiation response. By further development of the Standard format for DNA Damage (SDD), we hope to increase collaboration in the field allowing for more work like that reported here.

183 6.6 Acknowledgements NTH and JWW would like to acknowledge financial support from EPSRC (grant No.: EP/J500094/1). SJM would like to acknowledge financial support from the European Commission (EC FP7 grant MC-IOF-623630). This research was funded by the STFC Global Challenge Network+ in Advanced Radiotherapy and EPSRC Grand Challenge Network+ in Proton Therapy and supported by the NIHR Manchester Biomedical Research Centre. This work was supported by National Institute of Health / National Cancer Institute Grants R01CA187003 and U19 CA- 21239.

184 7. Final Discussions

This thesis describes the implementation and development of a Monte Carlo simulation code that is capable of scoring DNA damage, where the output can be used to further investigate biological effect. This model is to be used to investigate mechanisms of RBE. Many publications have reported on the simulation of DNA damage, though the investigations with this model and the resulting predictions are novel. In order to investigate which parts of the damage pattern influences the resulting biological response, it was necessary to develop such a model. Once developed, and combined with the repair model, identification and investigation of key parameters was much more straightforward, since the model behaviour was well-known. Development of an in-house DNA damage simulator also made investigation of damage mechanisms straightforward, and allows for testing of some of the assumed mechanisms in other models. RBE is one of the biggest unanswered questions in PBT. Reliance of modelling efforts on experimental data has introduced uncertainty in predictions, and as such a static RBE has been seen as preferable in clinical implementation. The model presented in this thesis is independent from the type of experimental data used in phenomenological modelling. With this approach ab initio predictions are made on proton induced DNA damage, DSB complexity, and the spatial clustering of DSBs. These damage factors are used to predict biological outcomes of relevance to RBE. Furthermore, separate development and validation of the damage model allows for further investigation of repair mechanisms in isolation. Separation between damage and repair mechanisms is of interest to identify cell specific components, i.e. identifying mechanisms of radiosensitivity. This kind of damage and repair separation is beyond the possibility of phenomenological modelling. This thesis highlights elements of the physical beam that influence biological response. The work presents an extension to the emerging field of nanodosimetry, which has the aim of determining biologically relevant dosimetry. The work has been presented as a series of publications, in the alternative thesis format, and whilst each publication has its own discussion these are often in the context of the publication. This chapter highlights some of the salient points from each publication, and discusses them in the wider context of the thesis aims.

7.1 Chapter 2 Chapter 2 details the first implementation of the DNA damage model. When constructing a simulation of the cellular geometry, and DNA, many studies require a model of the DNA superstructure. Within the literature it was apparent that the solenoid chromatin geometry was favoured for modelling purposes. However, evidence for any given chromatin structure is far from conclusive. In order to progress to higher DNA organisation structures, it was necessary to investigate the

185 impact of different lower DNA organisational models. It was concluded, following rigorous investigation, that the chromatin structure does not significantly affect the predictions of DNA damage nor complexity. This investigation gives credibility to studies that have modelled the solenoid chromatin fibre, and allows for a simulation on the full cell geometry to use any of the chromatin models tested. The context of the comparison presented in Chapter 2 was within nanodosimetry. As such it was chosen to use ionisations as the mechanism for direct DNA damage, since experimental nanodosimetry assesses clusters of ionisations. With this method and the modelled DNA geometry it was possible to compare damage predictions to other models in the literature. It was found that the SSB to DSB ratio was systematically higher than for studies in the literature. Initially this disagreement was ascribed to the DNA geometry, though other explanations could involve the mechanism of direct damage (investigated in Chapter 3). DNA volumes, bases and backbones, were constructed as spheres. With this geometry it was not possible to extend the volume past a certain point, where overlaps with sequential base pairs begin to occur. By making the volumes smaller and resimulating it was possible to see the effect of DNA volume size on model predictions, and to determine the optimal volume size to recreate the predictions of other models in the literature. However, this work did not investigate the impact of other direct DNA damage mechanisms, such as sub-ionisation energy depositions. The work presented in Chapter 2 also did not consider the impact of indirect DNA damage. So, whilst informative, the damage models used were incomplete.

7.2 Chapter 3 Chapter 3 of this thesis was focussed on investigating the mechanisms of DNA damage, and an extension to the model presented in Chapter 2 to make further predictions including DSB complexity. The DNA geometry modelled in Chapter 2 was limited by the maximum achievable volume, with the volumes of bases and backbones significantly smaller than has been reported in the literature. It is desirable to keep the DNA elements as discrete volumes, so that it is possible to identify clusters on a base pair separation rather than estimating from spatial separation. There have been a number of DNA models reported in the literature that meet this criterion. Three of the models were selected for testing, where DNA backbones are constructed from either half cylinder, quarter cylinders, or the spheres from Chapter 2. It is also worth noting that atomistic descriptions of the DNA volumes are available, though this sort of description can lead to a significant increase in simulation complexity with a cost in performance. Using the sets of DNA geometry models the mechanisms of direct DNA damage were investigated. Experimentally determining the direct DNA damage yield within a cell is difficult, if not impossible, since the process can’t be decoupled from indirect

186 damage. However, a number of experiments have been carried out on the irradiation of dried plasmids, where strand breaks can be determined through a change in plasmid geometry. Experimenting with dry plasmids removes the contribution from indirect effects. Three methods of direct DNA damage were investigated, an energy threshold to the DNA, an energy-based probability of strand break induction, and ionisations to the DNA volumes. Each of these methods was combined to each of the DNA geometries and a plasmid model was irradiated in sillico. The model predictions were compared to experimental data to show that the combination of quarter cylinder DNA model with strand breaks determined by an energy-based probability best reproduced experimental data. However, it is worth noting that there is quite a large spread in the experimental data. The conclusion is reached that this experimental spread does not mean that the mechanism of direct damage is incorrect, since it recreates experimental data, rather it means that more experimental data is required. Using the Geant4-DNA chemistry modules water radiolysis and indirect DNA damage was added to the work presented in Chapter 2. The DNA model and direct damage mechanism were selected based on the results of the plasmid simulations. The chromatin fibre was simulated, and a probability of damage induction was applied to the OH radicals that crossed a DNA volume. The probability was selected so that 65% of DNA backbone damage is a result of indirect effects when irradiated with photons. 65% was chosen based on experiments by Ward et al. (119). This mechanism of indirect damage is an oversimplification of the real chemical processes. Better approaches would involve no a priori fitting, instead simulating the effects of chemical reaction and the biological processing of indirectly damaged DNA volumes. However, such an approach would require significant computing power, or a significant scaling down of the geometry. The implemented mechanism of indirect DNA damage also suffers since it is fit following direct DNA damage. Ideally the two mechanisms would be fit separately to remove any dependencies, or compensations where one mechanism can correct for an under- or over-prediction of the other. At this stage the model includes a description of the chromatin fibre with mechanisms to describe direct and indirect DNA damage. To extend this model for cell level predictions it is necessary to describe the full genome geometry, where chromosomes are constructed from an organisation of chromatin fibres. Such an approach is desirable, and descriptions have been implemented in other works (170). However, chromosome structure is dynamic across the cell cycle and is not yet fully understood in the literature. Investigation of the higher genome organisation is an active area of research. Instead, by assuming that the DNA is homogenously spread throughout the nucleus it is possible to predict DNA damage at the nucleus level based on the clustering of energy depositions (262). Such a method lacks detailed descriptions on the types of DNA damage induced. By using the detailed

187 mechanism-based predictions of the chromatin fibre to populate the spatial predictions of the nucleus model it is possible to predict both DNA damage position and complexity. This comes at the cost of specificity, for example with such a combination it is not possible to directly determine the chromosome in which the damage occurred. It may be possible to make approximations to add specificity, such as partitioning the nucleus into chromosome territories. Using the cell and fibre model combinations, types and yields of DNA damage are predicted as a function of proton dose and LET. With a series of correlations, fitted to the detailed model predictions, complexity yields can be predicted mathematically. This includes not only the categories of DSB complexity, but also the specific number of damaged backbones and bases involved in a cluster. These correlations allows for the predictions to be applied to a clinically relevant case, predicting the types of DNA damage that would be expected for cells located at a given position within the irradiation field. A simplified method is applied to make similar predictions for photon irradiation. An RBE of damage complexity can then be calculated, showing a sharp increase in RBE at the proton distal edge. This has a clinical relevance if, for example, a certain type of DSB is shown to be particularly lethal to the cell. Summarising the detailed model predictions with correlative equations has the advantage of speed, which is particularly useful when translating nanodosimetry to the TPS. Here, the model predictions can easily be incorporated into the TPS provided that dose and LET are scored.

7.3 Chapter 4 Chapter 4 details the results of combining the DNA damage model, presented in Chapter 3, to a NHEJ repair model. The DNA damage model has been developed separately and compared to experimental data to show valid predictions. It is of interest to follow the damage predictions through to biological effect, and to identify dependencies between damage and repair. The implemented repair model is somewhat insensitive to DSB complexity, where two DSB ends form an inseparable synaptic complex before associated DNA damages are processed. Within the implemented repair model the effect of extra DNA damages, associated to the DSB, is to delay the progression of the DSB to a fully fixed state. This may have a relevance to cell fate when cell cycle checkpoints are considered. Our primary investigation, within Chapter 4, was to assess LET as a parameter for the prediction of biological effect. We did this by simulating irradiation of the cell model, from Chapter 3, with iso-LET protons, alpha particles, and carbon ions. The damage patterns were then passed to the NHEJ repair model to investigate any differences in repair efficacy and fidelity. In terms of the damage yields, the model predicted a difference in the initial DSB yield between the iso-LET ions. In terms of

188 the repair, the combined models predicted a difference in the probability of misrepair but a constant probability of residual DSBs between the iso-LET ions. Based on knowledge of the mechanism behind the NHEJ repair model we scored the initial damage pattern in terms of DSB density; by calculating the average number of DSBs within 70 nm from any given DSBs. We refer to this value as the cluster density, giving an overall measure of the clustering between DSBs for a given radiation quality. There is a strong linear correlation between the cluster density and the probability of DSB misrepair, regardless of ion type or dose. The cluster density depends on the ion species and LET, but can be predicted as a function of LET with a second order polynomial. We find that the initial DSB yield depends linearly on dose and LET. Together it is then possible to estimate the total yield of DSB misrepairs from LET and dose, which we apply to a proton SOBP showing an increase in biological effect at the distal edge. As with Chapter 3, the correlative equations applied can readily be implemented into the TPS. We do not go so far as to predict cell fate based on our predictions of residual and misrepaired DSBs. There are a number of reasons to limit our predictions as we have. Firstly, we do not apply any effect of cell cycle to the model and as such cannot make prediction on cell cycle checkpoints to identify the residual damage. Secondly, the cell model, presented in Chapter 3, does not explicitly model the chromosomes within the nucleus. Therefore, at this stage we can only say that a misrepair has occurred, not the toxicity of such an event.

7.4 Chapter 5 Throughout the development of the DNA damage model it was necessary to check predictions, validating with experimental data where possible or verifying with other models. Caution must be taken when validating predictions to other models. In this case circular validation can occur, and errors can propagate through models. For this reason, experimental validation is always preferable and comparison to other models should be used as verification rather than validation; though verification between a number of independently developed models may suffice. Inter-model verification can be difficult, where it is hard to extract the relevant information for comparison purposes. Often reported is, for example, the yield of SSBs and DSBs. But for in depth comparison more information is necessary. For example, to compare the extent of DSB clustering, as presented in Chapter 4, it would be necessary to have access to the spatial damage pattern predicted from a model for a given irradiation case. Chapter 5 details a standard format for recording DNA damage from models such as the one presented in this thesis. There are a number of models in the literature capable of predicting various levels of information and detail, not all of which will be necessary for our understanding of DNA repair processes or radiation

189 quality. The difficulty with proposing a standard format, such as in Chapter 5, is to capture as much relevant information as possible, ensuring the data is manageable, usable, and relevant. The standard format has received a lot of positive feedback from researchers in the field, with useful additions being proposed. It is hoped that a standard such as this will facilitate collaboration and deepen our understanding as a modelling community. Recording data in this format allows for easy comparison between models and for the use of such data by further models. A shared repository is envisaged to allow for the sharing of such data files between researchers. Researchers can then use the damage predictions from a model, such as the one detailed in this thesis, as an input to their repair model or for comparison to their damage model.

7.5 Chapter 6 Using the standard format reported in Chapter 5, the damage and repair models reported in Chapter 4 were compared to a similar damage and repair model created by S. McMahon (162,163). For the damage model McMahon takes a similar approach as the LEM, using an amorphous track structure that samples radial dose. For the repair model McMahon takes a similar approach as the BIANCA model, sampling pairs of DSBs to check for interactions, with the interaction probability depending on DSB separation. In terms of the DNA damage, the model presented in this thesis predicts significant differences to the McMahon model. For example, McMahon predicts an average of 30 DSBs per Gy of radiation, regardless of proton LET. McMahon predicts a constant DSB complexity across the proton LET range. The constant DSB yield in the McMahon model leads to a more diffuse DSB pattern than the model presented in this thesis. In this work the DSB yield depends on the clustering of energy depositions, which increase with increasing LET. Importantly the McMahon damage and repair model were fit together, with the desired outcome to match literature reported yields of chromosome aberration through DSB misrepair. As such, the range of DSB interaction used in the repair model is scaled for the diffuse spatial DSBs pattern. The repair model presented in Chapter 4 takes as an input the damage pattern predicted by the model presented in Chapter 3. The DSB end mobility was tuned in order to reproduce the yield of residual DSBs, following 24 hours of repair, reported in the literature. Other aspects of the repair model were fit independently to mechanisms, such as the recruitment rate of repair proteins. There was no fitting applied in order to match yields of chromosome aberration. Despite the different fitting methodologies in each of the models, similar yields of biological outcomes are predicted. Since the Henthorn-Warmenhoven damage

190 and repair models are fit mechanistically, where possible, the similar predictions to McMahon lends weight to a mechanistic understanding of the involved processes. Though, of course both the damage and repair model require further refinement to be fully mechanistic. As well as the lessons learned through model comparison, Chapter 6 also demonstrates the importance and benefit of the standard format for recording DNA damage presented in Chapter 5.

7.6 Model Assumptions It is desirable to develop models, such as the one presented in this thesis, through mechanisms rather than through fitting to biological outcomes, removing dependencies on noisy data and improving the predictive power of the model. If the model can sufficiently reproduce experimental results, for a number of given cases, then the model can be extrapolated to any given case. The difficulty of such an approach is to identify experimental results to be used for mechanistic understanding, particularly when the mechanisms involved are poorly understood in the literature or when there is disagreement between experimental data. As such, a number of mechanisms are applied that reproduce experimental results, but may not definitively describe the mechanism. Although previously discussed throughout the thesis, this section brings together and outlines the mechanisms used in the model and identifies areas of further investigation. Chapter 3, and Section 7.2, discusses the mechanisms applied for direct and indirect DNA damage. In the model, direct DNA damage occurs probabilistically based on a deposited energy range (between 5 - 37.5 eV). This accounts for experimental evidence showing strand break induction below the DNA ionisation energy, though it does not account for an increased probability of strand breakage above the ionisation level, as discussed by Prise et al. (107). Comparison of model predictions to dry plasmid irradiation shows agreement, though there is disagreement between the individual experimental results. Indirect damage is assumed to cause 65% of the strand breaks for Co-60 irradiation, based on values suggested by Ward (119). This is for normal aerobic conditions and further investigation is required to model hypoxia. Furthermore, the spectrum of indirect DNA damage is not completely modelled in this work, where only strand breakage or abasic sites are modelled. SSBs are converted into DSBs if they occur on opposite strands of the double helix, and are separated by 10 bp or less. This critical separation is common in a number of similar DNA damage models. However, the LEM uses a separation of 25 DSBs. PARTRAC also uses a 10 bp separation, although, DSBs are merged if the two DSBs are separated by 25 bp or less. It is likely that the separation between two SSBs relates to the probability of forming a DSB, with close SSBs more likely to form

191 DSBs and distant SSBs less likely to form a DSB. Uncovering such an effect is difficult and involves careful investigation on isolated fragments of DNA, provided the mechanism is mechanical rather than a result of biological processes. The critical separation of 10 bp can lead to clusters containing more than two DNA backbones. For example, two DSBs that are separated by less than 10 bp will be identified as a single DSB. In reality this may result in two separate DSBs with a loss of the genome sequence between the two, i.e. the loss of a small fragment. Within the model it is assumed that this type of cluster leads to one DSB, with the other lesions associated to the DSB ends. Isolated backbone damages that have an isolated base damage on the opposite strand, within 10 bp, are grouped and identified as “potential DSBs”. It is assumed that some of these potential DSBs will be converted to DSBs through the repair process, though the process is not quantified in this work. PARTRAC converts 1% of SSBs to DSBs for similar reasons, though there is no distinction between isolated backbone damages and backbone damages with nearby base damages. Base damages are incorporated into a DSB cluster if they are within 3 bp of the DSB end, informed by literature evidence for base separation involvement in repair processes (257). Perhaps a value of 10 bp would make more sense here, to account for repair induced strand breaks, as discussed in the previous paragraph. The involvement of base damages and the methodology of forming clusters, as discussed in the previous paragraph, influences the structure of the DSB ends. The DSB end structure, the overhang, impacts the choice of repair pathway and efficacy (370), and as such has a relevance to biological outcome. The chromatin fibre simulation, based on direct and indirect DNA damage mechanisms, gives the detailed structure of the induced breaks. However, the full nucleus damage is determined by the clustering of energy deposition sites. This simplification has the advantage of speed and simplicity. Although, mechanisms are lost with the process. For example, indirect damage is implicitly included in the nucleus model by accepting a greater number of energy depositions. This approach is valid only if the indirect damage occurs near to the physical track of primaries and secondaries. If the radical freely diffuses for a significantly long time, then this assumption begins to lose validity. The nucleus model is also only valid for a homogenously distributed genome, limiting applicability for cell cycle phases where DNA is heterogeneous, such as metaphase. It would be desirable to extend the detailed chromatin fibre to the full nucleus model, including mechanisms of DNA damage and identifying specific chromosomes in which the damage occurred.

192 8. Conclusions and Future Work

The work presented in this thesis describes the development and refinement of a Monte Carlo track structure code, capable of simulating DNA damage from photons and ions. Mechanisms of direct and indirect DNA damage have been investigated and applied to the model. The model has been compared to experimental data where possible, and to the predictions of other models in the literature. The model has been combined with a NHEJ repair model, and key parameters of the damage that influence the repair efficacy have been identified. All of the detailed model predictions are reported as a function of proton dose and LET. Correlations are drawn to produce these model predictions as a function of dose and LET, making them easily applicable to PBT. This kind of approach allows for model implementation within TPS, where it may be desirable to maximise or minimise DSB complexity or misrepairs to cells in certain voxels. At no point are the model predictions extrapolated to cell fate. Whilst desirable this kind of prediction may require more detail in the models. For example, it is possible to predict misrepaired DSBs, since the correct pairing of DSB ends is known within the simulation. However, the nucleus model presented in this work does not explicitly build the full genome geometry nor chromosome territories. Therefore, we do not know the type of aberration that is formed through the misrepair, and do not assign a toxicity. It is possible to assume that of the misrepairs predicted some fraction will lead to a given aberration type, and then a toxicity can be assigned. DSB complexity and the impact on repair kinetics is an open hypothesis within the literature. The ability of the model presented in this thesis to predict DSB complexity for a given radiation quality will be an invaluable tool to investigate this hypothesis. Understanding the impedance of repair, due to complex damage, has a relevance to optimising fractionation schedules in radiotherapy; summarised by the “repair” component in the “5 R’s of fractionation”. There is opportunity to use experimental data to assign toxicity to the model predictions, without further development of the model. For example, a toxicity weighting factor can be applied to misrepaired DSBs or to a given DSB complexity. The weighting factors can then be fit to cell survival data. However, this introduces a dependency of the model predictions on experimental survival data. It is worth noting that this kind of approach can also be applied to experimental nanodosimetry. For example, a given radiation quality will produce a characteristic ICSD which can be used in conjunction with cell survival data to assign toxicities. Currently, the models presented in this thesis aren’t cell type specific. Assuming that the full genome geometry is common to all cell types, as are the mechanisms of direct and indirect DNA damage, then there is no further refinement of the damage model to account for cell type. Of course, there may be differences between normal and malignant cells, such as aneuploidy. This then implies that differences in

193 radiosensitivity are entirely a function of the cells repair capability. The repair processes can become complicated quickly, with process chains involving many proteins and interactions that are poorly understood or not yet discovered. But, perhaps further investigation in the field will eventually reveal mechanisms that can explain radiosensitivity, without the reliance on fitting parameters from models such as the LQ. A dependence on cell cycle phase is not directly assessed in the work of this thesis. In principle the simplified nucleus model is capable of simulating DNA damage patterns across interphase, so long as the DNA is homogenously spread throughout the nucleus. With the model, the sensitive fraction of the nucleus can be changed to account for the total genome length. Ideally, the chromatin fibre model would be extended to represent the entire genome, making the model predictions more specific and applying the investigated damage mechanisms. A series of comprehensive experimental results to further investigate mechanisms of direct and indirect DNA damage would be desirable. For example, irradiation of dry plasmids across a finer range of proton LET and experiments on DNA fragments where indirect damage can be isolated from direct damage. This will allow for more in-depth investigations of mechanisms, such as a non-linear energy probability for direct damage, the critical separation between SSBs to form a DSB, and mechanism-based probabilities for indirect damage including the contribution of oxygen concentration. A similar series of experimental results for chromosome aberrations would also be useful. Although the cell model does not currently build the full genome, assumptions can be made to determine the chromosome territories. When combined with repair simulations it is then possible to determine inter- and intra-chromosomal aberrations and to assign toxicities. The experimental results will give validation to the models and confidence for implementation in TPS. RBE is confounded by many factors, both biological and physical. Some of the biological factors are intrinsic to the cell type, such as repair capabilities, and some are determined by physical properties of the radiation, such as the induction of proximal DSBs leading to greater misrepair or complex damage impeding repair kinetics. Implementation of a nanodosimetric model of the physical dose, such as the one detailed in this thesis, can remove physical aspects of RBE uncertainty.

194 9. References

1. Maddams J, Utley M, Møller H. Projections of cancer prevalence in the United Kingdom, 2010-2040. Br J Cancer. 2012;107(7):1195–202. 2. Cancer incidence by age [Internet]. Cancer Research UK. [cited 2018 Mar 17]. Available from: http://www.cancerresearchuk.org/health- professional/cancer-statistics/incidence/age#heading-Zero 3. Hanahan D, Weinberg RA. The hallmarks of cancer. Cell [Internet]. 2000 Jan 7;100(1):57–70. Available from: http://www.ncbi.nlm.nih.gov/pubmed/10647931 4. Hanahan D, Weinberg RA. Hallmarks of cancer: The next generation. Cell [Internet]. 2011;144(5):646–74. Available from: http://dx.doi.org/10.1016/j.cell.2011.02.013 5. Round CE, Williams M V., Mee T, Kirkby NF, Cooper T, Hoskin P, et al. Radiotherapy demand and activity in England 2006-2020. Clin Oncol [Internet]. 2013;25(9):522–30. Available from: http://dx.doi.org/10.1016/j.clon.2013.05.005 6. Aggarwal A, Sullivan R. Affordability of cancer care in the United Kingdom - Is it time to introduce user charges? J Cancer Policy [Internet]. 2014;2(2):31–9. Available from: http://dx.doi.org/10.1016/j.jcpo.2013.11.001 7. Burnet NG, Thomas SJ, Burton KE, Jefferies SJ. Defining the tumour and target volumes for radiotherapy. Cancer Imaging. 2004;4(2):153–61. 8. Royal College of Radiologists. Radiotherapy dose fractionation Second edition [Internet]. 2016. Available from: https://www.rcr.ac.uk/system/files/publication/field_publication_files/bfco163 _rrt_dose_fractionation.pdf 9. Steel GG, Mcmillan TJ, Peacock JH. The 5rs of radiobiology. Int J Radiat Biol. 1989;56(6):1045–8. 10. Pawlik TM, Keyomarsi K. Role of cell cycle in mediating sensitivity to radiotherapy. Int J Radiat Oncol Biol Phys [Internet]. 2004 Jul 15 [cited 2015 Sep 20];59(4):928–42. Available from: http://www.sciencedirect.com/science/article/pii/S0360301604003888 11. Moeller BJ, Richardson RA, Dewhirst MW. Hypoxia and radiotherapy: Opportunities for improved outcomes in cancer treatment. Cancer Metastasis Rev. 2007;26(2):241–8. 12. Hubbell JH, Seltzer SM. Tables of X-Ray Mass Attenuation Coefficients and Mass Energy-Absorption Coefficients (version 1.4) [Internet]. National Institute of Standards and Technology. Gaithersburg, MD; 2004. Available from: http://physics.nist.gov/xaamdi 13. Mayles P, Nahum A, Rosenwald J, Dale R, Evans P, Flower M, et al. Handbook of Radiotherapy Physics. Boca Raton: CRC Press; 2007.

195 14. Nilsson B, Sorcini B. Surface dose measurements in clinical photon beams. Acta Oncol (Madr). 1989;28(4):537–42. 15. Baumann M, Krause M, Overgaard J, Debus J, Bentzen SM, Daartz J, et al. Radiation oncology in the era of precision medicine. Nat Rev Cancer [Internet]. 2016;16(4):234–49. Available from: http://dx.doi.org/10.1038/nrc.2016.18 16. Bentzen SM, Constine LS, Deasy JO, Eisbruch A, Jackson A, Marks LB, et al. Quantitative Analyses of Normal Tissue Effects in the Clinic (QUANTEC): An Introduction to the Scientific Issues. Int J Radiat Oncol Biol Phys. 2010;76(3 SUPPL.):3–9. 17. Muramatsu M, Kitagawa A. A review of ion sources for medical accelerators (invited). Rev Sci Instrum. 2012;83(2). 18. Bragg WH, Kleeman R. LXXIV. On the ionization curves of radium. Philos Mag Ser 6 [Internet]. 1904;8(48):726–38. Available from: http://www.tandfonline.com/doi/abs/10.1080/14786440409463246 19. Zeitlin C, La Tessa C. The Role of Nuclear Fragmentation in Particle Therapy and Space Radiation Protection. Front Oncol [Internet]. 2016;6(March):1–13. Available from: http://journal.frontiersin.org/Article/10.3389/fonc.2016.00065/abstract 20. Wilson RR. Radiological Use of Fast Protons. Radiology [Internet]. 1946 Nov 1;47(5):487–91. Available from: http://dx.doi.org/10.1148/47.5.487 21. Tobias CA, Lawrence JH, Born JL, McCombs RK, Roberts JE, Anger HO, et al. Pituitary Irradiation with High-Energy Proton Beams A Preliminary Report. Cancer Res [Internet]. 1958 Feb 1;18(2):121 LP-134. Available from: http://cancerres.aacrjournals.org/content/18/2/121.abstract 22. Slater JM, Archambeau JO, Miller DW, Notarus MI, Preston W, Slater JD. The proton treatment center at Loma Linda University Medical Center: Rationale for and description of its development. Int J Radiat Oncol Biol Phys. 1992;22(2):383–9. 23. PTCOG. Particle therapy facilities in operation. 2018. 24. Smith AR. Vision 2020: Proton therapy. Med Phys. 2009;36(2):556–68. 25. PTCOG. Particle Therapy Patient Statistics (per end of 2016). 2016. 26. Pereira GC, Traughber M, Muzic RF. The role of imaging in radiation therapy planning: Past, present, and future. Biomed Res Int. 2014;2014(2). 27. UK Department of Health. National Proton Beam Therapy Service Development Programme. UK Gov Website [Internet]. 2012; Available from: http://www.dh.gov.uk/health/2012/10/proton-beam-therapy/ 28. Mee T, Kirkby NF, Kirkby KJ. Mathematical Modelling for Patient Selection in Proton Therapy. Clin Oncol [Internet]. 2018;1–8. Available from: http://linkinghub.elsevier.com/retrieve/pii/S0936655518300335

196 29. Paganetti H. Proton therapy physics. CRC Press; 2011. 30. Bortfeld T. An analytical approximation of the Bragg curve for therapeutic proton beams An analytical approximation of the Bragg curve for therapeutic proton beams. 1997;2024. 31. Jette D, Chen W. Creating a spread-out Bragg peak in proton beams. Phys Med Biol [Internet]. 2011;56(11):N131–8. Available from: http://stacks.iop.org/0031- 9155/56/i=11/a=N01?key=crossref.ac7eb3600afe78c672d5e07205b16a36 32. Safai S, Bortfeld T, Engelsman M. Comparison between the lateral penumbra of a collimated double-scattered beam and uncollimated scanning beam in proton radiotherapy. Phys Med Biol. 2008;53(6):1729–50. 33. Charlwood FC, Aitkenhead AH, Mackay RI. A Monte Carlo study on the collimation of pencil beam scanning proton therapy beams. Med Phys. 2016;43(3):1462. 34. Winterhalter C, Lomax A, Oxley D, Weber DC, Safai S. A study of lateral fall-off (penumbra) optimisation for pencil beam scanning (PBS) proton therapy. Phys Med Biol [Internet]. 2018;63(2):25022. Available from: http://stacks.iop.org/0031- 9155/63/i=2/a=025022?key=crossref.1e356d6db29e577728fd2c252091ed8 8 35. Lomax AJ. Intensity modulation methods for proton radiotherapy. Phys Med Biol [Internet]. 1999;44(1):185–205. Available from: http://stacks.iop.org/0031- 9155/44/i=1/a=014?key=crossref.4d961e9c70528846231301b77dc0ef6a 36. Vitfell-Rasmussen J, Sandvik RM, Dahlstrøm K, Al-Farra G, Krarup-Hansen A, Gehl J. Tumor reduction and symptom relief after electrochemotherapy in a patient with aggressive fibromatosis – a case report. Acta Oncol (Madr). 2017;1–4. 37. Garland JM, Appleby RB, Owen H, Tygier S. Normal-conducting scaling fixed field alternating gradient accelerator for proton therapy. Phys Rev Spec Top - Accel Beams. 2015;18(9):1–17. 38. Sánchez-Parcerisa D, Pourbaix JC, Ainsley CG, Dolney D, Carabe A. Fast range switching of passively scattered proton beams using a modulation wheel and dynamic beam current modulation. Phys Med Biol. 2014;59(7):18–26. 39. Joiner M. Basic Clinical Radiobiology Edited by. 2009;375. 40. Malinen E, Søvik Å. Dose or LET painting - What is optimal in particle therapy of hypoxic tumors? Acta Oncol (Madr). 2015;54(9):1614–22. 41. Widder J, van der Schaaf A, Lambin P, Marijnen CAM, Pignol JP, Rasch CR, et al. The Quest for Evidence for Proton Therapy: Model-Based

197 Approach and Precision Medicine. Int J Radiat Oncol Biol Phys [Internet]. 2016;95(1):30–6. Available from: http://dx.doi.org/10.1016/j.ijrobp.2015.10.004 42. Langendijk JA, Lambin P, De Ruysscher D, Widder J, Bos M, Verheij M. Selection of patients for radiotherapy with protons aiming at reduction of side effects: The model-based approach. Radiother Oncol [Internet]. 2013;107(3):267–73. Available from: http://dx.doi.org/10.1016/j.radonc.2013.05.007 43. Lievens Y, Pijls-Johannesma M. Health economic controversy and cost- effectiveness of proton therapy. Semin Radiat Oncol [Internet]. 2013;23(2):134–41. Available from: http://dx.doi.org/10.1016/j.semradonc.2012.11.005 44. Bekelman JE, Asch DA, Tochner Z, Friedberg J, Vaughn DJ, Rash E, et al. Principles and reality of proton therapy treatment allocation. Int J Radiat Oncol Biol Phys [Internet]. 2014;89(3):499–508. Available from: http://dx.doi.org/10.1016/j.ijrobp.2014.03.023 45. Mitin T, Zietman AL. Promise and pitfalls of heavy-particles therapy. J Clin Oncol. 2014;32(26):2855–63. 46. Leroy R, Benahmed N, Hulstaert F, Mambourg F, Fairon N, Van Eycken L, et al. Hadron therapy in children – an update of the scientific evidence for 15 paediatric cancers. Brussels: Belgian Health Care Knowledge Centre (KCE); 2015. 47. De Ruysscher D, Mark Lodge M, Jones B, Brada M, Munro A, Jefferson T, et al. Charged particles in radiotherapy: A 5-year update of a systematic review. Radiother Oncol [Internet]. 2012;103(1):5–7. Available from: http://dx.doi.org/10.1016/j.radonc.2012.01.003 48. Amos R, Bulbeck H, Burnet NG, Crellin A, Eaton D, Evans P, et al. Proton Beam Therapy – the Challenges of Delivering High-quality Evidence of Clinical Benefit. Clin Oncol [Internet]. 2018 Mar 16; Available from: http://dx.doi.org/10.1016/j.clon.2018.02.031 49. Smith GC, Pell JP. Parachute use to prevent death and major trauma related to gravitational challenge: Systematic review of [randomized] controlled trials. J Int Assoc Physicians AIDS Care. 2004;3(4):108–9. 50. Lea DE. Actions of radiations on living cells. 2nd ed. Cambridge University Press; 1955. 51. Zirkle RE, Marchgrsk DF, Kuck KD. Exponential and Sigmoid Survival Curves. J Cell Comp Physiol. 1952;39(S1):75–85. 52. Rossi HH. Specification of Radiation Quality. Radiat Res [Internet]. 1959;10(5):522–31. Available from: http://www.jstor.org/stable/3570787 53. Rossi HH. Spatial distribution of energy deposition by ionizing radiation.

198 Radiat Res [Internet]. 1960;Suppl 2(5):290–9. Available from: http://www.ncbi.nlm.nih.gov/pubmed/14439253 54. Booz J, Braby L, Coyne J, Kliauga P, Lindborg L, Menzel H-G, et al. Report 36. J Int Comm Radiat Units Meas [Internet]. 1983 Dec 31;os19(1):NP-NP. Available from: http://dx.doi.org/10.1093/jicru/os19.1.Report36 55. Peeler CR, Mirkovic D, Titt U, Blanchard P, Gunther JR, Mahajan A, et al. Clinical evidence of variable proton biological effectiveness in pediatric patients treated for ependymoma. Radiother Oncol [Internet]. 2016;121(3):395–401. Available from: http://dx.doi.org/10.1016/j.radonc.2016.11.001 56. Grassberger C, Trofimov A, Lomax A, Paganetti H. Variations in linear energy transfer within clinical proton therapy fields and the potential for biological treatment planning. Int J Radiat Oncol Biol Phys. 2011;80(5):1559–66. 57. Wilkens JJ, Oelfke U. Analytical linear energy transfer calculations for proton therapy. Med Phys. 2003;30(5):806–15. 58. Chang DS, Lasley FD, Das IJ, Mendonca MS, Dynlacht JR. Oxygen Effect, Relative Biological Effectiveness and Linear Energy Transfer BT - Basic Radiotherapy Physics and Biology. In: Chang DS, Lasley FD, Das IJ, Mendonca MS, Dynlacht JR, editors. Cham: Springer International Publishing; 2014. p. 235–40. Available from: https://doi.org/10.1007/978-3- 319-06841-1_22 59. Bassler N, Toftegaard J, Lühr A, Sorensen BS, Scifoni E, Krämer M, et al. LET-painting increases tumour control probability in hypoxic tumours. Acta Oncol (Madr). 2014;53(1):25–32. 60. Cao W, Khabazian A, Yepes P, Lim G, Poenisch F, Grosshans D. Linear energy transfer incorporated intensity modulated proton therapy optimization 3 4. Phys Med Biol. 2018;63(September 2017):aa9a2e. 61. Knopf AC, Lomax A. In vivo proton range verification: A review. Phys Med Biol. 2013;58(15):131–60. 62. Schneider U, Pedroni E, Lomax A. The calibration of CT Hounsfield units for radiotherapy treatment planning. Phys Med Biol. 1996;41(1):111–24. 63. Schaffner B, Pedroni E. The precision of proton range calculations in proton radiotherapy treatment planning: Experimental verification of the relation between CT-HU and proton stopping power. Phys Med Biol. 1998;43(6):1579–92. 64. Albertini F, Bolsi A, Lomax AJ, Rutz HP, Timmerman B, Goitein G. Sensitivity of intensity modulated proton therapy plans to changes in patient weight. Radiother Oncol. 2008;86(2):187–94. 65. Soukup M, Söhn M, Yan D, Liang J, Alber M. Study of Robustness of IMPT

199 and IMRT for Prostate Cancer Against Organ Movement. Int J Radiat Oncol Biol Phys. 2009;75(3):941–9. 66. Liebl J, Paganetti H, Zhu M, Winey BA. The influence of patient positioning uncertainties in proton radiotherapy on proton range and dose distributions. Med Phys [Internet]. 2014;41(9):1–12. Available from: http://dx.doi.org/10.1118/1.4892601 67. Price T. PRaVDA: High Energy Physics towards proton Computed Tomography. Nucl Instruments Methods Phys Res Sect A Accel Spectrometers, Detect Assoc Equip [Internet]. 2016;824:226–7. Available from: http://dx.doi.org/10.1016/j.nima.2015.12.013 68. Bennett. Beam localization via 15O activation in proton-radiation therapy. 1975;150:333–8. 69. Min CH, Kim CH, Youn MY, Kim JW. Prompt gamma measurements for locating the dose falloff region in the proton therapy. Appl Phys Lett. 2006;89(18). 70. Paganetti H. Range uncertainties in proton therapy and the role of Monte Carlo simulations. Phys Med Biol. 2013;57(11). 71. McGowan SE, Burnet NG, Lomax AJ. Treatment planning optimisation in proton therapy. Br J Radiol. 2013;86(1021). 72. ICRU. ICRU report 78 - prescribing, recording, and reporting proton-beam therapy. J ICRU. 2007;7. 73. Paganetti H. Relative biological effectiveness (RBE) values for proton beam therapy. Variations as a function of biological endpoint, dose, and linear energy transfer. Phys Med Biol [Internet]. 2014;59(22):R419-72. Available from: http://stacks.iop.org/0031- 9155/59/i=22/a=R419%5Cnhttp://www.ncbi.nlm.nih.gov/pubmed/25361443 74. Marthinsen ABL, Gisetstad R, Danielsen S, Frengen J, Strickert T, Lundgren S. Relative biological effectiveness of photon energies used in brachytherapy and intraoperative radiotherapy techniques for two breast cancer cell lines. Acta Oncol (Madr). 2010;49(8):1261–8. 75. Brenner DJ, Leu CS, Beatty JF, Shefer RE. Clinical relative biological effectiveness of low-energy x-rays emitted by miniature x-ray devices. Phys Med Biol. 1999;44(2):323–33. 76. Munshi A, Hobbs M, Meyn RE. Clonogenic Cell Survival Assay BT - Chemosensitivity: Volume 1 In Vitro Assays. In: Blumenthal RD, editor. Totowa, NJ: Humana Press; 2005. p. 21–8. Available from: https://doi.org/10.1385/1-59259-869-2:021 77. Belli M, Cera F, Cherubini R, Dalla Vecchia M, Haqjue AMI, Ianzini F, et al. RBE-LET relationships for cell inactivation and mutation induced by low energy protons in V79 cells: Further results at the LNL facility. Int J Radiat

200 Biol. 1998;74(4):501–9. 78. Paganetti H, Niemierko A, Ancukiewicz M, Gerweck LE, Goitein M, Loeffler JS, et al. Relative biological effectiveness (RBE) values for proton beam therapy. Int J Radiat Oncol. 2002;53(2):407–21. 79. Friedrich T, Scholz U, Elsässer T, Durante M, Scholz M. Systematic analysis of RBE and related quantities using a database of cell survival experiments with ion beam irradiation. J Radiat Res. 2013;54(3):494–514. 80. Paganetti H, Van Luijk P. Biological Considerations When Comparing Proton Therapy With Photon Therapy. Semin Radiat Oncol [Internet]. 2013;23(2):77–87. Available from: http://dx.doi.org/10.1016/j.semradonc.2012.11.002 81. Giantsoudi D, Sethi R V., Yeap BY, Eaton BR, Ebb DH, Caruso PA, et al. Incidence of CNS Injury for a Cohort of 111 Patients Treated with Proton Therapy for Medulloblastoma: LET and RBE Associations for Areas of Injury. Int J Radiat Oncol Biol Phys [Internet]. 2016;95(1):287–96. Available from: http://dx.doi.org/10.1016/j.ijrobp.2015.09.015 82. Buchsbaum JC, McDonald MW, Johnstone PAS, Hoene T, Mendonca M, Cheng CW, et al. Range modulation in proton therapy planning: A simple method for mitigating effects of increased relative biological effectiveness at the end-of-range of clinical proton beams. Radiat Oncol. 2014;9(1). 83. Underwood T, Paganetti H. Variable Proton Relative Biological Effectiveness: How Do We Move Forward? Int J Radiat Oncol Biol Phys [Internet]. 2016;95(1):56–8. Available from: http://dx.doi.org/10.1016/j.ijrobp.2015.10.006 84. McMahon SJ, Paganetti H, Prise KM. Complexity-weighted doses reduce biological uncertainty in proton radiotherapy planning. 2018 Feb 19; Available from: http://arxiv.org/abs/1802.06692 85. Jones B, McMahon SJ, Prise KM. The Radiobiology of Proton Therapy: Challenges and Opportunities Around Relative Biological Effectiveness. Clin Oncol [Internet]. 2018;1–8. Available from: http://linkinghub.elsevier.com/retrieve/pii/S0936655518300414 86. Ilicic K, Combs SE, Schmid TE. New insights in the relative radiobiological effectiveness of proton irradiation. Radiat Oncol. 2018;13(1):4–11. 87. Carabe A, Moteabbed M, Depauw N, Schuemann J, Paganetti H. Range uncertainty in proton therapy due to variable biological effectiveness. Phys Med Biol [Internet]. 2012 Mar 7;57(5):1159–72. Available from: http://stacks.iop.org/0031- 9155/57/i=5/a=1159?key=crossref.9b751be1967ed600b9a0ce4a585e5aca 88. Wedenberg M, Lind BK, Hårdemark B. A model for the relative biological effectiveness of protons: The tissue specific parameter α/β of photons is a

201 predictor for the sensitivity to LET changes. Acta Oncol (Madr). 2013;52(3):580–8. 89. McNamara AL, Schuemann J, Paganetti H. A phenomenological relative biological effectiveness (RBE) model for proton therapy based on all published in vitro cell survival data. Phys Med Biol [Internet]. 2015;60(21):8399–416. Available from: http://stacks.iop.org/0031- 9155/60/i=21/a=8399?key=crossref.c032a6811a8381ba7a2d43bd6444f635 90. Hutchinson F. Radiation Inactivation of Molecules in Cells. Am Nat. 1960;94(874):59–70. 91. Puck TT. Quantitative Studies on Mammalian Cells in Vitro. Rev Mod Phys [Internet]. 1959 Apr 1;31(2):433–48. Available from: https://link.aps.org/doi/10.1103/RevModPhys.31.433 92. Frankenberg-Schwager M. Induction, repair and biological relevance of radiation-induced DNA lesions in eukaryotic cells. Radiat Environ Biophys. 1990;29(4):273–92. 93. Palmans H, Rabus H, Belchior a L, Bug MU, Galer S, Giesen U, et al. Future development of biologically relevant dosimetry. Br J Radiol [Internet]. 2015;88(1045):20140392. Available from: http://www.birpublications.org/doi/abs/10.1259/bjr.20140392 94. Rabus H, Nettelbeck H. Nanodosimetry: Bridging the gap to radiation biophysics. Radiat Meas [Internet]. 2011;46(12):1522–8. Available from: http://dx.doi.org/10.1016/j.radmeas.2011.02.009 95. Casiraghi M, Schulte RW. Nanodosimetry-Based Plan Optimization for Particle Therapy. Comput Math Methods Med [Internet]. 2015;2015:1–13. Available from: http://www.hindawi.com/journals/cmmm/2015/908971/ 96. De Nardo L, Alkaa A, Khamphan C, Conte V, Colautti P, Ségur P, et al. A detector for track-nanodosimetry. Nucl Instruments Methods Phys Res Sect A Accel Spectrometers, Detect Assoc Equip. 2002;484(1–3):312–26. 97. Garty G, Shchemelinin S, Breskin A, Chechik R, Assaf G, Orion I, et al. The performance of a novel ion-counting nanodosimeter. Nucl Instruments Methods Phys Res Sect A Accel Spectrometers, Detect Assoc Equip. 2002;492(1–2):212–35. 98. Vasi F, Casiraghi M, Bashkirov V, Giesen U, Schulte RW. Development of a single ion detector for radiation track structure studies. J Instrum [Internet]. 2016;11(9):C09021. Available from: http://stacks.iop.org/1748- 0221/11/i=09/a=C09021 99. Pszona S, Gajewski R. An Approach to Experimental Microdosimetry at the Nanometre Scale. Radiat Prot Dosimetry [Internet]. 1994 Apr 1;52(1– 4):427–30. Available from: http://rpd.oxfordjournals.org/content/52/1- 4/427.abstract

202 100. Prise KM. A review of dsb induction data for varying quality radiations. Int J Radiat Biol [Internet]. 1998 Jan 3;74(2):173–84. Available from: http://www.tandfonline.com/doi/full/10.1080/095530098141564 101. Muslimovic A, Johansson P, Hammarsten O. Measurement of H2AX Phosphorylation as a Marker of Ionizing Radiation Induced Cell Damage [Internet]. INTECH: Current Topics in Ionizing Radiation. Available from: http://cdn.intechopen.com/pdfs-wm/32084.pdf 102. Folkard M, Prise K. Investigating Mechanisms of Radiation-Induced DNA Damage Using Low-Emergy Photons. Acta Phys Pol Ser a [Internet]. 2006;109(3):200. Available from: http://przyrbwn.icm.edu.pl/APP/PDF/109/a109z304.pdf 103. Martin RF, Haseltine WA. Range of radiochemical damage to DNA with decay of iodine-125. Science (80- ) [Internet]. 1981 Aug 21;213(4510):896 LP-898. Available from: http://science.sciencemag.org/content/213/4510/896.abstract 104. Lobachevsky PN, Martin RF. DNA Strand Breakage by 125 I-Decay in a Synthetic Oligodeoxynucleotide: Quantitative analysis of fragment distribution. Acta Oncol (Madr) [Internet]. 1996;35(7):809–15. Available from: http://www.tandfonline.com/doi/full/10.3109/02841869609104031 105. Nikjoo H, Martin RF, Charlton DE, Terrissol M, Kandaiya S, Lobachevsky P. Modelling of Auger-induced DNA damage by incorporated 125I. Acta Oncol [Internet]. 1996;35(7):849–56. Available from: http://informahealthcare.com/doi/abs/10.3109/02841869609104036 106. Boudaiffa B, Cloutier P, Hunting D, Huels M, Sanche L. Resonant Formation of DNA Strand Breaks by Low-Energy (3 to 20 eV) Electrons. Science (80- ) [Internet]. 2000;287(5458):1658–60. Available from: http://www.sciencemag.org/cgi/doi/10.1126/science.287.5458.1658 107. Prise K, Folkard M, Michael B, Vojnovic B, Brocklehurst B, Hopkirk A, et al. Critical energies for SSB and DSB induction in plasmid DNA by low-energy photons: action spectra for strand-break induction in plasmid DNA irradiated in vacuum. Int J Radiat Biol [Internet]. 2000;76(7):881–90. Available from: http://www.tandfonline.com/doi/full/10.1080/09553000050050891 108. Tasaki K, Yang X, Urano S, Fetzer S, LeBreton PR. UV Photoelectron and ab Initio Quantum Mechanical Characterization of Nucleotides: The Valence Electronic Structures of 2’-Deoxycytidine-5’-phosphate. J Am Chem Soc. 1990;112(2):538–48. 109. Friedland W, Jacob P, Bernhardt P, Paretzke HG, Dingfelder M. Simulation of DNA Damage after Proton Irradiation. Radiat Res [Internet]. 2003;159(3):401–10. Available from: http://www.bioone.org/doi/abs/10.1667/0033-

203 7587(2003)159[0401:SODDAP]2.0.CO;2 110. Alberts B, Johnson A, Lewis J, et al. Molecular Biology of the Cell [Internet]. 6th ed. Garland Science; 2002. Available from: http://www.ncbi.nlm.nih.gov/books/NBK26821/ 111. Breen AP, Murphy JA. Reactions of oxyl radicals with DNA. Free Radic Biol Med [Internet]. 1995 Jun;18(6):1033–77. Available from: http://www.ncbi.nlm.nih.gov/pubmed/7628729 112. Kreipl MS, Friedland W, Paretzke HG. Time- and space-resolved Monte Carlo study of water radiolysis for photon, electron and ion irradiation. Radiat Environ Biophys. 2009;48(1):11–20. 113. Dizdaroglu M. Oxidative damage to DNA in mammalian chromatin. Mutat Res [Internet]. 1992 Sep;275(3–6):331–42. Available from: http://www.ncbi.nlm.nih.gov/pubmed/1383774 114. Dizdaroglu M, Jaruga P, Birincioglu M, Rodriguez H. Free radical-induced damage to DNA: mechanisms and measurement1,2 1This article is part of a series of reviews on “Oxidative DNA Damage and Repair.” The full list of papers may be found on the homepage of the journal. 2Guest Editor: Miral Dizdaroglu. Free Radic Biol Med [Internet]. 2002;32(11):1102–15. Available from: http://www.sciencedirect.com/science/article/pii/S0891584902008262 115. Gates KS. An Overview of Chemical Processes That Damage Cellular DNA: Spontaneous Hydrolysis, Alkylation, and Reactions with Radicals. Chem Res Toxicol [Internet]. 2009 Nov 16;22(11):1747–60. Available from: http://pubs.acs.org/doi/abs/10.1021/tx900242k 116. Grimes DR, Partridge M. A mechanistic investigation of the oxygen fixation hypothesis and oxygen enhancement ratio. Biomed Phys Eng Express. 2016;1(4):1–22. 117. Bertout JA, Patel SA, Simon MC. The impact of O2 availability on human cancer. Nat Rev Cancer [Internet]. 2008;8(12):967–75. Available from: http://www.nature.com/doifinder/10.1038/nrc2540 118. Ewing D. The Oxygen Fixation Hypothesis: A Reevaluation. Am J Clin Oncol [Internet]. 1998;21(4). Available from: http://journals.lww.com/amjclinicaloncology/Fulltext/1998/08000/The_Oxyge n_Fixation_Hypothesis__A_Reevaluation.8.aspx 119. Ward JF. Biochemistry of DNA lesions. Radiat Res Suppl. 1985;8:S103–11. 120. Hakem R. DNA-damage repair; the good, the bad, and the ugly. EMBO J [Internet]. 2008 Feb 20;27(4):589–605. Available from: http://dx.doi.org/10.1038/emboj.2008.15 121. Jiricny J. The multifaceted mismatch-repair system [Internet]. Vol. 7, Nature Reviews Molecular Cell Biology. 2006. p. 335–46. Available from: http://www.nature.com/articles/nrm1907

204 122. Wilson DM, Bohr VA. The mechanics of base excision repair, and its relationship to aging and disease. DNA Repair (Amst). 2007;6(4):544–59. 123. Leibeling D, Laspe P, Emmert S. Nucleotide excision repair and cancer. J Mol Histol. 2006;37(5–7):225–38. 124. Jasin M, Rothstein R. Repair of strand breaks by homologous recombination. Cold Spring Harb Perspect Biol. 2013;5(11):1–18. 125. Jeggo PA, Geuting V, Löbrich M. The role of homologous recombination in radiation-induced double-strand break repair. Radiother Oncol [Internet]. 2011;101(1):7–12. Available from: http://dx.doi.org/10.1016/j.radonc.2011.06.019 126. Guirouilh-Barbat J, Lambert S, Bertrand P, Lopez BS. Is homologous recombination really an error-free process? Front Genet. 2014;5(JUN):1– 15. 127. Justman QA. Looking Beyond the Stop Sign: Cell-Cycle Checkpoints Reconsidered. Cell Syst [Internet]. 2017;5(5):438–40. Available from: https://doi.org/10.1016/j.cels.2017.11.004 128. Terzi MY, Izmirli M, Gogebakan B. The cell fate: senescence or quiescence. Mol Biol Rep [Internet]. 2016;43(11):1213–20. Available from: https://doi.org/10.1007/s11033-016-4065-0 129. Castedo M, Perfettini JL, Roumier T, Andreau K, Medema R, Kroemer G. Cell death by mitotic catastrophe: A molecular definition. Oncogene. 2004;23(16 REV. ISS. 2):2825–37. 130. Mao Z, Bozzella M, Seluanov A, Gorbunova V. DNA repair by nonhomologous end joining and homologous recombination during cell cycle in human cells. Cell Cycle [Internet]. 2008 Sep 15;7(18):2902–6. Available from: http://dx.doi.org/10.4161/cc.7.18.6679 131. Chiruvella KK, Liang Z, Wilson TE. Repair of double-strand breaks by end joining. Cold Spring Harb Perspect Biol. 2013;5(5):1–21. 132. Neumaier T, Swenson J, Pham C, Polyzos A, Lo AT, Yang P, et al. Evidence for formation of DNA repair centers and dose-response nonlinearity in human cells. Proc Natl Acad Sci [Internet]. 2012 Jan 10;109(2):443–8. Available from: http://www.pnas.org/content/109/2/443.abstract 133. Marnef A, Legube G. Organizing DNA repair in the nucleus: DSBs hit the road. Curr Opin Cell Biol. 2017;46:1–8. 134. Henthorn NT, Warmenhoven JW, Sotiropoulos M, Mackay RI, Kirkby NF, Kirkby KJ, et al. In Silico Non-Homologous End Joining Following Ion Induced DNA Double Strand Breaks Predicts That Repair Fidelity Depends on Break Density. Sci Rep [Internet]. 2018;8(1):2654. Available from: https://doi.org/10.1038/s41598-018-21111-8

205 135. Relative biological effectiveness (RBE), quality factor (Q), and radiation weighting factor (w(R)). A report of the International Commission on Radiological Protection. Ann ICRP. 2003;33(4):1–117. 136. Lett JT. Damage to cellular DNA from particulate radiations, the efficacy of its processing and the radiosensitivity of mammalian cells. Radiat Environ Biophys [Internet]. 1992 Dec;31(4):257–77. Available from: http://link.springer.com/10.1007/BF01210207 137. Chadwick KH, Leenhouts HP. A molecular theory of cell survival. Phys Med Biol [Internet]. 1973;18(1):78–87. Available from: http://www.ncbi.nlm.nih.gov/pubmed/4803965 138. Nomiya T. Discussions on target theory: Past and present. J Radiat Res. 2013;54(6):1161–3. 139. Zaider M. There is no mechanistic basis for the use of the linear-quadratic expression in cellular survival analysis. Med Phys [Internet]. 1998 May 1;25(5):791–2. Available from: http://dx.doi.org/10.1118/1.598430 140. Sachs RK, Brenner DJ. The mechanistic basis of the linear-quadratic formalism. Med Phys [Internet]. 1998 Oct 1;25(10):2071–3. Available from: http://dx.doi.org/10.1118/1.598431 141. Tobias CA. The repair-misrepair model in radiobiology: comparison to other models. Radiat Res Suppl [Internet]. 1985 Nov;8(May):S77–95. Available from: http://www.jstor.org/stable/3576635?origin=crossref 142. Curtis SB. Lethal and Potentially Lethal Lesions Induced by Radiation --- A Unified Repair Model. Radiat Res [Internet]. 1986;106(2):252–70. Available from: http://www.jstor.org/stable/3576798 143. Belli M, Cera F, Cherubini R, Haque a M, Ianzini F, Moschini G, et al. Inactivation and mutation induction in V79 cells by low energy protons: re- evaluation of the results at the LNL facility. Int J Radiat Biol. 1993;63(3):331–7. 144. Garcia LM, Wilkins DE, Raaphorst GP. α/β ratio: A dose range dependence study. Int J Radiat Oncol Biol Phys. 2007;67(2):587–93. 145. Scholz M, Kellerer a. M, Kraft-Weyrather W, Kraft G. Computation of cell survival in heavy ion beams for therapy: The model and its approximation. Radiat Environ Biophys [Internet]. 1997;36(1):59–66. Available from: http://link.springer.com/10.1007/s004110050055 146. Elsässer T, Weyrather WK, Friedrich T, Durante M, Iancu G, Krämer M, et al. Quantification of the relative biological effectiveness for ion beam radiotherapy: Direct experimental comparison of proton and carbon ion beams and a novel approach for treatment planning. Int J Radiat Oncol Biol Phys. 2010;78(4):1177–83. 147. Grün R, Friedrich T, Krämer M, Scholz M, al AK et, M A, et al. Systematics

206 of relative biological effectiveness measurements for proton radiation along the spread out Bragg peak: experimental validation of the local effect model. Phys Med Biol [Internet]. 2017;62(3):890–908. Available from: http://stacks.iop.org/0031- 9155/62/i=3/a=890?key=crossref.07dd7eb865311beddf5cae9eed1579b2 148. Friedrich TC, Durante M, Scholz M. The local effect model – principles and applications. In 2013. 149. Cucinotta FA, Nikjoo H, Goodhead DT. Applications of amorphous track models in radiation biology. Radiat Environ Biophys [Internet]. 1999 Jul 7;38(2):81–92. Available from: http://link.springer.com/10.1007/s004110050142 150. Elsässer T, Scholz M. Cluster effects within the local effect model. Radiat Res [Internet]. 2007 Mar;167(3):319–29. Available from: http://www.ncbi.nlm.nih.gov/pubmed/17316069 151. Shao C, Saito M, Yu Z. Formation of single- and double-strand breaks of pBR322 plasmid irradiated in the presence of scavengers. Radiat Environ Biophys [Internet]. 1999 Jul;38(2):105–9. Available from: http://www.ncbi.nlm.nih.gov/pubmed/10461756 152. Klimczak U, Ludwig DC, Mark F, Rettberg P, Schulte-Frohlinde D. Irradiation of Plasmid and Phage DNA in Water—alcohol Mixtures: Strand Breaks and Lethal Damage as a Function of Scavenger Concentration. Int J Radiat Biol [Internet]. 1993 Jan 3;64(5):497–510. Available from: https://doi.org/10.1080/09553009314551711 153. van Touw JH, Verberne JB, Retèl J, Loman H. Radiation-induced strand breaks in phi X174 replicative form DNA: an improved experimental and theoretical approach. Int J Radiat Biol Relat Stud Phys Chem Med [Internet]. 1985 Oct;48(4):567–78. Available from: http://www.ncbi.nlm.nih.gov/pubmed/2931392 154. Siddiqi MA, Bothe E. Single- and double-strand break formation in DNA irradiated in aqueous solution: dependence on dose and OH radical scavenger concentration. Radiat Res [Internet]. 1987 Dec;112(3):449–63. Available from: http://www.ncbi.nlm.nih.gov/pubmed/3423212 155. Elsässer T, Krämer M, Scholz M. Accuracy of the Local Effect Model for the Prediction of Biologic Effects of Carbon Ion Beams In Vitro and In Vivo. Int J Radiat Oncol Biol Phys. 2008;71(3):866–72. 156. Yokota H, Van Den Engh G, Hearst JE, Sachs RK, Trask BJ. Evidence for the organization of chromatin in megabase pair-sized loops arranged along a random walk path in the human G0/G1 interphase nucleus. J Cell Biol. 1995;130(6):1239–49. 157. Giovannini G, Böhlen T, Cabal G, Bauer J, Tessonnier T, Frey K, et al.

207 Variable RBE in proton therapy: comparison of different model predictions and their influence on clinical-like scenarios. Radiat Oncol [Internet]. 2016;11(1):68. Available from: http://ro- journal.biomedcentral.com/articles/10.1186/s13014-016-0642-6 158. Ballarini F, Merzagora M, Monforti F, Durante M, Gialanella G, Grossi GF, et al. Chromosome aberrations induced by light ions: Monte Carlo simulations based on a mechanistic model. Int J Radiat Biol. 1999;75(1):35–46. 159. Ballarini F, Carante MP. Chromosome aberrations and cell death by ionizing radiation: Evolution of a biophysical model [Internet]. Vol. 128, Radiation Physics and Chemistry. Elsevier; 2016. p. 18–25. Available from: http://dx.doi.org/10.1016/j.radphyschem.2016.06.009 160. Tello Cajiao JJ, Carante M Pietro, Bernal Rodriguez MA, Ballarini F. Proximity effects in chromosome aberration induction by low-LET ionizing radiation. DNA Repair (Amst) [Internet]. 2017;58(December 2017):38–46. Available from: http://linkinghub.elsevier.com/retrieve/pii/S1568786417301647 161. Carante M Pietro, Aimè C, Tello Cajiao JJ, Ballarini F. BIANCA, a biophysical model of cell survival and chromosome damage by protons, C- ions and He-ions at energies and doses used in hadrontherapy. Phys Med Biol [Internet]. 2018 Mar 6; Available from: http://www.ncbi.nlm.nih.gov/pubmed/29508768 162. Mcmahon SJ, Schuemann J, Paganetti H, Prise KM. Mechanistic Modelling of DNA Repair and Cellular Survival Following Radiation-Induced DNA Damage. Sci Rep [Internet]. 2016;6(April):33290. Available from: http://www.nature.com/articles/srep33290 163. McMahon SJ, McNamara AL, Schuemann J, Paganetti H, Prise KM. A general mechanistic model enables predictions of the biological effectiveness of different qualities of radiation. Sci Rep [Internet]. 2017;7(1):10790. Available from: http://www.nature.com/articles/s41598- 017-10820-1 164. Incerti S, Ivanchenko a, Karamitros M, Mantero a, Moretto P, Tran HN, et al. Comparison of GEANT4 very low energy cross section models with experimental data in water. Med Phys. 2010;37(9):4692–708. 165. Francis Z. Molecular Scale Simulation of Ionizing Particles Tracks for Radiobiology and Hadrontherapy Studies [Internet]. Vol. 65, Advances in Quantum Chemistry. Elsevier Inc.; 2013. 79-110 p. Available from: http://dx.doi.org/10.1016/B978-0-12-396455-7.00004-2 166. Uehara S, Nikjoo H, Goodhead DT. Cross-sections for water vapour for the Monte Carlo electron track structure code from 10 eV to the MeV region.

208 Phys Med Biol. 1993;38(12):1841–58. 167. McNamara AL, Geng C, Turner R, Ramos-Méndez J, Perl J, Held K, et al. Validation of the radiobiology toolkit TOPAS-nBio in simple DNA geometries. Phys Medica [Internet]. 2016;33:207–15. Available from: http://dx.doi.org/10.1016/j.ejmp.2016.12.010 168. Meylan S, Incerti S, Karamitros M, Tang N, Bueno M, Clairand I, et al. Simulation of early DNA damage after the irradiation of a fibroblast cell nucleus using Geant4-DNA. Sci Rep [Internet]. 2017;7(1):11923. Available from: http://www.nature.com/articles/s41598-017-11851-4 169. Nikjoo H, Uehara S, Emfietzoglou D, Cucinotta FA. Track-structure codes in radiation research. Radiat Meas. 2006;41(9–10):1052–74. 170. Friedland W, Schmitt E, Kundrát P, Dingfelder M, Baiocco G, Barbieri S, et al. Comprehensive track-structure based evaluation of DNA damage by light ions from radiotherapy-relevant energies down to stopping. Sci Rep [Internet]. 2017 Mar 27;7(September 2016):45161. Available from: http://dx.doi.org/10.1038/srep45161 171. Dingfelder M, Inokuti M, Paretzke HG. Inelastic-collision cross sections of liquid water for interactions of energetic protons. Radiat Phys Chem. 2000;59(3):255–75. 172. Friedland W, Dingfelder M, Jacob P, Paretzke HG. Calculated DNA double- strand break and fragmentation yields after irradiation with He ions. Radiat Phys Chem. 2005;72(2–3):279–86. 173. Friedland W, Dingfelder M, Kundrát P, Jacob P. Track structures, DNA targets and radiation effects in the biophysical Monte Carlo simulation code PARTRAC. Mutat Res - Fundam Mol Mech Mutagen [Internet]. 2011;711(1–2):28–40. Available from: http://dx.doi.org/10.1016/j.mrfmmm.2011.01.003 174. Friedland W, Jacob P, Kundrát P. Stochastic simulation of DNA double- strand break repair by non-homologous end joining based on track structure calculations. Radiat Res [Internet]. 2010;173(5):677–88. Available from: http://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&id=20 426668&retmode=ref&cmd=prlinks%5Cnfile:///Users/Piotr/Documents/Pape rs2/Articles/2010/Friedland/Radiation Research 2010 Friedland.pdf 175. Crespo-Hernández CE, Arce R, Ishikawa Y, Gorb L, Leszczynski J, Close DM. Ab initio ionization energy thresholds of DNA and RNA bases in gas phase and in aqueous solution. J Phys Chem A. 2004;108(30):6373–7. 176. Fernando H, Papadantonakis G a, Kim NS, LeBreton PR. Conduction- band-edge ionization thresholds of DNA components in aqueous solution. Proc Natl Acad Sci U S A. 1998;95(10):5550–5. 177. Bald I, Illenberger E, Kopyra J. Damage of DNA by Low Energy Electrons

209 (< 3 eV). J Phys Conf Ser. 2012;373:12008. 178. Kopyra J. Low energy electron attachment to the nucleotide deoxycytidine monophosphate: direct evidence for the molecular mechanisms of electron- induced DNA strand breaks. Phys Chem Chem Phys. 2012;14(23):8287. 179. Agostinelli S, Allison J, Amako K, Apostolakis J, Araujo H, Arce P, et al. GEANT4 - A simulation toolkit. Nucl Instruments Methods Phys Res Sect A Accel Spectrometers, Detect Assoc Equip. 2003;506(3):250–303. 180. Chauvie S, Francis Z, Guatelli S, Incerti S, Mascialino B, Moretto P, et al. Geant4 physics processes for microdosimetrysimulation: design foundation and implementationof the first set of models. 2007;54(6):2619–28. 181. Champion C, Incerti S, Aouchiche H, Oubaziz D. A free-parameter theoretical model for describing the electron elastic scattering in water in the Geant4 toolkit. Radiat Phys Chem [Internet]. 2009;78(9):745–50. Available from: http://dx.doi.org/10.1016/j.radphyschem.2009.03.079 182. Villagrasa C, Francis Z, Incerti S, Solarium C. Physical models implemented in the Geant4-DNA extension of the Geant-4 toolkit for calculating initial radiation damage at the molecular level. Radiat Prot Dosimetry. 2011;143(2–4):214–8. 183. Dingfelder M. Track-structure Simulations for Charged Particles. Health Phys. 2012;103(5):590–5. 184. Rudd ME, Kim YK, Madison DH, Gallagher JW. Electron production in proton collisions: Total cross sections. Rev Mod Phys. 1985;57(4):965–94. 185. Rudd M, Kim Y, Madison D, Gay T. Electron production in proton collisions with atoms and molecules: energy distributions. Rev Mod Phys. 1992;64(2):441–90. 186. Miller JH, Green a E. Proton energy degradation in water vapor. Radiat Res. 1973;54(3):343–63. 187. Emfietzoglou D, Papamichael G, Kostarelos K, Moscovitch M. A Monte Carlo track structure code for electrons (approximately 10 eV-10 keV) and protons (approximately 0.3-10 MeV) in water: partitioning of energy and collision events. Phys Med Biol. 2000;45(11):3171–94. 188. Brenner DJ, Zaider M. A computationally convenient parameterisation of experimental angular distributions of low energy electrons elastically scattered off water vapour. Phys Med Biol. 1983;29(4):443–7. 189. Francis Z, Incerti S, Capra R, Mascialino B, Montarou G, Stepan V, et al. Molecular scale track structure simulations in liquid water using the Geant4- DNA Monte-Carlo processes. Appl Radiat Isot [Internet]. 2011;69(1):220–6. Available from: http://dx.doi.org/10.1016/j.apradiso.2010.08.011 190. Urban L. A model for multiple scattering in Geant4. Prepr Cern. 2006; 191. Michaud M, Wen A, Sanche L. Cross sections for low-energy (1-100 eV)

210 electron elastic and inelastic scattering in amorphous ice. Radiat Res. 2003;159(1):3–22. 192. Banna MS, McQuaide BH, Malutzki R, Schmidt V. The photoelectron spectrum of water in the 30 to 140 eV photon energy range. J Chem Phys [Internet]. 1986;84(9):4739. Available from: http://scitation.aip.org/content/aip/journal/jcp/84/9/10.1063/1.450008 193. Ivanchenko VN, Incerti S, Francis Z, Tran HN, Karamitros M, Bernal M a., et al. Combination of electromagnetic physics processes for microdosimetry in liquid water with the Geant4 Monte Carlo simulation toolkit. Nucl Instruments Methods Phys Res Sect B Beam Interact with Mater Atoms [Internet]. 2012;273:95–7. Available from: http://dx.doi.org/10.1016/j.nimb.2011.07.048 194. Champion C, Incerti S, Tran HN, Karamitros M, Shin JI, Lee SB, et al. Proton transport in water and DNA components: A Geant4 Monte Carlo simulation. Nucl Instruments Methods Phys Res Sect B Beam Interact with Mater Atoms [Internet]. 2013;306:165–8. Available from: http://dx.doi.org/10.1016/j.nimb.2012.12.059 195. Champion C, Galassi ME, Weck PF, Incerti S, Rivarola RD, Fojón O, et al. Proton-induced ionization of isolated uracil molecules: A theory/experiment confrontation. Nucl Instruments Methods Phys Res Sect B Beam Interact with Mater Atoms [Internet]. 2013;314:66–70. Available from: http://dx.doi.org/10.1016/j.nimb.2013.04.063 196. Tran HN, El Bitar Z, Champion C, Karamitros M, Bernal M a., Francis Z, et al. Modeling proton and alpha elastic scattering in liquid water in Geant4- DNA. Nucl Instruments Methods Phys Res Sect B Beam Interact with Mater Atoms [Internet]. 2015;343:132–7. Available from: http://linkinghub.elsevier.com/retrieve/pii/S0168583X14008544 197. Kyriakou I, Incerti S, Francis Z. Technical Note: Improvements in geant4 energy-loss model and the effect on low-energy electron transport in liquid water. Med Phys [Internet]. 2015 Jun 9;42(7):3870–6. Available from: http://scitation.aip.org/content/aapm/journal/medphys/42/7/10.1118/1.49216 13 198. Geant4-DNA Collaboration [Internet]. [cited 2018 Mar 7]. Available from: http://geant4-dna.org 199. Karamitros M, Mantero A, Incerti S, Friedland W, Baldacchino G, Barberet P, et al. Modeling Radiation Chemistry in the Geant4 Toolkit. Prog Nucl Sci Technol [Internet]. 2011 Oct 1;2:503–8. Available from: http://linkinghub.elsevier.com/retrieve/pii/S0021999114004185 200. Karamitros M, Luan S, Bernal M a., Allison J, Baldacchino G, Davidkova M, et al. Diffusion-controlled reactions modeling in Geant4-DNA. J Comput

211 Phys [Internet]. 2014;274(July):841–82. Available from: http://dx.doi.org/10.1016/j.jcp.2014.06.011 201. Hanai R, Yazu M, Hieda K. On the experimental distinction between ssbs and dsbs in circular DNA. Int J Radiat Biol. 1998;73(5):475–9. 202. Schipler A, Iliakis G. DNA double-strand-break complexity levels and their possible contributions to the probability for error-prone processing and repair pathway choice. Nucleic Acids Res. 2013;41(16):7589–605. 203. Hada M, Georgakilas AG. Formation of clustered DNA damage after high- LET irradiation: a review. J Radiat Res. 2008;49(3):203–10. 204. Okayasu R. Repair of DNA damage induced by accelerated heavy ions-A mini review. Int J Cancer. 2012;130(5):991–1000. 205. Ward JF. The complexity of DNA damage: relevance to biological consequences. Int J Radiat Biol. 1994;66(5):427–32. 206. Antonelli F, Campa A, Esposito G, Giardullo P, Belli M, Dini V, et al. Induction and Repair of DNA DSB as Revealed by H2AX Phosphorylation Foci in Human Fibroblasts Exposed to Low- and High-LET Radiation: Relationship with Early and Delayed Reproductive Cell Death. Radiat Res [Internet]. 2015;183(4):417–31. Available from: http://www.rrjournal.org/doi/10.1667/RR13855.1%5Cnhttp://www.ncbi.nlm.n ih.gov/pubmed/25844944 207. Rich T, Allen RL, Wyllie a H. Defying death after DNA damage. Nature. 2000;407(6805):777–83. 208. Jackson SP. Sensing and repairing DNA double-strand breaks. Carcinogenesis. 2002;23(5):687–96. 209. Hada M, Sutherland BM. Spectrum of complex DNA damages depends on the incident radiation. Radiat Res. 2006;165(2):223–30. 210. Held KD, Kawamura H, Kaminuma T, Paz AES, Yoshida Y, Liu Q, et al. Effects of Charged Particles on Human Tumor Cells. Front Oncol [Internet]. 2016;6(February):1–19. Available from: http://journal.frontiersin.org/article/10.3389/fonc.2016.00023 211. Yokota Y, Yamada S, Hase Y, Shikazono N, Narumi I, Tanaka A, et al. Initial Yields of DNA Double-Strand Breaks and DNA Fragmentation Patterns Depend on Linear Energy Transfer in Tobacco BY-2 Protoplasts Irradiated with Helium, Carbon and Neon Ions. Radiat Res [Internet]. 2007;167(1):94–101. Available from: http://www.jstor.org/stable/4127469 212. Tommasino F, Durante M. Proton radiobiology. Cancers (Basel). 2015;7(1):353–81. 213. Grosswendt B. Nanodosimetry, from radiation physics to radiation biology. Radiat Prot Dosimetry [Internet]. 2005 Dec 20;115(1–4):1–9. Available from: http://rpd.oxfordjournals.org/content/115/1-4/1.abstract

212 214. Garty G, Schulte R, Shchemelinin S, Grosswendt B, Leloup C, Assaf G, et al. First attempts at prediction of DNA strand-break yields using nanodosimetric data. Radiat Prot Dosimetry. 2006;122(1–4):451–4. 215. Brenner DJ, Ward JF. Constraints on Energy Deposition and Target Size of Multiply Damaged Sites Associated with DNA Double-strand Breaks. Int J Radiat Biol [Internet]. 1992;61(6):737–48. Available from: http://informahealthcare.com/doi/abs/10.1080/09553009214551591 216. Bernal MA, Bordage MC, Brown JMC, Davídková M, Delage E, El Bitar Z, et al. Track structure modeling in liquid water: A review of the Geant4-DNA very low energy extension of the Geant4 Monte Carlo simulation toolkit. Phys Medica [Internet]. 2015;1–14. Available from: http://linkinghub.elsevier.com/retrieve/pii/S1120179715010042 217. Bueno M, Schulte R, Meylan S, Villagrasa C. Influence of the geometrical detail in the description of DNA and the scoring method of ionization clustering on nanodosimetric parameters of track structure: a Monte Carlo study using Geant4-DNA. Phys Med Biol [Internet]. 2015;60(21):8583. Available from: http://stacks.iop.org/0031-9155/60/i=21/a=8583 218. Delage E, Pham QT, Karamitros M, Payno H, Stepan V, Incerti S, et al. PDB4DNA: Implementation of DNA geometry from the Protein Data Bank (PDB) description for Geant4-DNA Monte-Carlo simulations. Comput Phys Commun [Internet]. 2015; Available from: http://linkinghub.elsevier.com/retrieve/pii/S0010465515000843 219. Dos Santos M, Villagrasa C, Clairand I, Incerti S. Influence of the DNA density on the number of clustered damages created by protons of different energies. Nucl Instruments Methods Phys Res Sect B Beam Interact with Mater Atoms [Internet]. 2013;298:47–54. Available from: http://dx.doi.org/10.1016/j.nimb.2013.01.009 220. Dos Santos M, Villagrasa C, Clairand I, Incerti S. Influence of the chromatin density on the number of direct clustered damages calculated for proton and alpha irradiations using a Monte Carlo code. Prog Nucl Sci Technol [Internet]. 2014;4:449–53. Available from: http://www.aesj.or.jp/publication/pnst004/data/449_453.pdf 221. Bernal M a., Sikansi D, Cavalcante F, Incerti S, Champion C, Ivanchenko V, et al. An atomistic geometrical model of the B-DNA configuration for DNA- radiation interaction simulations. Comput Phys Commun [Internet]. 2013;184(12):2840–7. Available from: http://dx.doi.org/10.1016/j.cpc.2013.07.015 222. Meylan S, Vimont U, Incerti S, Clairand I, Villagrasa C. Geant4-DNA simulations using complex DNA geometries generated by the DnaFabric tool. Comput Phys Commun [Internet]. 2016; Available from:

213 http://linkinghub.elsevier.com/retrieve/pii/S0010465516300340 223. Teif VB, Bohinc K. Condensed DNA: Condensing the concepts. Prog Biophys Mol Biol [Internet]. 2011;105(3):208–22. Available from: http://dx.doi.org/10.1016/j.pbiomolbio.2010.07.002 224. van Holde KE. Chromatin [Internet]. New York, NY: Springer New York; 1989. (Springer Series in Molecular Biology). Available from: http://link.springer.com/10.1007/978-1-4612-3490-6 225. Kornberg RD. Chromatin Structure: A Repeating Unit of Histones and DNA. Sci [Internet]. 1974 May 24;184(4139):868–71. Available from: http://www.sciencemag.org/content/184/4139/868.short 226. van Holde K, Zlatanova J. Chromatin fiber structure: Where is the problem now? Semin Cell Dev Biol [Internet]. 2007;18(5):651–8. Available from: http://linkinghub.elsevier.com/retrieve/pii/S1084952107001206 227. Finch JT, Klug a. Solenoidal model for superstructure in chromatin. Proc Natl Acad Sci U S A. 1976;73(6):1897–901. 228. Woodcock CLF, Frado LLY, Rattner JB. The higher-order structure of chromatin: Evidence for a helical ribbon arrangement. J Cell Biol. 1984;99(1 I):42–52. 229. Staynov DZ. Possible nucleosome arrangements in the higher-order structure of chromatin. Int J Biol Macromol. 1983;5(1):3–9. 230. Grigoryev SA, Woodcock CL. Chromatin organization - The 30nm fiber. Exp Cell Res [Internet]. 2012;318(12):1448–55. Available from: http://dx.doi.org/10.1016/j.yexcr.2012.02.014 231. Maeshima K, Hihara S, Eltsov M. Chromatin structure: Does the 30-nm fibre exist in vivo? Curr Opin Cell Biol [Internet]. 2010;22(3):291–7. Available from: http://dx.doi.org/10.1016/j.ceb.2010.03.001 232. Friedland W, Jacob P, Paretzke HG, Stork T. Monte Carlo Simulation of the Production of Short DNA Fragments by Low-Linear Energy Transfer Radiation Using Higher-Order DNA Models. Radiat Res [Internet]. 1998;150(2):170. Available from: http://www.jstor.org/stable/3579852?origin=crossref 233. Holley WR, Chatterjee A, Rydberg B. Clusters of DNA damage induced by ionizing radiation: Formation of short DNA fragments .2. Experimental detection. Radiat Res [Internet]. 1996 Feb 1;145(2):200–9. Available from: http://www.rrjournal.org/doi/abs/10.2307/3579174 234. Rydberg B, Holley WR, Mian IS, Chatterjee a. Chromatin conformation in living cells: support for a zig-zag model of the 30 nm chromatin fiber. J Mol Biol. 1998;284(1):71–84. 235. Nadassy K, Tomás-Oliveira I, Alberts I, Janin J, Wodak SJ. Standard atomic volumes in double-stranded DNA and packing in protein--DNA

214 interfaces. Nucleic Acids Res [Internet]. 2001;29(16):3362–76. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=55857&tool=pmc entrez&rendertype=abstract 236. Wang AH, Quigley GJ, Kolpak FJ, Crawford JL, van Boom JH, van der Marel G, et al. Molecular structure of a left-handed double helical DNA fragment at atomic resolution. Nature [Internet]. 1979;282(5740):680–6. Available from: http://europepmc.org/abstract/MED/514347%5Cnhttp://www.ncbi.nlm.nih.go v/pubmed/514347 237. Drew HR, Wing RM, Takano T, Broka C, Tanaka S, Itakura K, et al. Structure of a B-DNA dodecamer: conformation and dynamics. Proc Natl Acad Sci U S A [Internet]. 1981;78(4):2179–83. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=319307&tool=pm centrez&rendertype=abstract 238. Baker ES, Bowers MT. B-DNA Helix Stability in a Solvent-Free Environment. J Am Soc Mass Spectrom [Internet]. 2007 Jul;18(7):1188–95. Available from: http://link.springer.com/10.1016/j.jasms.2007.03.001 239. Bernal MA, E C, Incerti S, Champion C, Ivanchenko V, Francis Z. The influence of the DNA structure on the total direct strand break yield . 2015;1:3. 240. Dias RS, Lindman B. DNA Interactions with Polymers and Surfactants. DNA Interactions with Polymers and Surfactants. Wiley-Interscience; 2007. 1-407 p. 241. Daban J-R. Electron microscopy and atomic force microscopy studies of chromatin and metaphase chromosome structure. Micron [Internet]. 2011;42(8):733–50. Available from: http://linkinghub.elsevier.com/retrieve/pii/S0968432811000771 242. Luger K, Mäder a W, Richmond RK, Sargent DF, Richmond TJ. Crystal structure of the nucleosome core particle at 2.8 A resolution. Nature. 1997;389(6648):251–60. 243. Koslover EF, Fuller CJ, Straight AF, Spakowitz AJ. Local geometry and elasticity in compact chromatin structure. Biophys J [Internet]. 2010;99(12):3941–50. Available from: http://dx.doi.org/10.1016/j.bpj.2010.10.024 244. Thomas JO. The higher order structure of chromatin and histone H1. [Internet]. Vol. 1, Journal of cell science. Supplement. 1984. p. 1–20. Available from: http://www.ncbi.nlm.nih.gov/pubmed/6397467%5Cnhttp://jcs.biologists.org/ content/1984/Supplement_1/1.short

215 245. Kyriakou I, Šefl M, Nourry V, Incerti S. The impact of new Geant4-DNA cross section models on electron track structure simulations in liquid water. J Appl Phys [Internet]. 2016 May 17;119(19):194902. Available from: http://dx.doi.org/10.1063/1.4950808 246. Nikjoo H, O’Neill P, Terrissol M, Goodhead DT. Quantitative modelling of DNA damage using Monte Carlo track structure method. Radiat Environ Biophys. 1999;38(1):31–8. 247. Lazarakis P, Bug MU, Gargioni E, Guatelli S, Rabus H, Rosenfeld a B. Comparison of nanodosimetric parameters of track structure calculated by the Monte Carlo codes Geant4-DNA and PTra. Phys Med Biol [Internet]. 2012;57(5):1231–50. Available from: http://www.ncbi.nlm.nih.gov/pubmed/22330641 248. Alexander F, Villagrasa C, Rabus H, Wilkens JJ. Local weighting of nanometric track structure properties in macroscopic voxel geometries for particle beam treatment planning. Phys Med Biol [Internet]. 2015;60(23):9145–56. Available from: http://stacks.iop.org/0031- 9155/60/i=23/a=9145?key=crossref.7e7eb63f671a13cf652b11ceb87710e7 249. Welford B. Note on a Method for Calculating Corrected Sums of Squares and Products. Technometrics [Internet]. 1962;4(3):419–20. Available from: http://www.jstor.org/stable/10.2307/1266219%5Cnhttp://www.jstor.org/stabl e/1266577 250. Chaudhry MA, Weinfeld M. Reactivity of human apurinic/apyrimidinic endonuclease and Escherichia coli exonuclease III with bistranded abasic sites in DNA. J Biol Chem. 1997;272(25):15650–5. 251. Harrison L, Hatahet Z, Wallace SS. In vitro repair of synthetic ionizing radiation-induced multiply damaged DNA sites. J Mol Biol. 1999;290(3):667–84. 252. David-Cordonnier MH, Cunniffe SMT, Hickson ID, O’Neill P. Efficiency of incision of an AP site within clustered DNA damage by the major human AP endonuclease. Biochemistry. 2002;41(2):634–42. 253. Georgakilas AG, Bennett P V., Wilson DM, Sutherland BM. Processing of bistranded abasic DNA clusters in Gamma-irradiated human hematopoietic cells. Nucleic Acids Res. 2004;32(18):5609–20. 254. Ottolenghi A, Merzagora M, Tallone L, Durante M, Paretzke HG, Wilson WE. The quality of DNA double-strand breaks: A Monte Carlo simulation of the end-structure of strand breaks produced by protons and alpha particles. Radiat Environ Biophys [Internet]. 1995;34(4):239–44. Available from: http://dx.doi.org/10.1007/BF01209749 255. Nikjoo H, Neill PO, Wilson WE, Goodhead DT. Computational Approach for Determining the Spectrum of DNA Damage Induced by Ionizing Radiation.

216 2001;583(5):577–83. 256. Zhang L, Tan Z. A new calculation on spectrum of direct DNA damage induced by low-energy electrons. Radiat Environ Biophys. 2010;49(1):15– 26. 257. Watanabe R, Rahmanian S, Nikjoo H. Spectrum of Radiation-Induced Clustered Non-DSB Damage - A Monte Carlo Track Structure Modeling and Calculations. Radiat Res [Internet]. 2015;183(5):525–40. Available from: http://www.bioone.org/doi/10.1667/RR13902.1%5Cnhttp://www.ncbi.nlm.nih .gov/pubmed/25909147 258. Scholz V, Weidner J, Köhnlein W, Frekers D, Wörtche HJ. Induction of single- and double-strand breaks in plasmid DNA by monoenergetic alpha- particles with energies below the Bragg-Maximum. Zeitschrift fur Naturforsch Sect C - J Biosci. 1997;52(5–6):364–72. 259. Leloup C, Garty G, Assaf G, Cristovão A, Breskin A, Chechik R, et al. Evaluation of lesion clustering in irradiated plasmid DNA. Int J Radiat Biol [Internet]. 2005;81(1):41–54. Available from: http://informahealthcare.com/doi/abs/10.1080/09553000400017895 260. Vyšín L, Pachnerová Brabcová K, Štěpán V, Moretto-Capelle P, Bugler B, Legube G, et al. Proton-induced direct and indirect damage of plasmid DNA. Radiat Environ Biophys [Internet]. 2015;54(3):343–52. Available from: http://www.ncbi.nlm.nih.gov/pubmed/26007308 261. Garty G, Schulte R, Shchemelinin S, Leloup C, Assaf G, Breskin A, et al. A nanodosimetric model of radiation-induced clustered DNA damage yields. Phys Med Biol [Internet]. 2010;55(3):761–81. Available from: http://www.ncbi.nlm.nih.gov/pubmed/20071772 262. Francis Z, Villagrasa C, Clairand I. Simulation of DNA damage clustering after proton irradiation using an adapted DBSCAN algorithm. Comput Methods Programs Biomed [Internet]. 2011;101(3):265–70. Available from: http://dx.doi.org/10.1016/j.cmpb.2010.12.012 263. Pater P, Bäckstöm G, Villegas F, Ahnesjö A, Enger SA, Seuntjens J, et al. Proton and light ion RBE for the induction of direct DNA double strand breaks. Med Phys [Internet]. 2016;43(5):2131–40. Available from: http://scitation.aip.org/content/aapm/journal/medphys/43/5/10.1118/1.49448 70 264. Semenenko V a, Stewart RD. A fast Monte Carlo algorithm to simulate the spectrum of DNA damages formed by ionizing radiation. Radiat Res. 2004;161(4):451–7. 265. Liang Y, Yang G, Liu F, Wang Y. Monte Carlo simulation of ionizing radiation induced DNA strand breaks utilizing coarse grained high-order chromatin structures. Phys Med Biol [Internet]. 2016;61(1):445–60.

217 Available from: http://stacks.iop.org/0031- 9155/61/i=1/a=445?key=crossref.2aa8ed3c6281182b9effb3f9c38c6aac 266. Emami B, Lyman J, Brown A, Cola L, Goitein M, Munzenrider JE, et al. Tolerance of normal tissue to therapeutic irradiation. Int J Radiat Oncol [Internet]. 1991;21(1):109–22. Available from: http://www.sciencedirect.com/science/article/pii/036030169190171Y 267. Guan F, Bronk L, Titt U, Lin SH, Mirkovic D, Kerr MD, et al. Spatial mapping of the biologic effectiveness of scanned particle beams: Towards biologically optimized particle therapy. Sci Rep. 2015;5:1–10. 268. Marshall T. I, Chaudhary P, Michaelidesová A, Vachelová J, Davídková M, Vondráček V, et al. Investigating the implications of a variable RBE on proton dose fractionation across a clinical pencil beam scanned spread-out Bragg peak. Int J Radiat Oncol [Internet]. 2016;95(1):70–7. Available from: http://www.sciencedirect.com/science/article/pii/S0360301616001498 269. Sørensen BS, Overgaard J, Bassler N. In vitro RBE-LET dependence for multiple particle types. Acta Oncol (Madr). 2011;50(6):757–62. 270. Incerti S, Douglass M, Penfold S, Guatelli S, Bezak E. Review of Geant4- DNA applications for micro and nanoscale simulations. Phys Medica [Internet]. 2016 Oct;32(10):1187–200. Available from: http://dx.doi.org/10.1016/j.ejmp.2016.09.007 271. Friedland W, Jacob P, Kundrat P. Mechanistic Simulation of Radiation Damage To Dna and Its Repair : on the Track Tow Ards Systems. Radiat Prot Dosimetry. 2011;143(2):542–8. 272. Taleei R, Nikjoo H. Biochemical DSB-repair model for mammalian cells in G1 and early S phases of the cell cycle. Mutat Res - Genet Toxicol Environ Mutagen [Internet]. 2013;756(1–2):206–12. Available from: http://dx.doi.org/10.1016/j.mrgentox.2013.06.004 273. Eriksson D, Stigbrand T. Radiation-induced cell death mechanisms. Tumor Biol [Internet]. 2010;31(4):363–72. Available from: https://doi.org/10.1007/s13277-010-0042-8 274. Sachs RK, Chen D.J. AMB, Chen a M, Brenner DJ. Review: proximity effects in the production of chromosome aberrations by ionizing radiation. Int J Radiat Biol. 1997;71(1):1–19. 275. Georgakilas AG, O’Neill P, Stewart RD. Induction and Repair of Clustered DNA Lesions: What Do We Know So Far? Radiat Res [Internet]. 2013;180(1):100–9. Available from: http://www.bioone.org/doi/10.1667/RR3041.1 276. Carter RJ, Nickson CM, Thompson JM, Kacperek A, Hill MA, Parsons JL. Complex DNA damage induced by high-LET α-particles and protons triggers a specific cellular DNA damage response. Int J Radiat Oncol

218 [Internet]. 2017; Available from: http://linkinghub.elsevier.com/retrieve/pii/S0360301617341007 277. Mavragani I, Nikitaki Z, Souli M, Aziz A, Nowsheen S, Aziz K, et al. Complex DNA Damage: A Route to Radiation-Induced Genomic Instability and Carcinogenesis. Cancers (Basel) [Internet]. 2017;9(7):91. Available from: http://www.mdpi.com/2072-6694/9/7/91 278. Hilgers G, Bug MU, Rabus H. Measurement of track structure parameters of low and medium energy helium and carbon ions in Measurement of track structure parameters of low and medium energy helium and carbon ions in nanometric volumes. Phys Med Biol [Internet]. 2017;aa86e8. Available from: https://doi.org/10.1088/1361-6560/aa86e8 279. Villegas F, Bäckström G, Tilly N, Ahnesjö A. Energy deposition clustering as a functional radiation quality descriptor for modeling relative biological effectiveness. Med Phys. 2016;43(12):6322–35. 280. Thomlinson RH, Gray LH. The Histological Structure of Some Human Lung Cancers and the Possible Implications for Radiotherapy. Br J Cancer [Internet]. 1955 Dec 1;9:539. Available from: http://dx.doi.org/10.1038/bjc.1955.55 281. Lomax ME, Folkes LK, O’Neill P. Biological consequences of radiation- induced DNA damage: Relevance to radiotherapy. Clin Oncol [Internet]. 2013;25(10):578–85. Available from: http://dx.doi.org/10.1016/j.clon.2013.06.007 282. Henthorn NT, Warmenhoven JW, Sotiropoulos M, Mackay RI, Kirkby KJ, Merchant MJ. Nanodosimetric Simulation of Direct Ion-Induced DNA Damage Using Different Chromatin Geometry Models. Radiat Res [Internet]. 2017;188(6):690–703. Available from: http://www.rrjournal.org/doi/10.1667/RR14755.1 283. Bernal M a, Liendo J a. An investigation on the capabilities of the PENELOPE MC code in nanodosimetry. Med Phys. 2009;36(2):620–5. 284. Charlton DE, Nikjoo H, Humm JL. Calculation of Initial Yields of Single- and Double-strand Breaks in Cell Nuclei from Electrons, Protons and Alpha Particles. Int J Radiat Biol [Internet]. 1989 Jan 3 [cited 2017 Nov 28];56(1):1–19. Available from: http://www.tandfonline.com/doi/full/10.1080/09553008914551141 285. Berman HM, Olson WK, Beveridge DL, Westbrook J, Gelbin A, Demeny T, et al. The nucleic acid database. A comprehensive relational database of three-dimensional structures of nucleic acids. Biophys J [Internet]. 1992;63(3):751–9. Available from: http://dx.doi.org/10.1016/S0006- 3495(92)81649-1 286. Francis Z, El Bitar Z, Incerti S, Bernal MA, Karamitros M, Tran HN.

219 Calculation of lineal energies for water and DNA bases using the Rudd model cross sections integrated within the Geant4-DNA processes. J Appl Phys [Internet]. 2017;122(1):14701. Available from: http://aip.scitation.org/doi/10.1063/1.4990293 287. Bordage MC, Bordes J, Edel S, Terrissol M, Franceries X, Bardiès M, et al. Implementation of new physics models for low energy electrons in liquid water in Geant4-DNA. Phys Medica [Internet]. 2016 Dec 1 [cited 2017 Nov 28];32(12):1833–40. Available from: http://www.sciencedirect.com/science/article/pii/S1120179716309516 288. Michael BD, O'Neill P. A Sting in the Tail of Electron Tracks. Science (80- ) [Internet]. 2000 Mar 3;287(5458):1603 LP-1604. Available from: http://science.sciencemag.org/content/287/5458/1603.abstract 289. Datta K, Purkayastha S, Neumann RD, Pastwa E, Winters T a. Base damage immediately upstream from double-strand break ends is a more severe impediment to nonhomologous end joining than blocked 3’-termini. Radiat Res [Internet]. 2011;175(1):97–112. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3518376&tool=p mcentrez&rendertype=abstract 290. Dobbs TA, Palmer P, Maniou Z, Lomax ME, O’Neill P. Interplay of two major repair pathways in the processing of complex double-strand DNA breaks. DNA Repair (Amst). 2008;7(8):1372–83. 291. Obe G, Johannes C, Schulte-Frohlinde D. DNA double-strand breaks induced by sparsely ionizing radiation and endonucleases as critical lesions for cell death, chromosomal aberrations, mutations and oncogenic transformation. Mutagenesis. 1992;7(1):3–12. 292. Krokan HE, Bjoras M. Base excision repair. Cold Spring Harb Perspect Biol. 2013;5(4):a012583. 293. Bolivar F, Rodriguez RL, Greene PJ, Betlach MC, Heyneker HL, Boyer HW, et al. Construction and characterization of new cloning vehicles. II. A multipurpose cloning system. Gene. 1977;2(2):95–113. 294. Ester M, Kriegel H-PP, Sander JJ, Xu X. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. Proc 2nd Int Conf Knowl Discov Data Min [Internet]. 1996;226–31. Available from: http://dl.acm.org/citation.cfm?id=3001460.3001507 295. Barnard S, Bouffler S, Rothkamm K. The shape of the radiation dose response for DNA double-strand break induction and repair. Genome Integr [Internet]. 2013;4(1):1. Available from: http://genomeintegrity.biomedcentral.com/articles/10.1186/2041-9414-4-1 296. Souici M, Khalil TT, Muller D, Raffy Q, Barillon R, Belafrites A, et al. Single- and double-strand breaks of dry DNA exposed to protons at bragg-peak

220 energies. J Phys Chem B. 2017;121(3):497–507. 297. Urushibara A, Shikazono N, O’Neill P, Fujii K, Wada S, Yokoya A. LET dependence of the yield of single-, double-strand breaks and base lesions in fully hydrated plasmid DNA films by 4He(2+) ion irradiation. Int J Radiat Biol [Internet]. 2008 Jan 3;84(1):23–33. Available from: http://www.tandfonline.com/doi/full/10.1080/09553000701616072 298. Ushigome T, Shikazono N, Fujii K, Watanabe R, Suzuki M, Tsuruoka C, et al. Yield of Single- and Double-Strand Breaks and Nucleobase Lesions in Fully Hydrated Plasmid DNA Films Irradiated with High-LET Charged Particles. Radiat Res [Internet]. 2012 May;177(5):614–27. Available from: http://www.bioone.org/doi/10.1667/RR2701.1 299. Villagrasa C, Meylan S, Gonon G, Gruel G, Giesen U, Bueno M, et al. Geant4-DNA simulation of DNA damage caused by direct and indirect radiation effects and comparison with biological data. EPJ Web Conf. 2017;153:1–6. 300. Abolfath RM, Carlson DJ, Chen ZJ, Nath R. A molecular dynamics simulation of DNA damage induction by ionizing radiation. Phys Med Biol. 2013;58(20):7143–57. 301. Cornforth MN, Bedford JS. Ionizing radiation damage and its early development in chromosomes. In: Lett JT, Sinclair WK, editors. Advances in radiation biology [Internet]. San Diego: Academic; 1993. p. 423–96. Available from: http://cat.inist.fr/?aModele=afficheN&cpsidt=4171909 302. Kundrát P, Stewart RD. On the biophysical interpretation of lethal DNA lesions induced by ionising radiation. Radiat Prot Dosimetry. 2006;122(1– 4):169–72. 303. Hall EJ, Giaccia AJ. Radiobiology for the Radiologist. Lippincott Williams & Wilkins; 2006. 304. Baiocco G, Barbieri S, Babini G, Morini J, Alloni D, Friedland W, et al. The origin of neutron biological effectiveness as a function of energy. Sci Rep [Internet]. 2016;6:34033. Available from: http://www.nature.com/articles/srep34033 305. Rydberg B, Cooper B, Cooper PK, Holley WR, Chatterjee A. Dose- dependent misrejoining of radiation-induced DNA double-strand breaks in human fibroblasts: experimental and theoretical study for high- and low-LET radiation. Radiat Res. 2005;163(5):526–34. 306. Friedrich T, Scholz U, Elsässer T, Durante M, Scholz M. Calculation of the biological effects of ion beams based on the microscopic spatial damage distribution pattern. Int J Radiat Biol. 2012;88(1–2):103–7. 307. Sachs RKK, Brenner DJJ. Effect of LET on Chromosomal Aberration Yields. I. Do Long-lived, Exchange-prone Double Strand Breaks Play a

221 Role? Int J Radiat Biol [Internet]. 1993 Jan 1;64(6):677–88. Available from: http://dx.doi.org/10.1080/09553009314551921 308. Cornforth MN. Testing the Notion of the One-Hit Exchange. Radiat Res [Internet]. 1990 Jan;121(1):21. Available from: http://www.jstor.org/stable/3577559?origin=crossref 309. Roukos V, Voss TC, Schmidt CK, Lee S, Wangsa D, Misteli T. Spatial dynamics of chromosome translocations in living cells. Science (80- ) [Internet]. 2013;341(6146):660–4. Available from: http://www.ncbi.nlm.nih.gov/pubmed/23929981 310. Holley WR, Mian IS, Park SJ, Rydberg B, Chatterjee A. A Model for Interphase Chromosomes and Evaluation of Radiation-Induced Aberrations. Radiat Res [Internet]. 2002;158(5):568–80. Available from: http://www.ncbi.nlm.nih.gov/pubmed/12385634%5Cnhttp://www.bioone.org/ doi/abs/10.1667/0033-7587(2002)158[0568:AMFICA]2.0.CO;2 311. Edwards AA, Moiseenko V V., Nikjoo H. On the mechanism of the formation of chromosomal aberrations by ionising radiation. Radiat Environ Biophys. 1996;35(1):25–30. 312. Chen AM, Lucas JN, Hill FS, Brenner DJ, Sachs RK. Proximity effects for chromosome aberrations measured by FISH. Int J Radiat Biol [Internet]. 1996 Jan 1;69(4):411–20. Available from: http://dx.doi.org/10.1080/095530096145706 313. Friedland W, Kundrát P. Track structure based modelling of chromosome aberrations after photon and alpha-particle irradiation. Mutat Res - Genet Toxicol Environ Mutagen [Internet]. 2013;756(1–2):213–23. Available from: http://dx.doi.org/10.1016/j.mrgentox.2013.06.013 314. Soutoglou E, Dorn JF, Sengupta K, Jasin M, Nussenzweig A, Ried T, et al. Positional stability of single double-strand breaks in mammalian cells. Nat Cell Biol [Internet]. 2007;9(6):675–82. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2442898&tool=p mcentrez&rendertype=abstract 315. Girst S, Hable V, Drexler GA, Greubel C, Siebenwirth C, Haum M, et al. Subdiffusion Supports Joining Of Correct Ends During Repair Of DNA Double-Strand Breaks. Sci Rep [Internet]. 2013;3:1–6. Available from: http://www.nature.com/articles/srep02511 316. Iarovaia O V, Rubtsov M, Ioudinkova E, Tsfasman T, Razin S V, Vassetzky YS. Dynamics of double strand breaks and chromosomal translocations. Mol Cancer [Internet]. 2014;13:249. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=4289179&tool=p mcentrez&rendertype=abstract 317. Aten JA, Stap J, Krawczyk PM, van Oven CH, Hoebe RA, Essers J, et al.

222 Dynamics of DNA Double-Strand Breaks Revealed by Clustering of Damaged Chromosome Domains. Science (80- ) [Internet]. 2004 Jan 2;303(5654):92 LP-95. Available from: http://science.sciencemag.org/content/303/5654/92.abstract 318. Jakob B, Splinter J, Durante M, Taucher-Scholz G. Live cell microscopy analysis of radiation-induced DNA double-strand break motion. Proc Natl Acad Sci [Internet]. 2009 Mar 3;106(9):3172–7. Available from: http://www.pnas.org/content/106/9/3172.abstract 319. Averbeck NB, Topsch J, Scholz M, Kraft-Weyrather W, Durante M, Taucher-Scholz G. Efficient Rejoining of DNA Double-Strand Breaks despite Increased Cell-Killing Effectiveness following Spread-Out Bragg Peak Carbon-Ion Irradiation. Front Oncol [Internet]. 2016;6(February):1–8. Available from: http://journal.frontiersin.org/Article/10.3389/fonc.2016.00028/abstract 320. Hill MA. Fishing for radiation quality: Chromosome aberrations and the role of radiation track structure. Radiat Prot Dosimetry. 2015;166(1–4):295–301. 321. Rosenbluth MJ, Lam WA, Fletcher DA. Force microscopy of nonadherent cells: a comparison of leukemia cell deformability. Biophys J [Internet]. 2006;90(8):2994–3003. Available from: http://www.cell.com/article/S0006349506724808/fulltext 322. Metzler R, Klafter J. The random walk’s guide to anomalous diffusion: a fractional dynamics approach. Phys Rep [Internet]. 2000 Dec [cited 2017 Jun 12];339(1):1–77. Available from: http://www.sciencedirect.com/science/article/pii/S0370157300000703 323. Yang K, Guo R, Xu D. Non-homologous end joining: advances and frontiers. Acta Biochim Biophys Sin (Shanghai) [Internet]. 2016 Jul 1;48(7):632–40. Available from: http://dx.doi.org/10.1093/abbs/gmw046 324. Lee K-J, Saha J, Sun J, Fattah KR, Wang S-C, Jakob B, et al. Phosphorylation of Ku dictates DNA double-strand break (DSB) repair pathway choice in S phase. Nucleic Acids Res [Internet]. 2016 Feb 29;44(4):1732–45. Available from: http://dx.doi.org/10.1093/nar/gkv1499 325. Uematsu N, Weterings E, Yano KI, Morotomi-Yano K, Jakob B, Taucher- Scholz G, et al. Autophosphorylation of DNA-PKCS regulates its dynamics at DNA double-strand breaks. J Cell Biol. 2007;177(2):219–29. 326. Graham TGWGW, Walter JCC, Loparo JJJ. Two-Stage Synapsis of DNA Ends during Non-homologous End Joining. Mol Cell [Internet]. 2017 Jun 12;61(6):850–8. Available from: http://dx.doi.org/10.1016/j.molcel.2016.02.010 327. Sun J, Lee KJ, Davis AJ, Chen DJ. Human Ku70/80 protein blocks exonuclease 1-mediated DNA resection in the presence of human Mre11 or

223 Mre11/Rad50 protein complex. J Biol Chem. 2012;287(7):4936–45. 328. Walker JR, Corpina RA, Goldberg J. Structure of the Ku heterodimer bound to DNA and its implications for double-strand break repair. Nature [Internet]. 2001;412(6847):607–14. Available from: http://www.ncbi.nlm.nih.gov/pubmed/11493912 329. Dynan WS, Yoo S. Interaction of Ku protein and DNA-dependent protein kinase catalytic subunit with nucleic acids. Nucleic Acids Res. 1998;26(7):1551–9. 330. Mimori T, Hardin JA, Steitz JA. Characterization of the DNA-binding protein antigen Ku recognized by autoantibodies from patients with rheumatic disorders. J Biol Chem [Internet]. 1986 Feb 15;261(5):2274–8. Available from: http://www.jbc.org/content/261/5/2274.abstract 331. Bernal M a, deAlmeida CE, Sampaio C, Incerti S, Champion C, Nieminen P. The invariance of the total direct DNA strand break yield. Med Phys. 2011;38(7):4147–53. 332. Chaudhary P, Marshall TI, Currell FJ, Kacperek A, Schettino G, Prise KM. Variations in the Processing of DNA Double-Strand Breaks Along 60-MeV Therapeutic Proton Beams. Int J Radiat Oncol Biol Phys [Internet]. 2016;95(1):86–94. Available from: http://dx.doi.org/10.1016/j.ijrobp.2015.07.2279 333. Li Y, Reynolds P, O’Neill P, Cucinotta FA. Modeling damage complexity- dependent non-homologous end-joining repair pathway. PLoS One [Internet]. 2014 Feb 10;9(2):e85816. Available from: https://doi.org/10.1371/journal.pone.0085816 334. Lucas JS, Zhang Y, Dudko OK, Murre C. 3D trajectories adopted by coding and regulatory DNA elements: First-passage times for genomic interactions. Cell [Internet]. 2014;158(2):339–52. Available from: http://dx.doi.org/10.1016/j.cell.2014.05.036 335. Lieber MR. The mechanism of double-strand DNA break repair by the nonhomologous DNA end-joining pathway. Annu Rev Biochem [Internet]. 2010;79(D):181–211. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3079308&tool=p mcentrez&rendertype=abstract 336. Hoeijmakers JH. Genome maintenance mechanisms for preventing cancer. Nature [Internet]. 2001 May 17;411(6835):366–74. Available from: http://www.nature.com/doifinder/10.1038/35077232 337. Ciccia A, Elledge SJ. The DNA Damage Response: Making It Safe to Play with Knives. Mol Cell [Internet]. 2010 Oct;40(2):179–204. Available from: http://linkinghub.elsevier.com/retrieve/pii/S1097276510007471 338. Anglada T, Terradas M, Hernández L, Genescà A, Martín M. Analysis of

224 Residual DSBs in Ataxia-Telangiectasia Lymphoblast Cells Initiating Apoptosis. Biomed Res Int [Internet]. 2016;2016:1–12. Available from: http://www.hindawi.com/journals/bmri/2016/8279560/ 339. Carrano AV. Chromosome aberrations and radiation-induced cell death. Mutat Res Mol Mech Mutagen [Internet]. 1973 Mar;17(3):341–53. Available from: http://linkinghub.elsevier.com/retrieve/pii/0027510773900067 340. Braselmann H, Bauchinger M, Schmid E. Cell survival and radiation induced chromosome aberrations. Radiat Environ Biophys [Internet]. 1986;25(4):243–51. Available from: http://www.springerlink.com/index/10.1007/BF01214637 341. van der Kogel A. Basic Clinical Radiobiology Fourth Edition [Internet]. Basic Clinical Radiobiology. CRC Press; 2009. 169-190 p. Available from: http://www.lavoisier.fr/livre/notice.asp?id=O2KWO6A36XXOWN 342. Belli M, Cherubini R, Finotto S, Moschini G, Sapora O, Simone G, et al. RBE-LET Relationship for the Survival of V79 Cells Irradiated with Low Energy Protons. Int J Radiat Biol [Internet]. 1989 Jan 3;55(1):93–104. Available from: http://www.tandfonline.com/doi/full/10.1080/09553008914550101 343. Gerelchuluun A, Zhu J, Su F, Asaithamby A, Chen DJ, Tsuboi K. Homologous recombination pathway may play a major role in high-LET radiation-induced DNA double-strand break repair. J Radiat Res [Internet]. 2014 Mar 1;55(suppl 1):i83–4. Available from: https://academic.oup.com/jrr/article-lookup/doi/10.1093/jrr/rrt181 344. Lorat Y, Brunner CU, Schanz S, Jakob B, Taucher-Scholz G, Rübe CE. Nanoscale analysis of clustered DNA damage after high-LET irradiation by quantitative electron microscopy – The heavy burden to repair. DNA Repair (Amst) [Internet]. 2015 Apr;28:93–106. Available from: http://linkinghub.elsevier.com/retrieve/pii/S1568786415000191 345. Bryant HE, Schultz N, Thomas HD, Parker KM, Flower D, Lopez E, et al. Specific killing of BRCA2-deficient tumours with inhibitors of poly(ADP- ribose) polymerase. Nature [Internet]. 2005 Apr 14;434(7035):913–7. Available from: http://www.nature.com/articles/nature03443 346. Farmer H, McCabe N, Lord CJ, Tutt ANJ, Johnson DA, Richardson TB, et al. Targeting the DNA repair defect in BRCA mutant cells as a therapeutic strategy. Nature [Internet]. 2005 Apr 14;434(7035):917–21. Available from: http://www.nature.com/doifinder/10.1038/nature03445 347. Gudmundsdottir K, Ashworth A. The roles of BRCA1 and BRCA2 and associated proteins in the maintenance of genomic stability. Oncogene [Internet]. 2006 Sep 25;25(43):5864–74. Available from: http://www.nature.com/articles/1209874

225 348. Pierce AJ, Jasin M. NHEJ Deficiency and Disease. Mol Cell [Internet]. 2001 Dec;8(6):1160–1. Available from: http://linkinghub.elsevier.com/retrieve/pii/S1097276501004245 349. Belov O V., Krasavin EA, Lyashko MS, Batmunkh M, Sweilam NH. A quantitative model of the major pathways for radiation-induced DNA double- strand break repair. J Theor Biol [Internet]. 2014;366:115–30. Available from: http://dx.doi.org/10.1016/j.jtbi.2014.09.024 350. Dolan D, Nelson G, Zupanic A, Smith G, Shanley D. Systems Modelling of NHEJ Reveals the Importance of Redox Regulation of Ku70/80 in the Dynamics of DNA Damage Foci. Samuels DC, editor. PLoS One [Internet]. 2013 Feb 6;8(2):e55190. Available from: http://dx.plos.org/10.1371/journal.pone.0055190 351. Dolan DWP, Zupanic A, Nelson G, Hall P, Miwa S, Kirkwood TBL, et al. Integrated Stochastic Model of DNA Damage Repair by Non-homologous End Joining and p53/p21- Mediated Early Senescence Signalling. Xia Y, editor. PLOS Comput Biol [Internet]. 2015 May 28;11(5):e1004246. Available from: http://dx.plos.org/10.1371/journal.pcbi.1004246 352. Butts JJ, Katz R. Theory of RBE for Heavy Ion Bombardment of Dry Enzymes and Viruses. Radiat Res [Internet]. 1967 Apr;30(4):855. Available from: http://www.jstor.org/stable/3572151?origin=crossref 353. Katz R, Ackerson B, Homayoonfar M, Sharma SC. Inactivation of Cells by Heavy Ion Bombardment. Radiat Res [Internet]. 1971 Aug;47(2):402. Available from: http://www.jstor.org/stable/3573247?origin=crossref 354. Katz R, Sharma SC. Response of cells to fast neutrons, stopped pions, and heavy ion beams. Nucl Instruments Methods [Internet]. 1973 Aug;111(1):93–116. Available from: http://linkinghub.elsevier.com/retrieve/pii/0029554X73901018 355. Katz R, Sharma SC. Heavy particles in therapy: an application of track theory. Phys Med Biol [Internet]. 1974 Jul 1;19(4):1. Available from: http://stacks.iop.org/0031- 9155/19/i=4/a=001?key=crossref.5d2e7487df828cadeed3586d88d35fb2 356. K. Sachs, D. Levy, A. M. Chen, E. A R. Random breakage and reunion chromosome aberration formation model; an interaction–distance version based on chromatin geometry. Int J Radiat Biol [Internet]. 2000 Jan 3;76(12):1579–88. Available from: http://www.tandfonline.com/doi/full/10.1080/09553000050201064 357. Takahashi A, Yamakawa N, Kirita T, Omori K, Ishioka N, Furusawa Y, et al. DNA damage recognition proteins localize along heavy ion induced tracks in the cell nucleus. J Radiat Res [Internet]. 2008 Nov;49(6):645–52. Available from: https://academic.oup.com/jrr/article-

226 lookup/doi/10.1269/jrr.08007 358. Ballarini F, Ottolenghi A. Models of chromosome aberration induction: An example based on radiation track structure. Cytogenet Genome Res. 2004;104(1–4):149–56. 359. Scholz M, Kraft G. Calculation of heavy ion inactivation probabilities based on track structure, x-ray sensitivity and target size. 1993; 360. Friedland W, Paretzke HG, Ballarini F, Ottolenghi A, Kreth G, Cremer C. First steps towards systems radiation biology studies concerned with DNA and chromosome structure within living cells. Radiat Environ Biophys. 2008;47(1):49–61. 361. Friedland W, Bernhardt P, Jacob P, Paretzke HG, Dingfelder M. Simulation of DNA damage after proton and low LET irradiation. Radiat Prot Dosimetry [Internet]. 2002;99(1–4):99–102. Available from: http://www.ncbi.nlm.nih.gov/pubmed/12194370 362. Mine-Hattab J, Recamier V, Izeddin I, Rothstein R, Darzacq X. Fast imaging of DNA motion reveals distinct sub-diffusion regimes at the site of DNA damage [Internet]. bioRxiv. Cold Spring Harbor Labs Journals; 2016. Available from: http://www.biorxiv.org/content/early/2016/03/01/042051.abstract 363. Andrin C, McDonald D, Attwood KM, Rodrigue A, Ghosh S, Mirzayans R, et al. A requirement for polymerized actin in DNA double-strand break repair. Nucleus [Internet]. 2012 Jul 28;3(4):384–95. Available from: http://www.tandfonline.com/doi/abs/10.4161/nucl.21055 364. Hartlerode AJ, Morgan MJ, Wu Y, Buis J, Ferguson DO. Recruitment and activation of the ATM kinase in the absence of DNA-damage sensors. Nat Struct Mol Biol. 2015;22(9):736–43. 365. Li AYJ, Boo LM, Wang SY, Lin HH, Wang CCC, Yen Y, et al. Suppression of nonhomologous end joining repair by overexpression of HMGA2. Cancer Res. 2009;69(14):5699–706. 366. Davis AJ, So S, Chen DJ. Dynamics of the PI3K-like protein kinase members ATM and DNA-PKcs at DNA double strand breaks. Cell Cycle [Internet]. 2010 Jul 28;9(13):2529–36. Available from: http://www.tandfonline.com/doi/abs/10.4161/cc.9.13.12148 367. Weterings E, Verkaik NS, Keijzers G, Florea BI, Wang S-Y, Ortega LG, et al. The Ku80 Carboxy Terminus Stimulates Joining and Artemis-Mediated Processing of DNA Ends. Mol Cell Biol [Internet]. 2009 Mar 1;29(5):1134– 42. Available from: http://mcb.asm.org/cgi/doi/10.1128/MCB.00971-08 368. Hammel M, Yu Y, Mahaney BL, Cai B, Ye R, Phipps BM, et al. Ku and DNA-dependent protein kinase dynamic conformations and assembly regulate DNA binding and the initial non-homologous end joining complex. J

227 Biol Chem. 2010;285(2):1414–23. 369. Gruel G, Villagrasa C, Voisin P, Clairand I, Benderitter M, Bottollier-Depois J-F, et al. Cell to Cell Variability of Radiation-Induced Foci: Relation between Observed Damage and Energy Deposition. Amendola R, editor. PLoS One [Internet]. 2016 Jan 4;11(1):e0145786. Available from: http://dx.plos.org/10.1371/journal.pone.0145786 370. Chang HHY, Watanabe G, Gerodimos CA, Ochi T, Blundell TL, Jackson SP, et al. Different DNA end configurations dictate which NHEJ components are most important for joining efficiency. J Biol Chem. 2016;291(47):24377–89.

228 A. Appendix 1

A1.1 Field specific nomenclature This multi-disciplinary work covers a broad range of topics. To aid the reader we have included a table defining some terms that not all readers may be familiar with.

Primary particle The initial incident ion that traverses the target

Linear Energy The rate of energy deposition per unit length. Given in units Transfer (LET) of keV/µm.

Track averaged The LET measured and averaged across a number of LET mono-energetic ions.

Radiation quality The collected properties of the radiation used; including parameters such as LET, species, etc.

Nanodosimetry A dosimetric technique that measures energy depositions on a similar length scale to the structure of DNA, i.e. at the nano-scale.

Pristine Bragg The profile of energy vs position for a mono-energetic fixed peak beam of particles.

Spread Out Bragg A series of pristine Bragg peaks, at different energies, that Peak (SOBP) are weighted in order to produce a smooth plateau of high dose across a target.

Fractionation A dose of radiation is often prescribed in fractions in order to give healthy tissue a chance to repair. A typical course of radiotherapy will normally be 40-60 Gy given in doses of 1.5-2 Gy per day.

Hypo/Hyper Hypo fractionation uses less exposures of higher dose fractionation while hyper fractionation has more doses each consisting of a lower dose (although more than 1 dose may be delivered on the same day).

Direct/Indirect Direct damage refers to DNA damage caused by the damage physical processes of the interaction of the beam. Indirect damage results from indirect or secondary interaction with

229 the beam, i.e. when a free radical formed by the beam attacks the DNA.

Lesion A site where the DNA structure is damaged.

Aberration Successful repair of damaged DNA that has paired incorrect partners, resulting in a mixup of the genetic code.

Cluster Density A measure of the average local DSB density in a cell nucleus.

alpha/beta A measure that gives an indication of how sensitive a cell is to radiation. This ratio is derived from the linear quadratic model of cell survival.

Synapsis The joining together of two exposed ends of the DNA helix.

V(D)J A process in the immune system where double strand recombination breaks are purposefully induced and fixed in different combinations to increase the diversity of antibodies etc.

intra/inter track Intra track refers to events created along a single ion track. Inter track refers to events from different ion tracks.

Monte Carlo The method of simulation through repeated random sampling.

Sub-diffusion Diffusion where the mean squared displacement does not scale linearly with time, resulting in the object being more spatially confined.

Fractional A form of sub-diffusive motion where the object is confined Langevin Motion by a visco-elastic boundary.

Table A1.1 Definitions of some field specific terms.

A1.2 LET and Dose Calculation

The track averaged linear energy transfer (LETt) of the particle is determined through a separate Geant4-DNA simulation. A particle of a given energy is simulated crossing a water box of side length 10 µm, corresponding to the distance travelled across the cell nucleus and cytoplasm. Upon entering the box the particles coordinates and energy are recorded. The position and energy are then recorded for

230 the particles final step within the volume. The LETt is calculated as the change in energy divided by the distance travelled (assuming a straight path between the first and final step). Secondary electrons are “killed” within the simulation, this assumes that all secondary energy is deposited locally (unrestricted LET). The process is repeated for 50,000 primaries. The calculated LETt for each of the 50,000 primaries

6+ forms an LETt distribution, shown for some of the proton, alpha, and carbon-12 energies as the normalised distribution:

231

Figure A1.1 The probability distribution functions for the LET of mono energetic protons, alphas, and carbon ions. Higher energy particles have lower LET and vice- versa.

The average LETt is calculated for each of the primary energies, with the standard error in the mean taken as the uncertainty. This gives the track averaged

LETt across the irradiation volume. This is shown for protons, alphas, and carbons for the energy range used in this work:

Figure A1.2 The average LET of the mono energetic ions simulated across a 10 µm thick water phantom.

The average energy deposited by a primary, of a given energy, crossing the 10 µm box is also recorded. The deposited energy is converted to a dose by dividing by the mass of the irradiated volume. In this work, a primary starts on a disc of radius equal to the cell nucleus (2.5 µm). This gives a cylindrical irradiation volume, with length of 10 µm (the nucleus and cytoplasm). Within the simulation the volumes are constructed of liquid water, giving an irradiated mass of 1.96E-13 kg. The dose per

232 primary for protons, alphas, and carbons in the energy range used in this work is shown as a function of the primary LETt across the nucleus:

Figure A1.3 The dose delivered to the cell nucleus per primary, showing higher dose depositions from high LET primaries.

Using the dose per primary data the number of primaries required to deliver a given dose is calculated. However, this often results in non-integer values. To overcome this the irradiation field is overextended. Initially the primary is randomly placed on a disc with radius equal to the cell nucleus (2.5 µm). This ensures that every particle simulated crosses the nucleus, maximising computing resources. By increasing the disc radius and the number of primaries it is possible to average the number of primaries crossing the nucleus, and therefore give an average dose to the cylinder containing the nucleus. For example, an irradiation disc, rdisc, can be set up so that of the initial primaries, y, only an amount of them, x, fall within the disc corresponding to the nuclear radius, rnuc. Here x is equal to the number of primaries required for a given dose. This is shown schematically:

233

푦 푟 = √ 푟2 푑푖푠푐 푥 푛푢푐푙푒푢푠

Figure A1.4 Schematic of the irradiation setup. Primary particles originate from a disc (dashed circle) with random coordinates. The radius (r_disc) is changed so that on average x particles originate in the disc covering the nucleus (red circle).

Where y is an integer value representing the number of primaries used, taken here as four times the required primaries, x, rounded to the nearest integer. This results in a slight under- or over-dosing for a single simulation, however, the effect averages out over multiple simulations. The ion range is determined across a water phantom, shown for some of the primary proton and alpha energies. For the carbon ion energies investigated the range was greater than the water phantom length (1000 µm).

234

Figure A1.5 The simulated range distribution of mono energetic protons and alphas crossing liquid water. Range is measured as the position of the final simulation step made by the ion. All energy carbon ions investigated had a range greater than 1000 µm.

A1.3 Predicted DSB Yield Compared to Literature Data

Figure A1.6 The predicted DSB yield across the LET range. Yields are shown for a range of sensitive nucleus fractions (15%-20%). 15% reproduces the yields reported by Meylan et al.

The DSB yield simulated with our model varying the sensitive percentage of the cell nucleus between 15 - 20%, assuming that the genome consists of 6 Gbp. Error bars show the standard error in the mean for LETt (50,000 repeats) and the DSB yield (2,500 repeats). The DSB yields are compared to yields reported by other simulations in the literature(109,168,246,255,263). Selecting a 15% of the nucleus as sensitive reproduces the reported DSB yields of Meylan et al(168). The simulation

235 of Meylan et al. consists of a detailed fibroblast model including DNA damage from both direct and indirect effects, and is able to reproduce experimental yields of DSB induction.

A1.4 Damage Complexity and Misrepair To investigate the effect of damage complexity on the predicted misrepair the complexity is forced as either “simple” or “complex”. For the “simple” case each DSB is placed in the repair simulation as only two damaged backbones. For the “complex” case the most complex break is selected from the damage library. Figure A1.7 shows the effect of “simple” or “complex” breaks on the fraction of misrepaired DSBs. Here, we do not see any significant change in the misrepair from either case.

Figure A1.7 The predicted DSB misrepair when each DSB is populated by the simple or complex form. Random shows the case of randomly selecting the complexity from the break library. No significant differences are seen between the cases. Error bars show the standard error in the mean.

236 A1.5 Fit of Cluster Density and Misrepair Linear fits between the cluster density and misrepair, calculated with different radii, R.

Figure A1.8 The average number of neighbouring DSBs calculated for a range of radii against the fraction of misrepair. Linear fits to the complete data set are shown by solid lines.

The goodness of the linear fit at each radii is calculated as the Pearson chi square. The goodness of fit is normalised to the maximum chi squared, showing a minimum at 70 nm. Though, chi square values are similar below 100 nm.

237

Figure A1.9 The goodness of fit between misrepair and cluster density calculated at a range of different radii. Showing the best fit for cluster densities calculated at 70 nm. However, the goodness of fit values are similar for radii below 100 nm.

A1.6 Cluster Density and LET

The cluster density as a function of LETt can be correlated with a second order polynomial. This is shown for protons, carbons, and alpha. Calculated with a radius of 70 nm:

Figure A1.10 LET and cluster density for protons, alphas, and carbon ions. The cluster density can be approximated from a 2nd order polynomial, shown by solid lines.

238 A1.7 Geant4-DNA and the Carbon Ion Discontinuity The results of this work have highlighted anomalous behaviour in the interactions of carbon-126+ simulated through Geant4-DNA. For the default DNA physics list the only carbon interaction modelled is ionisation, handled by the “G4DNARuddIonisationExtendedModel”. Within this model the kinetic energy transferred from the primary carbon ion to the liberated electron is calculated either relativistically or classically. This change in calculation method becomes apparent when calculating the primary carbon LETt at a high primary energy resolution. However, it is possible to force Geant4-DNA to always use the relativistic calculation, which should still have validity at non-relativistic energies.

Figure A1.11 The carbon ion LET as a function of primary energy. Showing a discontinuity between 1191.3 MeV and 1191.6 MeV. The discontinuity is attributed to a change in calculation method, where Geant4 switches between classical and relativistic calculations.

Here we see the change in behaviour occurring between carbon ions at 1191.3 MeV and 1191.6 MeV. Forcing relativistic calculation removes the discontinuity. This has implications for any study using Geant4-DNA to simulate carbon ions at high

LETt, or approaching their Bragg peak. The energy transfer currently proposed by Geant4-DNA may be considerably lower than in reality.

A1.8 Simulation Results Compared to Experimental 53BP1 Foci at 24 Hours The amount of 53BP1 foci 24 hours post proton irradiation is extracted from Chaudhary et al. (332) and replotted alongside our prediction for residual DSBs at 24 hours. All points from “This Work” and for “Chaudhary et al. (2016)” are for 1 Gy, except the lowest LET in “Chaudhary et al. (2016)”. The error bars for both data sets show the standard error in the mean.

239

Figure A1.12 Experimental values of residual DSBs from Chaudhary et al. compared to the predicted yield of residual DSBs following 24 hours of repair.

A1.9 Incorrect Rejoining Time Our model predicts that misrepair occurs rapidly. This is shown for our model by scoring the times at which misrepair occurs for the case of various energies of 1 Gy protons. The majority of misrepair occurs before 95 seconds and there is no increase in the frequency of misrepair events after this time.

240

Figure A1.13 The time that misrepair occurs for DSBs created by a range of 1 Gy protons. Showing that repair of DSBs occurs rapidly, with the majority occurring before 95 seconds.

241 A. Appendix 2

McMahon Damage Model

DSB yield per Gy per GBP Fixed based on experimental McMahon et al. reports of DSB yield (162)

Mean Energy deposit per Fit to observed proton RBE McMahon et al. DSB values (163) McMahon Repair Model

Fast and slow DNA repair Fit to panel of DNA repair McMahon et al. rates, fraction of complex kinetic studies (162) DSBs Misrejoining range Fit to published rates of McMahon et al. misrepair and chromosome (162) aberration formation

Henthorn Damage Model

Sensitive fraction of the cell Fitted so that clustering of Meylan et al. (168) nucleus (15%) accepted energy depositions gives the same yield of DSBs as reported in the literature Probability of OH radicals Probability chosen so that Friedland et al. causing a strand break 65% of strand breaks are (109) from indirect effects

Probability of direct strand Linear probability based on Souici et al. (296), break induction energy deposited to the Urushibara et al. DNA, varying between 5- (297), Ushigome et 37.5 eV. Simulated plasmid al. (298), Vysin et model compared to al. (260) experiment Warmenhoven Repair Model

Ku Inhibition, Release from Fitted to live cell imaging of Uematsu et al. Inhibition, Ku Recruitment, fluorescent foci formation (325), Andrin et al. and DNA-PKcs (363), Hartlerode et Recruitment al. (364), Li et al. (365) Interaction range to form Adopted same value as Friedland et al. DSB synaptic complexes suggested by Friedland et al. (174,271)

242 Rate of Failure of Synaptic Fitted to live cell imaging of Uematsu et al. Complex, Stabilisation of fluorescent foci formation (325), Davis et al. Synaptic Complex, Clean- and recovery after (366), Weterings et up of Complexities at photobleaching al. (367), Hammel Synaptic Complex et al. (368) Final Ligation Steps Fitted to repair kinetics as Chaudhary et al. measured by fluorescent foci (332) Figure 3. vs. time Magnitude of DSB end Fitted to repair kinetics as Chaudhary et al. Mobility measured by fluorescent foci (332) Figure 6. vs. LET Overall Pathway Combined suggested Friedland et al. Development mechanisms. (174,271), Uematsu et al. (325), Graham et al. (326), Lee et al. (324), Girst et al. (315), Soutoglou et al. (314)

Table A2.1: The parameters used, and where they were derived from, in all the models investigated in the work of Chapter 6.

Average LET across target Proton Energy (MeV) (keV/µm) ± SEM 0.975 29.78 ± 0.01 1.175 25.268 ± 0.009 1.5 20.59 ± 0.01 1.8 17.780 ± 0.006 2.2 15.190 ± 0.006 2.5 13.720 ± 0.006 3.5 10.600 ± 0.006 5.5 7.420 ± 0.006 8.5 5.260 ± 0.006 34 1.77 ± 0.06 Photons LET at target (keV/µm) Co-60 0.2

Table A2.2: Cell models were irradiated by beams of the above energies for which the corresponding LET was calculated as the energy deposited across a 10 µm water slab.

243

Figure 6.4B: McMahon-McMahon Equation 푓(푥) = 푚푥 + 푐 Gradient (9.8±0.2)×10-3 Intercept 0 Figure 6.4B: Henthorn-Warmenhoven Equation 푓(푥) = 푐 ∙ 푒푎푥 Coefficient (2.58±0.08)×10-2 Exponent (6.7±0.1)×10-2 Figure 6.4C: McMahon-McMahon Equation 푓(푥) = 푚푥 + 푐 Gradient (5.73±0.06)×10-1 Intercept (8±1)×10-3 Figure 6.4C: Henthorn-Warmenhoven Equation 푓(푥) = 푚푥 + 푐 Gradient (5.73±0.06)×10-1 Intercept (8±36)×10-4

Table A2.3: Equations and parameter values used for the correlations fitted to data in Figure 6.4.

244