<<

Evolution of ?

Understanding Rate Plasticity

A thesis submitted to the University of Manchester for the degree of Doctor of

Philosophy in the Faculty of Science and Engineering

2018

Huw WE Richards

School of Earth and Environmental Sciences

Faculty of Science and Engineering Table of Contents i. Abbreviations ……………………………………………………………………………………...... 12 ii. Abstract ………………………………………………………………………………………………………. 13 iii. Declaration …………………………………………………………………………………………………. 14 iv. Acknowledgments ……………………………………………………………………………………... 15 1. Introduction …………………………………………………………………………………………………… 16 1.1. General Introduction …………………………………………………………………………………… 17 1.2. Evolvability …………………………………………………………………………………………………. 18 1.2.1. ………………………………………………………………………..…………….. 20 1.2.2. ………………………………………………………………………………….……. 22 1.3. of Mutation Rates ……………………………………………………………………….. 24 1.3.1. Scaling of Mutation Rates across …….……………………………………. 24 1.3.2. Selection on the ……….……….………………………………………. 29 1.3.3. Variation in Mutation Rates ……………………………………………………………… 32 1.3.4. Adaptiveness of Variable Mutation Rates …………………………….…………… 34 1.4. Measuring Mutation Rates ………………………………………………………………………….. 36 1.4.1. Luria-Delbrück Fluctuation assay ………………………………………………………. 36 1.4.2. Mutation Accumulation ………………………………………………..……………….…. 38 1.4.3. Measuring Environmental Dependence of Mutation Rates …………...…. 39 1.4.4. Caveats in using the Fluctuation assay …………………………..……………….… 39 1.5. Mutation Rate Plasticity ……………………………………………………………………………… 40 1.5.1. Stress-Induced ……………..……………………………………………... 41 1.5.1.1. SOS Response ……………………………………………………………..…….…..44 1.5.1.2. General Stress Response ………………………………………..…………….. 47 1.5.2. Density Associated Mutation Rate Plasticity …………………………………….. 49 1.5.2.1. Quorum Sensing ……………………………………………………………………. 54 1.6. Summary of Thesis ……………………………………………………………………………………… 56 1.6.1. Chapter 2: Spontaneous Mutation Rate is a Plastic Trait Associated with Population Density across Domains of ………………………………….……………….. 56 1.6.1.1. Author Contributions ……………………………………………………..…….. 58 1.6.2. Chapter 3: Evolution of Density Associated Mutation Rate Plasticity in two disparate species of Archaea ………………………………………………..……………….. 58 1.6.2.1. Author Contributions ………………………………………………………..….. 59

2 1.6.3. Chapter 4: Evolution of Density Associated Mutation Rate Plasticity in strains of ……………………………………………………………………………… 59 1.6.3.1. Author Contributions ……………………………………………..…………..… 60 1.6.4. Chapter 5: Highly conserved molecular mechanisms modulate density associated mutation rate plasticity in Escherichia coli ……………………………….…. 60 1.6.4.1. Author Contributions ………..………………………………………………….. 61 1.6.5. Discussion Chapter …………………………………………………………………………… 61 1.6.5.1. Author Contributions …………………………………………………….……… 61 1.7. References ………………………………………………………………………………………………….. 62 2. Spontaneous Mutation Rate is a Plastic Trait Associated with Population Density across Domains of Life ……………………………………………………………………………………. 82 2.1. Abstract ………………………………………………………………………………………………………. 83 2.2. Introduction ………………………………………………………………………………………………… 84 2.3. Results ………………………………………………………………………………………………………… 86 2.4. Discussion …………………………………………………………………………………………………. 102 2.5. Materials and Methods …………………………………………………………………………….. 104 2.5.1. Strains of and used in Chapter 2…..………..…………… 104 2.5.2. Media ………………………………………………………………………………….….……… 105 2.5.3. Fluctuation tests with bacteria ……………………………………………………….. 106 2.5.4. Fluctuation tests with yeast ……………………………………………………………. 107 2.5.5. Estimation of Mutation Rates …………………………………………………………..108 2.5.6. Statistical Analysis …………………………………………………………………….……. 108 2.5.7. Whole ……………………………………………………………. 109 2.5.8. Published mutation rate search criteria ………………………………………….. 111 2.5.9. Phylogeny used in analysing published mutation rates …………………… 116 2.6. References ………………………………………………………………………………………………… 124 2.7. Appendix …………………………………………………………………………………………………… 130 3. Evolution of Density Associated Mutation Rate Plasticity in two disparate species of Archaea ……………………………………………………………………………………………………. 159 3.1. Abstract …………………………………………………………………………………………………….. 160 3.2. Introduction ……………………………………………………………………………………………… 161 3.3. Materials and Methods …………………………………………………………………………….. 164 3.3.1 Strains used in this study ………………………………………………………………… 164

3 3.3.2 Media …………………………………………………………………………………………….. 164 3.3.3 Fluctuation tests in Archaea …………………………………………………………… 165 3.3.4 Estimation of Mutation Rates …………………………………………………………. 166 3.3.5 BLAST Analysis of Archaeal genetic and sequences ………..…… 167 3.3.6 Statistical Analysis …………………………………………………………………………... 168 3.4 Results ………………………………………………………………………………………………..……… 169 3.4.1 DAMP in the Archaea ………………………………………………………………..……. 169 3.4.2 BLAST analysis of Archaeal and sequences ….. 171 3.5 Discussion ……………………………………………………………………………………..…………... 173 3.6 References …………………………………………………………………………………..…………….. 177 3.7 Appendix ……………………………………………………………………………………..…………….. 181 3.7.1 Alignments from the BLASTp analysis ……………………………..………………. 181 3.7.1.1 Amino Acid Sequence alignment of Escherichia coli MutT to Sulfolobus acidocaldarius DSM 639 ………………………..………………… 181 3.7.1.2 Amino Acid Sequence alignment of PCD1 to Haloferax volcanii DS2 …………………………………………………..…..… 182 3.7.1.3 Amino Acid Sequence alignment of Escherichia coli MutT to Haloferax volcanii DS2 ………………………………………………………..……. 182 3.7.2 Media Preparation for both Archaea species …………………………….……. 183 3.7.2.1 Sulfolobus acidocaldarius media ………………………………………..…….. 183 3.7.2.2 Haloferax volcanii media …………………………………………..……………… 185 3.7.3 Statistical model outputs ………………………………………………………………… 188 3.7.3.1 Model output for Figure 3.1A ……………………………….………………….. 188 3.7.3.2 Model output for Figure 3.1B …………………………………………………… 188 3.7.3.3 Model output for Figure 3.2 …………………………………………………….. 188 4. Evolution of Density Associated Mutation Rate Plasticity within strains of Escherichia coli ……………………………………………………………………………………………… 189 4.1. Abstract …………………………………………………………………………………………………….. 190 4.2. Introduction .…………………………………………………………………………………………….. 191 4.3. Materials and Methods …………………………………………………………………………….. 158 4.3.1. Strains used in this study ………………………………..………………………………. 194 4.3.2. Media ………………………………………………………………………….…….……..……. 194 4.3.3. Fluctuation tests ………………….…………………………………………….………..…. 194

4 4.3.3.1. Variation of mutation rate and DAMP in strains of E. coli ….… 194 4.3.3.2. Interaction between SIM and DAMP in strains of E. coli ……... 196 4.3.4. Estimation of mutation rates ………………………………..….…………………….. 197 4.3.5. ECOR phylogeny ………………………………………………………………………….….. 198 4.3.6. Phylogenetic analysis ………………………………………………….……………..…... 198 4.3.7. Statistical analysis ……………………………………………………….………………….. 201 4.4. Results ………………………………………………………………………………………………………. 202 4.4.1. Mutation rates and DAMP in ECOR collection ……………….…………….…. 202 4.4.1.1. Variation in both average mutation rate and DAMP in ECOR collection .………………………………………………………………………………. 202 4.4.1.2. Variation in average mutation rate and DAMP robust to effects of ………………………………..……………… 206 4.4.1.3. Fitness effects of resistance mutation …………..……………………….. 210 4.4.1.4. Variation in average mutation rate and DAMP with weighted median estimate of fitness effects …………………………………..…….. 212 4.4.2. Phylogenetic analysis of average mutation rate and DAMP in ECOR collection ………………………………………………………………..……………………. 214 4.4.3. Interplay between SIM and DAMP in two isolates of E. coli …….……. 219 4.5. Discussion …………………………………………………………………………………………………..221 4.6. References ………………………………………………………………………………………………… 227 4.7. Appendix …………………………………………………………………………………………………… 231 4.7.1. Statistical model outputs …………………………….……………….…………………. 231 4.7.1.1. Model output for Figure 4.2A ……………………………………………… 231 4.7.1.2. Model output for Figure 4.3A ……………………………………………… 235 4.7.1.3. Model output for Figure 4.4A ……………………………………………… 240 4.7.1.4. Model output for Figure 4.5A ……………………………………………... 245 4.7.1.5. Model output for Figure 4.8 ……………………………………………….. 249 4.7.2. Diagnostic plots of BayesTraits Analysis …………………………………………. 254 4.7.2.1. Diagnostic plots for analysis of average mutation rate not co- estimating fitness effects of the resistance mutation …………….. 254 4.7.2.2. Diagnostic plots for degree of DAMP not co-estimating fitness effects of the resistance mutation …………………………………………. 256

5 4.7.2.3. Diagnostic plots for analysis of average mutation rate co- estimating fitness effects of the resistance mutation …………….. 257 4.7.2.4. Diagnostic plots for degree of DAMP co-estimating fitness effects of the resistance mutation …………………………………………………….. 259 4.7.2.5. Diagnostic plots for analysis of the correlation between the average mutation rate and degree of DAMP when not co- estimating the fitness effect of the resistance mutation ………… 260 4.7.2.6. Diagnostic plots for analysis of the correlation between the average mutation rate and degree of DAMP when not co- estimating the fitness effect of the resistance mutation ………… 262 4.7.3. Strains of bacteria used in Chapter 4 ……..………………………………………. 265 5. Highly conserved molecular mechanisms modulate density associated mutation rate plasticity in Escherichia coli ……………………………………………………………………. 272 5.1. Abstract …………………………………………………………………………………………………….. 273 5.2. Introduction ……………………………………………………………………………………………… 274 5.3. Materials and Methods …………………………………………………………………………….. 277 5.3.1. Strains of bacteria used in Chapter 5 …………….………………….………..….. 277 5.3.2. Media …………………………………………………………………….…………..…….……. 278 5.3.3. Fluctuation tests ……………………………………………………..…….………..……… 279 5.3.3.1. Co- Fluctuation tests …………………………………………..……. 279 5.3.3.2. Fluctuation tests for monoculture experiments .………..……….. 280 5.3.4. Estimation of mutation rates …………………………………….…………….…..... 280 5.3.5. Statistical analysis ……………………………………………………….………………….. 281 5.4. Results ………………………………………………………………………………………………………. 282 5.4.1. Intercellular control of Mutation Rates in E. coli ………….…………………. 282 5.4.2. Role of other E. coli NUDIX in modulating DAMP …………..……… 285 5.4.2.1. Fitness effects in NUDIX knockouts ……………………..……… 289 5.4.3. Mutation rates at different points in the culture cycle ……………..…….. 292 5.4.4. Intracellular nucleotide pool concentrations at different time points and their effect on mutation rate …………………………………………………………… 294 5.5. Discussion …………………………………………………………………………………………………. 296 5.6. References ………………………………………………………………………………………………… 301 5.7. Appendix …………………………………………………………………………………………………… 306

6 5.7.1. Statistical model outputs …………………………………………..…………….…….. 306 5.7.1.1. Model output for Figure 5.3A ……………………………………………… 306 5.7.1.2. Model output for Figure 5.3B ……………………………………………… 307 5.7.1.3. Model output for Figure 5.3C ……………………………………………… 308 5.7.1.4. Model output for Figure 5.3D ……………………………………………… 310 5.7.1.5. Model output for Figure 5.4 ………………………………………………… 311 5.7.1.6. Model output for Figure 5.6 ………………………………………………… 313 5.7.1.7. Model output for Figure 5.7A ……………………………………………… 314 5.7.1.8. Model output for Figure 5.8 ………………………………………………… 314 6. Discussion chapter ………………………………………………………………………………………… 316 6.1. Introduction ……………………………………………………………………………………………… 317 6.2. Summary of findings in each experimental chapter …………………………………… 317 6.2.1. Chapter 2: Spontaneous mutation is a plastic trait associated with population density across domains of life …………………………………….………….... 317 6.2.2. Chapter 3: Evolution of density associated mutation rate plasticity in two disparate species of Archaea ……………………………………………………………….. 318 6.2.3. Chapter 4: Evolution of Density Associated Mutation Rate Plasticity in strains of Escherichia coli …………………………..……………………………………………….. 318 6.2.4. Chapter 5: Highly conserved molecular mechanisms modulate density associated mutation rate plasticity in Escherichia coli………………………….……… 319 6.2.5. Considerations on DAMP and fluctuation tests ………………………..…….. 319 6.3. Discussion of results and suggestions for future work ………………………………. 322 6.3.1. Further investigation of DAMP at evolutionary scales……….………….…. 322 6.3.2. Investigation of DAMP across physiological scales……………………..……. 323 6.3.3. Molecular mechanisms controlling DAMP in Archaea………………..……. 324 6.3.4. Social aspects involved in modulating DAMP………………………..…………. 327 6.3.5. Adaptiveness of DAMP…………………………………………………..……………….. 328 6.3.6. DAMP and the growth cycle……………………………….…………………………... 329 6.3.7. Single investigation into DAMP……………………………………………….....329 6.4. Conclusions ………………………………………………………………………………………………. 330 6.5. References ………………………………………………………………………………………………… 332

Word Count – 45,765

7 List of Figures Figure 1.1 Effect of Increased Robustness on …………………………….……..…… 21 Figure 1.2 Depiction of Modular …………………..……………………………….…...….. 23 Figure 1.3 Drift-barrier hypothesis on the evolution of mutation rate ……………………... 25 Figure 1.4 Indirect selection on the mutation rate ……………………………….……….……..….. 31 Figure 1.5 Species specific mutation rates ……………………..………………………….….…….…… 33 Figure 1.6 Luria-Delbrück Fluctuation test ……………………..……………………….…….………... 37 Figure 1.7 Environmental adjustment of mutation rate in response to stress ………….. 43 Figure 1.8 SOS Response Mechanism …………………………….…………………………………….….. 45 Figure 1.9 Mutation Rate Plasticity in Escherichia coli …………………………….……………….. 50 Figure 1.10 Modulation of Mutation Rate depends upon environmental cues associated with luxS in Escherichia coli ………………………………………………………..………….. 52 Figure 2.1 Mutation rates published from 1943 to 2017 in relation to final population density ………………………………………………………..…………………………………………………………… 86 Figure 2.2 Slope values for species included in published mutation rate analysis ….... 88 Figure 2.3 Density associated mutation rate plasticity (DAMP) in bacteria and yeast, with population density calculated from luminescence and cell counts respectively .. 90 Figure 2.4 Density associated mutation rate plasticity (DAMP) in bacteria and yeast, with population density calculated from forming units (CFU) …………………..….. 91 Figure 2.5 Effect of Fitness differences between resistant and non-resistant strains estimated from slopes in Figure 2.3 ……………………………………………………….……..……...... 92 Figure 2.6 Relative fitness of rifampicin resistant of Escherichia coli REL606 ( A) and REL607 (mutant B) at different population densities …………….…….….. 94 Figure 2.7 All data from Figure 2.3 overlaid on published data in Figure 2.1 ……….…... 95 Figure 2.8 DAMP in strains of Escherichia coli with deficiencies in various DNA repair and other systems …………………………………………………………………………………………………... 97 Figure 2.9 DAMP in cells lacking mutation avoidance or correction genes in E. coli and S. cerevisiae, with population density calculated from luminescence and colony forming units respectively …………………………………………………………………..……….………….. 99 Figure 2.10 DAMP in cells deficient in mutation avoidance or correction genes in E. coli and S. cerevisiae, with population density calculated from colony forming units and cell counts respectively ……………………………………..……………….………………………………………… 100

Figure 2.11 Mutation rate in relation to Ne for all genotypes tested ………….…………… 104

8 Figure 2.12 Number of mutational events m per space and time in response to final population size Nt for all genotypes tested ……………………………….……………..………..….. 106 Figure 2.13 Density-associated mutation-rate plasticity (DAMP) in Vesicular stomatitis hosted by different cell lines ……………………………….………………………………..…….... 108 Figure 2.14 Phylogeny used in analysing published mutation rates ……………………….. 123 Figure 2.15 Calibration curves for final population density measured by counting colony forming units (CFU), against luminescence, assayed with the BacTiter-Glo assay (arbitrary units - AU) ……………………………………………………………………………………….…….. 131 Figure 3.1 Density Associated Mutation Rate Plasticity in Archaea …………….………….. 170 Figure 3.2 Density Associated Mutation Rate Plasticity in Sulfolobus acidocaldarius with population density estimated via an ATP-based assay …………………………………………... 171 Figure 4.1 Calibration curves for final population density measured by counting colony forming units (CFU), against luminescence, assayed with the BacTiter-Glo assay (LUM) ……………….……………………………………………………………………………………………..………….…... 196 Figure 4.2 Variation of Average Mutation rate and DAMP in Escherichia coli strains with population density estimated via an ATP-based assay and mutation rate estimated without the fitness effect of mutation……………….………………………………………….……….. 202 Figure 4.3 Variation of Average Mutation rate and DAMP in Escherichia coli strains with population density estimated via direct CFU counts and mutation rate estimated without the fitness effect of mutation ……………………………….…………………….…..……….. 205 Figure 4.4 Variation of Average Mutation rate and DAMP in Escherichia coli strains with population density estimated via an ATP-based assay and mutation rate co-estimated with the fitness effect of mutation……………….……………………………………………….……….. 207 Figure 4.5 Variation of Average Mutation rate and DAMP in Escherichia coli strains with population density estimated via direct CFU counts and mutation rate co-estimated with the fitness effect of mutation ……………………………….………………………….…..……….. 209 Figure 4.6 Average fitness effects of the resistance mutation to rifampicin ………..…. 211 Figure 4.7 Association between final population density and fitness of resistance mutation to rifampicin……………………………………………….….………………………………….……. 212 Figure 4.8 Average Mutation rate and DAMP in Escherichia coli strains with population density estimated via an ATP-based assay and mutation rate estimated with a weighted median of fitness cost of mutation………………………………....…………...….…….. 214

9 Figure 4.9 Average mutation rate plotted onto the phylogeny of Escherichia coli strains analysed for phylogenetic signal ………………………….………………….………………………...….. 216 Figure 4.10 Degree of DAMP plotted onto the phylogeny of Escherichia coli strains analysed for phylogenetic signal ………………………….………………….………………………...….. 217 Figure 4.11 Correlation between the average mutation rate and degree of DAMP for strains of Escherichia coli……………………………………………………………………….…..…..……... 218 Figure 4.12 Interaction between SIM and DAMP across a nutrient gradient in two isolates from the ECOR collection ……………………………………………………………………..…… 220 Figure 5.1 Mutation rate in both wild-type and ΔlsrK Escherichia coli in presence of different knockout strains………………………….……………………..……………………………….…... 283 Figure 5.2 Fitness effects of resistance in both wild-type and ΔlsrK Escherichia coli when grown in different co- ……………………………..……….…….. 284 Figure 5.3 Mutation rates in Escherichia coli strains deficient in various house-cleaning genes………..……………………………….………………………………………..…………………………….…... 286 Figure 5.4 Calibration curves for final population density measured by counting colony forming units (CFU), against luminescence, assayed with the BacTiter-Glo assay (LUM)……………………………………………………………………………………………………………..……... 287 Figure 5.5 Fitness effects of resistance mutations in both Escherichia coli wild-type and strains deficient in house-cleaning genes ……………………………………………..…………..…… 290 Figure 5.6 Interaction between fitness effects of the resistance mutation and the population density in both Escherichia coli wild-type and strains deficient in house- cleaning genes………………………………………….………………………………………………….…….….. 291 Figure 5.7 Changes in DAMP, number of mutational events and population density at different time points in the growth cycle …………………………………………….……..….….….. 293 Figure 5.8 Relationship between intracellular ATP levels and mutation rate at different points in the growth curve ……………………………………….…………………..……………..………... 295

10 List of Tables

Table 2.1 Strains of bacteria and yeast used in Chapter 2 ……..……………….………… 109 Table 2.2 Breseq analysis of mutations identified in genome sequence for two ΔmutT Keio strains ……..…………………………………………………………………………….…….…….………….. 114 Table 2.3 List of papers from which mutation rate estimates are taken for analysis of published mutation rates in Figure 2.1 …………………………………………………….……………. 116 Table 5.1 Strains of bacteria used in Chapter 5 ……..………………….……………………….….. 277

11 Abbreviations

CC – Cell Counts

CFU – Colony Forming Units

D – Final Population density - The estimated number of cells per ml at the end of the culture period

DAMP – Density-associated mutation-rate plasticity

LUM – Luminescence m – Number of mutational events

MMR – Methyl-directed DNA mismatch repair

N0 – The initial population size of cells.

Nt – The population size at the end of the culture period

Ne – The effective population size

QS – Quorum sensing

SIM – Stress-Induced Mutagenesis

12 Abstract

Mutation rates are crucial to an . By creating variation within an organism’s genome, they produce the variation needed by . Therefore, the rate that mutations arise could affect the rate that organism adapts and evolves. Since most mutations are deleterious to an organism’s fitness however, the mutation rate should be minimised as far as possible. This occurs in a wide range of species spanning the where mutation rate associates inversely with the organism’s effective population size, which equates to the balance between the powers of selection and . Such mutation rates are variable, having not evolved to a constant but vary depending upon the environment. Through the work in this thesis I answer questions regarding the evolution and mechanistic control of this environmental mutation rate plasticity with regard to the population density of a culture. Mutation rates have previously been shown to have an inverse association with population density at a locus in one bacterium. In the first experimental chapter, this association is found within the last 75 years of published literature and then is shown empirically to be present, but variable, in both pro-and . Intriguingly, this negative density associated mutation rate plasticity (DAMP) requires the same intracellular house- keeping Nudix hydrolase protein, in both domains, that hydrolyses the mutagenic nucleotide 8-oxo-dGTP. I extend this work further in the second experimental chapter, discovering that DAMP is present within two species of the final domain of life to be empirically tested, the Archaea, showing DAMP’s evolution at broad evolutionary scales. DAMP is then shown to also evolve at a fine evolutionary scale in the third experimental chapter. Between strains of the bacterium E. coli both the average mutation rate and the degree of DAMP exhibited has evolved. Furthermore, there is evidence for both a phylogenetic signal of DAMP and also an association of its degree with the average mutation rate. In the final experimental chapter I investigate the molecular mechanisms involved in modulating DAMP, discovering four more Nudix hydrolase genes that affect DAMP in contrasting ways. DAMP is then shown to be pervasive across the culture cycle, with the mutation rate affected by the concentration of the intracellular nucleotide pool. These results all combine to indicate that DAMP is a highly evolving trait, which potentially has a wide phylogenetic spread and ancient evolutionary origin.

13 Declaration

No portion of the work referred to in this thesis has been submitted in support of an application for another degree or qualification of this or any other university or other institute of learning.

Copyright statement

I. The author of this thesis (including any appendices and/ or schedules to this thesis) owns certain copyright or related rights in it (the “Copyright”) and s/he has given The University of Manchester certain rights to use Copyright, including for administrative purposes. II. Copies of this thesis, either in full or in extracts and whether in hard or electronic copy, may be made only in accordance with the Copyright, Designs and Patents Act 1988 (as amended) and regulations issued under it or, where appropriate, in accordance with licensing agreements which the University has from time to time. This page must form part of any such copies made. III. The ownership of certain Copyright, patents, designs, trademarks and other intellectual property (the “Intellectual Property”) and any of copyright works in the thesis, for example graphs and tables (“Reproductions”), which may be described in this thesis, may not be owned by the author and may be owned by third parties. Such Intellectual Property and Reproductions cannot and must not be made available for use without the prior written permission of the owner(s) of the relevant Intellectual Property and/ or Reproductions. IV. Further information on the conditions under which disclosure, publication and commercialisation of this thesis, the Copyright and any Intellectual Property and/or Reproductions described in it may take place is available in the University IP Policy (see http://documents.manchester.ac.uk/DocuInfo.aspx?DocID=2442 0), in any relevant Thesis restriction declarations deposited in the University , The University Library’s regulations (see http://www.library.manchester.ac.uk/about/regulations/) and in The University’s policy on Presentation of Theses.

14 Acknowledgements I wish to thank both my supervisors, Dr. Chris Knight and Prof. Andrew McBain, for all their help and support throughout my PhD. I would also like to thank my advisors, Dr. John Fitzpatrick and Prof. Daniela Delneri, for their helpful suggestions and discussions. I am also thankful to all members, past and present, of the Knight Lab for their help. I would specifically like to thank Dr. Rok Krašovec and Dr. Danna Gifford for their help in both laboratory work and discussions relating to this work.

I would also like to thank Prof Sonja Albers and Dr Michaela Wagner from the University of Freiburg for their help in getting the Sulfolobus up and running and providing the . The same thanks is also give to Prof Thorsten Allers from the University of Nottingham for his help and gift of Haloferax.

I am also indebted to my friends and family who have supported me throughout my PhD and have always been there when needed.

Finally, I would like to thank my fiancée Jennifer, without whom this PhD, and most certainly this thesis, would not have been completed and written.

15

Chapter 1: Introduction

16 1.1 General Introduction

All must accurately replicate their DNA as they grow and divide (Mott and

Berger 2007). Maintaining genetic fidelity is of paramount importance to an organism as even small changes in its genome can have profound negative fitness consequences

(Fisher 1958; Eyre-Walker and Keightley 2007). Due to these deleterious effects, organisms have evolved several genetic fidelity mechanisms that can either prevent pre-replication or correct an post-replication, some of which are highly conserved across domains of life (Iyama and Wilson 2013). These mechanisms, however, are not perfect; meaning the small proportion of replication mistakes that occur, via mutation, and are not corrected, will be incorporated into the genome as mutations and inherited by future generations.

Mutations are nonetheless not all deleterious, and in fact may be advantageous for an organism in creating the variation required by natural selection (Loewe and Hill 2010).

By altering the genetic information of an organism, mutations may in turn affect the resulting . It is this variation in that the forces of natural selection can act upon, thus allowing an organism to evolve. Therefore, the rate at which mutations arise in an organism’s genome, could affect the rate at which an organism can evolve, or their ‘Evolvability’ (Wagner and Altenberg 1996; Kirschner and

Gerhart 1998; Pigliucci 2008) – making mutations the fuel of evolution.

Here I will first explain the concept of evolvability and its components that allow for improved adaptation due to the occurrence of mutations. I will then move on to a more specific section looking at the evolution of mutation rates, followed by an overview of the methods used in estimating mutation rates and discuss the most

17 appropriate methodology to measure environmental dependence in mutation rates.

Finally, I will explain the phenomenon of mutation rate plasticity in relation to the environment, before outlining the content of the experimental chapters of this thesis.

1.2 Evolvability

Evolvability can be defined as an organism’s ability to generate heritable phenotypic variation (Kirschner and Gerhart 1998). This phenotypic variation relies upon variation in the underlying , which is then inherited by subsequent generations. As mentioned above, the rate at which mutations arise could directly affect an organism’s evolvability, due the increasing of variation that natural selection can utilise.

There is evidence for the evolution of evolvability in nature (Radman et al. 2000), primarily due to existence of variation in mutation rates both across and within species

(Drake et al. 1998; Baer et al. 2007; Galhardo et al. 2007). Through modulating its mutation rate due to the environment, an organism may augment its own evolvability, and thus facilitate adaptation to novel environments (Sniegowski et al. 1997).

Increased mutation rates may also present an advantage in evolutionary arms races.

Empirical evidence supports this with increased mutation rates evolving in populations of fluorescens when cultured in the presence of bacteriophage (Pal et al.

2007), where such an increase was not seen in populations evolved without phage.

Additionally, those populations that had evolved mutator phenotypes due to mutations in their DNA repair mechanisms (specifically their mismatch repair genes) had higher probability of pushing the phage population to , showing an evolutionary advantage of these bacteria against phage due to the increase in mutation rate.

18 The ability to move from phenotype to phenotype relies upon the underlying genotype-phenotype (GP) map (Alberch 1991), which has, possibly, evolved for enhanced evolvability (Pigliucci 2008; Wagner 2008, 2012; Draghi et al. 2010; Ibáñez-

Marcelo and Alarcón 2014). The GP map dictates the relationship between a phenotype and the genotypes that encode it. In essence, movement across the map is facilitated by the fact that the same phenotype can be encoded by various different genotypes, allowing an organism to resist any perturbations in their genome. The degree to which they are able to resist these perturbations is termed their genetic robustness (Hindré et al. 2012). Additionally, not all of an organism’s characters interact with one another with the same level of degree. Instead, some characters might be closely connected with one another than with another set of characters; termed an organism’s modularity (Hansen 2003; Griswold 2006; Wagner et al. 2007;

Clune et al. 2013). In this way, each of these sets of characters are grouped together in modules comprising closely connected characters. These modules are the interconnected with other modules, but at a lower degree of connectivity than present within the actual module. As such, any change in the character from one module will more likely affect change among other characters within its module that those in other modules. The fitness benefits associated with any change of one character decreases proportionally with the number of characters affected (Orr 2000; Welch and Waxman

2003). Modularity occurs at both the phenotypic and genetic level. Genes can affect more than one phenotype, where genes that affect a set of phenotypic characters are grouped together in the module with these phenotypes, with both genes and phenotypes having less effect upon other characters outside this module.

19 1.2.1 Robustness

Robustness allows an organism to acquire cryptic in the population

(Figure 1.1), due to the neutral fitness effects mutations have, where robustness can be thought of in terms of the distribution of fitness effects (Eyre-Walker and Keightley

2007). A highly robust organism will have a greater frequency of neutral fitness effects resulting from mutation, facilitating the creation of genetic variation in the population

(Charlesworth et al. 1995). Conversely, low robust organisms will have a greater frequency non-neutral of fitness effects, meaning that, because most mutations are deleterious, such cryptic variation cannot be created. Whilst unseen by natural selection, cryptic variation poses a valuable source of evolutionary potential (Gibson and Dworkin 2004; Paaby and Rockman 2014). An organism can benefit from robustness not only by developing cryptic variation, but also by gaining benefits from any potential epistatic relationships with future mutational events (de Visser et al.

2011). Altering a population’s standing genetic variation facilitates adaptation by creating new opportunities as the or target of selection changes over time

(Figure 1.1) (Massey and Buckling 2002; Good et al. 2017). This has been demonstrated in a RNA where populations that were able to build up their cryptic variation adapted quicker to a novel substrate than those populations that did not possess cryptic variation (Hayden et al. 2011), due to their improved ability to explore new genotypes that only become advantageous in new environments. The importance of the genetic background has been revealed by highlighting the differing fitness effects caused by parallel mutations occurring in different genotypes of the same species

(Vogwill et al. 2016). Robustness therefore alters an organism’s

(Wright 1932) by broadening an organism’s fitness peak, and flattening the overall fitness landscape in general. This alteration in fitness landscape topology has been

20 termed ‘survival of the flattest’ (Lauring et al. 2013), with experimental evidence supporting this from a RNA plant viroid (Codoñer et al. 2006) and vesicular stomatitis virus (Novella et al. 2013).

Figure 1.1 Effect of increased robustness on adaptation. The rate at which mutations arise in the genome and the extent to which an organism can resist any perturbation due to this mutation, can affect the rate an organism adapts to new selection pressures Mutation occurs in the original population (left hand dots) and any individual that resists this perturbation is highlighted by colour, different colours denoting different mutations. (a) highlights how low genetic robustness may limit adaptation. Here the population has a low level of standing variation in the population

21 due to individuals not resisting perturbations and being eliminated due to the actions of purifying selection. Due to this low level of standing variation, the population shows a lack of adaptability to new selective pressures (i.e. only being able to adapt to the new

‘red’ selective pressure). (b) shows the benefit to an organism of having increased robustness, as a large level of standing variation is able to accumulate in the population, thus allowing adaptation to occur to all new selective pressures. Taken and adapted from (Massey and Buckling 2002).

1.2.2 Modularity

Whereas robustness allows for change in the genotype without a change in phenotype, modularity protects phenotypic characters from change despite a change in another phenotypic character (Raff and Raff 2000; Hansen 2003; Griswold 2006; Wagner et al.

2007). Modularity in relation to genetic mutations can be thought of as pleiotropy, where a mutation in a single gene affects two or more unrelated phenotypic traits

(Stearns 2010), and genes that affect a set of phenotypic characters are grouped together in modules alongside the phenotypes they affect (Figure 1.2). These pleiotropic effects were previously thought to be very common and led to the idea of universal pleiotropy, that is where a mutation at any locus in the genome could affect any (Fisher 1930). This hypothesis led to theoretical studies that suggested the rate that an organism adapts to an environment is inversely related to the number of traits that organism possesses, termed the ‘cost of complexity’ (Orr

2000). Recent work however, has questioned this by finding that, rather than being universal, pleiotropy in several eukaryotic species is actually rather limited for most genes (Wang et al. 2010). Furthermore, there is a highly modular structure to these pleiotropic genes, potentially reducing the probability of a random mutation being

22 deleterious as, rather than affecting multiple unrelated traits in various ways, it would affect related traits in the same way (Welch and Waxman 2003; Martin and

Lenormand 2006). Additionally, mutations with a greater degree of pleiotropy have a greater relative per-trait effect size, meaning that the probability that a beneficial mutation is fixed and the fitness gain conveyed by that mutation increases with organismal complexity. The combination of these two effects (increased fixation and greater fitness increase) counteract the lower frequency with which beneficial mutations occur in a complex organism, meaning the highest rate of adaptation is predicted to occur in organisms of intermediate levels of complexity, thus presenting a route to REVIEWS the evolution of complex organisms in light of the ‘cost of complexity’ hypothesis. 37 Box 3 | Evidence for modular pleiotropy in its hydrophobic core . Structural analysis revealed two distinct binding sites in serum albumin that allow its interaction with different ligands38. The yeast HIS7 gene represents a case of type II pleiotropy. HIS7 encodes glutamine amidotransferase, which is used in both histi- dine biosynthesis and nucleotide monophosphate biosynthesis. Thus, loss of HIS7 leads to the shortage of multiple necessary metabolites (see the Saccharomyces Genome Database). Is pleiotropy mostly of type I or type II? This ques- tion was addressed by analysing yeast gene pleiotropy39. Interestingly, no significant correlation was found between the degree of pleiotropy of a gene and the number of molecular functions of that gene. There is also Figure 1.2 Depiction of modular pleiotropy. Modular pleiotropy occurs when It is often observed that certain groups of traits tend to co-vary among individuals of no correlation between gene pleiotropy and the number the same species or across different species, forming variational modules66. of domains in a protein. Among enzyme genes, there is sets of anVariational organism’s modularity phenotypic occurs traits when area set coof -traitsdetermined are co-determined by a set by of a setgenes of (circles). genes, a phenomenon known as modular pleiotropy48. The genotype–phenotype map no correlation between pleiotropy and the number of (GPM) can be viewed as a bipartite network that is composed of two types of nodes: catalytic activities of the enzyme. By contrast, there is a The numbergene of nodes traits and that trait a genenodes. will A link affect between (shown a gene by node black and lines) a trait varies node indicates by gene, but itpositive correlation between pleiotropy and the number that the gene affects the trait. It was recently shown in yeast, nematode worms and of cellular components to which the is is not unimiceform. that Instead, the bipartite rather network than of the all GPM genes is highly affecting modular, all forming traits, groups these gene-traitlocalized. There is also a positive correlation between of traits that are co-affected by groups of genes28 (see the figure; circles pleiotropy and the number of biological processes in represent genes and squares represent traits). For instance, the modularity in the relationships are highly modular – genes in module 1 are more likely to affectwhich a gene engages. These findings indicate that, at gene–trait bipartite network is 238 standard deviations greater than that least in yeast, pleiotropy is mostly of type II39. expected from the same network in which the gene–trait links are randomly rewired. phenotypesThe in evolutionary module 1 originthan moduleof the modularity 2, and vicein the versa. GPM is Taken unclear, from and several Wagner models, and Zhang Evolutionary and implications of type II plei- either with or without natural selection for modularity, have been proposed66. otropy. If we measure organismal complexity by the (2011). Regardless of its origin, modular pleiotropy further constrains the pleiotropic effects of mutations, because a mutation with modular pleiotropy is more likely to have number of recognizably different types of cells in smaller effects on unrelated traits than on related traits. an organism, then this property can be said to have Figure is modified, with permission, from REF. 28 (2010) US National Academy increased markedly from prokaryotes to advanced mul- of Sciences. ticellular eukaryotes such as and flowering plants40. However, the number of genes has increased 41 23 only about fourfold . This disparity has necessarily The molecular basis of pleiotropy resulted in an increase in average gene pleiotropy dur- Type I and type II pleiotropy. Despite the importance of ing the evolution of complex organisms. The finding pleiotropy and its long history of study in , little that pleiotropy is primarily of type II suggests that the is known about its molecular basis. A central question in increase in average gene pleiotropy during evolution is this regard is whether pleiotropy is conferred by multiple probably realized by the recruitment of existing genes molecular functions of a gene product, which is referred into new biological processes (rather than the acquisi- to as type I pleiotropy, or by multiple morphological and tion of new molecular functions), which presumably physiological consequences of a single molecular func- can occur, for example, by changes in tissue expression, tion, which is referred to as type II pleiotropy. This dis- subcellular localization and interaction partners, as well tinction goes back to 1938, to Grüneberg, who called as by context-sensitive . them ‘genuine’ and ‘spurious’ pleiotropy, respectively36 As mentioned, pleiotropic effects of mutations — terms that, in retrospect, seem unfortunate. In the underlie some . In principle, the GPM wake of the acceptance of the ‘one gene, one enzyme’ constructed from model organisms such as the labora- hypothesis, it was assumed that type II pleiotropy would tory mouse can guide the search for genes that, when be the most frequent, if not the only, form of pleiotropy1. mutated, cause human diseases. Such an approach can However, the discovery of multiple functional domains also aid the construction of the gene–disease map, which in the same protein and the discovery of alternative splic- is also known as the diseasome42. Conversely, the disea- ing, which produces different molecular species from the some informs us about the relationship between genes same gene locus, reopened the issue. and phenotypic defects, and thus can be used for infer- An example of type I pleiotropy comes from human ring the prevalence and role of pleiotropy in human dis- serum albumin, which has a crucial role in maintaining ease. It is commonly believed that different of a the osmotic pressure that is needed for proper distribu- gene will exhibit pleiotropic effects on different subsets of tion of body fluids between intravascular compartments traits43. Although this situation is possible44, if the pleio- and body tissues. Serum albumin also acts as a plasma tropic effects of a gene are usually conferred by the same carrier by nonspecifically binding several hydropho- molecular function39, it would be difficult to isolate alleles bic steroid hormones, and as a transport protein for of pleiotropic genes that affect only one trait. This consid- haemin and fatty acids. Furthermore, it is involved in eration is relevant to , in which isolation the oxidation of nitric oxide by binding this molecule of symptom-specific alleles is thought to be important

210 | MARCH 2011 | VOLUME 12 www.nature.com/reviews/genetics © 2011 Macmillan Publishers Limited. All rights reserved 1.3 Evolution of Mutation Rates

Mutation is a fundamental factor that is pervasive throughout

(Lynch et al. 2016). Calculating the rate that these occur usually entails first estimating the number of mutational events, m, that have occurred in the experiment. This is then divided by the final population size of the culture, Nt, to give the mutation rate. In this way mutation rate is an estimate of the probability that a mutation occurs in a cell in a generation (Pope et al. 2008).

Across disparate species mutation rates vary by several orders of magnitude, ranging from ~10-4 to 10-11 mutations per cell per replication event (Drake et al. 1998). The highest mutation rates so far discovered have been found in RNA (Duffy et al.

2008), with the highest mutation rate yet reported that of HIV-1 with 4.1 x 10-3 mutations per base per cell infection (equivalent to replication event for this virus and the type of mutation measured) (Cuevas et al. 2015) . Conversely, the lowest mutation rate yet discovered belongs to the unicellular tetraurelia, which has a mutation rate of 1.94 x 10-11 mutations per base-pair per generation (Sung et al.

2012b). This observation of such a broad range in mutation rates across a wide range of organisms shows both that mutation rates have not evolved to biochemically possible minimum and that factors are modulating it outside of universal physical constraints, and must be modified by other, evolutionary mechanisms.

1.3.1 Scaling of Mutation Rates across species

An organism’s mutation rate is hypothesised to evolve until it reaches a point where further refinement is blocked by the power of random genetic drift, which is inversely proportional to an organism’s effective population size (Ne) (Figure 1.3) (Lynch 2011;

24 Sung et al. 2012a; Lynch et al. 2016). This so called ‘Drift-barrier hypothesis’ (Sung et al. 2012a) to mutation rate evolution occurs because a point will be reached where the selective advantage that would be attained by gaining greater DNA replication fidelity is too small to overcome random genetic drift (Sniegowski and Raynes 2013).

Essentially equalising the probability of fixation between a beneficial mutation and any other random, . Across the tree of life, including both uni- and multicellular organisms, such an inverse association between mutation rates and Ne have been seen (Fig 1.3b) (Lynch et al. 2016), supporting the drift-barrier hypothesis of how mutation rates evolve.

Figure 1.3 The drift-barrier hypothesis on the evolution of mutation rates. a The overall refinement any trait reaches (i.e. closeness to overall biochemical perfection) will be greater where the effective population size is larger, due to greater selection efficacy (red arrows) and reduction in the power of random genetic drift

(black arrows). The actual refinement level a trait reaches will be determined by the point at which the power of selection is overpowered by genetic drift, as the benefit of increased refinement will become less than the cost of increasing such refinement. b

Regression of the base-substitution mutation rate against the effective population size

(Ne). Blue points indicate Eubacteria, Green points denote unicellular eukaryotes and 25 red points show multicellular eukaryotes. Numbers correspond to species, which can be found in the original publication. Taken from (Lynch et al. 2016).

The exact point at which this barrier to improved refinement is reached depends on the rates that mutator and anti-mutator alleles are produced. There is empirical evidence in support of this drift barrier to mutation rate evolution in opposition to the hypothesis that mutation rates have reached a biomolecular minimum, as bacteria populations initially founded with mutators evolve lower mutation rates over time

(McDonald et al. 2012; Wielgoss et al. 2012; Turrientes et al. 2013; Williams et al.

2013; Sprouffske et al. 2018). This reduction does not come about through alterations in the originally constructed mutator , but via compensatory mutations in other loci. This indicates that there are sites in the genome able to improve genome replication fidelity that are evidently unexploited, perhaps due to the blocking effect of genetic drift. Additionally, the rate of mutations in the of approaches that of the lowest estimate of unicellular species (6x10-11 per base per )

(Lynch 2010a). This is despite humans having one of the highest overall mutation rates

(at 1.35x10-8; number 8 on Fig 1.3b), and is actually several orders of magnitude lower than that found in some tissues (2.7x10-8 to 1.47x10-9 for retina cells and lymphocytes respectively) (Lynch 2010b, 2016; Behjati et al. 2014), showing that the mutation rate can be greatly decreased for species even with small values of Ne.

This drift barrier hypothesis encompasses key population genetic parameters of selection, genetic drift and mutation, suggesting its relevance in regulating the final level of molecular refinement for any biological trait. One drawback in the previous study proposing this mechanism of mutation rate evolution (Sung et al. 2012a) is that

26 the estimating levels of Ne indirectly depends upon the base-substitution mutation rate, the mutation rate used in this study. Whilst this study presented statistical analyses indicating that the potential correlation in estimating these parameters is not likely to be responsible for the negative association presented, a study where the mutation rate and Ne estimates are independently estimated is needed. Fortunately this potential caveat has been resolved by investigating an association of Ne another form of mutation, - () mutations (Sung et al. 2016). This investigation supports the previous results of the drift-barrier hypothesis, extending it to a different mutation type and additionally providing independence between estimations of the mutation rate and Ne. Due to the greater propensity for to arise within protein-coding genes and their effect of causing frameshifts, they will have a direct effect on fitness, which is not always the case for base-substitutions (Eyre-

Walker and Keightley 2007). This, alongside indels generally deleterious nature, means selection will be more efficient in modulating the rate indels occur, resulting in a close association with Ne. These two results show that the drift-barrier hypothesis provides a consistent hypothesis for mutation rate evolution across disparate organisms of varying complexity, and provides a uniform explanation for the wide range seen in mutation rate estimations.

In general, as shown in Figure 1.3b, mutation rates are elevated in multicellular compared to unicellular eukaryotes (Lynch 2008), but still follow the relationship between the mutation rate and Ne (Figure 1.3b). However, as noted above, there is a large discrepancy between the mutation rates found in the two cell types in these multicellular organisms: the somatic and germline cells (Milholland et al. 2017), with mutation rates also differing between the different somatic tissues of the same

27 organism (Lynch 2010a). The mutation rate is several orders of magnitude higher in lines compared to the germline for a wide range of species. Indeed, human somatic cells will accumulate mutations 4 to 25 times quicker than germline cells (Lynch 2010b). This difference in mutation rate occurs despite the fact that the same repair are utilised in both cell types (Marcon and Moens 2005). Recent evidence has suggested that this difference in mutation rate arise due to differences in damage or repair rates that occur during transcription in these two cell lines (Chen et al. 2017). If mutations occur within the somatic cells then the fitness effects are felt directly by the individual, whereas mutations occurring within the germline results in these phenotypic effects being felt by the progeny in subsequent generations. In this way, it is the germline mutations that will affect the long-term fitness of a population, whilst somatic mutations primarily affect the individual’s fitness. This highlights the importance of maintaining genomic integrity and protecting the germline whilst the soma cells are of a lower evolutionary importance, and so are somewhat disposable.

For a population to remain viable in the long term, the loss of fitness caused by the accumulation of deleterious mutations must be equalised by the action of natural selection to remove such mutations from the population. Within asexual populations, such mutations can fix due to their association with other beneficial mutations (Lang et al. 2013), and their removal from the population is slow as they are located within several backgrounds, meaning natural selection is inefficient. Conversely, in sexual populations undergoing recombination, natural selection is more efficient at purging such mutations from the population by combining them into a single background, speeding their removal from the population (Sohail et al. 2017). Additionally, rescues beneficial mutations from deleterious backgrounds and, similarly to its action

28 on deleterious mutations, accumulates beneficial mutations arising in different backgrounds into a single genetic background (McDonald et al. 2016). Thus sex relieves both the tendency for to occur and the chance for such deleterious mutations to hitchhike to fixation within these advantageous backgrounds (McDonald et al. 2016; Peabody et al. 2017). Selection acting on mutation rate modifiers is weak in populations with recombination, such that mutators will only increase in frequency in sexual populations under highly restrictive conditions (Leigh Egbert Giles 1970). All in all sex seems to act to make natural selection be more efficient at sorting between mutations of differing fitness effects and could accommodate for species to have a high mutation rate, as E. coli populations that exhibited a high mutation rate and were capable of sexual recombination outperformed other populations where mutation rate was lower or sexual recombination was not possible (Peabody et al. 2017).

1.3.2 Selection on the Mutation Rate

Understanding the evolution of mutation rates can also be aided by considering selection on the frequencies of mutator alleles found in populations (Sniegowski et al.

2000; Raynes and Sniegowski 2014). Selection on mutator alleles can either act directly or indirectly. Direct selection occurs when any mutator allele has a pleiotropic effect on fitness for traits other than its mutator effect. Such potential pleiotropic effects have been shown in viruses where improved DNA replication fidelity results in a decreased rate of replication, therefore presenting a direct cost upon the organism

(Furió et al. 2005, 2007). Conversely, a mutator allele in the mismatch repair system of the bacteria , specifically in the mutS gene, bestows resistance against the effects of , thus presenting a direct fitness

29 benefit for the organism, depending upon the environment (Torres-Barceló et al.

2013).

Indirect selection on the other hand, relies upon a link between the mutator allele and any mutations it generates that positively affect fitness. Due to this close association

(termed ‘linkage disequilibrium’) the mutator allele is able to ‘piggyback’ with the beneficial mutation into the next generation, termed ‘second-order’ selection

(Tenaillon et al. 1999) (Figure 1.4). Such evidence for the spread of a mutator allele due to these indirect effects was first presented in the bacterium Escherichia coli, where a mutator strain, lacking its mutT gene, outcompeted and spread in a population when co-cultured with wild-type cells, if the mutator strain was at a high enough frequency in the population (Chao and Cox 1983). This spread was shown to be due to indirect selection, and not any direct fitness effects, due to two results. First, there was an initial pause in the spread of the mutator strain, and second, the mutator strain even went extinct if at a low enough frequency at the beginning of the experiment. Therefore the success of the mutator strain is due to the selection of beneficial mutations that arose with greater frequency in the mutator strain than wild- type, and the mutator allele being linked to these mutations, and not any direct fitness effects conferred by the mutT deletion. This work has been supported by further competition experiments, where double mutant strains of E. coli, deficient in both mismatch repair and DNA proof-reading genes, out-competed and replaced a population fixed with a single mutator strain, deficient only in the mismatch repair machinery (Gentile et al. 2011). This replacement was again due to the hitchhiking with beneficial and not any intrinsic fitness benefits conferred by the strain. A further example also comes from E. coli, where evolved clones that had a lower competitive

30 Review articles

Figure 1. Evolutionary forces affecting the genomic mutation rate. Deleterious mutations (red arrow) decrease fitness, resulting in continual selection for a lower mutation rate. Two possible selective forces constrain the mutation rate from evolving to zero: (1) the increased probability of acquiring beneficial mutations under a higher mutation rate (green arrow), and (2) constraints on the fidelity of replication (blue arrow). A: Intuition might suggest that the mutation rate is set by a tradeoff between the effects of deleterious and beneficial mutations (green line), since mutation is required for long-term adaptation. Recombination, however, weakens the effect of beneficial mutations on mutation rate modifiers (see Fig. 2 and Box 1) and thus the mutation rate in many populations may instead be set by a tradeoff between deleterious mutations and constraints on the fidelity of replication (blue line). B: Little is known about the cost of fidelity. In both sexual and asexual populations, this cost may be sufficient to set the prevailing mutation rate higher than the value that would be set by beneficial mutations, since decreases in mutation would be costly to individual fitness.

fitness won competition assays over other, fitter clones due to their ability to create

new variation (Woods et al. 2011). mutations. The challenge in understanding mutation rate evolution lies in evaluating the relative importance of these factors, as illustrated in Fig. 1. Theoretical progress in understanding the genetical evolu- tion of mutation rates has been achieved by explicitly con- sidering the effect of natural selection on the frequencies of alleles that modify the mutation rate (mutation rate modifiers) in populations. Selection on a mutation rate modifier can be classified as either direct or indirect. Direct selection is theoretically straightforward and depends on the effect (if any) of the modifier allele on fitness through factors other than its effect on mutation. Indirect selection, in contrast, depends on nonrandom association (termed ``linkage disequilibrium'') between the modifier allele and alleles at other loci affecting fitness. Because linkage disequilibrium in a population is rapidly eroded by recombination, the efficacy of such indirect selection on the mutation rate is highly dependent on the recombination rate. In particular, because beneficial muta-Figure Figure 1.4 Indirect 2. Indirect selection selection on on the the mutation mutation rate. A modifier rate. Indicates how the tions are expected to be rare compared with deleterious that increases the mutation rate (red circle) tends to be preferentially associated with (in positive linkage disequili- mutations, indirect selection to increase the mutation rate isfrequency of an allele that affects the mutation rate (mutator allele – red circles) can be brium with) a beneficial allele arising by mutation (green greatly weakened by recombination (see Fig. 2), whereas in- dependentsquare). upon A:linkageWith completewith beneficial linkage mutations between the(green two squares) loci, the it creates. A when direct selection to decrease the mutation rate is less affected. modifier can hitchhike(75) along with the beneficial allele as it sweeps to fixation in the population. B: Recombination The mutation rate that evolves in a population thus dependsthe beneficial mutation and mutator allele are closely linked, the mutator allele on direct and indirect selection on modifier alleles, and the disrupts the association between the modifier and the beneficial allele and decreases the probability of hitchhiking. strength of indirect selection depends on the rate of recom- hitchhikesNote and that increases deleterious in frequency mutations alongside are not the shown;mutation their as a result of selection bination in the population. In the remainder of this review, we prevalence creates a continual indirect selection in favor of consider in more detail how these factors affect the mutationacting toreduced fix the mutation mutation rates,due to as its described positive infitness the text effects and inB BoxIf the 1. linkage between the rate. We begin with a discussion of the evolution of equilibrium mutation and mutator allele breaks down, due to recombination for example, then the

1058 BioEssays 22.12 mutator allele is selected against, due to the deleterious mutations it produces, and so proceeds to extinction, whilst the beneficial mutation again sweeps to fixation. This only

shows the link between a mutator allele and its beneficial mutation. If the mutator were

linked with a deleterious mutation, then the mutator would not fix in the population

and would become extinct, unless its link were removed, due to mutation being

selected against. Taken from Sniegowski et al. (2000).

31 Since the majority of mutations are deleterious, selection would be expected to then act to lower the mutation rate after the organism has become adapted to the new environment, or the linkage between the mutation and mutator allele is broken.

Alternatively, indirect selection may still act to reduce these high mutation rates. Such an example is found in E. coli lines involved in the long-term evolution experiment

(Lenski et al. 1991). One of these populations evolved a one base-pair insertion in its mutT allele, resulting in a frameshift that increased the mutation rate in subsequent populations by ~100-fold (Barrick et al. 2009). However, two separate mutations in the mutY gene, which have an anti-mutator effect in a mutT mutant, reduced the mutation rate by up to 60%, alleviating the effect imposed by the mutT mutation

(Wielgoss et al. 2013).

1.3.3 Variation in Mutation Rates

Whilst the rate at which mutations arise varies between species (Drake et al. 1998;

Lynch et al. 2016; Sung et al. 2016), there are also examples of wide ranging mutation rates within isolates of the same species, in some cases over an order of magnitude

(Figure 1.5) (Sniegowski et al. 1997; Oliver 2000; Kohlmann et al. 2018). This variation could occur due to the existence of hyper-mutators within the population. Hyper- mutators are strains that have a deficiency in their DNA repair mechanisms, resulting in constitutively higher mutation rates (Matic et al. 1997; Bridges 2001; Watson et al.

2004). Increased mutation rates, as previously mentioned, may confer an adaptive advantage by augmenting the ability of that population to produce mutations needed for adaptation, something that is of particular need if multiple mutations are needed in combination to facilitate adaptation (Boe 1992; Drake et al. 2005; Drake 2007), as needed by and disease in response to combinational therapies (Ascierto and

32 Marincola 2011; Mokhtari et al. 2017). Increased frequency in mutations is particularly important when environmental conditions change (Travis and Travis 2002; Tanaka et al. 2003). Mutation rates have also been demonstrated to vary at a particular site within a particular genotype (mutation rate plasticity, MRP). Whether and to what degree mutation rates should vary at this fine evolutionary level is particularly intriguing.

Figure 1.5 Species-specific mutation rates. This recent study has highlighted the wide range of mutation rates (rate per cell per generation) both between and within a species. Here not only do the mean mutation rates vary greatly between species, as shown by the horizontal black bars, but also the variation and variability of mutation rates is also apparent. For instance, Hafnia alvei isolates display a range of mutation rates spanning almost three orders of magnitude in difference (each point representing an individual strain of the particular species). Black points indicate mutation rate estimates using mutant counts, where grey points represent experiments where no mutants occurred. In these circumstances the occurrence of a single mutant was assumed, and thus gave the limit of mutation rate detection in that species. Species from left to right: Enterobacter cloacae complex, Enterobacter aerogenes, Citrobacter

33 freundi complex, Hafnia alvei, Providencia rettgeri, Providencia stuartii, Serratia marcescens, Serratia liquefaciens and Morganella morgani. The number of tested isolates is given by species name and mean mutation rates per species are given at the top of the plot, highlighted by the horizontal black line. This mean included mutation rates and limit of detection rates. Taken from (Kohlmann et al. 2018).

1.3.4 Adaptiveness of Variable Mutation Rates

The evolution of variable mutation rates need not have occurred for any adaptive reasons however. For instance, the increase in mutation rate due to stress may just be the result of the different efficacies of selection. Since it can be thought that stressful conditions would constitute exceptional circumstances for the organism, with non- stress being the norm, selection on genetic mechanisms used under stress would have a lower efficacy, causing these to be less effective than their non-stressful counterparts (Van Dyken and Wade 2010). This ‘drift’ hypothesis (not to be confused with the drift-barrier hypothesis mentioned above) (MacLean et al. 2013) is supported by evidence of codon bias in Pseudomonas aeruginosa. P. aeruginosa, like many bacteria, possesses strong codon bias, where genes with a higher expression level are enriched with a subset of codons that are, presumably, best matched by the most common tRNA (Sharp et al. 2010). Therefore, by comparing the codon bias between two genes with common , one expressed in non-stressful environments with the other is expressed under stress, it is possible to explore the effective strength of selection between the two. As would be expected, selection is stronger on the gene expressed in the absence of stress than the gene expressed under stress (DnaE and

DnaE2, respectively; both being DNA polymerases in the genus Pseudomonas)

(MacLean et al. 2013). Additionally, a higher rate of nonsynonymous changes relative

34 to synonymous changes (dN/dS) in DnaE2 than DnaE across Pseudomonas spp. (means of 0.87 and 0.05 respectively; (MacLean et al. 2013) also supports the finding that selection acts with greater efficacy on the higher expressed gene. Nonsynonymous mutations are under greater levels of selection than synonymous mutations, due to their effect of changing the amino acid sequence. Therefore a higher dN/dS level implies that nonsynonymous mutations get fixed at a similar rate to synonymous mutations, and so are under , whereas a low dN/dS shows strong selection on nonsynonymous mutations relative to synonymous mutations.

Nevertheless, theoretical studies have supported adaptive reasons for mutation rate variability to evolve (Ram and Hadany 2012; Alexander et al. 2017). When under stressful conditions, mutation rates are only selected to be minimised in individuals whose fitness is above the population mean, whilst also selecting for increased mutation rates in individuals below the population mean (Ram and Hadany 2012). This builds on previous work that shows increased mutation rates in response to stress do not lower the population’s fitness (Agrawal 2002; Shaw and Baer 2011), but by also including in the effect of beneficial mutations, such an increase can be favoured in even constant environments (Ram and Hadany 2012). Indeed, populations with variable mutation rates were fitter in the long-term than populations exhibiting a constant mutation rate in both constant and fluctuating environments. Additionally, populations with heterogeneous mutation rates tend to increase the frequency of multiple mutation individuals produced and will reduce the mutation load exerted on well-adapted populations over the long-term (Alexander et al. 2017). This production of so-called ‘higher order’ mutants can be advantageous as it could allow a population to cross valleys in their fitness landscape, which require higher mutation rates to cross

35 (Clune et al. 2008), and reduce the effect of clonal interference by grouping multiple mutations onto the same genetic background (Desai et al. 2007). The findings of such studies suggest that variable mutation rates presents an advantageous strategy for the population as it both allows for a higher mean fitness and also the improved exploration of the genotypic landscape, offering a greater advantage in future adaptive challenges.

1.4 Measuring Mutation Rates

Understanding the evolution of mutation rates requires an accurate methodology to measure. However, mutation rates, due to their inherent rarity, are very difficult to measure. Additionally, as mutations are random occurrences there is no way of knowing where and when mutations will arise in a culture. Furthermore, the environment that the organism is cultured in affects the rate mutations occur, thus biasing the mutation rate estimation. Many methods have been developed to measure them accurately (Luria and Delbrück 1943; Foster 2006; Kondrashov and Kondrashov

2010).

1.4.1 Luria-Delbrück Fluctuation assay

One of the primary methods of measuring mutation rates is also one of the oldest: the

Luria-Delbruck fluctuation assay (Luria and Delbrück 1943), which uses phenotypic assays of mutation. This method involves culturing cells under non-selective conditions before plating them onto selective media (Figure 1.6). Any colonies that therefore grow on the selective media are due to the spontaneous mutations that occur during the growth period, conferring resistance to that selective marker. Indeed, Luria and

Delbrück first developed this assay in 1943 to show that mutations occurring in E. coli

36 that become resistant to bacteriophage T1 were spontaneous and not induced by selective pressure from the bacteriophage. Since its inception, the fluctuation assay has been used countless times in organisms from all domains of life (Jacobs and

Grogan 1997; Sniegowski et al. 1997; Lang and Murray 2008), viruses (Combe and

Sanjuán 2014) and even cancer cells (Tlsty et al. 1989), showing how applicable this method is in a wide range of organisms with such a wide ranging evolutionary history.

This method’s resilience in estimating mutation rates is also demonstrated by the fact that, despite being conceived of 75 years ago and the other methods of measuring mutation rates now available, it is still applicable and used to estimate mutation rates today in a wide range of organisms today (Kohlmann et al. 2018).

Figure 1.6 The Luria-Delbruck Fluctuation Assay. The assay as used by Luria and Delbruck to identify that mutations conferring resistance to phage occurred spontaneously and were not induced. This has subsequently been modified to investigate spontaneous mutation in many different species, however these all follow

37 the same principles. Briefly, A a culture of sensitive cells is diluted into many different independent cultures and grown under non-selective conditions. This culture is then plated out onto the selective plates containing the appropriate selective agent. Plating subsamples of the same culture results in approximately the same number of resistant colonies appearing (left side), whereas when many independent cultures are plated out there is a large amount of variation in the number of resistant colonies appearing (right side). These results together indicate that any colonies that appear have resulted from spontaneous mutations randomly occurring in the culture cycle and were not induced by the selective agent. B The number of colonies appearing upon the selective plates depends upon the point at which the mutation occurs in the culture cycle. Mutations occurring early in the culture cycle will produce more colonies due to going through more cell divisions (‘Early mutation’ tree), whereas mutations occurring later in the culture cycle will go through fewer generations and so produce fewer colonies (‘Late mutation’ tree). Taken from Barton et al. 2007).

1.4.2 Mutation Accumulation

Another primary method used for estimating mutation rates is that of the mutation accumulation (MA) study (Halligan and Keightley 2009a). In these studies, a population of organisms are passed through several genetic bottlenecks such that the effect of selection is reduced effectively to that of random genetic drift. Due to this, all mutations, except the most deleterious, will accumulate in the genome. Thanks to the advent of (WGS) technology, it is then possible to identify the number and type of mutations that have accumulated over the course of the experiment. By also then knowing the number of generations that have passed, it is then possible to estimate the mutation rate. This marrying of MA experiments with

38 WGS has provided a very strong method for estimating mutation rates accurately in a large number of diverse organisms ((Halligan and Keightley 2009b; Lynch et al. 2016).

1.4.3 Measuring Environmental dependence of Mutation Rates

MA experiments, along with other methods utilising genome sequencing, such as genomic comparisons between parent and offspring (Campbell and Eichler 2013) or the use of maximum depth sequencing (Jee et al. 2016), are particularly laborious in estimating the mutation rate. Therefore they are unsuited to studying such a dynamic process as in the environmental dependence of mutation rates. Instead, such an understanding of environmental dependence of mutation rates is possible from using the fluctuation assay.

1.4.4. Caveats in using the Fluctuation assay

Despite this ability of measuring environmental dependence of mutation rate, there are still caveats that must be accounted for before using this fluctuation assays. First, there is the potential for the mutant to confer an unknown selective advantage in an environment that is thought to be nonselective. This occurred in a study that purported to show stress induced mutagenesis varies in colonies of E. coli

(Bjedov et al. 2003). However, the marker they used, resistance to the antibiotic rifampicin conferred by mutations in the rpoB gene, is actually advantageous and allows for additional growth in the older colonies due to the of acetate

(Wrande et al. 2008; Bergman et al. 2014). Second, as the fluctuation test utilises phenotypic markers of mutation, there is the inherent problem that the time from this mutation occurring to the phenotype being conveyed is delayed, a phenomenon termed ‘phenotypic lag’. It has recently been shown that such a delay in phenotypic

39 occurs in fluctuation tests of between three and four generations (Sun et al. 2018).

This is much larger than previously believed (Newcombe 1948; Kendal and Frost 1988) and is due to multi-fork replication that causes cells to effectively become polyploid.

Therefore, the mutation is masked by the wild-type genes in the same cell, and can only be expressed after several rounds of replication that create the homozygous mutant. Thankfully, despite this effective , the mutation rates estimated as either the per-copy or per-genome rate remain valid and it is only the per-cell rate that cannot be estimated reliably. Despite these caveats, a well formulated fluctuation test

(Foster 2006; Lang 2018), allied with novel methods to estimate mutation rates able to take into account other population dynamic processes such as fitness and

(Mazoyer et al. 2017), provides a strong tool in understanding environmental dependence of mutation rates.

1.5 Mutation Rate Plasticity (MRP)

As mentioned previously, mutation rates are a variable that can change depending upon the environment the particular genotype finds itself in (Massey and Buckling

2002). This is advantageous for the organism as it will be able to generate variation at different rates depending on the circumstances it finds itself in. This mutation rate plasticity has been most commonly thought of in relation to stress, termed stress- induced mutagenesis (Matic 2013).

MRP has also been mathematically predicted to be advantageous when associated with an organism’s fitness (Belavkin et al. 2016). As an organism’s fitness decreases, an increase in mutation rate would increase in the frequency that beneficial mutations arise, causing the organism’s movement back to the fitness peak. Conversely, when at

40 or close to a fitness peak, a reduction in mutation rate is advantageous as the majority of mutations that occur will be deleterious, and so push that organism off or away from such a peak.

Such a fitness associated MRP has recently been discovered in the bacterium

Escherichia coli (Krašovec et al. 2014a). Here mutation rate was found to inversely relate to fitness, as expected by mathematical models. Further investigation to uncover the important component of fitness involved here revealed that the final population density of the culture was the primary important component: decreasing population density results in at least a three-fold increase in mutation rate. This MRP has also been found to be partially controlled by the highly conserved luxS gene

(Vendeville et al. 2005; Pereira et al. 2013; Santiago-Rodriguez et al. 2014; Rao et al.

2016), specifically its metabolic role in the activated-methyl cycle (Halliday et al. 2010).

This finding poses several evolutionary questions. Particularly, why has this behaviour has evolved and been maintained (is it adaptive, or is it simply result of drift as proposed for stress induced mutagenesis? Additionally, how prevalent and variable is this trait between and within species, and does its evolution follow any particular pattern? Furthermore, what other molecular mechanisms are involved in controlling it, and are they as ubiquitous in the natural world as luxS?

1.5.1 Stress-Induced Mutagenesis

The first hurdle in understanding mutation in response to stress is first to define what we mean by stress. Here I will define stress as it appears in the literature (MacLean et al. 2013), where a stressor is any environmental variable that will decrease the fitness of a bacterium, such as reducing its growth rate or competitiveness. For example, one

41 common form of stress is oxidative stress resulting from reactive oxygen species (ROS), which are highly reactive towards biomolecules (Winterbourn 2008), such as DNA bases. Due to its low redox potential (David et al. 2007), the DNA base guanine is at high risk of oxidation to the mutagenic base 8-oxo-dGTP, which can cause double- strand breaks (Cheng et al. 1992). Due to such deleterious effects, a mechanism has evolved to stop such deleterious effects, the GO system (Michaels and Miller 1992).

The GO system encodes an enzyme that removes 8-oxo-dGTP from the intracellular nucleotide by hydrolysis before incorporation into DNA (Setoyama et al. 2011). It also encodes enzymes that correct DNA that has 8-oxo-dGTP incorporated, either through direct excision of the oxidised base (Tchou et al. 1991), or via removal of the adenosine that binds to 8-oxo-dGTP, thus providing a clear target for its subsequent removal

(Michaels et al. 1992). The GO system therefore comprises both and mutation avoidance and correction mechanisms. This definition of stress is not completely ideal, as it means any non-optimal factor to being called stress, which is probably an over- simplification, and also does not account for cases where the individual has high fitness but is in poor condition, such as an animal wearing itself out by getting many offspring into the next generation.

Theoretical models have shown evidence that mutator alleles induced by stressful environments do not lower a population’s mean fitness when in a constant, non- stressful environment. Indeed, they can actually be favoured and selected for in preference of non-mutator or constitutive mutator (elevated mutation rates irrespective of environmental conditions) alleles (Ram and Hadany 2012). This is because selection acts only to lower mutation rates in those individuals adapted to the environment (i.e. whose fitness is above the population mean), whilst increasing the

42 rates in those individuals not yet adapted (i.e. whose fitness is below the population mean). This theoretical prediction that increased mutation rates under stress can be beneficial in constant environment relies upon the assumption that back mutations will occur, where individuals are transient mutators showing mutation rate plasticity dependent upon the environment they find themselves in (Figure 1.7).

Figure 1.7 Environmental adjustment of mutation rate in response to stress. Here an environmental cue subjects a population to stress, causing an increased mutational state. This produces new variation in the population including adapted individuals, who are then selected for, and increase in number. This highlights how the increase in mutation rate in response to stress, increases the probability of successful adaptation to the new selective pressure. Taken from (Massey and Buckling

2002).

It has been proposed that stress may affect the mutation rate through two distinct mechanisms (MacLean et al. 2013). The first is where the stressor directly increases an organism’s mutation rate by direct damage to DNA or inhibition of enzymes involved in

DNA fidelity mechanisms (Friedberg 2006). This first mechanism has been termed

Stress-associated mutagenesis (SAM), as the mutagenesis is associated with the effects of stress directly, and not a resultant induction of genes expressed under stress. In this way, the association between stress and mutagenesis occurs purely by chance. There is no reason to believe that there should be a reason for an increased mutation rate due 43 to the stressor, unless it affects DNA fidelity mechanisms. The second and more interesting mechanism is where the stressor indirectly induces mutagenesis by affecting the expression of genes that raise the mutation rate, for example by causing the expression of error-prone DNA polymerases (Foster 2007; Galhardo et al. 2007;

Saint-Ruf et al. 2007). This has been termed Stress-induced mutagenesis (SIM) (Matic

2013) as the stressor induces mutagenesis rather than causing it directly as in SAM.

Bacteria have evolved many mechanism to minimise effects of stress (Foster 2005;

Matic 2016), two of the most well studied stress responses are the SOS response

(Friedberg 2006) and the General Stress Response (Battesti et al. 2011), each of which I will now discuss how stress affects mutation rates.

1.5.1.1 SOS Response

When a stressor damages DNA it creates single-stranded DNA (ssDNA), either directly, by the stressor damaging the DNA, or indirectly, for example by impeding DNA replication. This ssDNA is the signal to induce the SOS response (Figure 1.8) (Sassanfar and Roberts 1990), to minimize the lethality and mutagenicity of such damage

(Friedberg 2006). The ssDNA is first bound to a recombinase RecA to form a nucleoprotein complex that induces the cleavage of the SOS , LexA, thus inducing the SOS response (Little et al. 1980), which contains at least 40 genes under its regulatory control (Fernández De Henestrosa et al. 2000; Courcelle et al. 2001;

Foster 2007). Many of these are responsible for encoding enzymes whose roles include

DNA repair and synthesis beyond replication blocking lesions (Foster 2005).

44 Figure 1.8 SOS Response Mechanism. In benign conditions, the LexA binds to the region of the SOS regulon, inhibiting its induction (yellow ovals on SOS box).

After DNA damage occurs however, causing double strand breaks (DSBs) that result in single stranded DNA (ssDNA). This ssDNA is then bound by RexA recombinase (grey circles) that causes the cleavage of LexA from the SOS promoting region, causing its induction and the expression of genes under its regulation. Once the DSB is repaired,

RecA is now long bound to the ssDNA and so stops the cleaving of LexA dimers, resulting in the repression of the SOS regulon. (Taken from Qin et al. 2015).

There has been a recent discovery of a large and widely distributed group of error- prone DNA polymerases, the Y family (Ohmori et al. 2001; Goodman 2002), that are able to replicate damaged DNA, but at the cost of low fidelity replication of not only damaged but also undamaged DNA templates. Two of these polymerases, Pol IV and

Pol V, are induced as part of the SOS response (Foster 2007). In non-stressful

45 environments neither Pol IV nor Pol V are important to spontaneous mutation rates

(Kuban et al. 2004, 2006), due to two different reasons.

Since it is important for an organism to reduce mutation rates in non-stressful environments, it can be expected that levels of error-prone polymerases will be tightly regulated. Indeed this is true for Pol V where levels are barely detectable without induction of the SOS response (Foster 2007). Interestingly though, levels of Pol IV do not seem to be as tightly regulated as levels can reach 250 molecules per cell (Kim et al. 2001). Despite the large number of copies, Pol IV does not, rather puzzlingly, have a large effect on the spontaneous mutation rate under normal growth conditions in cells. However, Pol IV does contribute to the spontaneous mutation rate of genes on both a plasmid (Kuban et al. 2004) and bacteriophage (Brotcorne-

Lannoye and Maenhaut-Michel 1986).

When induced via the SOS response however, both Pol IV and V are strong contributors to the spontaneous mutation rate. By using mutant strains that constitutively express the SOS response it is possible to demonstrate the different relative effects of these two polymerases. Pol IV is responsible for a three-fold increase in mutation rates, compared to Pol V that causes a ten-fold increase (Kuban et al.

2006). Without constitutive SOS expression, over production of both polymerases show an even stronger effect with Pol IV increasing mutation rates one hundred-fold

(Kim et al. 1997), whilst using a Pol V homolog inserted in a plasmid (due to difficulty in activating Pol V) increases the rate fifty-fold (Schlacher et al. 2006). Thus, the SOS response is not only induced by direct DNA damage from external stressors however.

Other causes of DNA damage such as replication mistakes and damage caused

46 intermediaries in metabolic cycles can also prove to be strong inducers of the SOS response (O’Reilly and Kreuzer 2004). Since the SOS response is repressed by LexA, if cellular levels of LexA were to decrease, in aging colonies for instance (Taddei et al.

1995), then this would also cause a SOS response without direct damage stimulation of

DNA damage. Thus the SOS response can respond to both mechanisms (SAM and SIM) of mutagenesis in response to stress.

1.5.1.2 General Stress Response

When bacteria either enter stationary phase or experience nutrient limitations, a general stress response is induced (Foster 2007; Battesti et al. 2011). It is termed the general stress response because there is no single specific inducer such as a lack of a particular amino acid or sugar. The general stress response is regulated by a sigma factor (subunit of RNA polymerase) called RpoS (also known as σ38) (Battesti et al.

2011). During the exponential growth phase, the majority of RNA polymerase molecules contain an inactive sigma factor (RpoD), and RpoS has its levels and activities tightly controlled. Upon encountering stressful conditions however, both conditions are unregulated and the holoenzyme containing RpoS, called Eσ38, becomes widespread, inducing the transcription of genes that encode survival-enhancing proteins. At present approximately 500 genes (~10% total genome) have been identified in E. coli that are regulated (either directly or indirectly) by RpoS (Weber et al. 2005). This collection of genes is known as the RpoS (or σ38) regulon.

The general stress response possesses two mechanisms that allow it to increase spontaneous mutation rates in stressed bacteria (Foster 2007). The first mechanism is by upregulating production of the error-prone polymerase Pol IV. Within the first ten

47 hours of E. coli cells reaching stationary phase, Pol IV levels increased three-fold

(Foster 2007). Despite Pol IV also being regulated by the SOS response (see above), this regulation seems to be independent and solely controlled by RpoS. Activation of Pol IV in non-growing cells via the general stress response, allows for adaptive changes in populations undergoing a long stationary phase (Yeiser et al. 2002). The second mechanism of the general stress response that augments spontaneous mutation rates is by downregulation of the methyl-directed mismatch repair (MMR) mechanism

(Foster 2007). The MMR mechanism is a crucial aspect of genetic fidelity, and is widely conserved across domains of life (Fukui 2010). In bacteria such as E. coli, the MMR machinery checks newly replicated strands of DNA and corrects any base mismatches before they become fixed (Kunkel and Erie 2005; Li 2008). It is able to distinguish between newly synthesised strands and the template strand as only newly synthesised strands of DNA are methylated, hence the methyl-directed portion of its name. MMR also maintains species’ genetic integrity by preventing recombination occurring between sets of non-identical DNA (Matic et al. 1995). MMR proteins maintain their activity during stationary phase (Foster 1999), but with the downregulation by RpoS of two proteins in the MMR mechanism: MutH and MutS (Tsui et al. 1997).

The majority of mutator strains found in both nature and laboratories possess a defective MMR mechanism (LeClerc et al. 1996; Matic et al. 1997; Denamur and Matic

2006; Matic 2016), primarily due to inactivation of either the mutS or mutL gene

(Denamur and Matic 2006). The proteins of these two genes, MutS and MutL, have been highly conserved throughout evolutionary history (Fukui 2010; Matic 2013). The

MutS protein can recognise seven of the eight of the possible base mismatches, with

C-C, the least frequent mismatch error, being the sole unrecognisable mismatch. The

48 MutL protein meanwhile matches the MutS complex (the MutS protein with the mismatched section of DNA) with other proteins in the repair mechanism framework.

Inactivation of either mutS or mutL greatly increases mutation rates, with a hundred- fold increase in the rate of transitions and up to a thousand-fold increase in the rate of frameshifts (Denamur and Matic 2006). Conversely, overexpression of MutS during the stationary phase reduces the mutation rate, with another study finding further minimisation of mutation rates when MutS and MutL are overexpressed concurrently

(Denamur and Matic 2006). Interestingly, other studies have found that overexpression of MutL alone is sufficient (Denamur and Matic 2006).

1.5.2 Density Associated Mutation Rate Plasticity

Many of the previously stated causes of MRP due to stress, either genetic or environmental, have been confounded by any direct mutagenic effects or physiological responses induced by the environment. Additionally, mathematical models suggest that mutation rates should also be associated with the fitness of an organism: organisms with higher fitness should minimise the mutation rate (Belavkin et al. 2016).

To try and uncouple any of these confounding factors and test this theoretical association, the mutation rate of wild-type E. coli was investigated under non-selective conditions using fluctuation tests (Krašovec et al. 2014a). This confirms theory by finding such an inverse association with absolute fitness (number of generations per day, Figure 1.9a). An organism’s fitness is made up of various components, such as the initial population size, nutrient availability and population density. After separating these components apart, it was found that mutation rate was primarily affected by final population density (Figure 1.9b), where the two are again inversely related: mutation rate increases with a decrease in population density. With the realisation

49 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms4742 utation rate has long been appreciated as a fundamental factor in evolutionary genetics1,2. In nature, mutation 10 a rates are typically minimized, as far as population M 8 genetic constraints allow3. However, rates of spontaneous mutation can vary both between3,4 and locally within5,6 6 genotypes. In particular systems, such as the model bacterium E. coli, there is abundant variation in mutation rates among 7 that population density affects mutation rate, this opens the possibility that the natural isolates . For a single genotype, the question of whether 4 and how mutation rates at any particular site might vary (mutation rate plasticity, MRP) is ofunderlying mechanism is due to quorum sensing, particular interest8. Several a phenomenon that has been shown theoretical works have shown that increasing mutation rate, specifically when an organism is displacedto control from annumerous adaptive other peak, microbial traits (Krašovec et al. 2014b; Whiteley et al. 9–11 may be advantageous . Indeed, in evolutionary computing, 14 15

NATURE12 COMMUNICATIONS | DOI: 10.1038/ncomms4742 mutations per generation) ARTICLE such MRP is used to optimize performance . Evidence for MRP Absolute fitness, w

2017). –9 abs in nature comes from experiments showing that the number of (generations per day) mutations can increase during environmental13 or genetic utation rate has long been appreciated14 as a fundamental stress1,2 . However, in such cases, many potential10 causes of MRP a b factor in evolutionary geneticsare. confounded. In nature, mutation This includes direct mutagenic effects of the rates are typically minimized, as far as population 10 M environment and any physiological responses,8 adaptive or 3 genetic constraints allow . However,otherwise, rates of affecting spontaneous mutation rate. mutation can vary both between3,4 and locally within5,6 5 Here, using E. coli, we identify plasticity in6 the rate of mutation genotypes. In particular systems, suchto as rifampicin the model resistance. bacterium We find this MRP to be mediated by the E. coli, there is abundant variation in mutation rates among population density, to be genetically switchable, dependent on the Mutation rate per cell (10 7 2 natural isolates . For a single genotype,quorum-sensing the question of whethergene luxS and to act via cell–cell4 signalling. This and how mutation rates at any particular site might vary link between mutation8 rate and a social system opens up a new (mutation rate plasticity, MRP) is of particulararea in interest which. Severalbacteria may manipulate each other and, 1 theoretical works have shown that increasingpotentially, mutation humans rate, may manipulate bacteria. The relationship specifically when an organism is displaced from an adaptive peak, 9–11 with stress-induced mutagenesis is considered and mutation rate may be advantageous . Indeed, in evolutionarycontrol mechanisms computing, potentially involved are discussed. 14 15 152

12 mutations per generation) such MRP is used to optimize performance . Evidence for MRP Absolute fitness, w Population density, D (10 8 c.f.u. ml–1)

–9 abs in nature comes from experiments showing that the number of (generations per day) mutations can increase during environmentalResults 13 or genetic Figure 1 | MRP in E. coli strains. Relationship of mutation rate (m ) per cell stress14. However, in such cases, manyDecreasing potential causes mutation of MRP rate with increasing absolute fitness. To (a) to absoluteb fitness (wabs) in wild-type E. coli K-12, and (b) to final are confounded. This includes direct mutagenicexplore variation effects in of mutation the rate withinFigure a 1.9 single Mutation genotype, rate we plasticitypopulation density in Escherichia (D) in E. coli B coli strains.. A In Relationshipa, the line is the the fitted curve 10 15 used E. coli K-12 cells in classical fluctuation assays . These (log2(m) 7.9 0.39 wabs) from Model 1 (see Methods). In b, dark and environment and any physiological responses, adaptive or ¼ À  otherwise, affecting mutation rate. assays identify mutational events in the rpoB gene by counting light blue indicate, respectively, the Ara À (REL606) and Ara þ (REL607) absolute fitnessR (measured in generations per day) of wild-type E. coli cells the rate of Here, using E. coli, we identify plasticitycells in the resistant rate of mutationto the antibiotic rifampicin5 (Rif ) arising in ancestral B strains, and red indicates the strain evolved for 20,000 to rifampicin resistance. We find this MRPthe to absence be mediated of rifampicin by the (non-selective environment). Different generations (REL8593A). Circles are monocultures, squares are cocultures; amounts of nutrient were provided (50–1000mutations mg in l their1 of glucose),rpoB gene,thin encoding lines link resistance estimates from to two the strains antibiotic in the rifampicin. same coculture. In The line is population density, to be genetically switchable, dependent on the Mutation rate per cell (10 À 2 allowing cells to achieve different numbers of generations per day the fitted curve (log2(m) 15 4.7 log2(D)) from Model 2. Note that quorum-sensing gene luxS and to act via cell–cell signalling. This ¼ À  mutation rate and population density axes are logarithmic. link between mutation rate and a social(that system is, opens different up aabsolute new fitness, waccordanceabs). We findwith variationresults from of theoretical studies, mutation rate can be inversely related area in which bacteria may manipulatemutation each rate other related and, to changing wabs 1(Fig. 1a): mutation rate doubles with every reduction in w by 2.6 (1.9–3.9, potentially, humans may manipulate bacteria. The relationship with fitness:abs cells with higher fitnessthe number showed ofa decreased cell divisions, mutation (iii) finalrate. populationB shows that density (D), with stress-induced mutagenesis is considered95% confidence and mutation interval rate (CI)) generations per day (testing slope by analysis of variance (ANOVA): N 30,15F 2 35, or (iv) relative fitness (wrel). As before, we assayed mutation control mechanisms potentially involved are discussed. 1,28 rates to RifR, except this time we used cocultures of two strains. P 2.4 10 6; Model 1 in Methods).of all the componentPopulation¼ density,s ¼ to make D (10 up8 c.f.u. an organism’s ml–1) level of fitness, the sole important ¼  À This allowed us to test strains with different fitnesses in the Results Figure 1 |factor MRP in involvedE. coli strains. in thisRelationship inverse associationsame of mutation environment is ratethe (finalm) per at population the cell same time.density To that accomplish the culture this, we used Decreasing mutation rate with increasingMRP absolute mediated fitness by population. To (a) to density absolute. In fitness Fig. ( 1a,wabs the) in indepen- wild-type E. coliE.K-12, coli B and strains (b) to eitherfinal ancestral or evolved in minimal glucose 17 explore variation in mutation rate withindent a single variable, genotype,wabs, is we affectedpopulation by many density parameters, (D) in E. coli includingB strains. In amedium, the line is for the 20,000 fitted curve generations . We conducted experiments 15 reaches. Taken from (Krašovec et al. 2014a) used E. coli K-12 cells in classical fluctuationculture volume, assays inoculum. These size,(log viability,2(m) 7.9 productivity0.39 wabs) and from nutrient Model 1 (seein Methods). which conditions In b, dark and were varied by manipulating [glc] (50– 16 ¼ À  1 assays identify mutational events in theavailabilityrpoB gene. by In counting addition, simplylight blue modifying indicate, respectively, available nutrientsthe Ara À (REL606)1,500 mg and l AraÀ ),þ the(REL607) strains paired (ancestral or evolved), culture cells resistant to the antibiotic rifampicinconfounds (RifR different) arising potentially in ancestral causal B strains, effects, and such red as indicates changes the in strainvolume evolved (1, for 1.5, 20,000 10 and 15 ml) and culture period (B24 or 45 h). the absence of rifampicin (non-selectivethe environment). time spent Different in differentgenerations phases of (REL8593A). the culture Circles cycle. are monocultures, It is We fittedsquares a are linear cocultures; model for mutation rate containing each 1 amounts of nutrient were provided (50–1000therefore mg l À unclearof glucose), which factorthin lines or factors link estimates are determining from two strains the in theof same the coculture. factors The testing line is hypotheses (i–iv) above and their allowing cells to achieve different numbersobserved of generations MRP. To per identify day suchthe fitted factors, curve we (log sought2(m) to15 manipulate4.7 log2(D))interactions, from Model 2. sequentially Note that removing non-significant effects. We Quorum sensing is responsible for a large number of density¼ À  -dependent behaviours (that is, different absolute fitness, wabsseveral). We find of them variation independently of mutation of each rate and other population within density the same axes arefound logarithmic. that the only significant effect on mutation rate among experiment. Specifically, we tested four non-mutually exclusive those tested was final population density (D)alone(Fig.1b):a mutation rate related to changing wabs (Fig. 1a): mutation such as virulence and biofilm formation (Waters and Bassler 2005). One of the most rate doubles with every reduction inhypothesesw by about2.6 (1.9–3.9, this MRP it is (i) a direct response to nutrient reduction in D of 77% (61–96%, 95% CI) gives a doubling in abs the number of cell divisions, (iii) final population density (D), 95% confidence interval (CI)) generations(glucose) per concentration; day (testing (ii) an effect intrinsic to the strain (for mutation rate (testing slope by ANOVA: N 80, F1,43 14, or (iv) relativeimportant fitness and (w conserved rel). As before, genes we involved assayed4 in mutation quorum sensing is the luxS¼ gene ¼ slope by analysis of variance (ANOVA):example,N cellular30, F1,28 ageing35, or different cumulativeR effects of stress P 6.0 10 À ; Model 2 in Methods; see Supplementary Figs 1–3 6 ¼ ¼ rates to Rif , except this time we used cocultures¼  of two strains. P 2.4 10 À ; Model 1 in Methods). because of different amounts of time or numbers of divisions in for equivalent plots of the other three factors and Supplementary ¼  different phases of the cultureThis cycle); allowed(Vendeville et al. 2005; Pereira et al. 2013) (iii) us related to test to strains population with differentNote 1 for fitnesses testsluxS plays a significant role in the creation of in modelling the assumptions). In other words, we density; or (iv) related to thesame competitiveness environment at of the the same biological time. Tofind accomplish strong evidencethis, we used for an effect of final population density on MRP mediated by population density. In Fig. 1a, the indepen- E. coli B strains either ancestral or evolved in minimal glucose environment. These hypotheses each make different predictions;17 mutation rate (hypothesis (iii) above), but no such evidence dent variable, wabs, is affected by manyspecifically, parameters, under including each hypothesis,medium for respectively, 20,000 generations we expect. Wesupporting conducted alternative experiments hypotheses (numbers (i), (ii) and (iv) culture volume, inoculum size, viability, productivity and nutrient in which conditions were varied by manipulating50 [glc] (50– 16 mutation rate to relate to (i) glucose concentration1 ([glc]), (ii) above). availability . In addition, simply modifying available nutrients 1,500 mg l À ), the strains paired (ancestral or evolved), culture confounds different potentially causal effects,2 such as changes in volume (1, 1.5, 10 andNATURE 15 ml) COMMUNICATIONS and culture period| 5:3742 (B | DOI:24 or 10.1038/ncomms4742 45 h). | www.nature.com/naturecommunications the time spent in different phases of the culture cycle. It is We fitted a linear model for mutation rate containing each & 2014 Macmillan Publishers Limited. All rights reserved. therefore unclear which factor or factors are determining the of the factors testing hypotheses (i–iv) above and their observed MRP. To identify such factors, we sought to manipulate interactions, sequentially removing non-significant effects. We several of them independently of each other within the same found that the only significant effect on mutation rate among experiment. Specifically, we tested four non-mutually exclusive those tested was final population density (D)alone(Fig.1b):a hypotheses about this MRP it is (i) a direct response to nutrient reduction in D of 77% (61–96%, 95% CI) gives a doubling in (glucose) concentration; (ii) an effect intrinsic to the strain (for mutation rate (testing slope by ANOVA: N 80, F1,43 14, 4 ¼ ¼ example, cellular ageing or different cumulative effects of stress P 6.0 10 À ; Model 2 in Methods; see Supplementary Figs 1–3 because of different amounts of time or numbers of divisions in for¼ equivalent plots of the other three factors and Supplementary different phases of the culture cycle); (iii) related to population Note 1 for tests of modelling assumptions). In other words, we density; or (iv) related to the competitiveness of the biological find strong evidence for an effect of final population density on environment. These hypotheses each make different predictions; mutation rate (hypothesis (iii) above), but no such evidence specifically, under each hypothesis, respectively, we expect supporting alternative hypotheses (numbers (i), (ii) and (iv) mutation rate to relate to (i) glucose concentration ([glc]), (ii) above).

2 NATURE COMMUNICATIONS | 5:3742 | DOI: 10.1038/ncomms4742 | www.nature.com/naturecommunications & 2014 Macmillan Publishers Limited. All rights reserved. of quorum sensing signals (Xavier and Bassler 2003; Walters and Sperandio 2006), suggesting its possible role in this relationship. Its role was indeed confirmed in this mutation rate plasticity as mutation rates in relation to density for ΔluxS mutants had a slope not significantly different to zero (Krašovec et al. 2014a). This reinforces the conclusion that this plasticity is density-dependent and regulated by luxS.

luxS encodes an enzyme active in the activated methyl cycle (Winzer et al. 2003), utilised in splitting S-ribosyl homocysteine (SRH) to form homocysteine (HCY) and DPD

(4,5-dihydroxy-2,3-pentanedione). DPD is subsequently used to produce the quorum- sensing signal autoinducer-2 (AI-2), whilst HCY is used in the activated methyl cycle and is the precursor to a number of molecules, including the alternative, structurally- unidentified signal, autoinducer-3 (AI-3). Investigation of which luxS role, signalling or metabolic, is involved in this plasticity was investigated by adding either synthetic DPD, to allow AI-2 production, or aspartate, which is metabolised to produce HCY) to the media. The density-dependent relationship was restored by addition of aspartate not synthetic DPD, determining that it is the metabolic action of luxS in the activated methyl cycle that is responsible for the observed relationship (Krašovec et al. 2014a).

This finding was supported by the consistency in MRP between the K-12 and B strains of E. coli. Compared to K-12, E. coli B strains lack much of the lsr operon that is utilised in both the recognition and detection of AI-2. Indeed, a ΔlsrK mutant that lacks a kinase needed for uptake and processing of AI-2 shows MRP not significantly different to wild-type K-12.

51 The effect of the luxS deletion is, however, still due to some form of environmental cue as shown by co-culturing K-12 cells with either other K-12 or ΔluxS mutant cells (Figure

1.10) (Krašovec et al. 2014a). Those K-12 cells in the presence of ΔluxS mutant cells had a greater mutation rate (an average increase of a third) than those cells in the presence of wild-type K-12 cells, despite overall population density being similar. Thus, all the evidence seems to confirm that this socially dependent plasticity is as a result of the role luxS plays in the activated methyl cycle and is controlled via environmental cues.

Figure 1.10 Modulation of mutation rate depends upon environmental cues associated with luxS in Escherichia coli. Mutation rate in wide-type E. coli cells depends upon the strain of E. coli they are co-cultured with, independent of population density effects. Wild-type cells exhibit a higher average mutation rate (solid black line represents median value) when co-cultured with ΔluxS mutant cells then when co-cultured with other wild-type cells. Indicating some form of environmental sensing of the social environment is an important factor in modulating mutation rates in

E. coli. Taken from (Krašovec et al. 2014a).

52 This socially mediated plasticity is believed to be functionally independent of previous examples of SIM (Krašovec et al. 2014a). The evidence for this comes from understanding expression of genes involved in the stress responses. As mentioned earlier much of the increased mutation rates due to stress relates to the use of error- prone polymerases (Galhardo et al. 2007) and the downregulation of mutation repair proteins (Saint-Ruf et al. 2007), both of which are controlled by RpoS sigma factor that is expressed during the later stages of the growth cycle (Battesti et al. 2011). Intuition tells us to therefore expect that maximum rpoS expression would occur at high densities, giving a positive relationship between mutation rate and population density, which is the opposite to the relationship that was found.

More evidence comes from investigating published E. coli expression data (Meysman et al. 2014) where, as expected, expression of the stress responses are positively correlated with error-prone polymerase and negatively related with mismatch repair gene expression (Krašovec et al. 2014a). Interestingly though, contrary to what would be expected if this relationship was related to SIM, the general stress response and error-prone polymerase expression positively correlate with expression of genes induced by increasing density. These two pieces of evidence, when taken with previously published reports that RpoS expression is not functionally linked to density when cells are grown in minimal media (Ihssen and Egli 2004), as in this study, strongly suggests social MRP is indeed independent of previous examples of

SIM.

53 1.5.2.1 Quorum Sensing

It is widely regarded that bacteria species are able to co-ordinate their gene expression

in relation to their population density through chemical cell-cell signals (West et al.

2006, 2012; Schuster et al. 2013), where a response is only initiated when the signal

concentration reaches a threshold (Williams et al. 2007). This behaviour is termed

quorum sensing and has been shown to control a wide variety of traits in a diverse

array of species (West et al. 2012).

These signals have a dual function. The first is that they are taken up by other

members of the population and regulate the genetic expression of the receiver.

Second, they are also taken up by the producer, causing a positive feedback for the

increase its own production. In this way, these signals are said to be both autoinducing

and autoregulating respectively. The presence of this positive feedback loop clearly

leads to the hypothesis that the benefits of quorum sensing is density dependent, with

greater benefits at higher densities (West et al. 2006; Diggle et al. 2007; Williams et al.

2007). This was experimentally supported using the opportunistic pathogen

Pseudomonas aeruginosa, specifically that the fitness benefits of utilising QS was

greater at higher population densities, and this was not due to a reduced QS at lower

densities (Darch et al. 2012). This opinion that these signals are adapted molecules

that evolved for cell-cell communication has however been challenged (Redfield 2002;

West et al. 2012). It has been proposed they are instead used to assess the rate at

which they are able to diffuse away from their producer cell (signaller). This has been

termed ‘Diffusion Sensing’ (DS) (Redfield 2002) and proposes that these molecules

allow their producers to know the optimal time to release extracellular factors to reap

the greatest beneficial rewards via information about the environmental diffusion rate.

54 Quorum Sensing (QS) and DS are not in direct opposition because the role of molecules signalling between individuals, as in QS, does not exclude the fact that diffusion could play a key role in the process. Indeed there is a growing body of empirical evidence that supports the importance of both social cell-cell interactions and diffusion rates in the role of QS regulation (Dulla and Lindow 2008; Yang et al. 2010). It has recently been proposed by simulations that it is possible to both the social and physical environment by utilising signals with different decay rates in different environments, resulting in combinational responses to the signal concentrations (Cornforth et al.

2014). These simulations are supported by evidence in Pseudomonas aeruginosa, where the decay rates of its two primary signal molecules are significantly different in different environments. Empirical evidence also shows that genes involved in secretion of such molecules are synergistically controlled by so-called ‘AND-gate’ responses to multiple signals, meaning there is effective secretion of such molecules.

Another explanation for QS has come in the form of ‘competition sensing’ (CS)

(Cornforth and Foster 2013). This hypothesises that bacterial stress responses react to the detection of ecological interference competition, where one individual reduces another’s fitness (Hibbing et al. 2010). Thus, the bacterium detects harmful factors affecting the cell or the effects they produce, instigating the stress responses for protection. CS thus differs from QS since it detects the effects of ecological competition rather than to specific signals produced from a specific genotype. This has been highlighted as being advantageous (Cornforth and Foster 2013), as by responding to harm rather than signals, CS is non-specific and so will not induce spending resources against a non-harmful genotype. CS and QS are not completely conflicting

55 however, as they may both provide information on the optimal time to secrete density-dependent products.

1.6 Summary of Thesis

This thesis comprises four distinct but linked experimental chapters investigating the prevalence, evolution and mechanism of mutation rate plasticity in relation to population density. I will now outline the aims and content of each of the following three experimental chapters individually and explain the contributions I made to each chapter.

1.6.1 Chapter 2: Spontaneous mutation rate is a plastic trait associated with population density across domains of life

The scope of this chapter is to investigate the prevalence and mechanism of Density

Associated Mutation Rate Plasticity (DAMP). This is investigated through three avenues: i) compilation of mutation rates estimated using fluctuation tests since their inception by Luria and Delbruck to test for presence of association between mutation rate and population density in the published literature, ii) empirically test form association between mutation rate and population density in strains of bacteria and yeast to explore the prevalence of the association between closely and distantly related species and, iii) carry out fluctuation tests in mutants deficient in mutation repair proteins to identify any genes involved in the molecular mechanism of this mutation rate plasticity. First, a strong negative association between mutation rate and population density in the last 75 years of published literature. This comprises almost 500 mutation rate estimates collected in species from all domains of life, including viruses. Once all previously known factors that affect the variance of the

56 mutation rate estimate have been accounted for, population density is able to account for 93% of the remaining previously unidentified variation. This association is then tested empirically in multiple loci in species from different domains of life, showing that DAMP is present but variable in both pro- and eukaryotes. DAMP is present to the same degree in two separate loci in the genome: rpoB, targeted by rifampicin, and gyrA, targeted by nalidixic acid. DAMP however is not evident in a very closely related gammaproteobacterium, Pseudomonas aeruginosa, showing this plasticity has evolved between closely related species. At a broader evolutionary level, DAMP is evident to varying degrees in three strains of a Saccharomyces cerevisiae. Perhaps most intriguingly, DAMP in both these domains relies upon a protein scavenging a mutagenic oxidised nucleotide (8-oxo-dGTP). Such a prevalent association and conserved mechanism suggest that mutation rate has varied with population density since the early origins of life.

The work presented in this chapter has recently been published in PLOS Biology:

Krašovec R*, Richards H*, Gifford DR, Hatcher C, Faulkner KJ, Belavkin RV, et al. (2017)

Spontaneous mutation rate is a plastic trait associated with population density across domains of life. PLoS Biol 15(8): e2002731 (*denotes co-first authors).

This chapter, for ease of reading, has been re-formatted from the published manuscript (not including tables related to the raw data (S1 Table and S4 Table)) where they are mentioned in the text. The numbering of the figures has also been changed from the figure number given in the published manuscript to the order they appear in the text. The supplementary information submitted with the manuscript are

57 attached in the appendix at the end of the chapter along with the acknowledgments

included in the publication. Abbreviations from the paper have been moved to the

general list of abbreviations towards the beginning of this thesis.

1.6.1.1 Author Contributions: HR was responsible for both the collection of published

mutation rates, both directly from papers and through correspondence with

authors, and its overall statistical analysis, including production of the

phylogeny used. HR also conducted fluctuation tests in E. coli MG1655 (for

resistance to both Rifampicin and Nalidixic acid) and S. cerevisiae S288c. HR

also wrote the manuscript alongside RK and CGK with comments from co-

authors; RK conducted fluctuation tests in E. coli MG1655 and P. aeruginosa

PAO1 (for resistance to Rifampicin), S. cerevisiae (both sigma and BY1472

strains), E. coli knockout mutants (for resistance to both Rifampicin and

Nalidixic acid), and analysis of these results; DG conducted analysis of whole

genome sequencing; CH conducted fluctuation tests in E. coli MG1655

(resistance to Rifampicin); KJF conducted fluctuation tests in E. coli MG1655

(resistance to Rifampicin.

1.6.2 Chapter 3: Evolution of Density Associated Mutation Rate Plasticity in two

disparate species of Archaea

The scope of this chapter is to investigate the evolution of DAMP in the final domain of

life the Archaea. DAMP is proposed to also occur within the final domain of life – the

Archaea. However, this suggestion was not empirically tested in that work. Here I fill

this gap in our knowledge by estimating mutation rates in two evolutionarily distant

species of Archaea – Sulfolobus acidocaldarius and Haloferax volcanii. DAMP is

58 present, but variable, between these two species, further supporting the conclusions drawn in the first experimental chapter regarding DAMP’s ancient evolutionary origins.

Furthermore, these two species possess putative proteins that carry out the same mutation avoidance mechanism that modulates DAMP, suggesting that not only is

DAMP widespread and ancient, but so might its mechanism.

1.6.2.1 Author Contributions: HR was responsible for designing, carrying out and

analysis of all experiments within this chapter. HR also wrote the chapter, that

included comments from both PhD supervisors.

1.6.3 Chapter 4: Evolution of Density Associated Mutation Rate Plastcity within strains of Escherichia coli

This chapter investigates the evolution of DAMP at the fine evolutionary scale of strains within a species. This rounds out the previous two chapters and completes

DAMP’s evolutionary investigation across different evolutionary scales. I find that there is significant variation in both mutation rate and DAMP between all the isolates, showing that DAMP has also evolved at fine evolutionary scales. Utilising the whole genome sequences of these strains, I build a to investigate if there is any association between the phylogenetic relatedness and the level of mutation rate and/or DAMP. I find that there is evidence for a slight phylogenetic signal DAMP but not the average mutation rate, showing that more closely related strains exhibit similar degree of DAMP, though not to the degree as expected by the phylogenetic tree. This suggests that both mutation rate and DAMP are fast evolving traits for an organism, due to this lower phylogenetic relationship than expected. Furthermore, there is evidence for the interaction between average mutation rate and DAMP, with strains

59 exhibiting low mutation rates also exhibiting high degrees of DAMP. This suggests at

DAMP’s adaptive role in providing organism’s with low mutation rates a route to increased novelty production. These results combine to support the findings in the first experimental chapter of a wide and evolutionary ancient origin of DAMP that has evolved between within and between distantly related species across the tree of life, whilst also proposing a speed and cause for DAMP’s evolution.

1.6.3.1 Author Contributions: HR was responsible for designing, carrying out and

analysis of all experiments within this chapter. HR also wrote the chapter,

which included comments from both PhD supervisors.

1.6.4 Chapter 5: Highly conserved molecular mechanisms modulate density associated mutation rate plasticity in Escherichia coli

The scope of this chapter was to further our knowledge of the mechanisms and growth dynamics that affect the mutation rate and degree of DAMP. Following on from the first published article detailing a relationship between population density and mutation rate in E. coli (Krašovec et al. 2014a) the role of intercellular environment dependence on mutation rates was further explored. By using co-cultures of E. coli strains it became evident that the mutation rate depends upon both the strain tested and the co-cultured strain it is grown with. The chapter then moves on to investigating the role of other Nudix hydrolase genes involved in modulating DAMP. This work discovers that there are three additional Nudix hydrolase genes (nudF, nudI and nudJ) involved in modulating DAMP in E. coli. In addition to their roles in cleansing of the intracellular nucleotide pool, nudF and nudJ play roles in metabolic pathways, supporting previous findings of a metabolic role in modulating DAMP (Krašovec et al.

60 2014a). The chapter then ends by investigating the role of the growth cycle in controlling the degree of DAMP exhibited. This finds that mutation rate is affected by time spent in both lag and stationary phase, though this result of stationary phase is confounded due to the fixed length of the experiments conducted. Time in lag phase also affects the degree of DAMP with longer lag times resulting in flatter DAMP slopes.

Whilst DAMP is associated with an aspect of the growth curve it is evident from late lag phase, suggesting mechanisms involved in its control are expressed the breadth of the culture cycle. Mutation rate is also associated with the level of the intracellular nucleotide ATP, supporting evidence from both earlier in the chapter and Chapter 2 that the intracellular nucleotide pools play a role in the level of mutation rate.

1.6.4.1 Author Contributions: HR was responsible for designing, carrying out and

analysis of all experiments within this chapter. HR also wrote the chapter, that

included comments from both PhD supervisors.

1.6.5 Discussion Chapter

This chapter first summarises the findings of each experimental chapter before the discussing these findings and suggesting future work that would further test this finding and expand our knowledge of the implications of each finding.

1.6.5.1 Author contributions: This chapter was written by HR with comments from

supervisors incorporated.

61 1.7 References

Agrawal, A. F. 2002. Genetic loads under fitness-dependent mutation rates. J. Evol.

Biol. 15:1004–1010.

Alberch, P. 1991. From genes to phenotype: dynamical systems and evolvability.

Genetica 84:5–11.

Alexander, H. K., S. I. Mayer, and S. Bonhoeffer. 2017. Population Heterogeneity in

Mutation Rate Increases the Frequency of Higher-Order Mutants and Reduces

Long-Term Mutational Load. Mol. Biol. Evol. 34:419–436. Oxford University Press.

Ascierto, P. A., and F. M. Marincola. 2011. Combination therapy: the next opportunity

and challenge of medicine. J. Transl. Med. 9:115.

Baer, C. F., M. M. Miyamoto, and D. R. Denver. 2007. Mutation rate variation in

multicellular eukaryotes: causes and consequences. Nat. Rev. Genet. 8:619–31.

Barrick, J. E., D. S. Yu, S. H. Yoon, H. Jeong, T. K. Oh, D. Schneider, R. E. Lenski, and J. F.

Kim. 2009. and adaptation in a long-term experiment with

Escherichia coli. Nature 461:1243–1247. Nature Publishing Group.

Barton, N. H., D. E. Briggs, J. A. Eisen, D. Goldstein, and N. H. Patel. 2007. Evolution.

Cold Spring Harbour Laboratory Press, New York, NY.

Battesti, A., N. Majdalani, and S. Gottesman. 2011. The RpoS-mediated general stress

response in Escherichia coli. Annu. Rev. Microbiol. 65:189–213.

Behjati, S., M. Huch, R. Van Boxtel, W. Karthaus, D. C. Wedge, A. U. Tamuri, I.

Martincorena, M. Petljak, L. B. Alexandrov, G. Gundem, P. S. Tarpey, S. Roerink, J.

Blokker, M. Maddison, L. Mudie, B. Robinson, S. Nik-Zainal, P. Campbell, N.

Goldman, M. Van De Wetering, E. Cuppen, H. Clevers, and M. R. Stratton. 2014.

Genome sequencing of normal cells reveals developmental lineages and

mutational processes. Nature 513:422–425.

62 Belavkin, R. V., A. Channon, E. Aston, J. Aston, R. Krašovec, and C. G. Knight. 2016.

Monotonicity of fitness landscapes and mutation rate control. J. Math. Biol.

73:1491–1524.

Bergman, J. M., M. Wrande, and D. Hughes. 2014. Acetate availability and utilization

supports the growth of mutant sub-populations on aging bacterial colonies. PLoS

One 9:e109255.

Bjedov, I., O. Tenaillon, B. Gerard, V. Souza, E. Denamur, M. Radman, F. Taddei, and I.

Matic. 2003. Stress-Induced Mutagenesis in Bacteria. Science (80-. ). 300:1404–

1409.

Boe, L. 1992. Translational errors as the cause of mutations in Escherichia coil. Mol.

Gen. Genet. 231:469–471.

Bridges, B. A. 2001. Hypermutation in bacteria and other cellular systems. Philos.

Trans. R. Soc. Lond. B. Biol. Sci. 356:29–39.

Brotcorne-Lannoye, A., and G. Maenhaut-Michel. 1986. Role of RecA protein in

untargeted UV mutagenesis of bacteriophage lambda: evidence for the

requirement for the dinB gene. Proc. Natl. Acad. Sci. U. S. A. 83:3904–8.

Campbell, C. D., and E. E. Eichler. 2013. Properties and rates of germline mutations in

humans.

Chao, L., and E. C. Cox. 1983. COMPETITION BETWEEN HIGH AND LOW MUTATING

STRAINS OF ESCHERICHIA COLI. Evolution (N. Y). 37:125–134.

Charlesworth, D., B. Charlesworth, and M. T. Morgan. 1995. The pattern of neutral

molecular variation under the model. Genetics 141:1619–

1632.

Chen, C., H. Qi, Y. Shen, J. Pickrell, and M. Przeworski. 2017. Contrasting determinants

of mutation rates in Germline and Soma. Genetics 207:255–267.

63 Cheng, K. C., D. S. Cahill, H. Kasai, S. Nishimura, and L. a Loeb. 1992. 8-Hydroxyguanine,

an abundant form of oxidative DNA damage, causes G----T and A----C

substitutions. J. Biol. Chem. 267:166–172.

Clune, J., D. Misevic, C. Ofria, R. E. Lenski, S. F. Elena, and R. Sanjuán. 2008. Natural

Selection Fails to Optimize Mutation Rates for Long-Term Adaptation on Rugged

Fitness Landscapes. PLoS Comput. Biol. 4:e1000187. Public Library of Science.

Clune, J., J. Mouret, and H. Lipson. 2013. The evolutionary origins of modularity. Proc.

R. Soc. B Biol. Sci. R. … 280:20122863.

Codoñer, F. M., J.-A. Darós, R. V Solé, and S. F. Elena. 2006. The fittest versus the

flattest: experimental confirmation of the quasispecies effect with subviral

pathogens. PLoS Pathog. 2:e136.

Combe, M., and R. Sanjuán. 2014. Variation in RNA Virus Mutation Rates across Host

Cells. PLoS Pathog. 10.

Cornforth, D. M., and K. R. Foster. 2013. Competition sensing: the social side of

bacterial stress responses. Nat. Rev. Microbiol. 11:285–93. Nature Publishing

Group.

Cornforth, D. M., R. Popat, L. McNally, J. Gurney, T. C. Scott-Phillips, A. Ivens, S. P.

Diggle, and S. P. Brown. 2014. Combinatorial quorum sensing allows bacteria to

resolve their social and physical environment. Proc. Natl. Acad. Sci. 111:4280–

4284.

Courcelle, J., J. Courcelle, A. Khodursky, A. Khodursky, B. Peter, B. Peter, P. O. Brown,

P. O. Brown, P. C. Hanawalt, and P. C. Hanawalt. 2001. Comparative Gene

Expression Pro les Following UV Exposure in Wild-Type and SOS-De cient. City

158:41–64.

Cuevas, J. M., R. Geller, R. Garijo, J. López-Aldeguer, and R. Sanjuán. 2015. Extremely

64 High Mutation Rate of HIV-1 In Vivo. PLOS Biol. 13:e1002251. Public Library of

Science.

Darch, S. E., S. a West, K. Winzer, and S. P. Diggle. 2012. Density-dependent fitness

benefits in quorum-sensing bacterial populations. Proc. Natl. Acad. Sci. U. S. A.

109:8259–63.

David, S. S., V. L. O’Shea, and S. Kundu. 2007. Base-excision repair of oxidative DNA

damage. Nature 447:941–950. de Visser, J. A. G. M., T. F. Cooper, and S. F. Elena. 2011. The causes of . Proc.

R. Soc. B Biol. Sci. 278:3617–3624.

Denamur, E., and I. Matic. 2006. Evolution of mutation rates in bacteria. Mol.

Microbiol. 60:820–7.

Desai, M. M., D. S. Fisher, and A. W. Murray. 2007. The Speed of Evolution and

Maintenance of Variation in Asexual Populations. Curr. Biol. 17:385–394. Elsevier.

Diggle, S. P., A. Gardner, S. a West, and A. S. Griffin. 2007. Evolutionary theory of

bacterial quorum sensing: when is a signal not a signal? Philos. Trans. R. Soc.

Lond. B. Biol. Sci. 362:1241–9.

Draghi, J. a, T. L. Parsons, G. P. Wagner, and J. B. Plotkin. 2010. Mutational robustness

can facilitate adaptation. Nature 463:353–5.

Drake, J. W. 2007. Critical Reviews in Biochemistry and Too Many

Mutants with Multiple Mutations Too Many Mutants with Multiple Mutations.

Crit. Rev. Biochem. Mol. Biol. 424.

Drake, J. W., A. Bebenek, G. E. Kissling, and S. Peddada. 2005. Clusters of mutations

from transient hypermutability. Proc. Natl. Acad. Sci. U. S. A. 102:12849–54.

National Academy of Sciences.

Drake, J. W., B. Charlesworth, D. Charlesworth, and J. F. Crow. 1998. Rates of

65 Spontaneous Mutation. Genetics 148:1667–1686.

Duffy, S., L. A. Shackelton, and E. C. Holmes. 2008. Rates of evolutionary change in

viruses: patterns and determinants. Nat. Rev. Genet. 9:267–276. Nature

Publishing Group.

Dulla, G., and S. E. Lindow. 2008. Quorum size of Pseudomonas syringae is small and

dictated by water availability on the leaf surface. Proc. Natl. Acad. Sci. U. S. A.

105:3082–7.

Eyre-Walker, A., and P. D. Keightley. 2007. The distribution of fitness effects of new

mutations. Nat. Rev. Genet. 8:610–8.

Fernández De Henestrosa, A. R., T. Ogi, S. Aoyagi, D. Chafin, J. J. Hayes, H. Ohmori, and

R. Woodgate. 2000. Identification of additional genes belonging to the LexA

regulon in Escherichia coli. Mol. Microbiol. 35:1560–1572.

Fisher, R. 1958. The Genetical Theory of Natural Selection. 2nd Editio. Dover, New

York.

Fisher, R. A. 1930. The genetical theory of natural selection. Clarendon Press, Oxford.

Foster, P. 2007. Stress-induced mutagenesis in bacteria. Crit. Rev. Biochem. Mol. Biol.

42:373–397.

Foster, P. L. 1999. Are adaptive mutations due to a decline in mismatch repair? The

evidence is lacking. Mutat. Res. 436:179–84.

Foster, P. L. 2006. Methods for determining spontaneous mutation rates. Methods

Enzym. 409:195–213.

Foster, P. L. 2005. Stress responses and genetic variation in bacteria. Mutat. Res.

569:3–11.

Friedberg, E. C. 2006. DNA repair and mutagenesis. ASM Press, Washington, D.C.

Fukui, K. 2010. DNA Mismatch Repair in Eukaryotes and Bacteria. J. Nucleic Acids

66 2010:1–16.

Furió, V., A. Moya, and R. Sanjuán. 2005. The cost of replication fidelity in an RNA virus.

Proc. Natl. Acad. Sci. U. S. A. 102:10233–7.

Furió, V., A. Moya, and R. Sanjuán. 2007. The cost of replication fidelity in human

immunodeficiency virus type 1. Proceedings. Biol. Sci. 274:225–30. The Royal

Society.

Galhardo, R. S., P. J. Hastings, and S. M. Rosenberg. 2007. Mutation as a stress

response and the regulation of evolvability.

Gentile, C. F., S.-C. Yu, S. A. Serrano, P. J. Gerrish, and P. D. Sniegowski. 2011.

Competition between high- and higher-mutating strains of Escherichia coli. Biol.

Lett. 7:422–4. The Royal Society.

Gibson, G., and I. Dworkin. 2004. Uncovering cryptic genetic variation.

Good, B. H., M. J. McDonald, J. E. Barrick, R. E. Lenski, and M. M. Desai. 2017. The

dynamics of over 60,000 generations. Nature, doi:

10.1038/nature24287.

Goodman, M. F. 2002. Error-prone repair DNA polymerases in prokaryotes and

eukaryotes. Annu. Rev. Biochem. 71:17–50.

Griswold, C. 2006. Pleiotropic mutation, modularity and evolvability. Evol. Dev. 93:81–

93.

Halliday, N. M., K. R. Hardie, P. Williams, K. Winzer, and D. A. Barrett. 2010.

Quantitative liquid chromatography-tandem mass spectrometry profiling of

activated methyl cycle metabolites involved in LuxS-dependent quorum sensing in

Escherichia coli. Anal. Biochem. 403:20–29.

Halligan, D. L., and P. D. Keightley. 2009a. Spontaneous Mutation Accumulation Studies

in Evolutionary Genetics. Annu. Rev. Ecol. Evol. Syst. 40:151–172.

67 Halligan, D. L., and P. D. Keightley. 2009b. Spontaneous Mutation Accumulation

Studies in Evolutionary Genetics. Annu. Rev. Ecol. Evol. Syst. 40:151–172.

Hansen, T. 2003. Is modularity necessary for evolvability? Remarks on the relationship

between pleiotropy and evolvability. Biosystems 69:83–94.

Hayden, E. J., E. Ferrada, and A. Wagner. 2011. Cryptic genetic variation promotes

rapid evolutionary adaptation in an RNA enzyme. Nature 474:92–95.

Hibbing, M. E., C. Fuqua, M. R. Parsek, and S. B. Peterson. 2010. Bacterial competition:

surviving and thriving in the microbial jungle. Nat. Rev. Microbiol. 8:15–25.

Hindré, T., C. Knibbe, G. Beslon, and D. Schneider. 2012. New insights into bacterial

adaptation through in vivo and in silico . Nat. Rev.

Microbiol. 10:352–65. Nature Publishing Group.

Ibáñez-Marcelo, E., and T. Alarcón. 2014. The topology of robustness and evolvability

in evolutionary systems with genotype-phenotype map. J. Theor. Biol. 356:144–

62. Elsevier.

Ihssen, J., and T. Egli. 2004. Specific growth rate and not cell density controls the

general stress response in Escherichia coli. Microbiology 150:1637–48.

Iyama, T., and D. M. Wilson. 2013. DNA repair mechanisms in dividing and non-dividing

cells. DNA Repair (Amst). 12:620–636.

Jacobs, K. L., and D. W. Grogan. 1997. Rates of spontaneous mutation in an archaeon

from geothermal environments. J. Bacteriol. 179:3298–3303.

Jee, J., A. Rasouly, I. Shamovsky, Y. Akivis, S. R. Steinman, B. Mishra, and E. Nudler.

2016. Rates and mechanisms of bacterial mutagenesis from maximum-depth

sequencing. Nature 534:693–696. Nature Publishing Group.

Kendal, W. S., and P. Frost. 1988. Pitfalls and practice of Luria-Delbrück fluctuation

analysis: a review. Cancer Res. 48:1060–1065. American Association for Cancer

68 Research.

Kim, S. R., G. Maenhaut-Michel, M. Yamada, Y. Yamamoto, K. Matsui, T. Sofuni, T.

Nohmi, and H. Ohmori. 1997. Multiple pathways for SOS-induced mutagenesis in

Escherichia coli: an overexpression of dinB/dinP results in strongly enhancing

mutagenesis in the absence of any exogenous treatment to damage DNA. Proc.

Natl. Acad. Sci. U. S. A. 94:13792–7.

Kim, S. R., K. Matsui, M. Yamada, P. Gruz, and T. Nohmi. 2001. Roles of chromosomal

and episomal dinB genes encoding DNA pol IV in targeted and untargeted

mutagenesis in Escherichia coli. Mol. Genet. 266:207–215.

Kirschner, M., and J. Gerhart. 1998. Evolvability. Proc. Natl. Acad. Sci. U. S. A. 95:8420–

8427.

Kohlmann, R., T. Bähr, and S. G. Gatermann. 2018. Species-specific mutation rates for

ampC derepression in Enterobacterales with chromosomally encoded inducible

AmpC β-lactamase. J. Antimicrob. Chemother. 73:1530–1536.

Kondrashov, F. a, and A. S. Kondrashov. 2010. Measurements of spontaneous rates of

mutations in the recent past and the near future. Philos. Trans. R. Soc. Lond. B.

Biol. Sci. 365:1169–76.

Krašovec, R., R. V Belavkin, J. A. D. Aston, A. Channon, E. Aston, B. M. Rash, M.

Kadirvel, S. Forbes, and C. G. Knight. 2014a. Mutation rate plasticity in rifampicin

resistance depends on Escherichia coli cell-cell interactions. Nat. Commun.

5:3742.

Krašovec, R., R. V Belavkin, J. A. D. Aston, A. Channon, E. Aston, B. M. Rash, M.

Kadirvel, S. Forbes, and C. G. Knight. 2014b. Where anitibiotic resistance

mutations meet quorum-sensing. Microb. Cell 1:250–252.

Kuban, W., M. Banach-Orlowska, R. M. Schaaper, P. Jonczyk, and I. J. Fijalkowska. 2006.

69 Role of DNA polymerase IV in Escherichia coli SOS mutator activity. J. Bacteriol.

188:7977–80.

Kuban, W., P. Jonczyk, D. Gawel, K. Malanowska, R. M. Schaaper, and I. J. Fijalkowska.

2004. Role of Escherichia coli DNA polymerase IV in in vivo replication fidelity. J.

Bacteriol. 186:4802–7.

Kunkel, T. A., and D. A. Erie. 2005. DNA MISMATCH REPAIR. Annu. Rev. Biochem.

74:681–710.

Lang, G. I. 2018. Measuring Mutation Rates Using the Luria-Delbrück Fluctuation Assay.

Pp. 21–31 in M. Muzi-Falconi and G. Brown, eds. . Methods in

Molecular Biology. Humana Press, New York, NY.

Lang, G. I., and A. W. Murray. 2008. Estimating the per-base-pair mutation rate in the

yeast Saccharomyces cerevisiae. Genetics 178:67–82.

Lang, G. I., D. P. Rice, M. J. Hickman, E. Sodergren, G. M. Weinstock, D. Botstein, and

M. M. Desai. 2013. Pervasive genetic hitchhiking and clonal interference in forty

evolving yeast populations. Nature 500:571–574. Nature Publishing Group.

Lauring, A. S., J. Frydman, and R. Andino. 2013. The role of mutational robustness in

RNA virus evolution. Nat. Rev. Microbiol. 11:327–36. Nature Publishing Group.

LeClerc, J., B. Li, W. Payne, and T. Cebula. 1996. High Mutation Frequencies Among

Escherichia coli and Salmonella Pathogens. Science (80-. ). 274:1208–1211.

Leigh Egbert Giles, J. 1970. Natural Selection and Mutability. Am. Nat. 104:301–305.

Lenski, R. E., M. R. Rose, S. C. Simpson, and S. C. Tadler. 1991. Long-Term Experimental

Evolution in Escherichia coli . I . Adaptation and Divergence During. Am. Nat.

138:1315–1341.

Li, G.-M. 2008. Mechanisms and functions of DNA mismatch repair. Cell Res. 18:85–98.

Little, J. W., S. H. Edmiston, L. Z. Pacelli, and D. W. Mount. 1980. Cleavage of the

70 Escherichia coli lexA protein by the recA protease. Proc. Natl. Acad. Sci. U. S. A.

77:3225–9.

Loewe, L., and W. G. Hill. 2010. The of mutations: good, bad and

indifferent. Philos. Trans. R. Soc. B Biol. Sci. 365:1153–1167.

Luria, S. E., and M. Delbrück. 1943. Mutations of Bacteria from Virus Sensitivity to Virus

Resistance. Genetics 28:491–511.

Lynch, M. 2010a. Evolution of the mutation rate. Trends Genet. 26:345–52. Elsevier

Ltd.

Lynch, M. 2016. Mutation and Human Exceptionalism: Our Future Genetic Load. , doi:

10.1534/genetics.115.180471.

Lynch, M. 2010b. Rate, molecular spectrum, and consequences of human mutation.

Proc. Natl. Acad. Sci. U. S. A. 107:961–8.

Lynch, M. 2008. The cellular, developmental and population-genetic determinants of

mutation-rate evolution. Genetics 180:933–943. Genetics.

Lynch, M. 2011. The lower bound to the evolution of mutation rates. Genome Biol.

Evol. 3:1107–18.

Lynch, M., M. S. Ackerman, J.-F. , H. Long, W. Sung, W. K. Thomas, and P. L. Foster.

2016. Genetic drift, selection and the evolution of the mutation rate. Nat. Rev.

Genet. 17:704–714.

MacLean, R. C., C. Torres-Barceló, and R. Moxon. 2013. Evaluating evolutionary models

of stress-induced mutagenesis in bacteria. Nat. Rev. Genet. 14:221–227.

Marcon, E., and P. B. Moens. 2005. The evolution of : Recruitment and

modification of somatic DNA-repair proteins. BioEssays 27:795–808. Wiley-

Blackwell.

Martin, G., and T. Lenormand. 2006. A GENERAL MULTIVARIATE EXTENSION OF

71 FISHER’S GEOMETRICAL MODEL AND THE DISTRIBUTION OF MUTATION FITNESS

EFFECTS ACROSS SPECIES. Evolution (N. Y). 60:893–907.

Massey, R. C., and A. Buckling. 2002. Environmental regulation of mutation rates at

specific sites. Trends Microbiol. 10:580–584.

Matic, I. 2016. Molecular mechanisms involved in the regulation of mutation rates in

bacteria.

Matic, I. 2013. Stress-Induced Mutagenesis. P. 275 in D. Mittelman, ed. Stress-Induced

Mutagenesis. Springer New York, New York, NY.

Matic, I., M. Radman, F. Taddei, B. Picard, C. Doit, E. Bingen, E. Denamur, and J. Elion.

1997. Highly Variable Mutation Rates in Commensal and Pathogenic Escherichia

coli. Science (80-. ). 277:1833–1834.

Matic, I., C. Rayssiguier, and M. Radman. 1995. Interspecies gene exchange in bacteria:

The role of SOS and mismatch repair systems in evolution of species. Cell 80:507–

515.

Mazoyer, A., R. Drouilhet, S. Despréaux, and B. Ycart. 2017. flan: An R Package for

Inference on Mutation Models. R J. 9:334–351.

McDonald, M. J., Y. Y. Hsieh, Y. H. Yu, S. L. Chang, and J. Y. Leu. 2012. The evolution of

low mutation rates in experimental mutator populations of Saccharomyces

cerevisiae. Curr. Biol. 22:1235–1240.

McDonald, M. J., D. P. Rice, and M. M. Desai. 2016. Sex speeds adaptation by altering

the dynamics of molecular evolution. Nature 531:233–236. Nature Publishing

Group.

Meysman, P., P. Sonego, L. Bianco, Q. Fu, D. Ledezma-Tejeida, S. Gama-Castro, V.

Liebens, J. Michiels, K. Laukens, K. Marchal, J. Collado-Vides, and K. Engelen. 2014.

COLOMBOS v2.0: An ever expanding collection of bacterial expression compendia.

72 Nucleic Acids Res. 42.

Michaels, M. L., C. Cruz, A. P. Grollman, and J. H. Miller. 1992. Evidence that MutY and

MutM combine to prevent mutations by an oxidatively damaged form of guanine

in DNA. Proc. Natl. Acad. Sci. U. S. A. 89:7022–7025.

Michaels, M. L., and J. H. Miller. 1992. The GO System Protects Organisms from the

Mutagenic Effect of the Spontaneous Lesion 8-Hydroxyguanine (7,8-Dihydro-8-

Oxoguanine). J. BACrERIOLOGY 174:6321–6325.

Milholland, B., X. Dong, L. Zhang, X. Hao, Y. Suh, and J. Vijg. 2017. Differences between

germline and rates in humans and mice. Nat. Commun.

8:15183. Nature Publishing Group.

Mokhtari, R. B., T. S. Homayouni, N. Baluch, E. Morgatskaya, S. Kumar, B. Das, H. Yeger,

R. B. Mokhtari, T. S. Homayouni, N. Baluch, E. Morgatskaya, S. Kumar, B. Das, and

H. Yeger. 2017. Combination therapy in combating cancer. Oncotarget 8:38022–

38043. Impact Journals.

Mott, M. L., and J. M. Berger. 2007. DNA replication initiation: Mechanisms and

regulation in bacteria.

Newcombe, H. B. 1948. Delayed Phenotypic Expression of Spontaneous Mutations in

Escherichia Coli. Genetics 33:447–476.

Novella, I. S., J. B. Presloid, C. Beech, and C. O. Wilke. 2013. Congruent evolution of

fitness and genetic robustness in vesicular stomatitis virus. J. Virol. 87:4923–8.

O’Reilly, E. K., and K. N. Kreuzer. 2004. Isolation of SOS constitutive mutants of

Escherichia coli. J. Bacteriol. 186:7149–60.

Ohmori, H., E. C. Friedberg, R. P. P. Fuchs, M. F. Goodman, F. Hanaoka, D. Hinkle, T. A.

Kunkel, C. W. Lawrence, Z. Livneh, T. Nohmi, L. Prakash, S. Prakash, T. Todo, G. C.

Walker, Z. Wang, and R. Woodgate. 2001. The Y-family of DNA Polymerases.

73 Oliver, a. 2000. High Frequency of Hypermutable Pseudomonas aeruginosa in Cystic

Fibrosis Lung Infection. Science (80-. ). 288:1251–1253.

Orr, H. A. 2000. Adaptation and the Cost of Complexity. 54:13–20.

Paaby, A. B., and M. V. Rockman. 2014. Cryptic genetic variation: Evolution’s hidden

substrate.

Pal, C., M. D. Maciá, A. Oliver, I. Schachar, and A. Buckling. 2007. with

viruses drives the evolution of bacterial mutation rates. Nature 450:1079–1081.

Nature Publishing Group.

Peabody, G. L. V., H. Li, and K. C. Kao. 2017. Sexual recombination and increased

mutation rate expedite evolution of Escherichia coli in varied fitness landscapes.

Nat. Commun. 8:2112. Nature Publishing Group.

Pereira, C. S., J. a Thompson, and K. B. Xavier. 2013. AI-2-mediated signalling in

bacteria. FEMS Microbiol. Rev. 37:156–81.

Pigliucci, M. 2008. Is evolvability evolvable? Nat. Rev. Genet. 9:75–82.

Pope, C. F., D. M. O’Sullivan, T. D. McHugh, and S. H. Gillespie. 2008. A practical guide

to measuring mutation rates in antibiotic resistance. Antimicrob. Agents

Chemother. 52:1209–14.

Qin, T.-T., H.-Q. Kang, P. Ma, P.-P. Li, L.-Y. Huang, and B. Gu. 2015. SOS response and its

regulation on the fluoroquinolone resistance. Ann. Transl. Med. 3.

Radman, M., F. Taddei, and I. Matic. 2000. Evolution-driving genes. Res. Microbiol.

151:91–5.

Raff, E. C., and R. a. Raff. 2000. Dissociability, modularity, evolvability. Evol. Dev.

2:235–237.

Ram, Y., and L. Hadany. 2012. The Evolution of Stress-Induced Hypermutation in

Asexual Populations. Evolution (N. Y). 66:2315–2328.

74 Rao, R. M., S. N. Pasha, and R. Sowdhamini. 2016. Genome-wide survey and phylogeny

of S-Ribosylhomocysteinase (LuxS) enzyme in bacterial . BMC Genomics

17.

Raynes, Y., and P. D. Sniegowski. 2014. Experimental evolution and the dynamics of

genomic mutation rate modifiers. (Edinb). 113:375–380. Nature

Publishing Group.

Redfield, R. 2002. Is quorum sensing a side effect of diffusion sensing? Trends

Microbiol. 8:365–370.

Saint-Ruf, C., J. Pesut, M. Sopta, and I. Matic. 2007. Causes and consequences of DNA

repair activity modulation during stationary phase in Escherichia coli. Crit. Rev.

Biochem. Mol. Biol. 42:259–70.

Santiago-Rodriguez, T. M., A. R. Patrício, J. I. Rivera, M. Coradin, A. Gonzalez, G. Tirado,

R. J. Cano, and G. a Toranzos. 2014. luxS in bacteria isolated from 25- to 40-

million-year-old amber. FEMS Microbiol. Lett. 350:117–24.

Sassanfar, M., and J. W. Roberts. 1990. Nature of the SOS-inducing signal in Escherichia

coli. The involvement of DNA replication. J. Mol. Biol. 212:79–96.

Schlacher, K., P. Pham, M. M. Cox, and M. F. Goodman. 2006. Roles of DNA Polymerase

V and RecA protein in SOS damage-induced mutation.

Schuster, M., D. J. Sexton, S. P. Diggle, and E. P. Greenberg. 2013. Acyl-homoserine

lactone quorum sensing: from evolution to application. Annu. Rev. Microbiol.

67:43–63.

Setoyama, D., R. Ito, Y. Takagi, and M. Sekiguchi. 2011. Molecular actions of

Escherichia coli MutT for control of spontaneous mutagenesis. Mutat. Res. -

Fundam. Mol. Mech. . 707:9–14.

Sharp, P. M., L. R. Emery, and K. Zeng. 2010. Forces that influence the evolution of

75 codon bias. Philos. Trans. R. Soc. B Biol. Sci. 365:1203–1212.

Shaw, F. H., and C. F. Baer. 2011. Fitness-dependent mutation rates in finite

populations. J. Evol. Biol. 24:1677–84.

Sniegowski, P. D., P. J. Gerrish, T. Johnson, and a Shaver. 2000. The evolution of

mutation rates: separating causes from consequences. Bioessays 22:1057–66.

Sniegowski, P., P. Gerrish, and R. Lenski. 1997. Evolution of high mutation rates in

experimental populations of E . coli. Nature 387:703–705.

Sniegowski, P., and Y. Raynes. 2013. Mutation Rates: How Low Can You Go? CURBIO

23:R147–R149.

Sohail, M., O. A. Vakhrusheva, J. H. Sul, S. L. Pulit, L. C. Francioli, G. of the N. Genome of

the Netherlands Consortium, A. D. N. Alzheimer’s Disease Neuroimaging Initiative,

L. H. van den Berg, J. H. Veldink, P. I. W. de Bakker, G. A. Bazykin, A. S.

Kondrashov, and S. R. Sunyaev. 2017. Negative selection in humans and fruit

involves synergistic epistasis. Science 356:539–542. American Association for the

Advancement of Science.

Sprouffske, K., J. Aguilar-Rodríguez, P. Sniegowski, and A. Wagner. 2018. High mutation

rates limit evolutionary adaptation in Escherichia coli. PLOS Genet. 14:e1007324.

Public Library of Science.

Stearns, F. W. 2010. One hundred years of pleiotropy: A retrospective.

Sun, L., H. K. Alexander, B. Bogos, D. J. Kiviet, M. Ackermann, and S. Bonhoeffer. 2018.

Effective polyploidy causes phenotypic delay and influences bacterial evolvability.

PLOS Biol. 16:e2004644. Public Library of Science.

Sung, W., M. S. Ackerman, M. M. Dillon, T. G. Platt, C. Fuqua, V. S. Cooper, and M.

Lynch. 2016. Evolution of the Insertion-Deletion Mutation Rate Across the Tree of

Life. G3&#58; Genes|Genomes|Genetics 6:2583–2591.

76 Sung, W., M. S. Ackerman, S. F. Miller, T. G. Doak, and M. Lynch. 2012a. Drift-barrier

hypothesis and mutation-rate evolution. Proc. Natl. Acad. Sci. 109:18488–18492.

Sung, W., A. E. Tucker, T. G. Doak, E. Choi, W. K. Thomas, and M. Lynch. 2012b.

Extraordinary genome stability in the ciliate Paramecium tetraurelia. Proc. Natl.

Acad. Sci. U. S. A. 109:19339–44. National Academy of Sciences.

Taddei, F., I. Matic, and M. Radman. 1995. cAMP-dependent SOS induction and

mutagenesis in resting bacterial populations. Proc. Natl. Acad. Sci. 92:11736–

11740.

Tanaka, M. M., C. T. Bergstrom, and B. R. Levin. 2003. The evolution of mutator genes

in bacterial populations: The roles of environmental change and timing. Genetics

164:843–854.

Tchou, J., H. Kasai, S. Shibutani, M. H. Chung, J. Laval, A. P. Grollman, and S. Nishimura.

1991. 8-oxoguanine (8-hydroxyguanine) DNA glycosylase and its substrate

specificity. Proc. Natl. Acad. Sci. U. S. A. 88:4690–4.

Tenaillon, O., B. Toupance, and H. Le Nagard. 1999. Mutators, Population Size,

Adaptive Landscape and the Adaptation of Asexual Populations of Bacteria.

Genetics 152:485–493.

Tlsty, T. D., B. H. Margolin, and K. Lum. 1989. Differences in the rates of gene

amplification in nontumorigenic and tumorigenic cell lines as measured by Luria-

Delbrück fluctuation analysis. Proc. Natl. Acad. Sci. U. S. A. 86:9441–9445.

Torres-Barceló, C., G. Cabot, A. Oliver, A. Buckling, and R. C. Maclean. 2013. A trade-off

between oxidative stress resistance and DNA repair plays a role in the evolution

of elevated mutation rates in bacteria. Proceedings. Biol. Sci. 280:20130007. The

Royal Society.

Travis, J. M. J., and E. R. Travis. 2002. Mutator dynamics in fluctuating environments.

77 Proc. R. Soc. B Biol. Sci. 269:591–597.

Tsui, H. C. T., G. Feng, and M. E. Winkler. 1997. Negative regulation of mutS and mutH

repair gene expression by the Hfq and RpoS global regulators of Escherichia coli K-

12. J. Bacteriol. 179:7476–7487.

Turrientes, M.-C., F. Baquero, B. R. Levin, J.-L. Martínez, A. Ripoll, J.-M. González-Alba,

R. Tobes, M. Manrique, M.-R. Baquero, M.-J. Rodríguez-Domínguez, R. Cantón,

and J.-C. Galán. 2013. Normal mutation rate variants arise in a Mutator (Mut S)

Escherichia coli population. PLoS One 8:e72963.

Van Dyken, J. D., and M. J. Wade. 2010. The genetic signature of conditional

expression. Genetics 184:557–70. Genetics.

Vendeville, A., K. Winzer, K. Heurlier, C. M. Tang, and K. R. Hardie. 2005. Making

“sense” of metabolism: autoinducer-2, LuxS and pathogenic bacteria. Nat. Rev.

Microbiol. 3:383–96.

Vogwill, T., M. Kojadinovic, and R. C. MacLean. 2016. Epistasis between antibiotic

resistance mutations and genetic background shape the fitness effect of

resistance across species of Pseudomonas. Proc. R. Soc. B Biol. Sci. 283:20160151.

Wagner, A. 2008. Robustness and evolvability: a paradox resolved. Proc. Biol. Sci.

275:91–100.

Wagner, A. 2012. The role of robustness in phenotypic adaptation and innovation.

Proc. R. Soc. B Biol. Sci. 279:1249–58.

Wagner, G., and L. Altenberg. 1996. Perspective: Complex and evolution of

evolvability. Evolution (N. Y). 50:967–976.

Wagner, G. P., M. Pavlicev, and J. M. Cheverud. 2007. The road to modularity. Nat.

Rev. Genet. 8:921–31.

Wagner, G. P., and J. Zhang. 2011. The pleiotropic structure of the genotype-

78 phenotype map: the evolvability of complex organisms. Nat. Rev. Genet. 12:204–

213.

Walters, M., and V. Sperandio. 2006. Quorum sensing in Escherichia coli and

Salmonella. Int. J. Med. Microbiol. 296:125–31.

Wang, Z., B.-Y. Liao, and J. Zhang. 2010. Genomic patterns of pleiotropy and the

evolution of complexity. Proc. Natl. Acad. Sci. U. S. A. 107:18034–9. National

Academy of Sciences.

Waters, C. M., and B. L. Bassler. 2005. Quorum sensing: cell-to-cell communication in

bacteria. Annu. Rev. Cell Dev. Biol. 21:319–46.

Watson, M. E., J. L. Burns, and A. L. Smith. 2004. Hypermutable Haemophilus

influenzae with mutations in mutS are found in sputum.

Microbiology 150:2947–58.

Weber, H., T. Polen, J. Heuveling, V. F. Wendisch, and R. Hengge. 2005. Genome-wide

analysis of the general stress response network in Escherichia coli: σS-dependent

genes, promoters, and sigma factor selectivity. J. Bacteriol. 187:1591–1603.

Welch, J. J., and D. Waxman. 2003. Modularity and the cost of complexity. Evolution

(N. Y). 57:1723–1734. Wiley/Blackwell (10.1111).

West, S. a, A. S. Griffin, A. Gardner, and S. P. Diggle. 2006. Social evolution theory for

. Nat. Rev. Microbiol. 4:597–607.

West, S. a, K. Winzer, A. Gardner, and S. P. Diggle. 2012. Quorum sensing and the

confusion about diffusion. Trends Microbiol. 20:586–94. Elsevier Ltd.

Whiteley, M., S. P. Diggle, and E. P. Greenberg. 2017. Progress in and promise of

bacterial quorum sensing research.

Wielgoss, S., J. E. Barrick, O. Tenaillon, M. J. Wiser, W. J. Dittmar, S. Cruveiller, B.

Chane-Woon-Ming, C. Médigue, R. E. Lenski, and D. Schneider. 2013. Mutation

79 rate dynamics in a bacterial population reflect tension between adaptation and

genetic load. Proc. Natl. Acad. Sci. U. S. A. 110:222–7.

Wielgoss, S., J. E. Barrick, O. Tenaillon, M. J. Wiser, W. J. Dittmar, S. Cruveiller, B.

Chane-Woon-Ming, C. Médigue, R. E. Lenski, and D. Schneidera. 2012. Mutation

rate dynamics in a bacterial population reflect tension between adaptation and

genetic load. Proc Natl Acad Sci 110:222–227.

Williams, L. N., A. J. Herr, and B. D. Preston. 2013. Emergence of DNA polymerase ε

antimutators that escape error-induced extinction in yeast. Genetics 193:751–

770.

Williams, P., K. Winzer, W. C. Chan, and M. Cámara. 2007. Look who’s talking:

communication and quorum sensing in the bacterial world. Philos. Trans. R. Soc.

Lond. B. Biol. Sci. 362:1119–34.

Winterbourn, C. C. 2008. Reconciling the chemistry and biology of reactive oxygen

species. Nat Chem Biol 4:278–286.

Winzer, K., K. R. Hardie, and P. Williams. 2003. LuxS and Autoinducer-2: Their

Contribution to Quorum Sensing and Metabolism in Bacteria.

Woods, R., J. Barrick, and T. Cooper. 2011. Second-Order Selection for Evolvability in a

Large Escherichia coli Population. Science (80-. ). 331:1433–1436.

Wrande, M., J. Roth, and D. Hughes. 2008. Accumulation of mutants in ‘“aging”’

bacterial colonies is due to growth under selection, not stress-induced

mutagenesis. Proc. Natl. Acad. Sci. 105:11863–11868.

Wright, S. 1932. The roles of Mutation, , Crossbreeding and Selection in

Evolution. Proc. Sixth Int. Congr. Genet. 355–366.

Xavier, K. B., and B. L. Bassler. 2003. LuxS quorum sensing: more than just a numbers

game. Curr. Opin. Microbiol. 6:191–197.

80 Yang, J., B. a Evans, and D. E. Rozen. 2010. Signal diffusion and the mitigation of social

exploitation in pneumococcal competence signalling. Proc. Biol. Sci. 277:2991–9.

Yeiser, B., E. D. Pepper, M. F. Goodman, and S. E. Finkel. 2002. SOS-induced DNA

polymerases enhance long-term survival and evolutionary fitness. Proc. Natl.

Acad. Sci. 99:8737–8741.

81

Chapter 2: Spontaneous Mutation Rate Is a Plastic Trait

Associated with Population Density across Domains of Life

82

2.1 Abstract

Rates of random, spontaneous mutation can vary plastically, dependent upon the environment. Such plasticity affects evolutionary trajectories and may be adaptive. We recently identified an inverse plastic association between mutation rate and population density at one locus in one species of bacterium. It is unknown how widespread this association is, whether it varies among organisms and what molecular mechanisms of mutagenesis or repair are required for this mutation rate plasticity.

Here we address all three questions. We identify a strong negative association between mutation rate and population density across 70 years of published literature, comprising hundreds of mutation rates estimated using phenotypic markers of mutation (fluctuation tests) from all domains of life and viruses. We test this relationship experimentally, determining that there is indeed density-associated mutation-rate plasticity (DAMP) at multiple loci in both eukaryotes and bacteria, with up to 23-fold lower mutation rates at higher population densities. We find that the degree of plasticity varies, even among closely related organisms. Nonetheless, in each domain tested, DAMP requires proteins scavenging the mutagenic oxidized nucleotide

8-oxo-dGTP. This implies that phenotypic markers give a more precise view of mutation rate than previously believed: having accounted for other known factors affecting mutation rate, controlling for population density can reduce variation in mutation rate estimates by 93%. Widespread DAMP, which we manipulate genetically in disparate organisms, also provides a novel trait to use in the fight against the evolution of . Such a prevalent environmental association and conserved mechanism suggest that mutation has varied plastically with population density since the early origins of life.

83

2.2 Introduction

The probability of spontaneous genetic mutations occurring during replication evolves among organisms (Lynch et al. 2016). This mutation rate can also vary at a particular site in a particular genotype, dependent upon the environment (Massey and Buckling

2002). Specifically, mutation rate can increase with endogenous and exogenous factors

(Maki 2002). Indeed, any factor that affects the balance between mutagenesis and

DNA repair can modify the mutation rate. These include intracellular nucleotide pools

(Gon et al. 2011), organism age (Kong et al. 2012) and factors affecting the expression

(Gutierrez et al. 2013) and stochastic presence or absence of low copy number repair proteins (Uphoff et al. 2016). Where such mutation/repair-balance factors depend on the environment, the result is mutation rate plasticity. Plastic mutation rates have been most thoroughly addressed for stress-induced mutagenesis. This may involve the induction of error-prone polymerases, for instance in the Escherichia coli SOS response

(Galhardo, Hastings, and Rosenberg 2007). We have recently identified a novel mode of mutation rate plasticity in response to population density in E. coli. This plasticity does not have any very obvious association with stress – the most dense populations, experiencing the most competition, show the lowest mutation rates (Krašovec et al.

2014a).

Understanding mutation rate plasticity is hampered by the difficulty of accurately measuring any mutation rate. Spontaneous mutation rates have long been estimated in microbes using counts of cells gaining a phenotypic marker of mutation in environments lacking selection for that marker: the ‘fluctuation test’ created by Luria and Delbrück in 1943 (Luria and Delbrück 1943). Alternative approaches to measuring rates of mutation in the absence of selection, such as accumulation of mutations

84 through many population bottlenecks (Halligan and Keightley 2009), directly comparing genome sequences of parents and offspring (Campbell and Eichler 2013) or targeted population sequencing (Jee et al. 2016), are much more laborious and thus poorly suited to potentially dynamic responses. Therefore, well conducted fluctuation tests (Foster 2006), remain the most appropriate tool to assay environmental dependence in mutation rates.

Population density affects many traits, particularly in microbes (Miller and Bassler

2001). Its association with mutation rate has great potential to affect evolutionary trajectories (Belavkin et al. 2016), in ways relevant to the evolution of antimicrobial resistance (Krašovec et al. 2014a). However, thus far, this plasticity, associated with population density is poorly understood: Its prevalence across domains of life is unknown. Whether it varies among organisms, enabling its evolution, remains to be tested. A little is known about the relationship of mutation rate with density perception in one organism (Krašovec et al. 2014b), but the required downstream mechanisms of mutation or repair remain uncharacterised. Here we address each of these issues. We demonstrate that there is indeed density-associated mutation-rate plasticity (DAMP) across domains of life: high population density is associated with low mutation rates. DAMP differs between closely related organisms, indicating that this trait does indeed evolve. Strikingly, the same mutation avoidance mechanism is required to modulate mutation rate in response to population density in both prokaryotes and eukaryotes.

85

2.3 Results

In order to test the nature and prevalence of mutation rate plasticity, we considered over 70 years of published mutation rates estimated by fluctuation test. We collated

474 individual mutation rate estimates from 68 independent studies that conducted fluctuation tests in organisms of 26 species from all domains of life and viruses, where the density an organism’s population reached, D, was either reported or could be obtained from the original authors (Figure 2.1 and Table 2.3).

Figure 2.1 Mutation rates published from 1943 to 2017 in relation to

final population density. Colour indicates organism and shape indicates domain:

Archaea (circle), Bacteria (triangle), Eukaryota (square), and viruses/bacteriophage

(diamond). Mutation rates were estimated using a wide variety of phenotypic markers

(N=70). The genetic basis and number of mutations giving each phenotype varies and is

not known for all markers (i.e. different mutation rates may be generated from the 86 same underlying per base-pair rate). Population densities are estimated by different techniques (cell counts, colony- or plaque-forming units). See Results, Materials and

Methods and Model 2.1 in Appendix for statistical analysis accounting for these and other differences. Note the logarithmic axes. Originally denoted as Fig 1 in publication.

There is a clear negative association between mutation rate and D (Spearman’s P = -

0.66), spanning more than six orders of magnitude in both measures. This association potentially involves both between- and within-organism variation in mutation rate. We therefore analysed this relationship using a linear mixed-effects model (Model 2.1 in

Appendix) accounting for various features of the original experiments (organism, culture media, phenotypic marker, publication and phylogenetic relationships among organisms). We find that, having taken all these factors into account, changes in D explain 93% of variation in published mutation rate estimates (N=474, LR1 = 22,

P=2.9×10-6; Model 2.1). The model identifies substantial variation between-organisms in a negative within-organism association of mutation rate with D (slope varies from -

0.46 – -0.98; Figure 2.2). The average slope across organisms is -0.67 (-0.89 – -0.48 CI), meaning that mutation rate doubles with a 64% reduction in D (54-76% CI), quantitatively similar to the plasticity reported for E. coli B strains (Krašovec et al.

2014a), where the figure is 77% (61-96% CI).

87

Figure 2.2 Slope values for species included in published mutation rate analysis. Each point represents the estimate of the within-species slope of log2

(mutation rate) with log2 (population density) from mixed effect Model 2.1 (Appendix), which includes a random effect of organism on slope. Each value therefore represent the best linear unbiased prediction (BLUP) for that organism. The vertical line represents the overall estimated slope, fitted as a fixed effect in Model 2.1. Originally denoted as Fig S2 in publication.

Despite the striking association shown in Figure 2.1, the relationship could originate in various processes, including, but not limited to, mutation rate plasticity. We consider several hypotheses: 1. Technical bias: 1a – The same estimate of final population size

(Nt) is typically used to calculate both D and the mutation rate. Therefore any error in

Nt could itself lead to a negative association between mutation rate and D. 1b –

Fluctuation tests typically assume that the phenotypic markers used are selectively neutral, however, in practice, this is not always the case. If cultures grown to higher D also, typically, go through more generations, a systematic tendency towards

88 phenotypic markers being costly could lead to under-estimated mutation rates specifically at high D. 2. Reporting bias: there could be an under-representation of reported low mutation rates at low D and high mutation rates at high D. This is expected because standard volume microbial cultures with low D may not have sufficient mutational events at marker loci to achieve good estimates of low mutation rates. Similarly, in dense populations, high mutation rates can produce more mutants than it is practical to count. 3. Density-Associated Mutation-rate Plasticity (DAMP): the relationship in Figure 2.1 is consistent with DAMP across domains of life. However, the data in Figure 2.1 comes from diverse studies using very different experimental and analytical set-ups. It remains for us to test the association at different marker loci in different organisms within a single experimental and analytical framework.

To test these hypotheses we focused on the two most diverged genetic model organisms in Figure 2.1: the bacterium E. coli (strain MG1655) and the eukaryotic yeast

Saccharomyces cerevisiae (strains S288C, BY4742 and Sigma1278b). Using fluctuation tests we estimated mutation rates in batch cultures at two marker loci in each organism: rpoB and gyrA in E. coli and 25S ribosomal proteins and URA3 in S. cerevisiae

(see Materials and Methods). These confer resistance to antibiotics rifampicin, nalidixic acid, hygromycin B and 5-Fluoro-orotic acid (5-FOA) respectively. In each case we varied culture volume and added nutrients to give different D (see Material and

Methods). To test hypothesis 1a (correlation of errors), we estimated Nt by two independent methods for each organism: an ATP-based luminescence assay (LUM) and colony-forming units (CFU) for bacteria; haemocytometer cell counts (CC) and CFU for yeast. Using CFU to estimate mutation rate (typical in Figure 2.1) and any of the three methods to estimate D (Figures 2.3 and 2.4), we find significant variation of mutation 89 rates with changing D: mutation rate varies from 5-fold to 23-fold across both loci in all organisms, where mutation rate is lower at high D (Figure 2.4). This refutes hypothesis

1a that the broad association between population density and mutation rate (Figure

2.1) is caused by a correlation of errors.

Figure 2.3 Density-associated mutation-rate plasticity (DAMP) in bacteria and yeast, with population density calculated from luminescence and cell counts respectively. (A) Mutation rates to rifampicin (triangles) and nalidixic acid

(circles) resistance in E. coli MG1655 (dark blue, N=77) and P. aeruginosa PAO1 (light

-18 blue, N=40); lines from Model 2.2 in Appendix: Wald test t80=11; P=1.7×10 that E. coli slope is zero; t80=1.2, P=0.22 that P. aeruginosa slope is zero. Lines are plotted separately for the two E. coli markers because there are more mutations conferring rifampicin resistance (in the rpoB gene) than nalidixic acid resistance (in the gyrA gene), resulting in different overall rates. (B) Mutation rates to hygromycin B (squares) and 5-

FOA (diamonds) resistance in S. cerevisiae BY4742 (brown, N=46), Sigma1278b (,

-15 N=41) and S288C (red, N=24); lines from Model 2.3 in Appendix: t81=9.6; P=4.3×10

Wald test that average S. cerevisiae slope with D is zero; likelihood ratio test that slopes

-5 are equal among strains LR1=17, P=4.1×10 . Open shapes denote mutation rate estimates which would typically be excluded because the estimated number of 90 mutational events per culture, m, is either below 0.3 or above 30. Population densities estimated by ATP-based assay, in (A) and direct cell counts in (B). See main text for the slope estimates of the given lines. Note the logarithmic axes. Originally denoted as Fig 2 in publication.

Figure 2.4 Density-associated mutation-rate plasticity (DAMP) in bacteria and yeast with population density calculated from colony forming units

(CFU). Data as in Figure 2.3 but using CFU to estimate both population density and mutation rate. (A) Mutation rates to rifampicin (triangles) and nalidixic acid (circles) resistance in E. coli MG1655 (dark blue; N=77) and P. aeruginosa PA01 (light blue;

-23 N=40). Lines are from Model 2.5 in Appendix; t80=14; P=6.5×10 that E. coli slope is zero and t80=0.81, P=0.42 that P. aeruginosa slope is zero (B) Mutation rates to hygromycin B (squares) and 5-FOA (diamonds) resistance in S. cerevisiae BY4742

(brown; N=46), Sigma1278b (orange; N=59) and S288C (red; N=39). Lines are from

-5 Model 2.6 in Appendix: t105=4.3; P=3.2×10 Wald test that average S. cerevisiae slope is zero. Open shapes denote mutation rate estimates which would typically be excluded because the estimated number of mutational events per culture, m, is either below 0.3 or above 30. Note the logarithmic axes. Original figure denoted as Fig S4 in publication.

91

To test hypothesis 1b (costs of marker mutations) for the association between population density and mutation rate, we first considered the fitness effects of resistance mutations. Others have found small negative effects (12% on average in P. aeruginosa (Hall, Iles, and MacLean 2011)) and sometimes positive effects on fitness of mutations at the rpoB locus considered in Figure 2.3A (LaCroix et al. 2015; Wrande,

Roth, and Hughes 2008). Consistent with this, the fitness effects of mutations in our experiments are, on average, close to neutral (Figure 2.5) and, where present, are similar across population densities (Figure 2.6). Secondly we re-analysed the data in Fig

2 assuming different average fitness effects of resistance mutations. We find that, even assuming large fitness costs (>50%), there is only a small flattening of the negative slope (Figure 2.6). Thus it is highly unlikely that the differential growth of resistant strains is responsible for the slopes seen in Fig 2, refuting hypothesis 1b that the association in Fig 1 may be accounted for by technical bias driven by differential growth of mutant strains.

Figure 2.5 Effect of fitness differences between resistant and non- resistant strains on the estimated slopes in Figure 2.3. The estimated slope

92

(with its standard error, grey ribbon) of mutation rate against population density, D, for

E. coli (A) as shown in Fig 2.3A and estimated by Model 2.2 in Appendix, having used different assumed average relative fitnesses (w) in the calculation of each mutation rate. The vertical red line indicates equal fitness of resistant and non-resistant strains, as assumed in the main analysis. Values of w below 1 indicate a cost of resistance and values greater than 1 indicate a selective advantage to resistant strains. The vertical dashed black lines indicate the 95% CI of fitness values w estimated directly from the data, jointly with the number of mutational events m. Because our fluctuation tests used relatively limited numbers of cultures (see methods), it was not possible to jointly estimate m and w from the data in all cases. The interval shown is the confidence interval on the mean across the N = 56 fluctuation tests where it was possible to make joint estimates. (B) The same as part A but for the slope of mutation rate against population density D for strain BY4742 as shown in Fig 2.3B and estimated by Model 2.3

(Appendix); the confidence interval on the mean relative fitness of resistant strains was calculated across N = 84 fluctuation tests in this case. Originally denoted as Fig S5 in publication.

93

Figure 2.6 Relative fitness of rifampicin resistant mutants of Escherichia coli REL606 (mutant A) and REL607 (mutant B) at different population densities. Rifampicin resistant mutant A and rifampicin resistant mutant B were competed against a rifampicin susceptible parent strain with the opposite arabinose marker (REL607 and REL606, respectively) in Davis minimal medium with 100 and 250 mg of glucose per L. Originally denoted as Fig S6 in publication.

Typically, mutation rates where the estimated number of mutational events per culture (m) is either too low (<0.3) or too high (>30) are excluded, and may not be reported (Foster 2006). To test hypothesis 2 (reporting bias) we kept close account of all estimates. Those 34 estimates that would typically be excluded are shown as open symbols in Figures 2.3 and 2.4. As expected, these points fall in either the lower left or upper right of the data. Nonetheless, the association between D and mutation rate is similar and highly significant with or without these data (-0.68 [-0.80 – -0.56 CI] versus

-0.68 [-0.79 – -0.56 CI] in E. coli and, on average, -0.65 [-0.79 – -0.52 CI] versus -0.48 [-

0.63 – -0.33] in S. cerevisiae), refuting hypothesis 2, that the negative association 94 between population density and mutation rate in Figure 2.1 is caused by reporting bias. In contrast, our findings from different loci and different organisms within a single experimental and analytical framework are consistent with the pattern across the literature (Figures 2.1 and 2.7), strongly supporting hypothesis 3, that there is negative density-associated mutation-rate plasticity (DAMP) across the domains of life.

Figure 2.7 All data from Figure 2.3 overlaid on published data used in

Figure 2.1. Mutation rates in E. coli MG1655 (dark blue triangles), P. aeruginosa PAO1

(pale blue triangles) and S. cerevisiae (red squares) overlaid on published mutation rates collected from the literature (grey symbols). Green triangles represent mutation rate estimates for monocultures of wild-type E. coli from Krašovec et al. (2014a), which are not included in Figure 2.1. See main text and Fig 2.3 for more details. Originally denoted as Fig S7 in publication.

95

Our assays in S. cerevisiae identify significant variation in DAMP slope among strains

(Figures 2.3B): the slope for S288C and Sigma1278b is -0.98 (-1.2 – -0.73 CI) whereas in

BY4742 it is -0.32 (-0.40 – -0.25 CI). This limited difference in plasticity suggests that this trait may evolve. To investigate the extent of inter-organism variation in DAMP we tested another model organism, much more closely related to E. coli: Pseudomonas aeruginosa PAO1 (both are Gram-negative gamma Proteobacteria). We find P. aeruginosa has a greatly reduced DAMP slope relative to E. coli, not significantly different from zero (Figures 2.3A and 2.4A, slope estimate -0.15 [-0.40 – +0.095 CI],

Models 2.2 and 2.5 in Appendix). This indicates that, while DAMP is very widespread, it has evolved among closely related organisms.

Diverse mechanisms, some broadly conserved, could in principle be modulated to give

DAMP. These include polymerases used for DNA replication and repair, some of which are more error-prone than others (Goodman 2002), and systems that repair mutational mismatches (Kunkel and Erie 2005) or remove mutagenic before they can be incorporated into DNA (Michaels and Miller 1992). To identify mechanisms by which the observed DAMP occurs, we tested E. coli strains deleted for genes involved in these processes (see Materials and Methods). A strain lacking the error-prone polymerase Pol IV (encoded by dinB), implicated in stress-induced mutagenesis (Galhardo et al. 2007), displays DAMP (Figure 2.8; -1.1 [-1.3 – -1.00 CI]). E. coli’s methyl-directed DNA mismatch repair (MMR) system was hypothesized to be involved in DAMP (Krašovec et al. 2014a). Despite >100 fold increases in mutation rates, strains lacking MMR proteins (MutH, MutL and MutS) still display DAMP, albeit with a less steep slope (Figure 2.8; -0.20 [-0.30 – -0.11 CI]). We also considered other systems potentially associated with DAMP: a strain lacking MetI (Figure 2.9), a

96 transporter protein responsible for feeding methionine to the activated methyl cycle, which was previously implicated in DAMP (Krašovec et al. 2014a). Despite growing over only a relatively narrow range of densities, the metI deletant shows DAMP with a slope indistinguishable from the dinB deletant, as does a strain lacking Dam methylase, required for MMR to identify the ‘correct’ DNA strand (Figure 2.8). Finally, a strain lacking Endonuclease VIII (encoded by nei in E. coli) also shows DAMP, with the greatest slope among all these strains (Figure 2.8, -1.5 [-1.8 – -1.3 CI]).

Fig 2.8 DAMP in strains of Escherichia coli with deficiencies in various

DNA repair and other systems. Mutation rate in strains from the Keio collection

(Baba et al. 2006) deleted for genes encoding deoxyadenosine methylase (Δdam, N=23, yellow triangles), error-prone DNA polymerase Pol IV (ΔdinB, N=25, pink triangles), methionine transporter MetI (ΔmetI, N=10, light green triangles), endonuclease VIII

(Δnei, N=15, dark green triangles), methyl mismatch repair (MMR) which comprises 97

MutS (ΔmutS, N=28, light blue circles), that binds mismatched DNA and MutL (ΔmutL,

N=27, dark blue circles) that interacts with MutS to activate endonuclease MutH

(ΔmutH, N=24, medium blue circles). All lines result from Model 2.7 in Appendix. Wald

-30 tests that average slope for Δdam, ΔdinB and ΔmetI is zero: t92=17; P=4.2×10 ; that

-22 average slope for Δnei is zero: t91=13; P=6.0×10 ; that average slope for MMR strains is

-5 zero: t91=4.3; P=5.1×10 . See Materials and Methods for further strain details. Densities estimated by ATP-based assay (LUM). As in Figure 2.3, triangles and circles indicate mutation rates to rifampicin and nalidixic acid resistance respectively. Note the logarithmic axes. Originally denoted as Fig 3 in publication.

In contrast to the systems tested above, mutT deletion removed the dependence of mutation rate on D (Figures 2.9A and 2.10A, likelihood ratio test of slope N=63,

LR1=1.5; P=0.22, Model 2.8 in Appendix). MutT is the mutation avoidance component of the GO system, protecting cells from mutagenic effects of damaged nucleotides

(Michaels and Miller 1992). Specifically, the MutT Nudix hydrolase removes 8-oxo- dGTP from the free-nucleotide pool. This prevents AT to GC , which occur when 8-oxo-dGTP is incorporated in place of a TTP nucleotide during DNA synthesis

(Maki and Sekiguchi 1992). The GO system also includes mutation correction proteins

MutM and MutY, which target 8-oxo-dGTP once incorporated in DNA (Michaels and

Miller 1992). Strains lacking either protein display DAMP (Fig 2.9B and 2.10B Fig;

Model 2.10 in Appendix). Thus, while neither error-prone polymerase Pol IV nor the

MMR system with Dam methylase is necessary (Figure 2.8), DAMP in E. coli (Fig 2A) requires scavenging the oxidised nucleotide 8-oxo-dGTP from the cellular pool.

98

Fig 2.9 DAMP in cells lacking mutation avoidance or correction genes in

E. coli and S. cerevisiae, with population density calculated from luminescence and colony forming units respectively. (A) Mutation rates to nalidixic acid resistance in two independent E. coli ΔmutT strains which share no secondary mutations (JW0097-1 and JW0097-3 in Table 2.1 in Material and Methods) dark and light blue respectively (Model 2.8 in Appendix) (B) Mutation rates to rifampicin resistance in E. coli ΔmutM and ΔmutY strains (medium blue and light green respectively), with equal slopes -1.1 (-1.2 – -1.04, CI) (N=46, Model 2.10 in Appendix)

(C) Mutation rate to hygromycin B resistance in S. cerevisiae BY4742 and Sigma1278b

PCD1-Δ strains (brown and orange respectively, Model 2.12 in Appendix) (D) Mutation rate to 5-FOA resistance in S. cerevisiae Sigma1278b MLH1-Δ (slope -1.22 [-1.44 – -1.01

CI], N=26, Model 2.14 in Appendix). Open shapes as in Figure 2.3. Population density

99 measured by ATP-based assay in (A) and (B) and via colony forming units counts in (C) and (D). See also main text for the slope estimates of the given lines. Note the

logarithmic axes. Originally denoted as Fig 4 in publication.

Figure 2.10 DAMP in cells deficient in mutation avoidance or correction genes in E. coli and S. cerevisiae, with population density calculated from colony forming units and cell counts respectively. Data as in Fig 2.9 but using alternative methods to estimate population density (A) Mutation rates to nalidixic acid resistance in two independent E. coli Keio ΔmutT strains JW0097-1 (N=30) and JW0097-3 (N=33) (dark and light blue respectively). Both lines result from a

Model 2.9 in Appendix; likelihood ratio test of slope, N=63, LR1=0.29; P=0.59 (B)

Mutation rates to rifampicin resistance in E. coli ΔmutM (N=23) and ΔmutY (N=23) strains (blue and green respectively). Line result from a Model 2.11 in Appendix. Wald

-14 -11 tests that slope is zero for ΔmutM t32=12.9; P=3.5×10 and ΔmutY t32=9.8; P=3.3×10

(C) Mutation rate to hygromycin B resistance in S. cerevisiae BY4742 (N=22) and 100

Sigma1278b PCD1-Δ (N=20) strains (brown and orange respectively). Line result from a

Model 2.13 in Appendix (D) Mutation rate to 5-FOA resistance in S. cerevisiae

Sigma1278b MLH1-Δ. Line result from a Model 2.15 in Appendix. The slope is zero: N=

14, t10=3.5; P=0.0035. Final Density (D) measured by CFU in (A) and (B), and by direct cell counts in (C) and (D). Open shapes denote mutation rate estimates which would typically be excluded because the estimated number of mutational events per culture, m, is either below 0.3 or above 30. Originally denoted as Fig S8 in publication.

Eukaryotes possess more, and more diverse, genes and systems for mutation avoidance and correction than bacteria. Nonetheless, the mutation avoidance mechanism required for DAMP in E. coli is conserved across domains of life (McLennan

2006). PCD1 encodes the yeast 8-oxo-dGTPase functionally homologous to bacterial

MutT (Nunoshiba et al. 2004). We tested PCD1-Δ strains in two different S. cerevisiae backgrounds (BY4742 and Sigma1278b). As in E. coli ΔmutT, mutation rates are greatly increased in PCD1-Δ, with no evidence of DAMP (Figures 2.9C and 2.10C likelihood ratio test of slope N=42, LR1=0.18, P=0.67, Model 2.12 in Appendix). In contrast MLH1, a yeast gene homologous to E. coli MMR gene mutL, displays DAMP (Figures 2.9D and

2.10D; Model 2.14 in Materials and Methods). Therefore, DAMP not only occurs across domains of life, but, in the bacteria and eukaryotes tested, requires the same specific mutation avoidance mechanism.

101

2.4 Discussion

The negative association between published mutation rates and population density (D,

Figure 2.1) is remarkably tight. Much of the between-organism variation in microbial mutation rate is associated with variation in genome-size, where larger genomes tend to have smaller per-base-pair mutation rates (Drake 1991). We account for this in our analysis of the data in Figure 2.1 by allowing mutation rates to vary among organisms and accounting for their phylogeny (explicitly including a typical for each organism makes little difference to this analysis, Model 2.1 in Appendix). However, the within-organism variation in mutation rate associated with D is strong enough to explain differences in mutation rate estimates for the same organism within and between laboratories, without assuming any inconsistency in the fluctuation test itself. For instance, the estimates for Salmonella at the bottom of Figure 2.1 vary by over an order of magnitude, but diverge little from a negatively sloping straight line.

This suggests that, once population density is taken into account, fluctuation tests can give a more precise estimate of mutation rate than previously believed.

Fluctuation tests however have drawbacks, such as the possibility of unanticipated selection on mutant cells in a supposedly non-selective environment (Wrande et al.

2008). We avoided that specific issue here by using short incubation times and making estimates at multiple loci. The more general drawback of using fluctuation tests for considering environmental correlates (shared with most other methods for estimating mutation rates) is that they average across the time-varying environment of batch culture. Thus almost any environmental variable, including population density, has no fixed value and is itself associated with many other characteristics of the culture, both environmental and organismal (e.g. times spent in different phases of the culture

102 cycle). Thus, while we have demonstrated an association of mutation rate with final population density D (Figure 2.3) and the dependence of that association on a downstream mechanism (Figure 2.9), the link between 8-oxo-dGTPase and D remains unclear. Nonetheless, two important things can be said about this link. First, DAMP is observed across a wide range of environmental conditions and organisms (Figures 2.1 and 2.3), this argues that features of particular environments, such as the starting nutrient concentration, as manipulated here, are unlikely to provide a general link between D and mutation rate. Second, we have previously demonstrated that in one organism, E. coli, cell-cell interactions are involved in DAMP and deletion of a quorum sensing gene (luxS) breaks the association between D and mutation rate (Krašovec et al. 2014a), demonstrating that population interactions can be important for DAMP.

Such quorum sensing mechanisms occur widely (even in some phages (Erez et al.

2017)). Future work therefore needs to focus on asking whether DAMP is associated with particular environmental molecules.

Mutation rate is a central population genetic parameter. Across organisms, it is negatively associated with the other central population genetic parameter, effective population size (Ne) (Lynch et al. 2016). In our experiments, despite two orders of magnitude variation in each measure, we find no consistent association between mutation rate and Ne (Figure 2.11). This is unsurprising because the proposed reason for the negative association across organisms is that selection for replication fidelity is more efficient at higher Ne, meaning that, over the long-term, average mutation rates evolve to be lower at higher Ne (Lynch et al. 2016). In our short-term experiments there is little opportunity for such evolutionary change to occur, so we do not see this association. Nonetheless, this reinforces the clear distinction between within-organism

103 plasticity and among-organism variation in mutation rate. Both have shaped mutation rates in the published literature (Figure 2.1) and we are able to separate the two both statistically and by focused experiments (Figure 2.3). There may be links between the causes of among-organism variation and within-organism plasticity in mutation rate, for instance in the differing opportunities for selection on replication fidelity in polymerases expressed in common or rare environmental conditions (MacLean,

Torres-Barceló, and Moxon 2013). However, the evolutionary causes and effects of within-organism plasticity in mutation rate in general, and DAMP in particular, need further investigation.

non−plastic slightly plastic genotype BY4742 dam dinB metI MG1655 MLH1_sigma

) 1e+03 mutH 1 − 5e+02 mutL mutM mutS mutT1 generation 9

− mutT2 5e+01 mutY nei PAO1 PCD1_by PCD1_sigma S288C Mutation rate (10 Mutation rate Sigma_1278b

marker 5−fluoroorotic_acid1000 5e−01 hygromycinB300 nalidixic_acid30 rifampicin50 3e+03 3e+04 3e+05 3e+03 3e+04 3e+05 3e+03 3e+04 3e+05 Effective population size (Ne)

Figure 2.11 Mutation rate in relation to Ne for all genotypes tested. All mutation rates determined in this study are shown in relation to Ne, which is calculated as the harmonic mean across generations of the population size as it increases from N0 to Nt. The plotted lines come from Model 2.16 in Appendix (N=580), and the data are separated into panels, primarily for clarity, according to the degree of mutation rate plasticity identified in Figs 2.3, 2.8 and 2.9. Originally denoted as Fig S9 in publication.

104

The evolutionary causes of plasticity in mutation rates need not be adaptive (MacLean et al. 2013). Nonetheless, mutation is an evolutionary mechanism, so any plastic variation in mutation rates will have consequences for evolutionary trajectories

(Alexander, Mayer, and Bonhoeffer 2017). What the evolutionary consequences might be depend on how mutation rate associates with the environment. For evolutionary computing, where mutation rate is controlled, understanding the effect of that control is an important area of research (Karafotias, Hoogendoorn, and Eiben 2015). In biology, constitutively high mutation rates can evolve under specific circumstances

(Oliver 2000; Sniegowski, Gerrish, and Lenski 1997), but incur the costs of many, typically deleterious, mutations. If plasticity is such that mutation rate is inversely related to absolute organismal fitness, either in mathematical studies of evolutionary systems (Belavkin et al. 2016) or in population genetic models (Ram and Hadany 2012), organisms may benefit from a high mutation supply rate without paying the full evolutionary cost of a constitutively raised mutation rate. In some circumstances,

DAMP can result in such a negative association of mutation rate with fitness (Krašovec et al. 2014a), but the evolutionary effects of this remain to be tested.

The probability of a particular mutational event occurring (e.g. the emergence of spontaneous antibiotic resistance) might be expected to increase with D, as denser populations, containing more cells, will have had more opportunity for mutation. But for the DAMP described here, this increase is offset by a reduction in the mutation rate. This offsetting means that, for organisms with DAMP, numbers of mutational events per space and time vary much less with final population size than expected from a fixed mutation rate per generation (Figure 2.12). Population genetic models typically consider mutations per replication to be constant for an organism. However,

105 we find that the approximate constant is the number of mutational events per space and time (Figure 2.12). This is consistent with observations of invariant numbers of mutations per time in Mycobacterium infections (Ford et al. 2011) and indeed in human somatic (Alexandrov et al. 2015) and germ cells (Gutierrez et al. 2013).

Figure 2.12 Number of mutational events m per space and time in response to final population size Nt for all genotypes tested. The estimated number of mutational events, m, is elsewhere divided through by Nt to give the mutation rate per generation. Here all mutation rates determined in this study are plotted against Nt, having divided through by both the culture time in hours (typically

24h) and the culture volume in ml. The black dashed line indicates a slope of 1

(doubling Nt is associated with doubling m for a given volume and time), which is the expectation for a fixed (non-plastic) mutation rate. The coloured lines come from

Model 2.17 in Appendix (N=580), and the data are separated into panels, primarily for clarity, according to the degree of mutation rate plasticity identified in Figs 2.3, 2.8 and

2.9. Originally denoted as Fig S10 in publication.

106

Both the occurrence across domains of life (Fig 2.3) and the conserved mutation avoidance mechanism required (Fig 2.9) point to an ancient evolutionary origin for

DAMP. Furthermore, Figure 2.1 suggests that DAMP also occurs in viruses and bacteriophage. Any variation in mutation rates in viruses and phage lacking mutation avoidance or correction mechanisms, must be mediated by the host environment.

Consistent with this, we see different mutation rates, but similar DAMP, for the same

RNA virus in different host cells (Figure 2.13 reanalysed from (Combe and Sanjuán

2014)). DAMP itself therefore seems closely related to basic processes of replication common to all organisms. Nonetheless, our findings are limited to organisms where it is possible to assay mutation rate by fluctuation tests. This excludes multicellular eukaryotes, so how our findings might apply to them is unclear. Recent findings of variation in mitochondrial mutation rates at different population sizes and densities of the nematode highlight the challenge of separating out what population density could mean at the organism, tissue, cellular and sub-cellular (e.g. mitochondrial) levels (Konrad et al. 2017). Even so, if it were possible to manipulate microbial DAMP clinically as well as genetically (Fig 2.9), for instance as a strategy to slow the rate at which antibiotic resistance arises (Oldfield and Feng 2014), that could be applicable across the breadth of microbes, including pathogenic viruses.

107

Figure 2.13 Density-associated mutation-rate plasticity (DAMP) in Vesicular stomatitis virus hosted by different cell lines. Data from Sanjuan et al. (2010), plaque- forming units were used to estimate population density of viral particles. Viral mutation rates to monoclonal antibody were estimated in different host cells grown in normal (21%) oxygen levels at 37°C (squares) or 28°C (triangles) and in low (1%) oxygen levels at 37°C (circles). Hosts were baby hamster kidney cells (BHK), CT26 colon cancer cells (CT26), wild-type and Δp53 primary mouse embryonic fibroblasts (MEF and MEF_p53, respectively), Neuro-2a cells (Neuro), ovarian cells of the moth (sf21) and that of mosquito larvae (C6/36). Lines are from Model 2.18 in

Appendix (N=34, likelihood ratio test that host environment has no effect on the viral

-12 mutation rate LR7=68, P=4.3×10 ). Note the logarithmic axes. Originally denoted as

Fig S11 in publication.

108

2.5 Materials and Methods

2.5.1 Strains of bacteria and yeast used in Chapter 2

Table 2.1 A list of all bacteria and yeast strains used in this chapter

Strain Genotype Source or reference

E. coli MG1655 Karina B. Xavier

E. coli JW3350-2 F-, Δ(araD-araB)567, Keio collection (Baba et ΔlacZ4787(::rrnB-3), λ-, al. 2006) Δdam-722::kan, rph-1, Δ(rhaD-rhaB)568, hsdR514

E. coli JW0221-1 F-, Δ(araD-araB)567, Keio collection (Baba et ΔlacZ4787(::rrnB-3), λ-, al. 2006) ΔdinB749::kan, rph-1, Δ(rhaD-rhaB)568, hsdR514

E. coli JW2799-1 F-, Δ(araD-araB)567, Keio collection (Baba et ΔlacZ4787(::rrnB-3), λ-, al. 2006) ΔmutH756::kan, rph-1, Δ(rhaD-rhaB)568, hsdR514

E. coli JW4128-1 F-, Δ(araD-araB)567, Keio collection (Baba et ΔlacZ4787(::rrnB-3), λ-, al. 2006) ΔmutL720::kan, rph-1, Δ(rhaD-rhaB)568, hsdR514

E. coli JW3610-2 F-, Δ(araD-araB)567, Keio collection (Baba et ΔlacZ4787(::rrnB-3), λ-, al. 2006) ΔmutM744::kan, rph-1, Δ(rhaD-rhaB)568, hsdR514

E. coli JW2703-2 F-, Δ(araD-araB)567, Keio collection (Baba et ΔlacZ4787(::rrnB-3), λ-, al. 2006) ΔmutS738::kan, rph-1, Δ(rhaD-rhaB)568, hsdR514

E. coli JW0097-1 F-, Δ(araD-araB)567, Keio collection (Baba et ΔlacZ4787(::rrnB-3), λ-, al. 2006) ΔmutT790::kan, rph-1, Δ(rhaD-rhaB)568, hsdR514

E. coli JW0097-3 F-, Δ(araD-araB)567, Keio collection (Baba et ΔlacZ4787(::rrnB-3), λ-, al. 2006) ΔmutT790::kan, rph-1, Δ(rhaD-rhaB)568, hsdR514

109

E. coli JW2928-1 F-, Δ(araD-araB)567, Keio collection (Baba et ΔlacZ4787(::rrnB-3), λ-, al. 2006) ΔmutY736::kan, rph-1, Δ(rhaD-rhaB)568, hsdR514

E. coli JW0704-1 F-, Δ(araD-araB)567, Keio collection (Baba et ΔlacZ4787(::rrnB-3), λ-, al. 2006) Δnei764::kan, rph-1, Δ(rhaD-rhaB)568, hsdR514

E. coli JW0194-1 F-, Δ(araD-araB)567, Keio collection (Baba et ΔlacZ4787(::rrnB-3), λ-, al. 2006) ΔmetI723::kan, rph-1, Δ(rhaD-rhaB)568, hsdR514

P. aeruginosa PAO1 Johanna M. Schwingel

S. cerevisiae BY4742 Daniela Delneri

S. cerevisiae Sigma1287b Daniela Delneri

S. cerevisiae S288C Daniela Delneri

S. cerevisiae Sigma1287b PCD1-Δ Daniela Delneri

S. cerevisiae BY4742 PCD1-Δ Daniela Delneri

S. cerevisiae Sigma1287b MLH1-Δ Daniela Delneri

2.5.2 Media.

We used MilliQ water for all media. Tetrazolium arabinose agar (TA), Davis minimal medium (DM) and M9 minimal medium were prepared according to Lenski et al.

(1991). Luria-Bertani medium (LB), yeast extract peptone medium (YP) and yeast nitrogen base (YNB) were prepared according to manufacturers’ instructions.

Magnesium sulphate heptahydrate, thiamine hydrochloride, carbon source (3g/l L- arabinose or various concentrations of D-glucose), 2,3,5-triphenyltetrazolium chloride

(Sigma T8877) were sterile filtered and added to a cooled medium. Selective TA medium was supplemented with freshly prepared rifampicin (50µg/ml) or nalidixic acid

(30µg/ml). Selective YP medium was supplemented with freshly prepared 5-FOA

110

(1,000µg/ml) or hygromycin B (300µg/ml). For all cell dilutions sterile saline (8.5 g/l

NaCl) was used. All media were solidified as necessary with 15 g/l of agar (Difco).

2.5.3 Fluctuation tests with bacteria.

We did fluctuation tests with Escherichia coli and Pseudomonas aeruginosa as explained in (Krašovec et al. 2014a). In short, strains were first inoculated from frozen stock and grown in liquid LB medium at 37°C and then transferred to non-selective liquid DM (for E. coli) or M9 (for P. aeruginosa) supplemented with a particular concentration of glucose (25–300 mgl-1) and allowed to grow overnight shaking at

37°C. E. coli and P. aeruginosa were again diluted into fresh DM or M9 medium,

2 respectively, giving the initial population size (N0) of around 10,000 (range 2.5×10 –

1.3×105) and 5,000 (range 2.5×103 – 1.2×104), respectively. Various volumes (0.5–10 ml) of parallel cultures were grown to saturation for 24 hours at 37°C in 96 deep-well plates or 50ml falcon tubes. The position of each culture on a 96-well plate was chosen randomly. Final population size (Nt) of each culture was determined by two independent techniques. Nt was determined by colony forming units (CFU) where appropriate dilution was plated on a solid non-selective TA medium. Estimates of Nt using net luminescence (LUM) were determined using a Promega GloMax luminometer and the Promega Bac-Titer Glo kit, according to manufacturer's instructions. We measured luminescence of each culture 0.5 and 510 seconds after adding the Bac-Titer Glo reagent and calculated net luminescence as LUM = luminescence510s – luminescence0.5s. Each estimate of Nt is an average of 3 cultures.

Evaporation (routinely monitored by weighing plate before and after 24h incubation) was accounted for in the Nt value determined by CFU and was also used in statistical modelling as a variance covariate. We obtained the observed number of mutants

111 resistant to rifampicin or nalidixic acid, r, by plating the entirety of remaining cultures onto solid selective TA medium that allows spontaneous mutants to form colonies.

Plates were incubated at 37°C and mutants were counted at the earliest possible time after plating. For rifampicin plates this was 44–48 hours, when nalidixic acid was used the incubation time was 68–72 hours.

2.5.4 Fluctuation tests with yeast.

We did fluctuation tests with yeast in a similar way to with bacteria (see above).

Strains were inoculated from frozen stock in liquid YP medium with 20 mg/ml of glucose at 30°C (200 rpm) and then transferred to non-selective liquid YNB medium supplemented with a particular percentage of YP (v/v), except for S288C where YP was not added. We then allowed cultures to grow overnight at 30°C (200 rpm). Overnight cultures were again diluted into fresh medium giving N0 of around 5000 per parallel culture (range 5×102 – 5.1×104). Various volumes of parallel cultures (0.35–10 ml) were grown in yeast nitrogen base with 25–8,000 mgl-1 glucose and 0-7% v/v yeast extract peptone medium in 96 deep-well plates or in 50ml falcon tubes to saturation for 48 or

72 hours at 30°C (200 rpm). We positioned each culture on the plate randomly. Nt, was determined by colony forming units (CFU), where an appropriate dilution was plated on solid non-selective YP medium. Nt determined with haemocytometer (Cellometer

Auto M10 – Nexcelom) (CC) was done according to manufacturer’s instructions. Nt was calculated with 3 cultures per mutation rate estimate, where for each culture CFU and

CC was determined. Evaporation was accounted for in the Nt value determined by CFU and also used in statistical modelling as a variance covariate. We obtained the observed number of mutants resistant to 5-FOA or hygromycin B, r, by plating the entirety of remaining cultures onto solid selective YP medium. Plates were incubated

112 at 30°C and mutants were counted at the earliest possible time after plating, for both markers that was 68–72 hours.

For Figures 2.3A, 2.3B, 2.8, 2.9A, 2.9B, 2.9C and 2.9D we used 21, 14, 14, 8, 5, 5 and 3 independent experimental blocks, respectively, carried out on different days. Within an experimental block, multiple 96-well plates, or groups of falcon tubes, were used.

Any individual mutation rate estimate requires multiple parallel cultures (16 or 17), which were all carried out on a particular plate, or group of falcon tubes. For Figs 2.3A,

2.3B, 2.8, 2.9A, 2.9B, 2.9C and 2.9D the median number of plates used (with interquartile range) was 16 (15–16), 16 (16-16), 16 (16-16), 16 (16-16), 16 (16-16), 16

(16-16) and 16 (15–16), respectively.

2.5.5 Estimation of mutation rates.

To calculate number of mutational events, m, from the observed number of mutants we employed the Ma-Sandri-Sarkar maximum-likelihood method implemented by the

FALCOR web tool (Hall et al. 2009) or rSalvador (Zheng 2015, 2016). The mutation rate per cell per generation is calculated as m divided by the final population size, Nt.

Median (with interquartile range) of the coefficient of variation for Nt estimated with

CFU and ATP-based luminescence assay is 15.9 (9.6–24.8) and 10.9 (6.9–18.4), respectively. Ne is calculated as the harmonic mean of the population size across generations.

2.5.6 Statistical analysis.

All statistical analysis was executed in R v3.2.4 and v3.3.1, respectively when using spaMM v 1.7.2 (Model 2.1) and nlme v3.1 (Model 2.2 to 2.18) packages for linear

113 mixed effects modelling (Pinheiro and Bates 2000; Rousset and Ferdy 2014). This enabled the inclusion within the same model of experimental factors (fixed effects), blocking effects (random effects) and factors affecting variance (heteroscedasticity) as described in Statistical models. In all cases the log2 mutation rate was used.

2.5.7 Whole genome sequencing.

E. coli genomes were sequenced with the Illumina HiSeq2500 platform using 2x250 bp paired-end reads. Sequencing and initial read quality checking were provided by

MicrobesNG (http://www.microbesng.uk). Strains derived from MG1655 and from the

Keio collection were aligned to the E. coli str. K-12 substr. MG1655 (NC_000913.3) and

E. coli BW25113 (NZ_CP009273.1) genomes, respectively. Mutations (i.e. single nucleotide substitutions, small and large indels, and copy number variants) were predicted using breseq-0.27.2 (Table 2.2) using the default settings (Deatherage and

Barrick 2014).

Table 2.2 Breseq analysis of mutations identified in genome sequence for two ΔmutT

Keio strains. See Materials and Methods for more details about the analysis.

Differences from the reference common to all Keio strains are not shown. Sequence data available at the European Nucleotide Archive (accession number ERP024110, http://www.ebi.ac.uk/ena/data/view/ERP024110).

Position Mutatio ΔmutT ΔmutT annotation gene description n 3795 3796 308335 T>G yes E333A NP_414825.1 < ECP production outer (GAG>GCG) membrane protein 1262200 A>C yes L226L NP_415726.1 < 4diphosphocytidyl2C (CTT>CTG) methylerythritol kinase 1344259 A>C Yes S30A NP_415800.1 < global regulator of (TCT>GCT) transcription; DeoR family

114

1513869 T>G yes G13G NP_415959.1 > putative ABC transporter (GGT>GGG) permease 1871819 A>C Yes K479Q NP_416299.1 > putative (AAA>CAA) membraneanchored diguanylate cyclase 2193558 A>C Yes L204* NP_416616.4 < antiporter inner (TTA>TGA) membrane protein 2405801 A>C Yes Y281D NP_416792.1 < transcriptional repressor (TAT>GAT) of flagellar, motility and chemotaxis genes 2406280 A>C yes V121G NP_416792.1 < transcriptional repressor (GTG>GGG) of flagellar, motility and chemotaxis genes 2959804 A>C Yes L875L NP_417299.1 < exonuclease V (RecBCD (CTT>CTG) complex), gamma chain 2994210 T>G Yes I91R pbl > , (ATA>AGA) peptidoglycanbinding enzyme family 3256922 A>C yes N363K YP_026203.3 < putative transporter (AAT>AAG) 3283396 A>C yes E85A NP_417606.1 > tagatose 6phosphate (GAA>GCA) aldolase 1, kbaY subunit 3441340 T>G Yes A112A NP_417755.1 < 30S ribosomal subunit (GCA>GCC) protein S4 3828787 A>C Yes intergenic NP_418110.1 < glutamate (122/158 / > transporter/xanthine ) NP_418111.1 permease 3859103 A>C yes F338L NP_418135.1 < putative transporter (TTT>TTG) 3992438 T>G Yes L429* NP_418250.1 > adenylate cyclase (TTA>TGA) 3995699 A>C yes E39D NP_418255.1 > DUF484 family protein (GAA>GAC) 4009304 T>G yes F45L YP_026266.1 > lysophospholipase L2 (TTT>TTG) 4087643 T>G yes I214M NP_418332.1 > DUF3829 family (ATT>ATG) lipoprotein 4187714 A>C yes K789Q NP_418415.1 > RNA polymerase, beta (AAA>CAA) prime subunit 4361043 T>G yes N298H NP_418557.1 < cadBA operon (AAT>CAT) transcriptional activator 4463045 A>C yes intergenic NP_418659.1 < anaerobic (385/+9) / < ribonucleosidetriphosp NP_418660.1 hate reductase/trehalose6P hydrolase

115

2.5.8 Published mutation rate search criteria.

We identified studies that used Luria-Delbrück fluctuation tests for estimating mutation rates. We considered all papers citing the original reference (Luria and

Delbrück 1943) and further searched the Google Scholar and Web of Science databases with keywords “mutation rate”, “Luria Delbruck”, “fluctuation test” and

“fluctuation assay”, we also considered papers cited by papers identified in this way.

We collected mutation rate estimations from studies spanning over 70 years, starting with Luria and Delbrück’s pioneering paper in 1943 outlining the fluctuation test (Luria and Delbrück 1943). In all we collected 474 mutation rate estimations from 68 separate studies (Table 2.3), covering 26 different organisms from across domains of life (Archaea, Bacteria, Eukaryota) and Viruses. From these studies we recorded the mutation rate estimation, the estimator used for calculating mutation rate, the final population density (D) of parallel cultures, identity of the non-selective medium, the organism studied, the selective marker used and its concentration and the study the estimate came from. We excluded estimates that i) involved microorganisms cultured in intentionally selective conditions, ii) used genetically manipulated or mutator strains or iii) did not plate the entire culture volume onto the selective media. Any of this information that was not included in the published article was collected via a direct communication with the corresponding author.

Table 2.3 List of papers from which mutation rate estimates are taken for analysis of published mutation rates in Figure 2.1.

1. C. Aguilar et al., Deletion of the 2-acyl-glycerophosphoethanolamine cycle improve glucose metabolism in Escherichia coli strains employed for overproduction of aromatic compounds. Microb. Cell. Fact. 14, 194 (2015).

2. F. I. Arias-Sánchez, A. R. Hall, Effects of antibiotic resistance alleles on bacterial evolutionary responses to viral parasites. Biol. Lett. 12, 20160064 (2016).

116

3. L. Boe, Translational errors as the cause of mutations in Escherichia coil. Mol. Gen. Genet. 231, 469-471 (1992).

4. H. Boshoff, M. Reed, C. B. III, V. Mizrahi, DnaE2 Polymerase Contributes to In Vivo Survival and the Emergence of Drug Resistance in Mycobacterium tuberculosis. Cell 113, 183-193 (2003).

5. K. Bradwell, M. Combe, P. Domingo-Calap, R. Sanjuán, Correlation between mutation rate and genome size in riboviruses: Mutation rate of bacteriophage Qβ. Genetics 195, 243-251 (2013).

6. S. Broomfield, B. L. Chow, W. Xiao, MMS2, encoding a -conjugating-enzyme-like protein, is a member of the yeast error-free pathway. Proc. Natl. Acad. Sci. U. S. A. 95, 5678-5683 (1998).

7. C. R. Busch, J. DiRuggiero, MutS and MutL are dispensable for maintenance of the genomic mutation rate in the halophilic archaeon Halobacterium salinarum NRC-1. PLoS ONE 5, 1-8 (2010).

8. M. Combe, R. Sanjuán, Variation in RNA Virus Mutation Rates across Host Cells. PLoS Pathogens 10, (2014).

9. B. Csörgo, T. Fehér, E. Tímár, F. R. Blattner, G. Pósfai, Low-mutation-rate, reduced- genome Escherichia coli: an improved host for faithful maintenance of engineered genetic constructs. Microb. Cell. Fact. 11, 11 (2012).

10. J. M. Cuevas, S. Duffy, R. Sanjuán, rate of bacteriophage ΦX174. Genetics 183, 747-749 (2009).

11. H. L. David, Probability Distribution of Drug-Resistant Mutants in Unselected Populations of Mycobacterium tuberculosis. Appl. Envir. Microbiol. 20, 810-814 (1970).

12. M. Demerec, Studies of the Streptomycin-Resistance System of Mutations in E. Coli. Genetics 36, 585-597 (1951).

13. M. Demerec, U. Fano, Bacteriophage-Resistant Mutants in Escherichia coli. Genetics 30, 119 (1945).

14. P. Domingo-Calap, M. Pereira-Gómez, R. Sanjuán, Nucleoside analogue mutagenesis of a single-stranded DNA virus: evolution and resistance. J. Virol. 86, 9640-9646 (2012).

15. M. S. Esposito, C. V. Bruschi, Diploid yeast cells yield homozygous spontaneous mutations. Curr. Genet. 23, 430-434 (1993).

16. M. S. Esposito, R. M. Ramirez, C. V. Bruschi, Nonrandomly-associated forward mutation and mitotic recombination yield yeast diploids homozygous for recessive mutations. Curr. Genet. 26, 302-307 (1994).

17. T. Feher, B. Cseh, K. Umenhoffer, I. Karcagi, G. Posfai, Characterization of cycA mutants of Escherichia coli. An assay for measuring in vivo mutation rates. Mutat. Res. 595, 184- 190 (2006).

117

18. C. B. Ford et al., Use of whole genome sequencing to estimate the mutation rate of Mycobacterium tuberculosis during latent infection. Nat. Genet. 43, 482-486 (2011).

19. C. B. Ford et al., Emergence of Drug Resistant Tuberculosis. Nat. Genet. 45, 784-790 (2013).

20. V. Furió, A. Moya, R. Sanjuán, The cost of replication fidelity in an RNA virus. Proc. Natl. Acad. Sci. U. S. A. 102, 10233-10237 (2005).

21. W. E. Glaab, L. S. Mitchell, J. E. Miller, K. Vlasakova, T. R. Skopek, 5-Fluorouracil forward mutation assay in Salmonella: Determination of mutational target and spontaneous mutational spectra. Mutat. Res. 578, 238-246 (2005).

22. B. Grimberg, C. Zeyl, the Effects of Sex and Mutation Rate on Adaptation in Test Tubes. Evolution 59, 431-438 (2005).

23. B. G. Hall, Activation of the bgl operon by . Mol. Biol. Evol. 15, 1-5 (1998).

24. F. Hassim, A. O. Papadopoulos, B. D. Kana, B. G. Gordhan, A combinatorial role for MutY and Fpg DNA glycosylases in mutation avoidance in Mycobacterium smegmatis. Mutation Research - Fundamental and Molecular Mechanisms of Mutagenesis 779, 24- 32 (2015).

25. E. Huitric et al., Rates and mechanisms of resistance development in Mycobacterium tuberculosis to a novel diarylquinoline ATP synthase inhibitor. Antimicrob. Agents Chemother. 54, 1022-1028 (2010).

26. K. L. Jacobs, D. W. Grogan, Rates of spontaneous mutation in an archaeon from geothermal environments. J. Bacteriol. 179, 3298-3303 (1997).

27. S. M. Karve et al., Escherichia coli populations in unpredictably fluctuating environments evolve to face novel stresses through enhanced efflux activity. J. Evol. Biol. 28, 1131- 1143 (2015).

28. P. Komp Lindgren, A. Karlsson, D. Hughes, Mutation rate and evolution of fluoroquinolone resistance in Escherichia coli isolates from patients with urinary tract infections. Antimicrob. Agents Chemother. 47, 3222-3232 (2003).

29. K. Kurthkoti et al., A distinct physiological role of MutY in mutation prevention in mycobacteria. Microbiology 156, 88-98 (2010).

30. K. Kuthkoti, P. Kumar, R. Jain, U. Varshney, Important role of the nucleotide excision repair pathway in Mycobacterium smegmatis in conferring protection against commonly encountered DNA-damaging agents. Microbiology 154, 2776-2785 (2008).

31. G. I. Lang, A. W. Murray, Estimating the per-base-pair mutation rate in the yeast Saccharomyces cerevisiae. Genetics 178, 67-82 (2008).

32. G. I. Lang, A. W. Murray, Mutation rates across budding yeast chromosome VI Are correlated with replication timing. Genome Biol. Evol. 3, 799-811 (2011).

118

33. D. H. Lee, R. J. Miles, J. R. Inal, Antibiotic sensitivity and mutation rates to antibiotic resistance in Mycoplasma mycoides ssp. mycoides. Epidemiol. Infect. 98, 361-368 (1987).

34. R. E. Lenski et al., Sustained fitness gains and variability in fitness trajectories in the long- term evolution experiment with Escherichia coli. Proc Biol Sci 282, 20152292 (2015).

35. S. F. Levy et al., Quantitative using high-resolution tracking. Nature 519, 181-186 (2015).

36. H. Long et al., Background mutational features of the -resistant bacterium Deinococcus radiodurans. Mol. Biol. Evol. 32, 2383-2392 (2015).

37. H. Long et al., Mutation rate, spectrum, topology, and context-dependency in the DNA mismatch repair-deficient Pseudomonas fluorescens ATCC948. Genome Biol. Evol. 7, 262- 271 (2014).

38. S. E. Luria, M. Delbrück, Mutations of bacteria from virus sensitivity to virus resistance. Genetics 28, 491-511 (1943).

39. E. E. Machowski, S. Barichievy, B. Springer, S. I. Durbach, V. Mizrahi, In vitro analysis of rates and spectra of mutations in a polymorphic region of the Rv0746 PE_PGRS gene of Mycobacterium tuberculosis. J. Bacteriol. 189, 2190-2195 (2007).

40. M. D. Maciá et al., Efficacy and potential for resistance selection of antipseudomonal treatments in a mouse model of lung infection by hypermutable Pseudomonas aeruginosa. Antimicrob. Agents Chemother. 50, 975-983 (2006).

41. R. R. Mackwan, G. T. Carver, J. W. Drake, D. W. Grogan, An unusual pattern of spontaneous mutations recovered in the halophilic archaeon Haloferax volcanii. Genetics 176, 697-702 (2007).

42. R. R. Mackwan, G. T. Carver, G. E. Kissling, J. W. Drake, D. W. Grogan, The rate and character of spontaneous mutation in Thermus thermophilus. Genetics 180, 17-25 (2008).

43. V. S. Malshetty, R. Jain, T. Srinath, K. Kurthkoti, U. Varshney, Synergistic effects of UdgB and Ung in mutation prevention and protection against commonly encountered DNA damaging agents in Mycobacterium smegmatis. Microbiology 156, 940-949 (2010).

44. A. E. Minias, A. M. Brzostek, P. Minias, J. Dziadek, The deletion of rnhB in Mycobacterium smegmatis does not affect the level of RNase HII substrates or influence genome stability. PLoS ONE 10, e0115521 (2015).

45. M. R. Monti, V. Miguel, M. V. Borgogno, C. E. Argaraña, Functional analysis of the interaction between the mismatch repair protein MutS and the replication processivity factor β clamp in Pseudomonas aeruginosa. DNA Repair 11, 463-469 (2012).

46. H. B. Newcombe, Delayed Phenotypic Expression of Spontaneous Mutations in Escherichia Coli. Genetics 33, 447-476 (1948).

47. H. B. Newcombe, R. Hawirko, Spontaneous Mutation to Streptomycin Resistance and

119

Dependence in Escherichia coli. J. Bacteriol. 57, 565-572 (1949).

48. H. B. Newcombe, G. J. Mc, On the nonadaptive nature of change to full streptomycin resistance in Escherichia coli. J. Bacteriol. 62, 539-544 (1951).

49. S. Oide et al., Thermal and solvent stress cross-tolerance conferred to Corynebacterium glutamicum by adaptive laboratory evolution. Appl. Environ. Microbiol. 81, 2284-2298 (2015).

50. M. Pereira-Gómez, R. Sanjuán, Delayed lysis confers resistance to the nucleoside analogue 5-Fluorouracil and alleviates mutation accumulation in the single-stranded DNA bacteriophage x174. J. Virol. 88, 5042-5049 (2014).

51. M. Pereira-Gómez, R. Sanjuán, Effect of mismatch repair on the mutation rate of bacteriophage ϕX174. Virus Evolution 1, vev010 (2015).

52. C. Rajanna et al., A strain of Yersinia pestis with a mutator phenotype from the Republic of Georgia. FEMS Microbiol. Lett. 343, 113-120 (2013).

53. C. Riesenfeld, M. Everett, L. J. V. Piddock, B. G. Hall, Adaptive mutations produce resistance to ciprofloxacin. Antimicrob. Agents Chemother. 41, 2059-2060 (1997).

54. C. W. Russell, M. A. Mulvey, The Extraintestinal Pathogenic Escherichia coli Factor RqlI Constrains the Genotoxic Effects of the RecQ-Like RqlH. PLoS Pathogens 11, 1- 29 (2015).

55. S. J. Schrag, P. A. Rota, W. J. Bellini, Spontaneous mutation rate of measles virus: direct estimation based on mutations conferring monoclonal antibody resistance. J. Virol. 73, 51-54 (1999).

56. S. Shewaramani, T. J. Finn, S. C. Leahy, R. Kassen, P. B. Rainey, C. D. Moon, Anaerobically Grown Escherichia coli Has an Enhanced Mutation Rate and Distinct Mutational Spectra. PLoS Genet. 13(1): e1006570 (2017).

57. P. Siminoff, Development of Bacterial Resistance To Antibiotics. J. Bacteriol. 77, 79-85 (1959).

58. M. Sussman, S. G. Bradley, Mutant yeast strains resistant to arsenate and azide. J. Bacteriol. 66, 52-59 (1953).

59. C. Torres-Barceló, M. Kojadinovic, R. Moxon, R. C. MacLean, The SOS response increases bacterial fitness, but not evolvability, under a sublethal dose of antibiotic. Proc. R. Soc. London, B 282, 20150885 (2015).

60. J. E. Turse, J. Pei, T. A. Ficht, Lipopolysaccharide-deficient Brucella variants arise spontaneously during infection. Front. Microbiol. 2, 1-12 (2011).

61. A. J. Vogler et al., Molecular Analysis of Rifampin Resistance in Bacillus anthracis and Bacillus cereus. Antimicrob. Agents Chemother. 46, 511-513 (2002).

62. G. Wang et al., Spontaneous Mutations That Confer Antibiotic Resistance in Helicobacter

120

pylori Spontaneous Mutations That Confer Antibiotic Resistance in . Antimicrob. Agents Chemother. 45, 727-733 (2001).

63. T. Watanabe, T. Fukasawa, D. Ushiba, Probable absence of direct induction of bacterial resistance to streptomycin. J. Bacteriol. 73, 770-777 (1957).

64. M. E. Watson, J. L. Burns, A. L. Smith, Hypermutable Haemophilus influenzae with mutations in mutS are found in cystic fibrosis sputum. Microbiology 150, 2947-2958 (2004).

65. J. Werngren, S. E. Hoffner, Drug-susceptible Mycobacterium tuberculosis Beijing genotype does not develop mutation-conferred resistance to rifampin at an elevated rate. J. Clin. Microbiol. 41, 1520-1524 (2003).

66. S. Wielgoss et al., Mutation rate dynamics in a bacterial population reflect tension between adaptation and genetic load. Proc. Natl. Acad. Sci. U. S. A. 110, 222-227 (2013).

67. C. Zeyl, M. Mizesko, J. A. G. M. De Visser, in laboratory yeast populations. Evolution 55, 909-917 (2001).

Where the number of observed mutants per plate was available and the estimator was not the Ma-Sandri-Sarkar maximum-likelihood method, we recalculated the mutation rate using this method implemented by the FALCOR web tool (Hall et al. 2009). When only the proportion of plates without mutants was available, we recalculated the mutation rate using the P0 method (Foster 2006), implemented by equation – ln(P0)/Nt where P0 is the proportion of plates containing no mutant colonies. For viral mutation rates, we recorded the mutation rate as substitutions per strand copying. Where the published mutation rates were not in this format, these were converted using equation 10 in (Sanjuan et al. 2010).

2.5.9 Phylogeny used in analysing published mutation rates.

To take account of the fact that organisms may show similarity (including in mutation rates) through common ancestry, we accounted for phylogenetic relatedness using a correlation matrix within our Model 2.1. This matrix was derived from a phylogeny constructed from a combination of three published phylogenies. All the bacteria and

121 archaea came from the ‘All-Species Living Tree’ Project (LTP) (Yarza et al. 2008) version

LTPs123 and this phylogeny was used to add other organisms. Saccharomyces cerevisiae was taken from (Lane and Darst 2010), and viruses were taken from (Nasir and Caetano-Anollés 2015). Branch lengths for S. cerevisiae and the viruses were scaled to correspond to those in the LTP tree as follows. For S. cerevisiae, average branch lengths from the tips to the last common ancestor of the archaea

Halobacterium and Sulfolobus were compared for the LTP and Lane and Darst (Lane and Darst 2010) trees. The ratio between them was applied to the branch length of S. cerevisiae in Lane and Darst phylogeny (Lane and Darst 2010) before adding it to the

LTP tree at the branch point of the bacteria and archaea. For the viruses, four common tips from both trees (P. aeruginosa, Thermus. thermophilus, S. cerevisiae and

Halobacterium) were selected and the distance of each to a shared common ancestor

(P. aeruginosa with T. thermophilus, and S. cerevisiae with Halobacterium) were plotted against each other. A straight line was then fitted through the origin and the gradient of that line used to correct the branch length of the viruses in the Nasir and

Caetano-Anollés tree (Nasir and Caetano-Anollés 2015) before being added to the combined LTP/S. cerevisiae tree at the branch point of the three domains.

Some organisms in our analysis were not present in these phylogenies. These were: separate serovars of Salmonella enterica (serovars Typhimurium and Enteritidis) which we treated as subspecies of Salmonella enterica (indica and enterica, present in the tree); Vesicular stomatitis virus and Measles virus, which were combined into their common Order of Mononegavirales; and Bacteriophage ΦX174, which was positioned at Bacteriophage M13. The final phylogeny is shown in Figure 2.14.

122

Figure 2.14 Phylogeny used in analysing published mutation rates.

Phylogeny used to control for relatedness in Model 2.1 analysing data in Figure 2.1. See

Materials and Methods for construction and usage. Originally denoted as Fig S1 in publication.

123

2.6 References

Alexander, Helen K., Stephanie I. Mayer, and Sebastian Bonhoeffer. 2017. “Population

Heterogeneity in Mutation Rate Increases the Frequency of Higher-Order Mutants

and Reduces Long-Term Mutational Load.” Molecular biology and evolution

34(2):419–36.

Alexandrov, Ludmil B. et al. 2015. “Clock-like Mutational Processes in Human Somatic

Cells.” Nature Genetics 47(12):1402–7.

Baba, Tomoya et al. 2006. “Construction of Escherichia Coli K-12 in-Frame, Single-Gene

Knockout Mutants: The Keio Collection.” Molecular

2(1):2006.0008.

Belavkin, Roman V. et al. 2016. “Monotonicity of Fitness Landscapes and Mutation

Rate Control.” Journal of Mathematical Biology 73(6–7):1491–1524.

Campbell, Catarina D., and Evan E. Eichler. 2013. “Properties and Rates of Germline

Mutations in Humans.” Trends in Genetics 29(10):575–84.

Combe, Marine, and Rafael Sanjuán. 2014. “Variation in RNA Virus Mutation Rates

across Host Cells.” PLoS Pathogens 10(1).

Deatherage, DE, and JE Barrick. 2014. “Identification of Mutations in Laboratory-

Evolved Microbes from next-Generation Sequencing Data Using Breseq.” Pp. 165–

88 in Engineering and Analyzing Multicellular Systems: Methods and Protocols,

edited by L Sun and W Shou. New York: Springer New York.

Drake, John W. 1991. “A Constant Rate of Spontaneous Mutation in DNA-Based

Microbes.” Proceedings of the National Academy of Sciences USA 88:7160–64.

Erez, Zohar et al. 2017. “Communication between Viruses Guides Lysis-Lysogeny

Decisions.” Nature 541(7638):488–93.

Ford, Christopher B. et al. 2011. “Use of Whole Genome Sequencing to Estimate the

124

Mutation Rate of Mycobacterium tuberculosis during Latent Infection.” Nature

Genetics 43(5):482–86.

Foster, Patricia L. 2006. “Methods for Determining Spontaneous Mutation Rates.”

Methods Enzymol 409:195–213.

Galhardo, Rodrigo S., P. J. Hastings, and Susan M. Rosenberg. 2007. “Mutation as a

Stress Response and the Regulation of Evolvability.” Critical Reviews in

Biochemistry and molecular biology 42(5):399-435

Gon, Stéphanie, Rita Napolitano, Walter Rocha, Stéphane Coulon, and Robert P. Fuchs.

2011. “Increase in DNTP Pool Size during the DNA Damage Response Plays a Key

Role in Spontaneous and Induced-Mutagenesis in Escherichia Coli.” Proceedings of

the National Academy of Sciences 108(48):19311–16.

Goodman, Myron F. 2002. “Error-Prone Repair DNA Polymerases in Prokaryotes and

Eukaryotes.” Annual review of biochemistry 71:17–50.

Gutierrez, A. et al. 2013. “β-Lactam Antibiotics Promote Bacterial Mutagenesis via an

RpoS-Mediated Reduction in Replication Fidelity.” Nature Communications

4(1):1610.

Hall, Alex R., James C. Iles, and R. Craig MacLean. 2011. “The Fitness Cost of Rifampicin

Resistance in Pseudomonas aeruginosa Depends on Demand for RNA

Polymerase.” Genetics 187(3):817–22.

Hall, Brandon M., Chang-Xing Ma, Ping Liang, and Keshav K. Singh. 2009. “Fluctuation

Analysis CalculatOR: A Web Tool for the Determination of Mutation Rate Using

Luria-Delbruck Fluctuation Analysis.” (Oxford, England)

25(12):1564–65.

Halligan, Daniel L., and Peter D. Keightley. 2009. “Spontaneous Mutation Accumulation

Studies in Evolutionary Genetics.” Annual Review of Ecology, Evolution, and

125

Systematics 40(1):151–72.

Jee, Justin et al. 2016. “Rates and Mechanisms of Bacterial Mutagenesis from

Maximum-Depth Sequencing.” Nature 534(7609).

Karafotias, Giorgos, Mark Hoogendoorn, and A. E. Eiben. 2015. “Parameter Control in

Evolutionary Algorithms: Trends and Challenges.” IEEE Transactions on

Evolutionary Computation 19(2):167–87.

Kong, Augustine et al. 2012. “Rate of de Novo Mutations and the Importance of

Father-s Age to Disease Risk.” Nature 488(7412):471–75.

Konrad, Anke et al. 2017. “Mitochondrial Mutation Rate, Spectrum and

in Caenorhabditis elegans Spontaneous Mutation Accumulation Lines of Differing

Population Size.” Molecular Biology and Evolution 34(6):msx051.

Krašovec, Rok et al. 2014a. “Mutation Rate Plasticity in Rifampicin Resistance Depends

on Escherichia coli Cell-Cell Interactions.” Nature communications 5:3742.

Krašovec, Rok et al. 2014b. “Where Anitibiotic Resistance Mutations Meet Quorum-

Sensing.” Microbial Cell 1(7):250–52.

Kunkel, Thomas A., and Dorothy A. Erie. 2005. “DNA MISMATCH REPAIR.” Annual

Review of Biochemistry 74(1):681–710.

LaCroix, Ryan A. et al. 2015. “Use of Adaptive Laboratory Evolution to Discover Key

Mutations Enabling Rapid Growth of Escherichia coli K-12 MG1655 on Glucose

Minimal Medium.” Applied and Environmental Microbiology 81(1):17–30.

Lane, William J., and Seth A. Darst. 2010. “Molecular Evolution of Multisubunit RNA

Polymerases: .” Journal of Molecular Biology 395(4):671–85.

Lenski, Richard E., Michael R. Rose, Suzanne C. Simpson, and Scott C. Tadler. 1991.

“Long-Term Experimental Evolution in Escherichia coli . I . Adaptation and

Divergence During.” The American Naturalist 138:1315–41.

126

Luria, S. E., and M. Delbrück. 1943. “Mutations of Bacteria from Virus Sensitivity to

Virus Resistance.” Genetics 28(6):491–511.

Lynch, Michael et al. 2016. “Genetic Drift, Selection and the Evolution of the Mutation

Rate.” Nature Reviews Genetics 17(11):704–14.

MacLean, R. Craig, Clara Torres-Barceló, and Richard Moxon. 2013. “Evaluating

Evolutionary Models of Stress-Induced Mutagenesis in Bacteria.” Nature Reviews

Genetics 14(3):221–27.

Maki, Hisaji. 2002. “Origins of Spontaneous Mutations: Specificity and Directionality of

Base-Substitution, Frameshift, and Sequence-Substitution Mutageneses.” Annual

Review of Genetics 36(1):279–303.

Maki, Hisaji, and Mutsuo Sekiguchi. 1992. “MutT Protein Specifically Hydrolyses a

Potent Mutagenic Substrate for DNA Synthesis.” Nature 355(6357):273–75.

Massey, Ruth C., and Angus Buckling. 2002. “Environmental Regulation of Mutation

Rates at Specific Sites.” Trends in Microbiology 10(12):580–84.

McLennan, A. G. 2006. “The Nudix Hydrolase Superfamily.” Cellular and Molecular Life

Sciences 63(2):123–43.

Michaels, Mark Leo, and Jeffrey H. Miller. 1992. “The GO System Protects Organisms

from the Mutagenic Effect of the Spontaneous Lesion 8-Hydroxyguanine (7,8-

Dihydro-8-Oxoguanine).” JOURNAL OF BACrERIOLOGY 174(20):6321–25.

Miller, MB, and BL Bassler. 2001. “Quorum Sensing in Bacteria.” Annual Reviews in

Microbiology 55:165–99.

Nakagawa, Shinichi, and Holger Schielzeth. 2013. “A General and Simple Method for

Obtaining R2 from Generalized Linear Mixed-Effects Models” edited by Robert B.

O’Hara. Methods in Ecology and Evolution 4(2):133–42.

Nasir, Arshan, and Gustavo Caetano-Anollés. 2015. “A Phylogenomic Data-Driven

127

Exploration of Viral Origins and Evolution.” Science Advances 1(8):e1500527.

Nunoshiba, Tatsuo et al. 2004. “A Novel Nudix Hydrolase for Oxidized Purine

Nucleoside Triphosphates Encoded by ORFYLR151c (PCD1 Gene) in

Saccharomyces cerevisiae.” Nucleic Acids Research 32(18):5339–48.

Oldfield, Eric, and Xinxin Feng. 2014. “Resistance-Resistant Antibiotics.” Trends in

Pharmacological Sciences 35(12):664–74.

Oliver, a. 2000. “High Frequency of Hypermutable Pseudomonas Aeruginosa in Cystic

Fibrosis Lung Infection.” Science 288(5469):1251–53.

Paradis, Emmanuel, Julien Claude, and Korbinian Strimmer. 2004. “: Analyses of

Phylogenetics and Evolution in R .” Bioinformatics 20(2):289–90.

Pinheiro, J., and D. Bates. 2000. Mixed Effects Models in S and S-Plus. New York:

Springer.

Ram, Yoav, and Lilach Hadany. 2012. “The Evolution of Stress-Induced Hypermutation

in Asexual Populations.” Evolution 66(7):2315–28.

Rousset, François, and Jean-Baptiste Ferdy. 2014. “Testing Environmental and Genetic

Effects in the Presence of Spatial Autocorrelation.” Ecography 37(8):781–90.

Sanjuan, R., Miguel R. Nebot, Nicola Chirico, Louis M. Mansky, and Robert Belshaw.

2010. “Viral Mutation Rates.” Journal of Virology 84(19):9733–48.

Sniegowski, PD, PJ Gerrish, and RE Lenski. 1997. “Evolution of High Mutation Rates in

Experimental Populations of E . coli.” Nature 387:703–5.

Uphoff, Stephan et al. 2016. “Stochastic Activation of a DNA Damage Response Causes

Cell-to-Cell Mutation Rate Variation.” Science 351(6277):1094–97.

Wrande, Marie, JR Roth, and Diarmaid Hughes. 2008. “Accumulation of Mutants in

‘“Aging”’ Bacterial Colonies Is Due to Growth under Selection, Not Stress-Induced

Mutagenesis.” Proceedings of the National Academy of Sciences 105(12):11863–

128

68.

Yarza, Pablo et al. 2008. “The All-Species Living Tree Project: A 16S RRNA-Based

Phylogenetic Tree of All Sequenced Type Strains.” Systematic and Applied

Microbiology 31(4):241–50.

Zheng, Qi. 2015. “A New Practical Guide to the Luria-Delbrück Protocol.” Mutation

Research - Fundamental and Molecular Mechanisms of Mutagenesis 781:7–13.

Zheng, Qi. 2016. “A Second Look at the Final Number of Cells in a Fluctuation

Experiment.” Journal of Theoretical Biology 401:54–63.

129

2.7 Appendix - Statistical Models and Acknowledgments

Model 2.1

The model fitted to data presented in Figure 2.1 (analysis of published mutation rates) is the log2 transformed mutation rates from the published literature as a function of the mean-centred log2 transformed final population density, D (fixed effect). The effect of the organism (26 levels), culture environment (44 levels), the marker and its concentration (70 levels) and the article the estimate was published in (68 levels) had on the intercept were all initially included as, partially crossed, random effects. We also initially included random slopes effects for all these factors. Because of known differences in estimator accuracy we also allowed for the possibility of different levels of variance for the different estimators used to calculate mutation rate. To do this we created indicator variables for the four different estimators used for calculating mutation rate. We then incorporated each individual indicator variable within the model allowing a different individual level random effect for each estimation method.

The spaMM package allows for incorporating a correlation matrix. We used the correlation expected under a Brownian model of evolution across the phylogenetic tree described in the Materials and Methods (Figure 2.2), calculated via the ape package (Paradis, Claude, and Strimmer 2004), using Vij = γ x ta, where ta is the distance between the root and last common ancestor for taxa i and j and γ is a constant. We used Akaike Information Criterion (AIC) to reduce this maximal model by removing unnecessary effects. During model simplification, models with each of the above random effects on the intercept and either organism or published article on slope were equivalent (difference of 0.3 in AIC between models). The final minimal adequate model reported consists of random effects of culture environment, organism, marker

130 and its concentration and published article on the intercept, and a random effect of organism on the slope with D. Further details are given in ANOVA table below. The proportion of variance explained by the fixed effect (D) was calculated based on the equations given in Nakagawa and Schielzeth (2013). Specifically, we calculated the

2 variance explained by the fixed effect (σ f = 8.05), the variance explained by the

2 random effects (the sum of the variance explained by each random effect l, σ L =

12.02) and the total variance (the sum of the variance explained by the fixed effect,

2 2 2 2 random effects and the residual variance i.e. σ t = σ f + σ L + σ ε = 20.68) to give the proportion of variance explained by the fixed effect of D, accounting for the random

2 2 2 effects: σ f / (σ t - σ L) = 0.929. Explicit inclusion of genome size (median total length in

Mb as listed for each species name at www.ncbi.nlm.nih.gov/genome on 2nd June

2017) as a fixed effect in this model has little effect (difference of 0.03 in AIC between models; proportion of variance explained by D and genome size together, having accounted for variance explained by the random effects, = 0.930).

ANOVA table and fitted values for Model 2.1 (Figure 2.1).

See Materials and Methods for more details.

Value SE Fixed Effects Intercept 5.1 0.52 log2(D)centred -0.67 0.085

Random Effects SD EnvironmentIntercept 0.75 Marker & ConcentrationIntercept 4.8 PaperIntercept 2.4 OrganismIntercept 1.1 Organismslope 0.04 Maximum Likelihood estimator 4.3x10-9 p0 estimator 0.23 Mean estimator 2.8 Residual (median estimator) 0.61

131

Model 2.2

The model shown in Fig 2.3A (wild-type bacterial strains) fits log2 mutation rate against mean-centred log2 D (estimated via luminescence, see Model 2.4 below) allowing for differences in intercept and slope among the three treatments (different genotype/marker combinations). For this and all subsequent models, an initial model was fit by restricted maximum likelihood (REML) including all fixed effects (in this case

D, treatment and their interaction) and all random effects (experimental plate nested within experimental block nested within experimenter, each affecting the intercept). A series of variants of this model was constructed allowing differences in variance (i.e. heteroscedasticity) associated with one or two covariates.

Potential variance covariates considered were: experimental block, organism, strain and genotype identity, selective marker and liquid growth media used, all treated as discrete effects with a different variance at each level. Continuous variance covariates allowed variance to change as a power function of the covariate. Potential continuous variance covariates considered were: the fitted values of the response variable, the initial population size (N0), the number of mutational events estimated (m), the coefficient of variation (CV) and standard deviation in that estimate, the initial glucose concentration, the final population size (Nt) and its standard deviation, D (estimated by colony forming units, CFU, cell count, CC, or net luminescence, LUM, as available), net luminescence per cell (LUM/Nt), gross luminescence, absolute fitness, the number of generations, the generation time, the percentage of YP, phi (1-Nt/N0), number, volume and incubation time of parallel cultures used in the fluctuation test, proportion of weight remaining following evaporation during the growth of parallel cultures in the liquid media, upper bound, lower bound and range of the mutation rate estimate.

132

The model variant with the lowest AIC was then chosen. In this case this model allowed variance to change with [standard deviation of the estimated number of mutational events x number of parallel cultures]1.3. The two slopes of mutation rate with D estimated for Escherichia coli at different genotypic markers were very similar and this model was therefore simplified by combining these slopes to estimate a single value. No further simplification of the fixed effects was possible without significantly reducing the goodness of fit of the model. Here and below, a significant reduction in the goodness of fit was taken to mean P < 0.05 by likelihood ratio test, comparing models fit with and without particular fixed effects by maximum likelihood. Further details are given in ANOVA table and in diagnostic plots.

ANOVA table and fitted values for Model 2.2 (Figure 2.3A).

See Materials and Methods for more details.

Degrees Value SE F P of freedom Intercept (MG1655_rifampicin50) 1 3.3 0.21 135 2.8×10-26 -18 log2(D)centred 1 -0.68 0.059 129 1.7×10 genotype_marker 2 -4.1 0.32 88 1.8×10-9 (MG1655_nalidixic_acid30) (PAO1_rifampicin50) -1.9 0.32 -4 log2(D)centred:genotype_marker 1 0.53 0.14 15 2.7×10

133

Diagnostic plots for Model 2.2.

Standardised residuals by fitted values and normal quantile-quantile plot of standardised residuals.

Model 2.3

The model shown in Fig 2.3B (wild-type yeast strains) fits log2 mutation rate against mean-centred log2 D (estimated from direct cell counts), interacting with fixed effects of genotype treatments. Random effects of experimental plate nested within experimental block on the intercept were also included. The best model allowed variance to change with both the [culture volume]2.7 and the experimental block. The two slopes of mutation rate with D estimated for strains S288c and Sigma1278b were very similar, and this model was therefore simplified as above to combine these slopes 134 to estimate a single value. No further simplification of the fixed effects was possible without significantly reducing the goodness of fit of the model. Further details are given in ANOVA table and in diagnostic plots.

ANOVA table and fitted values for Model 2.3 (Figure 2.3B).

See Materials and Methods for more details.

Degrees of Value SE F P freedom Intercept (BY 4742) 1 10 0.49 309 6.0×10-35 -13 log2(D)centred 1 -0.32 0.038 313 5.5×10 genotype (S288C) 2 -5.8 1.1 690 5.3×10-4 (Sigma_1278b) -4.7 0.13 -6 log2(D)centred:genotype 1 -0.66 0.13 25 3.6×10

Diagnostic plots for Model 2.3

Standardised residuals by fitted values and normal quantile-quantile plot of standardised residuals.

135

Model 2.4

The model shown in Figure 2.15 (calibration curve) fits log2 D (measured using colony forming units) against mean-centred log2 of luminescence (in arbitrary units from the

BacTiter-Glo assay, LUM), organism (i.e. species) and their interaction and random effects of genotype on the slope and intercept and of experimental plate nested within experimental block on the intercept. The best model allowed variance to change with

[experimental block] x [fitted values of D]-1.4. No simplification of the fixed effects was possible without significantly reducing the goodness of fit of the model. This model was used to calibrate the luminescence values to give the population densities used in

Model 2.2, Model 2.7, Model 2.8 and Model 2.10 (allowing a different calibration curve for each genotype). Further details are given in ANOVA table and in diagnostic plots.

Figure 2.15 Calibration curves for final population density measured by counting colony forming units (CFU), against luminescence, assayed with the BacTiter-Glo assay (arbitrary units - AU). Calibration curves shown are from Model 2.4 (N=368) for E. coli and P. aeruginosa strains used in Fig 2.3, 2.8 and 2.9

136 and Model 2.2, Model 2.7, Model 2.8 and Model 2.10. Originally denoted as Fig S3 in publication.

ANOVA table and fitted values for Model 2.4 (Figure 2.15).

See Materials and Methods for more details.

Degrees of Value SE F P freedom Intercept (E. coli) 1 27 0.049 349696 -33 log2(LUM)centred 1 0.70 0.049 243 1.4×10 organism(P. aeruginosa) 1 0.48 0.17 5.6 0.038 log2(LUM)centred:organism 1 0.38 0.17 5.0 0.027

Diagnostic plots for Model 2.4.

Standardised residuals by fitted values and normal quantile-quantile plot of standardised residuals.

137

Model 2.5

The model shown in Figure 2.4A (wild-type bacterial strains) is very similar to

Model 2.2, but using D estimated by CFU rather than LUM. The best model allowed variance to change with [estimated number of mutational events]-0.38 and [number of parallel cultures]1.8. The two slopes of mutation rate with D estimated for Escherichia coli at different genotypic markers were very similar and this model was therefore simplified by combining these slopes to estimate a single value. No further simplification of the fixed effects was possible without significantly reducing the goodness of fit of the model. Further details are given in ANOVA table and in diagnostic plots.

ANOVA table and fitted values for Model 2.5 (Figure 2.4A).

See Materials and Methods for more details.

Degrees Value SE F P of freedom Intercept 1 3.3 0.14 302 3.3×10-37 (MG1655_rifampicin50) -23 log2(D)centred 1 -0.73 0.053 165 6.5×10 genotype_marker 2 4.2 0.25 150 3.1×10-11 (MG1655_nalidixic_acid30) (PAO1_ rifampicin50) 2.1 0.27 -6 log2(D)centred:genotype_marker 1 0.63 0.13 25 3.4×10

138

Diagnostic plots for Model 2.5.

Standardised residuals by fitted values and normal quantile-quantile plot of standardised residuals.

Model 2.6

The model shown in Figure 2.4B (wild-type yeast strains) is very similar to Model 2.3 but using D estimated by CFU. The best model allowed variance to change with

[generation time x culture volume]0.66. The two slopes of mutation rate with D estimated for strains BY4742 and Sigma1278b were very similar, and this model was therefore simplified as above to combine these slopes to estimate a single value. No further simplification of the fixed effects was possible without significantly reducing the goodness of fit of the model. Further details are given in ANOVA table and in diagnostic plots. 139

ANOVA table and fitted values for Model 2.6 (Figure 2.4B).

See Materials and Methods for more details.

Degrees of Value SE F P freedom Intercept (S288C) 1 5.3 0.77 349 3.9×10-10 -5 log2(D)centred 1 -0.69 0.16 112 3.2×10 genotype (BY4742) 2 4.6 0.89 225 1.4×10-6 (Sigma_1278b) 0.28 0.89 log2(D)centred:genotype 1 0.48 0.17 8.1 0.0052

Diagnostic plots for Model 2.6.

Standardised residuals by fitted values and normal quantile-quantile plot of standardised residuals.

140

Model 2.7

The model shown in Figure 2.8 (various bacterial gene knockouts) fits log2 mutation rate against mean-centred log2 population D (estimated via luminescence, see Model

2.4), interacting with a fixed effect of “MMR” (mutH / mutL / mutS) “dinB”, “dam”,

“nei” and “metI”. The model includes random effects of experimental plate nested within experimental block, and allowing variance to change as [fitted values of the mutation rate]-5.1 x [upper bound of the mutation rate]-1.7. Slopes of mutation rate with

D estimated for “dinB”, “dam” and “metI” were very similar and this model was therefore simplified by combining these slopes to estimate a single value. Further details are given in ANOVA table and in diagnostic plots.

ANOVA table and fitted values for Model 2.7 (Figure 2.8).

See Materials and Methods for more details.

Degrees of Value SE F P freedom Intercept (Δdam) 1 4.2 0.12 8272 6.5×10-55 -30 log2(D)centred 1 -1.1 0.067 232 4.2×10 system (ΔdinB) 4 -0.019 0.15 (ΔmetI) -0.34 0.25 (MMR) 1.55 0.14 (Δnei) -0.30 0.23 -19 log2(D)centred:(MMR) 2 0.93 0.082 97 3.5×10 log2(D)centred:(nei) -0.40 0.14

141

Diagnostic plots for Model 2.7

Standardised residuals by fitted values and normal quantile-quantile plot of standardised residuals.

Model 2.8

The initial model fit to the data in Figure 2.9A (E. coli ΔmutT strains) was similar to

Model 2.2 above with fixed effects of D (estimated with luminescence, see

Model 2.4),genotype and their interaction and random effects on the intercept of plate nested within block. However, model simplification proceeded further to a minimal adequate model with no effect of D on mutation rate at all, the only significant fixed effect being a difference in intercept between the two strains considered. Variance changed as [fitted values of the mutation rate]6.5 x [upper bound

142 of the mutation rate]-2.2. Further details are given in ANOVA table and in diagnostic plots.

ANOVA table and fitted values for Model 2.8 (Figure 2.9A).

See Materials and Methods for more details.

Degrees of Value SE F P freedom Intercept 1 5.4 0.17 929 4.4×10-31 (ΔmutT:JW0097-1) genotype 1 -1.1 0.12 90 5.1×10-12 (ΔmutT:JW0097-3)

Diagnostic plots for Model 2.8

Standardised residuals by fitted values and normal quantile-quantile plot of standardised residuals.

143

Model 2.9

The model shown in Figure 2.10A (E. coli ΔmutT strains) fits log2 mutation rate against mean-centred log2 D (estimated with CFU), genotype and their interaction and random effects on the intercept of plate nested within block. Model simplification proceeded to a minimal adequate model with no effect of D on mutation rate at all, the only significant fixed effect being a difference in intercept between the two strains considered. Variance changed with [fitted values of the mutation rate]5.3 and [lower bound of the mutation rate]-1.5. No further simplification of the fixed effects was possible without significantly reducing the goodness of fit of the model. Further details are given in ANOVA table and in diagnostic plots.

ANOVA table and fitted values for Model 2.9 (Figure 2.10A).

See Materials and Methods for more details.

Degrees of Value SE F P freedom Intercept 1 5.5 0.16 1041 5.9×10-32 (ΔmutT:JW0097-1) genotype 1 -1.1 0.12 87 8.6×10-12 (ΔmutT:JW0097-3)

144

Diagnostic plots for Model 2.9.

Standardised residuals by fitted values and normal quantile-quantile plot of standardised residuals.

Model 2.10

The model fit to the data in Figure 2.9B (E. coli ΔmutM and ΔmutY strains) is similar to

Model 2.8 above with fixed effects of D (estimated with luminescence, see Model 2.4), genotype and their interaction and random effects on the intercept of plate nested within block. The two slopes were very similar and this model was therefore simplified by combining these slopes to estimate a single value. Variance changed with [standard deviation of the estimated number of mutational events x genotype]. Further details are given in ANOVA table and in diagnostic plots.

145

ANOVA table and fitted values for Model 2.10 (Figure 2.9B).

See Materials and Methods for more details.

Degrees of Value SE F P freedom Intercept (ΔmutM) 1 4.7 0.038 38081 2.7×10-45 -24 log2(D)centred 1 -1.1 0.041 716 2.3×10 genotype (ΔmutY) 1 -0.40 0.050 62 4.8×10-9

Diagnostic plots for Model 2.10

Standardised residuals by fitted values and normal quantile-quantile plot of standardised residuals.

146

Model 2.11

The model shown in Figure 2.10B (E. coli ΔmutM and ΔmutY strains) fits log2 mutation rate against mean-centred log2 D (estimated with CFU), genotype and their interaction and random effects on the intercept of plate nested within block. Variance changed with [fitted values of the mutation rate]-7.0 and [upper bound of the mutation rate]3.1.

No further simplification of the fixed effects was possible. Further details are given in

ANOVA table and in diagnostic plots.

ANOVA table and fitted values for Model 2.11 (Figure 2.10B).

See Materials and Methods for more details.

Degrees of Value SE F P freedom Intercept (ΔmutM) 1 4.1 0.13 906 1.8×10-25 -14 log2(D)centred 1 -0.97 0.075 248 3.5×10 genotype (ΔmutY) 1 -0.20 0.070 4.4 0.0437 log2(D)centred:genotype 1 0.21 0.11 4.0 0.0528

147

Diagnostic plots for Model 2.11

Standardised residuals by fitted values and normal quantile-quantile plot of standardised residuals.

Model 2.12

The initial model fit to the data in Figure 2.9C (S. cerevisiae PCD1-Δ strains) fits log2 mutation rate against mean-centred log2 D (estimated with CFU), genotype and their interaction and random effects on the intercept of plate nested within block).

However, model simplification removed all effects of genotype and D. Variance changed as [fitted values of the mutation rate]11 and [upper bound of the mutation rate]-1.7. Further details are given in ANOVA table and in diagnostic plots.

148

ANOVA table and fitted values for Model 2.12 (Figure 2.9C) and Model 2.13 (Figure

2.10C).

See Materials and Methods for more details.

Degrees of Value SE F P freedom Intercept 1 12 0.72 270 3.1×10-16

Diagnostic plots for Model 2.12 and Model 2.13.

Standardised residuals by fitted values and normal quantile-quantile plot of standardised residuals.

Model 2.13

The initial model fit to the data in Figure 2.10C (S. cerevisiae PCD1-Δ strains) fit log2 mutation rate against mean-centred log2 D (estimated from direct cell counts),

149 genotype and their interaction and random effects on the intercept of plate nested within block. However, this simplified to a model identical with Model 2.12. ANOVA table and diagnostic plots are the same as for the Model 2.12.

Model 2.14

The model shown in Figure 2.9D (S. cerevisiae MLH1-Δ strain) fits log2 mutation rate against mean-centred log2 D (estimated with CFU) with random effects on the intercept of plate nested within block. Variance changed as [fitted values of the mutation rate]4.0 and [lower bound of the mutation rate]-1.7. No further simplification of the fixed effects was possible without significantly reducing the goodness of fit of the model. Further details are given in ANOVA table and in diagnostic plots.

ANOVA table and fitted values for Model 2.14 (Figure 2.9D).

See Materials and Methods for more details.

Degrees of Value SE F P freedom Intercept 1 3.6 0.82 18 3.1×10-4 -10 log2(D)centred 1 -1.2 0.10 146 2.2×10

150

Diagnostic plots for Model 2.14

Standardised residuals by fitted values and normal quantile-quantile plot of standardised residuals.

Model 2.15

The model shown in Figure 2.10D (S. cerevisiae MLH1-Δ strain) fits log2 mutation rate against mean-centred log2 D (estimated from direct cell counts) with random effects on the intercept of plate nested within block. Variance changed as [estimated standard

-0.73 deviation of Nt x fitted values of the mutation rate] . No simplification of the fixed effects was possible without significantly reducing the goodness of fit of the model.

Further details are given in ANOVA table and in diagnostic plots.

151

ANOVA table and fitted values for Model 2.15 (Figure 2.10D).

See Materials and Methods for more details.

Degrees of Value SE F P freedom Intercept 1 3.0 3.3 0 0.99 log2(D)centred 1 -0.80 0.23 12 0.0060

Diagnostic plots for Model 2.15.

Standardised residuals by fitted values and normal quantile-quantile plot of standardised residuals.

Model 2.16

The model shown in Figure 2.11 (all strains) fits log2 mutation rate against log2 effective population size (Ne), treatment (genotype/marker combination) and their interaction and random effects on the intercept of plate nested within block. Variance 152 changed with genotype and [D]-0.18. Further details are given in ANOVA table and in diagnostic plots.

ANOVA table and fitted values for Model 2.16 (Figure 2.11).

See Materials and Methods for more details.

Degrees of Value SE F P freedom Intercept (BY4742) 1 -3.8 2.6 2038 0.15 -4 log2(Ne) 1 0.67 0.20 185 6.9×10 genotype_marker 19 62 mutT2_nalidixic_acid30 8.9 3.6 nei_rifampicin50 17 5.6 metI_rifampicin50 28 8.6 MG1655_rifampicin50 12 5.1 MG1655_nalidixic_acid30 12 9.5 PAO1_rifampicin50 3.0 6.0 S288C_5-fluoroorotic_acid1000 -3.8 6.8 BY4742_hygromycinB300 15 4.2 Sigma_1278b_hygromycinB300 7.7 4.6 mutH_nalidixic_acid30 10 3.2 mutL_nalidixic_acid30 5.9 3.6 dinB_rifampicin50 9.8 5.1 mutS_nalidixic_acid30 7.2 5.0 dam_rifampicin50 21 8.9 mutM_rifampicin50 7.4 5.3 mutY_rifampicin50 5.0 5.1 PCD1_by_hygromycinB300 4.3 3.9 PCD1_sigma_hygromycinB300 1.7 5.2 MLH1_sigma_5-fluoroorotic_acid1000 7.9 4.3 log2(Ne):genotype_marker 19 2.7 log2(Ne):mutT2_nalidixic_acid30 -0.75 0.27 log2(Ne):nei_rifampicin50 -1.4 0.43 log2(Ne):metI_rifampicin50 -2.0 0.55 log2(Ne):MG1655_rifampicin50 -0.98 0.33 log2(Ne):MG1655_nalidixic_acid30 -1.1 0.51 log2(Ne):PAO1_rifampicin50 -0.57 0.40 log2(Ne):S288C_5-fluoroorotic_acid1000 0.29 0.52 log2(Ne):BY4742_hygromycinB300 -0.77 0.30 log2(Ne):Sigma_1278b_hygromycinB300 -0.58 0.31 log2(Ne):mutH_nalidixic_acid30 -0.70 0.24 log2(Ne):mutL_nalidixic_acid30 -0.39 0.27 log2(Ne):dinB_rifampicin50 -0.81 0.38 log2(Ne):mutS_nalidixic_acid30 -0.51 0.39 log2(Ne):dam_rifampicin50 -1.7 0.68 log2(Ne):mutM_rifampicin50 -0.62 0.40 log2(Ne):mutY_rifampicin50 -0.45 0.38 log2(Ne):PCD1_by_hygromycinB300 0.020 0.28 log2(Ne):PCD1_sigma_hygromycinB300 0.19 0.35 log2(Ne):MLH1_sigma_5-fluoroorotic_acid1000 -0.74 0.31

153

Diagnostic plots for Model 2.16

Standardised residuals by fitted values and normal quantile-quantile plot of standardised residuals.

Model 2.17

The model shown in Figure 2.12 (all strains) fits log2 [number of mutational events] /

[culture volume x culture time] against log2 D, treatment (genotype/marker combination) and their interaction and random effects on the intercept of plate nested within block. Variance changed with [standard deviation of the estimated number of mutational events x selective marker]. Further details are given in ANOVA table and in diagnostic plots.

154

ANOVA table and fitted values for Model 2.17 (Figure 2.12).

See Materials and Methods for more details.

Degrees of Value SE F P freedom Intercept (BY4742) 1 -12 1.7 732 3.0×10-12 log2(Nt) 1 0.53 0.072 42 5.3×10-13 genotype_marker 14 3.2 19 54 dam_rifampicin50 dinB_rifampicin50 15 3.0 metI_rifampicin50 15 4.6 MG1655_nalidixic_acid30 -3.6 4.4 MG1655_rifampicin50 3.7 2.7 MLH1_sigma_5-fluoroorotic_acid1000 19 7.1 mutH_nalidixic_acid30 -1.8 2.6 mutL_nalidixic_acid30 -5.0 2.6 mutM_rifampicin50 17 3.5 mutS_nalidixic_acid30 -3.4 2.8 mutT1_nalidixic_acid30 -18 2.6 mutT2_nalidixic_acid30 -9.6 4.0 mutY_rifampicin50 12 3.7 nei_rifampicin50 17 3.6 PAO1_rifampicin50 -15 3.7 PCD1_by_hygromycinB300 -7.2 3.2 PCD1_sigma_hygromycinB300 -5.8 3.2 S288C_5-fluoroorotic_acid1000 0.020 3.6 Sigma_1278b_hygromycinB300 0.89 3.5 log2(Nt):genotype_marker -0.67 0.13 19 17 log2(Nt):dam_rifampicin50 log2(Nt):dinB_rifampicin50 -0.75 0.12 log2(Nt):metI_rifampicin50 -0.73 0.17 log2(Nt):MG1655_nalidixic_acid30 -0.26 0.15 log2(Nt):MG1655_rifampicin50 -0.33 0.11 log2(Nt):MLH1_sigma_5-fluoroorotic_acid1000 -0.90 0.26 log2(Nt):mutH_nalidixic_acid30 -0.02 0.10 log2(Nt):mutL_nalidixic_acid30 0.087 0.11 log2(Nt):mutM_rifampicin50 -0.80 0.14 log2(Nt):mutS_nalidixic_acid30 0.011 0.11 log2(Nt):mutT1_nalidixic_acid30 0.56 0.10 log2(Nt):mutT2_nalidixic_acid30 0.19 0.16 log2(Nt):mutY_rifampicin50 -0.61 0.15 log2(Nt):nei_rifampicin50 -0.82 0.13 log2(Nt):PAO1_rifampicin50 0.26 0.14 log2(Nt):PCD1_by_hygromycinB300 0.41 0.14 log2(Nt):PCD1_sigma_hygromycinB300 0.36 0.15 log2(Nt):S288C_5-fluoroorotic_acid1000 -0.19 0.15 log2(Nt):Sigma_1278b_hygromycinB300 -0.20 0.14

155

Diagnostic plots for Model 2.17.

Standardised residuals by fitted values and normal quantile-quantile plot of standardised residuals.

Model 2.18

The initial model shown in Figure 2.13 (Vesicular stomatitis virus) fits log2 mutation rate against mean-centred log2 D (estimated with plaque forming units, effects of the host cell (host cell/genotype combination) and combined fixed effect of oxygen and temperature. No random effects were included. After removing the interaction no further simplification of the fixed effects was possible without significantly reducing the goodness of fit of the model. Further details are given in ANOVA table and in diagnostic plots.

156

ANOVA table and fitted values for Model 2.18 (Figure 2.13).

See Materials and Methods for more details.

Degrees of Value SE F P freedom Intercept (BHK) 1 15 0.26 32471 1.9×10-26 -5 log2(D)centred 1 -0.48 0.088 24 1.7×10 host_genotype (C6/36) 7 -2.6 0.34 26 9.1×10-8 (CT26) -0.38 0.31 (MEF) -0.18 0.31 (MEF_p53) -1.0 0.31 (Neuro) -0.32 0.32 (S2) -3.5 0.34 (sf21) -2.9 0.34 oxygen_temperature 2 0.16 0.37 0.69 (normal_28) (normal_37) -0.87 0.31

Diagnostic plots for Model 2.18.

Standardised residuals by fitted values and normal quantile-quantile plot of standardised residuals.

157

Acknowledgments for Chapter

We thank Johanna M. Schwingel for P. aeruginosa PAO1, Karina B. Xavier for E. coli

MG1655, Richard E. Lenski for E. coli B and Daniela Delneri for all yeast strains. We thank César Aguilar, Sutirth Dey, Jaroslaw Dziadek, Bhavna Gordhan, Alina Górna,

Dennis Grogan, Gregory Lang, Sasha Levy, Christina Moon, Shinichi Oide, Marianoel

Pereira-Gómez, Colin Russell, Rafael Sanjuán, Gavin James Sherlock, Clara Torres-

Barceló, Arjan de Visser, Sebastian Wielgoss and Clifford Zeyl for providing data. We thank Mark Foster and John Parfitt for technical assistance. Genome sequencing was provided by MicrobesNG (http://www.microbesng.uk), which is supported by the

BBSRC (grant number BB/L024209/1). We thank David Robertson, James McInerney,

John Fitzpatrick and Chris Thompson for critical readings of the manuscript.

158

Chapter 3: Density Associated Mutation Rate Plasticity in two disparate species of Archaea

159 3.1 Abstract

The spontaneous mutation rate is a crucial variable for an organism that can determine the evolvability of that organism, but is itself dependent upon the environment.

Recently mutation rate was shown to vary with the social environment in both bacteria and yeast. Specifically, mutation rate is inversely associated with the final population density reached by the culture: more dense populations exhibit a lower mutation rate than less dense populations. Here I test DAMP’s presence in the final domain of life yet to be empirically tested, the Archaea. DAMP was identified in two distantly related species of Archaea (Sulfolobus acidocaldarius and Haloferax volcanii), indicating that

DAMP has evolved at the broad evolutionary scale and is present in all domains of life.

Mirroring the results previously found in Escherichia coli, the degree of DAMP exhibited by S. acidocaldarius does not differ between different genes tested, signifying that DAMP is present with the same degree across the genome. It was shown previously that DAMP is modulated via the same mutation avoidance mechanism in both E. coli and Saccharomyces cerevisiae, via the cleansing of the intracellular nucleotide pool of the highly mutagenic nucleotide 8-oxo-dGTP (by mutT and PCD1 respectively). A BLAST analysis of these two Archaea revealed protein sequences similar to that of E. coli MutT in S. acidocaldarius and S. cerevisiae PCD1 in

H. volcanii, indicating the possibility that DAMP is modulated via the same mechanisms in all domains of life. The difference in protein similarity between these two Archaea highlights the intermediary point the Archaea have between Bacteria and Eukaryotes.

These results combine to clarify the previous assertion of DAMP’s presence in this domain and imply a highly conserved mechanism for DAMP’s control throughout all domains of life.

160 3.2 Introduction

Spontaneous mutations cause the genetic variation present within a species, allowing the processes of selection and genetic drift to act (Lynch et al. 2016). In this way, mutations are viewed as the fuel of evolution and the rate they occur can dictate that organism’s evolvability (Galhardo et al. 2007). The rate these mutations arise however is not a constant. Instead the mutation rate is a variable trait that depends upon the environment (Elena and de Visser 2003; Saint-Ruf and Matic 2006). In diverse microbes it is now apparent that mutation rate, alongside a multitude of other behaviours, is related to the population density of the organism (Krašovec et al. 2014a, 2017), modulated via intercellular communication (Krašovec et al. 2014b). Specifically, exhibited mutation rates are lower in populations with a high final population density than in lower density populations. Such Density Associated Mutation Rate Plasticity

(DAMP) has been shown to be present in both bacteria (Escherichia coli) and eukaryotes (Saccharomyces cerevisiae) (Krašovec et al. 2017) and involves the same mutation avoidance mechanism, cleansing the intracellular nucleotide pool of the highly mutagenic nucleotide 8-oxo-dGTP (Michaels and Miller 1992). Whilst this work is suggestive of an ancient origin of DAMP, due to its broad phylogenetic distribution and shared mechanism, it is lacking in experimental evidence from the final domain of life, the Archaea.

Archaea are commonly found in extreme environments, such as extremes in temperature, pH and salinity. Within these environments, there is great potential for

DNA to become damaged and as such the Archaea have evolved a multitude of DNA repair mechanisms, some of which are conserved across domains ( and Allers

2018). Additionally, the eukaryotes are proposed to have evolved from within the

161 Archaeal domain (Eme et al. 2017), and Archaea possess mechanisms of DNA replication that are similar in mode to Eukaryote species (Barry and Bell 2006). In this way the Archaea present an intermediary between the two domains previously tested.

The investigation of DAMP within the Archaea could therefore help inform us about the evolution of DAMP across the tree of life, and potentially even its evolutionary origins.

DAMP has been proposed to occur within the Archaea through the analysis of published mutation rates over the last 75 years (Krašovec et al. 2017). This suggestion is, however, problematic and potentially misleading for two reasons. First, two of the three Archaeal species included only included one mutation rate estimate within this analysis. Secondly, within that analysis, each species was only included as a random effect in the statistical model, and so the slope value given from the model (the Best

Linear Unbiased Predictor – BLUP) relates strongly to the average slope of the model for the whole dataset. Therefore asserting a slope from this data is overreaching what is capable from the available data. Here I resolve these problems by empirically exploring DAMP’s presence within the Archaea via fluctuation tests.

The evolution of DAMP in the Archaea is tested by estimating mutation rates via fluctuation tests in two species from two highly diverged phyla within the domain:

Sulfolobus acidocaldarius – a thermoacidophilic archaeon from the Crenarchaeota

(Brock et al. 1972); and, Haloferax volcanii – a halophilic mesophilic archaeon from

Euryarchaeota (Mullakhanbhai and Larsen 1975). Additionally, mutation rates in S. acidocaldarius were estimated in two genetic markers to investigate DAMP’s variation across the genome. Finally, the use of genetic knockout mutants of E. coli and S.

162 cerevisiae previously allowed for the exploration of the mechanisms controlling DAMP

(Krašovec et al. 2014a, 2017), however such mutants are not readily available for these archaeal species. Nevertheless, it is still possible to explore the mechanistic control of

DAMP in these two species. Both S. acidocaldarius (Chen et al. 2005) and H. volcanii

(Hartman et al. 2010) have had their whole genomes sequenced, allowing them to be searched for matches to genes and proteins already found to modulate DAMP. Thus making up for the lack of libraries in these two Archaea. The nucleotide and amino acid sequences for all previous mechanisms found to modulate DAMP in either E. coli or S. cerevisiae were tested against both S. acidocaldarius and H. volcanii

DNA and protein sequences to determine whether such previously identified mechanisms are also at work in this domain.

163 3.3 Materials and Methods

3.3.1 Strains used in this study Sulfolobus acidocaldarius belongs to the Crenarchaeota phylum and is a hyperthermophilic acidophile archaea that grows optimally ~80°C and a pH of 3 (Quehenberger et al. 2017). S. acidocaldarius strain DSM 639 was kindly provided Prof. Sonja Albers of the University of Freiburg, Germany. Haloferax volcanii belongs to the Euryarchaeota and is a halophilic mesophile archaea that grows in hypersaline environments (Soppa 2006). H. volcanii grows optimally as 42°C and in an environment containing 1.5-2M NaCl concentrations. H. volcanii strain DS2 was kindly provided by Prof. Thorsten Allers of the University of Nottingham, UK.

3.3.2 Media MilliQ water was used for all media. For Sulfolobus acidocaldarius, Brock basal media (BBM) was prepared according to Brock et al. (1972). 0.5M CaCl2, 1M

MgCl2, Tryptone (0.1% wt/v), and various concentrations of D-xylose were sterilised before being added to BBM post-autoclave. 5-Fluoroorotic acid (5-FOA; 100µg/ml) and

Novobiocin salt (4µg/ml) were freshly prepared and added to BBM for selective media.

All media was solidified when needed using Gelrite (12g/l; Sigma-Aldrich). Uracil

(20µg/ml) was added to all solidified media. BBM without carbon source was used for cell dilutions.

Haloferax volcanii Casamino acid (Hv-Ca) media was prepared according to the instructions of Thorsten Allers. Sterile Hv-Ca salts and 0.5M KPO4 buffer were added after autoclave. 5-FOA (100µg/ml) was prepared fresh and added to Hv-Ca media after autoclave for use as selective media. Uracil (10µg/ml) was added to the selective media alongside 5-FOA. Cell dilutions were carried out in the salt base solution of Hv-

164 Ca media, that is media without any carbon source. Media was solidified as needed by adding 15g/l of agar (Difco).

Detailed instructions for BBM media for S. acidocaldarius and Hv-Ca media for H. volcanii media preparation is in the Appendix at the end of this chapter.

3.3.3 Fluctuation tests with Archaea Fluctuation tests with both Archaea species followed the same basic principle as used previously with bacteria and yeast (Krašovec et al. 2017), but with several alterations.

For S. acidocaldarius, glycerol stock was inoculated in to BBM with 0.1% (wt/v)

Tryptone and 0.2% (wt/v) D-xylose as carbon sources and grown at 75ºC with moderate shaking. After ~24 hours, S. acidocaldarius was diluted into fresh BBM with

0.1% Tryptone and various D-xylose concentrations, giving an initial population size,

6 3 6 N0, of approximately 1.3 x 10 (range 3.5 x 10 – 2.2 x 10 ). Independent cultures

(1.75ml) were pipetted into the inner 60 wells of a 96 deep well plate, with the outer rows being filled with 2ml of sterile Milli-Q water. As previously done for E. coli

(Krašovec et al. 2017), final population density was estimated by both direct counts of colony forming units (CFU) and also net luminescence from an ATP-based assay (LUM).

For CFU counts, an appropriate dilution was plated onto a solid plate of non-selective

BBM media and incubated at 75°C. The LUM measurement was determined on a

Promega GloMax luminometer and using the Promega Bac-Titer Glo kit as detailed in the manufacturer’s instructions. Luminescence was first measured 0.5 seconds after injection of the Bac-Titer Glo kit. Luminescence for the same culture was then measured again 330 seconds after injection. The net luminescence was then calculated as luminescence330s – luminescence0.5S. Evapouration was monitored daily and

165 accounted for in final population size (Nt) calculations and also included in statistical models as a potential model co-variate. Additionally the outer wells of the 96 well plate were filled with sterile water, which were then refilled daily to make up for any lost to evapouration (no new liquid was added to the internal 60 wells). We obtained the observed number of mutants resistant to 5-FOA (100µg/ml) and Novobiocin

(4µg/ml), r, by plating the entirety of remaining cultures onto solid selective BBM medium that allows spontaneous mutants to form colonies. Selective 5-FOA and

Novobiocin plates were incubated at C for 8 days.

For H. volcanii, glycerol stock was inoculated into 0.5% (wt/v) Hv-Ca media and grown at 42°C shaking at 200rpm. After 2 days, H. volcanii was diluted and inoculated into fresh Hv-Ca media of various concentrations (0.0625 – 0.5% wt/v), giving a N0 of 1200

(range 375 - 2400). Independent cultures (1-1.5ml) were grown in 96 deep-well plates at 42°C for three days. Population density was estimated only by CFU as there was no clear association between LUM and CFU, potentially due to the high salt concentrations in the media. CFU was determined by plating appropriate dilutions of culture and grown on non-selective solid media. Evaporation was accounted for as detailed above. We obtained the observed number of mutants resistant to 5-FOA

(100µg/ml), r, by plating the entirety of remaining cultures onto solid selective Hv-Ca medium (also containing uracil (10µg/ml)) that allows spontaneous mutants to form colonies. Selective plates were incubated at 42°C for 7 days.

3.3.3 Estimation of mutation rates To calculate the number of mutational events, m, from the observed number of mutants I used the Ma-Sandri-Sarkar maximum- likelihood method implemented by the R package flan v0.6 (Mazoyer et al. 2017). The

166 mutation rate per cell per generation is then calculated as m divided by the final population size, Nt (as determined by CFU).

3.3.4 BLAST Analysis of Archaeal genetic and protein sequences Previous work in

DAMP uncovered the same molecular mechanism in both pro- and eukaryotes

(Krašovec et al. 2017), specifically the cleansing of intracellular nucleotide pools of the mutagenic oxidised nucleotide 8-oxo-dGTP (Michaels and Miller 1992). Additionally,

DAMP in E. coli is controlled via intercellular communication dependent upon the highly conserved gene luxS and its role in the activated methyl cycle (Krašovec et al.

2014a). Both S. acidocaldarius (Chen et al. 2005) and H. volcanii (Hartman et al. 2010) have had their whole genomes sequenced, allowing them to be searched for matches to both nucleotide and protein sequences for genes already found to modulate DAMP.

Thus making up for the lack of gene knockout libraries in these two species. Due to the intermediary position that the Archaea hold between bacteria and eukaryote species, sequences from both E. coli and S. cerevisiae were tested against the published sequences for S. acidocaldarius DSM 639 (Genbank accession No: CP000077) and H. volcanii DS2 (Genbank accession Nos: CP001953-57). E. coli K-12 MG1655 (Genbank accession No: U00096) mutT and luxS nucleotide and amino acid sequences and S. cerevisiae S288c (Genbank accession Nos: NC_001133-48 and NC_001224) PCD1 nucleotide and amino acid sequences were downloaded from KEGG (Kyoto

Encyclopaedia of Genes and Genomes: https://www.genome.jp/kegg/). BLAST analysis was also carried out on KEGG using BLASTn and BLASTp, carried out for nucleotide and protein similarity respectively. BLASTn and BLASTp utilised BLAST v2.2.9+ (Altschul et al. 1997; Schaffer et al. 2001). Comparison of identical or similar nucleotide and amino acid identity was extracted from the output alongside the E-value, giving a level of

167 significance of the similarity of the sequences. Sequences were determined to be similar to the query sequence when E-values were below 0.001. See appendix for detailed alignments of the sequences determined to be similar to the query sequence.

3.3.5 Statistical analysis All statistical analysis was executed in R v3.5.0 using nlme v3.1-137 package for linear mixed effects modelling (Pinheiro et al. 2018). This enabled the inclusion within the same model of experimental factors (fixed effects), blocking effects (random effects) and factors affecting variance (heteroscedasticity). Two factors were incorporated in each model to account for such heteroscedasticity, with models being compared by AIC to determine the best model. See the figure legends for details of factors accounting for heteroscedasticity in each specific model.

168 3.4 Results

3.4.1 DAMP in the Archaea

In order to test the suggestion of DAMP’s presence in the Archaea (Krašovec et al.

2017), mutation rates were estimated in two disparate species from this domain:

Sulfolobus acidocaldarius, at two loci, and Haloferax volcanii, at one locus. In both of these species there is a significant inverse association of population density and

-8 mutation rate (Sulfolobus: N = 45, LR7, 8 = 29, P = 7.22 x 10 ; Haloferax: N = 12, LR1 =

24.01, P = 9.57 x 10-7; Figure 3.1), with S. acidocaldarius having a slope of -0.92 (±0.11

SE; Figure 3.1A), whilst H. volcanii shows a lower degree of DAMP with a slope of -0.80

(±0.07 SE; Figure 3.1B). The degree of DAMP, as for previous work on E. coli, does not significantly differ between the loci tested in S. acidocaldarius (N = 45, LR8, 9 = 1.80, P =

0.18), although the average mutation rate between these two loci is significantly

-11 different (N = 45, LR7, 8 = 42.35, P = 7.63 x 10 ; Figure 3.1A). To account for any potential correlation of errors, population density was estimated via two independent methods for S. acidocaldarius, as done before for E. coli (Krašovec et al., 2017; see

Materials and Methods). As found in E. coli, there is still a significant degree DAMP when using either direct counts of CFU (Figure 3.1A) or an ATP-based assay (N = 45,

-3 LR6, 7 = 6.67, P = 9.81 x 10 ; Slope = -0.62 ± 0.20 SE; Figure 3.2) to estimate population density. Similarly, there is a significant difference in average mutation rate between

-9 selective agents tested (N = 45, LR6, 7 = 35.04, P = 3.24 x 10 ; Figure 3.2), again without any interaction between these selective agents and population density (N = 45, LR7, 8 =

0.03, P = 0.8524; Figure 3.2). These results highlight that estimating population density via an ATP-based assay produces a reduced slope compared to before where population density was estimated via direct colony counts. Using calibrated population density removes the possibility of such an inverse association occurring due to

169 correlating errors between the axes when using direct cell counts for estimating both mutation rate and population density. This shows that whilst correlation of errors is a potential problem, due to the flattening of the slope, overall DAMP is resistant to this and present in S. acidocaldarius. These results from Archaea support previous work on

DAMP, by showing clear evidence for the evolution of DAMP at this broad evolutionary scale and extending its prevalence to the final domain of life.

A B 10 5 )

1 5 −

1 2 generation 9 − 0.5

1

0.1

0.5

Mutation rate (10 Mutation rate 7 7 7 8 1x10 2x10 5x10 1x10 2x106 5x106 1x107 2x107

−1 Population density (ml )

Figure 3.1 Density Associated Mutation Rate Plasticity in Archaea. A

Mutation rates to 5-FOA (diamonds) and Novobiocin (circles) in Sulfolobus acidocaldarius DSM 639. Lines for the two markers are plotted separately due to the higher overall mutation rate of 5-FOA resistance than Novobiocin resistance. The standard deviation of the estimated number of mutational events crossed with the log(net luminescence per cell) was included in the model as a variance power function to account for heteroscedasticity. B Mutation rates to 5-FOA in Haloferax volcanii DS2.

The inverse coefficient of variation crossed with the fitted values of the mutation rate was included in the model as a variance power function to account for

170 heteroscedasticity. Population density for both species was estimated from direct CFU counts. Note that both axes are logarithmic.

500

100

20

10

1x107 2x107 5x107 1x108

Figure 3.2 Density Associated Mutation Rate Plasticity in Sulfolobus acidocaldarius with population density estimated via an ATP-based assay.

Mutation rates to 5-FOA (diamonds) and Novobiocin (circles) in Sulfolobus acidocaldarius DSM 639. Lines for the two markers are plotted separately due to the higher overall mutation rate of 5-FOA than Novobiocin. The standard deviation of the estimated number of mutational events crossed with the log(gross luminescence) was included in the model as a variance power function to account for heteroscedasticity.

Population density for S. acidocaldarius was estimated via an ATP-based assay. Note the logarithmic axes.

3.4.2 BLAST Analysis of Archaeal nucleotide and amino acid sequences

Nucleotide and protein sequences for previously identified mechanisms modulating

DAMP (Krašovec et al. 2014a, 2017) were compared to the published sequences for S. acidocaldarius DSM 639 and H. volcanii DS2. This identified that there were similar

171 protein sequences, but no nucleotide sequences, between the previously identified mechanisms that modulate DAMP and the sequences in these two Archaea species.

S. acidocaldarius possesses several protein sequences that bore a strong similarity to the MutT protein of E. coli. These genes are designated Saci 0153 (E-value = 2e-7), Saci

0121 (E-value = 6e-7) and Saci 0013 (E-value = 4e-4). A fourth gene was just outside the pre-selected value of significance (Saci 0550: E-value = 0.005). All four of these genes are from the NUDIX family of proteins, as shown by the presence of the NUDIX motif in all these proteins. Indeed, the similarity for Saci 0013 to MutT seems to come only due to the presence of this NUDIX motif (see Appendix for alignment), and not due to the longer sequence of the protein. No S. acidocaldarius proteins showed similarity to the S. cerevisiae PCD1 protein sequence. H. volcanii though shows similarity to the protein involved in modulating DAMP in S. cerevisiae, the MutT homolog PCD1 (Hv 0762: E-value = 4e-4). In addition to this, there are several other proteins that show similarity to E. coli MutT, although all these fall at or above the level of significance (ranging from 0.001 to 0.013; Hv 0030: E-value = 0.001). Similar to what was found for S. acidocaldarius, all these sequences are again members of the

NUDIX hydrolase family, as shown by the presence of the NUDIX motif. DAMP is also modulated in E. coli through the action of luxS in the activated methyl cycle (Krašovec et al. 2014a). Neither the nucleotide nor amino acid sequences of E. coli luxS show any similarity to sequences in either S. acidocaldarius or H. volcanii. These results combine to suggest that DAMP in these Archaea may be primarily modulated via the same mutation avoidance mechanism as previously identified in both bacteria and yeast.

172 3.5 Discussion

DAMP has evolved across diverse organisms from different domains of life (Krašovec et al. 2017). Whilst DAMP’s prevalence has been empirically supported in both bacteria and eukaryotes, its presence in the final domain of life, the Archaea, whilst proposed from analysis of the published literature (Krašovec et al. 2017), lacked empirical evidence. Here this has been resolved. Mutation rate is inversely related with population density in two evolutionary distant species of Archaea: Sulfolobus acidocaldarius and Haloferax volcanii (Figures 3.1 and 3.2). This relationship, as found previously in E. coli, does not differ across the genome (S. acidocaldarius – Figure 3.1), reaffirming DAMP’s presence across the genome. Furthermore DAMP may share the same modulation mechanism as previously found in both bacteria and yeast (Krašovec et al. 2017), though this suggestion comes from analysis of amino acid sequences of these two Archaeal species, and not empirical experiments.

Rather intriguingly, the protein sequences found to be similar in these Archaea and

DAMP’s known mechanisms differed between S. acidocaldarius and H. volcanii. S. acidocaldarius showed greatest similarity to that of E. coli MutT, whereas H. volcanii showed greatest similarity to S. cerevisiae PCD1. Both MutT and PCD1 carry out the same function of cleansing the intracellular nucleotide pool of the mutagenic nucleotide 8-oxo-dGTP (Michaels and Miller 1992; Nunoshiba et al. 2004). However the fact that S. acidocaldarius shows bacterial similarity whilst H. volcanii shows eukaryotic similarity is particularly interesting. The eukaryotes are proposed to have evolved from within the Archaea, specifically from within the TACK superphylum, consisting of Thaumarchaeota, Aigarchaeota, Crenarchaeota (including S. acidocaldarius) and Korarchaeota (Guy and Ettema 2011). Therefore it may have

173 been thought a priori that the protein sequence of S. acidocaldarius would show greater similarity to PCD1. Eukaryotes tend to have inherited genes involved in informational process, such as DNA repair, from then Archaea (Leipe et al. 1999), specifically the Euryarchaeota. Instead, such informational processing genes were acquired from an archaeal lineage that has acquired such genes from horizontal gene transfer from Euryarchaeal genes, possibly explaining this similarity to PCD1 in

H. volcanii (Yutin et al. 2008).

Additionally, there is no similarity in nucleotide or amino acid sequence with the other mechanism shown to modulate DAMP, that of luxS and its role in the activated methyl cycle (Krašovec et al. 2014a). This is not necessarily surprising as luxS has not been described in the archaeal domain and is missing also in eukaryotes

(Doherty et al. 2006). Instead, whereas the action of luxS is part of twostep process in E. coli (alongside the pfs gene), this is a single in both eukaryotes and archaea, including some other bacteria (Winzer et al. 2003). In E. coli the pfs (also called mtn/mtnN/yadA) gene precedes the action of luxS, and indeed pfs expression is the limiting factor in Autoinducer-2 production, rather than luxS (Beeston and Surette

2002). Therefore, because of these reasons, the lack of luxS similarity is not surprising and the modulation of DAMP via the activated methyl cycle needs exploring empirically in not only the Archaea, and but also eukaryotes.

The method of estimating population density via an independent method to that of the final population size as used in the estimation of the mutation rate is used here, as before (Krašovec et al. 2017), to the remove the potential of DAMP occurring due to a correlation of errors. In the previous study such a method resulted in only a small

174 change in DAMP slope for both E. coli and S. cerevisiae (Krašovec et al. 2017).

However, when using such independent methods here the reduction in DAMP slope is greater, with the S. acidocaldarius slope being reduced by approximately one third

(from -0.92 to -0.62; Figure 3.1A and 3.2). Therefore there is a clear potential for correlation of errors to occur, although the use of independence estimates corrects this. The use of independent methods for estimating population density could not be used for H. volcanii, due to no clear relationship between the ATP-based assay and population density from direct CFU counts to produce a calibration curve. This is possibly due to the high salt concentrations needed by this species in their media, which can inhibit the activity of the firefly luciferase as used in this ATP-based assay

(Denburg and Mcelroy 1970). This inhibition is thought to occur due to a localised conformational change in the due to binding by anions, with chloride anions showing greatest inhibition. Indeed, even if the correlation of errors followed that of S. acidocaldarius and reduces the slope by one third, this would still result in a strong

DAMP slope (-0.8 to -0.53), implying DAMP’s presence in H. volcanii is robust to the correlation of errors. Nevertheless, the possibility of such a correlation means that the use of such calibrated, independent methods should be utilised to ensure no spurious results are given.

In conclusion, DAMP is evident in two species from divergent phyla of Archaea (Figure

3.1). Additionally, comparison of protein sequences identified similarity to those mechanisms previously identified that modulate DAMP, specifically the cleansing of the intracellular nucleotide pool of highly mutagenic nucleotides. This similarity differs between the two species with S. acidocaldarius showing similarity to the bacterial protein and H. volcanii having similarity to eukaryote protein. This result highlights the

175 intermediary place that Archaea hold between the bacteria and Eukaryotes, though contrasts with the proposed positioning of the Eukaryotes within the TACK superphylum (Guy and Ettema 2011; Eme et al. 2017) and hints at an ancient horizontal gene transfer even in early evolution. The proteins identified here all belong to the NUDIX superfamily, which is highly conserved throughout life (McLennan 2006), suggesting that DAMP may be widespread and modulated via a conserved mechanism in all three domains. However, such proteins identified in these Archaea have not been empirically tested and indeed their actions have primarily been derived from sequence analysis. Therefore care should be taken before this empirical evidence is forthcoming.

The results presented here though, point to the fact that DAMP is prevalent at a wide evolutionary scale, having roots in all three domains of life, suggesting deep evolutionary origins to such socially mediated mutation rate plasticity.

176 3.6 References

Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. J.

Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein

database search programs. Nucleic Acids Res. 25:3389–3402. Oxford University

Press.

Barry, E. R., and S. D. Bell. 2006. DNA Replication in the Archaea. Microbiol. Mol. Biol.

Rev. 70:876–887.

Beeston, A. L., and M. G. Surette. 2002. pfs-Dependent Regulation of Autoinducer 2

Production in Salmonella enterica Serovar Typhimurium. J. Bacteriol. 184:3450–

3456.

Brock, T. D., K. M. Brock, R. T. Belly, and R. L. Weiss. 1972. Sulfolobus: A new genus of

sulfur-oxidizing bacteria living at low pH and high temperature. Arch. Mikrobiol.

84:54–68.

Chen, L., K. Brügger, M. Skovgaard, P. Redder, Q. She, E. Torarinsson, B. Greve, M.

Awayez, A. Zibat, H.-P. Klenk, and R. A. Garrett. 2005. The Genome of Sulfolobus

acidocaldarius, a Model Organism of the Crenarchaeota. J. Bacteriol. 187:4992–

4999.

Denburg, J. L., and W. D. Mcelroy. 1970. Anion Inhibition of Firefly Luciferase’. Arch.

Biochem. Biophys. 141:668475.

Doherty, N., M. T. G. Holden, S. N. Qazi, P. Williams, and K. Winzer. 2006. Functional

analysis of luxS in Staphylococcus aureus reveals a role in metabolism but not

quorum sensing. J. Bacteriol. 188:2885–97. American Society for Microbiology

(ASM).

Elena, S. F., and J. A. G. de Visser. 2003. Environmental stress and the effects of

mutation. J. Biol. 2:12. BioMed Central.

177 Eme, L., A. Spang, J. Lombard, C. W. Stairs, and T. J. G Ettema. 2017. Archaea and the

origin of eukaryotes. , doi: 10.1038/nrmicro.2017.133.

Galhardo, R. S., P. J. Hastings, and S. M. Rosenberg. 2007. Mutation as a stress

response and the regulation of evolvability.

Guy, L., and T. J. G. Ettema. 2011. The archaeal “TACK” superphylum and the origin of

eukaryotes. Elsevier.

Hartman, A. L., C. Norais, J. H. Badger, S. Delmas, S. Haldenby, R. Madupu, J. Robinson,

H. Khouri, Q. Ren, T. M. Lowe, J. Maupin-Furlow, M. Pohlschroder, C. Daniels, F.

Pfeiffer, T. Allers, and J. A. Eisen. 2010. The Complete Genome Sequence of

Haloferax volcanii DS2, a Model Archaeon. PLoS One 5:e9605. Public Library of

Science.

Krašovec, R., R. V Belavkin, J. A. D. Aston, A. Channon, E. Aston, B. M. Rash, M.

Kadirvel, S. Forbes, and C. G. Knight. 2014a. Mutation rate plasticity in rifampicin

resistance depends on Escherichia coli cell-cell interactions. Nat. Commun.

5:3742.

Krašovec, R., R. V Belavkin, J. A. D. Aston, A. Channon, E. Aston, B. M. Rash, M.

Kadirvel, S. Forbes, and C. G. Knight. 2014b. Where anitibiotic resistance

mutations meet quorum-sensing. Microb. Cell 1:250–252.

Krašovec, R., H. Richards, D. R. Gifford, C. Hatcher, K. J. Faulkner, R. V. Belavkin, A.

Channon, E. Aston, A. J. McBain, and C. G. Knight. 2017. Spontaneous mutation

rate is a plastic trait associated with population density across domains of life.

PLoS Biol. 15:e2002731.

Leipe, D. D., L. Aravind, and E. V Koonin. 1999. Did DNA replication evolve twice

independently? Nucleic Acids Res. 27:3389–3401. Oxford University Press.

Lynch, M., M. S. Ackerman, J.-F. Gout, H. Long, W. Sung, W. K. Thomas, and P. L. Foster.

178 2016. Genetic drift, selection and the evolution of the mutation rate. Nat. Rev.

Genet. 17:704–714.

Mazoyer, A., R. Drouilhet, S. Despréaux, and B. Ycart. 2017. flan: An R Package for

Inference on Mutation Models. R J. 9:334–351.

McLennan, A. G. 2006. The Nudix hydrolase superfamily. Cell. Mol. Life Sci. 63:123–

143.

Michaels, M. L., and J. H. Miller. 1992. The GO System Protects Organisms from the

Mutagenic Effect of the Spontaneous Lesion 8-Hydroxyguanine (7,8-Dihydro-8-

Oxoguanine). J. BACrERIOLOGY 174:6321–6325.

Mullakhanbhai, M. F., and H. Larsen. 1975. Halobacterium volcanii spec. nov., a Dead

Sea halobacterium with a moderate salt requirement. Arch. Microbiol. 104:207–

214. Springer-Verlag.

Nunoshiba, T., R. Ishida, M. Sasaki, S. Iwai, Y. Nakabeppu, and K. Yamamoto. 2004. A

novel Nudix hydrolase for oxidized purine nucleoside triphosphates encoded by

ORFYLR151c (PCD1 gene) in Saccharomyces cerevisiae. Nucleic Acids Res.

32:5339–5348. Oxford University Press.

Pinheiro, J., D. Bates, S. DebRoy, D. Sarkar, and R. C. Team. 2018. nlme: Linear and

Nonlinear Mixed Effects Models. R Packag. version 3.1-137.

Quehenberger, J., L. Shen, S. V. Albers, B. Siebers, and O. Spadiut. 2017. Sulfolobus - A

potential key organism in future biotechnology. Front. Microbiol. 8:2474.

Saint-Ruf, C., and I. Matic. 2006. Environmental tuning of mutation rates. Environ.

Microbiol. 8:193–9.

Schaffer, A. A., L. Aravind, T. L. Madden, S. Shavirin, J. L. Spouge, Y. I. Wolf, E. V.

Koonin, and S. F. Altschul. 2001. Improving the accuracy of PSI-BLAST protein

database searches with composition-based statistics and other refinements.

179 Nucleic Acids Res. 29:2994–3005.

Soppa, J. 2006. From genomes to function: Haloarchaea as model organisms.

Microbiology 152:585–590.

White, M. F., and T. Allers. 2018. DNA repair in the archaea—an emerging picture.

FEMS Microbiol. Rev. 42:514–526. Oxford University Press.

Winzer, K., K. R. Hardie, and P. Williams. 2003. LuxS and Autoinducer-2: Their

Contribution to Quorum Sensing and Metabolism in Bacteria.

Yutin, N., K. S. Makarova, S. L. Mekhedov, Y. I. Wolf, and E. V Koonin. 2008. The deep

archaeal roots of eukaryotes. Mol. Biol. Evol. 25:1619–1630. Oxford University

Press.

180 3.7 Appendix

3.7.1 Amino acid alignments from the BLASTp analysis

3.7.1.1 Amino acid sequence alignment of Escherichia coli MutT to Sulfolobus acidocaldarius DSM 639

>sai:Saci_0153 ADP-ribose pyrophosphatase ↑ Top Length=146

Score = 43.5 bits (101), Expect = 2e-07, Method: Compositional matrix adjust. Identities = 30/120 (25%), Positives = 57/120 (48%), Gaps = 2/120 (2%)

mutT 1 MKKLQIAVGIIRNENNEIFITRRAADAHMANKLEFPGGKIEMGETPEQAVVRELQEEVGI 60 M++ +AVG + + N++ + +R + N PGGK+E GET AV RE++EE + 0153 1 MERPLVAVGGVILKGNKVLLVKRRNPPNKGN-WAIPGGKVEYGETLVDAVKREMKEETAL 59 mutT 61 TPQHFSLFEKLEYEFPDRHITLWFWLVERWEGEPW-GKEGQPGEWMSLVGLNADDFPPAN 119 + L +E H ++ ++ + GE G + +++ L L ++ P 0153 60 DVEPIELLAVVEIIKEGYHYVIFDFICKVLNGELNPGSDATSADFLGLDELRRENVSPTT 119

>sai:Saci_0121 nudT; 7,8-dihydro-8-oxoguanine triphosphatase ↑ Top Length=153

Score = 42.0 bits (97), Expect = 6e-07, Method: Compositional matrix adjust. Identities = 28/99 (28%), Positives = 47/99 (47%), Gaps = 4/99 (4%) mutT 1 MKKLQIAVGIIRNENNEIFITR-RAADAHMANKLEFPGGKIEMGETPEQAVVRELQEEVG 59 M+ ++ +G+++ + +FI + R N GGK+ ETP + V REL EE+G 0121 1 MRFMETCLGVVKRGDFFLFIRKLRGIGTGYINS---AGGKLRDTETPRECVERELYEELG 57 mutT 60 ITPQHFSLFEKLEYEFPDRHITLWFWLVERWEGEPWGKE 98 I K+E+ + + +LV +EG P E 0121 58 IRVLSSERVGKIEFLYGQEKYLMHVYLVTEFEGIPRASE 96

>sai:Saci_0013 nudF; ADP-ribose pyrophosphatase ↑ Top Length=175

Score = 34.7 bits (78), Expect = 4e-04, Method: Compositional matrix adjust. Identities = 18/40 (45%), Positives = 25/40 (63%), Gaps = 2/40 (5%) mutT 34 EFPGGKIEMGETPEQAVVRELQEEVGITPQHFSLFEKLEY 73 EFP G IE GE+ + REL+EE+G P LF+ L++ 0013 62 EFPAGTIEEGESEDLTARRELEEEIGYVP--LKLFKVLKF 99

181 >sai:Saci_0550 NUDIX domain protein ↑ Top Length=155

Score = 31.6 bits (70), Expect = 0.005, Method: Compositional matrix adjust. Identities = 22/71 (31%), Positives = 33/71 (46%), Gaps = 10/71 (14%) mutT 1 MKKLQIAVGIIRNENNEIFITRRAA------DAHMANKLEFPGGKIEMGETPEQAVVREL 54 M+ + AV + ++ +I I +R HMA PGG+ E E E +RE 0550 1 MEDCKAAVVALISKVGKILIIKRKEKPGDPWSGHMA----LPGGRREDHEECESTAIREC 56 mutT 55 QEEVGITPQHF 65 EEV I P++ 0550 57 YEEVRIKPRNL 67

3.7.1.2 Amino acid sequence alignment of Saccharomyces cerevisiae PCD1 to

Haloferax volcanii DS2

>hvo:HVO_0762 NUDIX family hydrolase ↑ Top Length=198

Score = 37.7 bits (86), Expect = 4e-04, Method: Compositional matrix adjust. Identities = 29/95 (31%), Positives = 43/95 (45%), Gaps = 5/95 (5%)

PCD1 41 AVIILLFIGMKGELRVLLTKRSRTLRSFSGDVSFPGGKADYFQETFESVARREAEEEIGL 100 A +I + +L TKR+ L G +SFPGG + + A RE+ EEIGL 0762 20 AAVIAPVVFRDERPHILFTKRADHLGEHPGQMSFPGGGREPSDADLRATALRESNEEIGL 79

PCD1 101 PHDPEVLHKEFGMKLDNLVMDMPCYLSRTFLSVKP 135 PE F +LD+ + + Y F++ P 0762 80 --RPE--EAAFHGRLDD-IRTVTDYAVSPFVATVP 109

3.7.1.3 Amino acid sequence alignment of Escherichia coli MutT to Haloferax volcanii

DS2

>hvo:HVO_0030 NUDIX family hydrolase ↑ Top

Length=139

Score = 33.5 bits (75), Expect = 0.001, Method: Compositional matrix adjust. Identities = 20/71 (28%), Positives = 32/71 (45%), Gaps = 4/71 (6%) 182

Query 7 AVGIIRNENNEIFITRRAADAHMANKLEFPGGKIEMGETPEQAVVRELQEEVGITPQHFS 66 A G++R ++ + + R + P GK+E GET + VRE++EE Sbjct 11 AGGLLRRDDGRLCLVHRP----RYDDWSLPKGKLEPGETLVETAVREVREETRCEVDCGR 66

Query 67 LFEKLEYEFPD 77 + EY PD Sbjct 67 FAGRYEYRVPD 77

3.7.2 Media preparation for both Archaea species

3.7.2.1 Sulfolobus acidocaldarius media

Sulfolobus acidocaldarius media preparation protocol (provided by Dr. Michaela

Wagner from the University of Freiburg).

First prepare the trace element solution which is 2000x concentrated. You can store this solution for long time at room temperature. As the Brock components will be autoclaved later you do not have to autoclave/sterilize the trace element solution

1) Prepare Trace Element Solution

Makes 1 L Trace element solution (2000x):

è Stick to order

Na2B4O7.10H2O 9.0 g à Add H2SO4 till it dissolves (probably several ml,

end pH does not really matter)

ZnSO4.7H2O 0.44 g

CuCl2.2H2O 0.10 g

NaMoO4.2H2O 0.06 g

VOSO4.2H2O 0.06 g

CoSO4.7H2O 0.02 g

183 MnCl2.4H2O 3.60 g

The preparation of the trace element solution seemed to a problem in some labs. But I think if you stick to the order and if you go to acidic pH you should dissolve all ingredients easily.

2) Prepare Brock solutions:

Brock I (1000x): CaCl2.2H2O – 70g/L

Then autoclave to sterilise

Brock II+III (100x): (NH4)2SO4 - 130g/L

MgSO4.7H2O – 25g/L

KH2PO4 – 28g/L

50 ml trace element solution

1.5 ml H2SO4 (1:1)

Then autoclave to sterilise

Fe-solution (1000x): FeCl3.6H2O 20g/L

Then filter sterilize filter

3) Autoclave 1 L demineralised water in a bottle

Then you can prepare the medium at the flame

4) Prepare 1 L Brock basal medium: 1 ml Brock I solution

10 ml Brock II + III solution

1 ml Fe-solution

Make up volume to 1L with autoclaved demineralised water from step 3.

184 5) Afterwards, you should adjust the pH to around 3.0-3.5. I usually use therefore 50 %

H2SO4 ( around 120 µl). Always check the pH by pipetting some µl on a pH strip.

Note: It is normal that the Brock basal medium becomes a little bit cloudy/milky. You can store this prepared Brock basal solution for 1 month at room temperature.

______

3.7.2.2 Haloferax volcanii media

Haloferax volcanii media preparation protocol, provided by Dr. Thorsten Allers from the University of Nottingham.

1. Make Concentrated Salt Water concentration as a base for the Hv-Ca media.

Concentrated Salt Water (SW), 30% w/v

Distilled H2O (warm) ~4 litres

NaCl 1200 g

MgCl2·6H2O 150 g

MgSO4·7H2O 175 g

KCl 35 g

1 M Tris.HCl pH 7.5 100 ml

Warm water on hot plate to help dissolve salts.

Add distilled H2O to final volume of 5 litres.

2. Make up 10x concentrated Casamino acid solution.

Makes enough for 5 bottles of 333ml media.

Distilled water: 130 ml (approximately))

185 Casamino acids: 8.5 g

Slowly add 3 ml 1 M KOH while stirring.

Once dissolved, add distilled H2O to make up a final volume of 170 ml.

Do not autoclave, but use the same day.

3. Make the Media

Into a 500ml bottle, measure out:

− 100 ml Distilled H20

− 200 ml 30% SW solution

− 33 ml Casamino acid solution

Autoclave to sterilise media.

If needing solid plates, into a 500ml bottle measure out:

− 100 ml Distilled H20

− 200 ml 30% SW solution

− 5g Agar

Microwave until agar is dissolved (times for 5 bottles):

20 minutes on 100%, 20 minutes on 70% and 10 minutes on 50% twice. If agar still requires dissolving, put in again on 50% for 10 minutes.

Then add 33ml of casamino acid solution to bottle and autoclave.

4. Add solutions to Hv-Ca after autoclave.

186 Let bottles cool to ~57°C before adding to each bottle of 333ml media.

− Hv-Ca salt solution – 2.8ml

− 0.5M KPO4 buffer – 666µl

Hv-Ca salts solution recipe (makes 83ml of salt solution):

− 60 ml 0.5M CaCl2

− 10 ml Trace elements

− 13 ml Thiamine and Biotin

For Trace elements, add to 250ml H2O:

MnCl2·4H2O – 90 mg

ZnSO4·7H2O – 110 mg

FeSO4·7H2O – 575 mg

CuSO4·5H2O – 12.5 mg

For Thiamine and Biotin (19.5ml):

− 12ml Thiamine (1 mg/ml)

− 7.5ml Biotin (0.2 mg/ml)

Filter sterilise salt solution and then store at 4°C

0.5M KPO4 buffer solution (makes 100 ml):

− 61.5 ml 1M K2HPO4

− 38.5 ml 1M KH2PO4

Check pH = 7 and then filter sterilise. Store at 4°C

187 3.7.3 Statistical Model outputs

3.7.3.1 Model output for Figure 3.1A

Log2 Mutation Rate Predictors Estimates CI p (Intercept) 8.14 7.67 – 8.61 <0.001

Mean-centred Population Density -0.92 -1.15 – -0.69 <0.001

Novobiocin -4.03 -4.48 – -3.57 <0.001 Observations 45 R2 / Omega-squared 0.852 / 0.839 Population density was estimated via direct CFU counts.

3.7.3.2 Model output for Figure 3.1B

Log2 Mutation Rate Predictors Estimates CI p (Intercept) 8.01 7.53 – 8.49 <0.001

Mean-centred Population Density -0.80 -0.96 – -0.63 <0.001 Observations 12 R2 / Omega-squared 0.797 / 0.791 Population density was estimated via direct CFU counts.

3.7.3.3 Model output for Figure 3.2

Log2 Mutation Rate Predictors Estimates CI p (Intercept) 8.47 8.17 – 8.76 <0.001

Mean-centred Calibrated -0.58 -1.01 – -0.16 0.009 Population Density

Novobiocin -3.06 -3.86 – -2.27 <0.001 Observations 45 R2 / Omega-squared 0.749 / 0.735 Population density was estimated via an ATP-based assay

188

Chapter 4: Evolution of Density Associated Mutation Rate

Plasticity within strains of Escherichia coli

189 4.1 Abstract

Mutation provides the fuel for natural selection. Therefore the rate these rare perturbations arise in an organism’s genome could play a key role in its evolution. The mutation rate, however, is not a constant and may vary plastically within a single genotype, dependent upon the environment. One form of such environmental mutation rate plasticity occurs where the mutation rate is inversely associated with population density. This so-called Density Associated Mutation Rate Plasticity (DAMP) occurs, but is variable, in all domains of life, requiring the same mutation avoidance mechanism in both pro- and eukaryotes. To further investigate DAMP’s evolution and variability at fine evolutionary scales, mutation rates were estimated in dozens of strains of the bacterium Escherichia coli. Across these strains, both the average mutation rate and DAMP varies significantly, with more closely related strains exhibiting a more similar degree of DAMP, though such a relationship is not evident for the average mutation rate. Furthermore, having corrected for phylogenetic relatedness between these strains, there is slight evidence that a strain’s average mutation rate and degree of DAMP are themselves associated with one another.

Specifically, strains exhibiting lower mutation rates possessed a greater degree of

DAMP than those strains with a higher average mutation rate. This possibly helps provide an explanation for the evolution of DAMP. DAMP could have evolved either as a mechanism to reduce the mutational load or as a mechanism for increased novelty generation. This result suggests that DAMP may have evolved for the latter. These findings show that DAMP has not only evolved a wide phylogenetic distribution, but is a fast evolving trait within a single species. Taken together, these results indicate that

DAMP is a highly labile trait that has evolved an association with an organism’s mutation rate, potentially modulating the evolvability of such organisms.

190 4.2 Introduction

Mutation is central to population genetics, and yet an understanding of the rate these spontaneously arise remains an enduring task in evolutionary biology (Lynch et al.

2016). Whilst mutation allows for an organism’s evolution, the mutation rate itself, like any trait, can evolve, evident by the wide range of mutation rates seen in the natural world (Sung et al. 2012, 2016). It is becoming increasingly apparent that mutation rates do not just vary interspecifically however, but can also vary considerably within a single genome (Krašovec et al., 2014; Krašovec et al., 2017; Kohlmann, Bähr and

Gatermann, 2018), depending upon the environment (Saint-Ruf and Matic 2006;

Swings et al. 2017). Such mutation rate plasticity has predominantly been investigated in relation to stressful environments (Foster 2007; Matic 2016), however recent findings have identified that mutations rates can also vary plastically dependent upon environmental signals dependent upon the social environment (Krašovec et al. 2014).

Specifically, an organism’s mutation rate inversely associates with the final population density it reaches in a culture, a phenomenon termed Density Associated Mutation rate Plasticity (DAMP) (Krašovec et al., 2017).

Under stressful environmental conditions molecular mechanisms are initiated that result in an increased mutation rate (Matic 2016), a process termed Stress-Induced

Mutagenesis (SIM) (Williams and Foster 2012). This increased mutation rate allows the generation novel phenotypes to survive and adapt to the stress (Galhardo et al. 2007;

Swings et al. 2017). Conversely, mutation rate decreases as the population density of a culture increases, a process termed Density Associated Mutation rate Plasticity

(DAMP) (Krašovec et al. 2017). These two plastic phenomena, SIM and DAMP, act on the mutation rate with opposing effects, producing a mutation rate minimum at

191 intermediate population density (Krasovec et al. 2018). Additionally, DAMP, whilst interacting with SIM, does not obviously associate with its mechanisms (Krašovec et al.

2014; Krasovec et al. 2018). Such an interaction may occur as minimising mutation rates at intermediate nutrient and population density levels to reduce the evolutionary change in benign environments, with increasing mutation rate at low and high environment and population levels needed to increase adaptation and cope with increased competition respectively.

DAMP is present in diverse organisms, with both pro- and eukaryotes exhibiting

DAMP’s inverse association between mutation rate and density (Krašovec et al., 2017).

In both these domains, DAMP requires a Nudix hydrolase responsible for degrading the same mutagenic nucleotide, 8-oxo-dGTP (Michaels and Miller 1992). Whilst present at this broad evolutionary scale, the degree of plasticity exhibited varies considerably, even between closely related species (Krašovec et al., 2017). DAMP is present E. coli, with the same degree, at two loci in Escherichia, but is absent in Pseudomonas aeruginosa, despite both being gram-negative gamma Proteobacteria. DAMP even varies at a finer evolutionary scale, where the degree exhibited differs between strains of the yeast Saccharomyces cerevisiae, even at the same genomic loci (Krašovec et al.,

2017). These results together suggest DAMP is a wide-ranging trait that has evolved at fine scales within a species, though this remains to be tested empirically.

DAMP’s evolution at fine evolutionary scales was investigated by estimating mutation rates, again via fluctuation tests, in dozens of strains of the same species, the bacterium E. coli. These strains, from the E. coli reference (ECOR) collection (Ochman and Selander 1984), are isolated from a broad range of hosts and locales that

192 represent the genetic variation range of the species. These strains then had their whole genomes sequenced to investigate the phylogenetic relationship between both average mutation rate and DAMP. This allowed for a strain’s average mutation rate and DAMP slope to be mapped onto the phylogenetic tree and analysed to determine whether there was any phylogenetic signal within these strains that could indicate the mode of evolution involved in DAMP. Furthermore, by controlling for the relatedness between strains it is possible to investigate whether there is any interaction between the average mutation rate and the degree of DAMP exhibited by the ECOR strains.

DAMP could have evolved for two possible reasons. First, for strains with a low average mutation rate DAMP may pose an opportunity for that strain to have increased novelty production through increases in mutation rate. Alternatively DAMP may facilitate a strain that has a higher average mutation rate to reduce the mutational burden. Finally, the ECOR strains allow for further investigation into the interplay between the two modes of mutation rate plasticity: SIM and DAMP. As mentioned above, these two plastic traits interact with one another along a nutrient gradient: DAMP decreases mutation rate at low nutrient and population density levels with SIM increasing mutation rates at higher nutrient population density levels. This previous interaction was investigated in E. coli MG1655 which exhibits DAMP.

Therefore there remain two questions. First, does SIM and DAMP interact in a similar mode in natural isolates of E. coli? Secondly, does the increase due to SIM occur even in strains not exhibiting any noticeable degree of DAMP, or does a lack of DAMP indicate a low plasticity potential of their genome? Therefore mutation rates were estimated in two strains from the ECOR that have differing degrees of DAMP and are both closely related to each other and also to MG1655.

193 4.3 Materials and Methods

4.3.1 Strains used in this study The strains used in this study are provided in a Table at the end of this Chapter in the Appendix.

4.3.2 Media MilliQ water was used for all media. For Escherichia coli, tetrazolium arabinose agar (TA) and Davis minimal medium (DM) were prepared according to

Lenski et al. (1991), and Luria-Bertani medium (LB) was prepared according to manufacturers’ instructions. Magnesium sulphate heptahydrate, thiamine hydrochloride, carbon source (3g/l L-arabinose or various concentrations of D-glucose),

2,3,5-triphenyltetrazolium chloride (Sigma T8877) were sterile filtered before adding to media. Selective TA medium was supplemented with freshly prepared rifampicin

(50µg/ml). For all cell dilutions sterile saline (8.5 g/l NaCl) was used. All media were solidified as necessary with 15 g/l of agar (Difco).

4.3.3 Fluctuation tests

4.3.3.1 Variation of mutation rate and DAMP in strains of E. coli Fluctuation tests were carried out as previously described (Krašovec et al., 2014; Krašovec et al., 2017).

Briefly, strains were inoculated from frozen glycerol stock into liquid LB medium and grown at 37°C with moderate shaking. After ~7 hours, the strains were transferred to non-selective liquid DM media supplemented with 250 mgl-1 of glucose as a single carbon source and allowed to grow overnight at 37°C, again with moderate shaking.

The following morning, strains were then again diluted into fresh DM, giving the initial population size (N0) of around 3700 (range 285 - 12180). Various volumes (0.5–0.95 ml) of parallel cultures were grown for 24 hours at 37°C with shaking of 250rpm in 96 deep-well plates. The position of each culture on a 96-well plate was chosen randomly,

194 as was the order of treatment (that is a strain and its specific glucose concentration) across multiple 96-well plates. Final population size (Nt) of each culture was determined by two independent techniques. Nt was determined by colony forming units (CFU) where appropriate dilution was plated on a solid non-selective TA medium.

3 wells were set aside for each treatment (a particular glucose concentration for a particular strain) for Nt estimation by CFU. A minimum of 2 wells were used for CFU estimation. Additionally, estimates of Nt using net luminescence (LUM) were determined using a Promega GloMax luminometer and the Promega Bac-Titer Glo kit, according to manufacturer's instructions. We measured luminescence of each culture

0.5 and 450 seconds after adding the Bac-Titer Glo reagent and calculated net luminescence as LUM = luminescence450s – luminescence0.5s. Several strains did not exhibit a clear association between net luminescence and population density estimated via direct CFU counts, therefore these were excluded from the analysis

(strains: ECOR 40, 49, 55, 56). Plot of calibration curves for the ECOR collection and the model used is presented in Figure 4.1. Evaporation (routinely monitored by weighing plate before and after 24h incubation) was accounted for in the Nt value determined by CFU and was also used in statistical modelling as a variance covariate. We obtained the observed number of mutants resistant to rifampicin, r, by plating the entirety of remaining cultures onto solid selective TA medium that allows spontaneous mutants to form colonies. Selective rifampicin plates were incubated at 37°C for 72 hours.

195 5x108 ) 1 −

1x108

1x107 Population density (ml Population 5x106

1x105 1x106 1x107 Net Luminescence (LUM)

Figure 4.1 Calibration curves for final population density measured by counting colony forming units (CFU), against luminescence, assayed with the BacTiter-Glo assay (LUM). Calibration curves for all ECOR plots. Net luminescence was modelled against population density with a random effect of slope on strain nested within the 96 well plate ID, nested within experimental block. The identity of the strain was also included as a variance component (heteroscedasticity) along with power functions of the Nt estimate and the standard deviation in gross luminescence. This model had an R-squared value of 0.792. The overall correlation of

Net luminescence and population density estimated via CFU counts is 0.53 (0.48 – 0.58

CI). Note both axes are logarithmic.

4.3.3.2 Interaction between SIM and DAMP in strains of E. coli Fluctuation tests to investigate the interaction of SIM and DAMP in two isolates of the ECOR collection were conducted as previously described (Krasovec et al. 2018). Specifically, strains were inoculated from frozen glycerol stock into fresh LB media and grown at 37°C with shaking at 120rpm. This was then diluted into a fresh mixture of DM and LB media and

196 grown over night at 37°C with shaking at 120rpm. The next day cells were inoculated into a specific fresh media mix of DM and LB (1 to 57% LB) which included the interaction point of SIM and DAMP previously described (~10% LB; Krasovec et al.

2018). This gave an initial N0 of approximately 1400 and 1900 (Range 500 - 3700 and

200 – 5250) for ECOR 1 and 8 respectively. Independent parallel cultures of various volumes (0.5 – 1ml) were grown in 96 deep-well plates for 24 hours at 37°C with shaking at 250rpm. The position of each culture within the 96 well plate was allocated randomly. Nt was estimated via CFU only as described above. Evapouration was accounted for as described above. We obtained the observed number of mutants resistant to rifampicin, r, by plating the entirety of remaining cultures onto solid selective TA medium that allows spontaneous mutants to form colonies. Selective rifampicin plates were incubated at 37°C for 72 hours.

4.3.5 Estimation of mutation rates To calculate number of mutational events, m, from the observed number of mutants we employed the Ma-Sandri-Sarkar maximum- likelihood method implemented by the R package flan v0.6 (Mazoyer et al. 2017). The mutation rate per cell per generation is calculated as m divided by the final population size, Nt (as determined by CFU). The flan package also allows for the estimation of the fitness of mutant cells alongside estimating m. Whether the mutation rate includes an estimate of fitness cost or not is mentioned in both the main text and figure legends.

Due to the lack of power in estimating both fitness and m from the same small set of data, another method of accounting for fitness effects was used. The weighted median fitness value for the resistance mutation was estimated for each strain from the fitness estimate determined by flan. This median fitness value for each strain was then

197 reintroduced into the model for estimating m, thereby allowing the estimation of m with the different average fitness effects of each strain.

4.3.6 ECOR Phylogeny Whole genomes for all strains in our ECOR collection were sequenced by MicrobesNG at the University of Birmingham (https://microbesng.uk).

The two exceptions for this were ECOR 24 and ECOR 71, due to potential contamination and low of the sequencing respectively. To rectify the sequences for these two strains, sequences were downloaded from Enterobase

(https://enterobase.warwick.ac.uk/; Alikhan et al. 2018). The ECOR collection has been shown to be potentially problematic with duplicates and discrepancies (Galardini et al.

2017), particularly in the positioning of strains in the phylogeny compared to their phylogroup as detailed on the ECOR website

(http://www.shigatox.net/new/reference-strains/ecor.html). All strains were included in the analysis even if their position in the phylogenetic tree did not match their phylogrouping. The phylogeny used in the analysis was created using ParSNP

(Treangen et al. 2014), which creates the phylogeny by a whole genome nucleotide alignment across all the strains. The whole genome of Escherichia fergusonii ATCC

35469 (Accession Number: CU928158) was included as the out-group that allowed for the identification of the root. All sequences were aligned to E. coli K-12 MG1655

(RefSeq Accession: NC_000913.3) in the creation of the phylogeny. Identification and alteration of the ECOR tree’s root position was done using FigTree v1.4.2

(http://tree.bio.ed.ac.uk/software/figtree/).

4.3.7 Phylogenetic analysis Investigation of a phylogenetic signal for the average mutation rate and DAMP was conducted through the fitContinuous command in the

198 geiger package v2.0.6 (Harmon et al. 2008) and BayesTraits v3.0.1

(www.evolution.rdg.ac.uk) (Pagel and Meade 2016), fitted through the Bayes Traits wrapper package (btw version 2.0; http://rgriff23.github.io/projects/btw.html) in R.

The geiger package allows for fitting many different evolutionary models to a continuous trait on a phylogeny, determining the best model by differences in AIC. This function was used as a preliminary test for phylogenetic signal and to narrow down the choice of model to be passed onto BayesTraits. In each instance the best evolutionary model involved Pagel’s lambda and so this was the evolutionary model estimated in

BayesTraits. The trait was fitted on the phylogeny using random walk using MCMC models. MCMC models were run for 1,001,000 iterations, with the first 1,000 discarded as burn-in. These models also used a stepping stone sampler with 1000 stones each being sampled 10000 times. Priors for the phylogenetically corrected mean of the data and lambda were estimated by first running the random walk maximum likelihood model. The values from these models were then used as the midpoint for the uniform prior used in the MCMC analysis, as detailed in the

BayesTraits manual. If the posterior distribution of these parameters seemed truncated from the use of these priors, then the limits of these priors were increased in the appropriate direction. This was done by plotting out the frequency distribution of the lambda estimates from the model. All priors used were uninformative.

Convergence of three independent MCMC runs was checked by plotting the posterior distribution and the trace plots for the parameters of the models. Diagnostic plots for the MCMC runs can be found in the Appendix of the Chapter. Specifically, diagnostic plots for the model likelihood (Lh), the phylogenetically corrected mean of the data

(Alpha-1) and the estimate for Pagel’s lambda comprised of a frequency plot of the density, the autocorrelation through the run, the running mean of the value and a

199 trace plot showing the sampling throughout the run. Additionally, the effective sizes were estimated for this model as an estimate of the number of independent samples used to estimate the parameters mean, with all values being at least 200 to show the model had been fitted correctly. Furthermore the Gelman and Rubin’s Potential Scale

Reduction Factor (PSRF) was also estimated (Gelman and Rubin 1992; Brooks and

Gelman 1998). PSRF is based upon the comparison between within-chain vs between- chain variance, where a value of 1 shows the chains have been converged. If the PSRF value is 1.05 or above then the model was run for more iterations for the chains to converge. Lambda values were taken as the mean value from these three independent runs. Correlation between average mutation rate and DAMP was also tested using the same methods and parameters in BayesTraits as detailed above. Diagnostic plots for these analyses are also presented in the Appendix at the end of this Chapter. The plots consist of two phylogenetically corrected means for the two traits (Alpha-1 and Alpha-

2 being average mutation rate and DAMP respectively). Diagnostic plots for the correlation value are also included. The correlation value was taken as the mean value from three independent runs. Significance of MCMC models was calculated as log

Bayes Factors, as specified in the BayesTraits manual (Pagel and Meade 2016). Briefly, a positive Bayes Factor provides evidence for the hypothesis with values below 2 providing weak evidence and values over 2 indicating positive evidence. Values over 5 provide strong evidence. Visualisations of the trait on the phylogeny were created using the phytools package v0.6-44 (Revell 2012). The colouration within these plots is from the phytools package and not indicative of the BayesTraits analysis. All phylogenetic signal analysis was conducted on mutation rate and DAMP analysed using population density estimated via the ATP-based assay and the mutation rate was estimated either with or without an estimate of fitness cost of the mutation. The

200 average mutation rate was corrected from the log value of the models analysing

DAMP’s presence in the ECOR strains.

4.3.8 Statistical analysis All statistical analysis was executed in R v3.5.0 using nlme v3.1-137 package for linear mixed effects modelling (Pinheiro et al. 2018). This enabled the inclusion within the same model of experimental factors (fixed effects), blocking effects (random effects) and factors affecting variance (heteroscedasticity). Two factors were incorporated in each model to account for such heteroscedasticity, with models being compared by AIC to determine the best model. Due to known differences in variance between strains of the ECOR collection, an effect of strain was included in the model in addition to the two other factors accounting for heteroscedasticity. See the figure legends for details of factors accounting for heteroscedasticity in each specific model. The components that make up the variation seen in the ECOR collection were separated out via the use of mixture models in the flexmix package v2.3-14 (Leisch 2004), grouping strains together by similarity of DAMP and mutation rate. The flexmix package does not allow for the inclusion of either random effects or variance factors in the model structure. In all cases the log2 mutation rate was used.

201 4.4 Results

4.4.1 Mutation rates and DAMP in ECOR collection

4.4.1.1 Variation in both the average mutation rate and DAMP in ECOR collection

In order to dissect the causes of DAMP’s variation and ascertain any potential signal for its evolution, work must be undertaken at fine evolutionary scales. To answer this question of DAMP’s evolution, mutation rates were estimated in 62 natural isolates of the bacterium E. coli, representing its full phylogenetic range (Ochman and Selander

1984). There is significant variation in both the average mutation rate (N = 830, F61, 484

-123 = 27.97, P = 3.76 x 10 ) and the degree of DAMP rate (N = 830, F61, 484 = 2.61, P = 7.25 x 10-9) between these ECOR strains (Figure 4.2A). The slope of the wild-type MG1655 strain from this model is -0.67 [-0.30 to -0.96], which is highly similar to the previous estimate of -0.68 [-0.56 to -0.80 CI] (Krašovec et al. 2017).

A

100 50

10

) 5 1 −

1 generation

9 0.5 −

2x107 5x107 2x108 5x108

B 1 2 3 4 5

100 Mutation rate (10 Mutation rate

10

1

5x1075x108 5x1075x108 5x1075x108 5x1075x108 5x1075x108

1 Population density (ml− )

202 Figure 4.2 Variation in Average Mutation rate and DAMP in Escherichia coli strains with population density estimated via an ATP-based assay and mutation rate estimated without an estimate of the fitness cost of mutation. A Mutation rates where the fitness cost of the mutation was set to 1 (i.e. no fitness cost of mutation). Variation in slopes significantly differs between strains (N =

-5 829, LR190 = 112.99, P = 5.91 x 10 ). Different coloured lines and points represent the different ECOR strains. The identity of the ECOR strain, a power function of the logged gross luminescence and the number of mutational events, m, per volume were included as variance components. B Mixture model output of mutation rates for ECOR strains presented in Figure 4.2A. The model grouped the strains together due to their level of

DAMP and their average mutation rates (see main text for details). Population density was estimated via an ATP-based assay in both plots. Note all axes are logarithmic.

This variation in both mutation rate and DAMP between these ECOR strains can be further explored through the use of mixture models. These models subset the data and group strains together that exhibit the similar mutation rate and degree of DAMP to tease apart the components that make up the results presented above. This analysis resulted in the separation of the data into 5 components (Figure 4.2B). The first component comprise 7 strains that exhibit a low mutation rate (1.00 x 10-9) with no significant degree of DAMP (slope -0.30 ± 0.16; Figure 4.2B.1). The second component again comprises 7 strains that this time exhibit a moderate average mutation rate

(3.27 x 10-9) and do exhibit DAMP (slope -0.42 ± 0.16; Figure 4.2B.2). The third component comprises 5 strains with a high mutation rate (8.32 x 10-9) and exhibiting

DAMP (slope -0.30 ± 0.10; Figure 4.2B.3). The fourth component contains 29 strains that exhibit a similar degree of DAMP to component 2 (slope -0.40 ± 0.07 vs -0.42 ±

203 0.16), but a lower average mutation rate (2.12 x 10-9; Figure 4.2B.4). Similarly to component 4, component 5 comprises 14 strains that exhibit a similar average mutation rate to that found in component 2 (3.53 x 10-9 vs 3.27 x 10-9), but with a lower degree of DAMP (slope -0.32 ± 0.10; Figure 4.2B.5).

As previously found for E. coli, when not using an independent method for estimating population density and mutation rate (instead using CFU counts for both), the results do not qualitatively change (Figure 4.3). The variation between strains in both average

-113 mutation rate (N = 830, F61, 484 = 24.47, P = 9.15 x 10 ) and degree of DAMP (N = 830,

-4 F61, 484 = 1.84, P = 2.61 x 10 ) remains significant (Figure 4.5A). Moreover, this variation can still be split up into five components as before, though with differences in relationships between mutation rate and DAMP estimates (Figure 4.3B). The combination of both of these analyses points to evolution of both the mutation rate and DAMP, not only at large evolutionary scales between domains of life as has been previously identified (Chapters 2 and 3), but also at the very fine evolutionary scale of strains within the same species.

204 A200

50

20

) 5 1 − 2 generation

9 0.5 −

0.2 2x106 5x106 2x107 5x107 2x108 5x108

B 1 2 3 4 5

100 Mutation rate (10 Mutation rate

10

1

2x107 5x108 2x107 5x108 2x107 5x108 2x107 5x108 2x107 5x108

−1 Population density (ml )

Figure 4.3 Variation in Average Mutation rate and DAMP in Escherichia coli strains with population density estimated via direct CFU counts and mutation rate estimated without an estimate of the fitness cost of mutation. A Mutation rates where the fitness cost of the mutation was set to 1 (i.e. no fitness cost of mutation). Variation in slopes significantly differs between strains (N =

-5 829, LR190 = 112.99, P = 5.91 x 10 ). Different coloured lines and points represent the different ECOR strains. The identity of the ECOR strain, a power function of the logged gross luminescence and the number of mutational events, m, per volume were included as variance components. B Mixture model output of mutation rates for ECOR strains presented in Figure 4.3A. The model grouped the strains together due to their level of

DAMP and their average mutation rates. Specifically these values are as follows: 1)

Number of Strains in component = 26, Mutation rate = 2.00 x 10-9, slope = -0.46 ± 0.06;

205 2) 6, 7.93 x 10-9, -0.32 ± 0.07; 3) 4, 2.53 x 10-9, -0.66 ± 0.14; 4) 7, 0.97 x 10-9, -0.21 ±

0.14; 5) 19, 3.35 x 10-9, -0.47 ± 0.05. Population density was estimated via direct CFU counts. Note all axes are logarithmic.

4.4.1.2 Variation in mutation rate and DAMP robust to fitness effects of resistance mutation

Fitness effects from the resistance mutation could potentially affecting the relationship of DAMP (Krašovec et al. 2017). To counteract this possibility, the mutation rate was co-estimated with a fitness estimate of the mutation. This results in strong evidence for DAMP’s evolution at this fine evolutionary scale (Figure 4.4A). Between these strains there is significant variation in both the average mutation rate (N = 830, F61, 484

-110 = 23.55, P = 7.03 x 10 ) and the degree of DAMP exhibited (N = 830, F61, 484 = 1.69, P =

1.46 x 10-3). The average mutation rate for the wild-type MG1655 strain is 3.73 x 10-9

(3.52 – 3.94 x 10-9 CI), with a DAMP slope of -0.74 [-0.40 to -1.09 CI]. This relates well to both the previous example of DAMP in this strain (-0.68 [-0.56 to -0.80 CI] (Krašovec et al. 2017)) and the estimate presented above (-0.67 [-0.38 to -0.96 CI]; Figure 4.2A).

These results corroborate the previous assertion that DAMP is impervious to fitness effects of such mutations (Krašovec et al. 2017).

This variation between strains in both their average mutation rate and DAMP can again be further broken down and the relationships can be grouped into subsets of strains that possess a similar average mutation rate and degree of DAMP. By utilising mixture models, the association between mutation rate and DAMP was teased apart resulting into five separate components with different average mutation rate and DAMP relationships (Figure 4.4B). First, there are 35 strains that have a low mutation rate

206 (2.36 x 10-9) with a moderate level of DAMP (slope -0.38 ± 0.06; Figure 4.4B.1). Second, there 7 strains exhibiting a slightly higher average mutation rate (3.46 x 10-9) with a strong degree of DAMP (slope = -0.42 ± 0.15; Figure 4.4B.2). 4 strains meanwhile have the highest mutation rate (9.92 x 10-9) and also exhibit DAMP, but at a lower degree (-

0.27 ± 0.12; Figure 4.4B.3). The fourth component contains 7 strains that exhibit a low mutation rate with a moderate level of DAMP (mutation rate 1.04 x 10-9, slope -0.34 ±

0.17; Figure 4.4B.4). The final component has 9 strains with intermediate levels of average mutation rate but no obvious DAMP (Average mutation rate 4.30 x 10-9, slope

= -0.09 ± 0.10; Figure 4.4B.5).

A200.0

50.0

20.0 ) 1 −

2.0 generation

9 0.5 −

5e+07 1e+08 5e+08

B 1 2 3 4 5

100 Mutation rate (10 Mutation rate

10

1

5x1075x108 5x1075x108 5x1075x108 5x1075x108 5x1075x108

−1 Population density (ml )

Figure 4.4 Variation of Average Mutation rate and DAMP in Escherichia coli strains with population density estimated via an ATP-based assay and mutation rate co-estimated with the fitness effect of mutation. A

207 Mutation rates and the fitness estimate of mutation were co-estimated in strains of

Escherichia coli conferring resistance to the antibiotic rifampicin. Variation in slopes and intercepts significantly differs between strains (see main text for details). Different coloured lines and points represent the different ECOR strains. In addition to the identity of the strains, the estimated number of mutational events and its standard deviation were included in the model as a variance power functions to account for heteroscedasticity. B Mixture model output of mutation rates for ECOR strains presented in Figure 4.4A. The model grouped the strains together due to their level of

DAMP and their average mutation rates (see main text for details). Note all axes are logarithmic.

As previously found for E. coli, when not using an independent method for estimating population density and mutation rate (instead using CFU counts for both), the results do not qualitatively change (Figure 4.5). The variation between strains in both average

-104 mutation rate (N = 830, F61, 484 = 21.85, P = 2.82 x 10 ) and degree of DAMP (N = 830,

-3 F61, 484 = 1.62, P = 3.32 x 10 ) remains significant (Figure 4.5A). Moreover, this variation can still be split up into five components as before, though with differences in relationships between mutation rate and DAMP estimates (Figure 4.5B). The combination of both of these analyses points to the robustness of DAMP to the fitness effects of mutation.

208 A200

50

20

) 5 1 − 2 generation

9 0.5 −

0.2 2x106 5x106 2x107 5x107 2x108 5x108

B 1 2 3 4 5

100 Mutation rate (10 Mutation rate

10

1

5x1075x108 5x1075x108 5x1075x108 5x1075x108 5x1075x108

−1 Population density (ml )

Figure 4.5 Variation of Average Mutation rate and DAMP in Escherichia coli strains with population density estimated via direct CFU counts and mutation rate co-estimated with the fitness effect of mutation. A

Mutation rates and the fitness estimate of mutation were co-estimated in strains of

Escherichia coli conferring resistance to the antibiotic rifampicin. Variation in slopes significantly differs between strains (N = 829, LR190 = 92.88, P = 0.0053). Different coloured lines and points represent the different ECOR strains. In addition to the identity of the strains, the mutation rate and the mutation rate plus 2 standard deviations were included in the model as a variance power functions to account for heteroscedasticity. B Mixture model output of mutation rates for ECOR strains presented in Figure 4.5A. The model grouped the strains together due to their level of

DAMP and their average mutation rates. Specifically these values are as follows: 1)

209 Number of Strains in component = 12, Mutation rate = 4.39 x 10-9, slope = -0.19 ± 0.07

SE; 2) 5, 9.65 x 10-9, -0.40 ± 0.08; 3) 4, 2.35 x 10-9, -0.73 ± 0.09; 4) 13, 1.07 x 10-9, -0.34 ±

0.12; 5) 28, 2.43 x 10-9, -0.37 ± 0.06. Note all axes are logarithmic.

4.4.1.3 Fitness effects of resistance mutations

Fitness effects of mutations are known to vary greatly dependent upon both the identity of the strain and the environment, and therefore have the potential to affect the mutation rate estimate. It is now possible to estimate the fitness effects of resistance alongside the mutation rate, as conducted above. The fitness effects of such resistance mutations were, as anticipated, mostly deleterious and varied significantly

-7 between strains (N = 830, F61, 545 = 2.26, P = 7.57 x 10 ; Figures 4.6 and 4.7), and indeed with population density (Slope: 0.45 [0.38 to 0.53 CI]; N = 830, F61, 545 = 128.75, P = 6.20 x 10-27; Figure 4.7), with less dense cultures tending to have greater fitness costs of mutation than high density cultures. Such fitness effects have been predicted to affect the degree of DAMP, with greater deleterious fitness effects causing reduced DAMP, though the differences in DAMP they cause are slight (Krašovec et al., 2017). The positive relationship between fitness cost and density possibly suggests that mutation rates at low density are underestimated, meaning the above results of DAMP in the

ECOR collection appear robust to the fitness costs associated with the resistance mutation. Interestingly, one strain (ECOR 28) had, on average, a slight, though non- significant, positive fitness effect from the resistance mutation (1.08 ± 0.14 SE; Figure

4.6), and did not exhibit DAMP (Wald test slope differs from 0: -0.30 ± 0.17 SE, t484 = -

173, P = 0.085).

210 ● ECOR 1 ● ● ● ECOR 2 ● ECOR 3 ● ECOR 4 ECOR 5 ● ECOR 6 ● ECOR 7 ECOR 8 ● ECOR 9 ECOR 10 ● ● ECOR 11 ECOR 12 ● ECOR 13 ● ECOR 14 ● ● ECOR 15 ● ● ECOR 16 ● ● ECOR 17 ECOR 18 ● ECOR 19 ECOR 21 ECOR 22 ECOR 25 ● ECOR 26 ECOR 27 ● ECOR 28 ● ECOR 30 ECOR 31 ECOR 32 ● ECOR 33 ● ECOR 34 ECOR 35 ● ECOR 36 Escherichia coli ECOR 37 ECOR 38 ECOR 39 ECOR 41

Strain of Strain ECOR 42 ● ECOR 43 ● ECOR 44 ECOR 45 ● ECOR 46 ECOR 47 ECOR 48 ECOR 50 ECOR 51 ● ECOR 54 ● ECOR 57 ● ● ECOR 58 ● ● ECOR 59 ECOR 60 ECOR 63 ECOR 64 ECOR 65 ECOR 66 ECOR 67 ECOR 68 ● ● ECOR 69 ECOR 70 ECOR 71 ● ● ECOR 72 ● ● ● ● ● ● MG1655 ● ● Wound 0.01 0.10 1.00 Relative fitness effect of mutation Figure 4.6 Average fitness effects of the resistance mutation to rifampicin. Fitness effect was co-estimated with the mutation rate. The box plot shows the distribution for the average relative fitness of a resistant mutant to the wild- type strains of each E. coli strain used in this study. Black bars indicate the median and the boxes represent the inter-quartile range, whilst the whiskers extend to data points no more than 1.5 times the interquartile range from the edge of the box and points

211 indicate any data points falling beyond the whisker. Note logarithmic axis. Red vertical line represents no fitness effect, negative or positive, of the resistance mutation.

1.00

0.10 Fitness effect of mutation Fitness effect

0.01

1x107 1x108 −1 Population density (ml )

Figure 4.7 Association between final population density and fitness of resistance mutation to rifampicin. Fitness effect was co-estimated with the mutation rate. Lines represent the output of the model described in the main text.

There was no significant interaction between strain and population density, but there was a significant effect of each term individually (see main text). The model included variance components (heteroscedasticity) of the identity of the strain and a power function of the Nt estimate. Population density was estimated via direct CFU counts.

Note logarithmic axes.

4.4.1.4 Variation in mutation rate and DAMP with weighted median estimate of fitness effects

The co-estimation of fitness effects alongside mutation rate as conducted above in section 4.4.1.2 possibly suffers from a lack of power in estimating these two factors

212 from the same data. To attempt to manage this, the analysis from above was repeated where a weighted median fitness estimate of mutation for each strain was calculated from the fitness estimates previously calculated. This was then included in the mutation rate estimation. The weighted median takes into account the standard deviation of the fitness estimate into estimating the median. When using mutation rates estimated with a weighted median fitness estimate, the average mutation rate

-98 again differs significantly between strains (N = 830, F61, 484 = 20.18, P = 1.66 x 10 ;

Figure 4.8), as does the degree of DAMP, though with a considerably lower level of significance (N = 830, F61, 484 = 1.42, P = 0.025; Figure 4.8). This reduction in significance could be due the fact that the weighted median estimate of fitness cost only resembles the average per strain and does not account for the density association, whereas the co-estimation of mutation rate and fitness effect does. Additionally, the fitness estimate comes via the number of mutants in a culture rather than any empirical analysis where such fitness costs are independently estimated. Therefore potentially the best mutation rate estimates to use are those where either no fitness effect is estimated, (Figure 4.2), or where mutation rate and fitness cost are co-estimated

(Figure 4.4) as, despite the potential drawback of low power in these estimates, these estimates can account for subtle differences in fitness effect. This result suggests that the fitness cost associated with the resistance mutation may indeed affect DAMP, but also shows the need for accurate estimates of such fitness effects to account for such potential effects. Nonetheless, the results from these ECOR strains where mutation rate is estimated either with or without a fitness cost, suggests that DAMP has evolved at a fine evolutionary scale between strains of E. coli.

213 ) 1 − 100 50 generation 9

− 10 5

1 0.5 Mutation Rate (10 2x107 5x107 2x108 5x108 Population density (ml−1)

Figure 4.8 Average Mutation rate and DAMP in Escherichia coli strains with population density estimated via an ATP-based assay and mutation rate estimated with a weighted median of fitness cost of mutation.

Mutation rates were estimated with a weighted median of a fitness estimate for each strain of Escherichia coli conferring resistance to the antibiotic rifampicin included in initial mutation rate estimation. Variation in slopes does significantly differ between strains (see main text for details), whilst average mutation rate does also. Different coloured lines and points represent the different ECOR strains. Population density was estimated via an ATP-based assay. Note all axes are logarithmic.

4.4.2 Phylogenetic Analysis of the average mutation rate and DAMP in the ECOR collection

Species tend to display similar levels across a range of different traits depending on how closely related they are, due to sharing the degree of a trait prior to the point at which they split into separate species. To investigate whether the average mutation rate and DAMP in the E. coli strains tested exhibit similar tendencies with more closely 214 related strains, phylogenetic signal (Pagel’s λ) was estimated, after preliminary testing

(see Methods section), via BayesTraits. With the average mutation rate, calculated either with or without a fitness estimate, there is no evidence for a phylogenetic signal

(Bayes Factor = -1.40 and -1.88 for with and without fitness estimate respectively;

Figure 4.9). For DAMP, when using mutation rates estimated either with or without fitness there is positive evidence for a phylogenetic signal (λ = 0.35, Bayes Factor =

2.02 for both; Figure 4.10). BayesTraits does not allow for the inclusion of measurement error but does allow for more complex evolutionary models that can ascertain correlation between traits. The potential correlation between the average mutation rate and degree of DAMP was therefore tested and phylogenetically corrected through analysis in BayesTraits. There is only marginal evidence for a positive correlation between average mutation rate and degree of DAMP – higher mutation rates have a lower degree of DAMP i.e. a flatter slope (mutation rates co- estimated with fitness cost: correlation = 0.12, Bayes Factor = 0.86; mutation rates not co-estimated with fitness cost: correlation = 0.10, Bayes Factor = 1.59; Figure 4.11).

This correlation does not differ when only conducted in natural isolates of E. coli

(removing the typical lab strain MG1655), for mutation rates estimated either with or without a fitness effect. These results suggest that DAMP and mutation rate has evolved between closely related organisms and may even themselves be interconnected, with strains possessing a lower average mutation rate exhibiting a greater degree of DAMP.

215 ECOR 12 ECOR 12 ECOR 05 ECOR 05 ECOR 01 ECOR 01 ECOR 25 ECOR 25 ECOR 39 ECOR 39 ECOR 10 ECOR 10 A ECOR 09 B ECOR 09 ECOR 08 ECOR 08 ECOR 03 ECOR 03 ECOR 02 ECOR 02 ECOR 13 ECOR 13 MG1655 MG1655 ECOR 14 ECOR 14 ECOR 21 ECOR 21 ECOR 19 ECOR 19 ECOR 18 ECOR 18 ECOR 06 ECOR 06 ECOR 15 ECOR 15 ECOR 11 ECOR 11 ECOR 04 ECOR 04 ECOR 22 ECOR 22 ECOR 16 ECOR 16 ECOR 71 ECOR 71 ECOR 68 ECOR 68 ECOR 46 ECOR 46 ECOR 17 ECOR 17 ECOR 26 ECOR 26 ECOR 27 ECOR 27 ECOR 67 ECOR 67 ECOR 51 ECOR 51 ECOR 45 ECOR 45 ECOR 28 ECOR 28 ECOR 33 ECOR 33 ECOR 32 ECOR 32 ECOR 30 ECOR 30 ECOR 37 ECOR 37 ECOR 07 ECOR 07 ECOR 69 ECOR 69 ECOR 58 ECOR 58 ECOR 72 ECOR 72 ECOR 70 ECOR 70 ECOR 43 ECOR 43 ECOR 31 ECOR 31 ECOR 42 ECOR 42 ECOR 48 ECOR 48 ECOR 38 ECOR 38 ECOR 41 ECOR 41 ECOR 36 ECOR 36 ECOR 35 ECOR 35 ECOR 65 ECOR 65 ECOR 54 ECOR 54 ECOR 57 ECOR 57 ECOR 60 ECOR 60 ECOR 59 ECOR 59 ECOR 64 ECOR 64 ECOR 63 ECOR 63 wound wound ECOR 66 ECOR 66 ECOR 47 ECOR 47 ECOR 44 ECOR 44 ECOR 50 ECOR 50 ECOR 34 ECOR 34

1.02 Average Mutation Rate 12.27 1.11 Average Mutation Rate 13.5 length=0.1 length=0.1

Figure 4.9 Average mutation rate plotted onto the phylogeny of

Escherichia coli strains analysed for phylogenetic signal. Mutation rates are

plotted on the phylogeny with lower average mutation rates being coloured blue and

higher mutation rates coloured red. Length in both legends corresponds to the branch

lengths in the tree. A Mutation rates were estimated without assuming a fitness effect

of the resistance mutation. B Mutation rates were co-estimated alongside estimating

the fitness cost of mutation. Average mutation rate is the log2-corrected intercept (i.e.

where 2 is raised to the power of the intercept estimate from the model) for each strain

from the model presented in Figures 4.2A and 4.4A. The colouration and therefore the

216 ancestral reconstruction comes from the Geiger package and not the BayesTraits analysis.

ECOR 12 ECOR 12 ECOR 05 ECOR 05 ECOR 01 ECOR 01 ECOR 25 ECOR 25 A ECOR 39 B ECOR 39 ECOR 10 ECOR 10 ECOR 09 ECOR 09 ECOR 08 ECOR 08 ECOR 03 ECOR 03 ECOR 02 ECOR 02 ECOR 13 ECOR 13 MG1655 MG1655 ECOR 14 ECOR 14 ECOR 21 ECOR 21 ECOR 19 ECOR 19 ECOR 18 ECOR 18 ECOR 06 ECOR 06 ECOR 15 ECOR 15 ECOR 11 ECOR 11 ECOR 04 ECOR 04 ECOR 22 ECOR 22 ECOR 16 ECOR 16 ECOR 71 ECOR 71 ECOR 68 ECOR 68 ECOR 46 ECOR 46 ECOR 17 ECOR 17 ECOR 26 ECOR 26 ECOR 27 ECOR 27 ECOR 67 ECOR 67 ECOR 51 ECOR 51 ECOR 45 ECOR 45 ECOR 28 ECOR 28 ECOR 33 ECOR 33 ECOR 32 ECOR 32 ECOR 30 ECOR 30 ECOR 37 ECOR 37 ECOR 07 ECOR 07 ECOR 69 ECOR 69 ECOR 58 ECOR 58 ECOR 72 ECOR 72 ECOR 70 ECOR 70 ECOR 43 ECOR 43 ECOR 31 ECOR 31 ECOR 42 ECOR 42 ECOR 48 ECOR 48 ECOR 38 ECOR 38 ECOR 41 ECOR 41 ECOR 36 ECOR 36 ECOR 35 ECOR 35 ECOR 65 ECOR 65 ECOR 54 ECOR 54 ECOR 57 ECOR 57 ECOR 60 ECOR 60 ECOR 59 ECOR 59 ECOR 64 ECOR 64 ECOR 63 ECOR 63 wound wound ECOR 66 ECOR 66 ECOR 47 ECOR 47 ECOR 44 ECOR 44 ECOR 50 ECOR 50 ECOR 34 ECOR 34

−2.09 DAMP 0.2 −1.98 DAMP 0.05 length=0.1 length=0.1

Figure 4.9 Degree of DAMP plotted onto the phylogeny of Escherichia coli strains analysed for phylogenetic signal. The degree of DAMP for each strain is plotted on the phylogeny with greater degrees (i.e. greater negative slope) being coloured blue and low degrees (i.e. flatter slopes) coloured red. Length in both legends corresponds to the branch lengths in the tree. A Mutation rates were estimated without assuming a fitness cost of the resistance mutation B Mutation rates were estimated alongside estimating the fitness cost of mutation. DAMP slopes were

217 estimated in relation to population density estimated using an ATP-based assay in both plots. The colouration and therefore the ancestral reconstruction come from the Geiger package and not the BayesTraits analysis.

A 0.0 B 0.0

−0.5 −0.5

−1.0 −1.0

−1.5

Degree of DAMP −1.5

−2.0 −2.0 1 2 5 10 1 2 5 10

-9 - Average Mutation rate (10 generation 1 Figure 4.11 Correlation between the average mutation rate and degree of DAMP for strains of Escherichia coli. A Mutation rate was estimated without co-estimating the fitness estimate of the resistance mutation whereas B has mutation rate estimated alongside estimating a fitness cost of that mutation. DAMP was estimated by the association of mutation rate with population density estimated via an

ATP-based assay for both plots. Mutation rate is the log-corrected intercept for each strain from the model presented in Figure 4.2 and 4.4 for A and B respectively, with

DAMP slope being the corrected interaction of mutation rate and calibrated population density from the same model. The blue line represents the mean correlation estimate derived from BayesTraits analysis (see main text). The intercept is the phylogenetically

218 corrected mean estimate of DAMP (-0.75 and -55 for A and B respectively). Note that the x axis is logarithmic.

4.4.3 Interplay between SIM and DAMP in two isolates of E. coli

The observations of DAMP adds to the previously identified type of environmental mutation rate plasticity, stress-induced mutagenesis (SIM) (Williams and Foster 2012).

Whilst DAMP does not obviously associate with the mechanisms involved in SIM

(Krašovec et al. 2014), it does interact with it (Krasovec et al. 2018). Specifically, across a nutrient gradient these two forces interact to produce a mutation rate minimum at intermediate levels (approximately 7 x 108 cells/ml). This work identifies an incredibly close and subtle relationship between mutation rate and ecological factors, although the variation in responses has yet to be tested. As can be seen from the ECOR collection, there is a significant degree in the variation of DAMP and so it could be expected that such a relationship between SIM and DAMP would also vary. This variation in an interaction between SIM and DAMP is strongly suggested via some preliminary results from two strains of the ECOR collection (ECOR 1 and 8). These strains were chosen as they are both closely related to E. coli MG1655, the strain where this interaction was investigated (Krasovec et al. 2018), and also they exhibited different degrees of DAMP (ECOR 1 slope = -0.08 ± 0.21 SE, ECOR 8 slope = -0.82 ± 0.40

SE). In these strains, there is no obvious increase in mutation rate with increasing nutrient levels and there is a significant non-linear interaction between mutation rate and strain (LR11,12 = 6.03, P = 0.014; Figure 4.12). Although the nutrient levels used here do not cover the full nutrient level distribution used in the published study identifying the DAMP/SIM interaction, it does cover the point where a mutation rate minimum occurs in wild-type MG1655 E. coli. Therefore it seems that, just as with

219 DAMP, SIM and the interaction between the two has evolved between strains of E. coli. ) 1 − 10 generation 9

− 5

2 Mutation rate (10 Mutation rate

1 3.2 10 32 57 LB concentration (%) Figure 4.12 Interaction between DAMP and SIM across a nutrient gradient in two isolates from the ECOR collection. Mutation rates in two E. coli isolates from the ECOR collection (ECOR 1 – red, ECOR 8 – green) across a nutrient gradient (LB concentration). ECOR 1 does not exhibit DAMP in minimal media, whilst

ECOR 8 does (see main text). The identity of the strain and power functions of the fitted values of the model and the mutation rate plus 2 standard deviations were included to account for heteroscedasticity. Note both axes are logarithmic.

220 4.5 Discussion

Evolution of mutation rates at fine scales has been studied extensively, revealing large variations in mutation rates within species (Sniegowski et al. 1997; Oliver 2000;

Kohlmann et al. 2018). On the other hand, DAMP’s evolution at fine scales has been seen between two closely related species of bacteria (E. coli and Pseudomonas aeruginosa), and three strains of Saccharomyces cerevisiae (Krašovec et al. 2017). Here

DAMP’s evolution at such a fine scale has been further extended by uncovering its relationship in natural isolates of the bacterium E. coli (Figure 4.2) - both DAMP and the average mutation rate differs significantly between these bacterial strains, although such a difference requires accurate estimation of the fitness effects of mutation. Moreover there is slight evidence that DAMP is constrained by phylogenetic inertia within these strains, so that more closely related strains exhibit similar levels of

DAMP. This level of signal indicates that evolution has been occurring closer to the tips of the tree than the root. There is also some weak evidence that the average mutation rate and degree of DAMP may be positively associated in these strains, highlighting a potential evolutionary explanation for the evolution of DAMP.

Incorporating a fitness estimate into the estimate of mutation rate changes the estimated values of average mutation rate and DAMP coming out from the model, however the overall result remains the same. This occurs despite their potentially being a density dependent fitness effect of the resistance mutation. This differs from the empirical estimate of the fitness effect previously derived from this system, though in a single different strain of E. coli (Krašovec et al. 2017), where previously there was not difference in fitness effect of the resistance mutation in either high or low population density cultures. The fitness estimate from the flan package in R (Mazoyer

221 et al. 2017) comes from a mathematical understanding of the principles of the fluctuation test. Specifically that due to exponential growth, most mutations occur late in the growth cycle and that the fitness estimate will be determined via the number of mutants observed on the selective plate. If there are no so-called ‘jackpots’, that is cultures where there are a lot of mutants, then the relative fitness of normal cells to mutant cells will be high. This will decrease with increasing numbers of mutant colonies on the selective media. Due to the discrepancy with the previous estimate of fitness effects in E. coli between high and low density cultures, until the fitness estimates from the flan package can be quantitatively verified, the use of co-estimated fitness effects must be treated with caution.

Through the investigation of closely related isolates it was possible to determine whether there was any phylogenetic signal to the evolution of whether the average mutation rate or the degree of DAMP between strains. Intriguingly whilst there is no evidence for the phylogenetic signal of the trait investigated in these strains (the mutation rate), there is a phylogenetic signal for the plasticity of this trait (DAMP). The low level of phylogenetic signal suggests that evolution is occurring more at the tips of the tree than towards the root, with the phylogenetic tree only explaining a small proportion of DAMP’s variation between strains. Thus suggesting that DAMP may be a fast evolving labile trait between these closely related strains. Previous studies have tended to relate this level of phylogenetic signal to that of the speed of evolution of such a trait. Specifically, lower phylogenetic signal has been proposed to signify a highly labile trait (Blomberg et al. 2003), as the values of such a trait differ more between closely related species than would be expected through the tree topology. A high phylogenetic signal on the other hand indicates there potentially being

222 evolutionary conservatism with the trait evolving through neutral evolution or genetic drift (Swenson and Enquist 2007), with the tree’s topology explaining much of the variation of the trait. This assertion though is potentially problematic however (Revell et al. 2008), as several different processes may give the same level of phylogenetic signal, meaning such assumptions should be treated with care. Additionally, fluctuation tests are inherently noisy and the analysis does not allow for the inclusion of measurement error, potentially also contributing to the low phylogenetic signal seen here. As such, drawing conclusions regarding the rate and process of DAMP’s evolution is suggestive rather than confirmed. Future work could investigate whether the tempo of evolution varies through the tree and whether this tempo is dependent upon the degree of plasticity exhibited by the strain.

Such mutation rate plasticity has been predominantly thought about in terms of SIM

(Foster 2005), which has recently been identified to interact with DAMP along a nutrient gradient (Krasovec et al. 2018). Exploration of such an interaction in two strains that are closely related to MG1655 where this interaction was observed failed to reveal such an interaction. The degree of DAMP differed between such strains and whilst their DAMP profiles were still evident (ECOR 1 showed no decrease in mutation rate, whereas ECOR 8 did), there was no increase due to SIM. Despite the fact that the nutrient gradient did include the point where the previous increase due to SIM was seen (Krasovec et al. 2018). This could be due to three reasons. First, such an interaction point varies between strains and the nutrient gradient merely did not cover the whole range needed. This is potentially the case, however the nutrient gradient used did show a levelling out of the DAMP slope in ECOR 8, meaning that there must have been some initiation of the SIM response in this strain within this nutrient

223 gradient for this levelling off to occur. Secondly, the increase in mutation rate due to

SIM in down to the actions of error-prone polymerase Pol IV and Pol V (dinB and umuC respectively). Therefore a lacking of these genes would result in no increase due to

SIM. This second reason seems unlikely for two reasons. First, both these strains are closely related to MG1655 and so the likelihood of them lacking these genes is very low. Moreover, these strains have now been sequenced and both contain these two genes identical to the MG1655 genes. Secondly, when knockouts for these two genes were used in MG1655, which shows DAMP, there was a negative association throughout the nutrient gradient, unlike in ECOR 8 that flattens out. It is intriguing that both these strains, one possessing DAMP and one not, both show a similar level response to the increasing nutrient gradient, and thus SIM. Further experiments should look at increasing the nutrient gradient to determine whether the levelling out in ECOR 8 is its ‘maximal’ response to stress or if the SIM interaction occurs, though at a point later than MG1655. The response of ECOR 1 would also be interesting to determine whether the increase in SIM matches those strains with DAMP or if the lack of DAMP indicates low environmental genetic plasticity in general. Nevertheless, differences between strains of a species in this interaction of mutation rate plasticity traits occurs, and so could alter the evolutionary adaptability of such strains. Though this remains highly speculative and requires further experimentation to determine the scale and variation in the interactions between SIM and DAMP in different natural isolates. It would then be possible to test the phylogenetic relatedness of this variable interaction, identify the relative mechanistic control and explore the potential of this behaviour.

224 Variable mutation rates, though not needed to have evolved for adaptive reasons, and could be rather due to different efficiencies of selection upon the genetic mechanisms involved (MacLean et al. 2013). Such heterogeneous mutation rates could however pose an adaptive advantage to an organism by increasing the chance of adaptive mutations (Alexander et al. 2017) and maintaining an organism’s position on a fitness peak (Belavkin et al. 2016). If DAMP were to have evolved for adaptive reasons, we would possibly expect that organisms with a low average mutation rate to possess greater degrees of DAMP, as it is these organisms that need greater plasticity in mutation rate to produce the required novelty for adaption. From the analysis of the

ECOR collection, there seems to be slight support for this relationship between mutation rate and DAMP. In a phylogenetically controlled correlation analysis there is a slight positive correlation, where strains exhibiting greater degrees of DAMP have an overall low average mutation rate (Figure 4.11). Whether such a relationship of low mutation rate and high DAMP does pose such an adaptive role is yet to be empirically tested through the use of experimental evolution (Kawecki et al. 2012). Nonetheless, the fact that such a relationship between mutation rate and DAMP is evident, and the fact that DAMP has evolved between closely related organisms is suggestive of such a role.

The significant variation between strains of the same species of E. coli (Figures 4.2 –

4.5) shows that DAMP has evolved not only at large evolutionary scales between domains of life (Chapters 2 and 3), but also at the finest scales, that of between strains of the same species. Additionally, there is evidence for more closely related organisms possessing similar degrees of DAMP (Figure 4.10), although varying more than what would be expected from the phylogenetic tree. This is suggestive of a highly labile trait

225 that is changing quickly between closely related individuals, tough this remains speculative. Furthermore, DAMP and mutation rate are interconnected, with strains that have a lower average mutation rate exhibiting a greater degree of DAMP (Figure

4.11), suggesting DAMP’s evolution for increasing adaptation in organisms that tend to exhibit low rate of novelty production.

226 4.6 References

Alexander, H. K., S. I. Mayer, and S. Bonhoeffer. 2017. Population Heterogeneity in

Mutation Rate Increases the Frequency of Higher-Order Mutants and Reduces

Long-Term Mutational Load. Mol. Biol. Evol. 34:419–436. Oxford University Press.

Alikhan, N. F., Z. Zhou, M. J. Sergeant, and M. Achtman. 2018. A genomic overview of

the population structure of Salmonella. PLoS Genet. 14:e1007261.

Belavkin, R. V., A. Channon, E. Aston, J. Aston, R. Krašovec, and C. G. Knight. 2016.

Monotonicity of fitness landscapes and mutation rate control. J. Math. Biol.

73:1491–1524.

Blomberg, S. P., T. Garland, and A. R. Ives. 2003. TESTING FOR PHYLOGENETIC SIGNAL

IN COMPARATIVE DATA: BEHAVIORAL TRAITS ARE MORE LABILE. Evolution (N. Y).

57:717–745. John Wiley & Sons, Ltd (10.1111).

Brooks, S. P., and A. Gelman. 1998. General Methods for Monitoring Convergence of

Iterative Simulations. J. Comput. Graph. Stat. 7:434–455.

Foster, P. 2007. Stress-induced mutagenesis in bacteria. Crit. Rev. Biochem. Mol. Biol.

42:373–397.

Foster, P. L. 2005. Stress responses and genetic variation in bacteria. Mutat. Res.

569:3–11.

Galardini, M., A. Koumoutsi, L. Herrera-Dominguez, J. A. C. Varela, A. Telzerow, O.

Wagih, M. Wartel, O. Clermont, E. Denamur, A. Typas, and P. Beltrao. 2017.

Phenotype inference in an Escherichia coli strain panel. Elife 6:e31035. eLife

Sciences Publications Limited.

Galhardo, R. S., P. J. Hastings, and S. M. Rosenberg. 2007. Mutation as a stress

response and the regulation of evolvability.

Gelman, A., and D. B. Rubin. 1992. Inference from Iterative Simulation Using Multiple

227 Sequences. Stat. Sci. 7:457–472.

Harmon, L. J., J. T. Weir, C. D. Brock, R. E. Glor, and W. Challenger. 2008. GEIGER:

investigating evolutionary . 24:129–131.

Kawecki, T. J., R. E. Lenski, D. Ebert, B. Hollis, I. Olivieri, and M. C. Whitlock. 2012.

Experimental evolution. Elsevier.

Kohlmann, R., T. Bähr, and S. G. Gatermann. 2018. Species-specific mutation rates for

ampC derepression in Enterobacterales with chromosomally encoded inducible

AmpC β-lactamase. J. Antimicrob. Chemother. 73:1530–1536.

Krašovec, R., R. V Belavkin, J. A. D. Aston, A. Channon, E. Aston, B. M. Rash, M.

Kadirvel, S. Forbes, and C. G. Knight. 2014. Mutation rate plasticity in rifampicin

resistance depends on Escherichia coli cell-cell interactions. Nat. Commun.

5:3742.

Krasovec, R., H. Richards, D. R. Gifford, R. V Belavkin, A. Channon, E. Aston, A. J.

McBain, and C. G. Knight. 2018. Opposing effects of population density and stress

on Escherichia coli mutation rate. ISME J. In press.

Krašovec, R., H. Richards, D. R. Gifford, C. Hatcher, K. J. Faulkner, R. V. Belavkin, A.

Channon, E. Aston, A. J. McBain, and C. G. Knight. 2017. Spontaneous mutation

rate is a plastic trait associated with population density across domains of life.

PLoS Biol. 15:e2002731.

Leisch, F. 2004. FlexMix: A General Framework for Finite Mixture Models and Latent

Class Regression in R. J. Stat. Softw. 11:1–18.

Lenski, R. E., M. R. Rose, S. C. Simpson, and S. C. Tadler. 1991. Long-Term Experimental

Evolution in Escherichia coli . I . Adaptation and Divergence During. Am. Nat.

138:1315–1341.

Lynch, M., M. S. Ackerman, J.-F. Gout, H. Long, W. Sung, W. K. Thomas, and P. L. Foster.

228 2016. Genetic drift, selection and the evolution of the mutation rate. Nat. Rev.

Genet. 17:704–714.

MacLean, R. C., C. Torres-Barceló, and R. Moxon. 2013. Evaluating evolutionary models

of stress-induced mutagenesis in bacteria. Nat. Rev. Genet. 14:221–227.

Matic, I. 2016. Molecular mechanisms involved in the regulation of mutation rates in

bacteria.

Mazoyer, A., R. Drouilhet, S. Despréaux, and B. Ycart. 2017. flan: An R Package for

Inference on Mutation Models. R J. 9:334–351.

Michaels, M. L., and J. H. Miller. 1992. The GO System Protects Organisms from the

Mutagenic Effect of the Spontaneous Lesion 8-Hydroxyguanine (7,8-Dihydro-8-

Oxoguanine). J. BACrERIOLOGY 174:6321–6325.

Ochman, H., and R. K. Selander. 1984. Standard Reference Strains of Escherichia coli

from Natural Populations. J. Bacteriol. 157:690–693.

Oliver, a. 2000. High Frequency of Hypermutable Pseudomonas aeruginosa in Cystic

Fibrosis Lung Infection. Science (80-. ). 288:1251–1253.

Pagel, M., and A. Meade. 2016. BayesTraits v3 Manual.

Pinheiro, J., D. Bates, S. DebRoy, D. Sarkar, and R. C. Team. 2018. nlme: Linear and

Nonlinear Mixed Effects Models. R Packag. version 3.1-137.

Revell, L. J. 2012. phytools: An R package for phylogenetic comparative biology (and

other things). Methods Ecol. Evol. 3:217–223. Wiley/Blackwell (10.1111).

Revell, L. J., L. J. Harmon, and D. C. Collar. 2008. Phylogenetic Signal, Evolutionary

Process, and Rate. Syst. Biol. 57:591–601. Oxford University Press.

Saint-Ruf, C., and I. Matic. 2006. Environmental tuning of mutation rates. Environ.

Microbiol. 8:193–9.

Sniegowski, P., P. Gerrish, and R. Lenski. 1997. Evolution of high mutation rates in

229 experimental populations of E . coli. Nature 387:703–705.

Sung, W., M. S. Ackerman, M. M. Dillon, T. G. Platt, C. Fuqua, V. S. Cooper, and M.

Lynch. 2016. Evolution of the Insertion-Deletion Mutation Rate Across the Tree of

Life. G3 Genes|Genomes|Genetics 6:2583–2591.

Sung, W., M. S. Ackerman, S. F. Miller, T. G. Doak, and M. Lynch. 2012. Drift-barrier

hypothesis and mutation-rate evolution. Proc. Natl. Acad. Sci. 109:18488–18492.

Swenson, N. G., and B. J. Enquist. 2007. Ecological and evolutionary determinants of a

key plant functional trait: Wood density and its -wide variation across

latitude and elevation. Am. J. Bot. 94:451–459. John Wiley & Sons, Ltd.

Swings, T., B. van Den Bergh, S. Wuyts, E. Oeyen, K. Voordeckers, K. J. Verstrepen, M.

Fauvart, N. Verstraeten, and J. Michiels. 2017. Adaptive tuning of mutation rates

allows fast response to lethal stress in escherichia coli. Elife 6:e22939.

Treangen, T. J., B. D. Ondov, S. Koren, and A. M. Phillippy. 2014. The Harvest suite for

rapid core-genome alignment and visualization of thousands of intraspecific

microbial genomes. Genome Biol. 15:524. BioMed Central.

Williams, A. B., and P. L. Foster. 2012. Stress-Induced Mutagenesis. Ecosal Plus 4.

230 4.7 Appendix

4.7.1 Statistical Model outputs

4.7.1.1 Model output for Figure 4.2A

log 2 Mutation Rate Predictors Estimates CI p (Intercept) 3.62 3.45 – 3.79 <0.001

Cal Cent -0.67 -0.96 – -0.38 <0.001

1 -0.98 -1.35 – -0.61 <0.001

10 -2.12 -2.40 – -1.83 <0.001

11 -1.89 -2.24 – -1.54 <0.001

12 -2.41 -2.70 – -2.12 <0.001

13 -0.74 -1.04 – -0.45 <0.001

14 -1.43 -1.81 – -1.04 <0.001

15 -1.67 -1.96 – -1.39 <0.001

16 -0.80 -1.11 – -0.50 <0.001

17 -1.42 -1.76 – -1.08 <0.001

18 -1.03 -1.35 – -0.71 <0.001

19 -0.86 -1.12 – -0.60 <0.001

2 -2.55 -2.89 – -2.21 <0.001

21 -3.02 -3.38 – -2.66 <0.001

22 -2.34 -2.59 – -2.08 <0.001

25 -2.67 -3.03 – -2.31 <0.001

26 -1.44 -1.93 – -0.95 <0.001

27 -2.45 -2.86 – -2.03 <0.001

28 -1.50 -1.79 – -1.21 <0.001

3 -1.62 -1.95 – -1.28 <0.001

30 -2.36 -2.76 – -1.96 <0.001

31 -1.69 -2.06 – -1.33 <0.001

231 32 -1.84 -2.19 – -1.48 <0.001

33 -1.41 -1.81 – -1.01 <0.001

34 -2.11 -2.54 – -1.68 <0.001

35 -2.48 -2.81 – -2.16 <0.001

36 -1.74 -2.10 – -1.37 <0.001

37 -2.16 -2.47 – -1.85 <0.001

38 -2.91 -3.34 – -2.48 <0.001

39 -2.68 -3.10 – -2.27 <0.001

4 -1.12 -1.54 – -0.70 <0.001

41 -1.90 -2.24 – -1.56 <0.001

42 -2.29 -2.57 – -2.01 <0.001

43 -1.49 -1.85 – -1.13 <0.001

44 -2.17 -2.60 – -1.74 <0.001

45 -1.64 -1.99 – -1.28 <0.001

46 -1.36 -1.71 – -1.01 <0.001

47 -2.59 -2.90 – -2.28 <0.001

48 -1.99 -2.64 – -1.33 <0.001

5 -1.65 -1.97 – -1.32 <0.001

50 -2.12 -2.49 – -1.76 <0.001

51 -0.39 -1.38 – 0.59 0.468

54 -2.31 -2.69 – -1.92 <0.001

57 -1.49 -2.12 – -0.85 <0.001

58 -1.90 -2.26 – -1.54 <0.001

59 -1.52 -1.92 – -1.12 <0.001

6 -1.75 -2.24 – -1.26 <0.001

60 -3.00 -3.37 – -2.64 <0.001

63 -2.37 -2.87 – -1.87 <0.001

64 -3.59 -4.11 – -3.07 <0.001

65 -1.98 -2.57 – -1.39 <0.001

232 66 -1.59 -1.92 – -1.27 <0.001

67 -3.04 -3.53 – -2.55 <0.001

68 -2.40 -2.83 – -1.97 <0.001

69 -2.08 -2.40 – -1.75 <0.001

7 -1.97 -2.57 – -1.37 <0.001

70 -2.48 -2.85 – -2.11 <0.001

71 -2.29 -2.65 – -1.93 <0.001

72 -2.47 -2.84 – -2.10 <0.001

8 -2.55 -3.03 – -2.06 <0.001

9 -2.21 -2.49 – -1.93 <0.001

Wound -2.95 -3.38 – -2.52 <0.001

CalCent:Strain1 0.55 0.06 – 1.04 0.041

CalCent:Strain10 -0.05 -0.49 – 0.39 0.831

CalCent:Strain11 0.60 0.10 – 1.09 0.029

CalCent:Strain12 0.32 -0.14 – 0.77 0.209

CalCent:Strain13 0.13 -0.32 – 0.57 0.605

CalCent:Strain14 0.30 -0.24 – 0.83 0.312

CalCent:Strain15 0.28 -0.14 – 0.69 0.227

CalCent:Strain16 0.52 0.11 – 0.93 0.021

CalCent:Strain17 -0.02 -0.56 – 0.53 0.960

CalCent:Strain18 0.19 -0.22 – 0.59 0.409

CalCent:Strain19 0.26 -0.10 – 0.63 0.190

CalCent:Strain2 -0.18 -0.67 – 0.32 0.518

CalCent:Strain21 0.12 -0.37 – 0.60 0.670

CalCent:Strain22 0.78 0.37 – 1.19 0.001

CalCent:Strain25 0.13 -0.30 – 0.57 0.574

CalCent:Strain26 -0.03 -0.68 – 0.61 0.925

CalCent:Strain27 -0.30 -1.05 – 0.45 0.467

CalCent:Strain28 0.36 -0.06 – 0.78 0.125

233 CalCent:Strain3 0.20 -0.29 – 0.70 0.461

CalCent:Strain30 0.14 -0.89 – 1.16 0.811

CalCent:Strain31 0.35 -0.20 – 0.90 0.245

CalCent:Strain32 0.18 -0.34 – 0.70 0.527

CalCent:Strain33 -0.30 -0.91 – 0.30 0.361

CalCent:Strain34 -0.71 -1.56 – 0.13 0.125

CalCent:Strain35 0.20 -0.33 – 0.72 0.497

CalCent:Strain36 -0.15 -0.85 – 0.56 0.704

CalCent:Strain37 -0.71 -1.28 – -0.14 0.025

CalCent:Strain38 -0.12 -0.96 – 0.72 0.795

CalCent:Strain39 0.12 -0.43 – 0.67 0.699

CalCent:Strain4 0.87 0.40 – 1.34 0.001

CalCent:Strain41 -0.02 -0.56 – 0.52 0.951

CalCent:Strain42 0.11 -0.31 – 0.53 0.638

CalCent:Strain43 -0.10 -0.85 – 0.64 0.801

CalCent:Strain44 0.58 0.07 – 1.08 0.038

CalCent:Strain45 0.45 0.00 – 0.89 0.070

CalCent:Strain46 -0.26 -0.91 – 0.40 0.482

CalCent:Strain47 0.27 -0.20 – 0.74 0.295

CalCent:Strain48 -1.42 -1.98 – -0.85 <0.001

CalCent:Strain5 0.47 -0.03 – 0.97 0.089

CalCent:Strain50 0.37 -0.14 – 0.87 0.191

CalCent:Strain51 0.79 0.39 – 1.18 <0.001

CalCent:Strain54 0.27 -0.53 – 1.06 0.545

CalCent:Strain57 0.76 -0.15 – 1.67 0.131

CalCent:Strain58 -0.49 -1.33 – 0.34 0.286

CalCent:Strain59 -0.01 -0.93 – 0.91 0.984

CalCent:Strain6 0.10 -0.51 – 0.71 0.771

CalCent:Strain60 -0.32 -1.01 – 0.37 0.396

234 CalCent:Strain63 -0.15 -0.85 – 0.55 0.699

CalCent:Strain64 -0.52 -1.52 – 0.49 0.353

CalCent:Strain65 -0.31 -1.40 – 0.78 0.604

CalCent:Strain66 -0.51 -1.09 – 0.07 0.111

CalCent:Strain67 0.41 -0.22 – 1.05 0.240

CalCent:Strain68 -0.23 -0.91 – 0.44 0.535

CalCent:Strain69 -0.16 -0.63 – 0.31 0.535

CalCent:Strain7 0.17 -0.55 – 0.90 0.665

CalCent:Strain70 -0.48 -1.27 – 0.31 0.270

CalCent:Strain71 0.03 -0.56 – 0.62 0.922

CalCent:Strain72 0.36 -0.19 – 0.91 0.238

CalCent:Strain8 -0.16 -0.84 – 0.51 0.659

CalCent:Strain9 0.13 -0.22 – 0.49 0.490

CalCent:StrainWound 0.46 -0.06 – 0.98 0.111 Observations 830 R2 / Omega-squared 0.621 / 0.619 Mutation Rate was estimated without co-estimating the fitness effect of the mutation rate. Population density was calibrated using an ATP-based assay and then mean centred for the analysis (CalCent). The model was run using MG1655 as a reference.

4.7.1.2 Model output for Figure 4.3A

log 2 Mutation Rate Predictors Estimates CI p (Intercept) 3.35 3.15 – 3.54 <0.001

Dcent -0.44 -0.64 – -0.23 <0.001

1 -0.85 -1.19 – -0.52 <0.001

10 -1.81 -2.11 – -1.51 <0.001

11 -1.59 -1.93 – -1.26 <0.001

235 12 -2.06 -2.35 – -1.77 <0.001

13 -0.59 -0.88 – -0.29 <0.001

14 -1.33 -1.74 – -0.92 <0.001

15 -1.40 -1.72 – -1.08 <0.001

16 -0.66 -1.02 – -0.30 <0.001

17 -1.29 -1.59 – -0.99 <0.001

18 -0.81 -1.23 – -0.38 <0.001

19 -0.54 -0.91 – -0.18 0.003

2 -2.24 -2.60 – -1.89 <0.001

21 -2.63 -2.93 – -2.33 <0.001

22 -1.95 -2.22 – -1.68 <0.001

25 -2.37 -2.68 – -2.05 <0.001

26 -1.52 -1.95 – -1.10 <0.001

27 -2.18 -2.51 – -1.84 <0.001

28 -1.28 -1.57 – -0.99 <0.001

3 -1.40 -1.76 – -1.05 <0.001

30 -2.00 -2.29 – -1.71 <0.001

31 -1.53 -1.91 – -1.15 <0.001

32 -1.60 -1.89 – -1.31 <0.001

33 -1.39 -1.78 – -1.00 <0.001

34 -2.20 -2.59 – -1.81 <0.001

35 -2.14 -2.51 – -1.78 <0.001

36 -1.58 -1.98 – -1.18 <0.001

37 -1.87 -2.21 – -1.53 <0.001

38 -2.54 -2.99 – -2.10 <0.001

39 -2.20 -2.56 – -1.83 <0.001

4 -1.10 -1.44 – -0.75 <0.001

41 -1.74 -2.08 – -1.40 <0.001

42 -1.95 -2.22 – -1.68 <0.001

236 43 -1.64 -2.07 – -1.20 <0.001

44 -1.81 -2.19 – -1.42 <0.001

45 -1.30 -1.60 – -1.00 <0.001

46 -1.24 -1.57 – -0.91 <0.001

47 -2.31 -2.59 – -2.03 <0.001

48 -1.88 -2.46 – -1.30 <0.001

5 -1.42 -1.77 – -1.07 <0.001

50 -1.74 -2.08 – -1.40 <0.001

51 -2.57 -3.17 – -1.98 <0.001

54 -2.06 -2.42 – -1.69 <0.001

57 -2.02 -2.44 – -1.59 <0.001

58 -1.69 -2.02 – -1.35 <0.001

59 -1.55 -1.98 – -1.12 <0.001

6 -1.59 -2.05 – -1.14 <0.001

60 -2.64 -2.99 – -2.30 <0.001

63 -2.23 -2.62 – -1.85 <0.001

64 -3.46 -3.79 – -3.13 <0.001

65 -1.95 -2.37 – -1.53 <0.001

66 -1.64 -2.02 – -1.26 <0.001

67 -2.66 -3.04 – -2.27 <0.001

68 -2.04 -2.39 – -1.69 <0.001

69 -1.92 -2.27 – -1.58 <0.001

7 -1.91 -2.32 – -1.50 <0.001

70 -2.30 -2.62 – -1.99 <0.001

71 -2.08 -2.47 – -1.69 <0.001

72 -2.03 -2.37 – -1.70 <0.001

8 -2.35 -2.76 – -1.93 <0.001

9 -1.79 -2.09 – -1.49 <0.001

Wound -2.46 -2.80 – -2.12 <0.001

237 Dcent:Strain1 0.43 0.08 – 0.77 0.017

Dcent:Strain10 -0.15 -0.49 – 0.19 0.399

Dcent:Strain11 0.17 -0.20 – 0.54 0.361

Dcent:Strain12 0.05 -0.29 – 0.38 0.790

Dcent:Strain13 -0.00 -0.33 – 0.33 1.000

Dcent:Strain14 0.23 -0.25 – 0.72 0.350

Dcent:Strain15 0.16 -0.21 – 0.53 0.386

Dcent:Strain16 0.34 -0.07 – 0.76 0.105

Dcent:Strain17 0.06 -0.27 – 0.39 0.704

Dcent:Strain18 0.13 -0.21 – 0.47 0.451

Dcent:Strain19 0.19 -0.12 – 0.51 0.230

Dcent:Strain2 -0.22 -0.59 – 0.16 0.259

Dcent:Strain21 0.00 -0.34 – 0.35 0.979

Dcent:Strain22 0.44 0.11 – 0.77 0.009

Dcent:Strain25 -0.06 -0.40 – 0.28 0.734

Dcent:Strain26 0.15 -0.33 – 0.64 0.532

Dcent:Strain27 -0.30 -0.78 – 0.17 0.214

Dcent:Strain28 0.16 -0.17 – 0.49 0.351

Dcent:Strain3 0.17 -0.25 – 0.59 0.429

Dcent:Strain30 -0.28 -0.67 – 0.10 0.148

Dcent:Strain31 0.18 -0.27 – 0.63 0.442

Dcent:Strain32 -0.14 -0.47 – 0.18 0.389

Dcent:Strain33 -0.27 -0.70 – 0.17 0.226

Dcent:Strain34 -0.26 -0.82 – 0.30 0.366

Dcent:Strain35 0.13 -0.35 – 0.61 0.596

Dcent:Strain36 0.11 -0.40 – 0.62 0.675

Dcent:Strain37 -0.30 -0.66 – 0.07 0.108

Dcent:Strain38 -0.05 -0.80 – 0.70 0.892

Dcent:Strain39 -0.06 -0.43 – 0.31 0.754

238 Dcent:Strain4 0.57 0.23 – 0.92 0.001

Dcent:Strain41 0.06 -0.33 – 0.45 0.766

Dcent:Strain42 -0.10 -0.46 – 0.26 0.580

Dcent:Strain43 0.32 -0.31 – 0.96 0.318

Dcent:Strain44 0.00 -0.46 – 0.46 0.993

Dcent:Strain45 0.05 -0.28 – 0.38 0.758

Dcent:Strain46 -0.18 -0.60 – 0.23 0.392

Dcent:Strain47 0.10 -0.22 – 0.42 0.529

Dcent:Strain48 -0.12 -0.57 – 0.34 0.616

Dcent:Strain5 0.03 -0.39 – 0.46 0.877

Dcent:Strain50 0.03 -0.38 – 0.44 0.887

Dcent:Strain51 -0.57 -0.94 – -0.20 0.003

Dcent:Strain54 0.06 -0.32 – 0.45 0.743

Dcent:Strain57 -0.31 -0.62 – 0.01 0.058

Dcent:Strain58 -0.14 -0.56 – 0.28 0.512

Dcent:Strain59 0.13 -0.40 – 0.65 0.637

Dcent:Strain6 -0.03 -0.52 – 0.46 0.904

Dcent:Strain60 -0.02 -0.35 – 0.32 0.929

Dcent:Strain63 0.07 -0.38 – 0.52 0.763

Dcent:Strain64 0.23 -0.30 – 0.75 0.399

Dcent:Strain65 -0.26 -0.60 – 0.08 0.140

Dcent:Strain66 -0.07 -0.58 – 0.43 0.776

Dcent:Strain67 0.20 -0.20 – 0.60 0.326

Dcent:Strain68 -0.22 -0.59 – 0.15 0.240

Dcent:Strain69 -0.06 -0.45 – 0.34 0.782

Dcent:Strain7 0.11 -0.31 – 0.52 0.607

Dcent:Strain70 -0.37 -0.84 – 0.09 0.116

Dcent:Strain71 -0.02 -0.47 – 0.43 0.930

Dcent:Strain72 0.09 -0.29 – 0.48 0.636

239 Dcent:Strain8 -0.31 -0.75 – 0.13 0.166

Dcent:Strain9 -0.08 -0.39 – 0.23 0.614

Dcent:StrainWound 0.13 -0.23 – 0.48 0.481 Observations 830 R2 / Omega-squared 0.716 / 0.703 Mutation Rate was estimated without co-estimating the fitness effect of the mutation

rate. Population density was estimated via direct CFU counts and then mean centred for

the analysis (DCent). The model was run using MG1655 as a reference.

4.7.1.3 Model output from Figure 4.4A

log 2 Mutation Rate Predictors Estimates CI p (Intercept) 3.73 3.52 – 3.94 <0.001

Cal Cent -0.74 -1.09 – -0.40 <0.001

1 -1.02 -1.40 – -0.64 <0.001

10 -2.13 -2.48 – -1.79 <0.001

11 -2.00 -2.42 – -1.58 <0.001

12 -2.46 -2.80 – -2.13 <0.001

13 -0.82 -1.12 – -0.52 <0.001

14 -1.57 -1.98 – -1.16 <0.001

15 -1.76 -2.09 – -1.43 <0.001

16 -0.91 -1.36 – -0.45 <0.001

17 -1.31 -1.74 – -0.87 <0.001

18 -1.05 -1.47 – -0.62 <0.001

19 -0.85 -1.17 – -0.52 <0.001

2 -2.68 -3.09 – -2.27 <0.001

21 -3.11 -3.54 – -2.68 <0.001

22 -2.37 -2.67 – -2.07 <0.001

240 25 -2.78 -3.19 – -2.37 <0.001

26 -1.62 -2.15 – -1.09 <0.001

27 -2.56 -3.11 – -2.02 <0.001

28 -1.58 -1.94 – -1.22 <0.001

3 -1.66 -2.08 – -1.24 <0.001

30 -2.42 -2.85 – -2.00 <0.001

31 -1.81 -2.25 – -1.36 <0.001

32 -1.96 -2.35 – -1.57 <0.001

33 -1.60 -2.04 – -1.15 <0.001

34 -2.44 -2.94 – -1.93 <0.001

35 -2.61 -3.01 – -2.21 <0.001

36 -1.85 -2.29 – -1.41 <0.001

37 -2.32 -2.72 – -1.92 <0.001

38 -3.18 -3.67 – -2.69 <0.001

39 -2.91 -3.49 – -2.34 <0.001

4 -1.21 -1.64 – -0.78 <0.001

41 -1.92 -2.35 – -1.50 <0.001

42 -2.40 -2.72 – -2.08 <0.001

43 -1.75 -2.25 – -1.25 <0.001

44 -2.38 -2.89 – -1.87 <0.001

45 -1.63 -2.04 – -1.21 <0.001

46 -1.37 -1.82 – -0.91 <0.001

47 -2.66 -3.03 – -2.28 <0.001

48 -2.39 -3.20 – -1.58 <0.001

5 -1.67 -2.05 – -1.30 <0.001

50 -2.19 -2.61 – -1.77 <0.001

51 -0.83 -1.84 – 0.18 0.106

54 -2.33 -2.78 – -1.89 <0.001

57 -1.65 -2.37 – -0.93 <0.001

241 58 -1.93 -2.31 – -1.55 <0.001

59 -1.74 -2.24 – -1.24 <0.001

6 -1.93 -2.49 – -1.37 <0.001

60 -3.03 -3.49 – -2.58 <0.001

63 -2.41 -3.08 – -1.73 <0.001

64 -3.59 -4.24 – -2.94 <0.001

65 -1.94 -2.68 – -1.19 <0.001

66 -1.83 -2.23 – -1.43 <0.001

67 -3.24 -3.80 – -2.69 <0.001

68 -2.54 -3.06 – -2.02 <0.001

69 -2.22 -2.64 – -1.81 <0.001

7 -2.03 -2.73 – -1.33 <0.001

70 -2.61 -3.03 – -2.19 <0.001

71 -2.37 -2.81 – -1.93 <0.001

72 -2.46 -2.88 – -2.05 <0.001

8 -2.71 -3.24 – -2.19 <0.001

9 -2.22 -2.55 – -1.89 <0.001

Wound -3.02 -3.49 – -2.54 <0.001

CalCent:Strain1 0.66 0.13 – 1.19 0.015

CalCent:Strain10 0.04 -0.51 – 0.58 0.898

CalCent:Strain11 0.63 0.01 – 1.26 0.048

CalCent:Strain12 0.44 -0.09 – 0.96 0.103

CalCent:Strain13 0.14 -0.32 – 0.59 0.554

CalCent:Strain14 0.40 -0.18 – 0.97 0.181

CalCent:Strain15 0.34 -0.16 – 0.83 0.181

CalCent:Strain16 0.63 0.08 – 1.19 0.026

CalCent:Strain17 0.07 -0.62 – 0.75 0.848

CalCent:Strain18 0.29 -0.22 – 0.80 0.265

CalCent:Strain19 0.38 -0.06 – 0.81 0.088

242 CalCent:Strain2 -0.04 -0.65 – 0.56 0.892

CalCent:Strain21 0.16 -0.41 – 0.74 0.577

CalCent:Strain22 0.81 0.33 – 1.29 0.001

CalCent:Strain25 0.23 -0.28 – 0.74 0.369

CalCent:Strain26 0.23 -0.50 – 0.96 0.540

CalCent:Strain27 -0.34 -1.35 – 0.66 0.506

CalCent:Strain28 0.53 -0.00 – 1.05 0.051

CalCent:Strain3 0.44 -0.18 – 1.07 0.162

CalCent:Strain30 0.23 -0.93 – 1.40 0.694

CalCent:Strain31 0.30 -0.37 – 0.97 0.381

CalCent:Strain32 0.28 -0.29 – 0.86 0.335

CalCent:Strain33 -0.30 -0.98 – 0.38 0.389

CalCent:Strain34 -0.45 -1.43 – 0.53 0.364

CalCent:Strain35 0.34 -0.30 – 0.97 0.300

CalCent:Strain36 0.01 -0.77 – 0.78 0.989

CalCent:Strain37 -0.84 -1.59 – -0.08 0.031

CalCent:Strain38 0.05 -0.85 – 0.96 0.909

CalCent:Strain39 0.16 -0.59 – 0.92 0.671

CalCent:Strain4 0.81 0.29 – 1.32 0.002

CalCent:Strain41 0.07 -0.60 – 0.74 0.845

CalCent:Strain42 0.10 -0.37 – 0.57 0.691

CalCent:Strain43 0.17 -0.80 – 1.15 0.727

CalCent:Strain44 0.47 -0.15 – 1.08 0.135

CalCent:Strain45 0.49 -0.05 – 1.02 0.074

CalCent:Strain46 -0.15 -0.99 – 0.70 0.736

CalCent:Strain47 0.38 -0.18 – 0.93 0.181

CalCent:Strain48 -1.11 -1.89 – -0.32 0.006

CalCent:Strain5 0.54 -0.06 – 1.13 0.078

CalCent:Strain50 0.42 -0.19 – 1.03 0.177

243 CalCent:Strain51 0.73 0.13 – 1.34 0.017

CalCent:Strain54 0.52 -0.31 – 1.35 0.219

CalCent:Strain57 0.72 -0.27 – 1.71 0.155

CalCent:Strain58 -0.48 -1.33 – 0.37 0.265

CalCent:Strain59 0.11 -1.11 – 1.34 0.855

CalCent:Strain6 0.25 -0.42 – 0.92 0.465

CalCent:Strain60 -0.26 -1.11 – 0.60 0.557

CalCent:Strain63 -0.13 -0.97 – 0.71 0.764

CalCent:Strain64 -0.69 -1.88 – 0.49 0.250

CalCent:Strain65 -0.49 -1.75 – 0.78 0.450

CalCent:Strain66 -0.34 -1.03 – 0.34 0.321

CalCent:Strain67 0.51 -0.22 – 1.25 0.170

CalCent:Strain68 -0.10 -0.96 – 0.75 0.812

CalCent:Strain69 -0.07 -0.65 – 0.50 0.801

CalCent:Strain7 0.32 -0.53 – 1.16 0.461

CalCent:Strain70 -0.21 -1.07 – 0.65 0.640

CalCent:Strain71 0.19 -0.54 – 0.92 0.612

CalCent:Strain72 0.46 -0.16 – 1.08 0.146

CalCent:Strain8 -0.08 -0.83 – 0.67 0.838

CalCent:Strain9 0.25 -0.16 – 0.66 0.230

CalCent:StrainWound 0.46 -0.12 – 1.04 0.117 Observations 830 R2 / Omega-squared 0.651 / 0.651 Mutation Rate was estimated with co-estimating the fitness effect of the mutation rate.

Population density was calibrated using an ATP-based assay and then mean centred for the analysis (CalCent). The model was run using MG1655 as a reference.

244 4.7.1.4 Model output for Figure 4.5A

log 2 Mutation Rate Predictors Estimates CI p (Intercept) 3.54 3.32 – 3.76 <0.001

Dcent -0.49 -0.72 – -0.26 <0.001

1 -0.92 -1.28 – -0.56 <0.001

10 -1.90 -2.23 – -1.57 <0.001

11 -1.73 -2.09 – -1.38 <0.001

12 -2.19 -2.50 – -1.88 <0.001

13 -0.70 -1.00 – -0.40 <0.001

14 -1.43 -1.85 – -1.02 <0.001

15 -1.54 -1.87 – -1.22 <0.001

16 -0.66 -1.11 – -0.21 0.004

17 -1.25 -1.59 – -0.90 <0.001

18 -0.90 -1.35 – -0.44 <0.001

19 -0.60 -0.98 – -0.22 0.002

2 -2.32 -2.70 – -1.94 <0.001

21 -2.73 -3.08 – -2.38 <0.001

22 -2.09 -2.37 – -1.80 <0.001

25 -2.51 -2.84 – -2.17 <0.001

26 -1.59 -2.02 – -1.15 <0.001

27 -2.26 -2.64 – -1.88 <0.001

28 -1.41 -1.74 – -1.07 <0.001

3 -1.46 -1.84 – -1.07 <0.001

30 -2.09 -2.41 – -1.77 <0.001

31 -1.65 -2.05 – -1.26 <0.001

32 -1.70 -2.01 – -1.39 <0.001

33 -1.52 -1.91 – -1.13 <0.001

34 -2.35 -2.75 – -1.95 <0.001

245 35 -2.23 -2.61 – -1.86 <0.001

36 -1.72 -2.15 – -1.29 <0.001

37 -2.05 -2.42 – -1.68 <0.001

38 -2.72 -3.17 – -2.27 <0.001

39 -2.30 -2.71 – -1.89 <0.001

4 -1.24 -1.61 – -0.88 <0.001

41 -1.82 -2.21 – -1.44 <0.001

42 -2.06 -2.35 – -1.77 <0.001

43 -1.78 -2.23 – -1.33 <0.001

44 -1.99 -2.39 – -1.60 <0.001

45 -1.32 -1.66 – -0.99 <0.001

46 -1.26 -1.63 – -0.89 <0.001

47 -2.41 -2.72 – -2.10 <0.001

48 -2.07 -2.65 – -1.49 <0.001

5 -1.47 -1.83 – -1.11 <0.001

50 -1.86 -2.21 – -1.51 <0.001

51 -2.66 -3.27 – -2.05 <0.001

54 -2.09 -2.49 – -1.70 <0.001

57 -2.24 -2.71 – -1.77 <0.001

58 -1.80 -2.15 – -1.44 <0.001

59 -1.72 -2.16 – -1.29 <0.001

6 -1.78 -2.23 – -1.32 <0.001

60 -2.71 -3.10 – -2.32 <0.001

63 -2.31 -2.74 – -1.87 <0.001

64 -3.58 -3.95 – -3.21 <0.001

65 -2.07 -2.52 – -1.61 <0.001

66 -1.75 -2.16 – -1.34 <0.001

67 -2.81 -3.22 – -2.39 <0.001

68 -2.26 -2.62 – -1.89 <0.001

246 69 -2.04 -2.42 – -1.65 <0.001

7 -1.99 -2.44 – -1.54 <0.001

70 -2.42 -2.75 – -2.08 <0.001

71 -2.17 -2.59 – -1.75 <0.001

72 -2.15 -2.48 – -1.81 <0.001

8 -2.47 -2.89 – -2.04 <0.001

9 -1.91 -2.23 – -1.59 <0.001

Wound -2.63 -2.98 – -2.29 <0.001

Dcent:Strain1 0.47 0.09 – 0.85 0.016

Dcent:Strain10 -0.07 -0.45 – 0.31 0.728

Dcent:Strain11 0.27 -0.13 – 0.66 0.185

Dcent:Strain12 0.15 -0.20 – 0.49 0.413

Dcent:Strain13 -0.02 -0.35 – 0.31 0.917

Dcent:Strain14 0.29 -0.20 – 0.78 0.247

Dcent:Strain15 0.16 -0.21 – 0.53 0.393

Dcent:Strain16 0.35 -0.16 – 0.87 0.180

Dcent:Strain17 0.11 -0.27 – 0.50 0.568

Dcent:Strain18 0.19 -0.17 – 0.56 0.303

Dcent:Strain19 0.23 -0.11 – 0.56 0.181

Dcent:Strain2 -0.15 -0.55 – 0.25 0.458

Dcent:Strain21 0.01 -0.39 – 0.41 0.955

Dcent:Strain22 0.48 0.13 – 0.83 0.008

Dcent:Strain25 0.03 -0.33 – 0.39 0.863

Dcent:Strain26 0.23 -0.26 – 0.72 0.360

Dcent:Strain27 -0.36 -0.90 – 0.18 0.193

Dcent:Strain28 0.23 -0.15 – 0.61 0.232

Dcent:Strain3 0.25 -0.21 – 0.70 0.286

Dcent:Strain30 -0.15 -0.58 – 0.28 0.485

Dcent:Strain31 0.19 -0.28 – 0.67 0.419

247 Dcent:Strain32 -0.07 -0.41 – 0.27 0.695

Dcent:Strain33 -0.25 -0.68 – 0.19 0.271

Dcent:Strain34 -0.22 -0.79 – 0.35 0.449

Dcent:Strain35 0.16 -0.35 – 0.66 0.541

Dcent:Strain36 0.18 -0.37 – 0.72 0.527

Dcent:Strain37 -0.30 -0.70 – 0.10 0.145

Dcent:Strain38 0.07 -0.67 – 0.81 0.853

Dcent:Strain39 -0.05 -0.48 – 0.38 0.817

Dcent:Strain4 0.58 0.22 – 0.94 0.002

Dcent:Strain41 0.12 -0.33 – 0.57 0.598

Dcent:Strain42 -0.06 -0.44 – 0.32 0.764

Dcent:Strain43 0.36 -0.30 – 1.02 0.284

Dcent:Strain44 0.05 -0.42 – 0.51 0.844

Dcent:Strain45 0.08 -0.29 – 0.45 0.667

Dcent:Strain46 -0.09 -0.56 – 0.37 0.698

Dcent:Strain47 0.16 -0.19 – 0.51 0.377

Dcent:Strain48 -0.12 -0.57 – 0.33 0.603

Dcent:Strain5 0.13 -0.30 – 0.57 0.550

Dcent:Strain50 0.05 -0.37 – 0.46 0.828

Dcent:Strain51 -0.49 -0.86 – -0.11 0.011

Dcent:Strain54 0.26 -0.15 – 0.68 0.213

Dcent:Strain57 -0.27 -0.61 – 0.08 0.129

Dcent:Strain58 -0.11 -0.55 – 0.33 0.627

Dcent:Strain59 0.17 -0.37 – 0.70 0.535

Dcent:Strain6 0.01 -0.49 – 0.51 0.963

Dcent:Strain60 0.08 -0.29 – 0.46 0.667

Dcent:Strain63 0.06 -0.45 – 0.57 0.816

Dcent:Strain64 0.22 -0.38 – 0.81 0.478

Dcent:Strain65 -0.21 -0.57 – 0.15 0.254

248 Dcent:Strain66 -0.05 -0.59 – 0.48 0.846

Dcent:Strain67 0.26 -0.17 – 0.69 0.238

Dcent:Strain68 -0.20 -0.58 – 0.19 0.315

Dcent:Strain69 -0.08 -0.53 – 0.36 0.709

Dcent:Strain7 0.17 -0.29 – 0.63 0.470

Dcent:Strain70 -0.33 -0.82 – 0.16 0.187

Dcent:Strain71 0.05 -0.44 – 0.53 0.854

Dcent:Strain72 0.22 -0.15 – 0.60 0.244

Dcent:Strain8 -0.26 -0.71 – 0.19 0.265

Dcent:Strain9 0.01 -0.33 – 0.35 0.951

Dcent:StrainWound 0.19 -0.17 – 0.56 0.298 Observations 830 R2 / Omega-squared 0.723 / 0.711 Mutation Rate was estimated with co-estimating the fitness effect of the mutation rate.

Population density was estimated with direct CFU counts and then mean centred for the analysis (DCent). The model was run using MG1655 as a reference.

4.7.1.5 Model output for Figure 4.8

log 2 Mutation Rate Predictors Estimates CI p (Intercept) 3.34 3.10 – 3.58 <0.001

Cal Cent -0.13 -0.52 – 0.26 0.507

1 -0.85 -1.22 – -0.47 <0.001

10 -1.88 -2.23 – -1.53 <0.001

11 -1.61 -2.01 – -1.22 <0.001

12 -2.18 -2.52 – -1.83 <0.001

13 -0.55 -0.89 – -0.22 0.001

14 -1.35 -1.78 – -0.93 <0.001

249 15 -1.48 -1.81 – -1.15 <0.001

16 -0.67 -1.04 – -0.30 <0.001

17 -1.12 -1.52 – -0.71 <0.001

18 -0.84 -1.38 – -0.29 0.003

19 -0.55 -0.95 – -0.15 0.007

2 -2.25 -2.67 – -1.84 <0.001

21 -2.68 -3.03 – -2.32 <0.001

22 -2.02 -2.33 – -1.71 <0.001

25 -2.34 -2.72 – -1.95 <0.001

26 -1.45 -2.01 – -0.90 <0.001

27 -2.21 -2.64 – -1.78 <0.001

28 -1.28 -1.63 – -0.93 <0.001

3 -1.47 -1.87 – -1.07 <0.001

30 -2.08 -2.52 – -1.64 <0.001

31 -1.60 -2.02 – -1.18 <0.001

32 -1.69 -2.10 – -1.29 <0.001

33 -1.42 -1.88 – -0.95 <0.001

34 -2.24 -2.67 – -1.82 <0.001

35 -2.26 -2.65 – -1.88 <0.001

36 -1.66 -2.09 – -1.24 <0.001

37 -2.01 -2.36 – -1.65 <0.001

38 -2.66 -3.10 – -2.21 <0.001

39 -2.37 -2.75 – -2.00 <0.001

4 -1.08 -1.48 – -0.69 <0.001

41 -1.71 -2.09 – -1.34 <0.001

42 -2.06 -2.38 – -1.74 <0.001

43 -1.66 -2.17 – -1.15 <0.001

44 -1.99 -2.45 – -1.53 <0.001

45 -1.40 -1.81 – -0.99 <0.001

250 46 -1.20 -1.64 – -0.76 <0.001

47 -2.40 -2.75 – -2.04 <0.001

48 -2.22 -2.86 – -1.58 <0.001

5 -1.51 -1.92 – -1.10 <0.001

50 -1.85 -2.27 – -1.43 <0.001

51 -1.15 -1.90 – -0.40 0.003

54 -2.07 -2.52 – -1.62 <0.001

57 -1.58 -2.17 – -1.00 <0.001

58 -1.75 -2.16 – -1.35 <0.001

59 -1.69 -2.15 – -1.22 <0.001

6 -1.70 -2.25 – -1.16 <0.001

60 -2.42 -2.78 – -2.05 <0.001

63 -2.14 -2.57 – -1.70 <0.001

64 -3.43 -3.88 – -2.99 <0.001

65 -1.81 -2.41 – -1.20 <0.001

66 -1.59 -1.97 – -1.20 <0.001

67 -2.62 -3.05 – -2.18 <0.001

68 -2.11 -2.56 – -1.66 <0.001

69 -2.01 -2.38 – -1.63 <0.001

7 -1.95 -2.43 – -1.46 <0.001

70 -2.28 -2.68 – -1.88 <0.001

71 -2.08 -2.52 – -1.64 <0.001

72 -2.05 -2.42 – -1.69 <0.001

8 -2.37 -2.85 – -1.88 <0.001

9 -1.92 -2.29 – -1.55 <0.001

W -2.46 -2.86 – -2.06 <0.001

CalCent:Strain1 0.13 -0.43 – 0.68 0.659

CalCent:Strain10 -0.54 -1.08 – -0.00 0.050

CalCent:Strain11 -0.06 -0.66 – 0.54 0.842

251 CalCent:Strain12 -0.26 -0.80 – 0.28 0.342

CalCent:Strain13 -0.33 -0.83 – 0.17 0.196

CalCent:Strain14 -0.21 -0.84 – 0.42 0.517

CalCent:Strain15 -0.18 -0.69 – 0.32 0.473

CalCent:Strain16 0.04 -0.48 – 0.56 0.868

CalCent:Strain17 -0.48 -1.13 – 0.18 0.152

CalCent:Strain18 -0.22 -0.80 – 0.37 0.465

CalCent:Strain19 -0.01 -0.50 – 0.49 0.980

CalCent:Strain2 -0.55 -1.15 – 0.05 0.074

CalCent:Strain21 -0.46 -1.01 – 0.10 0.108

CalCent:Strain22 0.23 -0.27 – 0.74 0.365

CalCent:Strain25 -0.29 -0.80 – 0.22 0.271

CalCent:Strain26 -0.24 -1.03 – 0.56 0.557

CalCent:Strain27 -0.71 -1.48 – 0.06 0.070

CalCent:Strain28 -0.15 -0.69 – 0.39 0.587

CalCent:Strain3 -0.19 -0.81 – 0.44 0.560

CalCent:Strain30 -0.35 -1.51 – 0.81 0.555

CalCent:Strain31 -0.24 -0.86 – 0.39 0.457

CalCent:Strain32 -0.32 -0.93 – 0.29 0.304

CalCent:Strain33 -0.70 -1.43 – 0.03 0.061

CalCent:Strain34 -0.66 -1.47 – 0.14 0.106

CalCent:Strain35 -0.30 -0.93 – 0.32 0.340

CalCent:Strain36 -0.31 -1.05 – 0.42 0.405

CalCent:Strain37 -1.21 -1.85 – -0.57 <0.001

CalCent:Strain38 -0.39 -1.20 – 0.42 0.342

CalCent:Strain39 -0.44 -0.97 – 0.09 0.106

CalCent:Strain4 0.24 -0.28 – 0.75 0.367

CalCent:Strain41 -0.48 -1.08 – 0.13 0.125

CalCent:Strain42 -0.39 -0.88 – 0.10 0.122

252 CalCent:Strain43 -0.08 -1.09 – 0.93 0.873

CalCent:Strain44 -0.06 -0.65 – 0.53 0.841

CalCent:Strain45 0.00 -0.56 – 0.56 0.991

CalCent:Strain46 -0.70 -1.50 – 0.10 0.086

CalCent:Strain47 -0.17 -0.72 – 0.39 0.554

CalCent:Strain48 -0.59 -1.30 – 0.13 0.107

CalCent:Strain5 -0.05 -0.70 – 0.60 0.882

CalCent:Strain50 -0.07 -0.71 – 0.56 0.819

CalCent:Strain51 0.08 -0.45 – 0.61 0.768

CalCent:Strain54 -0.15 -1.05 – 0.74 0.734

CalCent:Strain57 -0.22 -0.97 – 0.52 0.552

CalCent:Strain58 -0.84 -1.72 – 0.05 0.063

CalCent:Strain59 -0.60 -1.68 – 0.48 0.278

CalCent:Strain6 -0.26 -0.93 – 0.41 0.451

CalCent:Strain60 -0.41 -1.09 – 0.27 0.234

CalCent:Strain63 -0.49 -1.10 – 0.13 0.120

CalCent:Strain64 -0.35 -1.37 – 0.66 0.497

CalCent:Strain65 -0.60 -1.72 – 0.52 0.294

CalCent:Strain66 -0.72 -1.39 – -0.05 0.035

CalCent:Strain67 -0.23 -0.86 – 0.41 0.481

CalCent:Strain68 -0.70 -1.41 – 0.01 0.053

CalCent:Strain69 -0.42 -0.96 – 0.12 0.124

CalCent:Strain7 -0.26 -0.91 – 0.39 0.426

CalCent:Strain70 -0.79 -1.71 – 0.12 0.090

CalCent:Strain71 -0.41 -1.09 – 0.27 0.241

CalCent:Strain72 -0.17 -0.72 – 0.39 0.555

CalCent:Strain8 -0.51 -1.18 – 0.17 0.139

CalCent:Strain9 -0.31 -0.79 – 0.16 0.199

CalCent:StrainW -0.15 -0.69 – 0.40 0.592

253 Observations 830 R2 / Omega-squared 0.672 / 0.649 Mutation Rate was estimated with the weighted median of the fitness effect of the mutation rate. Population density was estimated with an ATP-based assay and then mean centred for the analysis (CalCent). The model was run using MG1655 as a reference.

4.7.2 Diagnostic plots of BayesTraits analysis

4.7.2.1 Diagnostic plots for analysis of average mutation rates not co-estimating fitness effects of the resistance mutation

254

255 4.7.2.2 Diagnostic plots for analysis of the degree of DAMP not co-estimating fitness effects of the resistance mutation

256

4.7.2.3 Diagnostic plots for analysis of average mutation rates co-estimating fitness effects of the resistance mutation

257

258 4.7.2.4 Diagnostic plots for analysis of the degree of DAMP co-estimating fitness effects of the resistance mutation

259

4.7.2.5 Diagnostic plots for analysis of the correlation between the average mutation rate and degree of DAMP when not co-estimating the fitness effects of the resistance mutation

260

261

4.7.2.6 Diagnostic plots for analysis of the correlation between the average mutation rate and degree of DAMP when co-estimating the fitness effects of the resistance mutation

262

263

264 4.7.3 Strains of bacteria used in Chapter 4

ECOR strains are natural isolates from the E. coli Reference collection as described in Ochman and Selander (1984) and obtained from STEC centre at

Michigan State University (http://www.shigatox.net/new/).

Organism Isolate O H Host Locale Group Notes

Escherichia coli ECOR-01 N N human USA (Iowa) A Group A strain from a healthy person

ECOR-02 N 32 human USA (NY) A Group A strain from a healthy person

ECOR-03 1 NM dog USA (Mass.) A Group A strain from a healthy dog

ECOR-04 N N human (child) USA (Iowa) A Group A strain from a healthy child

ECOR-05 79 NM human USA (Iowa) A Group A strain from a healthy person 265

ECOR-06 N NM human (child) USA (Iowa) A Group A strain from a healthy child

Group A strain from a healthy orangutan in ECOR-07 85 N orangutan USA (Wash.) A captivity

ECOR-08 86 NM human USA (Iowa) A Group A strain from a healthy person

ECOR-09 N NM human Sweden A Group A strain from a healthy person ECOR-10 6 10 human Sweden A Group A strain from a healthy person Group A strain from a patien with a UTI and ECOR-11 6 10 human Sweden A acute cystitis

ECOR-12 7 32 human Sweden A Group A strain from a healthy person

ECOR-13 N N human Sweden A Group A strain from a healthy person Group A strain from a patient with a UTI and ECOR-14 M N human Sweden A acute pyelonephritis

ECOR-15 25 NM human Sweden A Group A strain from a healthy person

Group A strain from a healthy leopard in ECOR-16 N 10 leopard USA (Wash.) A captivity 266

ECOR-17 106 NM pig Indonesia A Group A strain from a healthy pig

ECOR-18 5 NM Celebese USA (Wash.) A Group A strain from a healthy Celebese ape

ape in captivity

Group A strain from a healthy Celebese ape ECOR-19 5 N Celebese ape USA (Wash.) A in captivity

ECOR-21 121 N steer Bali A Group A strain from a healthy steer ECOR-22 N N steer Bali A Group A strain from a healthy steer

ECOR-25 N N dog USA (NY) A Group A strain from a healthy dog

ECOR-26 104 21 human (infant) USA (Mass.) B1 Group B1 strain from a healthy infant Group B1 strain from a healthy giraffe in ECOR-27 104 NM giraffe USA (Wash.) B1 captivity

ECOR-28 104 NM human (child) USA (Iowa) B1 Group B1 strain from a healthy child

ECOR-30 113 21 bison Canada B1 Group B1 strain from a healthy bison

ECOR-31 79 43 leopard USA (Wash.) E Group E strain from a healthy leopard in

captivity

Group B1 strain from a healthy giraffe in 267 ECOR-32 7 21 giraffe USA (Wash.) B1 captivity

ECOR-33 7 21 sheep USA (Calif.) B1 Group B1 strain from a healthy sheep

ECOR-34 88 NM dog USA (Mass.) B1 Group B1 strain from a healthy dog

ECOR-35 1 NM human USA (Iowa) D Group D strain from a healthy person

ECOR-36 79 25 human USA (Iowa) D Group D strain from a healthy person

ECOR-37 N N marmoset USA (Wash.) E Group E strain from a healthy marmoset in captivity

ECOR-38 7 NM human USA (Iowa) D Group D strain from a healthy person

ECOR-39 7 NM human Sweden D Group D strain from a healthy person Group D strain from a patient with a UTI and ECOR-40 7 NM human Sweden D acute pyelonephritis

ECOR-41 7 NM human Tonga D Group D strain from a healthy person

ECOR-42 N 26 human USA (Mass.) E Group E strain from a healthy person

ECOR-43 N N human Sweden E Group E strain from a healthy person 268 Group D strain from a healthy cougar in ECOR-44 N N cougar USA (Wash.) D captivity

ECOR-45 N M pig Indonesia B1 Group B1 strain from a healthy pig Group D strain from a healthy ape in ECOR-46 1 6 ape USA (Wash.) D captivity

ECOR-47 M 18 sheep New Guinea D Group D strain from a healthy sheep Group D strain from a patient with a UTI and ECOR-48 N M human Sweden D acute cystitis ECOR-49 2 NM human Sweden D Group D strain from a healthy person Group D strain from a patient with a UTI and ECOR-50 2 N human Sweden D acute pyelonephritis

ECOR-51 25 N human (infant) USA (Mass.) B2 Group B2 strain from a healthy infant

ECOR-54 25 1 human USA (Iowa) B2 Group B2 strain from a healthy person

Group B2 strain from a patient with a UTI ECOR-55 25 1 human Sweden B2

and acute pyelonephritis 269

ECOR-56 6 1 human Sweden B2 Group B2 strain from a healthy person

Group B2 strain from a healthy gorilla in ECOR-57 N NM gorilla USA (Wash.) B2 captivity Group B1 strain from a healthy in ECOR-58 112 8 lion USA (Wash.) B1 captivity

ECOR-59 4 40 human USA (Mass.) B2 Group B2 strain from a healthy person Group B2 strain from a patient with a UTI ECOR-60 4 N human Sweden B2 and acute cystitis ECOR-63 N NM human Sweden B2 Group B2 strain from a healthy person Group B2 strain from a patient with a UTI ECOR-64 75 NM human Sweden B2 and acute cystitis Group B2 strain from a healthy celebese ape ECOR-65 N 10 celebese ape USA (Wash.) B2 in captivity Group B1 strain from a healthy celebese ape

ECOR-66 4 40 celebese ape USA (Wash.) B1 in captivity 270

ECOR-67 4 43 goat Indonesia B1 Group B1 strain from a healthy goat

Group B1 strain from a healthy giraffe in ECOR-68 N NM giraffe USA (Wash.) B1 captivity Group B1 strain from a healthy celebese ape ECOR-69 N NM celebese ape USA (Wash.) B1 in captivity Group B1 strain from a healthy gorilla in ECOR-70 78 NM gorilla USA (Wash.) B1 captivity Group B1 strain from a person with ECOR-71 78 NM human Sweden B1 asymptomatic bacteriuria Group B1 strain from a patient with a UTI ECOR72 144 8 human Sweden B1 and acute pyelonephritis Wound

UK Kindly provided by Prof. Andrew McBain isolate 271 MG1655 A Wild-type lab strain

Chapter 5: Highly Conserved Molecular Mechanisms Control

Density Associated Mutation Rate Plasticity in Escherichia coli

272 5.1 Abstract

The rate that mutations arise in a genome will affect the evolutionary trajectories of that organism. Over a wide phylogenetic distribution mutation rates behave plastically depending upon environmental conditions, such as stressful factors and the context of the social environment, specifically the density of individuals spread throughout a population. Whilst the mechanisms involved in modulating plastic mutation rates in relation to stress are well characterised, the mechanisms involved in social environment mutation rate plasticity are only recently being investigated. The results presented here add further explanation to the mechanisms controlling the social phenomenon of density associated mutation rate plasticity (DAMP), which is the inverse association of mutation rate with final population density. There is evidence that mutation rates are controlled via intercellular environmental signalling, although the fitness effects of resistance mutations may affect this. DAMP in Escherichia coli is modulated by the Nudix hydrolase protein MutT (also known as NudA). Here four further Nudix hydrolase proteins, specifically NudB, NudF, NudI and NudJ, are shown to also control DAMP in this bacterium. DAMP is pervasive at different time points in the culture cycle with the same degree, suggesting the mechanisms modulating DAMP are common across all growth points and not dependent upon the time-specific induction of genes. Mutation rates across the growth cycle are shown to associate with the level of intracellular ATP, increasing at low and high concentrations producing a mutation rate minimum at approximately 0.5µM ATP/cell. These results add to our knowledge of the molecular mechanisms that modulate DAMP and, due to the highly conserved nature of the genes involved, they point to the widespread prevalence and ancient evolutionary origins of this socially controlled trait.

273 5.2 Introduction

Spontaneous mutation is the fuel that drives evolution. The rate mutations occur could, by generating the variation required by natural selection, affect an organism’s adaptability. Across species and domains, an organism’s average mutation rate has evolved in relation to its effective population size due to a balance between powers of selection and genetic drift. (Lynch et al. 2016). Specifically, selection reduces the mutation rate due to the deleterious nature of most mutations. This reduction will continue until the benefits gained by such a reduction are small enough to be overcome by the power of genetic drift, presenting a barrier further reductions in mutation rate (Sung et al. 2012). An organism’s mutation rate has not just evolved to a constant however, but can behave plastically depending upon environmental circumstances (Elena and de Visser 2003; Saint-Ruf and Matic 2006). Such environmental mutation rate plasticity has evolved In a range of species over a broad evolutionary scale, in response to both stressful (MacLean et al. 2013) and social

(Krašovec et al. 2014; Krašovec et al. 2017) environments. Specifically, mutation rates increase in response to stress, termed Stress Induced Mutagenesis (SIM) (Foster 2007), and decrease with increasing final population density, termed Density Associated

Mutation Rate Plasticity (DAMP) (Krašovec et al. 2017).

Whilst the molecular regulation of mutation in response to stress is well understood

(Matic 2016), those mechanisms involved in modulating mutation in DAMP are only recently being investigated. Nonetheless, such molecular mechanisms have been identified and involve highly conserved genetic systems (Krašovec et al. 2014;

Krašovecet al. 2017). In E. coli mutation rate is regulated via intercellular communications that relies upon the highly conserved luxS gene (Krašovec et al. 2014).

274 Though, interestingly, it’s not the role luxS has in the production of the quorum sensing molecule AI-2 (Pereira et al. 2013) that affects DAMP, but rather its metabolic role in the activated methyl cycle (Halliday et al. 2010). This result is supported by the presence of DAMP in a strain lacking the lsrK gene, which is involved in control of processing the AI-2 signal (Krašovec et al. 2014). lsrK catalyses the phosphorylation of

AI-2 to AI-2P, which then binds to lsrR, the lsr operon repressor, allowing the transcription of the lsr operon (Xavier and Bassler 2005). Both these genes co-regulate a multitude of other genes within E. coli, with lsrR affecting the generation of several small previously linked to quorum sensing in Vibrio harveyi (Li et al. 2007).

Additionally, DAMP requires the same mutation avoidance mechanism in both pro- and eukaryotes (Krašovec et al. 2017). This mechanism utilises the same Nudix hydrolase protein that degrades the highly mutagenic oxidised nucleotide 8-oxo-dGTP from the intracellular nucleotide pools (Michaels and Miller 1992). Nudix hydrolases are highly conserved across domains of life and provide key house-cleaning roles to cleanse the intracellular environment of cellular metabolic by-products (Galperin et al.

2006). These proteins can be separated by the substrate they act upon such as ADP- ribose, dinuceloside polyphosphates, nucleoside sugars and deoxyribonucleoside polyphosphates (Dunn et al. 1999), although there is some crossover in substrate usage between these proteins (McLennan 2013).

Questions regarding the mechanistic control of DAMP at various levels have been raised previously (Krašovec et al. 2014) and expanded in the experimental chapters of this thesis. Specifically, how mutation rates are controlled upstream at the population level, downstream genetic mechanisms underpinning DAMP, and also how DAMP

275 varies across the growth cycle. Here I address each of these questions in turn, expanding our knowledge of the molecular mechanisms modulating DAMP at these various scales. In order to address the first question regarding the upstream control of mutation rates via intercellular interactions I shall conduct co-cultures of E. coli as previously described (Krašovec et al. 2014). Mutation rates will be estimated in both wild-type and ΔlsrK cells in the presence of certain strains of E. coli, specifically either wild-type, ΔlsrK, ΔlsrR or ΔluxS, thereby further investigating any potential role of AI-2 from different perspectives. The previous work investigated the upstream mechanism of luxS, specifically the role it plays in both the activated methyl cycle and the production of AI-2. The downstream mechanism of AI-2, namely the lsr operon, and specifically the AI-2 kinase lsrK and its repressor lsrR, remained untested. Therefore by conducting these co-cultures the previous assertion of the role of luxS will be tested both upstream, as before, but now additionally downstream. To answer the second question regarding the genetic mechanisms controlling DAMP I will estimate mutation rates in other Nudix hydrolase genes present in E. coli at various population densities to determine if any of these genes also have a role in modulating DAMP. To answer the final question of DAMP’s presence at different points in the growth cycle I shall estimate mutation rates in wild-type E. coli via fluctuation tests after various culture lengths, corresponding to different phases of the E. coli growth cycle.

276 5.3 Materials and Methods

5.3.1 Strains of bacteria used in Chapter 5 Strains of bacteria used in this study are listed in detail in Table 5.1.

Table 5.1. Escherichia coli strains used in this study.

Strain Genotype Source or reference

E. coli MG1655 F-, rph-1 Karina B. Xavier

E. coli BW25113 F-, Δ(araD-araB)567, ΔlacZ4787(::rrnB-3), λ-, Keio collection rph-1, Δ(rhaD-rhaB)568, hsdR514 parent (Baba et al. 2006)

E. coli JW1854-1 F-, Δ(araD-araB)567, ΔlacZ4787(::rrnB-3), λ-, Keio collection ΔnudB720::kan, rph-1, Δ(rhaD-rhaB)568, (Baba et al. hsdR514 2006)

E. coli JW5548-1 F-, Δ(araD-araB)567, ΔlacZ4787(::rrnB-3), λ-, Keio collection ΔnudC767::kan, rph-1, Δ(rhaD-rhaB)568, (Baba et al. hsdR514 2006)

E. coli JW5335-1 F-, Δ(araD-araB)567, ΔlacZ4787(::rrnB-3), λ-, Keio collection ΔnudD749(del)::kan, rph-1, Δ(rhaD-rhaB)568, (Baba et al. hsdR514 2006)

E. coli JW3360-5 F-, Δ(araD-araB)567, ΔlacZ4787(::rrnB-3), λ-, Keio collection ΔnudE732::kan, rph-1, Δ(rhaD-rhaB)568, (Baba et al. hsdR514 2006)

E. coli JW3002-1 F-, Δ(araD-araB)567, ΔlacZ4787(::rrnB-3), λ-, Keio collection ΔnudF731::kan, rph-1, Δ(rhaD-rhaB)568, (Baba et al. hsdR514 2006)

E. coli JW3610-2 F-, Δ(araD-araB)567, ΔlacZ4787(::rrnB-3), λ-, Keio collection ΔnudG757::kan, rph-1, Δ(rhaD-rhaB)568, (Baba et al. hsdR514 2006)

E. coli JW2798-1 F-, Δ(araD-araB)567, ΔlacZ4787(::rrnB-3), λ-, Keio collection ΔrppHC754::kan, rph-1, Δ(rhaD-rhaB)568, (Baba et al. hsdR514 2006)

E. coli JW2245-1 F-, Δ(araD-araB)567, ΔlacZ4787(::rrnB-3), λ-, Keio collection ΔnudI732::kan, rph-1, Δ(rhaD-rhaB)568, (Baba et al. hsdR514 2006)

277 E. coli JW1120-1 F-, Δ(araD-araB)567, ΔlacZ4787(::rrnB-3), λ-, Keio collection ΔnudJ722::kan, rph-1, Δ(rhaD-rhaB)568, (Baba et al. hsdR514 2006)

E. coli JW2451-1 F-, Δ(araD-araB)567, ΔlacZ4787(::rrnB-3), λ-, Keio collection ΔnudK787::kan, rph-1, Δ(rhaD-rhaB)568, (Baba et al. hsdR514 2006)

E. coli KX1102 Ara+, ΔlacZYA::Cm Karina B. Xavier

E.coli KX1228 Ara+, ΔluxS Karina B. Xavier

E. coli 1440 Ara+, ΔlsrK::Cm Karina B. Xavier

E. coli 1448 Ara+, ΔlsrK Karina B. Xavier

E. coli AS12 Ara+, ΔlsrR Karina B. Xavier

5.3.2 Media MilliQ water was used for all media. For bacteria, tetrazolium arabinose agar (TA) and Davis minimal medium (DM) were prepared according to Lenski et al.

(1991), and lysogeny broth medium (LB) was prepared according to manufacturers’ instructions. Magnesium sulphate heptahydrate, thiamine hydrochloride, carbon source (3g/l L-arabinose or various concentrations of D-glucose), 2,3,5- triphenyltetrazolium chloride (Sigma T8877) were sterile filtered and added to a cooled medium. Antibiotics chloramphenicol (50µg/ml) and rifampicin (50µg/ml) were prepared freshly in ethanol and methanol respectively and added to TA media as required. Chloramphenicol was used to distinguish between strains used in co- cultures, and rifampicin was used for detecting spontaneous mutations. For all cell dilutions sterile saline (8.5 g/l NaCl) was used. All media were solidified as necessary with 15 g/l of agar (Difco).

278 5.3.3 Fluctuation tests

5.3.3.1 Co-culture Fluctuation tests Fluctuation tests were carried out as explained previously (Krašovec et al. 2014). Briefly, strains were inoculated from frozen glycerol stock and grown in liquid LB medium at 37°C, shaking at 120rpm. After approximately

8 hours, the strains were diluted and then transferred to non-selective liquid DM supplemented with 250 mgl-1 of glucose as a single carbon source and allowed to grow overnight at 37°C, again with 120rpm shaking. The following morning, strains were then again diluted into fresh DM with appropriate concentrations of glucose giving the initial population size (N0). For co-cultures, the culture was made up of equal proportions of each strain, with the different strains being distinguished by resistance to the antibiotic chloramphenicol (Cm). The mean N0 of the strain having its mutation rate estimated was approximately 7700 (range 4200 – 9850), with the co-cultured strain having a N0 of approximately 8500 (range 4300 – 14400). All co-cultures used a total volume of 1ml. Cultures were grown for 24 hours at 37°C with shaking of 250rpm in 96 deep-well plates. The position of each culture on a 96-well plate was chosen randomly. Nt was estimated via CFU counts on non-selective TA and TA-Cm media to distinguish between the two strains. Evapouration (routinely monitored by weighing plate before and after 24h incubation) was accounted for in the Nt value determined by CFU and was also considered in statistical modelling as a potential variance covariate. We obtained the observed number of mutants resistant to rifampicin, r, by plating the entirety of remaining cultures onto solid selective TA medium that allows spontaneous mutants to form colonies. Selective media contained both rifampicin and chloramphenicol. Selective plates were incubated at 37°C for 72 hours.

279 5.3.3.2 Fluctuation tests for monoculture experiments Fluctuation tests were carried out as previously described (Krašovec et al. 2017). As above, strains were inoculated from frozen glycerol stock into LB and grown for ~8 hours before being transferred to

DM and grown overnight. The N0 for these monocultures this was of around 4200

(range 300 - 8700). Parallel cultures were grown at various volumes (0.5 – 1.5 ml) for a specific time depending upon the experiment (ranging from 16 – 24 hours) at 37°C with shaking of 250rpm in 96 deep-well plates. Nt was determined via two methods.

First, by colony forming units (CFU), where an appropriate dilution was plated on a solid non-selective TA medium. Second by using net luminescence (LUM) as determined by using a Promega GloMax luminometer and the Promega Bac-Titer Glo kit, according to manufacturer's instructions. We measured luminescence of each culture 0.5 and 450 seconds after adding the Bac-Titer Glo reagent and calculated net luminescence as LUM = luminescence450s – luminescence0.5s. Evapouration (routinely monitored by weighing plate before and after 24h incubation) was accounted for in the Nt value determined by CFU and was also considered in statistical modelling as a potential variance covariate. We obtained the observed number of mutants resistant to rifampicin, r, by plating the entirety of remaining cultures onto solid selective TA medium that allows spontaneous mutants to form colonies. Selective plates were incubated at 37°C for 48-72 hours.

5.3.4 Estimation of mutation rates To calculate number of mutational events, m, from the observed number of mutants we employed the Ma-Sandri-Sarkar maximum- likelihood method implemented by the R package flan v0.6 (Mazoyer et al. 2017). The mutation rate per cell per generation is calculated as m divided by the final population size, Nt (as determined by CFU). The flan package also allows for the estimation of the

280 fitness of mutant cells alongside m. The mutation rate co-estimated with the fitness effect was used as the default mutation rate estimate. Due to the lack of power in estimating both fitness and m from the same small set of data, another method of accounting for fitness effects was used. The weighted median fitness value for the resistance mutation was estimated for each strain from the fitness estimate determined as above. This median fitness value for each strain was then reintroduced into the model for estimating m, thereby allowing the estimation of m with the different average fitness effects of each strain. The mutation rate per cell per generation is calculated as m divided by the final population size, Nt, for monocultures.

For co-cultures the Nt of the strain being tested, as determined via plating on chloramphenicol TA plates.

5.3.5 Statistical analysis All statistical analysis was executed in R v3.5.0 (2018) using the nlme package v3.1-137 for linear mixed effects modelling (Pinheiro and Bates

2000). This enabled the inclusion within the same model of experimental factors (fixed effects), blocking effects (random effects) and factors affecting variance

(heteroscedasticity) as described in Statistical models. Two factors were incorporated in each model to account for such heteroscedasticity, with models being compared by

AIC to determine the best model. See the figure legends for details of factors accounting for heteroscedasticity in each specific model. In all cases the log2 mutation rate was used.

281 5.4 Results

5.4.1 Intercellular control of Mutation rates in Escherichia coli

Here I have further tested the role intercellular interactions in modulating mutation rate. Both wild-type and an lsrK deletant (ΔlsrK) E. coli strains were grown in co- cultures with either wild-type, ΔlsrK, ΔluxS or ΔlsrR (wild-type only) strains. Fitness effects associated with resistance mutations could potentially affect the mutation rate estimation. To counteract the potential bias that fitness effects could pose on the results presented, fitness effects of resistance mutations were co-estimated alongside mutation rates. These mutation rate estimates were analysed against an effect of treatment (the combination of target strain and the co-cultures strain), the population density of the strain being tested and their interaction. There is no significant interaction between population density and treatment (N = 147, LR14,20 = 4.45, P =

0.62; Figure 5.1) nor is there a significant effect of population density of the target strain on mutation rate overall (N = 147, LR13,14 = 0.09, P = 0.76; Figure 5.1). There is however a significant effect of treatment on mutation rate (N = 147, LR7,13 = 14.03, P =

0.03; Figure 5.1). This significant difference appears to come more from the differences between the ΔlsrK and wild-type strains than anything else as there is no significant difference between any of the wild-type co-cultures, and even the wild- type/ΔlsrK target co-culture is not significantly different to a wild-type/wild-type target co-culture (Wald test that mutation rate differs from wild-type/wild-type: t108 = -1.10,

P = 0.28; Figure 5.1). Instead, it is when either ΔlsrK or ΔluxS is co-cultured with ΔlsrK that a reduced mutation rate is produced (ΔlsrK: Wald test that mutation rate differs

-3 from wild-type/wild-type: t108 = -2.64, P = 9.5 x 10 ; ΔluxS: Wald test that mutation rate differs from wild-type/wild-type: t108 = -2.35, P = 0.021; Figure 5.1). In order to fully test whether there is a true interaction between target strain and co-cultured

282 strain, ΔlsrR was removed from the analysis as it was only co-cultured with wild-type strains. After doing this, there is no significant interaction between target strain and co-cultured strain (N =122, LR9,11 = 5.10, P = 0.078), and no overall effect of co-cultured

2 strain at all (N = 122, χ 2 = 0.47, P = 0.79). Instead, it seems the overall effect comes more from the identity of the target strain as ΔlsrK has a significantly lower mutation

2 -4 rate that wild-type(N = 122, χ 1 = 11.91, P = 5.58 x 10 ). Fitness effects for co-cultures are all on average costly to the organism (Below 1, Black horizontal line in Figure 5.2)

-4 are significantly affected by treatment (N =147, LR6,12 = 25.48, P = 3 x 10 ) but not the

-4 population density of the target strain (N =147, LR12,13 = 5.41 x 10 , P = 0.98). When

ΔlsrR is removed as above to investigate the interaction between target and co-

-4 cultured strains, such an interaction is significant (N = 122, LR9,11 = 17.60, P = 2 x 10 ).

This suggests that differences in mutation rate between treatments may in fact be due to different fitness effects in these different environments. ) 1

− 20 generation 9 − 10

Mutation rate (10 Mutation rate 5

ΔlsrK ΔluxS MG1655 ΔlsrK ΔlsrR ΔluxS MG1655 Co−cultured Strain Figure 5.1 Mutation rate in both wild-type and ΔlsrK Escherichia coli in presence of different knockout strains. ΔlsrK strain depicted as red boxes and wild-type MG1655 depicted as green boxes. Black horizontal lines represent the median for that treatment, with the boxes depicting the interquartile range (IQR). Whiskers out

283 of this box extends to the maximum point up to 1.5xIQR from the upper or lower quartile. Mutation rate differs by treatment identity, but seems to be more closely associated with identity of strain tested (see main text for details). Mutation rates were co-estimated alongside fitness effects of the resistance mutation. In addition to the identity of the target strain, the mutation rate and the mutation rate minus 2 standard deviations were included in the model as a variance power functions to account for heteroscedasticity. Mutation rate is estimated to the antibiotic rifampicin in non- selective conditions. Note logarithmic axis for mutation rate.

1

0.1 ● ● ● ●

Fitness effect of Resistance mutation Fitness effect ● 0.01 ΔlsrK ΔluxS MG1655 ΔlsrK ΔlsrR ΔluxS MG1655 Co−cultured Strain

Figure 5.2 Fitness effects of resistance mutations in both wild-type and

ΔlsrK Escherichia coli when grown in different co-cultures. Colours match the colours in Figure 5.1. Red is the ΔlsrK strain and green is wild-type MG1655

Escherichia coli. Black horizontal lines in boxes represent the median for that treatment, with the boxes depicting the interquartile range (IQR). Whiskers out of this box extends to the maximum point up to 1.5xIQR from the upper or lower quartile, with dots representing outliers lying greater than 1.5x the IQR from the relevant quartile (lower or

284 upper). The black horizontal line across the top of the plot represents neutral fitness effects. Note the logarithmic axes for fitness effects.

5.4.2 Role of other E. coli NUDIX genes in modulating DAMP

Another previously identified mechanism involved in modulating DAMP is through the cleansing of the intracellular nucleotide pool of a mutagenic oxidised nucleotide. This mechanism is highly conserved through all domains of life, therefore it is reasonable to hypothesise that other genes in this superfamily are also involved in modulating

DAMP. Single gene knockouts for these genes were tested for DAMP by assaying their mutation rate at various population densities to the antibiotic rifampicin. There is significant variation between different gene knockouts in both their interaction with

2 -17 population density (DAMP) (N = 245, χ 11 = 102.05, P = 6.99 x 10 ) and average

2 -149 mutation rate (N = 245, χ 11 = 729.89, P = 2.11 x 10 ) (Figure 5.3A). There is a high degree of variation between the average mutation rates of these strains (analysed with a mean-centred population density) compared to wild-type E. coli MG1655, with only one strain (ΔnudG) not possessing a significant difference (Wald test for difference in average mutation rate to MG1655: t173 = -1.86, P = 0.064) in mutation rate to wild-type MG1655. Interestingly, all the remaining strains exhibit an average mutation rate lower then wild-type MG1655. Conversely, the variation in DAMP between strains is less pronounced, with only two strains (ΔnudF and ΔnudJ) differing from MG1655 (MG1655 slope: -0.81 ± 0.17 SE; ΔnudF: slope: -0.16 ± 0.20, Wald test

-3 for difference in DAMP to MG1655:t173 = 3.19, p = 1.68 x 10 ΔnudJ: slope: -0.005 ±

-5 0.20, Wald test for difference in DAMP to MG1655: t173 = 4.14, p = 5.40 x 10 ; Figure

5.3A). Indeed, when tested explicitly, ΔnudF and ΔnudJ have slopes not significantly

2 different to 0 (Wald test that slope is different to 0: ΔnudF - N = 245 χ 1 = 0.66, p =

285 2 -4 0.42; ΔnudF - N = 245 χ 1 = 6 x 10 , p = 0.98). This extends the previous finding of molecular modulation of DAMP to two new Nudix hydrolase house-cleaning genes in E. coli.

A B 50 50

20 20 ) 1

− 5

5 2

5x107 1x108 5x108 5x107 1x108 5x108 generation 9 − C D 50 50

20 20

5 Mutation rate (10 Mutation rate 5 2

5x107 1x108 5x108 5x107 1x108 5x108

−1 Population Density (ml )

Figure 5.3 Mutation rates in Escherichia coli strains deficient in various house-cleaning genes. Mutation rates in strains from the Keio collection (Baba et al.

2006) deleted for genes encoding specific Nudix hydrolases alongside wild-type strains

MG1655 and BW25113 (the Keio parental strain). These strains are: ΔnudB -

Dihydroneopterin triphosphate diphosphatase; ΔnudC - NADH pyrophosphatase; ΔnudD

- GDP-mannose mannosyl hydrolase; ΔnudE - ADP compounds hydrolase NudE; ΔnudF -

ADP-ribose pyrophosphatase; ΔnudG - CTP pyrophosphohydrolase; ΔnudH - RNA

286 pyrophosphohydrolase; ΔnudI - Nucleoside triphosphatase; ΔnudJ - Phosphatase;

ΔnudK - GDP-mannose pyrophosphatase. Lines represent the output from the statistical model testing for variation in the association between mutation rate and population density (see main text for details). A Mutation rate co-estimated with fitness of the resistance mutation including all mutation rate estimates B Mutation rate co-estimated with fitness of the resistance mutation excluding mutation rates where the number of mutational events, m, is below 0.3 C Mutation rate estimated with a weighted mean of the fitness effect of the resistance mutation including all mutation rate estimates D

Mutation rate estimated with a weighted mean of the fitness effect of the resistance mutation excluding mutation rates where the number of mutational events, m, falls below 0.3. Colours denote the different strains, ranging from Red/orange for Bw25113 and MG1655, to purple/pink for ΔnudJ/ ΔnudK (see Figure 5.6 for an accurate scale of colours for the respective strains). Population density was estimated via an ATP-based assay in all plots. Calibration curve shown in Figure 5.4. Note that both scales are logarithmic.

5x108 ) 1 −

1x108 Population density (ml Population 5x107

5x105 1x106 5x106 Net Luminescence (LUM)

287 Figure 5.4 Calibration curves for final population density measured by counting colony forming units (CFU), against luminescence, assayed with the BacTiter-Glo assay (LUM). Calibration curves for all strains in Figure 5.3 plots.

Note both axes are logarithmic.

It is a common practice to exclude mutation rate estimates from fluctuation tests where the estimated number of mutational events, m, is below 0.3, due to current estimation methods being unreliable for points below this threshold (Foster 2006).

Such points tend to fall in the lower left hand corner of the panel, meaning the differences in slope could be due to points below this threshold, though this is not true for wild-type E. coli (Krašovec et al. 2017). To account for this potential confounding effect of low m in Nudix knockout genes, and to test the robustness of the results reported above, mutation rates were excluded as suggested (Foster 2006), and the analysis repeated. Despite removing these points, the results do not greatly change

(Figure 5.3B). There is slightly more significant variation in average mutation, with all strains all now significantly different to wild-type MG1655. The variation in DAMP also remains similar, again with ΔnudF, ΔnudI and ΔnudJ having significantly different slopes compared to MG1655 (MG1655 slope: -0.87 ± 0.15 SE; Wald test DAMP differs

-4 from MG1655: ΔnudF - slope: -0.21 ± 0.18, t159 = 3.74, P = 2.53 x 10 ; ΔnudJ - slope: -

-6 0.07 ± 0.17, t159 = 4.77, P = 4.13 x 10 ; Figure 5.3 B), but additionally ΔnudI now also has a significantly different slope to MG1655 (slope: -0.40 ± 0.18, t159 = 2.53, P = 1.23 x

10-2). Therefore the relationship identified in Figure 5.3A is robust to the potential confounding of mutations estimated with m below the suggested threshold. Indeed, the exclusion of these points suggests that another gene may be involved in modulating DAMP.

288 5.4.2.1 Fitness effects in the NUDIX gene knockouts

The above analysis, like for that on co-cultures, fitness effects of the resistance mutation was accounted for by estimating these effects alongside the mutation rate.

Fitness effects differed significantly between knockout strains, with all mutations on average being slightly deleterious (Figure 5.5), and also interacts with the population density of the culture with different degrees between strains (Effect of strain: N = 245,

2 -4 2 χ 11 = 33.67, P = 4.10 x 10 ; Strain interaction with population density: N = 245, χ 11 =

29.82, P = 1.69 x 10-3; Figure 5.6). However, as described in Chapter 4, these estimates of fitness effect potentially suffer from a lack of power in their estimation. This was again accounted for by estimating the weighted median fitness for each strain and then re-estimating the mutation rate using this new estimate of fitness. Accounting for this potential lack of power makes little change in the results with both average mutation rate and DAMP still significantly varying between strains (Average mutation

2 -123 2 rate: N = 245, χ 11 = 607.02, P = 4.41 x 10 ; DAMP: N = 245, χ 11 = 111.61, P = 8.73 x

10-19; Figure 5.3C). There is less variation in the average mutation rate between strains now with 3 strains (ΔnudC, ΔnudD and ΔnudG) now exhibiting average mutation rates not significantly different to wild-type MG1655. For DAMP, ΔnudF, ΔnudI and ΔnudJ remain significantly different from wild-type MG1655 (Wald test that DAMP differs

-3 -2 from MG1655: ΔnudF – t173 = 2.81, P = 5.45 x 10 ; ΔnudI – t173 = 2.05, P = 4.23 x 10 ;

-3 ΔnudJ – t173 = 3.79, P = 4.23 x 10 ; Figure 5.3C). Again removing points that had m less than 0.3 results in revealing another gene potentially involved in modulating DAMP

(Figure 5.3D), though, interestingly, in the opposite direction to that of those previously reported (I.e. deletion steepens DAMP slope). This new gene is ΔnudB and has a slope of -1.02 (± 0.11 SE: Wald test that DAMP significantly differs from MG1655

-3 – t158 = -3.29, P = 1.23 x 10 ). This suggests that, whilst fitness effects and mutation

289 rate estimates with greater degrees of unreliability in their estimation may affect

DAMP slightly, they do not greatly change the overall result, as the results are qualitatively similar when whether accounting for these factors or not (Figure 5.3).

ΔnudK ● ΔnudJ ΔnudI ● ΔnudH ● ΔnudG ● ΔnudF ●

Escherichia coli ΔnudE ΔnudD ● ΔnudC ● ●

Strains of Strains ΔnudB

MG1655 ● ●●

BW25113 ● 0.01 0.1 1 Fitness effect of Resistance mutation

Figure 5.5 Fitness effects of resistance mutations in both Escherichia coli wild-type and strains deficient in house-cleaning genes. Fitness effects of the mutation conferring resistance to the antibiotic rifampicin was estimated alongside the mutation rate. Black horizontal lines in boxes represent the median for that treatment, with the boxes depicting the interquartile range. The black vertical line across the top of the plot represents neutral fitness effects. Colours denoted here match the colours for the specific strains in all other plots within this chapter. Note the logarithmic axes for fitness effects.

290 1

0.1 Fitness effect of Resistance mutation Fitness effect 0.01

5x107 1x108 5x108 Population density (ml−1)

Figure 5.6 Interaction between fitness effects of the resistance mutation and the population density in both Escherichia coli wild-type and strains deficient in house-cleaning genes. Fitness effects of the mutation conferring resistance to the antibiotic rifampicin was estimated alongside the mutation rate plotted against final population density, estimated via an ATP-based assay. There is a significant overall negative association between fitness and population density (N = 245,

2 -3 χ 11 = 9.92, P = 1.63 x 10 ), and this association differs significantly between strains (see main text). Colours denote different strains as noted in Figure 5.6. Note both axes are logarithmic.

291 5.4.3 Mutation rates at different points in the culture cycle

DAMP associates mutation rate with final population density of a culture, however the population density changes over the course of the culture. This could mean that the mutation rate and DAMP may vary at different points in the culture cycle. To investigate this, cultures of were grown for different culture lengths, in different experiments: 16 hrs, 18 hrs, 20 hours, 22 hrs and 24 hours. There is an overall association of population density and mutation rate across all culture lengths (N = 60,

-9 LR9,10 = 40.91, P = 6.81 x 10 ; Figure 5.7A). Whilst culture length affects the average

-5 mutation rate (N = 60, LR610 = 25.17, P = 4.64 x 10 ; Figure 5.7A), it does not significantly affect the degree of DAMP (N = 60, LR10,14 = 3.45, P = 0.49; Figure 5.7A ).

This variation in average mutation is due to the change in the number of mutational events, m, and population size over the course of the culture cycle (Figure 5.7B and C).

Mutation rate decreases between 16 hours and 18 hours due to an increase in final population size, but with a much smaller increase in m at the same time periods.

DAMP does not differ between these two time points, though due to the small population density range at 16 hours compared to the other time points, the slope at

16 hours should be treated with caution. Mutation rate then increases from 18 hours to 22 hours due to an increase in m, with population size not increasing to the same degree. There is a final decrease in average mutation rate from 22 hours to 24 hours due to a reduction in both m and population size. This highlights that DAMP is prevalent across the growth cycle of E. coli and highlights the interaction between changes in m and population size that set the population’s mutation rate at different time points.

292 A 100 )

1 50 − generation 9 − 20

10 Mutation rate (10 Mutation rate

5

1x107 2x107 5x107 1x108 2x108 Population density (ml−1) B C

2 x 108 3

1 x 108 2

5 x 107

1

Total Number of cells Total 2 x 107 Number of mutational events Number of mutational

1 x 107

16 18 20 22 24 16 18 20 22 24 Length of culture (hours) Length of culture (hours) Figure 5.7 Changes in DAMP, number of mutational events and population density at different time points in the growth cycle. A DAMP occurs at all time points tested (16 – 24 hrs), with no change in degree (see main text for details). Colours denote the different time points: red – 16 hrs, olive – 18 hrs, green

– 20 hrs, blue – 22 hrs, pink – 24 hrs. Solid lines are the DAMP slopes from the model.

Mutation rate was estimated without the fitness effects of the mutation. Population density was estimated using direct CFU counts. Note both axes are logarithmic. B

Change in the number of mutational events at different time points in growth cycle.

Colours denote different glucose environments used: Blue – low glucose (80mgL-1), red

– intermediate glucose (125mgL-1), green – high glucose (250mgL-1). Number of

293 mutational events was co-estimated with the fitness effects of mutation. Points are mean from 3 separate estimates, with error bars representing the standard error of the mean. C Change in the final population size at different time points in growth cycle.

Colours denote different glucose environments used: Blue – low glucose (80mgL-1), red

– intermediate glucose (125mgL-1), green – high glucose (250mgL-1). Population size was estimated from direct CFU counts. Points are mean from 3 separate estimates, with error bars representing the standard error of the mean.

5.4.4 Intracellular nucleotide pool concentrations at different time points and their effect on mutation rate

Changes in intracellular nucleotide pools have already been shown to modulate DAMP

(Figure 5.3 and Krašovec et al. 2017), and these alter with the physiological state of the cell (Buckstein et al. 2008). To determine the affect any such changes in this pool has on the mutation rate, estimates of the level of ATP in the cell through the use of an

ATP-based assay were tested against mutation rate. This analysis shows that the relationship between the ATP levels within the cell and the mutation rate between

-4 culture times varies significantly (F4,43 = 13.16, p < 1 x 10 , Figure 5.8). At 16 hrs there is a strong positive relationship between intracellular ATP and mutation rate (slope =

2.42 ± 0.43 SE). The relationship remains positive after 18 and 20 hrs, but with a less steep slope (slopes – 0.43 ± 0.45 and 0.33 ± 0.46 respectively). After 22 hrs the relationship changes from positive to a strong negative (-1.64 ± 0.57), before returning to a strong positive relationship after 24 hrs (1.02 ± 0.75). These results show the intricate relationship between changes in the intracellular nucleotide pool and mutation rate across the growth cycle of E. coli.

294 100 ) 1 − 50 generation 9 − 20

10 Mutation rate (10 Mutation rate 5

0.2 0.5 1.0 2.0 ATP per cell (µM)

Figure 5.8 Relationship between intracellular ATP levels and mutation rate at different points in the growth curve. Mutation rate is positively related with ATP levels per cell after 16 to 20 hrs of incubation (see main text for details), with

ATP levels decreasing through these timepoints. This relationship then becomes negative after 22 hrs incubation, with the lowest intracellular levels of ATP. After 24 hrs this relationship becomes positive again alongside an increase in the intracellular levels of ATP. This relationship works together to produce a mutation rate minimum at approximately 0.5µM ATP within the cell. Colours denote the different time points at which mutation rates were estimated: red – 16 hrs, yellow – 18 hrs, green – 20 hrs, purple – 22 hrs, pink – 24 hrs. ATP levels were estimated using an ATP-based assay that generates luminescence. A calibration curve of known ATP concentrations was used to generate the ATP concentration form these luminescence readings. Note both axes are logarithmic.

295 5.5 Discussion

Previous experimental work on DAMP uncovered a role of intercellular signalling mediated by the highly conserved luxS gene (Krašovec et al. 2014). Here, whilst there is an increase in average mutation rate when wild-type E. coli is co-cultured with a luxS knockout strain, this difference is non-significant. This is also the case when wild-type cells are co-cultured with lsrK knockout strain, responsible for phosphorylating the quorum sensing molecule AI-2 (Pereira et al. 2013). Interestingly, there is no significant effect of co-cultured strain on the mutation rate in wild-type E. coli. There is a difference in mutation rate in ΔlsrK when co-cultured with either ΔlsrK or ΔluxS, compared to co-cultured with wild-type cells. This is partly due to the lower average mutation rate of the ΔlsrK mutant strain compared to the wild-type strain. The fact that ΔlsrK exhibits a similar mutation rate to wild-type cells when co-cultured with wild-type cells, greater than a ΔlsrK/ΔlsrK co-cultures suggests that intercellular environmental signalling controls the strain’s mutation rate.

This result of co-cultures is not consistent with the previous investigation of the upstream mechanism of luxS (Krašovec et al. 2014). Such inconsistency could be explained through two routes. First, there is an ongoing discussion of the so-called replication crisis in science, where results from published articles are not always validated when carried out later by different researchers, or indeed the original researchers . Secondly, the fluctuation test is a very intricate method for estimating mutation rates that require excellent accuracy in the technique. These co-cultures were the primary experiments conducted in my PhD, and so I was still learning such techniques. This can be seen in the data as there is a large spread in the mutation rates in all these co-cultures, much greater than was seen previously (Krašovec et al. 2014). I

296 believe this inconsistency comes from the later for a couple reasons. First, the previous work not only investigated the co-culture but also supplemented the knockout with the molecule precursors in the activated methyl cycle and AI-2, showing the former is the role of luxS. Secondly, this work was initially done alongside the researcher who initially conducted the co-cultures, replicating this relationship he previously shown

(Krašovec et al. 2014). This is supported that by the fact that as the PhD progressed, the variation in mutation rates estimates I obtained from fluctuation tests decreased such that small differences in DAMP and mutation rate could be separated, as evidenced by the monoculture experiments within this chapter. Therefore, despite such an inconsistency the results here present, the previous experiments into luxS still hold due to the further experiments in that original paper (Krašovec et al. 2014) and experiments since reaffirming the lack of DAMP in this strain (Krasovec et al. 2018).

More recently DAMP has been shown to be modulated via the cleansing of the intracellular nucleotide pool (Krašovec et al. 2017). The results here have added weight to this finding by identifying DAMP is modulated by four other Nudix hydrolase genes involved in house-cleaning activities in the intracellular environment (Galperin et al.

2006). nudB is identified as a potential DAMP controller, where deletion increases the slope of DAMP. nudB catalyses the hydrolysis of dihydroneopterin triphosphate and nucleoside triphosphates, preferring dATP. nudB is also involved in glycogen cycling and its deletion causes high levels of glycogen to build up intracellularly. nudF deletion causes an overall reduction in the degree of DAMP exhibited. nudF is an ADP-ribose pyrophosphatase that acts on a variety of ADP sugars to produce AMP, and catalyses the limiting step in the gluconeogenic process, inhibiting carbon flow towards glycogen biosynthesis (Moreno-Bruna et al. 2001). Intracellular glycogen turnover has been

297 proposed to act as a capacitor, where the constant recycling of glycogen controls the cells usage of carbon and energy during exponential growth (Belanger and Hatfull

1999). Therefore, the deletion of nudF not only affects the intracellular nucleotide pool but also affects the metabolic regulation within the cell. As both of these factors are known to affect DAMP (Krašovec et al. 2014, 2017), it is unclear which is the reason for a reduction in DAMP seen here. It is particularly suggestive that nudB and nudF affect both intracellular glycogen levels (increasing and decreasing respectively) and the degree of DAMP (steepening and flattening respectively) in opposite ways, which strongly suggests a role of glycogen/glucose cycling to modulate DAMP in E. coli.

Intriguingly, nudF also responds to intercellular signals, with greater binding to cell membrane at higher population densities (Morán-Zorzano et al. 2008). nudI, like nudF, causes a flattening of DAMP’s slope. nudI’s role in the cell is as a nucleoside triphosphatase, breaking down nucleoside triphosphates (dUTP, dCTP and dTTP) to the corresponding monophosphate (Xu et al. 2006). dUTP, nudI’s preferential substrate (Xu et al. 2006), can be particularly problematic to an organism as its incorporation in DNA causes DNA glycosylases to act and could lead to double strand breaks in DNA, undermining DNA integrity (Kouzminova and Kuzminov 2004). Thus its role in DAMP seems to be down to its role in the recycling and cleansing of the intracellular nucleotide pool. nudJ causes an overall flattening of DAMP slope, resulting in a slope not different to 0. nudJ encodes a phosphatase with several functions such as in the thiamine metabolic process (Lawhorn et al. 2004) and also as a non-specific nucleoside tri- and diphosphatase, though with greatest affinity to GDP (Xu et al. 2006). Recently, a Nudix domain fused to a thiamine biosynthesis gene (Tnr3) in Schizosaccharomyces pombe was shown to act in a similar manner to E. coli nudJ (Goyer et al. 2013). In addition to its hydrolytic role in thiamine synthesis, this yeast Nudix domain on Tnr3

298 also acted as a metabolite proof-reader (Linster et al. 2013; Van Schaftingen et al.

2013), by correcting an error made by a metabolic enzyme (Goyer et al. 2013).

Therefore, like nudF, the role of nudJ in modulating DAMP may be a combination of both its metabolic proof-reading mechanisms and its role in altering the composition of intracellular nucleotide pool.

Whilst mutation rate is affected by aspects of the growth curve, such as stationary phase (Loewe et al. 2003; Kivisaar 2010) and growth rate (Nishimura et al. 2017;

Maharjan and Ferenci 2018), DAMP does not, as the degree of DAMP remains the same across multiple time points. Over the large range of densities estimated there is a strong overall negative association between population density and mutation rate. The average mutation rate does differ through the culture cycle due to the changes in population density and number of mutational events across the growth cycle. The number of mutational events slowly increases through the growth cycle before a large increase in early stationary phase, due to a corresponding increase in population size.

It is therefore interesting to see that during the fastest growth period there is such a low increase in number of mutational events, perhaps against what would possibly be expected a priori.

Since several genes have been identified that affect the intracellular nucleotide pool, this change could also be associated with available nucleotides. Specifically, this increase could be due to a reduction in the levels of ATP in the intracellular nucleotide pools through the growth cycle, with cells in early stationary phase possessing the lowest levels of ATP. These cells would have used up a large proportion of their ATP during growth and so have lower energy levels to carry out house-keeping or

299 metabolic functions. This is supported by the associated decrease in mutation rate at

24 hours with an increased level of intracellular ATP. Despite such interactions with mutation rate, the intracellular levels of dATP do not obviously affect DAMP. dATP levels are modulated intracellularly via hydrolysis by nudB. As nudB’s effect on DAMP was only later revealed through further examination of the data, it is perhaps suggestive that this effect of intracellular dATP levels on mutation rate, but not DAMP, is likely.

The realisation that DAMP is prevalent across phases of the growth cycle suggests that its mechanisms are fundamental to the overall cycle of the cell and generic processes of growth, rather than down to genes preferentially expressed at a specific point in the growth cycle. Additionally, DAMP is modulated by a highly conserved family of genes responsible for cleansing the intracellular environment of metabolic substrates common to all form of life. (McLennan 2006). The combination of such findings adds further support to the hypothesis (Krašovec et al. 2017) that DAMP is a widespread, ancient phenotypic trait pervasive across the tree of life. The interplay between intracellular molecules and their cycling dynamics on either mutation rate or DAMP, or indeed both, requires further exploration through the generation of models that can accurately encompass the kinetics of the genes identified here and delving down to the single cell level. Nevertheless, the potential environmental control of mutation rates could pose a novel strategy in fighting the growing threat of antibiotic resistant bacteria.

300 5.6 References

Baba, T., T. Ara, M. Hasegawa, Y. Takai, Y. Okumura, M. Baba, K. A. Datsenko, M.

Tomita, B. L. Wanner, and H. Mori. 2006. Construction of Escherichia coli K-12 in-

frame, single-gene knockout mutants: the Keio collection. Mol. Syst. Biol.

2:2006.0008. EMBO Press.

Belanger, A. E., and G. F. Hatfull. 1999. Exponential-phase glycogen recycling is

essential for growth of Mycobacterium smegmatis. J. Bacteriol. 181:6670–6678.

American Society for Microbiology.

Buckstein, M. H., J. He, and H. Rubin. 2008. Characterization of nucleotide pools as a

function of physiological state in Escherichia coli. J. Bacteriol. 190:718–26.

American Society for Microbiology.

Dunn, C. A., S. F. O’Handley, D. N. Frick, and M. J. Bessman. 1999. Studies on the ADP-

ribose pyrophosphatase subfamily of the Nudix hydrolases and tentative

identification of trgB, a gene associated with tellurite resistance. J. Biol. Chem.

274:32318–32324. American Society for Biochemistry and Molecular Biology.

Elena, S. F., and J. A. G. de Visser. 2003. Environmental stress and the effects of

mutation. J. Biol. 2:12. BioMed Central.

Foster, P. 2007. Stress-induced mutagenesis in bacteria. Crit. Rev. Biochem. Mol. Biol.

42:373–397.

Foster, P. L. 2006. Methods for determining spontaneous mutation rates. Methods

Enzym. 409:195–213.

Galperin, M. Y., O. V. Moroz, K. S. Wilson, and A. G. Murzin. 2006. House cleaning, a

part of good housekeeping. Mol. Microbiol. 59:5–19. Wiley/Blackwell (10.1111).

Goyer, A., G. Hasnain, O. Frelin, M. A. Ralat, J. F. Gregory, and A. D. Hanson. 2013. A

cross-kingdom Nudix enzyme that pre-empts damage in thiamin metabolism.

301 Biochem. J. 454:533–42. Portland Press Limited.

Halliday, N. M., K. R. Hardie, P. Williams, K. Winzer, and D. A. Barrett. 2010.

Quantitative liquid chromatography-tandem mass spectrometry profiling of

activated methyl cycle metabolites involved in LuxS-dependent quorum sensing in

Escherichia coli. Anal. Biochem. 403:20–29.

Kivisaar, M. 2010. Mechanisms of stationary-phase mutagenesis in bacteria:

mutational processes in pseudomonads. FEMS Microbiol. Lett. 312:1–14.

Wiley/Blackwell (10.1111).

Kouzminova, E. A., and A. Kuzminov. 2004. Chromosomal fragmentation in dUTPase-

deficient mutants of Escherichia coli and its recombinational repair. Mol.

Microbiol. 51:1279–1295. Wiley/Blackwell (10.1111).

Krašovec, R., R. V Belavkin, J. A. D. Aston, A. Channon, E. Aston, B. M. Rash, M.

Kadirvel, S. Forbes, and C. G. Knight. 2014. Mutation rate plasticity in rifampicin

resistance depends on Escherichia coli cell-cell interactions. Nat. Commun.

5:3742.

Krasovec, R., H. Richards, D. R. Gifford, R. V Belavkin, A. Channon, E. Aston, A. J.

McBain, and C. G. Knight. 2018. Opposing effects of population density and stress

on Escherichia coli mutation rate. ISME J. In press.

Krašovec, R., H. Richards, D. R. Gifford, C. Hatcher, K. J. Faulkner, R. V. Belavkin, A.

Channon, E. Aston, A. J. McBain, and C. G. Knight. 2017. Spontaneous mutation

rate is a plastic trait associated with population density across domains of life.

PLoS Biol. 15:e2002731.

Lawhorn, B. G., S. Y. Gerdes, and T. P. Begley. 2004. A for the

identification of thiamin metabolic genes. J. Biol. Chem. 279:43555–9. American

Society for Biochemistry and Molecular Biology.

302 Lenski, R. E., M. R. Rose, S. C. Simpson, and S. C. Tadler. 1991. Long-Term Experimental

Evolution in Escherichia coli . I . Adaptation and Divergence During. Am. Nat.

138:1315–1341.

Li, J., C. Attila, L. Wang, T. K. Wood, J. J. Valdes, and W. E. Bentley. 2007. Quorum

sensing in Escherichia coli is signaled by AI-2/LsrR: Effects on small RNA and

biofilm architecture. J. Bacteriol. 189:6011–6020.

Linster, C. L., E. Van Schaftingen, and A. D. Hanson. 2013. Metabolite damage and its

repair or pre-emption. Nature Publishing Group.

Loewe, L., V. Textor, Scherer, and S. 2003. High Deleterious Genomic Mutation Rate in

Stationary Phase of Escherichia coli. American Association for the Advancement of

Science.

Lynch, M., M. S. Ackerman, J.-F. Gout, H. Long, W. Sung, W. K. Thomas, and P. L. Foster.

2016. Genetic drift, selection and the evolution of the mutation rate. Nat. Rev.

Genet. 17:704–714.

MacLean, R. C., C. Torres-Barceló, and R. Moxon. 2013. Evaluating evolutionary models

of stress-induced mutagenesis in bacteria. Nat. Rev. Genet. 14:221–227.

Maharjan, R. P., and T. Ferenci. 2018. The impact of growth rate and environmental

factors on mutation rates and spectra in Escherichia coli. Environ. Microbiol. Rep.,

doi: 10.1111/1758-2229.12661. Wiley/Blackwell (10.1111).

Matic, I. 2016. Molecular mechanisms involved in the regulation of mutation rates in

bacteria.

Mazoyer, A., R. Drouilhet, S. Despréaux, and B. Ycart. 2017. flan: An R Package for

Inference on Mutation Models. R J. 9:334–351.

McLennan, A. G. 2013. Substrate ambiguity among the nudix hydrolases: biologically

significant, evolutionary remnant, or both? Cell. Mol. Life Sci. 70:373–385. SP

303 Birkhäuser Verlag Basel.

McLennan, A. G. 2006. The Nudix hydrolase superfamily. Cell. Mol. Life Sci. 63:123–

143.

Michaels, M. L., and J. H. Miller. 1992. The GO System Protects Organisms from the

Mutagenic Effect of the Spontaneous Lesion 8-Hydroxyguanine (7,8-Dihydro-8-

Oxoguanine). J. BACrERIOLOGY 174:6321–6325.

Morán-Zorzano, M. T., M. Montero, F. J. Muñoz, N. Alonso-Casajús, A. M. Viale, G.

Eydallin, M. T. Sesma, E. Baroja-Fernández, and J. Pozueta-Romero. 2008.

Cytoplasmic Escherichia coli ADP sugar pyrophosphatase binds to cell membranes

in response to extracellular signals as the cell population density increases. FEMS

Microbiol. Lett. 288:25–32. Oxford University Press.

Moreno-Bruna, B., E. Baroja-Fernandez, F. J. Munoz, A. Bastarrica-Berasategui, A.

Zandueta-Criado, M. Rodriguez-Lopez, I. Lasa, T. Akazawa, and J. Pozueta-Romero.

2001. Adenosine diphosphate sugar pyrophosphatase prevents glycogen

biosynthesis in Escherichia coli. Proc. Natl. Acad. Sci. 98:8128–8132.

Nishimura, I., M. Kurokawa, L. Liu, and B.-W. Ying. 2017. Coordinated Changes in

Mutation and Growth Rates Induced by Genome Reduction. MBio 8:e00676-17.

American Society for Microbiology.

Pereira, C. S., J. a Thompson, and K. B. Xavier. 2013. AI-2-mediated signalling in

bacteria. FEMS Microbiol. Rev. 37:156–81.

Pinheiro, J., and D. Bates. 2000. Mixed effects models in S and S-Plus. Springer, New

York.

Saint-Ruf, C., and I. Matic. 2006. Environmental tuning of mutation rates. Environ.

Microbiol. 8:193–9.

Sung, W., M. S. Ackerman, S. F. Miller, T. G. Doak, and M. Lynch. 2012. Drift-barrier

304 hypothesis and mutation-rate evolution. Proc. Natl. Acad. Sci. 109:18488–18492.

Team, R. C. 2018. R: A language and environment for statistical computing. R Found.

Stat. Comput. Vienna.

Van Schaftingen, E., R. Rzem, A. Marbaix, F. Collard, M. Veiga-Da-Cunha, and C. L.

Linster. 2013. Metabolite , a neglected aspect of intermediary

metabolism. J. Inherit. Metab. Dis. 36:427–434. Springer Netherlands.

Xavier, K. B., and B. L. Bassler. 2005. Regulation of uptake and processing of the

quorum-sensing autoinducer AI-2 in Escherichia coli. J. Bacteriol. 187:238–48.

American Society for Microbiology.

Xu, W., C. A. Dunn, S. F. O’Handley, D. L. Smith, and M. J. Bessman. 2006. Three new

Nudix hydrolases from Escherichia coli. J. Biol. Chem. 281:22794–22798. in Press.

305 5.7 Appendix

5.7.1 Statistical model outputs

5.7.1.1 Model output for Figure 5.3A

log 2 Mutation Rate Predictors Estimates CI p (Intercept) 4.91 4.42 – 5.41 <0.001

Mean-centred calibrated -0.81 -1.15 – -0.47 <0.001 population density (CalCent)

BW25113 -0.52 -0.87 – -0.16 0.005

nud B -1.18 -1.63 – -0.73 <0.001

nud C -0.72 -1.37 – -0.07 0.029

nud D -0.49 -0.94 – -0.05 0.031

nud E -0.61 -1.04 – -0.18 0.006

nud F -1.42 -1.86 – -0.98 <0.001

nud G -0.44 -0.90 – 0.03 0.064

nud H -2.74 -3.55 – -1.93 <0.001

nud I -0.84 -1.30 – -0.37 <0.001

nud J -2.32 -2.74 – -1.89 <0.001

nud K -0.88 -1.32 – -0.45 <0.001

CalCent:StrainBW25113 0.00 -0.36 – 0.36 0.987

CalCent:StrainnudB -0.07 -0.53 – 0.38 0.751

CalCent:StrainnudC -0.17 -0.60 – 0.27 0.450

CalCent:StrainnudD -0.02 -0.43 – 0.38 0.910

CalCent:StrainnudE 0.20 -0.21 – 0.60 0.342

CalCent:StrainnudF 0.65 0.25 – 1.05 0.002

CalCent:StrainnudG 0.09 -0.35 – 0.52 0.689

CalCent:StrainnudH -0.18 -0.63 – 0.27 0.431

CalCent:StrainnudI 0.39 -0.02 – 0.80 0.064

306 CalCent:StrainnudJ 0.81 0.42 – 1.19 <0.001

CalCent:StrainnudK -0.01 -0.40 – 0.38 0.964 Observations 245 R2 / Omega-squared 0.741 / 0.741 Population density was estimated via an ATP-based assay. Mutation rate was co- estimated with the fitness effect of the resistance mutation. Model was run with strain

MG1655 as a reference. The Intercept and CalCent values represent the average mutation rate estimate and DAMP slope estimate respectively for MG1655.

5.7.1.2 Model output for Figure 5.3B

log 2 Mutation Rate Predictors Estimates CI p (Intercept) 5.04 4.57 – 5.51 <0.001

Mean-centred calibrated -0.87 -1.16 – - <0.001 population density (CalCent) 0.57

BW25113 -0.56 -0.87 – - 0.001 0.25 nud B -1.23 -1.62 – - <0.001 0.85 nud C -0.92 -1.48 – - 0.001 0.36 nud D -0.58 -0.97 – - 0.003 0.20 nud E -0.69 -1.06 – - <0.001 0.32 nud F -1.45 -1.83 – - <0.001 1.06 nud G -0.49 -0.90 – - 0.017 0.09 nud H -3.00 -3.72 – - <0.001 2.29 nud I -0.89 -1.30 – - <0.001 0.49

307 nud J -2.43 -2.79 – - <0.001 2.07

nud K -0.94 -1.31 – - <0.001 0.56

CalCent:Strain BW25113 0.04 -0.27 – 0.36 0.797

CalCent:StrainnudB -0.20 -0.59 – 0.20 0.325

CalCent:StrainnudC -0.14 -0.51 – 0.23 0.465

CalCent:StrainnudD 0.05 -0.30 – 0.40 0.772

CalCent:StrainnudE 0.25 -0.10 – 0.60 0.168

CalCent:StrainnudF 0.66 0.31 – 1.01 <0.001

CalCent:StrainnudG 0.10 -0.28 – 0.48 0.600

CalCent:StrainnudH -0.22 -0.62 – 0.17 0.269

CalCent:StrainnudI 0.46 0.10 – 0.83 0.012

CalCent:StrainnudJ 0.80 0.47 – 1.12 <0.001

CalCent:StrainnudK 0.01 -0.33 – 0.34 0.975 Observations 231 R2 / Omega-squared 0.762 / 0.756 Population density was estimated via an ATP-based assay. Mutation rate was co-

estimated with the fitness effect of the resistance mutation. Model was run with strain

MG1655 as a reference. The Intercept and CalCent values represent the average

mutation rate estimate and DAMP slope estimate respectively for MG1655. Mutation

rates with an estimated number of mutational events below 0.3 were excluded from

the analysis.

5.7.1.3 Model output for Figure 5.3C

log 2 Mutation Rate Predictors Estimates CI p (Intercept) 4.65 4.23 – 5.07 <0.001

Mean-centred calibrated -0.66 -0.96 – - <0.001

308 population density (CalCent) 0.36

BW25113 -0.49 -0.79 – - 0.002 0.19 nud B -1.24 -1.62 – - <0.001 0.87 nud C -0.47 -0.98 – 0.04 0.071 nud D -0.26 -0.63 – 0.11 0.163 nud E -0.42 -0.78 – - 0.022 0.06 nud F -1.04 -1.42 – - <0.001 0.66 nud G -0.35 -0.73 – 0.03 0.073 nud H -2.30 -2.95 – - <0.001 1.66 nud I -0.64 -1.03 – - 0.002 0.25 nud J -2.08 -2.43 – - <0.001 1.72 nud K -0.71 -1.08 – - <0.001 0.34

CalCent:Strain BW25113 -0.03 -0.34 – 0.29 0.864

CalCent:StrainnudB -0.28 -0.66 – 0.10 0.145

CalCent:StrainnudC -0.14 -0.52 – 0.23 0.452

CalCent:StrainnudD -0.13 -0.48 – 0.21 0.450

CalCent:StrainnudE 0.23 -0.14 – 0.60 0.219

CalCent:StrainnudF 0.51 0.15 – 0.86 0.005

CalCent:StrainnudG -0.07 -0.43 – 0.30 0.719

CalCent:StrainnudH -0.05 -0.42 – 0.33 0.812

CalCent:StrainnudI 0.37 0.01 – 0.73 0.042

CalCent:StrainnudJ 0.65 0.31 – 0.98 <0.001

CalCent:StrainnudK -0.03 -0.37 – 0.32 0.884 Observations 245 R2 / Omega-squared 0.749 / 0.748

309 Population density was estimated via an ATP-based assay. Mutation rate was co-

estimated with the weighted median fitness estimate of the resistance mutation. This

was estimated by using the fitness estimates co-estimated with the mutation rate and

estimating the weighted median of these values. Model was run with strain MG1655

as a reference. The Intercept and CalCent values represent the average mutation rate

estimate and DAMP slope estimate respectively for MG1655.

5.7.1.4 Model output for Figure 5.3D

log 2 Mutation Rate Predictors Estimates CI p (Intercept) 4.67 4.28 – 5.06 <0.001

Mean-centred calibrated -0.58 -0.84 – - <0.001 population density (CalCent) 0.32

BW25113 -0.51 -0.77 – - <0.001 0.25

nud B -1.17 -1.49 – - <0.001 0.85

nud C -0.53 -0.96 – - 0.016 0.10

nud D -0.27 -0.58 – 0.04 0.090

nud E -0.41 -0.71 – - 0.009 0.10

nud F -0.91 -1.23 – - <0.001 0.58

nud G -0.30 -0.63 – 0.02 0.070

nud H -2.35 -2.90 – - <0.001 1.80

nud I -0.57 -0.91 – - 0.001 0.23

nud J -2.08 -2.38 – - <0.001 1.78

nud K -0.64 -0.96 – - <0.001

310 0.32

CalCent:Strain BW25113 -0.08 -0.36 – 0.20 0.566

CalCent:StrainnudB -0.44 -0.77 – - 0.008 0.12

CalCent:StrainnudC -0.21 -0.54 – 0.11 0.202

CalCent:StrainnudD -0.22 -0.51 – 0.08 0.149

CalCent:StrainnudE 0.21 -0.12 – 0.54 0.204

CalCent:StrainnudF 0.36 0.05 – 0.66 0.024

CalCent:StrainnudG -0.18 -0.49 – 0.13 0.261

CalCent:StrainnudH -0.12 -0.44 – 0.21 0.468

CalCent:StrainnudI 0.35 0.04 – 0.66 0.026

CalCent:StrainnudJ 0.60 0.31 – 0.89 <0.001

CalCent:StrainnudK -0.11 -0.41 – 0.18 0.453 Observations 230 R2 / Omega-squared 0.768 / 0.766 Population density was estimated via an ATP-based assay. Mutation rate was co- estimated with the weighted median fitness estimate of the resistance mutation. This was estimated by using the fitness estimates co-estimated with the mutation rate and estimating the weighted median of these values. Model was run with strain MG1655 as a reference. The Intercept and CalCent values represent the average mutation rate estimate and DAMP slope estimate respectively for MG1655. Mutation rates with an estimated number of mutational events below 0.3 were excluded from the analysis.

5.7.1.5 Model output for Figure 5.4

log 2 CFU Population Density Predictors Estimates CI p (Intercept) 26.83 26.49 – 27.18 <0.001

Mean-centred Net Luminescence 1.07 0.85 – 1.30 <0.001

311 BW25113 0.52 0.33 – 0.72 <0.001 nud B 0.26 -0.05 – 0.57 0.101 nud C 0.35 -0.25 – 0.95 0.247 nud D 0.44 0.16 – 0.72 0.002 nud E 0.46 0.19 – 0.74 0.001 nud F 0.39 0.12 – 0.66 0.005 nud G 0.49 0.20 – 0.77 0.001 nud H -1.92 -2.53 – -1.32 <0.001 nud I 0.36 0.05 – 0.66 0.021 nud J 0.15 -0.16 – 0.46 0.333 nud K 0.38 0.09 – 0.66 0.009

Mean-centred Net Luminescence: StrainBW25113 -0.14 -0.37 – 0.10 0.266

Mean-centred Net Luminescence: StrainnudB -0.60 -0.86 – -0.34 <0.001

Mean-centred Net Luminescence:StrainnudC -0.23 -0.51 – 0.06 0.114

Mean-centred Net Luminescence: StrainnudD -0.38 -0.63 – -0.13 0.003

Mean-centred Net Luminescence: StrainnudE -0.21 -0.47 – 0.04 0.099

Mean-centred Net Luminescence: StrainnudF -0.42 -0.66 – -0.18 0.001

Mean-centred Net Luminescence: StrainnudG -0.53 -0.78 – -0.29 <0.001

Mean-centred Net Luminescence: StrainnudH 0.12 -0.22 – 0.46 0.485

Mean-centred Net Luminescence: StrainnudI -0.20 -0.46 – 0.06 0.129

Mean-centred Net Luminescence: StrainnudJ -0.13 -0.39 – 0.14 0.350

Mean-centred Net Luminescence: StrainnudK -0.30 -0.54 – -0.05 0.017 Observations 723 R2 / Omega-squared 0.841/0.841

Calibration model between population density estimated via CFU and the net luminescence of that culture. Model was run with MG1655 as a reference.

312 5.7.1.6 Model output for Figure 5.6

log 2 Fitness estimate of the resistance mutation Predictors Estimates CI p (Intercept) -0.59 -1.25 – 0.08 0.082

Cal Cent -0.14 -0.85 – 0.57 0.696

BW25113 0.14 -0.55 – 0.82 0.696 nud B -0.31 -1.16 – 0.55 0.483 nud C 0.13 -0.63 – 0.89 0.738 nud D -0.19 -1.13 – 0.74 0.685 nud E -0.03 -0.80 – 0.74 0.934 nud F -0.49 -1.30 – 0.32 0.236 nud G 0.40 -0.37 – 1.17 0.309 nud H 0.55 -0.94 – 2.03 0.469 nud I -0.22 -1.09 – 0.64 0.611 nud J -0.28 -1.11 – 0.56 0.514 nud K 0.13 -0.62 – 0.88 0.733

StrainBW25113:CalCent 0.42 -0.32 – 1.16 0.262

StrainnudB:CalCent 0.33 -0.73 – 1.38 0.541

StrainnudC:CalCent -0.11 -0.91 – 0.70 0.797

StrainnudD:CalCent 0.18 -0.79 – 1.16 0.713

StrainnudE:CalCent 0.53 -0.29 – 1.36 0.201

StrainnudF:CalCent 0.50 -0.37 – 1.36 0.260

StrainnudG:CalCent -0.33 -1.19 – 0.53 0.457

StrainnudH:CalCent 2.05 0.82 – 3.28 0.001

StrainnudI:CalCent 0.58 -0.41 – 1.56 0.248

StrainnudJ:CalCent 0.30 -0.59 – 1.18 0.508

StrainnudK:CalCent 0.46 -0.32 – 1.24 0.248

Observations 245 R2 / Omega-squared 0.215 / 0.215

313 Model investigating the relationship between fitness of the resistance mutation and

population density estimated via an ATP-based assay. Calibrated population density

was mean-centred for the analysis (CalCent). Model was run with strain MG1655 as a

reference.

5.7.1.7 Model output for Figure 5.7A

log 2 Mutation Rate Predictors Estimates CI p (Intercept) 3.51 2.96 – 4.06 <0.001

Mean-centred population density -0.89 -1.03 – -0.75 <0.001

18 h -0.07 -0.94 – 0.79 0.844

20 h 0.42 -3.92 – 4.76 0.434

22 h 1.82 0.84 – 2.79 0.004

24 h 0.93 -3.48 – 5.33 0.228 Observations 60 R2 / Omega-squared 0.881 / 0.880 Model investigating variation in DAMP between different culture lengths of

experiment. Model was run with reference to the 16 hours culture. Mutation rate was

estimated without co-estimating the fitness effect of the mutation.

5.7.1.8 Model output for Figure 5.8

log 2 Mutation Rate Predictors Estimates CI p (Intercept) 0.34 -1.39 – 2.07 0.694

Mean-centred Population density -0.72 -0.96 – - <0.001 0.49

log2(ATP_cell_µM) 2.82 1.15 – 4.49 0.001

18 h 3.05 0.57 – 5.53 0.024

314 20 h 3.59 -9.40 – 0.176 16.58

22 h 5.54 2.64 – 8.45 0.003

24 h 3.36 -8.62 – 0.174 15.35 log2(ATP_cell_µM):Competition_time_18h -2.61 -4.18 – - 0.002 1.03 log2(ATP_cell_µM):Competition_time_20h -2.60 -4.13 – - 0.001 1.07 log2(ATP_cell_µM):Competition_time_22h -2.59 -4.65 – - 0.015 0.54 log2(ATP_cell_µM):Competition_time_24h -3.51 -5.43 – - 0.001 1.59

Mean-centred Population 0.22 0.02 – 0.42 0.031 density:log2(ATP_cell_µM) Observations 60 R2 / Omega-squared 0.890 / 0.889 Model investigating variation in DAMP between different culture lengths of experiment. Model was run with reference to the 16 hours culture. Mutation rate was estimated without co-estimating the fitness effect of the mutation.

315

Chapter 6: Discussion

316 6.1 Introduction

Mutation rates within a single genotype are not a constant and vary dependent upon the environment (Massey and Buckling 2002; Saint-Ruf and Matic 2006). Such environmental mutation rate plasticity occurs in response to an organism’s social environment. Specifically, mutation rate is inversely associated with the final population density a culture reaches (Krašovec et al. 2014; Krašovec et al. 2017).

Within the experimental chapters of this thesis I have expanded our knowledge of the prevalence, evolution and molecular mechanisms involved in this Density Associate

Mutation rate Plasticity (DMAP) (Krašovec et al. 2017). Here I will summarise the findings of each experimental chapter, before suggesting for future work that would build upon the findings of DAMP enclosed in this thesis.

6.2 Summary of Findings in each experimental chapter

6.2.1 Chapter 2: Spontaneous mutation is a plastic trait associated with population density across domains of life

DAMP’s prevalence in the natural world is investigated through two avenues (Krašovec et al. 2017). First, by re-analysing mutation rate estimations from 75 years of published literature such an inverse association between mutation rate and population density is uncovered in a wide range of species spanning the tree of life. Secondly, this association is then tested experimentally, finding that DAMP is indeed present in highly diverged species. Specifically, DAMP occurs in both Escherichia coli and

Saccharomyces cerevisiae. Interestingly, DAMP, though widespread, is itself variable.

DAMP occurs in E. coli, but is absent in Pseudomonas aeruginosa, another gamma

Proteobacterium. Furthermore, DAMP even varies between strains of S. cerevisiae.

DAMP’s molecular modulation in these divergent species is highly conserved. In both

317 these species DAMP is modulated via cleansing of the intracellular nucleotide pool of

8-oxo-dGTP by a Nudix hydrolase protein, whereas other mutational correction mechanisms are not involved in modulating DAMP.

6.2.2 Chapter 3: Evolution of density associated mutation rate plasticity in two disparate species of Archaea

The evolution of DAMP within the final domain of life to be empirically tested, the

Archaea, was revealed by identifying DAMP in two divergent species from phylogenetically distant phyla: Sulfolobus acidocaldarius and Haloferax volcanii.

Investigation at two genetic markers confirmed the previous result from chapter 2 than DAMP is present with the same degree across the genome. Leveraging the published genome and amino acid sequences, similar proteins to those previously identified that modulate DAMP in bacteria and yeast. Interestingly this differs between the species of Archaea, highlighting their intermediary placement between the bacteria and eukaryote domains of life.

6.2.3 Chapter 4: Evolution of Density Associated Mutation Rate Plasticity within strains of Escherichia coli

At fine evolutionary scales, between strains of the bacterium E. coli, both DAMP and the average mutation rate have also evolved. This relationship is unaffected by the variation in fitness effects between the different strains, showing DAMPs robustness to fitness effects of resistance mutation. Across the strains tested there is a slight phylogenetic signal for the degree of DAMP. This analysis also reveals that there is an association between mutation rate and DAMP, with strains possessing a low mutation rate exhibiting a higher degree of DAMP.

318 6.2.4 Chapter 5: Highly conserved molecular mechanisms control density associated mutation rate plasticity in Escherichia coli

Finally, further investigation of DAMP’s mechanism and variation throughout the various periods of the growth cycle was conducted. Mutation rates are affected via intercellular environmental signals, though this depends upon the co-cultured strain and the fitness effects of mutations. Three Nudix hydrolase genes are found to also modulate the degree of DAMP in E. coli. DAMP’s presence across the growth cycle was confirmed by conducting fluctuation tests at different time points, identifying its occurrence with the same degree at all points tested. Despite this, the average mutation rate at these points differs due to the dynamics of number of mutational events across these time points. This variation could also be due to the changing levels in the intracellular nucleotide pool and the cell’s energetic needs across the growth curve.

6.2.5 Considerations of DAMP and fluctuation tests

DAMP as phenomenon relies upon the approach used and the accurate estimation of the mutation rate and the final population density of the culture. As discussed in the experimental chapters of this thesis, if the same method is used to estimate population size used in both the mutation rate and density estimates, there may be a correlation of errors that results in just such an association of DAMP. Therefore, throughout this thesis an independent measure of population density has been used, that of an ATP-based assay. As seen in Chapter 3, use of this independent measure resulted in a reduced slope of Sulfolobus acidocaldarius, showing that such a correlation is indeed a potential problem. Therefore the use of such an independent method of estimate is essential for fully evaluating DAMP. Moreover, when the

319 number of mutational events is low and in the ECOR analysis (Chapter 4) there seems to be an association between mutation rate and population density when both are estimated via CFU (Figure 4.3 – bottom left of plot). Here the points follow such a tight association to be worrying. This disappears with the use of the independent measure of population density (Figure 4.2). Therefore, despite differences in the correlation of the calibration curves for strains (Figure 4.1), the use of the independent method reduces the possibility of spurious results due to a low number of mutational events and the correlation of errors.

Another important factor to consider is the fact that fluctuation tests result in mutant and non-mutant cells growing within the same culture during the experiment. One feature of the fluctuation test is the non-selective nature of the environment. All the fluctuation tests conducted within the thesis consisted of non-selective conditions, primarily chemically defined Davis minimal media. Whilst the use of rifampicin has shown to be problematic in long cultures to the ability of rpoB mutants to utilise acetate (Bergman et al. 2014). This selective advantage is not expected to occur here and affect DAMP as acetate only comes in aged cultures. The longest culture times using rifampicin was for 24 hours. Additionally, if such an adaptive mutation was contributing to or affecting DAMP then the results of fluctuation tests with different culture lengths would follow a different path. Specifically, we would see that the mutation rate should be higher in longer culture times, the opposite of the result seen.

Another reason why this does not contribute to the profile of DAMP is that in this study the use of two genetic markers were used in the same organism without a difference in DAMP occurring. The likelihood of both mutations possessing the same

320 adaptive benefit or fitness cost is very low, suggesting DAMP’s robustness to this possibility.

Despite the non-selective nature of fluctuation tests, there is still the possibility of

DAMP occurring due to the fitness effects of mutation. As most mutations are deleterious (Eyre-Walker and Keightley 2007), DAMP could be explained due to slower growth of mutants in high density populations to non-mutant cells, resulting in the negative association of DAMP. However, it seems this is not the case. Mathematical calculations of the mutation rate can now include the estimation of fitness effects of mutations (Mazoyer et al. 2017). This relies upon the mathematical understanding of the time when mutations occur and leverage the mutant counts to estimate the relative fitness effects between mutant and non-mutant cells. Whilst this is estimate is made from mathematical predictions rather than empirical estimates, it is suggestive of greater fitness costs occurring at lower density than high density (Chapters 4 and 5).

This is the opposite of what would be expected if the fitness effects of mutation were to contribute to DAMP and rather implies that mutation rates at low density are underestimated, suggesting a strengthening of DAMP. This method of estimating mutation rate also allows for the explicit testing of what fitness effects would affect

DAMP and to what degree. This shows that fitness effects need to be much more severe than is seen in empirical studies to play a role in DAMP, and indeed the effect is still only very minimal (Chapter 2). Additionally, co-estimating fitness effects of mutation do not change the results meaningfully in any of the chapters of this thesis.

All in all, these results highlight that DAMP is robust to the fitness effects of mutation, however empirical work should look at estimating fitness effects of mutation in the

ECOR collection, specifically strains with and without DAMP. These effects should be

321 quantified and compared to those presented in the Chapter, especially the possible density dependent fitness effect. This would both confirm the co-estimated fitness effect and also determine the costs in plastic and non-plastic strains, clarifying fitness effects possible role in DAMP. Until then the use of mutation rates without fitness effect estimation is probably the preferred option for mutation rate analysis.

Overall the results presented in this thesis show that fluctuation tests are still a valid method in estimating mutation rate, especially in the realm of environmental mutation rate plasticity. Nevertheless, care must be taken in how these tests are conducted.

Specifically the fluctuation tests require a large number of independent cultures that provide a reasonable number of mutational events, so that one can be confident in the mutation rate estimated. This requires a large amount of work to produce the sufficient number of mutation rates to be able to overcome the inherent noise of environmental mutation rates. Mutation rates are difficult to measure and all methods possess downsides. The fluctuation test, despite being conceived over three quarters of a century ago, is still an applicable method to estimate mutation rates today.

6.3 Discussion of results and suggestions for future work

6.3.1 Further investigation of DAMP at evolutionary scales

An association between mutation rate and population density was first identified in one locus in one bacterium (Krašovec et al. 2014). This relationship has been further extended through to multiple loci in all domains of life (Chapter 3 and Krašovec,

Richards, Danna R. Gifford, et al., 2017), suggesting that this trait is prevalent, but variable, across the tree of life. The fact that DAMP occurs in species from a wide range of organisms present in a diverse array of ecosystems means that the chemical

322 composition of the environment is unlikely to provide the link between population density and mutation rate. Rather, it is more likely to be conserved mechanisms that cause and affect DAMP. This is suggested due to the presence of mutation rate estimates from viruses within the published literature analysis (Krašovec et al. 2017).

Reanalysis of such data (Combe and Sanjuán 2014) identified different mutation rates but similar DAMP between host cells, identifying that the basic processes involved within replication plays a role in DAMP. Whilst the evidence from viruses is tantalising, the presence of DAMP still needs to be tested empirically. Specifically, whether it is the density of the virus or the density of the host cells that is important for expressing

DAMP, if at all. Additionally, the presence of DNA repair genes differs between viruses, and so this could ascertain the relative levels of importance of DNA repair genes and how they DNA repair genes of the virus interact. Finally, whilst the presence of DAMP is now known to be in all three domains of life, its intermediary distribution (i.e. at the family or genus scale) is still unknown. Specifically, the only bacteria so far tested are both Gram-negative gamma Proteo-bacteria, meaning its prevalence in other phyla of bacteria are still unknown. Knowledge of DAMP’s degree over this intermediary evolutionary scale would provide greater evidence of DAMP’s evolution and the evolutionary history of such a trait.

6.3.2 Investigation of DAMP across physiological scales

Whilst the exploration of DAMP in both viruses and other classes of organisms intermediary to those already tested would allow for further knowledge of DAMP across evolutionary scales, its presence across physiological scales of organisation would remain unknown. For example, the genes and genomes of mitochondria have been greatly studied and found to behave profoundly different to that of nuclear

323 genomes (Birky 2001). Recent advances that allow for the transferring of mitochondria between cells allows for the possible manipulation of both the number and genetic make-up of a cell’s mitochondria (Yang and Koob 2012; Wada et al. 2017). Thus by manipulating such factors it could be possible to test for the presence of DAMP in these subcellular structures, identifying whether mitochondria can sense their density within the cell or rely upon cues from the host cell. Conversely, at the level of multicellularity, population density is a more challenging issue to resolve. Intercellular communication occurs within (Yamasaki et al. 1995; Brücher and Jamall 2014), suggesting that, if this communication were a prerequisite for DAMP to evolve, then a relationship such as DAMP may indeed be possible. Fluctuation tests have found a use in studying mutation rates in some multicellular tissues, in particular cancer (Kendal and Frost 1988). Intriguingly, a pattern similar with DAMP has been previously discovered in mouse cells, where cells kept at high density exhibited a lower mutation rate (Boesen et al. 1994). This suggests that DAMP may even be present in multicellular organisms, though probably at the tissue, rather than whole organism, level.

6.3.3 Molecular mechanisms controlling DAMP in Archaea

DAMP in both pro- and eukaryotes requires the same Nudix hydrolase protein cleansing the intracellular nucleotide pool of the mutagenic 8-oxo-dGTP nucleotide (8- oxo-dGTPase) (Krašovec et al. 2017). The use of highly conserved Nudix hydrolase proteins has been extended to three other genes involved in both metabolic and house-cleaning processes. Whilst the use of such proteins has not been empirically tested in the Archaea displaying DAMP, the use of their whole genome sequence allows for identifying both nucleotide and protein sequence similarity. Intriguingly,

324 when the protein sequence of the Nudix hydrolases identified are tested against that of both S. acidocaldarius and H. volcanii, there is evidence for presence of all these proteins in these species. There is no such relationship between the genetic sequences of these archaea and bacteria. This suggests two possible hypotheses for DAMP to evolve. First, DAMP may have been present in their last common ancestor but through the passage of time, nucleotide changes have occurred in the genetic sequence rendering them to be completely unidentical. Second, the proteins used may have evolved in both these domains through . Whilst the conservation of these amino acid sequences is suggestive of a highly conserved mechanism involved in DAMP, these genes in both these species are still not fully characterised and require full experimental testing to identify their role, if any, in modulating DAMP in these species.

Another avenue of investigation in Nudix genes control of DAMP is to measure the respective levels of these Nudix hydrolases in vivo. A problem with this approach is that the most common methods used for assaying such levels are laborious and have low levels of sensitivity (Takagi et al. 2012; Dong et al. 2015). This has shown however that, for E. coli mutT, the levels of intracellular protein are kept constant, with only half the cellular levels that are maintained needed to preserve the mutation frequencie, suggesting the intracellular levels of the MutT protein are there to prepare for any sudden change in oxygen pressure (Setoyama et al. 2011). Recently, a new method for identifying intracellular activity levels of the human MTH1 protein, homologus to bacterial MutT (Hayakawa et al. 1995), that uses a chimeric nucleotide producing a luminescence signal due to the cleavage of ATP signalling the protein’s activity level (Ji et al. 2016). This use could elucidate the varying activity of 8-oxo-dGTPase in cells

325 across population density environments. Additionally, the use of the cancer drug (S)- crizotinib inhibits the activity of MTH1 (Huber et al. 2014; Dong et al. 2015), suggesting its potential to inhibit this action in other species. Through addition of various concentrations of this inhibitor could determine if there is a specific level of intracellular 8-oxo-dGTPase required for DAMP or if there is a gradual decline in DAMP with declining protein activity. Furthermore, the addition of this inhibitor in other species without gene knockouts such as S. acidocaldarius and H. volcanii, would determine if DAMP in these species do indeed utilise the mechanism without the need to produce gene knockouts.

Whilst the use of the ECOR collection allowed for the investigation of DAMP at fine evolutionary scales, the results gained from this group of natural isolates allows for further investigations of DAMP’s molecular mechanisms. A recent study conducting numerous phenotypic tests on a wide range of E. coli isolates calculated, what they term, a disruption score for the genes of these isolate (Galardini et al. 2017).

Specifically, they compared each strain and protein coding gene an estimated the impact of nonsynonymous substitutions, giving them a probability for each gene of how its function was disrupted in relation to E. coli MG1655. Therefore, by obtaining the whole genome sequences of the ECOR strains used in this study, it would be possible to then compute a score such as this and, using random forests, identify again gene that, when disrupted, affects with DAMP or the average mutation rate, or both.

Alternatively, such an analysis could be carried out on the raw SNP data coming from such whole genome sequencing analysis.

326 6.3.4 Social aspects involved in modulating DAMP

Another aspect of DAMP’s mechanism that requires elucidating is that of what is actually meant by population density. Whilst DAMP identifies an association between the final population density of a culture and the mutation rate, which is dependent upon the actions of Nudix hydrolase proteins, a direct link between the action of such proteins and population density remains unclear. Intriguingly, one Nudix hydrolase identified here (nudF) has been shown to respond to intercellular signalling, with greater cell membrane binding at high density (Morán-Zorzano et al. 2008). Such a link in the other Nudix hydrolase genes has not been found thus far. Moving forward, such an association between the mechanisms of DAMP and the density of the culture needs to be identified for a greater understanding of the processes involved in DAMP.

Additional to the potential use of the ECOR collection detailed above, the collection could be used to investigate the intercellular signalling control of DAMP. The significant variation in degrees of DAMP between these strains could be due to different levels of intercellular signalling, which could be investigated by conducting co-cultures between these strains, alongside the investigation of the genetic differences. These co-cultures could be conducted in a similar way to that described in this thesis (chapter 4) and previously (Krašovec et al. 2014). In order to do this would require the strains to be genetically modified, either with an antibiotic resistance marker as before (Krašovec et al. 2014) or possibly with tagged with a fluorescence protein, as used to investigate intercellular interactions in bacteria (Stempler et al.

2017; Laganenka and Sourjik 2018) and between bacteria and epithelial cells (Ismail et al. 2016). Co-cultures could also utilise transwell plates where the cells being tested are separated in compartments of the well separated by a filter allowing the

327 movement of signals and metabolites between the two cultures. Therefore, whilst the cells are separated physically, the chemical interactions remain unaffected, allowing the examination of microbial interactions (Chodkowski and Shade 2017). Co-culturing

ECOR strains in these transwell plates would further reveal the role of cellular interactions in modulating DAMP, for example by co-culturing a strain with high degree of DAMP alongside a strain with no DAMP. If variation in DAMP between such strains is due to differential cellular interactions then a strain without DAMP could exhibit DAMP when co-cultured with a DAMP strain. Conversely, if there is no difference in behaviours then this suggests that it is differences in other genetic mechanisms between strains that has the main role in determining different levels of

DAMP.

6.3.5 Adaptiveness of DAMP

Variation in mutation rates are suggested to be adaptive (Alexander et al. 2017).

However, despite finding such a relationship as DAMP, the adaptive benefits of DAMP remain to be tested, and indeed DAMP may not have necessarily evolved for adaptive reasons, such as has been proposed for SIM (MacLean et al. 2013). Experimental evolution is the ideal route for answering such a question. By growing strains with differing levels of both DAMP and average mutation rate in different environments, both static and fluctuating, and measuring the fitness changes of those strains relative to the initial strain would show how quickly a strain adapted to an environment and whether this is related to the level of DAMP. Plasticity of a trait that directly affects fitness is expected to be favoured by selection, as proposed for mutation rate plasticity

(Belavkin et al. 2016), if such plasticity is related to an organism’s fitness, Under fluctuating environmental conditions, though this is not the case under static

328 environmental conditions if the plasticity is costly (Gomez-Mestre and Jovani 2013).

Additionally, it is becoming apparent that evolution within environments depends on the mix of mutations available to the organism, which varies by environment

(Maharjan and Ferenci 2018), thus affecting potential adaptive pathways (Ferenci and

Maharjan 2015). Therefore by investigating the different mutational spectra across the density range in strains exhibiting various levels of DAMP, the adaptive potential differences between and within strains across a population density range can be elucidated.

6.3.6 DAMP and the growth cycle

The density of a culture changes through the course of the culture cycle, and aspects of the growth curve have been shown to affect mutation rate (Loewe et al. 2003;

Kivisaar 2010; Nishimura et al. 2017; Maharjan and Ferenci 2018). By altering the initial population size of the culture, it is possible to manipulate the time spent in lag phase.

Therefore, by then altering the total culture time it would be possible to alter the ratio of time spent in either stationary or lag phase, and determine the relative importance of these phases in determining the degree of DAMP. The fact that DAMP is present across the growth cycle suggests that it is not controlled by a time-specific gene expression and rather is modulated by processes expressed throughout the culture cycle.

6.2.7 Single cell investigation into DAMP

Fluctuation tests average the mutation rate over the whole culture, however it need only be due to a few cells mutating that cause the observed mutation rate. It is known that genetically identical cells exhibit different levels of gene expression despite being

329 in the same uniform environment (Elowitz et al. 2002; Ozbudak et al. 2002), potentially as a bet-hedging method to maximise survival (Veening et al. 2008). Therefore a greater understanding of DAMP requires the use of microfluidic chips to experimentally test mutation rate and gene expression at the single cell level (Ryley and Pereira-Smith 2006). Within this system it is possible to monitor single cellular behaviour and so, through modulating the environment through addition of various substrates, such as nucleotides, the environmental signal affecting the level of mutation rate could be determined. Variation in mutation rate and their associated fitness effects has recently been followed in single cells in the bacterium E. coli (Robert et al. 2018). This shows the variation of mutation rate at the level of the individual and through this use of microfluidics and fluorescent tagging of genes would allow for a greater understanding in the processes underlying the relationship of DAMP and its interaction with SIM.

6.4 Conclusions

In conclusion, the results presented in this thesis expand our knowledge of the prevalence, evolution and underlying mechanisms involved in DAMP. DAMP is widespread across the tree of life with species exhibiting DAMP present in all three domains of lifeMutation rates and DAMP have evolved between strains E. coli with more closely related strains exhibiting similar levels of each trait. Furthermore, in both pro- and eukaryotes, DAMP requires the same Nudix hydrolase protein cleansing the intracellular nucleotide pool of the highly mutagenic nucleotide 8-oxo-dGTP, which is also predicted to occur in both species of Archaea tested. DAMP in E. coli at least also requires three other Nudix hydrolase genes that, like 8-oxo-dGTPase, controls the intracellular nucleotide pool, whilst also playing a role in metabolic pathways, which

330 also affects DAMP. Whilst population density and mutation rate varies across the culture cycle, DAMP does not, suggesting DAMP relies upon mechanisms expressed across all points of the culture cycle. DAMP is present across the growth cycle with similar degree, although this varies with the length of time spent in lag phase.

Mutation rates across the culture cycle depends upon the levels of the intracellular nucleotide pool. The presence of DAMP across all domains of life, the highly conserved nature of the mechanisms involved in its modulation suggests a wide evolutionary presence of DAMP, with mutation rate varying with ecological factors such as population density since the early origins of life.

331 6.5 References

Alexander, H. K., S. I. Mayer, and S. Bonhoeffer. 2017. Population Heterogeneity in

Mutation Rate Increases the Frequency of Higher-Order Mutants and Reduces

Long-Term Mutational Load. Mol. Biol. Evol. 34:419–436. Oxford University Press.

Belavkin, R. V., A. Channon, E. Aston, J. Aston, R. Krašovec, and C. G. Knight. 2016.

Monotonicity of fitness landscapes and mutation rate control. J. Math. Biol.

73:1491–1524.

Bergman, J. M., M. Wrande, and D. Hughes. 2014. Acetate availability and utilization

supports the growth of mutant sub-populations on aging bacterial colonies. PLoS

One 9:e109255.

Birky, C. W. 2001. The Inheritance of Genes in Mitochondria and Chloroplasts: Laws,

Mechanisms, and Models. Annu. Rev. Genet. 35:125–148. Annual Reviews 4139

El Camino Way, P.O. Box 10139, Palo Alto, CA 94303-0139, USA .

Boesen, J. J. B., M. J. Niericker, N. Dieteren, and J. W. I. M. Simons. 1994. How variable

is a spontaneous mutation rate in cultured mammalian cells? Mutat. Res.

307:121–129.

Brücher, B. L. D. M., and I. S. Jamall. 2014. Cell-cell communication in the tumor

microenvironment, , and anticancer treatment. Karger Publishers.

Chodkowski, J. L., and A. Shade. 2017. A Synthetic Community System for Probing

Microbial Interactions Driven by Exometabolites. mSystems 2:e00129-17.

American Society for Microbiology Journals.

Combe, M., and R. Sanjuán. 2014. Variation in RNA Virus Mutation Rates across Host

Cells. PLoS Pathog. 10.

Dong, L., H. Wang, J. Niu, M. Zou, N. Wu, D. Yu, Y. Wang, and Z. Zou. 2015.

Echinacoside induces apoptotic cancer cell death by inhibiting the nucleotide pool

332 sanitizing enzyme MTH1. Onco. Targets. Ther. 8:3649–3664. Dove Press.

Elowitz, M. B., A. J. Levine, E. D. Siggia, and P. S. Swain. 2002. Stochastic gene

expression in a single cell. Science 297:1183–6. American Association for the

Advancement of Science.

Eyre-Walker, A., and P. D. Keightley. 2007. The distribution of fitness effects of new

mutations. Nat. Rev. Genet. 8:610–8.

Ferenci, T., and R. Maharjan. 2015. Mutational heterogeneity: A key ingredient of bet-

hedging and evolutionary divergence? BioEssays 37:123–130.

Galardini, M., A. Koumoutsi, L. Herrera-Dominguez, J. A. C. Varela, A. Telzerow, O.

Wagih, M. Wartel, O. Clermont, E. Denamur, A. Typas, and P. Beltrao. 2017.

Phenotype inference in an Escherichia coli strain panel. Elife 6:e31035. eLife

Sciences Publications Limited.

Gomez-Mestre, I., and R. Jovani. 2013. A heuristic model on the role of plasticity in

adaptive evolution: Plasticity increases adaptation, population viability and

genetic variation. Proc. R. Soc. B Biol. Sci. 280:20131869.

Hayakawa, H., M. Kuwano, A. Taketomi, K. Sakumi, and M. Sekiguchi. 1995. Generation

and Elimination of 8-Oxo-7, 8-Dihydro-2’-Deoxyguanosine 5’-Triphosphate, A

Mutagenic Substrate for DNA Synthesis, in Human Cells. Biochemistry 34:89–95.

American Chemical Society.

Huber, K. V. M., E. Salah, B. Radic, M. Gridling, J. M. Elkins, A. Stukalov, A. S. Jemth, C.

Göktürk, K. Sanjiv, K. Strömberg, T. Pham, U. W. Berglund, J. Colinge, K. L.

Bennett, J. I. Loizou, T. Helleday, S. Knapp, and G. Superti-Furga. 2014.

Stereospecific targeting of MTH1 by (S)-crizotinib as an anticancer strategy.

Nature 508:222–227. Nature Publishing Group.

Ismail, A. S., J. S. Valastyan, and B. L. Bassler. 2016. A Host-Produced Autoinducer-2

333 Mimic Activates Bacterial Quorum Sensing. Cell Host Microbe 19:470–480. Cell

Press.

Ji, D., A. A. Beharry, J. M. Ford, and E. T. Kool. 2016. A Chimeric ATP-Linked Nucleotide

Enables Luminescence Signaling of Damage Surveillance by MTH1, a Cancer

Target. J. Am. Chem. Soc. 138:9005–9008. American Chemical Society.

Kendal, W. S., and P. Frost. 1988. Pitfalls and practice of Luria-Delbrück fluctuation

analysis: a review. Cancer Res. 48:1060–1065. American Association for Cancer

Research.

Kivisaar, M. 2010. Mechanisms of stationary-phase mutagenesis in bacteria:

mutational processes in pseudomonads. FEMS Microbiol. Lett. 312:1–14.

Wiley/Blackwell (10.1111).

Krašovec, R., R. V Belavkin, J. A. D. Aston, A. Channon, E. Aston, B. M. Rash, M.

Kadirvel, S. Forbes, and C. G. Knight. 2014. Mutation rate plasticity in rifampicin

resistance depends on Escherichia coli cell-cell interactions. Nat. Commun.

5:3742.

Krašovec, R., H. Richards, D. R. Gifford, C. Hatcher, K. J. Faulkner, R. V. Belavkin, A.

Channon, E. Aston, A. J. McBain, and C. G. Knight. 2017. Spontaneous mutation

rate is a plastic trait associated with population density across domains of life.

PLoS Biol. 15:e2002731.

Laganenka, L., and V. Sourjik. 2018. Autoinducer 2-dependent Escherichia coli biofilm

formation is enhanced in a dual-species coculture. Appl. Environ. Microbiol.

84:e02638-17.

Loewe, L., V. Textor, Scherer, and S. 2003. High Deleterious Genomic Mutation Rate in

Stationary Phase of Escherichia coli. American Association for the Advancement of

Science.

334 MacLean, R. C., C. Torres-Barceló, and R. Moxon. 2013. Evaluating evolutionary models

of stress-induced mutagenesis in bacteria. Nat. Rev. Genet. 14:221–227.

Maharjan, R. P., and T. Ferenci. 2018. The impact of growth rate and environmental

factors on mutation rates and spectra in Escherichia coli. Environ. Microbiol. Rep.,

doi: 10.1111/1758-2229.12661. Wiley/Blackwell (10.1111).

Massey, R. C., and A. Buckling. 2002. Environmental regulation of mutation rates at

specific sites. Trends Microbiol. 10:580–584.

Mazoyer, A., R. Drouilhet, S. Despréaux, and B. Ycart. 2017. flan: An R Package for

Inference on Mutation Models. R J. 9:334–351.

Morán-Zorzano, M. T., M. Montero, F. J. Muñoz, N. Alonso-Casajús, A. M. Viale, G.

Eydallin, M. T. Sesma, E. Baroja-Fernández, and J. Pozueta-Romero. 2008.

Cytoplasmic Escherichia coli ADP sugar pyrophosphatase binds to cell membranes

in response to extracellular signals as the cell population density increases. FEMS

Microbiol. Lett. 288:25–32. Oxford University Press.

Nishimura, I., M. Kurokawa, L. Liu, and B.-W. Ying. 2017. Coordinated Changes in

Mutation and Growth Rates Induced by Genome Reduction. MBio 8:e00676-17.

American Society for Microbiology.

Ozbudak, E. M., M. Thattai, I. Kurtser, A. D. Grossman, and A. van Oudenaarden. 2002.

Regulation of noise in the expression of a single gene. Nat. Genet. 31:69–73.

Nature Publishing Group.

Robert, L., J. Ollion, J. Robert, X. Song, I. Matic, and M. Elez. 2018. Mutation dynamics

and fitness effects followed in single cells. Science (80-. ). 359:1283–1286.

American Association for the Advancement of Science.

Ryley, J., and O. M. Pereira-Smith. 2006. Microfluidics device for single cell gene

expression analysis inSaccharomyces cerevisiae. Yeast 23:1065–1073. Wiley-

335 Blackwell.

Saint-Ruf, C., and I. Matic. 2006. Environmental tuning of mutation rates. Environ.

Microbiol. 8:193–9.

Setoyama, D., R. Ito, Y. Takagi, and M. Sekiguchi. 2011. Molecular actions of

Escherichia coli MutT for control of spontaneous mutagenesis. Mutat. Res. -

Fundam. Mol. Mech. Mutagen. 707:9–14.

Stempler, O., A. K. Baidya, S. Bhattacharya, G. B. Malli Mohan, E. Tzipilevich, L. Sinai, G.

Mamou, and S. Ben-Yehuda. 2017. Interspecies nutrient extraction and toxin

delivery between bacteria. Nat. Commun. 8:315. Nature Publishing Group.

Takagi, Y., D. Setoyama, R. Ito, H. Kamiya, Y. Yamagata, and M. Sekiguchi. 2012. Human

MTH3 (NUDT18) protein hydrolyzes oxidized forms of guanosine and

deoxyguanosine diphosphates: comparison with MTH1 and MTH2. J. Biol. Chem.

287:21541–9. American Society for Biochemistry and Molecular Biology.

Veening, J.-W., W. K. Smits, and O. P. Kuipers. 2008. Bistability, , and Bet-

Hedging in Bacteria. Annu. Rev. Microbiol. 62:193–210. Annual Reviews.

Wada, K.-I., K. Hosokawa, Y. Ito, and M. Maeda. 2017. Quantitative control of

mitochondria transfer between live single cells using a microfluidic device. , doi:

10.1242/bio.024869.

Yamasaki, H., M. Mesnil, Y. Omori, N. Mironov, and V. Krutovskikh. 1995. Intercellular

communication and carcinogenesis. Mutat. Res. - Fundam. Mol. Mech. Mutagen.

333:181–188.

Yang, Y.-W., and M. D. Koob. 2012. Transferring isolated mitochondria into tissue

culture cells. Nucleic Acids Res. 40:e148–e148. Oxford University Press.

336