Zurich Open Repository and Archive University of Zurich Main Library Strickhofstrasse 39 CH-8057 Zurich www.zora.uzh.ch

Year: 2013

The genetic basis of microbial adaptations to new environments in laboratory evolution

Dhar, Riddhiman

Posted at the Zurich Open Repository and Archive, University of Zurich ZORA URL: https://doi.org/10.5167/uzh-91557 Dissertation

Originally published at: Dhar, Riddhiman. The genetic basis of microbial adaptations to new environments in laboratory evolu- tion. 2013, University of Zurich, Faculty of Science. The Genetic Basis of Microbial Adaptations to New Environments in Laboratory Evolution

Dissertation Zur Erlangung der Naturwissenschaftlichen Doktorwürde (Dr. sc. nat.)

vorgelegt der Mathematisch-naturwissenschaftlichen Fakultät der Universität Zürich

von Riddhiman Dhar aus Indien

Promotionskomitee: Prof. Dr. Andreas Wagner (Vorsitz) Prof. Dr. Martin Ackermann Prof. Dr. Owen Petchey Dr. Michael Krützen

Zürich, 2013

Die vorliegende Arbeit wurde von der Mathematisch-naturwissenschaftlichen Fakultät der Universität Zürich im Frühjahrssemester 2013 als Dissertation angenommen.

Promotionskomitee: Prof. Dr. Andreas Wagner (Vorsitz) Prof. Dr. Martin Ackermann Prof. Dr. Owen Petchey Dr. Michael Krützen

Content

Summary 5

Zusammenfassung 7

Chapter 1: Introduction 9

1.1 Environmental stressors...... 9 1.1.1 Natural stressors...... 9 1.1.1.1 Abiotic stressors...... 9 1.1.1.2 Biotic stressors...... 12 1.1.2 Man-made stressors ...... 12 1.2 The stress response ...... 13 1.2.1 The cellular stress response ...... 14 1.2.1.1 The short term stress response ...... 14 1.2.1.2 Long term adaptation to stressors ...... 15 1.2.2 Organismal stress response...... 17 1.3 Applications of the cellular stress response...... 18 1.4 Yeast as a model system for studying cellular stress response...... 18 1.5 The short term stress response in yeast...... 19 1.5.1 General stress response...... 19 1.5.2 Stressor-specific response...... 21 1.6 Long term adaptation to stressors in yeast...... 21 1.7 Mechanisms of evolutionary adaptation to changing environments...... 22 1.7.1 Stochastic switching ...... 22 1.7.2 Stress cross-protection ...... 23 1.7.3 Anticipatory regulation ...... 24 1.8 Gene duplication ...... 25 1.8.1 Gene duplication as a mechanism of adaptation to new environments ...... 26 1.8.1.1 Gene dosage increase...... 26 1.8.1.2 Neo-functionalization ...... 27 1.8.1.3 Sub-functionalization and Amplification-mutagenesis...... 28 1.9 Experimental evolution...... 29 1.10 Aim of the Thesis...... 30 Bibliography ...... 31

Chapter 2: Adaptation of Saccharomyces cerevisiae to saline stress through laboratory evolution 60

Appendix to Chapter 2 98

Chapter 3: Yeast adapts to a changing stressful environment by evolving cross-protection and anticipatory gene regulation 140

Appendix to Chapter 3 176

Chapter 4: Increased gene dosage plays the most important role in the initial stages of evolution of duplicate TEM-1 beta lactamase genes 208

Appendix to Chapter 4 243

Acknowledgements 306

Curriculum Vitae 307

Summary

Microbes regularly face environmental stressors in their natural habitat. These stressors cause damage to cellular components, inhibit the function of macromolecules, and can lead to cell death. Microbes have evolved stress response mechanisms that allow them to survive and proliferate in such stressful environments. Their stress responses can be of two types, physiological short-term stress responses, and evolutionary adaptations to stress. A short term stress response allows microbes to tolerate stressors that persist for a short period of time, and usually involves transient changes in cellular processes. Longer-term evolutionary adaptation to stress often requires genetic and epigenetic changes that allow microbes to permanently live in a stressful environment. In microbes, short term stress responses are widely studied, whereas evolutionary adaptation to stressors is poorly understood. In my thesis, I combine laboratory evolution in microbes with genomic analysis to uncover the molecular changes associated with long term adaptation to environmental stressors.

Salt in high concentrations is an environmental stressor that has important implications in industrial processes and agriculture. To understand the molecular basis of long term adaptation to salt stress, I examined the genomic changes in three salt-adapted experimental yeast populations using different high-throughput genomic techniques. I observed modest changes in the expression of several genes, widespread changes in ploidy in all populations, and a non- synonymous fitness-increasing mutation in one yeast population. Several important conclusions emerged from this study. First, long term adaptation to salt stress is polygenic in nature – we observed modest changes in the expression of multiple genes rather than high expression changes of a few genes. Second, evolutionary changes in gene expression are more important for adaptation to salt stress than mutational changes in protein coding genes in our yeast populations.

Multiple environmental stressors can often occur together or alternate in a cyclic manner. Thus, some microbes have evolved mechanisms such as stress cross-protection and anticipatory regulation that provide significant benefits in such environments. Although these mechanisms are known to exist in microbes, it is unknown how easily they can evolve. To ask whether they can evolve in the laboratory, I analyzed gene expression changes in yeast populations that had adapted to a regular cycle of salt stress and oxidative stress. I demonstrated that stress-cross

5 protection and anticipatory regulation in these yeast populations can evolve within only 300 generations.

Gene duplications are one kind of genetic change that can help microbes adapt to novel and stressful environments. Such adaptations can occur via several mechanisms, such as an increase in gene dosage, neo-functionalization, and sub-functionalization. In all these mechanisms, mutational changes in coding regions, changes in gene expression or a combination of both drive adaptation. Their relative importance in retaining duplicate genes is unclear. Further, we do not know whether the environment can influence this importance. To address this knowledge gap, I performed an experimental evolution study using duplicate TEM-1 genes in E. coli in a variety of selection conditions. I observed an increase in gene dosage in all selection conditions, highlighting the importance of gene expression changes in the initial stages of adaptation.

Taken together, my thesis work characterizes genomic changes associated with long term adaptation to environmental stressors in yeast populations. It also shows that complex mechanisms of adaptation can evolve within a relatively short period of time. Furthermore, it indicates the importance of increased gene dosage caused by gene duplication in the adaptation to novel environments.

6 Zusammenfassung

Mikroorganismen sind in ihrer natürlichen Umwelt fortwährend verschiedenen Stressfaktoren ausgesetzt. Diese Stressfaktoren schädigen zelluläre Komponenten, stören die Funktion von Makromolekülen und können zum Zelltod führen. Mikroorganismen haben verschiedene Mechanismen entwickelt, welche ihnen das Überleben angesichts solcher Stressfaktoren ermöglichen. Diese Mechanismen sind grundsätzlich von zweierlei Art, kurzfristige physiologische Massnahmen und evolutionäre Anpassung an Stressfaktoren. Kurzfristige physiologische Massnahmen erlauben es Mikroorganismen, über kurze Zeit bestehende Störungen in der Umwelt zu überdauern. Solche Massnahmen erfordern in der Regel nur vorübergehende Änderungen in zellulären Prozessen. Evolutionäre Anpassungen erfordern oft genetische und epigenetische Änderungen, welche es den Mikroorganismen erlauben, dauerhaft in einer Umgebung mit Stressfaktoren zu überleben und sich dort zu vermehren. Kurzfristige physiologische Massnahmen von Mikroorganismen wurden bereits intensiv studiert, aber evolutionäre Anpassungen an permanente Stressfaktoren wurden noch nicht umfassend erforscht. In meiner Dissertation wende ich Methoden zur Analyse von Genomen auf im Labor sich über einen längeren Zeitraum entwickelnde Mikroorganismen an, um molekulare Veränderungen zur dauerhaften Anpassung an Stressfaktoren aufzudecken.

Salz in hohen Konzentrationen ist ein Stressfaktor von grosser Bedeutung in industriellen Prozessen und in der Landwirtschaft. Um molekulare Veränderungen zur dauerhaften Anpassung an Salz in hohen Konzentrationen zu finden, untersuchte ich Veränderungen im Genom von drei im Labor an hohe Salzkonzentrationen adaptierte Hefepopulationen mit Hilfe von Hochdurchsatztechnologien. Ich fand Veränderungen mässigen Ausmaßes in der Expression mehrerer Gene, umfassende Änderungen in der Ploidie in allen Populationen, und eine nicht- synonyme, die Fitness erhöhende Mutation in einer Hefepopulation. Verschiedene wichtige Schlüsse lassen sich aus dieser Studie ziehen. Erstens ist die dauerhafte Anpassung an erhöhte Salzkonzentrationen von polygenischer Natur, denn ich beobachtete nur mässig grosse Änderungen in der Expression mehrerer Gene statt grosse Änderungen in der Expression weniger Gene. Zweitens sind evolutionäre Anpassungen in der Genexpression wichtiger für die Anpassung an dauerhaft erhöhte Salzkonzentrationen als durch Mutation bedingte Veränderungen in den protein-kodierenden Genen unserer Hefepopulationen.

7

Verschiedene Stressfaktoren treten in der Umwelt oft gemeinsam auf oder wechseln sich zyklisch ab. Einige Mikroorganismen haben deshalb Mechanismen zum Überleben in solchen Situationen entwickelt, beispielsweise Kreuzprotektion („cross-protection“) gegen mehrere Stressfaktoren gleichzeitig oder Mechanismen zur vorausschauenden Regulation zellulärer Prozesse. Obwohl solche Mechanismen bekannt sind, ist noch nicht bekannt, wie einfach oder schwierig ihre Evolution in Mikroorganismen ist. Um diese Frage zu beantworten, untersuchte ich Änderungen in der Genexpression von Hefepopulationen, welche sich an einen regelmässigen Zyklus von abwechselnd durch hohe Salzkonzentrationen verursachten Stress und oxidativen Stress angepasst hatten. Ich zeigte, dass Kreuzprotektion und vorausschauende Regulation im Labor innerhalb von nur 300 Generationen auftauchen können.

Genduplikationen gehören zu den genetischen Veränderungen, welche es Mikroorganismen erlauben, sich an neue und stressreiche Umgebungen anzupassen. Diese Anpassungen durch Genduplikation erfolgen über verschiedene Mechanismen, beispielsweise durch eine Erhöhung der Gendosierung, durch eine Neufunktionalisierung oder durch eine Subfunktionalisierung. Bei all diesen Mechanismen wird die evolutionäre Anpassung angetrieben durch mutationsbedingte Veränderungen in protein-kodierenden Regionen des Genoms, durch Änderungen in der Genexpression, oder durch beides zugleich. Es ist noch unklar, welcher dieser beiden Faktoren wichtiger ist für die Erhaltung von duplizierten Genen. Zudem ist unbekannt, ob die Umwelt die Gewichtung der beiden Faktoren beeinflusst. Um diese offene Frage zu beantworten, führte ich im Labor eine Evolutionsstudie mit duplizierten TEM-1 Genen in Escherichia coli unter verschiedenen, selektiven Umweltbedingungen durch. Ich beobachtete eine Erhöhung der Gendosierung unter sämtlichen Umweltbedingungen, was die Bedeutung von Änderungen in der Gendosierung am Anfang von evolutionären Anpassungen unterstreicht.

Insgesamt beschreibt meine Dissertation die Änderungen am Genom, welche mit der dauerhaften Anpassung von Hefepopulationen an Stressfaktoren einhergehen. Meine Arbeit zeigt weiter, dass auch komplexe Mechanismen innerhalb relativ kurzer Zeit evoluieren können. Zudem weist sie auf die Bedeutung einer erhöhten Gendosierung durch Genduplikation bei der Anpassung an neue Umweltbedingungen hin.

8 Chapter 1

Introduction

1.1 Environmental stressors Microbial cells and their macromolecules require specific environmental conditions for functioning optimally, including temperature, pH and ionic concentrations (Quinn, 1988; Somero, 1995; Alexov 2004; Dubyak, 2004; Talley and Alexov, 2010). These optimal conditions can vary between organisms and molecules. Organisms live in different environments and thus, the biological molecules inside them have gradually adapted for optimal function in their native environment over evolutionary time (Robb and Clark, 1999; Stetter, 1999; Rothschild and Mancinelli, 2001; Greaves and Warwicker, 2007; Garcia-Moreno, 2009; Tadeo et al., 2009; Ortega et al., 2011). However, even in cases of very stable environments organisms need to withstand some environmental changes such as those associated with seasons. Dramatic changes in the environmental condition can result in non-optimal functioning of the biological molecules and in certain cases, can completely inhibit their functions. In consequence, such changes could reduce the growth of the organisms and in extreme cases, result in cell death or death of the organism (Weiser and Hargiss, 1946; Domingo, 1994; Allakhverdiev et al., 2000; Vermeulen et al., 2008; Fulda et al., 2010; Cramer et al., 2011; Skirycz et al., 2011). Environmental conditions that detrimentally affect organisms are generally referred to as stressors. Stressors can be classified into two main categories – natural stressors – stressors that occur naturally in the environment and man-made stressors – stressors that are produced artificially due to human activity.

1.1.1 Natural stressors Natural stressors occur naturally in the environment. Some of these stressors are of non- biological origin (referred to as abiotic stressors) whereas others are due to biological organisms (referred to as biotic stressors).

1.1.1.1 Abiotic stressors Abiotic stressors include heat stress, cold stress, osmotic stress, salt stress, oxidative stress, pH stress and radiation stress. Heat stress is caused by higher than normal temperatures; whereas

9 Chapter 1

cold stress is caused by lower than normal temperatures. High temperatures inhibit proper folding of protein molecules and thereby inhibit the functioning of enzymes and protein complexes (Ahern and Klibanov, 1988; Jaenicke, 1991; Somero, 1995). Conversely, low temperatures inhibit cell growth via various mechanisms. They inhibit membrane transport, lower the stability of RNA and proteins, and cause non-optimal functioning of enzymes (Willing and Leopold, 1983; Privalov, 1990; Drobnis et al., 1993; Phadtare, 2004). Extremely low temperatures cause cellular water to freeze and cause freezing damage to cells (Mazur, 1970; Meryman, 1974; Burke, 1976; Gao and Critser, 2000).

Osmotic stress is caused by an imbalance between extracellular and intracellular solute concentrations. When extracellular solute concentrations are higher than the intracellular solute concentrations, intracellular water flows out of cells by osmosis. This results in higher than normal ionic concentrations inside cells which inhibit normal functioning of enzymes and cellular components (Hsiao, 1973; Brown, 1976; Binzel et al., 1988; Wood, 1999; Hohmann, 2002; Munns, 2002). Such environmental stress is referred to as hyperosmotic stress. Hyperosmotic stress can be caused by high extracellular concentrations of sugars or salts (Wood, 1999; Hohmann, 2002). In the opposite scenario, when intracellular solute concentrations are higher than the extracellular solute concentrations, extracellular water flows into the cell. This results in lower than optimal ionic concentrations inside cells, and leads to increased turgor pressure on the cellular membrane. In extreme cases, this increase pressure can cause bursting of cells (Wood, 1999; Hohmann, 2002).

Salt stress is caused by the presence of high salt (sodium chloride) in the extracellular medium. In addition to imposing hyperosmotic stress on cells, high salt causes hyperionic stress due to presence of high intracellular ionic concentrations that are toxic for cells (Binzel et al., 1988; Apse et al., 1999; Maathuis and Amtmann, 1999; Serrano et al., 1999; Hohmann, 2002).

pH stress is caused by a departure from a normal pH in the extracellular environment. Such changes in pH change in the polarity of amino acid side chains and can thus interfere with the proper functioning of enzymes and protein complexes (Kawai et al., 2005; Re et al., 2008; Chan and Warwicker, 2009; Talley and Alexov, 2010).

10 Chapter 1

Oxygen is essential for the survival of aerobic organisms. Oxygen participates in metabolic pathways in the cell. In the process, it generates several reactive oxygen species (ROS). Exposure to ROS can lead to oxidative damage of DNA, proteins and lipid molecules in the cell and can thus inhibit cell growth. Aerobic organisms have cellular antioxidants that tightly control the generation of ROS in the cells. However, changes in the environment such as the presence of hydrogen peroxide or menadione can lead to high concentrations of ROS inside cells (Cadenas, 1989; Halliwell and Gutteridge, 1999; Toledano et al., 2003). This environmental condition is referred to as oxidative stress. Oxidative stress causes a reduction in cell growth and, in extreme cases, cell death, through oxidative damage due to excess ROS (Halliwell and Gutteridge, 1984; Cadenas, 1989; Jamieson, 1992; Berlett and Stadtman ER, 1997; Cadet et al., 2003; Toledano et al., 2003). Conversely, a lack of oxygen can inhibit functioning of cells of aerobic organisms, a condition referred to as hypoxic stress (Voesenek et al., 2006; Majmundar et al., 2010).

Metal stress is caused by excess amounts of metals in the environment. High metal concentration can be toxic for cells not only due to their non-specific binding and inhibition of cellular molecules and complexes, but also due to generation of reactive oxygen species (Dietz et al., 1999; Ercal et al., 2001; Hall, 2002).

UV radiation from the sun is another stressor that occurs in the natural environment. Prolonged exposure to UV causes dimerization of thymine molecules in DNA (Setlow, 1966; Tyrrell, 1973) as well as DNA damage through oxidative damage to bases and DNA strand breaks (Wang and , 1986; Kvam and Tyrrell, 1997; Slieman and Nicholson, 2000; Cadet et al., 2005; Rastogi et al., 2010). Such damage to DNA can inhibit DNA replication and transcription (Protic-Sabljic and Kraemer, 1986; Mitchell et al., 1989; Donahue et al., 1994). This eventually leads to non- optimal amount of proteins and enzymes in a cell and results in inhibition of cell growth and, in extreme cases, cell death. Other kinds of high energy radiations such as radioactivity could also cause DNA damage (Löbrich et al., 1996; Newman et al., 1997; Trainor et al., 2012).

11 Chapter 1

1.1.1.2 Biotic stressors Biotic stressors are of biological origin and can be of several types, the first of which comprises antimicrobial compounds and microbial toxins. Antimicrobial compounds encompass a vast range of molecules that inhibit the growth of microbes and include antibiotics. Naturally occurring antibiotics are compounds produced by some groups of bacteria that inhibit the growth of other groups of microbes (Fleming, 1929; Walsh, 2003; Clardy, 2006). Antibiotics can inhibit growth in many ways. Some antibiotics interfere with the formation of the cell wall, whereas others inhibit the functioning of ribosomes (Wise and Park, 1965; Tomasz, 1979; Misumi and Tanaka, 1980; Chopra and Roberts, 2001). Similar compounds are synthesized by higher plants that inhibit the growth of fungi on a plant surface. These compounds are generally known as anti-fungal compounds (Grayer and Harborne, 1994; Grayer and Kokubun, 2001). Toxins are usually produced by microbes that inhibit the growth of other microbes but can also have detrimental effects on higher animals (Middlebrook and Dorland, 1984; Lubran, 1988; Montecucco et al., 1996).

Biotic stressors can also occur in the form of infection of an organism by microbes. One such example is infection of bacteria by bacteriophages. In the lytic cycle, bacteriophages use the resources of the host bacterium cell for its own replication and eventually cause cell death (Bertani, 1953; Rolfe and Campbell, 1977; Young, 1992). Similar examples are found in the infection of plants and animals by viruses, bacteria and fungi (Goens and Perdue, 2004; Waites and Talkington, 2004; Kämper et al., 2006; Butler et al., 2009; Dorer et al., 2009; Fisher et al., 2009; Heitman, 2011).

1.1.2 Man-made stressors Apart from naturally occurring stressors, biological organisms face stressors that are generated by human activity. Several of these stressors are compounds generated as by-products of industrial production, whereas others are specifically designed to cause stress. Processes such as mining, chemical industries and petroleum industries generate several inorganic and organic compounds that are released into the water and the environment (Nriagu and Pacyna, 1988; Lewis et al., 2011). These compounds can reduce the growth of the biological organisms due to toxic effects on cells and can interfere with biological processes such as development and

12 Chapter 1

reproduction (Casini et al., 1985; Domingo, 1994; Dodson and Hanazato, 1995; McCord, 1996; Giller et al., 1998; Lamhamdi et al., 2011). Compounds that are specifically designed and synthesized to kill organisms that infect plants, animals, and humans include herbicides, pesticides and antimicrobial drugs (Duke, 1990; Dresser and Rybak, 1998; Russell, 2002; Casida, 2009; González-Lamothe et al., 2009; Malmström et al., 2009).

Antimicrobial drugs inhibit the growth of microbes by interfering with essential biochemical and cellular reactions (Wood and Austrian, 1942; Brown, 1962; De Clercq, 1998; Krauth-Siegel and Coombs, 1999; Opperdoes and Michels, 2001; Bermingham and Derrick, 2002; Hawser et al., 2006). Reverse transcriptase inhibitors are one example of antimicrobial drugs that are used for treatment of HIV/AIDS (De Clercq, 1998; Gallant et al., 2003; Waters et al., 2007). The human immuno-deficiency virus (HIV) contains a reverse transcriptase enzyme that converts its RNA genome into cDNA which then infects human cells (Katz and Skalka, 1994; Sarafianos et al., 2009). The reverse transcriptase inhibitors inhibit the reverse transcription reaction and can thus stop viral infection (De Clercq, 1998; Gallant et al., 2003; Waters et al., 2007). Another example of antimicrobial drugs is the inhibitors of the folate biosynthesis pathways. Tetrahydrofolic acid is an essential cellular cofactor required for synthesis of purines and amino acids (Woods, 1964; Wagner, 1995). Biosynthesis of tetrahydrofolic acid in bacterial cell involves several enzymes, two of which are dihydrofolate reductase and dihyropterate synthase (Griffin and Brown, 1964; Reynolds and Brown, 1964; Burg and Brown, 1966; Richey and Brown, 1969; Mathis and Brown, 1970; Suzuki and Brown, 1974). Several antimicrobial drugs inhibit functioning of these enzymes, which causes a reduction in the cellular concentration of tetrahydrofolic acid, and ultimately inhibits cell growth (Wood and Austrian, 1942; Brown, 1962; Brown, 1967; McCullough and Maren, 1973; Lever et al., 1986; Bermingham and Derrick, 2002; Hawser et al., 2006; Hevener et al., 2010).

1.2 The stress response Organisms have evolved mechanisms that enable them to withstand the stressors that occur in their environment. These mechanisms are generally known as the stress response. In both unicellular and multicellular organisms, it is individual cells that ultimately respond to stressors. This is known as the cellular stress response. In multicellular organisms, in addition to the

13 Chapter 1

cellular stress response, there are higher order mechanisms that can help combat environmental stressors. These responses at the organismal level can comprise coordinated stress responses of multiple cells. I will discuss these responses in more detail in the following sections.

1.2.1 The cellular stress response All environmental stressors cause some sort of damage to molecules and components inside cells. The cellular response is a mechanism that protects cells against such damage and aids in repairing damaged components. The cellular stress response can differ from one stressor to another and may depend on the duration of exposure to the stressor. I will discuss these kinds of responses separately below.

1.2.1.1 The short term stress response The short term cellular stress response plays the most important role in the survival and protection of the cells. It usually happens in multiple stages. First, changes in cellular conditions trigger defense mechanisms that protect the cell from further damage. Second, cells become adapted, repair damaged cellular components and can survive and reproduce even under the stressful condition. Throughout these stages, cells monitor changes in the stressful environmental condition through cellular sensors and adjust the cellular response mechanisms accordingly (Wood, 1999; Hohmann, 2002; Hohmann and Mager, 2003). I will discuss an example of these phases in the hyperosmotic stress response and the oxidative stress response in yeast. Similar phases are also observed in response to other environmental stressors.

Hyperosmotic stress causes rapid loss of water from cells resulting in high intracellular concentrations of ions that inhibit cellular reactions. In addition, loss of water leads to a reduction in turgor pressure, to shrinking of cells, and to collapse of cytoskeletal structures (Hohmann, 2002). In response, cells first arrest growth. Further, cells activate the HOG MAP kinase pathway which induces expression of downstream genes that synthesize osmoprotectant molecules such as glycerol or trehalose (Yancey et al., 1982; Van Wuytswinkel et al., 2000; Hohmann, 2002; O'Rourke and Herskowitz, 2004). These osmoprotectant molecules accumulate inside the cell and slowly restore osmotic balance between the intracellular and extracellular medium. Finally, with restoration of osmotic balance, the damaged cytoskeleton is repaired and

14 Chapter 1

the cells grow and proliferate (Hohmann, 2002). Throughout these phases, two osmo-sensors Sln1p and Sho1p monitor environmental conditions (Maeda et al., 1994; Posas and Saito, 1997; Ostrander and Gorman, 1999; Hohmann, 2002). These osmo-sensors sense changes in the extracellular osmotic conditions and control expression of genes required for osmo-adaptation via MAP kinase signaling pathways (Van Wuytswinkel et al., 2000; Hohmann, 2002; O'Rourke and Herskowitz, 2004).

Oxidative stress is harmful to cells due to oxidative damage to proteins, lipids and DNA molecules by excess ROS (Halliwell and Gutteridge, 1984; Cadenas, 1989; Jamieson, 1992; Berlett and Stadtman ER, 1997; Cadet et al., 2003; Toledano et al., 2003). Upon exposure to oxidative stress, yeast cells activate antioxidant enzymes such as superoxide dismutases, catalase and peroxidases, which remove the harmful affects of ROS molecules via chemical modifications (Autor, 1982; Cohen et al., 1985; van Loon et al., 1986; Grant et al., 1998; Luikenhuis et al., 1998; Culotta, 2000; Grant, 2001; Sturtz et al., 2001; Toledano et al., 2003; Lushchak and Gospodaryov, 2005). In this process, Yap1p act as a sensor of oxidant concentrations inside yeast cells and regulate expression of antioxidant enzymes (Stephen et al., 1995; Delaunay et al., 2000; Ikner and Shiozaki, 2005; Temple et al., 2005; Okazaki et al., 2007).

1.2.1.2 Long term adaptation to stressors The short term stress response usually involves changes in the regulation of genes that are essential for adaptation to an environmental stressor. Such changes are usually mediated by changes in gene expression and are usually transient (Gasch et al., 2000; Causton et al., 2001; Petersohn et al., 2001; Price et al., 2001). In contrast, long exposure to an environmental stressor, ranging from hundreds to thousands of generations, result in heritable changes in cells that allow permanent adaptation to a persistently stressful environment. These heritable changes can be genetic or epigenetic and result in cellular and physiological adaptations for better survival in an altered environment.

Extremophiles are good examples of long term adaptation to environmental stressors (Macelroy, 1974; Madigan and Marrs, 1997; Rothschild and Mancinelli, 2001). “Extremophile” generally refers to an organism that lives in natural environments with extreme temperatures, pH, salinity,

15 Chapter 1

radiation or pressure (Macelroy, 1974; Madigan and Marrs, 1997; Rothschild and Mancinelli, 2001). These organisms have evolved mechanisms that allow them to survive in such extreme environments. Hyperthermophiles are one such group of extremophiles. They live and proliferate at high temperatures often exceeding 80°C (Stetter, 1999; Rothschild and Mancinelli, 2001). At such temperatures, DNA double strands become unstable and are spontaneously denaturated. However, in hyperthermophiles, an enzyme called reverse gyrase stabilizes genomic DNA by introducing positive superhelical turns (Kikuchi and Asai, 1984; Rodríguez and Stock, 2002). This enzyme is unique to hyperthermophiles and is only active at high temperatures (Kikuchi and Asai, 1984; Bouthier de la Tour et al., 1990). Similar examples are found in the evolutionary adaptation of proteins to high temperatures (Hensel et al., 1992; Klump et al., 1992; Robb and Clark, 1999).

Resistance to antibiotics and antimicrobial compounds is another example of long term adaptation to environmental stressors through genetic changes. Microbes are often exposed to antimicrobial compounds and they have evolved several mechanisms to survive such stressors. Some microbes have antibiotic resistance genes that remove the harmful effects of antibiotic molecules by modifying them chemically (Sutcliffe, 1978; Miyamura et al., 1977; Gray and Fitch, 1983; Speer et al., 1992; Davies, 1994; Damblon et al., 1996; Schnappinger and Hillen, 1996; Briggs and Fratamico, 1999). Some microbes have evolved drug efflux systems that transport antimicrobial compounds out of the cell, thereby removing their toxic effects (Nilsen et al., 1996; Sutcliffe et al., 1996; Marshall and Piddock, 1997; Zgurskaya and Nikaido, 2000; Poole, 2001; Webber and Piddock, 2003). Several antimicrobial compounds inhibit the growth of microbes by binding and inhibiting functions of macromolecules involved in essential cellular processes. Resistance to antimicrobials can also occur through genetic changes that alter these macromolecules in such a way that the antimicrobial compounds can not bind and inhibit their functions (Weisblum, 1995; Triglia et al., 1997; Sköld, 2000; Tait-Kamradt et al., 2000; Lee et al., 2001; Farrell et al., 2003; Meka et al., 2004; Wolter et al., 2005).

Long term adaptation to new environments can also occur through epigenetic changes and through cellular “memory” that remodel expression of genes required for evolutionary adaptation (Turner, 1998; Turner, 2002; Ringrose and Paro, 2004; Casadesús and Low, 2006;

16 Chapter 1

Zacharioudakis et al., 2007). These changes are passed on across generations and can aid in survival of the offspring in a stressful environment (Adam et al., 2008; Chinnusamy and Zhu, 2009; Seong et al., 2011; Crews et al., 2012).

1.2.2 Organismal stress response In addition to the stress responses at the cellular level, multicellular organisms can adapt to environmental changes through responses at the level of the organism. In higher animals and plants, the short term stress response at the organismal level can be mediated by hormones (Vaadia, 1976; Charmandari et al., 2005; Achard et al., 2006; Achard et al., 2007; Wang et al., 2007) and is often intertwined with components of the cellular stress response (Blake et al., 1991; Cvoro et al., 1998; Deane et al., 1999). In addition, long term adaptation to stressors at the organismal level can also occur through changes in body structure or behavior. Adaptations in the root structure of mangrove and wetland plants are one such example. These plants grow in soils that experience regular flooding, which results in low oxygen content in the soil. Thus, to survive in such anaerobic conditions, these plants have evolved aerial roots that allow direct uptake of oxygen from air, and aerenchyma tissues that increase internal oxygen transfer between tissues (Justin and Armstrong, 1987; Jackson and Armstrong, 1999; Pi et al., 2009). Another example of organismal adaptations to stressors is the mammalian immune system. Immune systems are formed by the highly coordinated action of multiple cell types that help their hosts to survive biotic stressors (Kawai and Akira, 2006; Pancer and Cooper, 2006; Medzhitov, 2007). Other examples of organismal adaptations to stressors are migration and hibernation of animals in winter. In winter, birds from northern latitudes often migrate to temperate and warm climates (McNamara et al., 1998; Bairlein et al., 2012). Such migration helps them avoid cold stress and nutrient depletion in winter. Some groups of fish become dormant in winter, which helps them survive extreme cold and food depletion (Lyman et al., 1982; Crawshaw, 1984; Gieser et al., 1996; Campbell et al., 2008).

Unicellular organisms also have stress response mechanisms that resemble the organismal stress response in multicellular organisms, in that a group of cells participates in the stress response process. One such example is bacterial biofilms (Costerton et al., 1987; Hall-Stoodley et al., 2004). Biofilms consist of multiple bacterial cells embedded in an extracellular matrix

17 Chapter 1

(Sutherland, 2001; Flemming and Wingender, 2010). Biofilms often show increased resistance to environmental stressors, including antibiotics, in comparison to their individual counterparts (Cochran et al., 2000; Lewis, 2001; Stewart and Costerton, 2001; Kubota et al., 2009; Høiby et al., 2010; Bridier et al., 2011). Several mechanisms have been described that explain increased resistance, one of which is the non-permeability of antibiotics molecules through the extracellular matrix of biofilms (Brown et al., 1988; Evans et al., 1991; Hoyle et al., 1992; Anderl et al., 2000; Mah and O'Toole, 2001; Stewart, 2002).

1.3 Applications of the cellular stress response Studying the stress response in general and the cellular stress response in particular has important implications for agriculture, industrial processes, and medicine. In agriculture, for example, abiotic stressors are one of the major factors that contribute to the loss of crops (Boyer, 1982; Moffat, 2002; Mittler, 2006). Thus, a detailed understanding of the stress response in plants might allow researchers to engineer plants and protect them from such stressors in the fields. Yeast is used for various industrial processes, such as baking bread, producing ethanol, and making wine. In these processes, yeast cells encounter a diverse set of stressors that inhibit their growth and cellular metabolism (Codón et al., 1998; Trainotti and Stambuk, 2001; Pérez-Torrado et al., 2005; Gibson et al., 2007). Thus, understanding the stress response in yeast cells could have a significant positive effect on the industry (Attfield, 1997; Zheng et al., 2011). Finally, the stress response is also associated with ageing, apoptosis, the immune response and cancer (Jolly and Morimoto, 2000; Herr and Debatin, 2001; Padgett and Glaser, 2003; Todd et al., 2008; Fulda et al., 2010; Kourtis and Tavernarakis, 2011). A better understanding of mechanisms behind the stress response might thus allow researchers to devise better therapeutic strategies (Tomida and Tsuruo, 1999; Brown and Bicknell, 2001; Herr and Debatin, 2001; Fulda, 2010).

1.4 Yeast as a model system for studying cellular stress response For various reasons, yeast is widely used as a model organism for studying stress response in eukaryotes. First, the MAPK pathways that play important roles in the stress response in yeast are conserved in all eukaryotes, from fungi to higher animals and plants (Kosako et al., 1993; Neiman AM, 1993; Waskiewicz and Cooper, 1995). These pathways are also important for the stress response in all eukaryotes (Galcheva-Gargova et al., 1994; Han et al., 1994; Waskiewicz

18 Chapter 1

and Cooper, 1995; Jonak et al., 1996; Kovtun et al., 2000; Kyriakis and Avruch, 2001; Teige et al, 2004). Second, several cellular components and proteins that play important roles in the stress response mechanisms are conserved in eukaryotes (DiRuggiero et al., 1999; Kültz D, 2003; Thompson et al., 2008). One such example is heat shock proteins that are central to the stress response in eukaryotes (Lindquist and Craig, 1988; Lindquist, 1992; Gething, 1997; Feder and Hofmann, 1999). Another example is the use of similar calcium-dependent mechanisms and ion detoxification mechanisms for salt stress response in yeast and plants (Mendoza et al., 1994; Bressan et al., 1998; Pardo et al., 1998; Lee et al., 1999; Gisbert et al., 2000; Quintero et al., 2000; Yang et al., 2001; Quintero et al., 2002). Finally, yeast has been used for a long time as a model system for experimental studies. Thus, in addition to the known genome sequence and the well-understood functions of many genes, many molecular and genetic tools exist that can aid in detailed experimental analysis of the stress response in yeast (Oliver et al., 1998; Johnston M, 2000; Forsburg, 2001; Kumar and Snyder, 2001).

1.5 The short term stress response in yeast The short term stress response to environmental stressors in yeast has been studied for a long time through analysis of global changes in gene expression in response to stressors (Jelinsky and Samson, 1999; Gasch et al., 2000; Posas et al., 2000; Rep et al., 2000; Causton et al., 2001; Yale and Bohnert, 2001), as well as through studies on individual genes (Mendizabal et al., 1998; Higgins et al., 2002; Warringer et al., 2003; Outten et al., 2005; van Voorst et al., 2006; Yoshikawa et al., 2009).These studies have uncovered thousands of genes that change expression in response to various stressors and are essential for survival of yeast cells under these stressors. Approximately 66% of the genes in the yeast genome respond to one environmental stressor or another (Causton et al., 2001).

1.5.1 General stress response Various environmental stressors can inhibit growth and pose a threat to the survival of yeast cells in various ways. However, approximately 10 to 15% of all the genes in yeast respond in the same way to multiple environmental stressors (Gasch et al., 2000; Causton et al., 2001). These observations suggest that various environmental stressors actually pose some common cellular threats to the yeast cells which elicit common stress response mechanisms. This is usually

19 Chapter 1

referred to as a “general stress response” or a “common environmental response” or an “environmental stress response” (Hecker and Völker, 1990; Gasch et al., 2000; Causton et al., 2001). Approximately 30 to 40% of the genes (depending on the study) involved in the general stress response are induced in response to multiple stressors in yeast, whereas about 60-70% of these genes are repressed (Gasch et al., 2000; Causton et al., 2001). The genes induced in general stress response are involved in various cellular functions such as carbohydrate metabolism, protein folding, protection against reactive oxygen species and protein degradation (Gasch et al., 2000; Causton et al., 2001). Several of these genes contain stress response element (STRE) sequences in their promoter regions (Moskvina et al., 1998; Treger et al., 1998). Expression of these genes is induced by binding of two transcription factors Msn2p and Msn4p to these elements under stressful conditions (Martínez-Pastor et al., 1996; and McEntee, 1996). The genes repressed in the general stress response include genes associated with translation & protein synthesis, nucleotide biosynthesis as well as tRNA synthesis (Gasch et al., 2000; Causton et al., 2001).

The general stress response reflects common threats to the survival of yeast cells imposed by diverse stressors which is uncovered by the molecular functions of the genes that participate in the general stress response. For example, multiple environmental stressors could cause misfolding of proteins important for cellular functions. Thus, higher expression of molecular chaperones that aid in proper folding of the proteins can allow the cells to maintain normal cellular functions (Lindquist, 1992; Becker and Craig, 1994; Feder and Hofmann, 1999). Similarly, induction of genes that are part of the protein degradation machinery can ensure swift and efficient degradation of misfolded proteins that could otherwise be toxic to the cells (Bucciantini et al., 2002; Geiler-Samerotte et al., 2011). Several environmental stressors cause damage to cellular components through reactive oxygen species (Cazalé et al., 1998; Dietz et al., 1999; Ercal et al., 2001; Hall, 2002; Lin and Kao, 2002). Induction of genes associated with detoxification of reactive oxygen species can protect the cells against such oxidative damage. In contrast, the genes repressed in the general stress response are associated with cell growth and proliferation, and are non-essential for initial survival of cells under stressful conditions (Gasch et al., 2000; Causton et al., 2001).

20 Chapter 1

1.5.2 Stressor-specific response In addition to posing common threats to cells, various environmental stressors affect particular aspects of cell growth and survival in yeast. Thus, yeast cells have evolved specific stress response mechanisms that are also reflected in patterns of changes in gene expression (Gasch et al., 2000; Causton et al., 2001). For example, salt stress and acid stress impose ionic stress on yeast cells due to changes in ions and ionic concentrations inside cells (Apse et al., 1999; Holyoak et al., 1999; Maathuis and Amtmann, 1999; Serrano et al., 1999; Hohmann, 2002). In these stressors, expression of the ENA1 gene is induced (Causton et al., 2001).This gene encodes for a protein that functions as the primary sodium extrusion pump in the yeast cells (Haro et al., 1991; Wieland et al., 1995). High expression of this protein helps in restoration of cellular ionic concentration. Knocking out this gene in yeast reduces resistance to salt stress and tolerance to high metal concentration (Kinclova-Zimmermannova et al., 2006; Ye et al., 2008). Another example of stressor-specific response is induction of the gene LHS1 in the yeast cells in response to heat stress (Causton et al., 2001). Heat stress causes misfolding of protein molecules required for important cellular functions. This gene encodes for a molecular chaperone that helps in proper folding of proteins, thereby allowing them to perform their cellular roles even under heat stress (Baxter et al., 1995; Saris et al., 1997).

1.6 Long term adaptation to stressors in yeast Long term adaptation of yeast to environmental stressors is a poorly studied topic compared to the short term stress response. There are few studies that exposed yeast cells continuously to environmental stressors in the lab for a few hundred generations, and subsequently investigated the genomic changes associated with such long term adaptation. These studies have uncovered nucleotide polymorphisms, changes in gene expression, gene amplifications, deletions and chromosomal rearrangements associated with adaptations to the stressful environments in the laboratory (Adams and Oeller, 1986; Adams et al., 1992; Ferea et al., 1999; Dunham et al., 2002; Gresham et al., 2008). Another approach of investigating adaptation to stressors has been to study the genomic changes in industrial strains of yeast that regularly face various environmental stressors during the industrial processes (Bidenne et al., 1992; Codón et al., 1998; Lucena et al., 2007). These studies have also uncovered genetic changes associated with adaptation to environmental stressors.

21 Chapter 1

1.7 Mechanisms of evolutionary adaptation to changing environments In Nature, some environmental stressors occur in a regular cycle whereas others occur randomly. Moreover, some environmental stressors often occur together or in succession. For example, when microbes enter mammalian guts from their natural environment, they experience temperature stress caused by an increase in temperature and pH stress due to low pH in the gut. Similarly, temperature and light levels change along day–night cycles and soil properties change with seasons (Fierer et al., 2003; Bapiri et al., 2010; van der Linden et al., 2010; Evans and Wallenstein, 2012). For example, very high temperatures or heat stress can cause a loss of moisture from the soil resulting in drought stress on the soil microbes and plants (Craufurd and Peacock, 1993; Machado and Paulsen, 2001). Since such changes are common in the natural environment, biological organisms have evolved several evolutionary adaptation mechanisms of which three are well described.

1.7.1 Stochastic switching Stochastic switching refers to variation in expression of genes among the individuals of a population (McAdams and Arkin, 1997; Raser and O'Shea, 2005; Raj and van Oudenaarden, 2008). Stochasticity in gene expression occurs due to several cellular factors (Guptasarma, 1995; Swain et al., 2002; Raser and O'Shea; 2004). Such stochasticity can generate phenotypic heterogeneity even among isogenic individuals (McAdams and Arkin, 1997; Raser and O'Shea, 2005; Raj and van Oudenaarden, 2008), and could be especially beneficial in stochastic environmental fluctuations (Thattai and van Oudenaarden, 2004; Kussell and Leibler, 2005; Acar et al., 2008; Donaldson-Matasci et al., 2008; Gaál et al., 2010). The phenotypic heterogeneity generated by stochastic gene expression can ensure that at least some of the individuals in a population would always express those genes that would aid them to survive in the presence of certain environmental stressors (Attfield et al., 2001; Booth, 2002; Sumner and Avery, 2002; Wolf et al., 2005; Blake et al., 2006; Bishop et al., 2007). Such a phenomenon is also known as “bet-hedging” (Montgomery, 1974; Philippi and Seger, 1989; de Jong et al., 2011). Stochastic phenotypic variations is a common feature of biological systems (Balaban et al., 2004; Maamar et al., 2007; Childs et al., 2010; Raj et al., 2010; Levy et al., 2012). For example, about 20% of the genes in the yeast Saccharomyces cerevisiae contain TATA box in the promoter region

22 Chapter 1

(Basehoar et al., 2004), whose sequence can affect the stochastic variation in expression level of a gene (Raser and O'Shea, 2004; Blake et al., 2006). Stress induced genes in yeast are significantly enriched for TATA box containing promoters (Basehoar et al., 2004), suggesting that the stochastic variation in the expression of stress response genes could be beneficial for cells under fluctuating environments. Evolution of stochastic phenotype switching as a stable survival strategy has also been observed in an experimental population of Pseudomonas fluorescens that were grown in a fluctuating environment (Beaumont et al., 2009; Libby and Rainey, 2011; Rainey et al., 2011). Stochastic variation can also be advantageous for survival of bacterial populations facing a periodic appearance of antibiotics (Balaban et al., 2004; Keren et al., 2004). It can also make bacterial cells transiently competent that might allow uptake of beneficial DNA molecules (Maamar et al., 2007). Furthermore, stochastic variation in gene expression is also important for generating antigenic diversity in pathogenic organisms that can help them evade immune reactions (van Ham et al., 1993; Moxon et al., 1994).

1.7.2 Stress cross-protection A second mechanism of adaptation to changing environments is stress cross-protection. Stress cross-protection is a phenomenon where exposure to mild doses of one stressor protects a cell or an organism against a subsequent lethal dose of the same or a different environmental stressor. There are numerous examples of stress cross-protection in phage, bacteria, yeast, plants and mammalian cell lines (Mitchel and Morrison, 1982, 1983; Coote et al., 1991; Völker et al., 1992; Humphrey et al., 1993; Leyer and Johnson, 1993; Lou and Yousef, 1997; Wang and Doyle, 1998; Maleck et al., 2000; Cullum et al., 2001; Pereira et al., 2001; Talalay and Fahey, 2001; Chinnusamy et al., 2004; Durrant and Dong, 2004; Kandror et al., 2004; Palhano et al., 2004; Charng et al., 2006; Greenacre and Brocklehurst, 2006; Berry and Gasch, 2008). However, such cross-protection among various stressors is not universal, and could be symmetric or asymmetric in nature. In symmetric cross-protection, exposure to stressor 1 would protect cells against subsequent occurrence of stressor 2, while exposure to stressor 2 would also protect cells against subsequent occurrence of stressor 1. Asymmetric cross-protection would work in only one of these directions. An example of symmetric stress cross-protection can be found in the yeast Saccharomyces cerevisiae. Exposure to mild salt stress has been shown to protect yeast cells against a subsequent severe dose of oxidative stress, while exposure to mild oxidative stress also

23 Chapter 1

protected yeast cells against a subsequent salt stress (Berry and Gasch, 2008). Asymmetric stress cross-protection also exists in yeast. Hydrogen peroxide and menadione both cause oxidative stress in yeast (Jamieson, 1992; Toledano et al., 2003). Exposure to menadione protects yeast cells against exposure to hydrogen peroxide but not vice versa (Jamieson, 1992). Stress cross- protection could occur by induction of common stress response proteins (such as heat shock proteins) that are required for adaptation to both environmental stressors (Lu et al., 1993; Scholz et al., 2005). On the other hand, cross-protection could also occur via distinct adaptation mechanisms that could be dependent on the first stressor (Berry et al., 2011).

1.7.3 Anticipatory regulation Some microbes have evolved a predictive mechanism that allows them to anticipate future environmental changes based on past environmental changes. This mechanism of adaptation is akin to a mechanism of “learning” in higher animals (Pavlov, 1927; Pearce and Hall, 1980). We can understand it through one hypothetical example. Let us consider a scenario where stressor 1 serves as an anticipatory signal for stressor 2. In anticipatory regulation, when cells are exposed to stressor 1, they change expression of not only those genes that allow them to adapt to stressor 1 but also of some of the genes that are necessary for adaptation to the stressor 2. Thus, even though such changes in gene expression could be neutral or even maladaptive in stressor 1, anticipatory regulation could be beneficial for cells when environmental changes are periodic in nature, and where stressor 2 always occurs after stressor 1. A theoretical study has investigated the environmental scenarios where such predictive mechanism could be beneficial (Mitchell and Pilpel, 2011). Similar to stress cross-protection, anticipatory regulation can be symmetric or asymmetric in nature (Mitchell et al., 2009).

Anticipatory regulation exists in Escherichia coli and yeast Saccharomyces cerevisiae. One of the environmental habitats of E. coli that shows a predictable environmental change is the mammalian gastrointestinal tract, where E. coli is first exposed to the sugar lactose followed by exposure to maltose (Savageau, 1998). Thus, anticipatory regulation, where lactose acts as an anticipatory signal for maltose would be beneficial for the cells. Indeed, exposure to lactose not only results in induction of genes required for utilization of lactose but also induces genes required for utilization of maltose (Mitchell et al., 2009). Passage of the E. coli cells from the

24 Chapter 1

outside environment to the mammalian gastrointestinal tract is also accompanied by increased temperature and lowered oxygen concentration (Tagkopoulos et al., 2008). Anticipatory regulation could be beneficial for cells under such regular environmental changes, and the sets of genes that respond to temperature increase and to low oxygen intersect significantly (Tagkopoulos et al., 2008). Yet another example of predictable environmental change is the wine production process where yeast cells experience heat stress followed by ethanol stress which is then followed by oxidative stress (Mitchell et al., 2009). Here again, anticipatory regulation could be beneficial. Indeed, heat stress can be an anticipatory signal for oxidative stress in yeast (Mitchell et al., 2009). Similar anticipatory mechanisms are present in Vibrio cholerae and Candida albicans (Schild et al., 2007; Rodaki et al., 2009).

1.8 Gene duplication Long term adaptation to new environments often requires genetic changes. Gene duplication is one kind of genetic change that could help in such adaptations (Kondrashov, 2012). Gene duplication refers to duplication of genetic material in a genome. Duplication can occur at the whole genome level – a phenomenon called whole genome duplication (WGD). Duplication could also occur at the level of a single gene, or at the level of a segment of the genome. Gene duplication is a special case of gene amplification, where the copy number of a gene or a genomic segment can increase more than twofold.

Gene duplication is one of the fundamental forces of evolution. It has been estimated that gene duplications occur as frequently as mutations (Lynch and Conery, 2000). Gene duplications are considered to have played an important role in the evolution of many organisms (Wolfe and Shields, 1997; Makalowski, 2001; Samonte and Eichler, 2002; Jaillon et al., 2004; Kellis et al., 2004; Stankiewicz et al., 2004; Cheng et al., 2005; Dehal and Boore, 2005; Donoghue and Purnell, 2005; Zahn et al. 2005; Scannell et al., 2006; Wapinski et al., 2007; Marques-Bonet et al., 2009; Santini et al., 2009; Jiao et al., 2011). Many researchers also consider gene duplication as an important step in speciation (Sidow, 1996; Spring, 1997; Lynch and Force, 2000a; Taylor et al., 2001). Furthermore, gene duplications have contributed to the evolution of gene networks and protein complexes (Teichmann and Babu, 2004; Hittinger and Carroll, 2007; Pereira-Leal et al., 2007; Szklarczyk et al., 2008).

25 Chapter 1

One of the most important contributions of gene duplication is its role in evolutionary innovations in biological organisms (Ohno, 1970; Orgel, 1977; Walsh, 1995; Long, 2001; Hughes, 2005). Such innovations can often help organisms survive in and adapt to new environments. In addition to creating new gene functions, there are several other mechanisms by which gene duplications could aid in adaptation to new environments. I will discuss all of them in detail in the following sections.

1.8.1 Gene duplication as a mechanism of adaptation to new environments 1.8.1.1 Gene dosage increase Gene duplication can increase the concentration of an expressed gene product which can be beneficial for cells in a new environment (Kondrashov et al., 2002). The earliest example comes from an experimental population of Escherichia coli grown in a lactose limited medium over many generations in a chemostat. This gave rise to fitter strains that synthesized higher amount of ß-galactosidase, the enzyme essential for utilization of lactose. This increase in expression was due to amplification of the lacZ gene that encodes the ß-galactosidase enzyme (Horiuchi et al., 1962; Horiuchi et al., 1963). Since then, several studies have associated gene amplifications and duplications with adaptation of bacteria and yeast to nutrient-limited and novel environments (Straus, 1975; Sonti and Roth, 1989; Brown et al., 1998; Riehle et al., 2001; Dunham et al., 2002; Infante et al., 2003; Reams and Neidle, 2003). Gene amplifications have also been suggested to be involved in metal tolerance in bacteria, yeast, drosophila and plants (Fogel and Welch, 1982; Maroni et al., 1987; Kondratyeva et al., 1995; Tohoyama et al., 1996; van Hoof et al., 2001; Adamo et al., 2012). Gene amplifications, through increased dosage, are also thought to be responsible for antibiotic resistance in bacteria (Rownd and Mickel, 1971; Clewell et al., 1975; Normark et al., 1977; Matthews and Stewart, 1988; Nichols and Guay, 1989; Ives and Bott, 1990; Musher et al., 2002; Nilsson et al., 2006; Sandegren and Andersson, 2009), insecticide resistance in insects (Tabashnik, 1990; Raymond et al., 1993), and drug resistance in human pathogens (Foote et al., 1989; Segovia et al., 1994; Nair et al., 2007).

An example of adaptation to a novel environment through increased gene dosage can be found in the yeast, Saccharomyces cerevisiae. Conant and Wolfe (Conant and Wolfe 2007) observed that

26 Chapter 1

several enzymes in the glycolytic pathway in yeast were retained in two copies since an ancient whole genome duplication (WGD). They proposed that duplicate metabolic genes could increase the glycolytic flux and ethanol production in yeast. They speculated that the increased glycolytic flux was selectively advantageous for yeast in a glucose rich environment. A recent experimental study investigated the fitness contributions of the duplicate metabolic genes in yeast using knock-out strains (DeLuna et al., 2008), and showed that the majority of the duplicate metabolic genes in yeast retains a significant functional overlap (DeLuna et al., 2008), suggesting that increased metabolic flux was indeed beneficial for yeast cells (Kuepfer et al. 2005, DeLuna et al., 2008).

1.8.1.2 Neo-functionalization In his book “Evolution by Gene Duplication” (Ohno, 1970), Susumu Ohno proposed that as the duplicate gene copy is redundant in function, it experiences relaxed selection and thus, can accumulate mutations and evolve new functions. These new functions can be beneficial when an organism adapts to a new environment. There are several known examples of such neo- functionalization of duplicate genes (Chen et al., 1997; Zhang et al., 2002; Rodríguez-Trelles et al., 2003; Lynch, 2007). Two main models have been proposed to explain how duplicate genes could facilitate evolution of new functions, the Innovation-Amplification-Divergence (IAD) model (Bergthorsson et al., 2007; Näsvall et al., 2012), and the Escape from Adaptive Conflict (EAC) model (Des Marais and Rausher, 2008; Barkman and Zhang, 2009).

Proteins, in addition to their primary functions, often can have secondary or minor functions (Copley, 2003; Khersonsky et al., 2006). Although these functions might not be useful in the native environment of an organism, they can become beneficial in a new environment. In such a new environment, gene amplification would increase the concentration of protein product, resulting in an increased activity of these beneficial minor functions. Subsequently, on evolutionary time scales, one of the duplicate copies can improve its minor function by accumulating mutations, which would then help in adaptation to the new environment. Indeed, amplification of genes with minor functions is beneficial for bacteria under stressful environmental conditions (Soo et al., 2011). A recent study in Salmonella typhimurium has experimentally demonstrated how such minor activity could be beneficial in a nutrient-limited

27 Chapter 1

environment (Näsvall et al., 2012). They also observed that, within 3000 generations, one of the duplicate copies improved its minor function which resulted in increased growth of the cells in this nutrient-limited environment (Näsvall et al., 2012).

In the EAC model, a single gene with its primary and secondary functions is constrained from specializing on any of these functions because of conflicts arising due to pleiotropic effects - mutations that improve one function worsen other. Gene duplication can resolve this conflict by allowing each copy to specialize on one function or the other (Des Marais and Rausher, 2008; Barkman and Zhang, 2009). Such functional specialization can often aid in adaptation to new environments. One example of neo-functionalization consistent with the EAC model is the evolution of type III anti-freeze protein in Antarctic zoarcid fish from an old sialic acid synthase (SAS) gene (Deng et al., 2010). The ancestral SAS gene encodes an SAS enzyme present in microbes and vertebrates (Gunawan et al., 2005; Reaves et al., 2008). In contrast, the AFPIII gene is a secretory protein that causes freezing point depression through binding ice and interfering with ice crystal growth. The ancestral SAS gene also has rudimentary ice-binding capability. After duplication, the N-terminal signal sequence in one of the daughter genes was replaced with a new signal sequence that allowed secretion of the protein product. Subsequently, the AFPII gene specialized for the anti-freeze function through accumulation of mutations, and is now essential for the survival of zoarcid fish in the cold Antarctic ocean (Deng et al., 2010).

1.8.1.3 Sub-functionalization and Amplification-mutagenesis In 1999, Force et al. proposed that since duplicate genes are redundant, both the genes experience partially relaxed selection (Force et al., 1999). These genes can accumulate mutations such that each gene comes to perform only a part of the ancestral function, but together they maintain the original function (Force et al., 1999; Lynch and Force, 2000b). This model is referred to as the Duplication-Degenration-Complementation (DDC) model (Force et al., 1999). Although sub-functionalization can not directly contribute to adaptation to new environments, it can allow maintenance of duplicate genes for long period of time and could eventually allow evolution of new functions (He and Zhang, 2005; Rastogi and Liberles, 2005; MacCarthy and Bergman, 2007). Similarly, it has been suggested that gene duplications can accelerate evolution in new or stressful environments by providing more genetic material for mutation to work on.

28 Chapter 1

This is referred to as the Amplification-Mutagnesis hypothesis (Hendrickson et al., 2002; Roth and Andersson, 2004). The role of this mechanism in adaptation to nutrient-limited and stressful environments has been demonstrated experimentally (Andersson et al., 1998; Hendrickson et al., 2002).

1.9 Experimental evolution The main objectives of the study of evolution are to explain the diversity of biological organisms on earth, to understand the relationships between genotypes and phenotypes and to investigate the roles of various factors, environmental and genetic, in the evolution of organisms. Evolutionary processes are primarily studied from the data on existing populations of biological organisms (Hardison et al., 1997; Batzoglou et al., 2000; Boffelli et al., 2003; Bowers et al., 2003; Cliften et al., 2003; Hardison, 2003; Kellis et al., 2003; Thomas et al., 2003; Edwards 2009). These populations are products of past evolution and were shaped by evolutionary forces of mutation, selection, and neutral drift as well as by past environmental conditions and population structure (Nei et al., 1975; Via and Lande, 1985; Griffiths et al., 2000; Elena et al., 2007; Hartl and Clark, 2007; Handel and Rozen, 2009). Thus, knowledge of past environment or population can sometimes be essential to correctly infer the evolutionary processes in the past. This can often be an obstacle because data on past environmental conditions and population structure are not always available.

Experimental evolution provides an alternative approach for studying evolution that can overcome some of the limitations mentioned above. In experimental evolution, a laboratory population of an organism is grown under controlled environmental conditions (Garland and Rose, 2009, Kawecki, 2012). Such studies retain information about environmental condition and population structure, and they allow researchers to understand the evolutionary process in great detail in real time. One of the most notable studies of experimental evolution is the evolution of Escherichia coli for more than 50,000 generations in the laboratory (Lenski, 2011; Le Gac et al., 2012). Experimental evolution studies have been performed on a diverse set of organisms ranging from viruses, bacteria, and yeast to insects, nematodes, fish and plants (Ebert et al., 2002; Velicer and Yu, 2003; Leu and Murray, 2006; Magalhães et al., 2007; Hollis et al., 2009; Koskella and Lively, 2009; Morran et al., 2009; Burke et al., 2010; Barrett et al., 2011; Roels and

29 Chapter 1

Kelly, 2011; Meyer et al., 2012). These studies have allowed researchers to explore the roles of evolutionary processes and the environment, as well as to investigate the relationship between these factors (Bryant and Meffert, 1993; Travisano et al., 1995; Reboud and Bell, 1997; Rainey and Travisano, 1998; Whitlock et al., 2002; Rundle, 2003; Beaumont et al., 2009; Kolss et al., 2009; Yeaman et al., 2010). In addition, they have allowed researchers to experimentally test several existing hypotheses on evolution as well as posit new ones (Dodd, 1989; Moya et al., 1995; Burch and Chao, 1999; Kerr et al., 2006; Porcher et al., 2006; Woods et al., 2006; Kuzdzal-Fick et al., 2011; Manhes and Velicer, 2011; Morran et al., 2011). One such example is the evolution of Escherichia coli strains with different mutation rates in the laboratory (Visser et al., 1999). High mutation rates are often observed in pathogenic bacteria, and it has been suggested that such high mutation rates could be beneficial in adaptive evolution (LeClerc et al., 1996; Matic et al., 1997; Taddei et al., 1997). However, by evolving E. coli strains with different mutation rates in the laboratory, the above-mentioned study established that high mutation rates could be beneficial only in limited circumstances (Visser et al., 1999). Another example of experimental evolution is a study of evolution of reproductive isolation in the fungus Neurospora (Dettman et al., 2008). According to the ecological speciation model, adaptation of a population to divergent environments can lead to reproductive isolation (Dobzhansky, 1937; Muller, 1942; Orr and Turelli, 2001; Schluter, 2001; Rundle and Nosil, 2005). This prediction was confirmed by an experimental evolution study in Neurospora, which observed lower reproductive fitness for mating of lineages adapted to divergent selection conditions, compared to lineages selected in the same environment (Dettman et al., 2008).

1.10 Aim of the Thesis In this work, I used experimental evolution and genomic techniques to examine the mechanisms of adaptation in microbes to new laboratory environments. In Chapter 2 of the thesis, I describe my research on the investigation of the genomic changes associated with adaptation of yeast (Saccharomyces cerevisiae) populations to saline stress evolved in the laboratory for 300 generations. I studied genomic changes in these populations through next generation sequencing, expression microarrays and pulsed field gel electrophoresis (PFGE). Although the stress response has been widely studied in many organisms, most of these studies focused on responses on very short time scales of the order of hours (Jelinsky and Samson, 1999; Gasch et al., 2000;

30 Chapter 1

Posas et al., 2000; Rep et al., 2000; Causton et al., 2001; Yale and Bohnert, 2001). Very few studies have investigated long term adaptations where a stressor can persist for hundreds or thousands of generations (Adams et al., 1992; Ferea et al., 1999; Riehle et al., 2001; Dunham et al., 2002; Bennett and Lenski, 2007; Gresham et al., 2008; Rudolph et al., 2010). To my knowledge, this is the first study to investigate the mechanisms of long term adaptation to salt stress in an organism. In this work, I prepared samples for genome sequencing, carried out the PFGE experiments, and analyzed all data. In Chapter 3 of the thesis, I describe my research on the adaptation of yeast to a regular cycling environment. To this end, we evolved yeast in the laboratory for 300 generations in a changing environment and examined fitness changes and gene expression changes in these evolved populations. With this work, we asked whether yeast cells can evolve predictive mechanisms such as stress cross-protection and anticipatory regulation when exposed to a regular periodic environment. These predictive mechanisms exist in microbes but how easily they can evolve was not known before (Völker et al., 1992; Chinnusamy et al., 2004; Palhano et al., 2004; Berry and Gasch, 2008; Tagkopoulos et al., 2008; Mitchell et al., 2009). In this work, I carried out some of the fitness experiments and analyzed all data. In Chapter 4 of the thesis, I describe my work on the experimental evolution of duplicate genes in Escherichia coli. Duplicate genes are an important factor for evolution of new protein functions that could allow organisms to survive in new environments (Ohno, 1970; Walsh, 1995; Riehle et al., 2001; Dunham et al., 2002; Tohoyama et al., 1996; van Hoof et al., 2001; Adamo et al., 2012; Musher et al., 2002; Nilsson et al., 2006; Nair et al., 2007). Duplicate genes can facilitate adaptation to new environments through various mechanisms (Kondrashov et al., 2002; Bergthorsson et al., 2007; Des Marais and Rausher, 2008; Kondrashov, 2012), but their relative importance in this process is unclear. In this chapter, I test the role of these mechanisms by evolving duplicate genes in novel environments in the laboratory. I performed all the experiments except the construction of the mutator strain and analyzed all data.

Bibliography

Acar M, Mettetal JT, van Oudenaarden A, 2008. Stochastic switching as a survival strategy in fluctuating environments. Nat Genet. 40:471-475.

Achard P, Baghour M, Chapple A, Hedden P, Van Der Straeten D, Genschik P, Moritz T, Harberd NP, 2007. The plant stress hormone ethylene controls floral transition via DELLA-dependent regulation of floral meristem-identity genes. Proc Natl Acad Sci USA 104:6484-6489.

31 Chapter 1

Achard P, Cheng H, De Grauwe L, Decat J, Schoutteten H, Moritz T, Van Der Straeten D, Peng J, Harberd NP, 2006. Integration of plant responses to environmentally activated phytohormonal signals. Science 311:91-94.

Adam M, Murali B, Glenn NO, Potter SS, 2008. Epigenetic inheritance based evolution of antibiotic resistance in bacteria. BMC Evol Biol. 8:52.

Adamo GM, Lotti M, Tamás MJ, Brocca S, 2012. Amplification of the CUP1 gene is associated with evolution of copper tolerance in Saccharomyces cerevisiae. Microbiology 158:2325-2335.

Adams J, Oeller PW, 1986. Structure of evolving populations of Saccharomyces cerevisiae: adaptive changes are frequently associated with sequence alterations involving mobile elements belonging to the Ty family. Proc Natl Acad Sci USA 83:7124-7127.

Adams J, Puskas-Rozsa S, Simlar J, Wilke CM, 1992. Adaptation and major chromosomal changes in populations of Saccharomyces cerevisiae. Curr Genet. 22:13-19.

Ahern TJ, Klibanov AM, 1988. Analysis of processes causing thermal inactivation of enzymes. Methods Biochem Anal. 33:91-127.

Alexov E, 2004. Numerical calculations of the pH of maximal protein stability. The effect of the sequence composition and three-dimensional structure. Eur J Biochem. 271:173-185.

Allakhverdiev SI, Sakamoto A, Nishiyama Y, Inaba M, Murata N, 2000. Ionic and osmotic effects of NaCl-induced inactivation of photosystems I and II in Synechococcus sp. Plant Physiol. 123:1047-1056.

Anderl JN, Franklin MJ, Stewart PS, 2000. Role of antibiotic penetration limitation in Klebsiella pneumoniae biofilm resistance to ampicillin and ciprofloxacin. Antimicrob Agents Chemother. 44:1818-1824.

Andersson DI, Slechta ES, Roth JR, 1998. Evidence that gene amplification underlies adaptive mutability of the bacterial lac operon. Science 282:1133-1135.

Apse MP, Aharon GS, Snedden WA, Blumwald E, 1999. Salt Tolerance Conferred by Overexpression of a Vacuolar Na +/H+ Antiport in Arabidopsis. Science 285:1256-1258.

Attfield PV, 1997. Stress tolerance: the key to effective strains of industrial baker's yeast. Nat Biotechnol. 15:1351- 1357.

Attfield PV, Choi HY, Veal DA, Bell PJ, 2001. Heterogeneity of stress gene expression and stress resistance among individual cells of Saccharomyces cerevisiae. Mol Microbiol. 40:1000-1008.

Autor AP, 1982. Biosynthesis of mitochondrial manganese superoxide dismutase in Saccharomyces cerevisiae. Precursor form of mitochondrial superoxide dismutase made in the cytoplasm. J Biol Chem. 257:2713-2718.

Bairlein F, Norris DR, Nagel R, Bulte M, Voigt CC, Fox JW, Hussell DJ, Schmaljohann H, 2012. Cross-hemisphere migration of a 25 g songbird. Biol Lett. 8:505-507.

Balaban NQ, Merrin J, Chait R, L, Leibler S, 2004. Bacterial persistence as a phenotypic switch. Science 305:1622-1625.

Bapiri A, Bååth E, Rousk J, 2010. Drying-rewetting cycles affect fungal and bacterial growth differently in an arable soil. Microb Ecol. 60:419-428.

Barkman T, Zhang J, 2009. Evidence for escape from adaptive conflict? Nature 462:E1; discussion E2-3.

32 Chapter 1

Barrett RD, Paccard A, Healy TM, Bergek S, Schulte PM, Schluter D, Rogers SM, 2011. Rapid evolution of cold tolerance in stickleback. Proc Biol Sci. 278:233-238.

Basehoar AD, Zanton SJ, Pugh BF, 2004. Identification and distinct regulation of yeast TATA box-containing genes. Cell 116:699-709.

Batzoglou S, Pachter L, Mesirov JP, Berger B, Lander ES, 2000. Human and mouse gene structure: Comparative analysis and application to exon prediction. Genome Res. 10: 950–958.

Baxter BK, James P, Evans T, Craig EA, 1996. SSI1 encodes a novel Hsp70 of the Saccharomyces cerevisiae endoplasmic reticulum. Mol Cell Biol. 16:6444-6456.

Beaumont HJ, Gallie J, Kost C, Ferguson GC, Rainey PB, 2009. Experimental evolution of bet hedging. Nature 462:90-93.

Becker J, Craig EA, 1994. Heat-shock proteins as molecular chaperones. Eur J Biochem. 219:11–23.

Bennett AF, Lenski RE, 2007. An experimental test of evolutionary trade-offs during temperature adaptation. Proc Natl Acad Sci USA 104:8649-8654.

Bergthorsson U, Andersson DI, Roth JR, 2007. Ohno's dilemma: evolution of new genes under continuous selection. Proc Natl Acad Sci USA 104:17004-17009.

Berlett BS, Stadtman ER, 1997. Protein oxidation in aging, disease, and oxidative stress. J Biol Chem. 272:20313- 20316.

Bermingham A, Derrick JP, 2002. The folic acid biosynthesis pathway in bacteria: evaluation of potential for antibacterial drug discovery. Bioessays 24:637-648.

Berry DB, Gasch AP, 2008. Stress-activated genomic expression changes serve a preparative role for impending stress in yeast. Mol Biol Cell 19:4580-4587.

Berry DB, Guan Q, Hose J, Haroon S, Gebbia M, Heisler LE, Nislow C, Giaever G, Gasch AP, 2011. Multiple means to the same end: the genetic basis of acquired stress resistance in yeast. PLoS Genet. 7:e1002353.

Bertani G, 1953. Lysogenic versus lytic cycle of phage multiplication. Cold Spring Harb Symp Quant Biol. 18:65- 70.

Bianchi E, Conio G, Ciferri A, Puett D, Rajagh L, 1967. The role of pH, temperature, salt type, and salt concentration on the stability of the crystalline, helical, and randomly coiled forms of collagen. J Biol Chem. 242:1361-1369.

Bidenne C, Blondin B, Dequin S, Vezinhet F, 1992. Analysis of the chromosomal DNA polymorphism of wine strains of Saccharomyces cerevisiae. Curr Genet. 22:1-7.

Binzel ML, Hess FD, Bressan RA, Hasegawa PM, 1988. Intracellular compartmentation of ions in salt adapted tobacco cells. Plant Physiol. 86: 607-614.

Bishop AL, Rab FA, Sumner ER, Avery SV. 2007. Phenotypic heterogeneity can enhance rare-cell survival in 'stress-sensitive' yeast populations. Mol Microbiol. 63:507-520.

Blake MJ, Udelsman R, Feulner GJ, Norton DD, Holbrook NJ, 1991. Stress-induced heat shock protein 70 expression in adrenal cortex: an adrenocorticotropic hormone-sensitive, age-dependent response. Proc Natl Acad Sci

33 Chapter 1

USA 88:9873-9877.

Blake WJ, Balázsi G, Kohanski MA, Isaacs FJ, Murphy KF, Kuang Y, Cantor CR, Walt DR, Collins JJ. 2006. Phenotypic consequences of promoter-mediated transcriptional noise. Mol Cell 24:853-865.

Boffelli D, McAuliffe J, Ovcharenko D, Lewis KD, Ovcharenko I, Pachter L, Rubin EM, 2003. Phylogenetic shadowing of primate sequences to find functional regions of the human genome. Science 299:1391-1394.

Bolstad DB, Bolstad ES, Wright DL, Anderson AC, 2008. Dihydrofolate reductase inhibitors: developments in antiparasitic chemotherapy. Expert Opin Ther Pat. 18:143-157.

Booth IR. 2002. Stress and the single cell: intrapopulation diversity is a mechanism to ensure survival upon exposure to stress. Int J Food Microbiol. 78:19-30.

Bouthier de la Tour C, Portemer C, Nadal M, Stetter KO, Forterre P, Duguet M, 1990. Reverse gyrase, a hallmark of the hyperthermophilic archaebacteria. J Bacteriol. 172:6803-6808.

Bowers JE, Chapman BA, Rong J, Paterson AH, 2003. Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422:433-438.

Boyer JS, 1982. Plant productivity and environment. Science 218:443-448.

Bressan RA, Hasegawa PM, Pardo JM, 1998. Plants use calcium to resolve salt stress. Trends Plant Sci. 3: 411-412.

Bridier A, Briandet R, Thomas V, Dubois-Brissonnet F, 2011. Resistance of bacterial biofilms to disinfectants: a review. Biofouling. 27:1017-1032.

Briggs CE, Fratamico PM, 1999. Molecular characterization of an antibiotic resistance gene cluster of Salmonella typhimurium DT104. Antimicrob Agents Chemother. 43:846-849.

Brown AD, 1976. Microbial Water Stress. Bacteriol. Rev. 40:803-846.

Brown CJ, Todd KM, Rosenzweig RF, 1998. Multiple duplications of yeast hexose transport genes in response to selection in a glucose-limited environment. Mol Biol Evol. 15:931-942.

Brown GM, 1962. The biosynthesis of folic acid. II. Inhibition by sulfonamides. J Biol Chem. 237:536-540.

Brown GM, 1967. Inhibition by sulfonamides of the biosynthesis of folic acid. Int J Lepr Other Mycobact Dis. 35:580-589.

Brown MR, Allison DG, Gilbert P, 1988. Resistance of bacterial biofilms to antibiotics: a growth-rate related effect? J Antimicrob Chemother. 22:777-780.

Brown NS, Bicknell R, 2001. Hypoxia and oxidative stress in breast cancer. Oxidative stress: its effects on the growth, metastatic potential and response to therapy of breast cancer. Breast Cancer Res. 3:323-327.

Bryant EH, Meffert LM, 1993. The effect of serial founder flush cycles on quantitative genetic variation in the housefly. Heredity 70:122–129.

Bucciantini M, Giannoni E, Chiti F, Baroni F, Formigli L, Zurdo J, Taddei N, Ramponi G, Dobson CM, Stefani M, 2002. Inherent toxicity of aggregates implies a common mechanism for protein misfolding diseases. Nature 416:507-511.

Burch CL, Chao L, 1999. Evolution by small steps and rugged landscapes in the RNA virus w6. Genetics 151, 921– 927.

34 Chapter 1

Burg AW, Brown GM, 1966. The biosynthesis of folic acid. VI. Enzymatic conversion of carbon atom 8 of guanosine triphosphate to formic acid. Biochim Biophys Acta 117:275–278.

Burke MJ, 1976. Freezing and injury in plants. Ann Rev Plant Physiol. 27:507-528.

Burke MK, Dunham JP, Shahrestani P, Thornton KR, Rose MR, Long AD, 2010. Genome-wide analysis of a long- term evolution experiment with Drosophila. Nature 467:587-590.

Butler G et al., 2009. Evolution of pathogenicity and sexual reproduction in eight Candida genomes. Nature 459:657-662.

Cadenas E, 1989. Biochemistry of oxygen toxicity. Annu Rev Biochem. 58:79-110.

Cadet J, Douki T, Gasparutto D, Ravanat JL, 2003. Oxidative damage to DNA: formation, measurement and biochemical features. Mutat Res. 531:5-23.

Cadet J, Sage E, Douki T, 2005. Ultraviolet radiation-mediated damage to cellular DNA. Mutat Res. 2005 571:3-17.

Campbell HA, Fraser KP, Bishop CM, Peck LS, Egginton S, 2008. Hibernation in an antarctic fish: on ice for winter. PLoS One 3:e1743.

Casadesús J, Low D, 2006. Epigenetic gene regulation in the bacterial world. Microbiol Mol Biol Rev. 70:830-856.

Casida JE, 2009. Pest toxicology: the primary mechanisms of pesticide action. Chem Res Toxicol. 22:609-619.

Casini AF, Pompella A, Comporti M, 1985. Liver glutathione depletion induced by bromobenzene, iodobenzene, and diethylmaleate poisoning and its relation to lipid peroxidation and necrosis. Am J Pathol. 118:225-237.

Causton HC, Ren B, Koh SS, Harbison CT, Kanin E, Jennings EG, Lee TI, True HL, Lander ES, Young RA, 2001. Remodeling of yeast genome expression in response to environmental changes. Mol Biol Cell 12:323-337.

Cazalé AC, Rouet-Mayer MA, Barbier-Brygoo H, Mathieu Y, Laurière C, 1998. Oxidative Burst and Hypoosmotic Stress in Tobacco Cell Suspensions. Plant Physiol. 116:659-669.

Chan P, Warwicker J, 2009. Evidence for the adaptation of protein pH-dependence to subcellular pH. BMC Biol.7:69.

Charmandari E, Tsigos C, Chrousos G, 2005. Endocrinology of the stress response. Annu Rev Physiol. 67:259-284.

Charng YY, Liu HC, Liu NY, Hsu FC, Ko SS, 2006. Arabidopsis Hsa32, a novel heat shock protein, is essential for acquired thermotolerance during long recovery after acclimation. Plant Physiol. 140:1297-1305.

Chen L, DeVries AL, Cheng CH, 1997. Evolution of antifreeze glycoprotein gene from a trypsinogen gene in Antarctic notothenioid fish. Proc Natl Acad Sci USA 94:3811-3816.

Cheng et al., 2005. A genome-wide comparison of recent chimpanzee and human segmental duplications. Nature 437:88-93.

Childs DZ, Metcalf CJ, Rees M, 2010. Evolutionary bet-hedging in the real world: empirical evidence and challenges revealed by plants. Proc Biol Sci. 277:3055-3064.

Chinnusamy V, Schumaker K, Zhu JK, 2004. Molecular genetic perspectives on cross-talk and specificity in abiotic stress signalling in plants. J Exp Bot. 55:225-236.

35 Chapter 1

Chinnusamy V, Zhu JK, 2009. Epigenetic regulation of stress responses in plants. Curr Opin Plant Biol. 12:133-139.

Chopra I, Roberts M, 2001. Tetracycline antibiotics: mode of action, applications, molecular biology, and epidemiology of bacterial resistance. Microbiol Mol Biol Rev. 65:232-260.

Clardy J, Fischbach MA, Walsh CT, 2006. New antibiotics from bacterial natural products. Nat Biotechnol. 24:1541-1550.

Clewell DB, Yagi Y, Bauer B, 1975. Plasmid-determined tetracycline resistance in Streptococcus faecalis: evidence for gene amplification during growth in presence of tetracycline. Proc Natl Acad Sci USA. 72:1720-1724.

Cliften P, Sudarsanam P, Desikan A, Fulton L, Fulton B, Majors J, Waterston R, Cohen BA, Johnston M, 2003. Finding functional features in Saccharomyces genomes by phylogenetic footprinting. Science 301:71-76.

Cochran WL, McFeters GA, Stewart PS, 2000. Reduced susceptibility of thin Pseudomonas aeruginosa biofilms to hydrogen peroxide and monochloramine. J Appl Microbiol. 88:22-30.

Codón AC, Benítez T, Korhola M, 1998. Chromosomal polymorphism and adaptation to specific industrial environments of Saccharomyces strains. Appl Microbiol Biotechnol. 49:154-163.

Cohen G, Fessl F, Traczyk A, Rytka J, Ruis H, 1985. Isolation of the catalase A gene of Saccharomyces cerevisiae by complementation of the cta1 mutation. Mol Gen Genet. 200:74-79.

Conant GC, Wolfe KH, 2007. Increased glycolytic flux as an outcome of whole-genome duplication in yeast. Mol Syst Biol. 3:1-12.

Coote PJ, Cole MB, Jones MV, 1991. Induction of increased thermotolerance in Saccharomyces cerevisiae may be triggered by a mechanism involving intracellular pH. J Gen Microbiol. 137:1701-1708.

Copley SD, 2003. Enzymes with extra talents: moonlighting functions and catalytic promiscuity. Curr Opin Chem Biol. 7:265-272.

Costerton JW, Cheng KJ, Geesey GG, Ladd TI, Nickel JC, Dasgupta M, Marrie TJ, 1987. Bacterial biofilms in nature and disease. Annu Rev Microbiol. 41:435-64.

Cramer GR, Urano K, Delrot S, Pezzotti M, Shinozaki K, 2011. Effects of abiotic stress on plants: a systems biology perspective. BMC Plant Biol. 11:163.

Craufurd PQ, Peacock JM, 1993. Effect of Heat and Drought Stress on Sorghum (Sorghum Bicolor). II. Grain Yield. Exp Agri 29:77-86.

Crawshaw LI., 1984. Low-temperature dormancy in fish. Am J Physiol. 246:R479-R486.

Crews D, Gillette R, Scarpino SV, Manikkam M, Savenkova MI, Skinner MK, 2012. Epigenetic transgenerational inheritance of altered stress responses. Proc Natl Acad Sci USA 109:9143-9148.

Cullum AJ, Bennett AF, Lenski RE, 2001. Evolutionary adaptation to temperature. IX. Preadaptation to novel stressful environments of Escherichia coli adapted to high temperature. Evolution 55:2194-2202.

Culotta VC, 2000. Superoxide dismutase, oxidative stress, and cell metabolism. Curr Top Cell Regul. 36:117-132.

Cvoro A, Dundjerski J, Trajković D, Matić G, 1998. Association of the rat liver glucocorticoid receptor with Hsp90 and Hsp70 upon whole body hyperthermic stress. J Steroid Biochem Mol Biol. 67:319-325.

36 Chapter 1

Damblon C, Raquet X, Lian LY, Lamotte-Brasseur J, Fonze E, Charlier P, Roberts GC, Frère JM, 1996. The catalytic mechanism of beta-lactamases: NMR titration of an active-site lysine residue of the TEM-1 enzyme. Proc Natl Acad Sci USA 93:1747-1752.

Davies J, 1994. Inactivation of antibiotics and the dissemination of resistance genes. Science 264:375-382.

De Clercq E, 1998. The role of non-nucleoside reverse transcriptase inhibitors (NNRTIs) in the therapy of HIV-1 infection. Antiviral Res. 38:153-179. de Jong IG, Haccou P, Kuipers OP. 2011. Bet hedging or not? A guide to proper classification of microbial survival strategies. Bioessays 33:215-223.

Deane EE, Kelly SP, Lo CK, Woo NY, 1999. Effects of GH, prolactin and cortisol on hepatic heat shock protein 70 expression in a marine teleost Sparus sarba. J Endocrinol. 161:413-421.

Dehal P, Boore JL, 2005. Two Rounds of Whole Genome Duplication in the Ancestral Vertebrate. Plos Biol. 3:1700-1708.

Delaunay A, Isnard AD, Toledano MB, 2000. H2O2 sensing through oxidation of the Yap1 transcription factor. EMBO J. 19:5157-5166.

DeLuna A, Vetsigian K, Shoresh N, Hegreness M, Colón-González M, Chao S, Kishony R, 2008. Exposing the fitness contribution of duplicated genes. Nat Genet. 40:676-681.

Deng C, Cheng CH, Ye H, He X, Chen L, 2010. Evolution of an antifreeze protein by neofunctionalization under escape from adaptive conflict. Proc Natl Acad Sci USA 107:21593-21598.

Des Marais DL, Rausher MD, 2008. Escape from adaptive conflict after duplication in an anthocyanin pathway gene. Nature 454:762-765.

Dettman JR, Anderson JB, Kohn LM, 2008. Divergent adaptation promotes reproductive isolation among experimental populations of the filamentous fungus Neurospora. BMC Evol Biol. 8:35.

Dietz K-J, Baier M, Krämer U, 1999. Free radicals and reactive oxygen species as mediators of heavy metal toxicity in plants. In: Heavy metal stress in plants: Molecules to ecosystem. Eds.: Prasad MNV, Hagemeyer J. Springer- Verlag, Berlin.

DiRuggiero J, Brown JR, Bogert AP, Robb FT, 1999. DNA repair systems in archaea: mementos from the last universal common ancestor? J Mol Evol. 49:474-484.

Dobzhansky T, 1937. Genetics and the Origin of Species. Columbia University Press, New York.

Dodd DMB, 1989. Reproductive isolation as a consequence of adaptive divergence in Drosophila pseudoobscura. Evolution 43: 1308–1311

Dodson SI, Hanazato T, 1995. Commentary on effects of anthropogenic and natural organic chemicals on development, swimming behavior, and reproduction of Daphnia, a key member of aquatic ecosystems. Environ Health Perspect. 103:7-11.

Domingo JL, 1994. Metal-induced developmental toxicity in mammals: a review. J Toxicol Environ Health 42:123- 141.

Donahue BA, Yin S, Taylor JS, Reines D, Hanawalt PC, 1994. Transcript cleavage by RNA polymerase II arrested

37 Chapter 1 by a cyclobutane pyrimidine dimer in the DNA template. Proc Natl Acad Sci USA 91:8502–8506.

Donaldson-Matasci MC, Lachmann M, Bergstrom CT, 2008. Phenotypic diversity as an adaptation to environmental uncertainty. Evol Ecol Res. 10: 493–515.

Donoghue PJC, Purnell MA, 2005. Genome duplication, extinction and vertebrate evolution. Trends Ecol Evol. 20: 313-319.

Dorer MS, Talarico S, Salama NR, 2009. Helicobacter pylori's unconventional role in health and disease. PLoS Pathog. 5:e1000544.

Dresser LD, Rybak MJ, 1998. The pharmacologic and bacteriologic properties of oxazolidinones, a new class of synthetic antimicrobials. Pharmacotherapy 18:456-462.

Drobnis EZ, Crowe LM, Berger T, Anchordoguy TJ, Overstreet JW, Crowe JH, 1993. Cold shock damage is due to lipid phase transitions in cell membranes: a demonstration using sperm as a model. J Exp Zool. 265:432-437.

Dubyak GR, 2004. Ion homeostasis, channels, and transporters: an update on cellular mechanisms. Adv Physiol Educ. 28:143-154.

Duke SO, 1990. Overview of herbicide mechanisms of action. Environ Health Perspect. 87:263-271.

Dunham MJ, Badrane H, Ferea T, Adams J, Brown PO, Rosenzweig F, Botstein D, 2002. Characteristic genome rearrangements in experimental evolution of Saccharomyces cerevisiae. Proc Natl Acad Sci USA. 99:16144-16149.

Durrant WE, Dong X, 2004. Systemic acquired resistance. Annu Rev Phytopathol. 42:185-209.

Ebert D, Haag C, Kirkpatrick M, Riek M, Hottinger JW, Pajunen VI, 2002. A selective advantage to immigrant genes in a Daphnia metapopulation. Science 295:485-488.

Edwards SV, 2009. Natural selection and phylogenetic analysis. Proc Natl Acad Sci USA. 106:8799-8800.

Elena SF, Wilke CO, Ofria C, Lenski RE, 2007. Effects of population size and mutation rate on the evolution of mutational robustness. Evolution 61:666-674.

Ercal N, Gurer-Orhan H, Aykin-Burns N, 2001. Toxic metals and oxidative stress part I: mechanisms involved in metal-induced oxidative damage. Curr Top Med Chem. 1:529-39.

Evans DJ, Allison DG, Brown MR, Gilbert P, 1991. Susceptibility of Pseudomonas aeruginosa and Escherichia coli biofilms towards ciprofloxacin: effect of specific growth rate. J Antimicrob Chemother. 27:177-184.

Evans SE, Wallenstein MD, 2012. Soil microbial community response to drying and rewetting stress: does historical precipitation regime matter? Biogeochemistry 109:101-116.

Farrell DJ, Douthwaite S, Morrissey I, Bakker S, Poehlsgaard J, Jakobsen L, Felmingham D, 2003. Macrolide resistance by ribosomal mutation in clinical isolates of Streptococcus pneumoniae from the PROTEKT 1999-2000 study. Antimicrob Agents Chemother. 47:1777-1783.

Feder ME, Hofmann GE, 1999. Heat-shock proteins, molecular chaperones, and the stress response: evolutionary and ecological physiology. Annu Rev Physiol. 61:243-282.

Ferea TL, Botstein D, Brown PO, Rosenzweig RF, 1999. Systematic changes in gene expression patterns following adaptive evolution in yeast. Proc Natl Acad Sci USA 96:9721-9726

38 Chapter 1

Fierer N, Schimel JP, Holden PA, 2003. Influence of drying-rewetting frequency on soil bacterial community structure. Microb Ecol. 45:63–71.

Fisher MC, Garner TW, Walker SF, 2009. Global emergence of Batrachochytrium dendrobatidis and amphibian chytridiomycosis in space, time, and host. Annu Rev Microbiol. 63:291-310.

Fleming A, 1929. On antibacterial action of culture of penicillium, with special reference to their use in isolation of B. influenzae. British J Exp Path.. 10:226–236.

Flemming H-C, Wingender J, 2010. The biofilm matrix. Nat Rev Microbiol. 8:623-633.

Fogel S, Welch JW, 1982. Tandem gene amplification mediates copper resistance in yeast. Proc Natl Acad Sci USA 79:5342-5346.

Foote SJ, Thompson JK, Cowman AF, Kemp DJ, 1989. Amplification of the multidrug resistance gene in some chloroquine-resistant isolates of P. falciparum. Cell 57:921-930.

Force A, Lynch M, Pickett FB, Amores A, Yan YL, Postlethwait J, 1999. Preservation of duplicate genes by complementary, degenerative mutations. Genetics 151:1531-1545.

Forsburg SL, 2001. The art and design of genetic screens: yeast. Nat Rev Genet. 2:659-668.

Fulda S, 2010. Evasion of apoptosis as a cellular stress response in cancer. Int J Cell Biol. 2010:370835.

Fulda S, Gorman AM, Hori O, Samali A, 2010. Cellular stress responses: cell survival and cell death. Int J Cell Biol. 2010:214074.

Gaál B, Pitchford JW, Wood AJ, 2010. Exact results for the evolution of stochastic switching in variable asymmetric environments. Genetics 184:1113–1119.

Galcheva-Gargova Z, Dérijard B, Wu IH, Davis RJ, 1994. An osmosensing signal transduction pathway in mammalian cells. Science 265:806-808.

Gallant et al., 2003. Nucleoside and nucleotide analogue reverse transcriptase inhibitors: a clinical review of antiretroviral resistance. Antivir Ther. 8:489-506.

Gao D, Critser JK, 2000. Mechanisms of cryoinjury in living cells. ILAR J. 41:187-196.

Garcia-Moreno B, 2009. Adaptations of proteins to cellular and subcellular pH. J Biol. 8:98.

Garland, T., Jr., and M. R. Rose, eds. 2009. Experimental evolution: concepts, methods, and applications of selection experiments. University of California Press, Berkeley, California.

Gasch AP, Spellman PT, Kao CM, Carmel-Harel O, Eisen MB, Storz G, Botstein D, Brown PO, 2000. Genomic expression programs in the response of yeast cells to environmental changes. Mol Biol Cell 11:4241-4257.

Geiler-Samerotte KA, Dion MF, Budnik BA, Wang SM, Hartl DL, Drummond DA, 2011. Misfolded proteins impose a dosage dependent fitness cost and trigger a cytosolic unfolded protein response in yeast. Proc Natl Acad Sci USA 108:680-685.

Gething MJ, ed. 1997. Guidebook to Molecular Chaperones and Protein-Folding Catalysts. Oxford Univ. Press, Oxford, UK.

Gibson BR, Lawrence SJ, Leclaire JP, Powell CD, Smart KA, 2007. Yeast responses to stresses associated with industrial brewery handling. FEMS Microbiol Rev. 31:535-569.

39 Chapter 1

Gieser F, Hulbert AJ, Nicol SC, 1996. Adaptations to the Cold. University of New England Press.

Giller KE, Witter E, Mcgrath SP, 1998. Toxicity of heavy metals to microorganisms and microbial processes in agricultural soils: a review. Soil Biol Biochem. 30: 1389-1414.

Gisbert C et al., 2000. The Yeast HAL1 Gene Improves Salt Tolerance of Transgenic Tomato. Plant Physiol. 123: 393-402.

Goens SD, Perdue ML, 2004. Hepatitis E viruses in humans and animals. Anim Health Res Rev. 5:145-156.

González-Lamothe R, Mitchell G, Gattuso M, Diarra MS, Malouin F, Bouarab K, 2009. Plant antimicrobial agents and their effects on plant and human pathogens. Int J Mol Sci. 10:3400-3419.

Grant CM, 2001. Role of the glutathione/glutaredoxin and thioredoxin systems in yeast growth and response to stress conditions. Mol Microbiol. 39:533-541.

Grant CM, Perrone G, Dawes IW, 1998. Glutathione and catalase provide overlapping defenses for protection against hydrogen peroxide in the yeast Saccharomyces cerevisiae. Biochem Biophys Res Commun. 253:893-898.

Gray GS, Fitch WM, 1983. Evolution of antibiotic resistance genes: the DNA sequence of a kanamycin resistance gene from Staphylococcus aureus. Mol Biol Evol. 1:57-66.

Grayer RJ, Harborne JB, 1994. A survey of antifungal compounds from higher plants, 1982-1993. Phytochemistry 37:19-42.

Grayer RJ, Kokubun T, 2001. Plant--fungal interactions: the search for phytoalexins and other antifungal compounds from higher plants. Phytochemistry 56:253-63.

Greaves RB, Warwicker J, 2007. Mechanisms for stabilisation and the maintenance of solubility in proteins from thermophiles. BMC Struct Biol. 7:18.

Greenacre EJ, Brocklehurst TF, 2006. The Acetic Acid Tolerance Response induces cross-protection to salt stress in Salmonella typhimurium. Int J Food Microbiol. 112:62-65.

Gresham D, Desai MM, Tucker CM, Jenq HT, Pai DA, Ward A, DeSevo CG, Botstein D, Dunham MJ, 2008. The repertoire and dynamics of evolutionary adaptations to controlled nutrient-limited environments in yeast. PLoS Genet. 4:e1000303.

Griffin MJ, Brown GM, 1964. The Biosynthesis Of Folic Acid. III. Enzymatic Formation Of Dihydrofolic Acid From Dihydropteroic Acid And Of Tetrahydropteroylpolyglutamic Acid Compounds From Tetrahydrofolic Acid. J Biol Chem. 239:310-316.

Griffiths AJF, Miller JH, Suzuki DT, Lewontin RC, Gelbart WM, 2000. A synthesis of forces: variation and divergence of population (chapter 26). In: An Introduction to Genetic Analysis. W. H. Freeman, New York.

Gunawan J, Simard D, Gilbert M, Lovering AL, Wakarchuk WW, Tanner ME, Strynadka NC, 2005. Structural and mechanistic analysis of sialic acid synthase NeuB from Neisseria meningitidis in complex with Mn2+, phosphoenolpyruvate, and N-acetylmannosaminitol. J Biol Chem. 280:3555-3563.

Guptasarma P. 1995. Does replication-induced transcription regulate synthesis of the myriad low copy number proteins of Escherichia coli? Bioessays 17:987-997.

Hall JL, 2002. Cellular mechanisms for heavy metal detoxification and tolerance. J Exp Bot. 53: 1-11.

40 Chapter 1

Hall-Stoodley L, Costerton JW, Stoodley P, 2004. Bacterial biofilms: from the natural environment to infectious diseases. Nat Rev Microbiol. 2:95-108.

Halliwell B, Gutteridge JM, 1984. Free radicals, lipid peroxidation, and cell damage. Lancet 2:1095.

Halliwell B, Gutteridge JMC, 1999. Free Radicals in Biology and Medicine. Oxford University Press, Oxford, UK.

Han J, Lee JD, Bibbs L, Ulevitch RJ, 1994. A MAP kinase targeted by endotoxin and hyperosmolarity in mammalian cells. Science 265:808-811.

Handel A, Rozen DE, 2009. The impact of population size on the evolution of asexual microbes on smooth versus rugged fitness landscapes. BMC Evol Biol. 9:236.

Hardison RC, 2003. Comparative Genomics. PLoS Biol. 1:e58.

Hardison RC, Oeltjen J, Miller W, 1997. Long human-mouse sequence alignments reveal novel regulatory elements: a reason to sequence the mouse genome. Genome Res. 7:959-966.

Haro R, Garciadeblas B, Rodríguez-Navarro A, 1991. A novel P-type ATPase from yeast involved in sodium transport. FEBS Lett. 291:189-191.

Hartl DL, Clark AG, 2007. Principles of Population Genetics. Sinauer associates, Sunderland, Massachusetts

Hawser S, Lociuro S, Islam K, 2006. Dihydrofolate reductase inhibitors as antibacterial agents. Biochem Pharmacol. 71:941-948.

He X, Zhang J, 2005. Rapid subfunctionalization accompanied by prolonged and substantial neofunctionalization in duplicate gene evolution. Genetics 169:1157-1164.

Hecker M, Völker U, 1990. General stress proteins in Bacillus subtilis. FEMS Microbiol Lett. 74:197–213.

Heitman J, 2011. Microbial Pathogens in the Fungal Kingdom. Fungal Biol Rev. 25: 48–60.

Hendrickson H, Slechta ES, Bergthorsson U, Andersson DI, Roth JR, 2002. Amplification-mutagenesis: evidence that "directed" adaptive mutation and general hypermutability result from growth with a selected gene amplification. Proc Natl Acad Sci USA 99:2164-2169.

Hensel R, Jakob I, Scheer H, Lottspeich F, 1992. Proteins from hyperthermophilic archaea: stability towards covalent modification of the peptide chain. Biochem Soc Symp. 58:127-133.

Herr I, Debatin KM, 2001. Cellular stress response and apoptosis in cancer therapy. Blood 98:2603-2614.

Hevener KE, Yun MK, Qi J, Kerr ID, Babaoglu K, Hurdle JG, Balakrishna K, White SW, Lee RE, 2010. Structural studies of pterin-based inhibitors of dihydropteroate synthase. J Med Chem. 53:166-177.

Higgins VJ, Alic N, Thorpe GW, Breitenbach M, Larsson V, Dawes IW, 2002. Phenotypic analysis of gene deletant strains for sensitivity to oxidative stress. Yeast 19:203-214.

Hittinger CT, Carroll SB, 2007. Gene duplication and the adaptive evolution of a classic genetic switch. Nature 449:677-681.

Høiby N, Bjarnsholt T, Givskov M, Molin S, Ciofu O, 2010. Antibiotic resistance of bacterial biofilms. Int J Antimicrob Agents. 35:322-332.

41 Chapter 1

Hohmann S, 2002. Osmotic stress signaling and osmoadaptation in yeasts. Microbiol Mol Biol Rev. 66:300-372.

Hohmann, S, Mager WH, 2003. Yeast Stress Responses. Springer-Verlag.

Hollis B, Fierst JL, Houle D, 2009. Sexual selection accelerates the elimination of a deleterious mutant in Drosophila melanogaster. Evolution 63:324-333.

Holyoak CD, Bracey D, Piper PW, Kuchler K, Coote PJ, 1999. The Saccharomyces cerevisiae weak-acid-inducible ABC transporter Pdr12 transports fluorescein and preservative anions from the cytosol by an energy-dependent mechanism. J Bacteriol. 181:4644-4652.

Horiuchi T, Horiuchi S, Novick A, 1963. The genetic basis of hypersynthesis of ß-galactosidase. Genetics 48,157- 169.

Horiuchi T, Tomizawa JI, Novick A, 1962. Isolation and properties of bacteria capable of high rates of beta- galactosidase synthesis. Biochim Biophys Acta. 55:152-63.

Hoyle BD, Alcantara J, Costerton JW, 1992. Pseudomonas aeruginosa biofilm as a diffusion barrier to piperacillin. Antimicrob Agents Chemother. 36:2054-2056.

Hsiao TC, 1973. Plant responses to water stress. Ann Rev Plant Physiol. 24:519-570.

Hughes AL, 2005. Gene duplication and the origin of novel proteins. Proc Natl Acad Sci USA 102:8791-8792.

Humphrey TJ, Richardson NP, Statton KM, Rowbury RJ, 1993. Effects of temperature shift on acid and heat tolerance in Salmonella enteritidis phage type 4. Appl Environ Microbiol. 59:3120-3122.

Ikner A, Shiozaki K, 2005. Yeast signaling pathways in the oxidative stress response. Mutat Res. 569:13-27.

Infante JJ, Dombek KM, Rebordinos L, Cantoral JM, Young ET, 2003. Genome-wide amplifications caused by chromosomal rearrangements play a major role in the adaptive evolution of natural yeast. Genetics 165:1745-1759.

Ives CL, Bott KF, 1990. Characterization of chromosomal DNA amplifications with associated tetracycline resistance in Bacillus subtilis. J Bacteriol. 172:4936-4944.

Jackson MB, Armstrong W, 1999. Formation of Aerenchyma and the Processes of Plant Ventilation in Relation to Soil Flooding and Submergence. Plant Biol. 1:274–287.

Jaenicke R, 1991. Protein stability and molecular adaptation to extreme conditions. Eur J Biochem. 202:715-28

Jaillon O et al., 2004. Genome duplication in the teleost fish Tetraodon nigroviridis reveals the early vertebrate proto-karyotype. Nature 431:946-957.

Jamieson DJ, 1992. Saccharomyces cerevisiae has distinct adaptive responses to both hydrogen peroxide and menadione. J Bacteriol. 174:6678-6681.

Jelinsky SA, Samson LD, 1999. Global response of Saccharomyces cerevisiae to an alkylating agent. Proc Natl Acad Sci USA 96:1486-1491.

Jiao Y et al., 2011.Ancestral polyploidy in seed plants and angiosperms. Nature 473:97-100.

Johnston M, 2000. The yeast genome: on the road to the Golden Age. Curr Opin Genet Dev. 10:617-623.

42 Chapter 1

Jolly C, Morimoto RI, 2000. Role of the heat shock response and molecular chaperones in oncogenesis and cell death. J Natl Cancer Inst. 92:1564-1572.

Jonak C, Kiegerl S, Ligterink W, Barker PJ, Huskisson NS, Hirt H, 1996. Stress signaling in plants: a mitogen- activated protein kinase pathway is activated by cold and drought. Proc Natl Acad Sci USA 93:11274-11279.

Justin SHFW, Armstrong W, 1987.The Anatomical Characteristics of Roots and Plant Response to Soil Flooding. New Phytologist 106:465–495.

Kämper J et al., 2006.Insights from the genome of the biotrophic fungal plant pathogen Ustilago maydis. Nature 444:97-101.

Kandror O, Bretschneider N, Kreydin E, Cavalieri D, Goldberg AL, 2004. Yeast adapt to near-freezing temperatures by STRE/Msn2,4-dependent induction of trehalose synthesis and certain molecular chaperones. Mol Cell 13:771- 781.

Katz RA, Skalka AM, 1994. The retroviral enzymes. Annu Rev Biochem. 63:133-73.

Kawai C, Prado FM, Nunes GL, Di Mascio P, Carmona-Ribeiro AM, Nantes IL, 2005. pH-Dependent interaction of cytochrome c with mitochondrial mimetic membranes: the role of an array of positively charged amino acids. J Biol Chem. 280:34709–34717.

Kawai T, Akira S, 2006. Innate immune recognition of viral infection. Nat Immunol. 7:131-137.

Kawecki TJ, Lenski RE, Ebert D, Hollis B, Olivieri I, Whitlock MC, 2012. Experimental evolution. Trends Ecol Evol. 27:547-560.

Kellis M, Birren BW, Lander ES, 2004. Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae. Nature 428:617-624.

Kellis M, Patterson N, Endrizzi M, Birren B, Lander ES, 2003. Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423: 241–254.

Keren I, Kaldalu N, Spoering A, Wang Y, Lewis K, 2004. Persister cells and tolerance to antimicrobials. FEMS Microbiol Lett. 230:13-18.

Kerr B, Neuhauser C, Bohannan BJ, Dean AM, 2006. Local migration promotes competitive restraint in a host- pathogen 'tragedy of the commons'. Nature 442:75-78.

Khersonsky O, Roodveldt C, Tawfik DS, 2006. Enzyme promiscuity: evolutionary and mechanistic aspects. Curr Opin Chem Biol. 10:498-508.

Kikuchi A, Asai K, 1984. Reverse gyrase—a topoisomerase which introduces positive superhelical turns into DNA. Nature 309:677-681.

Kinclova-Zimmermannova O, Gaskova D, Sychrova H, 2006. The Na+,K+/H+ -antiporter Nha1 influences the plasma membrane potential of Saccharomyces cerevisiae. FEMS Yeast Res. 6:792-800.

Klump H, Di Ruggiero J, Kessel M, Park JB, Adams MW, Robb FT, 1992. Glutamate dehydrogenase from the hyperthermophile Pyrococcus furiosus. Thermal denaturation and activation. J Biol Chem. 267:22681-22685.

Kolss M, Vijendravarma RK, Schwaller G, Kawecki TJ, 2009. Life-history consequences of adaptation to larval nutritional stress in Drosophila. Evolution 63:2389-2401.

43 Chapter 1

Kondrashov FA, 2012. Gene duplication as a mechanism of genomic adaptation to a changing environment. Proc Biol Sci. 279:5048-5057.

Kondrashov FA, Rogozin IB, Wolf YI, Koonin EV, 2002. Selection in the evolution of gene duplications. Genome Biol. 3:RESEARCH0008.

Kondratyeva TF, Muntyan LN, Karavaiko GI, 1995. Zinc- and arsenic-resistant strains of Thiobacillus ferrooxidans have increased copy numbers of chromosomal resistance genes. Microbiology 141, 1157-1162.

Kosako H, Nishida E, Gotoh Y, 1993. cDNA cloning of MAP kinase kinase reveals kinase cascade pathways in yeasts to vertebrates. EMBO J. 12:787-794.

Koskella B, Lively CM, 2009. Evidence for negative frequencydependent selection during experimental coevolution of a freshwater snail and a sterilizing trematode. Evolution 63:2213–2221.

Kourtis N, Tavernarakis N, 2011. Cellular stress response pathways and ageing: intricate molecular relationships. EMBO J. 30:2520-2531.

Kovtun Y, Chiu WL, Tena G, Sheen J, 2000. Functional analysis of oxidative stress-activated mitogen-activated protein kinase cascade in plants. Proc Natl Acad Sci USA 97:2940-2945.

Krauth-Siegel RL, Coombs GH, 1999. Enzymes of parasite thiol metabolism as drug targets. Parasitol Today. 15:404-409.

Kubota H, Senda S, Tokuda H, Uchiyama H, Nomura N, 2009. Stress resistance of biofilm and planktonic Lactobacillus plantarum subsp. plantarum JCM 1149. Food Microbiol. 26:592-597.

Kuepfer L, Sauer U, Blank LM, 2005. Metabolic functions of duplicate genes in Saccharomyces cerevisiae. Genome Res. 15:1421-1430.

Kültz D, 2003. Evolution of the cellular stress proteome: from monophyletic origin to ubiquitous function. J Exp Biol. 206:3119-3124.

Kumar A, Snyder M, 2001. Emerging technologies in yeast genomics. Nat Rev Genet. 2:302-312.

Kussell E, Leibler S, 2005. Phenotypic diversity, population growth, and information in fluctuating environments. Science 309:2075–2078.

Kuzdzal-Fick JJ, Fox SA, Strassmann JE, Queller DC, 2011. High relatedness is necessary and sufficient to maintain multicellularity in Dictyostelium. Science 334: 1548–1551.

Kvam E, Tyrrell RM, 1997. Induction of oxidative DNA base damage in human skin cells by UV and near visible radiation. Carcinogenesis 18:2379-2384.

Kyriakis JM, Avruch J, 2001. Mammalian mitogen-activated protein kinase signal transduction pathways activated by stress and inflammation. Physiol Rev. 81:807-869.

Lamhamdi M, Bakrim A, Aarab A, Lafont R, Sayah F, 2011. Lead phytotoxicity on wheat (Triticum aestivum L.) seed germination and seedlings growth. C R Biol. 334:118-126.

Le Gac M, Plucain J, Hindré T, Lenski RE, Schneider D, 2012. Ecological and evolutionary dynamics of coexisting lineages during a long-term experiment with Escherichia coli. Proc Natl Acad Sci USA 109:9487-9492.

LeClerc JE, Li B, Payne WL, Cebula TA, 1996. High mutation frequencies among Escherichia coli and Salmonella pathogens. Science 274:1208-1211.

44 Chapter 1

Lee JC, Oh JY, Cho JW, Park JC, Kim JM, Seol SY, Cho DT, 2001. The prevalence of trimethoprim-resistance- conferring dihydrofolate reductase genes in urinary isolates of Escherichia coli in Korea. J Antimicrob Chemother. 47:599-604.

Lee JH, Montagu MV, Verbruggen N, 1999. A highly conserved kinase is an essential component for stress tolerance in yeast and plant cells. Proc Natl Acad Sci USA 96: 5873-5877

Lenski RE, 2011. Evolution in action: a 50,000-generation salute to Charles Darwin. Microbe 6:30-33.

Leu JY, Murray AW, 2006. Experimental evolution of mating discrimination in budding yeast. Curr Biol. 16:280- 286.

Lever OW Jr, Bell LN, Hyman C, McGuire HM, Ferone R, 1986. Inhibitors of dihydropteroate synthase: substituent effects in the side-chain aromatic ring of 6-[[3-(aryloxy)propyl]amino]-5-nitrosoisocytosines and synthesis and inhibitory potency of bridged 5-nitrosoisocytosine-p-aminobenzoic acid analogues. J Med Chem. 29:665-670.

Levy SF, Ziv N, Siegal ML, 2012. Bet hedging in yeast by heterogeneous, age-correlated expression of a stress protectant. PLoS Biol. 10:e1001325.

Lewis K, 2001. Riddle of biofilm resistance. Antimicrob Agents Chemother. 45:999-1007.

Lewis M, Pryor R, Wilking L, 2011. Fate and effects of anthropogenic chemicals in mangrove ecosystems: a review. Environ Pollut. 159:2328-2346.

Leyer GJ, Johnson EA, 1993. Acid adaptation induces cross-protection against environmental stresses in Salmonella typhimurium. Appl Environ Microbiol. 59:1842-1847.

Libby E, Rainey PB, 2011. Exclusion rules, bottlenecks and the evolution of stochastic phenotype switching. Proc Biol Sci. 278:3574-3583.

Lin CC, Kao CH, 2002. Osmotic stress-induced changes in cell wall peroxidase activity and hydrogen peroxide level in roots of rice seedlings. Plant Growth Reg. 37:177-184.

Lindquist S, 1992. Heat-shock proteins and stress tolerance in microorganisms. Curr Opin Genet Dev. 2:748-755.

Lindquist S, Craig EA, 1988. The heat-shock proteins. Annu Rev Genet. 22:631-677.

Löbrich M, Cooper PK, Rydberg B, 1996. Non-random distribution of DNA double-strand breaks induced by particle irradiation. Int J Radiat Biol. 70:493–503.

Long M, 2001. Evolution of novel genes. Curr Opin Genet Dev. 11:673-680.

Lou Y, Yousef AE, 1997. Adaptation to sublethal environmental stresses protects Listeria monocytogenes against lethal preservation factors. Appl Environ Microbiol. 63:1252-1255.

Lu D, Maulik N, Moraru II, Kreutzer DL, Das DK, 1993. Molecular adaptation of vascular endothelial cells to oxidative stress. Am J Physiol. 264:C715-C722.

Lubran MM, 1988. Bacterial toxins. Ann Clin Lab Sci. 18:58-71.

Lucena BT, Silva-Filho EA, Coimbra MR, Morais JO, Simões DA, Morais MA Jr, 2007. Chromosome instability in industrial strains of Saccharomyces cerevisiae batch cultivated under laboratory conditions. Genet Mol Res. 6:1072- 1084.

45 Chapter 1

Luikenhuis S, Perrone G, Dawes IW, Grant CM, 1998. The yeast Saccharomyces cerevisiae contains two glutaredoxin genes that are required for protection against reactive oxygen species. Mol Biol Cell 9:1081-1091.

Lushchak VI, Gospodaryov DV, 2005. Catalases protect cellular proteins from oxidative modification in Saccharomyces cerevisiae. Cell Biol Int. 29:187-192.

Lyman CP, Willis JS, Malan A, 1982. Hibernation and Torpor in Mammals and Birds. Academic Press, New York.

Lynch M, Conery JS, 2000. The evolutionary fate and consequences of duplicate genes. Science 290:1151-1155.

Lynch M, Force A, 2000a. The origin of interspecies genomic incompatibility via gene duplication. Am Nat. 156, 590-605.

Lynch M, Force A, 2000b. The probability of duplicate gene preservation by subfunctionalization. Genetics 154:459-473.

Lynch VJ, 2007. Inventing an arsenal: adaptive evolution and neofunctionalization of snake venom phospholipase A2 genes. BMC Evol Biol. 7:2.

Maamar H, Raj A, Dubnau D, 2007. Noise in gene expression determines cell fate in Bacillus subtilis. Science 317:526-529.

Maathuis FJM, Amtmann A, 1999. K+ Nutrition and Na+ Toxicity : The Basis of Cellular K+/Na+ Ratios. Annals of Bot. 84:123-133.

MacCarthy T, Bergman A, 2007. The limits of subfunctionalization. BMC Evol Biol. 7:213.

Macelroy RD, 1974. Some comments on the evolution of extremophiles. Biosystems 6: 74–75.

Machado S, Paulsen GM, 2001. Combined effects of drought and high temperature on water relations of wheat and sorghum. Plant and Soil 233:179-187.

Madigan MT, Marrs BL, 1997. Extremophiles. Sci Am. 276:82–87.

Maeda T, Wurgler-Murphy SM, Saito H, 1994. A two-component system that regulates an osmosensing MAP kinase cascade in yeast. Nature 369:242–245.

Magalhães S, Fayard J, Janssen A, Carbonell D, Olivieri I, 2007. Adaptation in a spider mite population after long- term evolution on a single host plant. J Evol Biol. 20:2016-2027.

Mah TF, O'Toole GA, 2001. Mechanisms of biofilm resistance to antimicrobial agents. Trends Microbiol. 9:34-39.

Majmundar AJ, Wong WJ, Simon MC, 2010. Hypoxia-inducible factors and the response to hypoxic stress. Mol Cell 40:294-309.

Makalowski W, 2001. Are we polyploids? A brief history of one hypothesis. Genome Res. 11:667-670.

Maleck K, Levine A, Eulgem T, Morgan A, J, Lawton KA, Dangl JL, Dietrich RA, 2000.The transcriptome of Arabidopsis thaliana during systemic acquired resistance. Nat Genet. 26:403-410.

Malmström E et al., 2009. Protein C inhibitor--a novel antimicrobial agent. PLoS Pathog. 5:e1000698.

Manhes P, Velicer GJ, 2011. Experimental evolution of selfish policing in social bacteria. Proc. Natl. Acad. Sci

46 Chapter 1

USA 108:8357–8362.

Maroni G, Wise J, Young JE, Otto E, 1987. Metallothionein gene duplications and metal tolerance in natural populations of Drosophila melanogaster. Genetics 117:739-744.

Marques-Bonet T et al., 2009. A burst of segmental duplications in the genome of the African great ape ancestor. Nature 457:877-881.

Marshall NJ, Piddock LJ, 1997. Antibacterial efflux systems. Microbiologia 13:285-300.

Martínez-Pastor MT, Marchler G, Schüller C, Marchler-Bauer A, Ruis H, Estruch F, 1996. The Saccharomyces cerevisiae zinc finger proteins Msn2p and Msn4p are required for transcriptional induction through the stress response element (STRE). EMBO J. 15:2227-2235.

Mathis JB, Brown GM, 1970. The biosynthesis of folic acid. XI. Purification and properties of dihydroneopterin aldolase. J Biol Chem. 245:3015–3025.

Matic I, Radman M, Taddei F, Picard B, Doit C, Bingen E, Denamur E, Elion J, 1997. Highly variable mutation rates in commensal and pathogenic Escherichia coli. Science 277:1833-1834.

Matthews PR, Stewart PR, 1988. Amplification of a section of chromosomal DNA in methicillin-resistant Staphylococcus aureus following growth in high concentrations of methicillin. J Gen Microbiol. 134:1455-1464.

Mazur P, 1970. Cryobiology: the freezing of biological systems. Science 168:939-949.

McAdams HH, Arkin A. 1997. Stochastic mechanisms in gene expression. Proc Natl Acad Sci USA 94:814-819.

McCord JM, 1996. Effects of positive iron status at a cellular level. Nutr Rev. 54:85-88.

McCullough JL, Maren TH, 1973. Inhibition of dihydropteroate synthetase from Escherichia coli by sulfones and sulfonamides. Antimicrob Agents Chemother. 3:665-669.

McNamara JM, Welham RK, Houston AI, 1998. The Timing of Migration within the Context of an Annual Routine J Avian Biol. 29:416-423.

Medzhitov R, 2007. Recognition of microorganisms and activation of the immune response. Nature 449:819-826.

Meka et al., 2004. Linezolid resistance in sequential Staphylococcus aureus isolates associated with a T2500A mutation in the 23S rRNA gene and loss of a single copy of rRNA. J Infect Dis. 190:311-317.

Mendizabal I, Rios G, Mulet JM, Serrano R, de Larrinoa IF, 1998. Yeast putative transcription factors involved in salt tolerance. FEBS Lett. 425: 323-328.

Mendoza I, Rubioli F, Rodriguez-Navarroli A, Pardo JM, 1994. The Protein Phosphatase Calcineurin Is Essential for NaCl Tolerance of Saccharomyces cerevisiae. J Biol Chem. 269: 8792-8796.

Meryman HT, 1974. Freezing injury and its prevention in living cells. Annu Rev Biophys Bioeng. 3:341-363.

Meyer JR, Agrawal AA, Quick RT, Dobias DT, Schneider D, Lenski RE, 2010. Parallel changes in host resistance to viral infection during 45,000 generations of relaxed selection. Evolution 64:3024-3034.

Meyer JR, Dobias DT, Weitz JS, Barrick JE, Quick RT, Lenski RE, 2012. Repeatability and contingency in the evolution of a key innovation in phage lambda. Science 335:428-432.

47 Chapter 1

Middlebrook JL, Dorland RB, 1984. Bacterial toxins: cellular mechanisms of action. Microbiol Rev. 48:199-221.

Misumi M, Tanaka N, 1980. Mechanism of inhibition of translocation by kanamycin and viomycin: a comparative study with fusidic acid. Biochem Biophys Res Commun. 92:647-654.

Mitchel RE, Morrison DP, 1982. Heat-shock induction of ionizing radiation resistance in Saccharomyces cerevisiae, and correlation with stationary growth phase. Radiat Res. 90:284-291.

Mitchel RE, Morrison DP, 1983. Heat-shock induction of ultraviolet light resistance in Saccharomyces cerevisiae. Radiat Res. 96:95-99.

Mitchell A, Pilpel Y, 2011. A mathematical model for adaptive prediction of environmental changes by microorganisms. Proc Natl Acad Sci USA 108:7271-7276.

Mitchell A, Romano GH, Groisman B, Yona A, Dekel E, Kupiec M, Dahan O, Pilpel Y, 2009. Adaptive prediction of environmental changes by microorganisms. Nature 460:220-224.

Mitchell DL, Vaughan JE, Nairn RS, 1989. Inhibition of transient gene expression in Chinese hamster ovary cells by cyclobutane dimers and (6-4) photoproducts in transfected ultraviolet-irradiated plasmid DNA. Plasmid 21: 21- 30.

Mittler R, 2006. Abiotic stress, the field environment and stress combination. Trends Plant Sci. 11:15-19.

Miyamura S, Ochiai H, Nitahara Y, Nakagawa Y, Terao M, 1977. Resistance mechanism of chloramphenicol in Streptococcus haemolyticus, Streptococcus pneumoniae and Streptococcus faecalis. Microbiol Immunol. 21:69-76.

Moffat AS, 2002. Finding new ways to protect drought-stricken plants. Science 296:1226-1229.

Montecucco C, Schiavo G, Rossetto O, 1996. The mechanism of action of tetanus and botulinum neurotoxins. Arch Toxicol Suppl. 18:342-354.

Montgomery S. 1974. Hedging one's evolutionary bets. Nature 250:704-705.

Morran LT, Parmenter MD, Phillips PC, 2009. Mutation load and rapid adaptation favour outcrossing over self- fertilization. Nature 462:350-352.

Morran LT, OG, Gelarden IA, Parrish RC 2nd, Lively CM, 2011. Running with the Red Queen: host- parasite coevolution selects for biparental sex. Science 333:216-218.

Moskvina E, Schüller C, Maurer CT, Mager WH, Ruis H, 1998. A search in the genome of Saccharomyces cerevisiae for genes regulated via stress response elements. Yeast 14:1041-1050.

Moxon ER, Rainey PB, Nowak MA, Lenski RE, 1994. Adaptive evolution of highly mutable loci in pathogenic bacteria. Curr Biol. 4:24-33.

Moya A, Galiana A, Ayala FJ, 1995. Founder-effect speciation theory: failure of experimental corroboration. Proc Natl Acad Sci USA 92:3983-3986.

Muller HJ, 1942. Isolating mechanisms, evolution, and temperature. Biol Symp. 1942, 6:71-125.

Munns R, 2002. Comparative physiology of salt and water stress. Plant Cell Environ. 25:239-250.

Musher DM, Dowell ME, Shortridge VD, Flamm RK, Jorgensen JH, Le Magueres P, Krause KL, 2002. Emergence of macrolide resistance during treatment of pneumococcal pneumonia. N Engl J Med. 346:630-631.

48 Chapter 1

Nair S, Nash D, Sudimack D, Jaidee A, Barends M, Uhlemann AC, Krishna S, Nosten F, Anderson TJ, 2007. Recurrent gene amplification and soft selective sweeps during evolution of multidrug resistance in malaria parasites. Mol Biol Evol. 24:562-573.

Näsvall J, Sun L, Roth JR, Andersson DI, 2012. Real-time evolution of new genes by innovation, amplification, and divergence. Science 338:384-387.

Nei M, Maryuama T, Chakraborty R, 1975. The bottleneck effect and genetic variability in populations. Evolution 29:1-10.

Neiman AM, 1993. Conservation and reiteration of a kinase cascade. Trends Genet. 9:390-394.

Newman HC, Prise KM, Folkard M, Michael BD, 1997. DNA double-strand break distributions in x-ray and alpha- particle irradiated V79 cells: evidence for non-random breakage. Int J Radiat Biol 71:347–363.

Nichols BP, Guay GG, 1989. Gene amplification contributes to sulfonamide resistance in Escherichia coli. Antimicrob Agents Chemother. 33:2042-2048.

Nilsen IW, Bakke I, Vader A, Olsvik O, El-Gewely MR, 1996. Isolation of cmr, a novel Escherichia coli chloramphenicol resistance gene encoding a putative efflux pump. J Bacteriol. 178:3188-3193.

Nilsson AI, Zorzet A, Kanth A, Dahlström S, Berg OG, Andersson DI, 2006. Reducing the fitness cost of antibiotic resistance by amplification of initiator tRNA genes. Proc Natl Acad Sci USA 103:6976-6981.

Normark S, Edlund T, Grundström T, Bergström S, Wolf-Watz H, 1977. Escherichia coli K-12 mutants hyperproducing chromosomal beta-lactamase by gene repetitions. J Bacteriol. 132:912-922.

Nriagu JO, Pacyna JM, 1988. Quantitative assessment of worldwide contamination of air, water and soils by trace metals. Nature 333:134-139.

Ohno S, 1970. Evolution by Gene Duplication. Springer-Verlag, Berlin.

Okazaki S, Tachibana T, Naganuma A, Mano N, Kuge S, 2007. Multistep disulfide bond formation in Yap1 is required for sensing and transduction of H2O2 stress signal. Mol Cell 27:675-688.

Oliver SG, Winson MK, Kell DB, Baganz F, 1998. Systematic functional analysis of the yeast genome. Trends Biotechnol. 16:373-378.

Opperdoes FR, Michels PA, 2001. Enzymes of carbohydrate metabolism as potential drug targets. Int J Parasitol. 31:482-490.

Orgel LE, 1977. Gene-duplication and the origin of proteins with novel functions. J Theor Biol. 67:773.

O'Rourke SM, Herskowitz I, 2004. Unique and redundant roles for HOG MAPK pathway components as revealed by whole-genome expression analysis. Mol Biol Cell 15:532-542.

Orr HA, Turelli M, 2001. The evolution of postzygotic isolation: accumulating Dobzhansky-Muller incompatibilities. Evolution 55:1085-1094.

Ortega G, Laín A, Tadeo X, López-Méndez B, Castaño D, Millet O, 2011. Halophilic enzyme activation induced by salts. Sci Rep. 1:6.

Ostrander DB, Gorman JA, 1999. The extracellular domain of the Saccharomyces cerevisiae Sln1p membrane osmolarity sensor is necessary for kinase activity. J Bacteriol. 181:2527–2534.

49 Chapter 1

Outten CE, Falk RL, Culotta VC, 2005. Cellular factors required for protection from hyperoxia toxicity in Saccharomyces cerevisiae. Biochem J. 388:93-101.

Padgett DA, Glaser R, 2003. How stress influences the immune response. Trends Immunol. 24:444-448.

Palhano FL, Orlando MT, Fernandes PM, 2004. Induction of baroresistance by hydrogen peroxide, ethanol and cold-shock in Saccharomyces cerevisiae. FEMS Microbiol Lett. 233:139-145.

Pancer Z, Cooper MD, 2006. The evolution of adaptive immunity. Annu Rev Immunol. 24:497-518. Pardo JM et al., 1998. Stress signaling through Ca2+/ calmodulin-dependent protein phosphatase calcineurin mediates salt adaptation in plants. Proc Natl Acad Sci USA 95: 9681-9686.

Pavlov IP, 1927. Conditioned Reflexes. Oxford University Press, Oxford, UK.

Pearce JM, Hall G, 1980. A model for Pavlovian learning: Variations in the effectiveness of conditioned but not of unconditioned stimuli. Psycho Rev. 87:532-552.

Pereira MD, Eleutherio EC, Panek AD, 2001. Acquisition of tolerance against oxidative damage in Saccharomyces cerevisiae. BMC Microbiol. 1:11.

Pereira-Leal JB, Levy ED, Kamp C, Teichmann SA, 2007. Evolution of protein complexes by duplication of homomeric interactions. Genome Biol. 8:R51.

Pérez-Torrado R, Bruno-Bárcena JM, Matallana E, 2005. Monitoring stress-related genes during the process of biomass propagation of Saccharomyces cerevisiae strains used for wine making. Appl Environ Microbiol. 71:6831- 6837.

Petersohn A, Brigulla M, Haas S, Hoheisel JD, Völker U, Hecker M, 2001. Global analysis of the general stress response of Bacillus subtilis. J Bacteriol. 183:5617-5631.

Phadtare S, 2004. Recent developments in bacterial cold-shock response. Curr Issues Mol Biol. 6:125-136.

Philippi T, Seger J. 1989. Hedging one's evolutionary bets, revisited. Trends Ecol Evol. 4:41-44.

Pi N, Tam NFY, Wu Y, Wong MH, 2009. Root anatomy and spatial pattern of radial oxygen loss of eight true mangrove species. Aquatic Bot. 90:222–230.

Poole K, 2001. Multidrug efflux pumps and antimicrobial resistance in Pseudomonas aeruginosa and related organisms. J Mol Microbiol Biotechnol. 3:255-264.

Porcher E, Giraud T, Lavigne C, 2006. Genetic differentiation of neutral markers and quantitative traits in predominantly selfing metapopulations: confronting theory and experiments with Arabidopsis thaliana. Genet Res. 87:1-12.

Posas F, Chambers JR, Heyman JA, Hoeffler JP, de Nadal E, Ariño J, 2000. The transcriptional response of yeast to saline stress. J Biol Chem. 275:17249-17255.

Posas F, Saito H, 1997. Osmotic activation of the HOG MAP kinase pathway via Ste11p MAPKKK: scaffold role of Pbs2p MAPKK. Science 276:1702–1705.

Price CW, Fawcett P, Cérémonie H, Su N, Murphy CK, Youngman P, 2001. Genome-wide analysis of the general stress response in Bacillus subtilis. Mol Microbiol. 41:757-774.

Privalov PL, 1990. Cold denaturation of proteins. Crit Rev Biochem Mol Biol. 25:281-305.

50 Chapter 1

Protic-Sabljic M, Kraemer KH, 1986. One pyrimidine dimer inactivates expression of a transfected gene in xeroderma pigmentosum cells. Proc Natl Acad Sci USA 82: 6622-6626

Quinn PJ, 1988. Effects of temperature on cell membranes. Symp Soc Exp Biol. 42:237-258. Quintero FJ, Blatt MR, Pardo JM, 2000. Functional conservation between yeast and plant endosomal Na+/H+ antiporters. FEBS Lett. 471: 224-228.

Quintero FJ, Ohta M, Shi H, Zhu J-K, Pardo JM, 2002. Reconstitution in yeast of the Arabidopsis SOS signaling pathway for Na+ homeostasis. Proc Natl Acad Sci USA 99: 9061-9066.

Rainey PB, Beaumont HJ, Ferguson GC, Gallie J, Kost C, Libby E, Zhang XX, 2011. The evolutionary emergence of stochastic phenotype switching in bacteria. Microb Cell Fact. 10:S14.

Rainey PB, Travisano M, 1998. Adaptive radiation in a heterogeneous environment. Nature 394:69–72.

Raj A, Rifkin SA, Andersen E, van Oudenaarden A, 2010. Variability in gene expression underlies incomplete penetrance. Nature 463:913-918.

Raj A, van Oudenaarden A. 2008. Nature, nurture, or chance: stochastic gene expression and its consequences. Cell 135:216-226.

Raser JM, O'Shea EK, 2004. Control of stochasticity in eukaryotic gene expression. Science 304:1811-1814.

Raser JM, O'Shea EK, 2005 Noise in gene expression: origins, consequences, and control. Science 309:2010-2013.

Rastogi RP, Richa, Kumar A, Tyagi MB, Sinha RP, 2010. Molecular mechanisms of ultraviolet radiation-induced DNA damage and repair. J Nucleic Acids 2010:592980.

Rastogi S, Liberles DA, 2005. Subfunctionalization of duplicated genes as a transition state to neofunctionalization. BMC Evol Biol.5:28.

Raymond M, Poulin E, Boiroux V, Dupont E, Pasteur N, 1993. Stability of insecticide resistance due to amplification of esterase genes in Culex pipiens. Heredity 70:301–307.

Re F, Sesana S, Barbiroli A, Bonomi F, Cazzaniga E, Lonati E, Bulbarelli A, Masserini M, 2008. Prion protein structure is affected by pH-dependent interaction with membranes: a study in a model system. FEBS Lett. 582:215– 220.

Reams AB, Neidle EL, 2003. Genome plasticity in Acinetobacter: new degradative capabilities acquired by the spontaneous amplification of large chromosomal segments. Mol Microbiol. 47:1291-1304.

Reaves ML, Lopez LC, Daskalova SM, 2008. Replacement of the antifreeze-like domain of human N- acetylneuraminic acid phosphate synthase with the mouse antifreeze-like domain impacts both N-acetylneuraminic acid 9-phosphate synthase and 2-keto-3-deoxy-D-glycero-D-galacto-nonulosonic acid 9-phosphate synthase activities. BMB Rep. 41:72-78.

Reboud X, Bell G, 1997. Experimental evolution in Chlamydomonas. III. Evolution of specialist and generalist types in environments that vary in space and time. Heredity 78:507-514.

Rep M, Krantz M, Thevelein JM, Hohmann S, 2000.The transcriptional response of Saccharomyces cerevisiae to osmotic shock. Hot1p and Msn2p/Msn4p are required for the induction of subsets of high osmolarity glycerol pathway-dependent genes. J Biol Chem. 275:8290-8300.

51 Chapter 1

Reynolds JJ, Brown GM, 1964. The Biosynthesis of Folic Acid. IV. Enzymatic synthesis of dihydrofolic acid from guanine and ribose compounds. J Biol Chem. 239:317–325.

Richey DP, Brown GM, 1969. The biosynthesis of folic acid. IX. Purification and properties of the enzymes required for the formation of dihydropteroic acid. J Biol Chem. 244:1582–1592.

Riehle MM, Bennett AF, Long AD, 2001. Genetic architecture of thermal adaptation in Escherichia coli. Proc Natl Acad Sci USA 98:525-530.

Ringrose L, Paro R, 2004. Epigenetic regulation of cellular memory by the Polycomb and Trithorax group proteins. Annu Rev Genet. 38:413-443.

Robb FT, Clark DS, 1999. Adaptation of proteins from hyperthermophiles to high pressure and high temperature. J Mol Microbiol Biotechnol. 1:101-105.

Rodaki A, Bohovych IM, Enjalbert B, Young T, Odds FC, NA, Brown AJ, 2009. Glucose promotes stress resistance in the fungal pathogen Candida albicans. Mol Biol Cell 20:4845-4855.

Rodríguez AC, Stock D, 2002. Crystal structure of reverse gyrase: insights into the positive supercoiling of DNA. EMBO J. 21:418-426.

Rodríguez-Trelles F, Tarrío R, Ayala FJ, 2003. Convergent neofunctionalization by positive Darwinian selection after ancient recurrent duplications of the xanthine dehydrogenase gene. Proc Natl Acad Sci USA 100:13413-13417.

Roels SAB, Kelly JK, 2011. Rapid evolution caused by pollinator loss in Mimulus guttatus. Evolution 65:2541– 2552.

Rolfe BG, Campbell JH, 1977. Genetic and physiological control of host cell lysis by bacteriophage lambda. J Virol. 23:626-636.

Roth JR, Andersson DI, 2004. Amplification-mutagenesis--how growth under selection contributes to the origin of genetic diversity and explains the phenomenon of adaptive mutation. Res Microbiol. 155:342-351.

Rothschild LJ, Mancinelli RL, 2001. Life in extreme environments. Nature 409:1092-1101.

Rownd R, Mickel S, 1971. Dissociation and reassociation of RTF and r-determinants of the R-factor NR1 in Proteus mirabilis. Nat New Biol. 234:40-43.

Rudolph B, Gebendorfer KM, Buchner J, Winter J, 2010. Evolution of Escherichia coli for Growth at High Temperatures. J Biol Chem. 285:19029-19034.

Rundle HD, 2003. Divergent environments and population bottlenecks fail to generate premating isolation in Drosophila pseudoobscura. Evolution 57:2557-2565.

Rundle HD, Nosil P, 2005. Ecological speciation. Ecol Lett. 8: 336–352.

Russell AD, 2002. Mechanisms of antimicrobial action of antiseptics and disinfectants: an increasingly important area of investigation. J Antimicrob Chemother. 49:597-599.

Sabeti PC et al., 2007. Genome-wide detection and characterization of positive selection in human populations. Nature 449:913-918.

Samonte RV, Eichler EE, 2002. Segmental duplications and the evolution of the primate genome. Nat Rev Genet. 3:65-72.

52 Chapter 1

Sandegren L, Andersson DI, 2009. Bacterial gene amplification: implications for the evolution of antibiotic resistance. Nat Rev Microbiol. 7:578-588.

Santini F, Harmon LJ, Carnevale G, Alfaro ME, 2009. Did genome duplication drive the origin of teleosts? A comparative study of diversification in ray-finned fishes. BMC Evol Biol. 9:194.

Sarafianos SG, Marchand B, Das K, Himmel DM, Parniak MA, Hughes SH, Arnold E, 2009. Structure and function of HIV-1 reverse transcriptase: molecular mechanisms of polymerization and inhibition. J Mol Biol. 385:693-713.

Saris N, Holkeri H, Craven RA, Stirling CJ, Makarow M, 1997. The Hsp70 homologue Lhs1p is involved in a novel function of the yeast endoplasmic reticulum, refolding and stabilization of heat-denatured protein aggregates. J Cell Biol. 137:813-824.

Savageau MA, 1998. Demand theory of gene regulation. II. Quantitative application to the lactose and maltose operons of Escherichia coli. Genetics 149:1677–1691.

Scannell DR, Byrne KP, Gordon JL, Wong S, Wolfe KH, 2006. Multiple rounds of speciation associated with reciprocal gene loss in polyploid yeasts. Nature 440:341-345.

Schild S, Tamayo R, Nelson EJ, Qadri F, Calderwood SB, Camilli A, 2007. Genes induced late in infection increase fitness of Vibrio cholerae after release into the environment. Cell Host Microbe 2:264-277.

Schluter D, 2001. Ecology and the origin of species. Trends Ecol Evol. 16:372-380.

Schmitt AP, McEntee K, 1996. Msn2p, a zinc finger DNA-binding protein, is the transcriptional activator of the multistress response in Saccharomyces cerevisiae. Proc Natl Acad Sci USA 93:5777-5782.

Schnappinger D, Hillen W, 1996. Tetracyclines: antibiotic action, uptake, and resistance mechanisms. Arch Microbiol. 165:359-369.

Scholz H, Franz M, Heberlein U, 2005. The hangover gene defines a stress pathway required for ethanol tolerance development. Nature 436:845-847.

Segovia M, 1994. Leishmania gene amplification: a mechanism of drug resistance. Ann Trop Med Parasitol. 88, 123-130.

Seong KH, Li D, Shimizu H, Nakamura R, Ishii S, 2011. Inheritance of stress-induced, ATF-2-dependent epigenetic change. Cell 145:1049-1061.

Serrano R et al., 1999. A glimpse of the mechanisms of ion homeostasis during salt stress. J Exp Bot. 50:1023-1036.

Setlow RB, 1966. Cyclobutane-type pyrimidine dimers in polynucleotides. Science 153:379-386.

Sharp PM, Hahn BH, 2010. The evolution of HIV-1 and the origin of AIDS. Philos Trans R Soc Lond B Biol Sci. 365:2487-2494.

Sidow A, 1996. Gen(om)e duplications in the evolution of early vertebrates. Curr Opin Genet Dev. 6:715-722.

Skirycz A et al., 2011. Pause-and-stop: the effects of osmotic stress on cell proliferation during early leaf development in Arabidopsis and a role for ethylene signaling in cell cycle arrest. Plant Cell 23:1876-1888.

Sköld O, 2000. Sulfonamide resistance: mechanisms and trends. Drug Resist Updat. 3:155-160.

Slieman TA, Nicholson WL, 2000. Artificial and solar UV radiation induces strand breaks and cyclobutane pyrimidine dimers in Bacillus subtilis spore DNA. Appl Environ Microbiol. 66:199-205.

53 Chapter 1

Somero GN, 1995. Proteins and temperature. Annu Rev Physiol. 57:43-68

Sonti RV, Roth JR, 1989. Role of gene duplications in the adaptation of Salmonella typhimurium to growth on limiting carbon sources. Genetics 123:19-28.

Soo VW, Hanson-Manful P, Patrick WM, 2011. Artificial gene amplification reveals an abundance of promiscuous resistance determinants in Escherichia coli. Proc Natl Acad Sci USA 108:1484-1489.

Speer BS, Shoemaker NB, Salyers AA, 1992. Bacterial resistance to tetracycline: mechanisms, transfer, and clinical significance. Clin Microbiol Rev. 5: 387–399.

Spring J, 1997. Vertebrate evolution by interspecific hybridisation – are we polyploids? FEBS Lett. 400:2-8.

Stankiewicz P, Shaw CJ, Withers M, Inoue K, Lupski JR, 2004. Serial segmental duplications during primate evolution result in complex human genome architecture. Genome Res. 14:2209-2220.

Stephen DW, Rivers SL, Jamieson DJ, 1995. The role of the YAP1 and YAP2 genes in the regulation of the adaptive oxidative stress responses of Saccharomyces cerevisiae. Mol Microbiol. 16:415-423.

Stetter KO, 1999. Extremophiles and their adaptation to hot environments. FEBS Lett. 452:22-25.

Stewart PS, 2002. Mechanisms of antibiotic resistance in bacterial biofilms. Int J Med Microbiol. 292:107-113.

Stewart PS, Costerton JW, 2001. Antibiotic resistance of bacteria in biofilms. Lancet 358:135-138.

Straus DS, 1975. Selection for a large genetic duplication in Salmonella typhimurium. Genetics 80:227-237.

Sturtz LA, Diekert K, Jensen LT, Lill R, Culotta VC, 2001. A fraction of yeast Cu,Zn-superoxide dismutase and its metallochaperone, CCS, localize to the intermembrane space of mitochondria. A physiological role for SOD1 in guarding against mitochondrial oxidative damage. J Biol Chem. 276:38084-38089.

Subbayya IN, Ray SS, Balaram P, Balaram H, 1997. Metabolic enzymes as potential drug targets in Plasmodium falciparum. Indian J Med Res. 106:79-94.

Sumner ER, Avery SV. 2002. Phenotypic heterogeneity: differential stress resistance among individual cells of the yeast Saccharomyces cerevisiae. Microbiology 148:345-351.

Sutcliffe J, Tait-Kamradt A, Wondrack L, 1996. Streptococcus pneumoniae and Streptococcus pyogenes resistant to macrolides but sensitive to clindamycin: a common resistance pattern mediated by an efflux system. Antimicrob Agents Chemother. 40:1817-1824.

Sutcliffe JG, 1978. Nucleotide sequence of the ampicillin resistance gene of Escherichia coli plasmid pBR322. Proc Natl Acad Sci USA 75:3737-3741.

Sutherland IW, 2001. The biofilm matrix--an immobilized but dynamic microbial environment. Trends Microbiol. 9:222-227.

Suzuki Y, Brown GM, 1974. The biosynthesis of folic acid. XII. Purification and properties of dihydroneopterin triphosphate pyrophosphohydrolase. J Biol Chem. 249:2405–2410.

Swain PS, Elowitz MB, Siggia ED. 2002. Intrinsic and extrinsic contributions to stochasticity in gene expression. Proc Natl Acad Sci USA 99:12795-12800.

54 Chapter 1

Szklarczyk R, Huynen MA, Snel B, 2008.Complex fate of paralogs. BMC Evol Biol. 8:337.

Tabashnik BE, 1990. Implications of gene amplification for evolution and management of insecticide resistance. J Econ Entomol. 83, 1170-1176.

Taddei F, Radman M, Maynard-Smith J, Toupance B, Gouyon PH, Godelle B, 1997. Role of mutator alleles in adaptive evolution. Nature 387:700-702.

Tadeo X, López-Méndez B, Trigueros T, Laín A, Castaño D, Millet O, 2009. Structural basis for the amino acid composition of proteins from halophilic archea. PLoS Biol. 7:e1000257.

Tagkopoulos I, Liu YC, Tavazoie S, 2008. Predictive behavior within microbial genetic networks. Science 320:1313-1317.

Tait-Kamradt A, Davies T, Cronan M, Jacobs MR, Appelbaum PC, Sutcliffe J, 2000. Mutations in 23S rRNA and ribosomal protein L4 account for resistance in pneumococcal strains selected in vitro by macrolide passage. Antimicrob Agents Chemother. 44:2118-2125.

Talalay P, Fahey JW, 2001. Phytochemicals from cruciferous plants protect against cancer by modulating carcinogen metabolism. J Nutr. 131:3027S-3033S.

Talley K, Alexov E, 2010. On the pH-optimum of activity and stability of proteins. Proteins 78:2699-2706.

Taylor JS, Van de Peer Y, Meyer A, 2001. Genome duplication, divergent resolution and speciation. Trends Genet. 17, 299-301.

Teichmann SA, Babu MM, 2004. Gene regulatory network growth by duplication. Nat Genet. 36:492-496.

Teige M, Scheikl E, Eulgem T, Dóczi R, Ichimura K, Shinozaki K, Dangl JL, Hirt H, 2004. The MKK2 pathway mediates cold and salt stress signaling in Arabidopsis. Mol Cell 15:141-152.

Temple MD, Perrone GG, Dawes IW, 2005. Complex cellular responses to reactive oxygen species. Trends Cell Biol. 15:319-326.

Thattai M, van Oudenaarden A. 2004. Stochastic gene expression in fluctuating environments. Genetics 167:523- 530.

Thomas JW et al., 2003. Comparative analyses of multi-species sequences from targeted genomic regions. Nature 424: 788–793.

Thompson DM, Lu C, Green PJ, Parker R, 2008. tRNA cleavage is a conserved response to oxidative stress in eukaryotes. RNA 14:2095-2103.

Todd DJ, Lee AH, Glimcher LH, 2008. The endoplasmic reticulum stress response in immunity and autoimmunity. Nat Rev Immunol. 8:663-674.

Tohoyama H, Shiraishi E, Amano S, Inouhe M, Joho M, Murayama T, 1996. Amplification of a gene for metallothionein by tandem repeat in a strain of cadmium-resistant yeast cells. FEMS Microbiol Lett. 136: 269-273.

Toledano MB, Delaunay A, Biteau B, Spector D, Azevedo D. 2003. Oxidative stress responses in yeast. In: Hohman S, Mager WH, editors. Yeast stress responses. Springer-Verlag, Berlin.

Tomasz A, 1979. The mechanism of the irreversible antimicrobial effects of penicillins: how the beta-lactam antibiotics kill and lyse bacteria. Annu Rev Microbiol. 33:113–137.

55 Chapter 1

Tomida A, Tsuruo T, 1999. Drug resistance mediated by cellular stress response to the microenvironment of solid tumors. Anti-Cancer Drug Des. 14:169-177.

Trainor C, Butterworth KT, McGarry CK, McMahon SJ, O'Sullivan JM, Hounsell AR, Prise KM, 2012. DNA damage responses following exposure to modulated radiation fields. PLoS One 7:e43326.

Trainotti N, Stambuk BU, 2001. NaCl stress inhibits maltose fermentation by Saccharomyces cerevisiae. Biotech Lett. 23:1703-1707.

Travisano M, Mongold JA, Bennett AF, Lenski RE, 1995. Experimental tests of the roles of adaptation, chance, and history in evolution. Science 267:87-90.

Treger JM, Magee TR, McEntee K, 1998. Functional analysis of the stress response element and its role in the multistress response of Saccharomyces cerevisiae. Biochem Biophys Res Commun. 243:13-19.

Triglia T, Menting JG, Wilson C, Cowman AF, 1997. Mutations in dihydropteroate synthase are responsible for sulfone and sulfonamide resistance in Plasmodium falciparum. Proc Natl Acad Sci USA 94:13944-13949.

Turner BM, 1998. Histone acetylation as an epigenetic determinant of long-term transcriptional competence. Cell Mol Life Sci. 54:21-31.

Turner BM, 2002. Cellular Memory and the Histone Code. Cell 111:285 – 291.

Tyrrell RM, 1973. Induction of pyrimidine dimers in bacterial DNA by 365 nm radiation. Photochem Photobiol. 17:69-73.

Vaadia Y, 1976. Plant Hormones and Water Stress. Phil Trans Roy Soc. 273: 513-522. van der Linden AM, Beverly M, Kadener S, Rodriguez J, Wasserman S, Rosbash M, Sengupta P. 2010. Genome- wide analysis of lightand temperature-entrained circadian transcripts in Caenorhabditis elegans. PLoS Biol. 8:e1000503. van Ham SM, van Alphen L, Mooi FR, van Putten JP, 1993. Phase variation of H. influenzae fimbriae: transcriptional control of two divergent genes through a variable combined promoter region. Cell 73:1187-1196. van Hoof et al., 2001. Enhanced copper tolerance in Silene vulgaris (Moench) Garcke populations from copper mines is associated with increased transcript levels of a 2b-type metallothionein gene. Plant Physiol. 126:1519- 1526. van Loon AP, Pesold-Hurt B, Schatz G, 1986. A yeast mutant lacking mitochondrial manganese-superoxide dismutase is hypersensitive to oxygen. Proc Natl Acad Sci USA 83:3820-3824. van Voorst F, Houghton-Larsen J, Jønson L, Kielland-Brandt MC, Brandt A, 2006. Genome-wide identification of genes required for growth of Saccharomyces cerevisiae under ethanol stress. Yeast 23:351-359.

Van Wuytswinkel O, Reiser V, Siderius M, Kelders MC, Ammerer G, Ruis H, Mager WH, 2000. Response of Saccharomyces cerevisiae to severe osmotic stress: evidence for a novel activation mechanism of the HOG MAP kinase pathway. Mol Microbiol. 37:382-397.

Velicer GJ, Yu YT, 2003. Evolution of novel cooperative swarming in the bacterium Myxococcus xanthus. Nature 425:75-78.

Vermeulen N, Keeler WJ, Nandakumar K, Leung KT, 2008. The bactericidal effect of ultraviolet and visible light on Escherichia coli. Biotechnol Bioeng. 99:550-556.

56 Chapter 1

Via S, Lande R, 1985. Genotype-Environment Interaction and the Evolution of Phenotypic Plasticity. Evolution 39: 505-522.

Visser JAGM, Zeyl CW, Gerrish PJ, Blanchard JL, Lenski RE, 1999. Diminishing returns from mutation supply rate in asexual populations. Science 283:404-406.

Voesenek LA, Colmer TD, Pierik R, Millenaar FF, Peeters AJ, 2006. How plants cope with complete submergence. New Phytol. 170:213-226.

Völker U, Mach H, Schmid R, Hecker M, 1992. Stress proteins and cross-protection by heat shock and salt stress in Bacillus subtilis. J Gen Microbiol. 138:2125–2135.

Waites KB, Talkington DF, 2004. Mycoplasma pneumoniae and its role as a human pathogen. Clin Microbiol Rev. 17:697-728.

Wagner C, 1995. Biochemical role of folate in cellular metabolism. In: Folate in Health and Disease (Bailey, L.B., ed.). Marcel Dekker, New York.

Walsh C, 2003. Antibiotics: Actions, Origins, Resistance. ASM Press. Washington, DC.

Walsh JB, 1995. How often do duplicated genes evolve new functions? Genetics 139:421-428.

Wang G, Doyle MP, 1998. Heat shock response enhances acid tolerance of Escherichia coli O157:H7. Lett Appl Microbiol. 26:31-34.

Wang TC, Smith KC, 1986. Postreplication repair in ultraviolet-irradiated human fibroblasts: formation and repair of DNA double-strand breaks. Carcinogenesis 7:389-92.

Wang Y et al., 2007. Arabidopsis EIN2 modulates stress response through abscisic acid response pathway. Plant Mol Biol. 64:633-644.

Wapinski I, Pfeffer A, Friedman N, Regev A, 2007. Natural history and evolutionary principles of gene duplication in fungi. Nature 449:54-61.

Warringer J, Ericson E, Fernandez L, Nerman O, Blomberg A, 2003. High-resolution yeast phenomics resolves different physiological features in the saline response. Proc Natl Acad Sci USA 100: 15724–15729.

Waskiewicz AJ, Cooper JA, 1995. Mitogen and stress response pathways: MAP kinase cascades and phosphatase regulation in mammals and yeast. Curr Opin Cell Biol. 7:798-805.

Waters L, John L, Nelson M, 2007. Non-nucleoside reverse transcriptase inhibitors: a review. Int J Clin Pract. 61:105-118.

Webber MA, Piddock LJ, 2003. The importance of efflux pumps in bacterial antibiotic resistance. J Antimicrob Chemother. 51:9-11.

Weisblum B, 1995. Erythromycin resistance by ribosome modification. Antimicrob Agents Chemother. 39:577-585.

Weiser RS, Hargiss CO, 1946. Studies on the death of bacteria at low temperatures; the comparative effects of crystallization, vitromelting, and devitrification on the mortality of Escherichia coli. J Bacteriol. 52:71-79.

Whitlock MC, Phillips PC, Fowler K, 2002. Persistence of changes in the genetic covariance matrix after a bottleneck. Evolution 56:1968-1975.

57 Chapter 1

Wieland J, Nitsche AM, Strayle J, Steiner H, Rudolph HK, 1995. The PMR2 gene cluster encodes functionally distinct isoforms of a putative Na+ pump in the yeast plasma membrane. EMBO J. 14:3870-3882.

Willing RP, Leopold AC, 1983. Cellular expansion at low temperature as a cause of membrane lesions. Plant Physiol. 71:118-1121.

Wise EM Jr, Park JT, 1965. Penicillin: its basic site of action as an inhibitor of a peptide cross-linking reaction in cell wall mucopeptide synthesis. Proc Natl Acad Sci USA 54:75-81.

Wolf DM, Vazirani VV, Arkin AP. 2005. Diversity in times of adversity: probabilistic strategies in microbial survival games. J Theor Biol 234:227-253.

Wolfe KH, Shields DC, 1997. Molecular evidence for an ancient duplication of the entire yeast genome. Nature 387:708-713.

Wolter N, Smith AM, Farrell DJ, Schaffner W, Moore M, Whitney CG, Jorgensen JH, Klugman KP, 2005. Novel mechanism of resistance to oxazolidinones, macrolides, and chloramphenicol in ribosomal protein L4 of the pneumococcus. Antimicrob Agents Chemother. 49:3554-3557.

Wood JM, 1999. Osmosensing by bacteria: signals and membrane-based sensors. Microbiol Mol Biol Rev. 63:230- 262.

Wood WB, Austrian R, 1942. Studies On The Antibacterial Action Of The Sulfonamide Drugs : II. The Possible Relation Of Drug Activity to Substances Other Than p-Aminobenzoic Acid. J Exp Med. 75:383-394.

Woods DD, 1964. The Function Of Folic Acid In Cellular Metabolism. Proc R Soc Med. 57:388-390.

Woods R, Schneider D, Winkworth CL, Riley MA, Lenski RE, 2006. Tests of parallel molecular evolution in a long-term experiment with Escherichia coli. Proc Natl Acad Sci USA 103:9107-9112.

Yale J, Bohnert HJ, 2001. Transcript expression in Saccharomyces cerevisiae at high salinity. J Biol Chem. 276:15996-16007.

Yancey PH, Clark ME, Hand SC, Bowlus RD, Somero GN, 1982. Living with water stress: evolution of osmolyte systems. Science 217:1214-1222.

Yang SX, Zhao YX, Zhang Q, He YK, Zhang H, Luo D, 2001. HAL1 mediate salt adaptation in Arabidopsis thaliana. Cell Res. 11: 142-148.

Ye T, Elbing K, Hohmann S, 2008. The pathway by which the yeast protein kinase Snf1p controls acquisition of sodium tolerance is different from that mediating glucose regulation. Microbiology 154:2814-2826.

Yeaman S, Chen Y, Whitlock MC, 2010. No effect of environmental heterogeneity on the maintenance of genetic variation in wing shape in Drosophila melanogaster. Evolution 64:3398-3408.

Yoshikawa K, Tanaka T, Furusawa C, Nagahisa K, Hirasawa T, Shimizu H, 2009. Comprehensive phenotypic analysis for identification of genes affecting growth under ethanol stress in Saccharomyces cerevisiae. FEMS Yeast Res. 9: 32–44.

Young R, 1992. Bacteriophage lysis: mechanism and regulation. Microbiol Rev. 56:430-481.

Zacharioudakis I, Gligoris T, Tzamarias D, 2007. A yeast catabolic enzyme controls transcriptional memory. Curr Biol. 17:2041-2046.

58 Chapter 1

Zahn LM, Kong H, Leebens-Mack JH, Kim S, Soltis PS, Landherr LL, Soltis DE, Depamphilis CW, Ma H, 2005. The evolution of the SEPALLATA subfamily of MADS-box genes: a preangiosperm origin with multiple duplications throughout angiosperm history. Genetics 169:2209-2223.

Zhang J, Zhang YP, Rosenberg HF, 2002. Adaptive evolution of a duplicated pancreatic ribonuclease gene in a leaf- eating monkey. Nat Genet. 30:411-415.

Zheng DQ, Wu XC, Tao XL, Wang PM, Li P, Chi XQ, Li YD, Yan QF, Zhao YH, 2011. Screening and construction of Saccharomyces cerevisiae strains with improved multi-tolerance and bioethanol fermentation performance. Bioresour Technol. 102:3020-3027.

Zgurskaya HI, Nikaido H, 2000. Multidrug resistance mechanisms: drug efflux across two membranes. Mol Microbiol. 37:219-225.

59 Chapter 2

Adaptation of Saccharomyces cerevisiae to saline stress through laboratory evolution

This chapter was published as “Dhar R, Sägesser R, Weikert C, Yuan J, Wagner A, 2011. Adaptation of Saccharomyces cerevisiae to saline stress through laboratory evolution. Journal of Evolutionary Biology 24:1135-1153.”

Abstract Most laboratory evolution studies that characterize evolutionary adaptation genomically focus on genetically simple traits that can be altered by one or few mutations. Such traits are important, but they are few compared to complex, polygenic traits influenced by many genes. We know much less about complex traits, and about the changes that occur in the genome and in gene expression during their evolutionary adaptation. Salt stress tolerance is such a trait. It is especially attractive for evolutionary studies, because the physiological response to salt stress is well-characterized on the molecular and transcriptome level. This provides a unique opportunity to compare evolutionary adaptation and physiological adaptation to salt stress. The yeast Saccharomyces cerevisiae is a good model system to study salt stress tolerance, because it contains several highly conserved pathways that mediate the salt stress response. We evolved three replicate lines of yeast under continuous salt (NaCl) stress for 300 generations. All three lines evolved faster growth rate in high salt conditions than their ancestor. In these lines, we studied gene expression changes through microarray analysis, and genetic changes through next generation population sequencing. We found two principal kinds of gene expression changes, changes in basal expression (82 genes), and changes in regulation (62 genes). The genes that change their expression involve several well-known physiological stress response genes, including CTT1, MSN4, HLR1. Next generation sequencing revealed only one high frequency single nucleotide change, in the gene MOT2, that caused increased fitness when introduced into the ancestral strain. Analysis of DNA content per cell revealed ploidy increases in all the three lines. Our observations suggest that evolutionary adaptation of yeast to salt stress is associated with genome size increase and modest expression changes in several genes.

60 Chapter 2

Introduction The ability to characterize the changes that occur during evolutionary adaptation on a genome- wide scale has been a boon for the field of laboratory evolution. Most published studies focus on traits with a simple basis, where changes of major effects in one or few changes can alter a trait during laboratory evolution experiments (Ferea et al., 1999; Blanc & Adams, 2003; Velicer et al., 2006; Stanek et al., 2009). Such traits are important, but they are in the minority. The vast majority of traits have a complex, polygenic basis (Benfey & Protopapas, 2005). We know much less about how genomic change and change in gene expression occurs in such polygenic traits. Our study is a step towards answering this question. We here focus on a prototypical polygenic trait, an organism’s response to high concentrations of salt in its environment. The physiological response of an organism to such salt stress is well-studied on the molecular and on the transcriptome levels (Posas et al., 2000; Rep et al., 2000; Causton et al., 2001; Yale & Bohnert, 2001). This fact provides another important motivation to study the salt stress response in an evolutionary context. It allows us to ask whether evolutionary adaptation to salt stress is similar to physiological adaptation, using a genome-scale approach that relies on transcriptome changes in response to salt stress.

Environmental fluctuations and stressors constantly challenge organisms in the wild. Organisms thus use cellular mechanisms to adapt to and to survive environmental fluctuations. Hyperosmotic stress is one prominent environmental stressor, where a cell experiences higher solute concentration outside the cell than inside. This causes water loss from the cell, resulting in a higher intracellular concentration of ions and metabolites, and eventual arrest of cellular activity. Hyperosmotic stress is caused by high concentrations of sugar or salt, e.g., sodium chloride (NaCl). High salt stress is a special case of hyperosmotic stress and has similar effects on a cell as high concentration of sugars (Gasch et al., 2000; Causton et al., 2001). In addition, it causes hyperionic stress due to high extracellular concentrations of Na+ and Cl- ions, which are imported into the cell and can disrupt cellular ionic equilibrium. Tolerance to Na+ stress thus needs additional ion transport and detoxification mechanisms along with those required for the hyperosmotic stress response (Serrano, 1996; Apse et al., 1999; Gaxiola et al., 1999; Maathuis & Amtmann, 1999; Serrano et al., 1999; Zhu, 2001). The hyperosmotic stress response in yeast is

61 Chapter 2 mediated by the high osmolarity glycerol (HOG) pathway, which is a MAPK pathway (Brewster et al.,1993; Dihazi et al., 2004, Saito & Tatebayashi, 2004; reviewed in Hohmann 2002).Two cell membrane-bound sensors Sho1p and Sln1p detect osmotic change, which results in activation of Hog pathway genes, which, in turn, leads to the activation of the downstream genes associated with salt tolerance and adaptation (Martinez-Pastor et al., 1996; Schmitt & McEntee, 1996; Gorner et al., 1998; Ostrander & Gorman, 1999; Rep et al., 1999; Reiser et al., 2000; Rep et al., 2000). Whole transcriptome studies by Gasch et al, Posas et al, Yale and Bonhart, Causton et al., and Rep et al. (Gasch et al., 2000; Posas et al., 2000; Rep et al., 2000; Causton et al., 2001; Yale & Bohnert, 2001) have identified hundreds of genes whose expression levels are affected by hyperosmotic stress. The genes induced during osmotic stress response include the genes involved in synthesis and regulation of the cellular osmolytes glycerol and trehalose. Another group of genes, specifically activated under saline stress, are associated with ion homeostasis. Some of the genes affected by osmotic stress show transient expression changes, whereas others show expression changes that are stable in time. In addition to genome wide approaches, many studies have characterized the roles of individual genes in the osmotic stress response of yeast (Haro et al., 1991; Garciadeblas et al., 1993; Mai & Breeden, 1997; Ganster et al., 1998; Mendizabal et al., 1998; Tsujimoto et al., 2000; Betz et al., 2002; Goossens et al., 2002; Hirata et al., 2003; Heath et al., 2004; ).

Yeast is a good model system for studying osmo-adaptation in eukaryotes, since other fungi and plants share many of the stress response pathways and proteins involved in osmo-adaptation in yeast. First, the mitogen-activated protein kinase (MAP kinase) cascade, central to the stress response in yeast, is a conserved eukaryotic signal transduction pathway present from fungi to plants (Neiman, 1993; Kosako et al., 1993; Galcheva-Gargova et al., 1994; Han et al., 1994; Waskiewicz & Cooper, 1995; Jonak et al., 1999; Kyriakis & Avruch, 2001). Stress signaling in plants is also carried out by MAPK pathways, which are activated by cold, drought, salt, heat, and oxidative stress (Jonak et al., 1996; Kovtun et al., 2000; Teige et al., 2004). Second, yeast and plants have highly similar genes required for stress tolerance (Mendoza et al., 1994; Bressan et al., 1998; Pardo et al., 1998; Lee et al., 1999; Sanders, 2000; Hasegawa & Bressan, 2000; Quintero et al., 2002; Zhu, 2002). Third, yeast and plants have similar membrane ion transport

62 Chapter 2

and detoxification systems (Gaxiola et al., 1999; Quinteroa et al., 2000). For example, HAL family genes are important for ion homeostasis in yeast as well as in plants (Haro et al., 1991; Gaxiola at al., 1992; Ferrando et al., 1995; Murguía et al., 1996; Rios et al., 1997; Mulet et al., 1999; Espinosa-Ruiz et al., 1999; Gisbert et al., 2000; Yang et al., 2001).

With one exception (Samani & Bell, 2010), all previous studies of salt stress adaptation in yeast focused on physiological adaptation. Such adaptation occurs on time scales up to a few hours. The mechanisms of longer term evolutionary adaptation to salt stress are not known. Such adaptations occur on time scales of hundred generations or more. One aim of this study is to compare a population’s evolutionary response to salt stress with its physiological response on the transcriptome level. Does evolutionary adaptation mirror physiological adaptation? Does it affect largely the same genes as the physiological response?

A second aim is to investigate the genetic basis of evolutionary adaptation of yeast to high saline stress, as far as this is possible for a complex trait. Does the adaptation come about through accumulation of identifiable beneficial point mutations? Does it involve chromosomal rearrangements, as observed in evolution of yeast in glucose-limited or phosphate-limited media? Or does it take place through genetic and epigenetic changes altering expression of genes that help cells adapt to high salt? These are some of the questions we ask.

Understanding of evolutionary principles of salt tolerance could also be important for biotechnological applications. Firstly, yeast cells experience high salt concentration in many industrial fermentation processes, and improvement in performance of yeast in such conditions would benefit the industry immensely (Attfield, 1997; Trainotti & Stambuk, 2001; Zheng et al., 2011). Secondly, the principles of salt tolerance in yeast might be useful for engineering fungi and crop plants for salt tolerance, as both classes of organisms share many components of their stress response systems.

In this contribution, we exposed yeast to high salt concentrations for 300 generations in the laboratory in medium containing sodium chloride (NaCl). We compared the fitness and viability

63 Chapter 2

of the evolved lines with the starting yeast strain (ancestral strain), followed by characterization of gene expression changes and genetic changes such as mutations, copy number variations and chromosomal alterations.

Results To test for evolutionary adaptation of yeast cells to osmotic stress, we carried out laboratory evolution experiments through 30 serial transfers in batch cultures, comprising approximately 300 cell generations (see Methods). This number of generations is sufficient to show evolutionary adaptations in yeast (Adams & Oeller, 1986; Adams et al., 1992; Dunham et al., 2002; Gresham et al., 2008). For our experiments, we used haploid populations to avoid any potential masking of adaptively significant alleles in the diploid stage (Zeyl et al., 2003). We carried out three parallel replicate evolution lines in which yeast cells grew and divided in yeast peptone medium supplemented with galactose as the sole carbon source, and with 0.5M NaCl (YPGN), which is a high salt medium that exposes cells to high osmolarity stress. We refer to these lines as S lines (S1, S2 and S3). As a control, we also carried out three replicates where the growth medium did not contain NaCl (lines C1,C2 and C3).

Evolutionary adaptation to NaCl Before embarking on our experiments, we asked whether NaCl affects the fitness of our ancestral strain. Only in this case would we expect that NaCl exerts a selection pressure to which the strain can adapt. Fitness has two main components in our experiments. These are viability on the one hand, and growth or cell division rate on the other hand. We found that osmotic stress does not affect viability of the ancestral strain significantly (Figure 1a). Not surprisingly then, viability does not increase over the course of our experiment for lines evolved on NaCl (Figure 1b) and for lines evolved without NaCl (Figure 1c).

In contrast to viability, population growth rate is affected by salt stress, as shown in supplementary figure 1. The figure indicates that the ancestral strain grows significantly more slowly in medium containing NaCl. Thus, the fitness of the ancestral strain is lower in salt compared to that of the ancestral strain in medium without any salt. Having established that NaCl

64 Chapter 2

does affect the fitness of the ancestral strain through its growth rate, we asked if the growth rate and thus, relative fitness w (see Methods) of three salt evolved lines increase after 300 generations of evolution (figure 2a). Fitness increased significantly relative to the ancestral strain, such that the final evolved lines had a relative population mean fitness between w=1.11 and w=1.17, which corresponds to a decrease in the average cell doubling time between 8.2% and 12.3% relative to the ancestral strain (see Methods).

Figure 1: (a-c) Viability of the ancestral strain and the evolved lines in medium containing 0.5M NaCl. Cells from the ancestral strain and the evolved lines from overnight cultures were transferred into YPG medium. Cell samples from these cultures were withdrawn after 16 hours (during exponential growth phase) as well as after 24 hours (stationary phase), and then diluted and plated. The plates were incubated at 30ºC for five days and the number of colonies was counted. The relative viability of both the ancestral strain and the evolved lines were estimated from the ratio of colonies formed on agar plates with YPGN medium to that of plates with YPG medium. Relative viabilities were measured for (a) the ancestral strain, (b) lines evolved in YPGN medium (S lines), and (c) lines evolved in YPG medium (C lines). From the figures, it is clear that NaCl does not affect the viability of the ancestral strain. The viabilities of the evolved lines S1, S2 and S3 do not increase during the course of adaptation.

65 Chapter 2

Figure 2: Fitness Assay of the evolved yeast lines. Fitness assays were performed by inoculating equal numbers of cells from an evolved line and from the GFP-tagged reference yeast strain in the same vessel. The cells were grown for 24hrs, and the percentage of non-GFP- tagged cells was counted using FACS. The ratio of the cell numbers of the evolved lines to that of the ancestral strain gives the relative fitness of the evolved lines. (a) Relative fitness of the three replicate yeast lines evolved in medium with salt (S1,S2 and S3) in YPGN medium, and the three replicate yeast lines evolved without NaCl (C1,C2,C3) measured in YPGN medium. (b) Relative fitness of the lines S1, S2 and S3 in medium without NaCl. The lines evolved in salt show an increase in growth rate by 8% to 12% in NaCl medium compared to the ancestral strain. These lines also show a growth rate increase in medium without NaCl. However, this increase in growth rate is consistently lower than the increase in NaCl medium. In addition, the lines evolved in medium without NaCl show lower fitness in NaCl medium. Taken together, it can be concluded that part of the adaptation in the salt evolved lines is specific to NaCl.

Since our high salt medium is a complex medium, it is likely that part of the evolutionary response we observe also reflects adaptation to medium components different from NaCl. To ask whether this is the case, we also measured the relative fitness of the three S lines in the control medium (without NaCl). Not surprisingly, the fitness in this medium had also increased (figure 2b) relative to the ancestral strain, which suggests that at least part of the evolutionary adaptation we see is not specific to NaCl as a stressor. However, two lines of evidence show that a substantial fraction of the fitness increase is specific to NaCl. The first is that the fitness increase, when measured in the absence of NaCl, appears substantially and significantly lower than when measured in the presence of NaCl (figures 2a and 2b; Mann-Whitney U test, p<0.0001 for S1, S2 and S3). The second line of evidence is provided by our three control lines that had evolved without the addition of NaCl to the medium. Figure 2a shows that in every single line the fitness increase, when measured in medium with NaCl, is consistently and significantly lower than for

66 Chapter 2 the lines that had evolved in NaCl (Mann-Whitney U test, p<0.0001 for comparisons between S1 & C1, S2 & C2 and S3 & C3). Specifically, the increase in relative fitness w was at least 54 percent higher in the lines evolved in NaCl compared to the lines evolved without NaCl. In sum, a significant and substantial part of the evolutionary adaptation we observe is due to the selection pressure provided by NaCl.

Gene Expression analysis of the evolved lines To compare the expression levels of genes in the evolved lines and in the ancestral strain, we performed whole gene transcriptome analysis using yeast microarrays (Affymetrix). Because previous experiments had shown that many yeast genes change expression in response to salt and to other stressors (Gasch et al., 2000; Posas et al., 2000; Rep et al., 2000; Causton et al., 2001; Yale & Bohnert, 2001), we expected that this would also hold for our lines. Thus, we first asked which genes respond to salt in a similar manner in the ancestral strain and in all evolved lines. We will refer to such genes for brevity as shared genes. For all the analyses presented below we used all the replicate lines, as well as replicate microarray experiments (see methods).

In total, we observed 581 shared genes that were induced in salt and 580 shared genes that were repressed in salt with 2-fold expression change (|log2(fold change)|≥1, t-test, p<0.05, False

Discovery Rate <10%). Figure 3a shows a “volcano plot” (-log10(p-value) vs log2(fold change)) for the expression of allyeast genes in response to NaCl in the S lines and the ancestral strain. Figure 3b shows the extent of expression change in the evolved lines (vertical axis) and in the ancestral strain (horizontal axis) for all genes with similar response to NaCl in the evolved lines and the ancestral strain (|log2(fold change)|≥1, t-test, p<0.05, False Discovery Rate <10%). The genes we labeled by name in the figure include some known stress response genes (e.g., GPD1, SIP18), some genes whose expression changed substantially (e.g., FMP48, NOG1), and genes in both categories (e.g., GRE1). Many of these shared genes are directly associated with saline stress, hyperosmotic stress, or the general stress response, and were shown to be affected by hyper-osmolarity and salt stress in previous studies of the physiological and osmotic stress response (Gasch et al., 2000; Posas et al., 2000; Rep et al., 2000; Causton et al., 2001; Yale & Bohnert, 2001). Among the 581 shared induced genes, 192 genes were shown to be affected by stress in previous studies; among the 580 shared repressed genes, 56 were affected by stress in

67 Chapter 2

previous studies.

Figure 3: The genes showing similar (t-test, p-value<0.05, False Discovery Rate<10%) induction or repression by NaCl in the ancestral strain and in the evolved S lines (a) Volcano plot, showing the p-values on the Y axis, and the fold expression change for these genes on the X axis. The labeled genes include genes associated with the stress response (e.g., GPD1, SIP18), genes with high level of expression change (e.g., FMP48, NOG1), and genes in both categories (e.g., GRE1). (b) Scatter plot for the fold-change of the common genes in the ancestral strain (X-axis) and in the evolved S lines (Y-axis). Although most of the common genes change expression to a similar extent in the ancestral strain and the S lines, a small number of genes change expression to a different extent.

68 Chapter 2

Next, we classified shared genes using the Comprehensive Yeast Genome Database (CYGD) classification from the Munich Information Center for Protein Sequences (MIPS) (Güldener et al., 2005). A detailed analysis is given in the supplementary material (supplementary figure 2 and supplementary text). Here, we only discuss two gene classes in more detail. First, genes associated with 'cell rescue, defense and virulence' contain many general stress response genes, as well as genes that respond specifically to saline stress. One would expect that such genes are induced in response to salt stress, and our data show that induced genes in this category are overrepresented, and repressed genes in this category are underrepresented (supplementary figure 2 and supplementary text). This group of genes was also found to be significantly overrepresented among genes whose deletion reduced the growth rate of yeast in salt (Warringer et al., 2003). A second class of genes are genes involved in protein synthesis. The physiological response to stress can cause repression of protein synthesis (Gasch et al., 2000). In support of this observation, our data show that significantly fewer genes associated with protein synthesis are induced, and significantly more genes than expected are repressed (supplementary figure 2 and supplementary text). Again, this group of genes was significantly overrepresented among genes whose deletion increased salt-resistance (Warringer et al., 2003).

Differentially Expressed Genes While an analysis of genes regulated similarly in ancestral and evolved strains is instructive, we were more interested in genes that are expressed differently. These are genes whose expression adapted evolutionarily to salt stress. They can be subdivided into two main categories. The first comprises genes whose regulation has changed in the evolved lines. Figure 4a and b show two hypothetical examples of genes in this category. The second category comprises genes whose basal level of expression changed in the evolved lines, even in the absence of salt stress (figure 4c and d). This second category of genes may be especially important, because our selection conditions imposed continuous salt stress. It is thus conceivable that cells whose expression is ancestrally regulated in response to salt stress simply increase or decrease their basal expression to the level of regulated expression in the ancestral strain in the absence of salt. We note that these two categories have multiple subcategories, and there can be genes where both the basal expression and regulation can change. Supplementary figure 3 shows an overview of all

69 Chapter 2

possibilities.

We distinguished genes in the main categories by calculating two different Z-scores for each

gene, a Z-score for change in basal expression (Zb) and a Z-score for change in regulation (Zr) (see Methods) (Mukhopadhyay et al., 2006). In this analysis, we considered genes with |Z- score|≥1.5 as differentially expressed.

Figure 4: Schematic diagram for the type of expression change that can occur as a result of evolutionary adaptation in the evolved S lines compared to the ancestral strain. Each bar chart reflects the expression level of a hypothetical gene in the ancestral and evolved strain, in the absence and presence of salt. The scenarios that are depicted are as follows (a) No basal change, increased regulation (b) No basal change, increased repression (c) Basal increase in expression but no change in regulation (d) Basal decrease in expression but no change in regulation.

70 Chapter 2

Genes whose regulation changes In all, there are 62 genes whose regulation changes during our experiment (figure 5a). As we mentioned above, multiple types of such change are possible. Firstly, a gene’s induction in salt can increase. This will occur if the gene is induced more strongly in the evolved lines than the ancestral strain (supplementary figure 3b). Secondly, a gene’s induction in salt can become reduced (supplementary figure 3c). Thirdly, a gene’s repression in salt can be increased, i.e., the gene becomes repressed to a greater extent in the evolved lines (supplementary figure 3d). Fourth, a gene’s repression in salt can be reduced (supplementary figure 3e). Finally, a gene that was not regulated in the ancestral strain can become regulated in response to salt (supplementary figure 3f and g). Figure 5a plots the extent to which genes changed in their regulation against

their Zr scores. Some of the genes with significant changes in the regulation are labeled. Figure 6a and b show the five genes whose induction or repression changed most significantly (based on

Zr). We next discuss some of these genes.

There are only four genes in total which showed increased induction in the evolved lines. Among these four genes, two (PUT4 and PCL5) also showed a decrease in basal expression in the evolved lines. Four further genes showed new induction in the evolved lines. One of them, CUP1-1 (figure 5a), has previously been observed to be up-regulated in response to osmotic stress (Yale & Bohnert, 2001). Five genes showed reduced induction in our experiment. One of the genes, HSP30, encodes a stress-responsive protein that negatively regulates the H(+)-ATPase Pma1p. It is induced by heat shock, treatment with organic acid and ethanol, and glucose starvation (Panaretou & Piper, 1992; Piper et al., 1994; Piper et al., 1997).

In contrast to the few genes which showed increased induction, a total of 37 genes showed increased repression. Out of these 37 genes, one gene (HLR1) is directly involved in the stress response. This gene encodes a protein involved in regulation of cell wall composition, as well as in the osmotic stress response (Alonso-Monge et al., 2001). 12 genes showed new repression in our experiment. Among these 12 genes, the genes GCV2 and GCV1 were previously found to be up-regulated in osmotic stress (Yale & Bohnert, 2001). We did not find any gene with reduced repression.

71 Chapter 2

Figure 5: Genes with differential expression in the S lines compared to the ancestral strain (|Z-score|≥1.5). The plots show the level of expression change (vertical axis) and the Z-score (horizontal axis) for genes with (a) changed regulation, and (b) changed basal expression in the evolved lines. One should note that the basal change in expression is represented in terms of fold change, where the expression change is calculated as the ratio of the expressions in the evolved lines and the ancestral strain. The calculation of regulated expression change involves an additional subtraction of basal expression change and thus, can not be represented as fold change.

72 Chapter 2

Figure 6: The five genes with the highest Z-score in each of four categories of expression change in the evolved lines. Categories are (a) change in the level of induction, (b) change in the level of repression, (c) increase in basal expression, and (d) decrease in basal expression. The columns show, from left to right, the systematic gene name, the extent of expression change, the Z-score, the gene’s common name, the type of expression change (in (a) and b) only), and a reference, if the gene had responded physiologically to hyperosmotic stress or saline stress in previous studies.

73 Chapter 2

We next classified genes whose regulation changed in any of these ways according to their functions, using the CYGD classification of yeast genes (Güldener et al., 2005) (supplementary figure 4), and compared their distribution among functional categories with the proportions of the yeast genome in each category. The distribution of the number of genes among different classes is non-random for genes whose repression changes (49 genes in total; Chi-square test, p<0.001, df=17), whereas the distribution is not significantly different from random for the genes whose induction changes (13 genes; p>0.1, df=17). We then asked whether genes whose regulation changes occur preferentially in specific functional categories. Interestingly, for genes with change in induction level, such genes are (marginally) enriched only in the functional class of cell rescue, defense and virulence (p=0.0444). Genes whose repression changes are enriched in the functional classes transcription (p=0.0006), protein synthesis (p=0.0008), and proteins with binding or catalytic function (p=0.0003). The classes protein fate (folding, modification, destination; p=0.0232), cellular transport (p=0.0048), cell cycle/DNA processing (p=0.0134) and cell rescue, defense and virulence (p=0.0230) contain fewer genes whose repression changes than expected by chance alone (all p-values based on an exact binomial test).

Genes with basal level expression change Next we turn to genes that show a change in basal expression. There are 82 such genes. Thirty eight of them increased their basal expression, whereas 44 genes reduced their basal expression.

Figure 5b plots the extent of change in basal expression (log2-transformed) of all the genes against their corresponding Zb values. Some of the genes with significant changes in the basal expression level are labeled. Figure 6c and d lists the top five genes (based on Zb) with increased and decreased basal expression, respectively. The distribution of genes among different functional classes (supplementary figure 4) is significantly different from what would be expected by chance alone for the genes with increase in basal expression (p<0.05,df=17, Chi- square test), and also for the genes with a decrease in basal expression (p<0.001,df=17). For genes with an increase in basal expression, the class ‘cell rescue and defense’ (stress response) shows significantly more genes than expected (p=0.0059, exact binomial test), and the class protein fate (p=0.0409) shows significantly fewer genes than expected by chance alone. For genes with a decrease in basal expression, the class ‘cell type differentiation’ (cell wall,

74 Chapter 2 sporulation, spore wall etc.) (p=0.0165) shows more genes than expected, and the classes ‘transcription’ (p=0.0125) as well as ‘protein fate’ (folding, modification, destination) (p=0.0014) contain fewer genes than expected.

Two well-established stress response genes showed an increase in basal expression in the evolved lines. One of them is CTT1, whose basal expression increased by ~1.6 fold compared to the ancestral strain. This gene was found to be induced in four previous genome-wide studies of physiological stress adaptation in yeast (Gasch et al., 2000; Rep et al., 2000; Causton et al., 2001; Yale & Bonhert, 2001). The second known stress response gene is the transcription factor MSN4, whose expression was increased by approximately 1.5 fold. A more detailed analysis of individual genes can be found in the supplementary material (supplementary text).

Whole Genome (Re)Sequencing Analysis Some laboratory evolution studies have identified few beneficial mutations of large fitness effects in evolving populations (Blanc & Adams, 2003; Velicer et al., 2006; Stanek et al., 2009). To find out if there are any such mutations in our study system, we subjected the ancestral strain and population samples from two of our evolved lines (S1 and S2) to deep sequencing at about 10X coverage and the third line (S3) to ~50X coverage, using the Roche 454 Genome Sequencer. In our analysis of this sequence data, we aimed to identify only changes that may have swept to high frequency or fixation and 10X coverage is sufficient for that purpose. We chose candidate SNPs based on the criteria for the changed (derived) nucleotide described in supplementary methods and our approach is deliberately conservative. We identified 56 candidate SNPs and 7 deletions in total from all the three evolved (S) lines.

We sequenced all 56 candidate SNPs and 7 candidate deletions, but only one of them turned out to be a true change. This change was a SNP unique to line S2, and occurred at a frequency exceeding 50 percent, based both on next generation sequencing data and PCR sequencing data. This SNP is a non-synonymous G to A mutation (amino acid change: G230D) in the MOT2 (YER068W) gene. The gene MOT2 is a subunit of the CCR4-NOT complex which has roles in transcription regulation, mRNA degradation, and post-transcriptional modifications (Cade & Errede, 1994; Liu et al., 1998; Badarinarayana et al., 2000; Denis et al., 2001; Panasenko et al.,

75 Chapter 2

2006; Mersman et al., 2009), with no known role in salt tolerance.

Figure 7: Fitness analysis (using FACS) of the ancestral strain into which the mutant MOT2 gene of line S2 was introduced, and the corresponding wild-type MOT2 gene as a control. (a) Fitness in medium with NaCl (b) Fitness in medium without NaCl. The data show that the mutant MOT2 gene increases the fitness in salt medium as well as in medium without salt, and to similar extents.

To test whether the MOT2 SNP alone provided any fitness benefit to yeast in salt, we replaced the wild-type variant of MOT2 in the genome of the ancestral strain with the mutant (see Methods). Three replicate competition assays using FACS (see Methods) showed a relative population fitness w=1.043±0.008 for the mutant MOT2 allele (figure 7a) in medium with NaCl, compared to w=0.999±0.014 for the wildtype MOT2 allele. However, the fitness increase of the mutant MOT2 could only explain 25.3 percent of the total amount of fitness increase in line S2, whose relative fitness was w=1.17, indicating that there are other factors that contribute to the fitness increase of the line S2. In addition, we note that the MOT2 mutation also conferred a substantial fitness increase in the absence of NaCl (figure 7b), indicating that this mutation is not specifically adaptive to salt stress.

Duplication and PFGE We next turned to copy number variations as sources of evolutionary adaptation. Since large- scale chromosomal rearrangements and aneuploidies are frequent in yeast laboratory evolution (Adams et al., 1992; Dunham et al., 2002; Rancati et al., 2008), and are also observed in yeast

76 Chapter 2

gene knock-out strains (Hughes et al., 2000), we first wanted to know whether such rearrangements were numerous in our evolved lines. To this end, we first performed pulsed field gel electrophoresis (PFGE) analysis of whole chromosomes for two clones of the ancestral strain, as well as for two clones from each of the evolved lines. This experiment revealed no changes in any of the lines, except for one additional band (at ~500kb) that occurred exclusively in line S3 (figure 8a). PFGE for chromosomes digested with the NotI restriction enzyme also revealed a single novel band (~180kb), which was again unique to line S3 (figure 8b). Taken together, these observations suggest that high frequency copy number changes on a chromosomal scale are not rampant in our evolved lines, and the same arrangements do not occur across lines. All other distinguishable fragments in the gel, apart from the novel band of size ~180kb in line S3, were as predicted for a haploid yeast genome from a computational restriction digest of the genome with NotI. Additionally, chromosome-wide read coverage data from next generation sequencing (see below) supports the notion that there are no copy number changes involving large chromosome fragments (supplementary figure 5). However, a change in ploidy in the evolved lines that would affect entire chromosomes is not detectable by any of these two methods.

Gene Amplifications and Deletions from Next Generation Sequencing data We next turned to smaller scale copy number changes that we could detect in our next generation sequencing data. To detect gene duplication and deletions, a segmentation algorithm (see supplementary methods) was used along with analysis of DNA breakpoints from the reads (see supplementary methods). The results of both analyses are described in detail in the supplementary text.

We did not observe any large amplification or deletion in the genomes of any of the lines (and specifically in line S3) that could explain the appearance of novel bands in the pulsed field electrophoresis data from figure 8. This observation, together with read coverage data of individual chromosomes (supplementary figure 5) suggests that these novel bands were results of a non-duplicative translocation event in the line S3. However, for a haploid genome, a translocation event would displace one of the original bands in the PFGE, which we did not observe. Thus, such a translocation event is only consistent with the PFGE data if the affected chromosome occurs in more than one copy. In other words, it is consistent with a ploidy change

77 Chapter 2

in line S3.

Figure 8: Pulsed field gel electrophoresis (PFGE) analysis of the three replicate salt evolved lines S1, S2 & S3 and the ancestral strain. Two clones from each of the lines S1,S2 and S3 were subjected to PFGE analysis along with the ancestral strain. (a) The PFGE for the yeast chromosomes from the evolved lines S1, S2 and S3 and the ancestral strain. The chromosomes are shown on the left side. M represents the yeast chromosome PFG marker (NEB Catalog# N0345S). Only line S3 shows a new band (red arrow) of size approximately 500kb. (b) PFGE analysis of the NotI digested chromosomes from the lines S1, S2, S3 and the ancestral strain. M represents the Mid-range PFG marker I from NEB (Catalog# N3551S), whose fragments range between 15 kb and 291 kb. Again, only line S3 shows a new band (red arrow) in the gel, which has a size of approximately 180kb.

Ploidy of the Evolved Yeast Lines Changes in ploidy of haploid yeast strains under salt stress have been observed before (Gerstein et al., 2006). Neither PFGE nor chromosome wide read coverage data would be able to detect such changes. To estimate the ploidy level of evolved lines in comparison to the ancestral strain, we grew ancestral strain and population samples of the evolved lines for twenty four hours at 30oC and estimated the cell densities. We then isolated genomic DNA from a defined number of cells and quantified the DNA amount, which allowed us to calculate the DNA content per yeast cell for all the lines. We found that the DNA content per cell of lines S1 and S2 had increased by 78.7 percent and 85.4 percent, respectively, from that of the ancestral strain. For the line S3, the DNA content per cell was more than double that of the ancestral strain (119.91 percent increase

78 Chapter 2

from the ancestral strain)(figure 9). These observations suggest that all the evolved lines have become massively aneuploid.

Figure 9: DNA content per yeast cell of the ancestral strain and the evolved lines. The ancestral strain and population samples of the evolved lines were grown for 24 hours at 30oC. For each of the cultures, the cell density was estimated. Genomic DNA was isolated from a defined number of cells and DNA quantification was carried out with a NanoDrop ND-1000 spectrophotometer. The DNA content per cell was then calculated for each line. The figure suggests that all the evolved lines have become massively aneuploid.

A consistent cell size increase during our experiment Increases in ploidy are often associated with cell size increases. Our lines are no exception, as exemplified by the histogram of figure 10a, which shows the distribution of cell diameters of the ancestral strain and the evolved line S2. This increase is also microscopically visible (figure 10b). The mean cell diameter increased from 2.35μm (±0.70μm) to 6.73μm (±2.34μm). Along with a significant increase in the mean cell diameter (t-test, p-value <2.2x10-16), the coefficient

of variation (Cv), as defined by the ratio of standard deviation and mean, also increased -6 significantly (t-test for distributions of Cv for samples drawn from two distributions, p<10 ). In other words, not only did cells become larger during laboratory evolution, they also became

79 Chapter 2 more variable in size. Cell size changes in laboratory evolution experiments are not unprecedented. They have been observed in evolving E. coli populations (Lenski & Travisano, 1994; Lenski & Mongold, 2000; Philippe et al., 2009), as well as in Staphylococcus aureus populations in NaCl medium (Vijaranakul et al., 1995). However, the effect of cell size on survival and fitness is generally poorly understood.

Figure 10: Cell size comparison between ancestral strain and one evolved line. (a) Histogram of cell diameters of the ancestral strain and the S2 line, showing that the salt evolved cells are bigger than the ancestral cells. (b) Microphotograph of cell samples from the ancestral strain and the salt evolved lines.

80 Chapter 2

Discussion

To investigate evolutionary adaptation to long-term osmotic stress, we evolved yeast cells in the laboratory for 300 generations in high salt (NaCl) medium. We observed that salt reduces the growth rate of our ancestral strain by 11 percent. Consequently, the final cell density after 24 hours of growth is approximately 3 times lower compared to cells grown in the same medium but without salt. Our three replicate evolved lines grow approximately 8 to 12 percent faster than the ancestral strain in high salt medium, and part of this increase reflects adaptation specific to salt.

We analyzed the gene expression levels in all the evolved lines as well as in the ancestral strain. Although there were many shared genes between that respond to salt in similar manner in the evolved lines and in the ancestral strain , we also observed multiple differentially expressed genes in the evolved lines compared to the ancestral strain. These differentially expressed genes can be divided into two categories. The first category comprises genes with changes in their basal expression level in the evolved lines compared to the ancestor (i.e., even in the absence of salt). The second category comprises genes regulated differently in response to salt in the evolved lines.

Multiple genes whose expression shows an evolutionary response in our experiments were also affected by hyperosmotic stress or salt stress in previous studies of the physiological stress response. Changed expression of many genes in the physiological stress response in general, and in the hyperosmotic stress response in particular is transient (Gasch et al., 2000; Causton et al., 2001), and may depend on the time and conditions in which it is measured. It is thus to be expected that different studies show limited comparability with respect to the identity of these genes.

Some of the genes whose expression evolved are known to be associated with the stress response. For example, two genes, CTT1 and GAC1, showed an increase in basal expression in the evolved lines compared with the ancestral strain. Both these genes were also up-regulated in response to salt in our ancestral strain. CTT1 encodes a cytosolic catalase and protects the cell

81 Chapter 2

from oxidative damage by reactive oxygen species (Jamieson, 1998; Lushchak & Gospodaryov, 2005; et al., 2008). This gene has been shown to be up-regulated in four previous studies of physiological stress response and salt stress response (Gasch et al., 2000; Rep et al., 2000; Causton et al., 2001; Yale & Bohnert, 2001). The gene GAC1 is associated with glycogen synthesis (Wu et al., 2001) and is also induced during osmo-adaptation (Posas et al., 2000). Both these genes contain stress responsive elements (STRE) in their promoters (Schuller et al., 1994; Moskvina et al., 1998) and are known to be activated under various stress conditions. A third gene, CUP1-1, showed increased induction in the evolved lines and also was up-regulated in response to NaCl in our ancestral strain. This gene was also shown to be up-regulated in one previous study of physiological stress response (Yale & Bohnert, 2001). Yet another gene, HLR1, showed increased repression in the evolved lines, and was down-regulated in the ancestral strain in NaCl medium. This gene encodes a protein involved in maintaining cell wall composition and is also involved in response to osmotic stress (Alonso-Monge et al., 2001).

There are three main physiological components of the salt stress response. They affect adaptation time to salt, growth rate in salt, and efficiency of growth in salt (Warringer et al., 2003). In our experiments, evolutionary adaptation to salt must affect one or more of these three components, because the viability itself of yeast cells does not change between the ancestral strain and the evolved lines. There are various ways in which these components could change during evolution. Firstly, some of the genes whose basal expression changes might be directly involved in the initial adaptation to salt. Increasing or decreasing the basal expression of those genes to the level needed for salt adaptation, might enable the cells respond to salt much more quickly as opposed to changing their expression state. For example, some ion transporter genes (e.g. FRE5) have an increased basal expression in the evolved lines in our experiment. Higher levels of these transporter proteins at the initial stages of the salt stress response would help cells achieve ion homeostasis much more quickly. Secondly, some of the differentially expressed genes might not be directly related to the salt stress response but encode transcription factors that control salt stress response genes. For example, the transcription factor MSN4, that controls many stress response genes, has increased basal expression levels in our evolved yeast lines. Higher levels of this transcription factor during the initial adaptation phase could ensure faster induction of the

82 Chapter 2

genes required for salt stress response. Thirdly, there are genes that show evolutionary change in regulation in the evolved lines. Higher level of induction or repression of these genes might increase the growth rate and/or efficiency of growth in salt medium.

We found 16 genes that gained regulation (new induction and new repression) in the evolution experiment, suggesting that new ways of salt stress adaptation could also arise during evolution. However, we also observed loss of regulation, at least to some extent, for five genes (genes with reduced induction and reduced repression) in the evolved lines. Because we performed our evolution experiment under constant environmental stress, losing regulation might be advantageous for evolutionary adaptation to salt stress. If the affected genes are involved in adaptation to multiple stressors, losing regulation could actually be beneficial both under constant or fluctuating environmental stressors. However, if the affected genes are specific for a particular stress, a loss in regulation may also have a cost, because cells would become physiologically less flexible under fluctuating environmental stresses. Among the five genes with loss of regulation, one gene, HSP30, can be induced by several stressors (Piper et al., 1997). On the contrary, the mRNA level of this gene was shown to be diminished in NaCl medium by Rep et al (Rep et al., 2000).

The hyperosmotic and salt stress responses are complex and influenced by many genes. It is an open question whether a small number of genetic changes with large effects could dramatically increase salt stress resistance, and thus explain the fitness increase in our evolved lines. To find out, we also sequenced the ancestral strain and population samples of the evolved lines using deep sequencing with genome coverages between 10- and 50-fold to identify whether any high frequency polymorphisms arose in the evolved lines during our evolution experiment. For two reasons, we chose deep population sequencing over clone sequencing. First, population sequencing gives a comprehensive view of polymorphisms in a population. Second, it can also permit estimation of polymorphism frequencies, although any such estimate would be more qualitative than quantitative at our sequence coverage. At the very least, population sequencing can detect high frequency polymorphisms. Such polymorphisms are of the greatest interest to us, because they would correspond to adaptive mutations with strong fitness effects that rose to high

83 Chapter 2

frequencies. We note that the effective population sizes in our experiment are so large (>2x106) that no neutral polymorphisms would be expected to rise to high frequencies during the experiment’s short duration except perhaps through hitchhiking with an advantageous mutation. (Clone sequencing can detect low frequency polymorphisms, but for a heterogeneous population, this method requires sequencing of many clones, and is thus currently prohibitively costly.)

Only one of our evolved lines (S2) contained a high frequency single nucleotide polymorphism (SNP). This SNP occurs in more than 50 percent of the population, and causes an amino acid substitution in the protein encoded by the MOT2 gene. When introduced into the ancestral strain, this mutation causes a fitness increase that explains 25.3 percent of the increase in fitness w in the evolved strain. However, the mutation causes a fitness increase also in the absence of salt, which means that its effects do not reflect a specific adaptation to salt.

Multiple genetic changes of low population frequency could be present in our evolved lines, and these changes might explain a part of the fitness increase we see. Such low frequency SNPs might be present for several reasons. First, individual mutations may confer only modest fitness benefits and thus increase in frequency slowly. Second, some mutations with very strong fitness effects may have occurred, but late in the experiment, and thus might simply not have had enough time to rise to high frequency. Third, the fitness effect of alleles might be determined by epistatic interactions with other mutations elsewhere in the genome (Elena & Lenski, 1997; Elena & Lenski, 2001; Morgan & Feldman, 2001). Eventually genotypes with such alleles might rise to high frequency, but at time scales much longer than those of our experiment, because multiple mutations might have to occur before any one epistatic combination with strong benefits arises. Finally, and perhaps most importantly, there could be clonal interference between multiple beneficial polymorphisms in the population and this might prevent any particular polymorphism from rising to high frequency in the population (Kao & Sherlock, 2008).

Similar to our analysis on SNPs, our analysis on copy number variations, both based on PFGE and next generation sequencing, did not reveal any large scale copy number changes shared across strains. However, we observed ploidy increases in all the evolved lines, and this could be

84 Chapter 2

the reason behind the increase in cell size of these lines.

Evidence from plants suggests that an increase in ploidy and the resultant increase in cell size can be advantageous under salt-stressed conditions. For example, within a species, plants with higher ploidy can cope with salt stress better than plants with lower ploidy (Saleh et al., 2008). Polyploid plants have higher water content and lower osmotic pressure than diploid plant (Noggle, 1946). Their increased water content is due to a decrease in the surface-to-volume ratio of cells (Stebbins, 1950). In addition, any water content decrease due to high salinity is smaller in polyploid plants than that in diploid plants (Tal & Gardi, 1976). Gerstein et al. observed that initially haploid yeast lines increased genome size significantly faster under salt stress than in unstressed conditions, suggesting an advantage for higher ploidy or increased cell size also in yeast (Gerstein et al., 2006).

Changes in ploidy could help yeast cells adapt faster physiologically, for example by affecting expression of genes important for salt stress tolerance. Could all the changes in gene expression that we observed in our experiment be caused solely by ploidy changes or by the concomitant increase in cell size? To find out, we compared the set of genes that changed expression in our evolved lines with the set of genes that are known to change expression after cell size increases (Wu et al., 2010) and ploidy changes (Galitski et al., 1999). Remarkably, only three genes that evolve changed expression in our experiment (out of 144 differentially expressed genes in our dataset) are among the genes affected by cell size increase or ploidy changes. Two of these genes showed increased basal expression in our experiment. One of them, YIL169C, was also observed to be induced with increase in cell size (Wu et al., 2010). The other gene, COS8, was observed to change expression with a change in ploidy level (Galitski et al., 1999). The third gene, YLR042C, showed a basal decrease in expression in our evolved lines. It was also observed to be repressed in cells with increased size (Wu et al., 2010).

Overall, the vast majority of changes that we observe in gene expression cannot be caused by changed ploidy or cell size. Because our whole genome sequencing analysis revealed no small genetic changes of strong phenotypic effects, the gene expression changes we observe could be

85 Chapter 2 caused by multiple genetic changes of modest individual effects. Some of these changes might affect a gene’s expression in cis, others in trans, for example through changes in the amino acid sequence of transcriptional regulators (Wittkopp et al., 2004; Emerson & Li, 2010). In addition, some expression changes may be caused by epigenetic change (e.g., changes in DNA methylation) or alterations in cellular memory (Turner, 2002; Ringrose & Paro, 2004; Zhang et al., 2005; Zacharioudakis et al., 2007).

We found several genes with modest changes in expression level in our evolved lines, and no gene with very strong expression change. Our observations stand in contrast to the only laboratory evolution study we know of that investigated evolutionary adaptation to a stressor. The study asked how E. coli cells adapt to heat stress. It observed very high level of evolutionary expression change in two proteins, GroEL and GroES, that are known heat shock genes (Rudolph et al., 2010). The reason for this difference to our work is probably that adaptation to salt stress requires several molecular functions simultaneously, namely ionic detoxification to achieve ionic equilibrium, maintaining cellular water activity, and osmolyte synthesis. Each of the evolutionarily altered genes in our experiment may perform or control some of these functions and thus, may have only a modest individual fitness contribution when differentially expressed. In addition, gene interactions (Phillips 2008; He et al., 2010) could also contribute to a fitness increase. This scenario resembles recent observations in genome wide association studies (GWAS) for complex human traits and diseases, where researchers have observed many genetic variants with modest individual effects on a phenotype (Frayling, 2007; Barrett et al., 2008; Cooper et al., 2008; Visscher, 2008; Zeggini et al., 2008; Hindorff et al., 2009; Manolio et al., 2009; Visscher & Montgomery, 2009; Park et al., 2010; Yang et al., 2010).

In sum, adaptation to salt stress is associated with gene expression changes, and with a DNA content increase in all three evolved lines. It is also associated with one high frequency mutation in one of the evolved lines, and with one chromosomal rearrangement in another. Previous studies suggest that the DNA content increase we observe might facilitate evolutionary adaptation to high salt. However, it is not sufficient to explain the expression changes in 144 genes that our evolved lines show. These expression changes may be caused by multiple genetic

86 Chapter 2

changes of low frequency (that might vary among subpopulations) or by epigenetic changes. Several of the genes that change expression through evolutionary adaptation are also involved in the physiological stress response, which supports a causal role for these genes in evolutionary adaptation to salt stress. Although the fine-tuning of several genes helps shape the complex, polygenic trait we study, some of their expression changes might also reflect cellular constraints on evolution. Future work will allow us to disentangle these two contributors to evolutionary adaptation in gene expression.

Methods Strains and media All laboratory evolution experiments started from the same clone of haploid yeast strain BY4741, which is referred to as the ancestral strain. The 3 replicate yeast lines evolved in NaCl are referred to as lines S1, S2 and S3. The growth rates of the evolved lines as well as the ancestral strain were measured relative to a BY4741 strain in which the CWP2 gene was GFP- tagged (termed as the reference strain). For serial transfers, cells were cultured in YP and 2% galactose (YPG) and YPG supplemented with 0.5 M NaCl (YPGN).

Serial transfer Six parallel serial transfer experiments were started from one single clone of the ancestral strain. In each parallel experiment, 50 ml yeast culture was grown for at 30oC. Every 24 hours, 50μl of grown culture was transferred into fresh culture medium; 30 such transfer cycles were carried out for a total of approximately 300 generations (Each transfer cycle involved approximately log21000≈10 cell generations). In three of the parallel experiments, the culture medium was YPG , whereas in the other three parallel experiments, the medium was YPGN.

Viability assays To estimate cell viabilities, cultures of growing yeast cells were sampled after 16 hours (during exponential growth phase), as well as after 24 hours (during stationary phase), diluted and plated. The plates were incubated at 30oC for 5 days and the number of colonies was counted. The relative viabilities of both the ancestral strain and the evolved lines were estimated from the ratio

87 Chapter 2

of colonies forming on agar plates containing YPG+ 0.5M NaCl to that of plates containing only YPG. All the measurements were carried out in three biological replicates.

Competition assays To compare growth rates of the evolved lines with that of the ancestral strain, competition assays were carried out in triplicates using fluorescence activated cell sorting (FACS). Cells from frozen glycerol stocks were grown overnight in 4ml YPD medium (30oC, 220rpm) until late logarithmic phase (<1.5x108 cells/ml). For the competition assay, approximately equal cell numbers of the reference strain and of the competing strain were mixed and grown for 24 hours at 30oC. The relative cell numbers at the beginning and at the end of the competition experiment were determined using FACS (supplementary methods) and growth rate differences were estimated as described in supplementary methods. For calculation of fitness, it was taken into consideration that the cell numbers at the beginning of the assay may have differed between the reference strain and the competing strain.

Whole genome transcriptome analysis The mRNA expression levels in the ancestral strain and in the evolved lines were analyzed using a GeneChip Yeast Genome 2.0 Array (Affymetrix). Equal number of cells from the ancestral strain and the evolved lines were grown in YPG medium for 16 hours. The cells were then either induced with 0.5M NaCl, or grown uninduced for 20 further minutes. The microarray analysis was carried out for two replicates each for the ancestral strain in YPG and YPGN (4 in total) and for 4 replicate population samples (One for S1, two for S2 and one for S3) each in YPG and YPGN for the S lines (8 in total). “Shared” genes, genes that respond to NaCl in a similar manner between the ancestral strain and the evolved lines, with significant up-regulation or down-regulation were identified using a t-test at p=0.05, False Discovery Rate (FDR)<10% and

|log2(fold change)|≥1.

Genes with changes in basal expression or change in regulation were identified based on two Z scores, Zr and Zb (see supplementary methods). Genes whose absolute Z-scores exceeded a value of 1.5 were considered to be differentially expressed. The differentially expressed genes were

88 Chapter 2

then grouped into different classes using the CYGD functional classification for yeast genes (Güldener et al., 2005).

Whole Genome Sequencing and SNP identification The ancestral strain, as well as population samples of the NaCl evolved lines at generation 300 were sequenced at approximately 10X coverage using next generation pyrosequencing (Margulies et al., 2005) (genome sequencer FLX LR system, Roche). The line S3 was further sequenced to a total of ~50X coverage. Candidate SNPs and indels were identified using blastn (Altschul et al., 1990) followed by MUSCLE (Edgar, 2004) based on the criteria described in supplementary methods. PCR sequencing was done to confirm candidate SNPs and indels.

Pulsed-field gel electrophoresis (PFGE) To identify large scale chromosomal rearrangements, two clones from each of the evolved lines (S1, S2 and S3) and the ancestral strain were analyzed by PFGE. Agarose plugs were prepared with the CHEF Yeast Genomic DNA Plug Kit (Bio-Rad) according to the manufacturer’s protocol. For restriction enzyme digestion, agarose plugs were incubated with NotI restriction enzyme for 16hrs at 37oC mixture in an appropriate restriction buffer. The digested or undigested plugs were then loaded into the wells of 1% agarose gels. Electrophoresis was carried out in a CHEF-DRIII system (Bio-Rad). The gels were stained using ethidium bromide (EtBr) and photographed.

Gene Amplification and Deletions from Next Generation Sequencing data In addition to identifying very short indels within individual reads, larger gene duplications and deletions in the evolved lines, compared to the ancestral strain, were identified using read coverage information from the sequencing data. Furthermore, DNA breakpoint analysis was performed to confirm those candidate copy number changes.

Ploidy determination of the evolved lines To estimate the ploidy level of the evolved lines in comparison to the ancestral strain, the ancestral strain and population samples of the evolved lines were grown for 24 hours at 30oC.

89 Chapter 2

Genomic DNA was isolated from a defined number of cells following the protocol described in Sambrook & Russell, 2001(Chapter 6).

Cell size measurement Population samples of ancestral and evolved cells were grown in YPG medium up to mid-log phase. The cell density was adjusted to be the same for all the samples (~1x107cells/ml). 20ul of cells were then loaded onto disposable counting chambers and the cell sizes, were measured using a Cellometer M10 (Nexcelom Biosciences), following the manufacturer's protocol.

Acknowledgements We would like to acknowledge the help of the Functional Genomics Center Zurich for carrying out next generation sequencing and microarray experiments. AW acknowledges support through SNF grants 315200-116814, 315200-119697, and 315230-129708, as well as through the YeastX project of SystemsX.ch, and the University Research Priority Program in systems biology at the University of Zurich.

References Adams, J. & Oeller, P.W. 1986. Structure of evolving populations of Saccharomyces cerevisiae: Adaptive changes are frequently associated with sequence alterations involving mobile elements belonging to the Ty family. Proc. Natl. Acad. Sci. USA 83: 7124-7127.

Adams, J., Puskas-Rozsa, S., Simlar, J. & Wilke, C.M. 1992. Adaptation and major chromosomal changes in populations of Saccharomyces cerevisiae. Curr. Genet. 22: 13-19.

Alonso-Monge, R., Real, E., Wojda, I., Bebelman, J.P., Mager, W.H. & Siderius, M. 2001. Hyperosmotic stress response and regulation of cell wall integrity in Saccharomyces cerevisiae share common functional aspects. Mol. Microbiol. 41: 717-730.

Altschul, S.F., Gish, W., Miller, W., Myers, E.W. & Lipman, D.J. 1990. Basic local alignment search tool. J. Mol. Biol. 215: 403-410.

Apse, M.P., Aharon, G.S., Snedden, W.A. & Blumwald, E. 1999. Salt Tolerance Conferred by Overexpression of a Vacuolar Na +/H+ Antiport in Arabidopsis. Science 285: 1256-1258.

Attfield, P.V. 1997. Stress Tolerance: The key to effective strains of industrial baker's yeast. Nat. Biotech. 15: 1351- 1357.

Badarinarayana, V., Chiang, Y.C. & Denis, C.L. 2000. Functional interaction of CCR4-NOT proteins with TATAA- binding protein (TBP) and its associated factors in yeast. Genetics 155: 1045-1054.

Barrett, J.C., Hansoul, S., Nicolae, D.L., Cho, J.H., Duerr, R.H., Rioux, J.D. et al. 2008. Genome-wide association

90 Chapter 2 defines more than 30 distinct susceptibility loci for Crohn’s disease. Nat. Genet. 40: 955-962.

Benfey, P.N. & Protopapas, A. D. 2005. Essentials of genomics. Prentice-Hall, Upper Saddle River, NJ.

Betz, C., Zajonc, D., Moll, M. & Schweizer, E. 2002. ISC1-encoded inositol phosphosphingolipid phospholipase C is involved in Na+/Li+ halotolerance of Saccharomyces cerevisiae. Eur. J. Biochem. 269: 4033-4039.

Blanc, V.M. & Adams, J. 2003. Evolution in Saccharomyces cerevisiae: Identification of Mutations Increasing Fitness in Laboratory Populations. Genetics 165: 975-983.

Bressan, R.A., Hasegawa, P.M. & Pardo J.M. 1998. Plants use calcium to resolve salt stress. Trends Plant Sci. 3: 411-412.

Brewster, J.L., de Valoir, T., Dwyer, N.D., Winter, E. & Gustin, M.C. 1993. An Osmosensing Signal Transduction Pathway in Yeast. Science 259: 1760-1763.

Cade, R.M. & Errede, B. 1994. MOT2 encodes a negative regulator of gene expression that affects basal expression of pheromone-responsive genes in Saccharomyces cerevisiae. Mol. Cell. Biol. 14: 3139-3149.

Causton, H.C., Ren, B., Koh, S.S., Harbison, C.T., Kanin, E., Jennings, E.G. et al. 2001. Remodeling of Yeast Genome Expression in Response to Environmental Changes. Mol. Biol. Cell 2: 323-337.

Cooper, J.D., , D.J., Smiles, A.M., Plagnol, V., Walker, N.M., Allen, J.E. et al. 2008. Meta-analysis of genome-wide association study data identifies additional type 1 diabetes risk loci. Nat. Genet. 40: 1399-1401.

Denis, C.L., Chiang, Y.C., Cui, Y. & Chen, J. 2001. Genetic evidence supports a role for the yeast CCR4-NOT complex in transcriptional elongation. Genetics 158: 627-634.

Dihazi, H., Kessler, R. & Eschrich, K. 2004. High Osmolarity Glycerol (HOG) Pathway-induced Phosphorylation and Activation of 6-Phosphofructo-2-kinase Are Essential for Glycerol Accumulation and Yeast Cell Proliferation under Hyperosmotic Stress. J. Biol. Chem. 279: 23961-23968.

Dunham, M.J., Badrane, H., Ferea, T., Adams, J., Brown, P.O., Rosenzweig, F. et al. 2002. Characteristic genome rearrangements in experimental evolution of Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. USA 99: 16144- 16149.

Edgar, R.C. 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32:1792-1797.

Elena, S.F. & Lenski, R.E. 1997. Test of synergistic interactions among deleterious mutations in bacteria. Nature 390: 395-398.

Elena, S.F. & Lenski, R.E. 2001. Epistasis between new mutations and genetic background and a test of genetic canalization. Evolution 55: 1746-1752.

Emerson, J.J. & Li, W-H. 2010. The genetic basis of evolutionary change in gene expression levels. Phil. Trans. R. Soc. B 365: 2581-2590.

Espinosa-Ruiz, A., Belles, J.M., Serrano, R. & Culianez-Macla, V. 1999. Arabidopsis thaliana AtHAL3: A flavoprotein related to salt and osmotic tolerance and plant growth. Plant J. 20: 529-539.

Ferea, T.L., Botstein, D., Brown, P.O. & Rosenzweig, R.F. 1999. Systematic changes in gene expression patterns following adaptive evolution in yeast. Proc. Natl. Acad. Sci. USA. 96: 9721-9726.

Ferrando, A., Kron, S.J., Rios, G., Fink, G.R. & Serrano, R. 1995. Regulation of cation transport in Saccharomyces

91 Chapter 2 cerevisiae by the salt tolerance gene HAL3. Mol. Cell. Biol. 15: 5470-5481.

Frayling, T.M. 2007. Genome-wide association studies provide new insights into type 2 diabetes aetiology. Nat. Rev. Genet. 8: 657-662.

Galcheva-Gargova, Z., Derijard, B., Wu, I.H. & Davis, R.J. 1994. An osmosensing signal transduction pathway in mammalian cells. Science 265: 806-808.

Galitski, T., Saldanha, A.J., Styles, C.A., Lander, E.S. & Fink, G.R. 1999. Ploidy regulation of gene expression. Science 285: 251-254.

Ganster, R.W., McCartney, R.R. & Schmidt, M.C. 1998. Identification of a Calcineurin-Independent Pathway Required for Sodium Ion Stress Response in Saccharomyces cerevisiae. Genetics 150: 31-42.

Garciadeblas, B., Rubio, F., Quintero, F.J., Bañuelos, M.A., Haro, R. & Rodríguez-Navarro, A. 1993. Differential expression of two genes encoding isoforms of the ATPase involved in sodium efflux in Saccharomyces cerevisiae. Mol. Gen. Genet. 236: 363-368.

Gasch, A.P., Spellman, P.T., Kao, C.M., Carmel-Harel, O., Eisen, M.B., Storz, G. et al. 2000. Genomic Expression Programs in the Response of Yeast Cells to Environmental Changes. Mol. Biol. Cell 11: 4241-4257.

Gaxiola, R., de Larrinoa, I.F., Villalba, J.M. & Serrano, R. 1992. A novel and conserved salt-induced protein is an important determinant of salt tolerance in yeast. EMBO J. 11: 3157-3164.

Gaxiola, R.A., Rao, R., Sherman, A., Grisafi, P., Alper, S.L. & Fink, G.R. 1999. The Arabidopsis thaliana proton transporters, AtNhx1 and Avp1, can function in cation detoxification in yeast. Proc. Natl. Acad. Sci. USA 96: 1480- 1485.

Gerstein, A.C., Chun, H.J.E., Grant, A. & Otto, S.P. 2006. Genomic convergence toward diploidy in Saccharomyces cerevisiae. PLoS Genet. 2: e145.

Gisbert, C., Rus, A.M., Bolarín, M.C., Lopez-Coronado, J.M., Arrillaga, I., Montesinos, C. et al. 2000. The Yeast HAL1 Gene Improves Salt Tolerance of Transgenic Tomato. Plant Physiol. 123: 393-402.

Goossens, A., Forment, J. & Serrano, R. 2002. Involvement of Nst1p/YNL091w and Msl1p, a U2B'' splicing factor, in Saccharomyces cerevisiae salt tolerance. Yeast 19: 193-202.

Gorner, W., Durchschlag, E., Martinez-Pastor, M.T., Estruch, F., Ammerer, G. & Hamilton, B. et al. 1998. Nuclear localization of the C2H2 zinc finger protein Msn2p is regulated by stress and protein kinase A activity. Genes Dev. 12: 586-597.

Gresham, D., Desai, M.M., Tucker, C.M., Jenq, H.T., Pai, D.A., Ward, A. et al. 2008. The Repertoire and Dynamics of Evolutionary Adaptations to Controlled Nutrient-Limited Environments in Yeast. PLoS Genet. 4: e1000303.

Güldener, U., Münsterkötter, M., Kastenmüller, G., Strack, N., van Helden, J., Lemer, C. et al. 2005. CYGD: the Comprehensive Yeast Genome Database. Nucl. Acids Res. 33: D364-D368.

Han, J., Lee, J.D., Bibbs, L. & Ulevitch, R.J. 1994. A MAP kinase targeted by endotoxin and hyperosmolarity in mammalian cells. Science 265: 808-811.

Haro, R., Garciadeblas, B. & Rodríguez-Navarro, A. 1991. A novel P-type ATPase from yeast involved in sodium transport. FEBS Lett. 291: 189-191.

Hasegawa, P.M. & Bressan, R.A. 2000. Plant Cellular and Molecular Responses to High Salinity. Annu. Rev. Plant. Physiol. Plant. Mol. Biol. 51: 463-499.

92 Chapter 2

He, X., Qian, W., Wang, Z., Li, Y. & Zhang, J. 2010. Prevalent positive epistasis in Escherichia coli and Saccharomyces cerevisiae metabolic networks. Nat. Genet. 42: 272-276.

Heath, V.L., Shaw, S.L., Roy, S. & Cyert, M.S. 2004. Hph1p and Hph2p, Novel Components of Calcineurin- Mediated Stress Responses in Saccharomyces cerevisiae. Eukaryotic Cell 3: 695-704

Herrero, E., Ros, J., Bellí, G. & Cabiscol, E. 2008. Redox control and oxidative stress in yeast cells. Biochim. Biophys. Acta 1780: 1217-1235.

Hindorff, L.A., Sethupathy, P., Junkins, H.A., Ramos, E.M., Mehta, J.P., Collins, F.S. et al. 2009. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl. Acad. Sci. USA 106: 9362–9367.

Hirata, Y., Andoh, T., Asahara, T. & Kikuchi, A. 2003. Yeast Glycogen Synthase Kinase-3 Activates Msn2p- dependent Transcription of Stress Responsive Genes. Mol. Biol. Cell. 14: 302-312.

Hohmann, S. 2002. Osmotic Stress Signaling and Osmoadaptation in Yeasts. Micro. Mol. Biol. Rev. 66: 300–372.

Hughes, T.R., Roberts, C.J., Dai, H., Jones, A.R., Meyer, M.R., Slade, D. et al. 2000. Widespread aneuploidy revealed by DNA microarray expression profiling. Nat. Genet. 25: 333-337.

Jamieson, D.J. 1998. Oxidative stress responses of the yeast Saccharomyces cerevisiae. Yeast 14: 1511-1527.

Jonak, C., Kiegerl, S., Ligterink, W., Barker, P.J., Huskisson, N.S. & Hirt, H. 1996. Stress signaling in plants: A mitogen-activated protein kinase pathway is activated by cold and drought. Proc. Natl. Acad. Sci. USA 93: 11274- 11279.

Jonak, C., Ligterink, W. & Hirt, H. 1999. MAP kinases in plant signal transduction. Cell. Mol. Life Sci. 55: 204-213.

Kao, K.C. & Sherlock, G. 2008. Molecular characterization of clonal interference during adaptive evolution in asexual populations of Saccharomyces cerevisiae. Nat. Genet. 40: 1499-1504.

Kosako, H., Nishida, E. & Gotoh, Y. 1993. cDNA cloning of MAP kinase kinase reveals kinase cascade pathways in yeasts to vertebrates. EMBO J. 12: 787-794.

Kovtun, Y., Chiu, W-L., Tena, G., Sheen, J. 2000. Functional analysis of oxidative stress-activated mitogen- activated protein kinase cascade in plants. Proc. Natl. Acad. Sci. USA 97: 2940-2945.

Kyriakis, J.M. & Avruch, J. 2001. Mammalian Mitogen-Activated Protein Kinase Signal Transduction Pathways Activated by Stress and Inflammation. Physiol. Rev. 81: 807-869.

Lee, J.H., Montagu, M.V. & Verbruggen, N. 1999. A highly conserved kinase is an essential component for stress tolerance in yeast and plant cells. Proc. Natl. Acad. Sci. USA 96: 5873-5877.

Lenski, R.E. & Travisano, M. 1994. Dynamics of adaptation and diversification: A 10,000-generation experiment with bacterial populations. Proc. Natl. Acad. Sci. USA 91: 6808-6814.

Lenski, R.E. & Mongold, J.A. 2000. Cell size, shape, and fitness in evolving populations of bacteria. In: Scaling in Biology (J. H. Brown & G. B. West, eds), pp. 221-235, Oxford University Press.

Liu, H.Y., Badarinarayana, V., Audino, D.C., Rappsilber, J., Mann, M. & Denis, C.L. 1998. The NOT proteins are part of the CCR4 transcriptional complex and affect gene expression both positively and negatively. EMBO J. 17: 1096-1106.

93 Chapter 2

Lushchak, V.I. & Gospodaryov, D.V. 2005. Catalases protect cellular proteins from oxidative modification in Saccharomyces cerevisiae. Cell Biol. Int. 29: 187-192.

Maathuis, F.J.M. & Amtmann, A. 1999. K+ Nutrition and Na+ Toxicity : The Basis of Cellular K+/Na+ Ratios. Annals of Bot. 84: 123-133.

Mai, B. & Breeden, L. 1997. Xbp1, a Stress-Induced Transcriptional Repressor of the Saccharomyces cerevisiae Swi4/Mbp1 Family. Mol. Cell. Biol. 17: 6491-6501.

Manolio, T.A., Collins, F.S., Cox, N.J., Goldstein, D.B., Hindorff, L.A., Hunter, D.J. et al. 2009. Finding the missing heritability of complex diseases. Nature 461: 747-753.

Margulies, M., Egholm, M., Altman, W.E., Attiya, S., Bader, J.S., Bemben, L.A. et al. 2005. Genome sequencing in microfabricated high-density picolitre reactors. Nature 437: 376-380.

Martinez-Pastor, M.T., Marchler, G., Schuller, C., Marchler-Bauer, A., Ruis, H. & Estruch, F. 1996. The Saccharomyces cerevisiae zinc finger proteins Msn2p and Msn4p are required for transcriptional induction through the stress-response element (STRE). EMBO J. 15: 2227-2235.

Mendizabal, I., Rios, G., Mulet, J.M., Serrano, R. & de Larrinoa, I.F. 1998. Yeast putative transcription factors involved in salt tolerance. FEBS Lett. 425: 323-328.

Mendoza, I., Rubioli, F., Rodriguez-Navarroli, A. & Pardo, J.M. 1994. The Protein Phosphatase Calcineurin Is Essential for NaCl Tolerance of Saccharomyces cerevisiae. J. Biol. Chem. 269: 8792-8796.

Mersman, D.P., Du, H.N., Fingerman, I.M., South, P.F. & Briggs, S.D. 2009. Polyubiquitination of the demethylase Jhd2 controls histone methylation and gene expression. Genes Dev. 23: 951-962.

Morgan, L.W. & Feldman, J.F. 2001. Epistatic and Synergistic Interactions Between Circadian Clock Mutations in Neurospora crassa. Genetics 159: 537-543.

Moskvina, E., Schuller, C., Maurer, C.T.C., Mager, W.H. & Ruis, H. 1998. A Search in the Genome of Saccharomyces cerevisiae for Genes Regulated via Stress Response Elements. Yeast 14: 1041-1050.

Mulet, J.M., Leube, M.P., Kron, S.J., Rios, G., Fink, G.R. & Serrano, R. 1999. A novel mechanism of ion homeostasis and salt tolerance in yeast: the Hal4 and Hal5 protein kinases modulate the Trk1-Trk2 potassium transporter. Mol. Cell. Biol. 19: 3328-3337.

Mukhopadhyay, A., He, Z., Alm, E.J., Arkin, A.P., Baidoo, E.E., Borglin, S.C. et al. 2006: Salt Stress in Desulfovibrio vulgaris Hildenborough: an Integrated Genomics Approach. J. Bacteriol. 188: 4068-4078.

Murguía, J.R., Bellés, J.M. & Serrano, R. 1996. The yeast HAL2 nucleotidase is an in vivo target of salt toxicity. J. Biol. Chem. 271: 29029-29033.

Neiman, A.M. 1993. Conservation and reiteration of a kinase cascade. Trends Genet. 9: 390-394.

Noggle, R. 1946.The physiology of polyploidy in plants. I. Review of the literature. Lloydia 9; 153-173.

Ostrander, D.B. & Gorman, J.A. 1999. The Extracellular Domain of the Saccharomyces cerevisiae Sln1p Membrane Osmolarity Sensor Is Necessary for Kinase Activity. J. Bacteriol. 181: 2527-2534.

Panaretou, B. & Piper, P.W. 1992. The plasma membrane of yeast acquires a novel heat-shock protein (hsp30) and displays a decline in proton-pumping ATPase levels in response to both heat shock and the entry to stationary phase. Eur. J. Biochem. 206: 635-640.

94 Chapter 2

Panasenko, O., Landrieux, E., Feuermann, M., Finka, A., Paquet, N. & Collart, M.A. 2006. The yeast Ccr4-Not complex controls ubiquitination of the nascent-associated polypeptide (NAC-EGD) complex. J. Biol. Chem. 281: 31389-31398.

Pardo, J.M., Reddy, M.P., Yang, S., Maggio, A., Huh, G-H., Matsumoto, T. et al. 1998. Stress signaling through Ca2+/ calmodulin-dependent protein phosphatase calcineurin mediates salt adaptation in plants. Proc. Natl. Acad. Sci. USA 95: 9681-9686.

Park, J.H., Wacholder, S., Gail, M.H., Peters, U., Jacobs, K.B., Chanock, S.J. et al. 2010. Estimation of effect size distribution from genome-wide association studies and implications for future discoveries. Nat. Genet. 42: 570-575.

Philippe, N., Pelosi, L., Lenski, R.E. & Schneider, D. 2009. Evolution of Penicillin-Binding Protein 2 Concentration and Cell Shape during a Long-Term Experiment with Escherichia coli. J. Bacteriol. 191: 909-921.

Phillips, P.C. 2008. Epistasis - the essential role of gene interactions in the structure and evolution of genetic systems. Nat. Rev. Genet. 9: 855-867.

Piper, P.W., Ortiz-Calderon, C., Holyoak, C., Coote, P. & Cole, M. 1997. Hsp30, the integral plasma membrane heat shock protein of Saccharomyces cerevisiae, is a stress-inducible regulator of plasma membrane H(+)-ATPase. Cell Stress Chaperones 2: 12-24.

Piper, P.W., Talreja, K., Panaretou, B., Moradas-, P., Byrne, K., Praekelt, U.M. et al. 1994. Induction of major heat-shock proteins of Saccharomyces cerevisiae, including plasma membrane Hsp30, by ethanol levels above a critical threshold. Microbiology 140: 3031-3038.

Posas, F., Chambers, J.R., Heyman, J.A., Hoeffler, J.P., de Nadal, E. & Arino, J. 2000. The Transcriptional Response of Yeast to Saline Stress. J. Biol. Chem. 275: 17249-17255.

Quinteroa, F.J., Blatt, M.R. & Pardo, J.M. 2000. Functional conservation between yeast and plant endosomal Na+/H+ antiporters. FEBS Lett. 471: 224-228.

Quintero, F.J., Ohta, M., Shi, H., Zhu, J-K. & Pardo, J.M. 2002. Reconstitution in yeast of the Arabidopsis SOS signaling pathway for Na+ homeostasis. Proc. Natl. Acad. Sci. USA 99: 9061-9066.

Rancati, G., Pavelka, N., Fleharty, B., Noll, A., Trimble, R., Walton, K. et al. 2008. Aneuploidy underlies rapid adaptive evolution of yeast cells deprived of a conserved cytokinesis motor. Cell 135: 879-893.

Reiser, V., Salah, S.M. & Ammerer, G. 2000. Polarized localization of yeast Pbs2 depends on osmostress, the membrane protein Sho1 and Cdc42. Nat. Cell. Biol. 2: 620-627.

Rep, M., Reiser, V., Gartner, U., Thevelein, J.M., Hohmann, S., Ammerer, G. et al. 1999. Osmotic Stress-Induced Gene Expression in Saccharomyces cerevisiae Requires Msn1p and the Novel Nuclear Factor Hot1p. Mol. Cell. Biol. 19: 5474-5485.

Rep, M., Krantz, M., Thevelein, J.M. & Hohmann, S. 2000. The transcriptional response of Saccharomyces cerevisiae to osmotic shock. J. Biol. Chem. 275: 8290-8300.

Ringrose, L. & Paro, R. 2004. Epigenetic regulation of cellular memory by the Polycomb and Trithorax group proteins. Annu. Rev. Genet. 38:413-443.

Rios, G., Ferrando, A. & Serrano, R. 1997. Mechanisms of salt tolerance conferred by overexpression of the HAL1 gene in Saccharomyces cerevisiae. Yeast 13: 515-528.

Rudolph, B., Gebendorfer, K.M., Buchner, J. & Winter, J. 2010. Evolution of Escherichia coli for Growth at High Temperatures. J. Biol. Chem. 285: 19029-19034.

95 Chapter 2

Saito, H. & Tatebayashi, K. 2004. Regulation of the Osmoregulatory HOG MAPK Cascade in Yeast. J. Biochem. 136: 267-272.

Saleh, B., Allario, T., Dambier, D., Ollitrault, P. & Morillon, R. 2008. Tetraploid citrus rootstocks are more tolerant to salt stress than diploid. C. R. Biologies, 331: 703-710.

Samani, P. & Bell, G. 2010. Adaptation of experimental yeast populations to stressful conditions in relation to population size. J. Evol. Biol. 23: 791-796.

Sambrook, J. & Russell, D.W. 2001. Molecular Cloning: A laboratory Manual, 3rd edn. Cold Spring Harbor Laboratory Press, New York.

Sanders, D. 2000. Plant biology: The salty tale of Arabidopsis. Curr. Biol. 10: R486-R488

Schmitt, A.P. & McEntee, K. 1996. Msn2p, a zinc finger DNA-binding protein, is the transcriptional activator of the multistress response in Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. USA 93: 5777-5782.

Schuller, C., Brewster, J.L., Alexander, M.R., Gustin, M.C. & Ruis, H. 1994. The HOG pathway controls osmotic regulation of transcription via the stress response element (STRE) of the Saccharomyces cerevisiae CTT1 gene. EMBO J. 13: 4382-4389.

Serrano, R. 1996. Salt tolerance in plants and microorganisms: toxicity targets and defense responses. Int. Rev. Cytol. 165: 1-52.

Serrano, R., Mulet, J.M., Rios, G., Marquez, J.A., de Larrinoa, I.F., Leube, M.P. et al. 1999. A glimpse of the mechanisms of ion homeostasis during salt stress. J. Exp. Bot. 50: 1023-1036.

Stanek, M.T., Cooper, T.F. & Lenski, R.E. 2009. Identification and dynamics of a beneficial mutation in a long-term evolution experiment with Escherichia coli. BMC Evol. Biol. 9:302.

Stebbins, G. L. 1950. Variation and evolution in plants. Columbia University Press, New York.

Tal, M., & Gardi, I. 1976. Physiology of polyploid plants: water balance in autotetraploid and diploid tomato under low and high salinity. Physiol. Plantarum 38: 257-261.

Teige, M., Scheikl, E., Eulgem, T., Doczi, R., Ichimura, K., Shinozaki, K. et al. 2004. The MKK2 Pathway Mediates Cold and Salt Stress Signaling in Arabidopsis. Mol. Cell 15: 141-152.

Tsujimoto, Y., Izawa, S. & Inoue, Y. 2000. Cooperative Regulation of DOG2, Encoding 2-Deoxyglucose-6- Phosphate Phosphatase, by Snf1 Kinase and the High-Osmolarity Glycerol–Mitogen-Activated Protein Kinase Cascade in Stress Responses of Saccharomyces cerevisiae. J. Bacteriol. 182: 5121–5126.

Trainotti, N. & Stambuk, B. U. 2001. NaCl stress inhibits maltose fermentation by Saccharomyces cerevisiae. Biotech. Lett. 23: 1703-1707.

Turner, B.M. 2002. Cellular Memory and the Histone Code. Cell 111: 285-291.

Velicer, G.J., Raddatz, G., Keller, H., Deiss, S., Lanz, C., Dinkelacker, I. et al. 2006. Comprehensive mutation identification in an evolved bacterial cooperator and its cheating ancestor. Proc. Natl. Acad. Sci. USA 103: 8107- 8112.

Vijaranakul, U., Nadakavukaren, M.J., de Jonge, B.L., Wilkinson, B.J. & Jayaswal, R.K. 1995. Increased Cell Size and Shortened Peptidoglycan Interpeptide Bridge of NaCl-Stressed Staphylococcus aureus and Their Reversal by Glycine Betaine. J. Bacteriol. 177: 5116-5121.

96 Chapter 2

Visscher, P.M. 2008. Sizing up human height variation. Nat. Genet. 40: 489-490.

Visscher, P.M. & Montgomery, G.W. 2009. Genome-wide Association Studies and Human Disease: From Trickle to Flood. J. Am. Med. Assoc. 302: 2028-2029.

Warringer, J., Ericson, E., Fernandez, L., Nerman, O. & Blomberg, A. 2003. High-resolution yeast phenomics resolves different physiological features in the saline response. Proc. Natl. Acad. Sci. USA 100: 15724-15729.

Waskiewicz, A.J. & Cooper, J.A. 1995. Mitogen and stress response pathways: MAP kinase cascades and phosphatase regulation in mammals and yeast. Curr. Opin. Cell Biol. 7:798-805.

Wittkopp, P.J., Haerum, B.K. & Clark, A.G. 2004. Evolutionary changes in cis and trans gene regulation. Nature 430: 85-88.

Wu, C-Y., Rolfe, P.A., Gifford, D.K. & Fink, G.R. 2010. Control of Transcription by Cell Size. PLoS Biol 8: e1000523.

Wu, X., Hart, H., Cheng, C., Roach, P.J. & Tatchell, K. 2001. Characterization of Gac1p, a regulatory subunit of protein phosphatase type I involved in glycogen accumulation in Saccharomyces cerevisiae. Mol. Genet. Genomics 265: 622-635.

Yale, J. & Bohnert, H.J. 2001. Transcript Expression in Saccharomyces cerevisiae at High Salinity. J. Biol. Chem. 276: 15996-16007.

Yang, S.X., Zhao, Y.X., Zhang, Q., He, Y.K., Zhang, H. & Luo, D. 2001. HAL1 mediate salt adaptation in Arabidopsis thaliana. Cell Res. 11: 142-148.

Yang, J., Benyamin, B., McEvoy, B.P., Gordon, S., Henders, A.K., Nyholt, D.R. et al. 2010. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42: 565-569.

Zacharioudakis, I., Gligoris, T. & Tzamarias, D. 2007. A Yeast Catabolic Enzyme Controls Transcriptional Memory. Curr. Biol. 17: 2041–2046.

Zeggini, E., Scott, L.J., Saxena, R., Voight, B.F., Marchini, J.L., Hu, T. et al. 2008. Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes. Nat. Genet. 40: 638-645.

Zeyl, C., Vanderford, T. & Carter, M. 2003. An Evolutionary Advantage of Haploidy in Large Yeast Populations. Science 299: 555-558.

Zhang, Y., Fatima, N. & Dufau, M.L. 2005. Coordinated Changes in DNA Methylation and Histone Modifications Regulate Silencing/Derepression of Luteinizing Hormone Receptor Gene Transcription. Mol. Cell. Biol. 25: 7929- 7939.

Zheng, D.Q., Wu, X.C., Tao, X.L., Wang, P.M., Li, P., Chi, X.Q. et al. 2011. Screening and construction of Saccharomyces cerevisiae strains with improved multi-tolerance and bioethanol fermentation performance. Bioresour. Technol. 102:3020-3027.

Zhu, J-K. 2001. Plant salt tolerance. Trends Plant Sci. 6: 66-71.

Zhu, J-K. 2002. Salt and Drought Stress Signal Transduction in Plants. Annu. Rev. Plant. Biol. 53: 247-273.

97 Appendix to Chapter 2

Supplementary figures

Supplementary Figure 1 - Effect of NaCl on the growth rate of the ancestral yeast strain. Equal number of ancestral strain cells were inoculated into YPG medium and YPGN medium (YPG supplemented with 0.5M NaCl), and grown for 24 hours. Cell numbers were counted manually after 24 hours. The experiment shows that NaCl reduces the growth rate of the ancestral strain, and after 24 hours the cell numbers in YPGN medium are almost three times lower than the cell numbers in YPG medium.

98 Appendix to Chapter 2

Supplementary figure 2 - Different functional gene classes [94] are enriched or impoverished for genes that respond similarly (t-test, |log2(fold change)|≥1, p-value<0.05, False Discovery Rate<10%) to NaCl in all the evolved yeast lines, and in the ancestral strain. The vertical axis shows the functional classes of genes we consider, and the horizontal axis shows the fraction of genes in each category listed on the bottom of the figure. Asterisks (‘*’) indicate classes where the number of genes differs significantly from that expected by chance alone, as indicated by a binomial test.

99 Appendix to Chapter 2

Supplementary figure 3 - Schematic diagram for the type of expression change that can occur as a result of evolutionary adaptation in the evolved S lines compared to the ancestral strain. Each bar chart reflects the expression level of a hypothetical gene in the ancestral and evolved strain, in the absence and presence of salt. The scenarios that are depicted on the following pages are as follows - (a) Basal change in expression but no change in regulation (b) Increased induction (c) Reduced induction (d) Increased repression (e) Reduced repression (f) New induction (g) New repression

Supplementary figure 3a: No change in Regulation

100 Appendix to Chapter 2

Supplementary figure 3b: Increased induction

101 Appendix to Chapter 2

Supplementary figure 3c: Reduced Induction

102 Appendix to Chapter 2

Supplementary figure 3d: Increased Repression

103 Appendix to Chapter 2

Supplementary figure 3e: Reduced Repression

104 Appendix to Chapter 2

Supplementary figure 3f: New Induction

105 Appendix to Chapter 2

Supplementary figure 3g: New Repression

106 Appendix to Chapter 2

Supplementary figure 4 - Different functional gene classes (Güldener et al., 2005) are enriched or impoverished for genes that change expression in the salt evolved yeast lines (S lines), compared to the ancestral strain. The vertical axis shows the functional classes of genes we consider, and the horizontal axis shows the fraction of genes in each category listed on the bottom of the figure. Asterisks (‘*’) indicate classes where the number of genes differs significantly from that expected by chance alone, as indicated by an exact binomial test.

107 Appendix to Chapter 2

Supplementary figure 5 - Chromosomal read coverage ratio for all the chromosomes between each salt evolved yeast lines and the ancestral strain. The X-axis represents the chromosome numbers. M stands for mitochondrion. The read coverage does not dramatically exceed that of the ancestral strain for any of the evolved lines, and for any of the chromosomes. This observation suggests that there are no chromosome-scale duplications or deletions in any of the evolved lines. The new bands that appear in the PFGE analysis for line S3 could be a non- duplicative chromosomal translocation event.

108 Appendix to Chapter 2

Supplementary figure 6 - Relationship between %G+C content and read coverage. The %G+C content was calculated for all possible reads of length 250bp in a genomic window of 10kb and plotted against the average number of reads per base pair in that 10kb window. The blue lines indicate the least square fits of the average number of reads versus %G+C content. We can see that the %G+C content has no significant effect on the average read coverage in all the lines. The plots show %G+C content vs Average read coverage per base pair in (a) Ancestral line (b) S1 line (c) S2 line and (d) S3 line.

109 Appendix to Chapter 2

Supplementary figure 7 - Length distribution of amplifications and deletions from the lines S1, S2, and S3 from the actual data (A) and the randomized data (R) (a) with false positive rate (FPR) <10% (b) without controlling FPR.

110 Appendix to Chapter 2

Supplementary figure 8 - Copy number distribution of amplifications and deletions from the lines S1, S2, and S3 from the actual data and the randomized data (a) with FPR<10% (b) without controlling for FPR. The X-axis represents the absolute change in copy number. For deletions, the copy number change actually represents the population frequency of the deletions.

111 Appendix to Chapter 2

Supplementary figure 9 - Effect of random sampling and the sensitivity of the segmentation algorithm (sensitivity) in detecting amplifications and deletions of various lengths and copy numbers. The X-axis represents the length of the amplified or deleted region (in basepairs), the Y-axis represents the Copy number of amplifications and deletions spiked into the data using random sampling and the Z-axis shows the sensitivity of the segmentation algorithm (see Methods). A sensitivity of 0.8 means that the algorithm could correctly detect an amplification/deletion of a particular size and copy number in 80% of the time.

112 Appendix to Chapter 2

Supplementary figure 10 - The steps of segmentation algorithm for calling amplifications and deletions based on coverage ratio (CR).

113 Appendix to Chapter 2

Supplementary text

Functional classification (Güldener et al., 2005) of genes that respond similarly to salt stress between the ancestral strain and the evolved lines We observed 581 common genes that were induced in salt and 580 common genes that were repressed in salt with |log2(fold change)|≥1 (t-test, p-value<0.05, False Discovery Rate <10%). We classified genes with similar expression in ancestral and evolved strain using the Comprehensive Yeast Genome Database (CYGD) classification from the Munich Information Center for Protein Sequences (MIPS)(Güldener et al., 2005).

The distribution of the number of genes in various classes is significantly different from what would be expected by chance alone, for both the up-regulated (Chi-square test, p<0.001, df=17) and down-regulated (p<0.001, df=17) genes. Also the numbers of genes in individual functional classes that are regulated in similar ways in the ancestral and evolved lines are significantly different from the number expected by chance alone, that is, from the number of genes in each class (exact binomial test). These classes can again be categorized into two groups - the classes that have more genes than expected and the classes that have fewer genes than expected by chance alone. For genes induced by salt, the classes associated with energy production (p- value<0.01) and cell rescue/defense (p-value<0.01) are overrepresented, whereas the classes associated with cell cycle and DNA processing (p-value<0.01), transcription (p-value<0.01), protein synthesis (p-value<0.01), cellular transport (p-value<0.01), and biogenesis of cellular components (p-value=0.03) are underrepresented.

For the genes repressed by salt, the classes transcription (p-value=0.01), protein synthesis (p- value<0.01), protein with binding function or cofactor requirement (structural or catalytic) (p- value<0.01), cell fate (p-value=0.02), biogenesis of cellular components (p-value=0.03), cell type differentiation (p-value=0.03), and development (systemic) (p-value=0.03) are overrepresented; the classes energy(p-value=0.01), cell cycle and DNA processing (p- value=0.01), protein fate (folding, modification, destination) (p-value<0.01), cellular transport (p-value<0.01), cell rescue, defense and virulence (p-value=0.02) are underrepresented.

Genes whose regulation changes in the evolved lines (S lines) There are only four genes in total which showed increased induction in the evolved lines. All four genes were up-regulated (log2(fold-change)≥1) in our ancestral strain in response to NaCl. Among these four genes, two of them (PUT4 and PCL5) also showed decrease in basal expression in the evolved lines. Four further genes showed new induction in the evolved lines. These genes were induced in the evolved lines in response to NaCl, whereas they did not show any expression change in the ancestral strain in NaCl. Among these four genes is CUP1-1 (Figure 6a), a metallothionein gene that binds copper and is associated with resistance to high concentrations of copper and cadmium (Karin et al., 1984; Winge et al., 1985). CUP1-1 has also been observed to be up-regulated in response to osmotic stress in a previous study (Yale & Bohnert, 2001). Five genes showed reduced induction in our experiment. One of the genes, HSP30, encodes a stress-responsive protein that negatively regulates the H(+)-ATPase Pma1p. It is induced by heat shock, treatment with organic acid and ethanol, and glucose starvation (Panaretou & Piper, 1992; Piper et al., 1994; Piper et al., 1997). This gene was shown previously to be down-regulated in physiological adaptation to salt stress (Rep et al., 2000). However, this

114 Appendix to Chapter 2

gene was found to be up-regulated in our ancestral strain in NaCl medium.

In contrast to the few genes which showed increased induction, a total of 37 genes showed increased repression. All 37 genes were also down-regulated (log2(fold-change)≤-1) in our ancestral strain in response to NaCl. Out of these 37 genes, one gene (HLR1) is directly involved in the stress response. This gene encodes a protein involved in regulation of cell wall composition, as well as in the osmotic stress response (Alonso-Monge et al., 2001). Four of these genes are shared with the down-regulated genes observed in the study by Rep et al (Rep et al., 2000). These are MET6 (associated with Methionine biosynthesis) (Masselot & De Robichon- Szulmajster, 1975), UTP4 (RNA binding and rRNA processing) (Dragon et al., 2002; Grandi et al., 2002; Gallagher et al., 2004), RPF2 (rRNA processing, ribosomal proteins, RNA binding) (Morita et al., 2002), and RPA34 (RNA polymerase I subunit A34.5) (Gadal et al., 1997). One of the genes, NCA3, was previously found to be up-regulated in response to osmotic stress (Yale & Bohnert, 2001). Two genes (YKR041W and PFK27) with increased repression also showed increase in basal expression in our experiment. 12 genes showed new repression in our experiment. Among these 12 genes, the genes GCV2 and GCV1 were previously found to be up- regulated in osmotic stress (Yale & Bohnert, 2001). We did not find any gene with reduced repression.

Genes with basal change in expression in the evolved lines (S lines) Two well-established stress response genes showed an increase in basal expression in the evolved lines. One of them is CTT1, whose basal expression increased by ~1.6 fold compared to the ancestral strain. We note that the level of induction of CTT1 in response to NaCl was more or less the same in the evolved lines and the ancestral strain (Figure 4b). This gene was found to be induced in four previous genome-wide studies of physiological stress adaptation in yeast (Gasch et al., 2000; Rep et al., 2000; Causton et al., 2001; Yale & Bohnert, 2001).The second known stress response genes is the transcription factor MSN4, whose expression was increased by approximately 1.5 fold. There are several other genes with increased basal expression that have previously been detected to increase expression in response to osmotic stress, including VID24 (Yale & Bohnert, 2001), RTN2 (Rep et al., 2000; Yale & Bohnert, 2001), and SFK1 (Yale & Bohnert, 2001).

Some of the genes with increase in basal expression such as HPF1, MIG2, PDR12, GAC1, PFK27 and PGU1 are involved in carbohydrate metabolism. Two of these genes have been associated with the stress response. For example, the gene PFK27 encodes the enzyme 6- phosphofructo-2-kinase that catalyzes the synthesis of fructose-2,6-bisphosphate (Boles et al., 1996). This gene is regulated by protein kinase A TPK1 (Kretschmer & Fraenkel, 1991), which regulates processes involved in osmotic tolerance (Norbeck & Blomberg, 2000). PFK27 is over- expressed in response to hyperosmotic stress (Yale & Bohnert, 2001). Another gene, GAC1, associated with glycogen synthesis (Wu et al., 2001), is also induced during osmo-adaptation (Posas et al., 2000) This protein binds to the Hsf1p heat shock transcription factor and induces HSF-regulated genes under heat shock (Lin & Lis, 1999). The promoter of GAC1 contains a stress responsive element STRE sequence and the expression is controlled by protein kinase A (Ward et al., 1995). We note that a contradictory report suggests that over-expression of GAC1 results in increased salt sensitivity in yeast (Wu et al., 2001).Three further genes FIT2, FIT3 and FRE5 that show increase in basal expression are associated with ion transport. The gene FIT2

115 Appendix to Chapter 2

showed the highest level of basal increase in expression (3.7-fold) in our experiment. This gene encodes a mannoprotein that is incorporated into the cell wall and is involved in iron uptake (Philpott et al., 2002). This gene is also induced in the hyperosmotic stress response (Yale & Bohnert, 2001). Another member of the same gene family, FIT3, also increased its basal expression (~2.2 fold). This gene is also associated with iron uptake in yeast cells (Philpott et al., 2002). The third gene FRE5 encodes a putative ferric reductase and is predicted to be involved in iron uptake. It is also associated with homeostasis of ions (Na+,K+,Ca2+) in yeast (Yun et al., 2001). There are several other genes, that show increase in basal expression, associated with cellular transport such as TPO1, TPO2 and TPO3 (amine/polyamine transport) (Tomitori et al. 1999; Tomitori et al., 2001), AQR1 (Carbohydrate transport) (Tenreiro et al., 2002), and RSB1 (fatty acid transport) (Kihara & Igarashi, 2002). Among the 44 genes whose basal expression level decreased, the gene TIR1 showed ~2.1 fold reduction in basal expression. TIR1 encodes for a cell wall mannoprotein whose expression is downregulated at acidic pH, and induced by cold shock and anaerobiosis (Marguet et al., 1988; et al., 1995; Kitagaki et al., 1997; Cohen et al., 2001). Another gene, STE3 showed a ~2.7 fold decrease in expression level compared to the ancestral strain. This protein, a pheromone receptor (Hagen et al., 1986), interacts with MAP kinase cascade pathway proteins such as STE2, STE4, and STE20 (Jensen et al., 2009).

We also note that some genes showed reduced basal expression during evolutionary adaptation experiment, whereas previous studies found them to be up-regulated during physiological adaptation. One of them is PUT4 (~1.75 fold decrease in expression), a proline permease required for high-affinity transport of proline (Andréasson et al., 2004). It was found to be up- regulated in three previous experiments of physiological stress response (Posas et al., 2000; Rep et al., 2000; Yale & Bohnert, 2001). The other four genes are YLR042C (Rep et al., 2000; Yale & Bohnert, 2001), YLR031W (Rep et al., 2000), YHR022C (Rep et al., 2000), and YGR146C (Rep et al., 2000), whose functions are not known.

Gene Amplifications and Deletions from Next Generation Sequencing Data We observed 325 putative copy number changes (119 amplifications and 206 deletions), 227 changes (96 amplifications and 131 deletions), and 273 changes (166 amplifications and 107 deletions) in lines S1, S2 and S3 respectively (False Positive Rate <10%). The median length of these changes was 1551bp, 1735bp and 1177bp for the lines S1, S2 and S3 respectively. Supplementary figure 7 shows histograms of the length distribution of these changes for all three lines. The amplification and deletions are usually small in size (<2000bp) and affect mostly a single gene or a part of a single gene and intergenic regions. Supplementary figure 8 shows analogous data for the apparent copy number distributions. We emphasize that the copy number change we see in our approach, would be affected by both copy number and allele frequency of a variant in the population. This is why we refer to apparent copy number here.

The amplifications and deletions, when present in high frequency should be detectable from DNA breakpoints in the read data. For a deleted region, reads containing the junction between the adjoining regions of the deletion would be detectable, whereas for amplifications, the junction would vary depending on where the amplification is present in the chromosome. To analyze breakpoints, reads that map to two non-adjacent regions of the genome were taken. A breakpoint was considered to be present if at least two independent (non-duplicate) reads show the same breakpoint within 50bp of each other. We note that the yeast strain BY4741 genome

116 Appendix to Chapter 2

contains four known gene deletions. These are his3∆1 (chromosomeXV 721947-722609), leu2∆0 (chromosomeIII, 91324-92418), met15∆0 (chromosomeXII, 732544-733878), and ura3∆0 (chromosomeV,116167-116970). In addition, the mitochondrion being circular, there should be reads connecting the end of the mitochondrion and the beginning of the mitochondrion. To validate the quality of our sequence data and analysis method, we attempted to identify these sequence signatures. We were able to correctly identify the breakpoints corresponding to all fourdeletions as well as the junction in mitochondrial DNA in the ancestral strain as well as in all the evolved lines.

When applying breakpoint analysis to our evolve lines, we obtained 16, 31 and 87 breakpoints for lines S1, S2 and S3 respectively. We compared all breakpoints from our DNA breakpoint analysis with the amplifications and deletions obtained from our segmentation algorithm, but found no overlap between the two sets of breakpoints. There are several factors that could be important for inability to detect those breakpoints through segmentation analysis. Firstly, the sequence coverage is modest, hence there may not be enough reads sampled from a given region with a breakpoint. For example, the deletion corresponding to met15∆0 in chromosomeXII, which is present in the whole population, is covered by only 3 reads with the corresponding breakpoint in the ancestral strain. Secondly, the frequency of an amplification or deletion would also affect the number of reads containing a breakpoint, and further reduce the number of reads that indicate a breakpoint.

Deletions whose length is of the order of the length of yeast genes can be either full or partial deletions. Either of these changes can decrease gene expression. Full gene amplification may increase an affected gene’s expression, whereas the effects of partial gene amplification are more difficult to predict. For example, when occurring inside a gene, an amplification might lead to an altered function, but when occurring in a regulatory region, amplification might change the expression level of the gene.

To investigate if there is any association between gene/regulatory region amplification or deletions with the changes in gene expression that we see from our microarray experiments, we mapped amplifications and deletions to the yeast genome. Out of 144 genes with differential expression, 18 genes also have amplifications or deletions associated with a gene or its regulatory region. Overall, however, a randomization assay (see Methods) suggests that the overlap between the amplifications and deletions from the sequencing data and the expression data can be observed by chance alone (p-value=0.50 from the empirical distribution).

Validating indels genetically from our data is not straightforward for three reasons. First, the position of an amplification in the genome is unknown. Second, in heterogeneous populations like ours there might be variation between individuals in the population. The apparent copy number identified from the above analysis is dependent on the population frequency of an indel, as well on as its actual change in copy number in those individuals where it occurs. For example, an indel of very high actual copy number change and low frequency and an indel of high frequency but low copy number change might appear identical in our analysis. Third, to confirm deletions genetically, the positions of the breakpoints have to be estimated. Identification of breakpoints with reasonable accuracy is quite difficult in our case for the reasons mentioned in the DNA breakpoint analysis section above.

117 Appendix to Chapter 2

Supplementary references

Alonso-Monge, R., Real, E., Wojda, I., Bebelman, J.P., Mager, W.H. & Siderius, M. 2001. Hyperosmotic stress response and regulation of cell wall integrity in Saccharomyces cerevisiae share common functional aspects. Mol. Microbiol. 41: 717-730.

Andréasson, C., Neve, E.P. & Ljungdahl, P.O. 2004. Four permeases import proline and the toxic proline analogue azetidine-2-carboxylate into yeast. Yeast 21: 193-199.

Boles, E., Göhlmann, H.W., & Zimmermann, F.K. 1996. Cloning of a second gene encoding 5-phosphofructo-2- kinase in yeast, and characterization of mutant strains without fructose-2,6-bisphosphate. Mol. Microbiol. 20: 65-76.

Causton, H.C., Ren, B., Koh, S.S., Harbison, C.T., Kanin, E., Jennings, E.G. et al. 2001. Remodeling of Yeast Genome Expression in Response to Environmental Changes. Mol. Biol. Cell 2: 323-337.

Cohen, B.D., Sertil, O., Abramova, N.E., Davies, K.J. & Lowry, C.V. 2001. Induction and repression of DAN1 and the family of anaerobic mannoprotein genes in Saccharomyces cerevisiae occurs through a complex array of regulatory sites. Nucleic Acids Res. 29: 799-808.

Dragon, F., Gallagher, J.E., Compagnone-Post, P.A., Mitchell, B.M., Porwancher, K.A., Wehner, K.A. et al. 2002. A large nucleolar U3 ribonucleoprotein required for 18S ribosomal RNA biogenesis. Nature 417: 967-970.

Gadal, O., Mariotte-Labarre, S., Chedin, S., Quemeneur, E., Carles, C., Sentenac, A. et al. 1997. A34.5, a nonessential component of yeast RNA polymerase I, cooperates with subunit A14 and DNA topoisomerase I to produce a functional rRNA synthesis machine. Mol. Cell. Biol. 17: 1787-1795.

Gallagher, J.E., Dunbar, D.A., Granneman, S., Mitchell, B.M., Osheim, Y., Beyer, A.L. et al. 2004. RNA polymerase I transcription and pre-rRNA processing are linked by specific SSU processome components. Genes Dev. 18: 2506-2517.

Gasch, A.P., Spellman, P.T., Kao, C.M., Carmel-Harel, O., Eisen, M.B., Storz, G. et al. 2000. Genomic Expression Programs in the Response of Yeast Cells to Environmental Changes. Mol. Biol. Cell 11: 4241-4257.

Grandi, P., Rybin, V., Bassler, J., Petfalski, E., Strauss, D., Marzioch, M. et al. 2002. 90S pre-ribosomes include the 35S pre-rRNA, the U3 snoRNP, and 40S subunit processing factors but predominantly lack 60S synthesis factors. Mol. Cell 10: 105-115.

Güldener, U., Münsterkötter, M., Kastenmüller, G., Strack, N., van Helden, J., Lemer, C. et al. 2005. CYGD: the Comprehensive Yeast Genome Database. Nucl. Acids Res. 33: D364-D368.

Hagen, D.C., McCaffrey, G. & Sprague, G.F. Jr. 1986. Evidence the yeast STE3 gene encodes a receptor for the peptide pheromone a factor: gene sequence and implications for the structure of the presumed receptor. Proc. Natl. Acad. Sci. USA 83: 1418-1422.

Jensen, L.J., Kuhn, M., Stark, M., Chaffron, S., Creevey, C., Muller, J. et al. 2009. STRING 8--a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res. 37: D412-416.

Karin, M., Najarian, R., Haslinger, A., Valenzuela, P., Welch, J. & Fogel, S. 1984. Primary structure and transcription of an amplified genetic locus: the CUP1 locus of yeast. Proc. Natl. Acad. Sci. USA 81: 337-341.

Kihara, A. & Igarashi, Y. 2002. Identification and characterization of a Saccharomyces cerevisiae gene, RSB1, involved in sphingoid long-chain base release. J. Biol. Chem. 277: 30048-30054.

Kitagaki, H., Shimoi, H. & Itoh, K. 1997. Identification and analysis of a static culture-specific cell wall protein,

118 Appendix to Chapter 2

Tir1p/Srp1p in Saccharomyces cerevisiae. Eur. J. Biochem. 249: 343-349.

Kowalski, L.R., Kondo, K. & Inouye, M. 1995. Cold-shock induction of a family of TIP1-related proteins associated with the membrane in Saccharomyces cerevisiae. Mol. Microbiol. 15: 341-353.

Kretschmer, M. & Fraenkel, D.G. 1991. Yeast 6-phosphofructo-2-kinase: sequence and mutant. Biochemistry 30: 10663-10672.

Lin, J.T. & Lis, J.T. 1999. Glycogen synthase phosphatase interacts with heat shock factor to activate CUP1 gene transcription in Saccharomyces cerevisiae. Mol. Cell. Biol. 19: 3237-3245.

Marguet, D., Guo, X.J. & Lauquin, G.J. 1988. Yeast gene SRP1 (serine-rich protein). Intragenic repeat structure and identification of a family of SRP1-related DNA sequences. J. Mol. Biol. 202: 455-470.

Masselot, M. & De Robichon-Szulmajster, H. 1975. Methionine biosynthesis in Saccharomyces cerevisiae. I. Genetical analysis of auxotrophic mutants. Mol. Gen. Genet. 139: 121-132.

Morita, D., Miyoshi, K., Matsui, Y., Toh-E, A., Shinkawa, H., Miyakawa, T. et al. 2002. Rpf2p, an evolutionarily conserved protein, interacts with ribosomal protein L11 and is essential for the processing of 27 SB Pre-rRNA to 25 S rRNA and the 60 S ribosomal subunit assembly in Saccharomyces cerevisiae. J. Biol. Chem. 277: 28780-28786.

Norbeck, J. & Blomberg, A. 2000. The level of cAMP-dependent protein kinase A activity strongly affects osmotolerance and osmo-instigated gene expression changes in Saccharomyces cerevisiae. Yeast 16: 121-137.

Panaretou, B. & Piper, P.W. 1992. The plasma membrane of yeast acquires a novel heat-shock protein (hsp30) and displays a decline in proton-pumping ATPase levels in response to both heat shock and the entry to stationary phase. Eur. J. Biochem. 206: 635-640.

Philpott, C.C., Protchenko, O., Kim, Y.W., Boretsky, Y. & Shakoury-Elizeh, M. 2002. The response to iron deprivation in Saccharomyces cerevisiae: expression of siderophore-based systems of iron uptake. Biochem. Soc. Trans. 30: 698-702.

Piper, P.W., Talreja, K., Panaretou, B., Moradas-Ferreira, P., Byrne, K., Praekelt, U.M. et al. 1994. Induction of major heat-shock proteins of Saccharomyces cerevisiae, including plasma membrane Hsp30, by ethanol levels above a critical threshold. Microbiology 140: 3031-3038.

Piper, P.W., Ortiz-Calderon, C., Holyoak, C., Coote, P. & Cole, M. 1997. Hsp30, the integral plasma membrane heat shock protein of Saccharomyces cerevisiae, is a stress-inducible regulator of plasma membrane H(+)-ATPase. Cell Stress Chaperones 2: 12-24.

Posas, F., Chambers, J.R., Heyman, J.A., Hoeffler, J.P., de Nadal, E. & Arino, J. 2000. The Transcriptional Response of Yeast to Saline Stress. J. Biol. Chem. 275: 17249-17255.

Rep, M., Krantz, M., Thevelein, J.M. & Hohmann, S. 2000. The Transcriptional Response of Saccharomyces cerevisiae to Osmotic Shock. J. Biol. Chem. 275: 8290-8300.

Tenreiro, S., Nunes, P.A., Viegas, C.A., Neves, M.S., Teixeira, M.C., Cabral, M.G. et al. 2002. AQR1 gene (ORF YNL065w) encodes a plasma membrane transporter of the major facilitator superfamily that confers resistance to short-chain monocarboxylic acids and quinidine in Saccharomyces cerevisiae. Biochem. Biophys. Res. Commun. 292: 741-748.

Tomitori, H., Kashiwagi, K., Sakata, K., Kakinuma, Y. & Igarashi, K. 1999. Identification of a gene for a polyamine transport protein in yeast. J. Biol. Chem. 274: 3265-3267.

Tomitori, H., Kashiwagi, K., Asakawa, T., Kakinuma, Y., Michael, A.J. & Igarashi, K. 2001. Multiple polyamine transport systems on the vacuolar membrane in yeast. Biochem. J. 353: 681-688.

119 Appendix to Chapter 2

Ward, M.P., Gimeno, C.J., Fink, G.R. & Garrett, S. 1995. SOK2 may regulate cyclic AMP-dependent protein kinase-stimulated growth and pseudohyphal development by repressing transcription. Mol. Cell. Biol. 15: 6854- 6863.

Winge, D.R., Nielson, K.B., Gray, W.R. & Hamer, D.H. 1985. Yeast metallothionein. Sequence and metal-binding properties. J. Biol. Chem. 260: 14464-14470.

Wu, X., Hart, H., Cheng, C., Roach, P.J. & Tatchell, K. 2001. Characterization of Gac1p, a regulatory subunit of protein phosphatase type I involved in glycogen accumulation in Saccharomyces cerevisiae. Mol. Genet. Genomics 265: 622-635.

Yale, J. & Bohnert, H.J. 2001. Transcript Expression in Saccharomyces cerevisiae at High Salinity. J. Biol. Chem. 276: 15996-16007.

Yun, C.W., Bauler, M., Moore, R.E., Klebba, P.E. & Philpott, C.C. 2001. The role of the FRE family of plasma membrane reductases in the uptake of siderophore-iron in Saccharomyces cerevisiae. J. Biol. Chem. 276: 10218- 10223.

120 Appendix to Chapter 2

Supplementary tables

Supplementary table 1 - Genes with changed induction in our experiment. The columns show, from left to right, systematic gene name, the level of expression change, the Z-score, the gene’s common name, the type of expression change, and a reference, if the gene had responded physiologically to hyperosmotic stress or saline stress in previous studies.

Log2 (Expression Common Previous Gene Z-score Type of change Change) Name studies YOR348C 0.8821 2.088 PUT4 Increased induction 1,2,4 YHR053C 0.7017 1.8401 CUP1-1 Increased induction 4 YKL049C 0.554 1.8101 CSE4 New induction 4 YAR010C 0.9355 1.7493 --- New induction 4 YHR071W 0.5834 1.726 --- Increased induction 4 YKL061W 0.5397 1.6606 --- New induction 4 YJL107C 0.6874 1.563 --- Increased induction 4 YBL086C 0.5343 1.5317 --- Increased induction 4 YCR021C −1.1641 −2.8828 --- Reduced induction 4 YHR021W-A −0.8585 −2.4177 --- Reduced induction 4 YDR218C −0.734 −1.9778 --- Reduced induction 4 YCR104W −0.6064 −1.8442 --- Reduced induction 4 YOR394W −0.7475 −1.5461 --- Reduced induction 4

121 Appendix to Chapter 2

Supplementary table 2 - Genes with changed repression in our experiment. The columns show, from left to right, systematic gene name, the level of expression change, the Z-score, the gene’s common name, the type of expression change, and a reference, if the gene had responded physiologically to hyperosmotic stress or saline stress in previous studies.

Gene Log2 Z-score Common Type of change Previous (Expression Name studies Change) YMR189W −1.0952 −2.9498 GCV2 New repression 4 YER091C −0.9458 −2.6921 MET6 Increased repression 2 YNR053C −1.0932 −2.6838 NOG2 Increased repression --- YOR339C −0.8424 −2.613 UBC11 Increased repression --- YHR066W −0.9644 −2.4702 SSF1 Increased repression --- YDR449C −0.9666 −2.4321 UTP6 Increased repression --- YNL308C −0.707 −2.3274 KRI1 Increased repression --- YOL010W −0.7639 −2.3083 RCL1 Increased repression --- YLR401C −0.7603 −2.2784 DUS3 Increased repression --- YLR222C −0.6988 −2.1319 UTP13 Increased repression --- YPL058C −0.7239 −2.0957 PDR12 New repression --- YGL209W −1.2042 −2.0794 MIG2 New repression --- YKR041W −1.273 −2.0507 --- Increased repression --- YCL054W −0.6863 −2.0498 SPB1 Increased repression --- YJL157C −0.8386 −2.0116 FAR1 Increased repression --- YCR057C −0.9464 −1.975 PWP2 Increased repression --- YKR056W −0.7558 −1.9696 TRM2 Increased repression --- YDR324C −0.7305 −1.9689 UTP4 Increased repression 2 YOL136C −0.7598 −1.9429 PFK27 Increased repression 4 YJL069C −0.7755 −1.899 UTP18 Increased repression --- YPL093W −0.77 −1.884 NOG1 Increased repression --- YDR496C −0.6033 −1.8561 PUF6 Increased repression --- YOR359W −0.7522 −1.8506 VTS1 Increased repression --- YPL130W −1.5479 −1.8346 SPO19 New repression --- YPR145W −0.5736 −1.8201 ASN1 Increased repression --- YKR081C −0.6162 −1.8102 RPF2 Increased repression 2 YJL116C −1.0038 −1.7928 NCA3 Increased repression 4

122 Appendix to Chapter 2

YKR069W −0.5795 −1.7782 MET1 New repression --- YGR128C −0.716 −1.7692 UTP8 Increased repression --- YGL078C −0.7441 −1.7642 DBP3 Increased repression --- YIL091C −0.5854 −1.76 --- Increased repression --- YGR283C −0.6895 −1.7597 --- Increased repression --- YNL119W −0.5689 −1.746 NCS2 Increased repression --- YBL108C- −0.5368 −1.7208 PAU9 New repression --- A YLR276C −0.7402 −1.7094 DBP9 Increased repression --- YGL184C −0.5572 −1.7079 STR3 Increased repression --- YBR298C- −0.8594 −1.7013 --- New repression --- A YDL167C −0.6209 −1.6472 NRP1 Increased repression --- YER056C −0.607 −1.6453 FCY2 Increased repression --- YDR528W −0.7456 −1.6061 HLR1 Increased repression --- YJL148W −0.6971 −1.6036 RPA34 Increased repression 2 YPL250C −0.5142 −1.5792 ICY2 Increased repression --- YDR398W −0.8879 −1.5762 UTP5 Increased repression --- YBL029W −0.6805 −1.5658 --- New repression --- YDR019C −0.9154 −1.5595 GCV1 New repression 4 YLR009W −0.4854 −1.5402 RLP24 Increased repression --- YPR167C −0.4788 −1.5302 MET16 Increased repression --- YML018C −0.4979 −1.529 --- Increased repression --- YPR065W −0.4581 −1.5024 ROX1 New repression ---

123 Appendix to Chapter 2

Supplementary table 3 - Genes with increased basal expression in our experiment. The columns show, from left to right, systematic gene name, the level of expression change, the Z- score, the gene’s common name, and a reference, if the gene had responded physiologically to hyperosmotic stress or saline stress in previous studies.

Gene Log2 (Fold Z-score Common Name Previous studies Change) YNL034W 1.3572 4.2657 ------YOR384W 1.2084 3.8216 FRE5 --- YOR049C 1.4081 3.5885 RSB1 --- YGL192W 1.5698 3.3941 IME4 --- YOR382W 1.8809 3.3935 FIT2 4 YIL169C 1.4536 2.75 ------YOL155C 1.3007 2.5886 HPF1 --- YGL209W 1.2428 2.5377 MIG2 --- YLR460C 0.8062 2.2763 ------YIR042C 0.7058 2.2012 ------YKR041W 1.0952 2.1158 ------YPL058C 0.7199 2.1019 PDR12 --- YBR105C 0.7067 2.0065 VID24 4 YDL048C 0.726 1.9886 STP4 --- YOL136C 0.6849 1.9879 PFK27 4 YGR088W 0.6911 1.9075 CTT1 2,3,4,5 YGR138C 0.8664 1.9041 TPO2 2 YGR109W- 0.9663 1.8699 ------B YDL204W 0.6419 1.868 RTN2 2,3 YNL065W 0.9231 1.8404 AQR1 --- YHL048W 0.652 1.833 COS8 --- YPL130W 1.3476 1.8013 SPO19 --- YBL029W 0.7047 1.78 ------YOR376W- 0.606 1.7793 ------A YFR032C 1.3441 1.7731 ------YAL016C- 0.6157 1.7623 ------B

124 Appendix to Chapter 2

YGL258W- 0.5373 1.7516 ------A YOR383C 1.1259 1.7318 FIT3 --- YLL028W 0.8465 1.7185 TPO1 4 YGL188C- 0.7349 1.701 ------A YMR013W- 0.5637 1.6962 ------A YJR153W 0.587 1.6759 PGU1 --- YKL051W 0.7722 1.6732 SFK1 4 YOR178C 0.538 1.6618 GAC1 1 YHL024W 0.5964 1.5922 RIM4 --- YKL062W 0.5898 1.5671 MSN4 --- YPR156C 1.0814 1.5463 TPO3 --- YOR107W 0.9521 1.5359 RGS2 ---

125 Appendix to Chapter 2

Supplementary table 4 - Genes with reduced basal expression in our experiment. he columns show, from left to right, systematic gene name, the level of expression change, the Z- score, the gene’s common name, and a reference, if the gene had responded physiologically to hyperosmotic stress or saline stress in previous studies.

Gene Log2 (Fold Z-score Common Name Previous studies Change) YKL178C −1.2774 −3.9263 STE3 --- YML058W- −1.2118 −3.7933 HUG1 --- A YJR078W −1.2138 −3.5631 BNA2 --- YER011W −1.0741 −3.361 TIR1 --- YDR281C −1.0859 −2.789 PHM6 --- YGR032W −1.1234 −2.6625 GSC2 --- YLR042C −0.9415 −2.6593 --- 2,4 YIR027C −0.7257 −2.3068 DAL1 --- YFL011W −0.8321 −2.265 HXT10 --- YFR023W −0.6752 −2.2328 PES4 --- YDR317W −0.7271 −2.2271 HIM1 --- YCL073C −0.6927 −2.2158 ------YJL038C −0.7085 −2.0857 LOH1 --- YAL063C- −0.6505 −2.0358 ------A YHL012W −0.6736 −1.9643 ------YFR012W −0.6097 −1.9552 ------YLR031W −0.7318 −1.9047 --- 2 YKL107W −0.6838 −1.8989 ------YOR348C −0.7881 −1.8902 PUT4 1,2,4 YJL133C-A −0.701 −1.8566 ------YOL014W −0.624 −1.8389 ------YBR157C −0.6091 −1.8342 ICS2 --- YFL012W −0.5668 −1.7958 ------YLR231C −0.6046 −1.7803 BNA5 --- YGL263W −0.6434 −1.7662 COS12 --- YOR328W −0.5443 −1.7523 PDR10 ---

126 Appendix to Chapter 2

YLR307W −0.5494 −1.7491 CDA1 --- YNR064C −0.6117 −1.7477 ------YFR012W- −0.5421 −1.7243 ------A YHR054C −0.5642 −1.711 --- 4 YLR030W −0.542 −1.7033 ------YCR106W −0.5893 −1.6905 RDS1 --- YMR040W −0.5822 −1.6384 YET2 --- YLR152C −0.5575 −1.6332 ------YDR403W −0.5302 −1.604 DIT1 --- YCR005C −0.5163 −1.601 CIT2 --- YDL024C −0.4966 −1.5963 DIA3 --- YLR054C −0.5923 −1.5898 OSW2 --- YGR059W −0.5395 −1.5567 SPR3 --- YFL059W −0.4753 −1.5559 SNZ3 --- YHR071W −0.5104 −1.53 PCL5 --- YGR146C −0.4714 −1.5213 --- 2 YHR022C −0.4976 −1.519 --- 2 YBL098W −0.5021 −1.511 BNA4 ---

Supplementary references

1. Posas, F., Chambers, J.R., Heyman, J.A., Hoeffler, J.P., de Nadal, E., & Arino, J. 2000. The Transcriptional Response of Yeast to Saline Stress. J. Biol. Chem. 275: 17249-17255.

2. Rep, M., Krantz, M., Thevelein, J.M., & Hohmann, S. 2000. The Transcriptional Response of Saccharomyces cerevisiae to Osmotic Shock. J. Biol. Chem. 275: 8290-8300.

3. Gasch, A.P., Spellman, P.T., Kao, C.M., Carmel-Harel, O., Eisen, M.B., Storz, G. et al. 2000. Genomic Expression Programs in the Response of Yeast Cells to Environmental Changes. Mol. Biol. Cell 11: 4241-4257.

4. Yale, J. & Bohnert, H.J. 2001. Transcript Expression in Saccharomyces cerevisiae at High Salinity. J. Biol. Chem. 276: 15996-16007.

5. Causton, H.C., Ren, B., Koh, S.S., Harbison, C.T., Kanin, E., Jennings, E.G. et al. 2001. Remodeling of Yeast Genome Expression in Response to Environmental Changes. Mol. Biol. Cell 2: 323-337.

127 Appendix to Chapter 2

Supplementary methods

Strains and media The haploid yeast strain BY4741 (MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0; American Type Culture Collection ATCC#201388) was used in this study. This strain is derived from the reference strain S288C (Brachmann et al., 1998; Goffeau et al., 1996). All laboratory evolution experiments started from the same clone of BY4741, which is referred to as the ancestral strain. Evolved lines refer to the strains derived from the ancestral strain through serial transfer cycles, as described below. The 3 replicate yeast lines evolved in NaCl are referred to as lines S1, S2 and S3. The growth rates of the evolved lines as well as the ancestral strain were measured relative to a BY4741 strain in which the CWP2 gene was GFP-tagged. This GFP-tagged strain is referred to as the reference strain. Ancestral and reference strains were obtained from EUROSCARF (http://web.uni-frankfurt.de/fb15/mikro/euroscarf/yeast.html) and Invitrogen (www.invitrogen.com and http://yeastgfp.ucsf.edu) (Ghaemmaghami et al., 2003), respectively. Although the reference strain’s GFP-tag might affect fitness in some environments, such changes do not affect our measurements, because the fitness changes of both evolved and ancestral strains were calculated relative to the reference strain.

The growth media in this study included YPD, which consists of 2% peptone, 1% yeast extract, adjusted to pH 5.8, and 2% glucose, supplied after sterilization. For serial transfers, we cultured cells in YP and 2% galactose (YPG) and YPG supplemented with 0.5 M NaCl (YPGN). Media for serial transfer were supplemented with 50 μg/ml ampicillin and 25 μg/ml tetracycline to minimize the risk of contamination.

Serial transfer Six parallel serial transfer experiments were started with an overnight YPD culture derived from one single clone of the ancestral strain, from which 50μl were transferred into 50ml fresh medium. In each parallel experiment, 50 ml yeast culture was grown in an incubating shaker for 24 hours at 220rpm and 30oC, after which cultures had reached stationary phase. Every 24 hours, 50μl of grown culture was transferred into fresh culture medium; 30 such transfer cycles were carried out for a total of approximately 300 generations (Each transfer cycle involved approximately log21000≈10 cell generations). In three of the parallel experiments, the culture medium was YPG for each transfer, whereas in the other three parallel experiments, the medium was YPGN. Before each transfer, 1.5ml of culture were withdrawn, supplemented with 25% glycerol, and stored at -80oC for future analysis. Cultures were periodically examined for contamination by plating, as well as microscopically.

Viability assays To investigate whether the ancestral strain and the evolved lines can survive high concentrations of NaCl, cell viabilities were estimated as follows. A sample of cells from an overnight culture was transferred into YPG medium. Cells of this culture were sampled after 16 hours (during exponential growth phase), as well as after 24 hours (during stationary phase), diluted and plated. The plates were incubated at 30oC for 5 days and the number of colonies were counted. The relative viabilities of both the ancestral strain and the evolved lines were estimated from the ratio of colonies forming on agar plates containing YPG+ 0.5M NaCl to that of plates containing only YPG. All the measurements were carried out in three biological replicates.

128 Appendix to Chapter 2

Competition assays The reference strain, ancestral strain, and evolved lines have matching auxotrophies (Ghaemmaghami et al., 2003), which makes comparison of their growth rates unproblematic. To compare growth rates of the evolved lines with that of the ancestral strain, competition assays were carried out using fluorescence activated cell sorting (FACS), as described below. Cells from frozen glycerol stocks were grown overnight in 4ml YPD medium (30oC, 220rpm). The cultures were allowed to grow until late logarithmic phase (<1.5x108 cells/ml). Cell numbers were then counted manually with a Neubauer cytometer. In each culture, cell numbers were then adjusted to 2.5x107 cells/ml with YPD. These pre-culturing steps ensured that cells were in comparable physiological states with high viability before competition. For the competition assay, equal cell numbers (2.5×106) of the reference strain and of the competing strain were mixed in 10ml YPGN medium (5.0×105ml-1 final cell density). A 2.0ml aliquot of this culture was centrifuged, and the cells were resuspended in 1.0ml PBSE buffer (10mM Na2HPO4, 2mM KH2PO4, 137mM NaCl, 2.7mM KCl, 0.1mM EDTA, pH 7.4). These cells were stored overnight o at 4 C. FACS was then used to measure the relative cell numbers Nr(0) of the reference strain, and Ne(0) of an evolved line (or Na(0) of the ancestral strain, depending on the experiment) at the beginning of the competition assay. Competition assays were carried out in three replicates. For each replicate, 2ml cell aliquots were taken and grown for 24 hours in an incubating shaker at o 220rpm and 30 C. The relative cell numbers Nr(1) and Ne(1) (or Na(1)) at the end of 24 hours were counted using FACS.

Fluorescent activated cell sorting (FACS) analysis At the end of each competition assay, the cell densities were estimated visually. Then, 20μl, 50μl or 100μl of the suspended cells were added to 1.0ml PBSE for high, medium or low cell densities respectively. The cell ratios Nr(0)/Nx(0) and Nr(1)/Nx(1), where “x” stands for either “e” for an evolved line or “a” for the ancestral strain, were then determined with a Beckman Coulter Cytomics FC500 fluorescence activated cell sorter. An argon ion emitting light at 488nm was used as a light source for the GFP excitation, and an optical long-pass filter with a cut-off of 500nm was used to discriminate against unspecific fluorescence and scattering light against GFP fluorescence at 520nm. Since our evolved lines consist of populations with heterogeneous cell sizes, the analysis was done in the ungated mode. The flow rate was adjusted to keep the total data rate below 3000 events per second. For each sample, 10000 cells were counted. The histogram of fluorescing and non-fluorescing cells provided the relative number of cells of the reference strain (GFP-tagged) and the competing strain (evolved line or ancestral strain).

Estimating growth rate differences From the competition assays growth rate differences were determined as follows. If the reference

rtr etr strains R and some other strain E grow according to Nr t)( = e Nr )0( and Ne t)( = e Ne )0( , where rr and re are strain-specific growth rates, and where Ni(t) are population sizes of the respective strains at time t, then

Ntee() N(00) N e( ) =rexp⎣⎦⎡⎤()er−−rt =m exp ⎣⎡() em rt ⎦⎤ Ntrr() N()00 N r()

Here, me is the Malthusian fitness of strain E and mr is the Malthusian fitness of strain R. If the population numbers were measured through FACS after t=1 day of competition (as done in our

129 Appendix to Chapter 2

-1 experiment), me (dimension [d ]) can be estimated as

⎛⎞NN(1/) ( 1) mm=− ln er er ⎜⎟ ⎝⎠NNer()0/ () 0 The same calculation can be used to estimate ma. To compare the cell numbers of an evolved line to that of the ancestral strain, the dimensionless ratio N=Ne/Na was estimated. This ratio is equivalent to N ⎛⎞NN()1/( 0) N=e =ee =exp r−− r = exp m m ⎜⎟()ea[] e a NNaa⎝⎠()1/ N a ( 0 ) Therefore, ⎛⎞N ⎛⎞NN(1/) ( 0) mm=− lne = ln ee ea ⎜⎟ ⎜⎟ ⎝⎠NNNaaa⎝⎠()1/ ( 0 ) Since, in our case, the growth of yeast cells is density dependent and is limited to a 1000 fold -1 increase per culture transfer cycle, ma can be estimated as ma=mav=ln(1000)/(1d)=6.9078d . The selection coefficient s is defined as the difference in Darwinian fitness between an evolved line and the ancestral strain, i.e., s=(me-ma)/ma=w-1 (Lenski et al., 1991), where s>0 indicates that the evolved line has an advantage over the ancestral strain. 10 N −1serves as a useful approximation, for estimating the increase in growth rate of the evolved lines.

Whole genome transcriptome analysis The mRNA expression levels in the ancestral strain and in the evolved lines were analyzed using a GeneChip Yeast Genome 2.0 Array (Affymetrix). Equal number of cells from the ancestral strain and the evolved lines were grown in YPG medium for 16 hours. The cells were then either induced with 0.5M NaCl or grown uninduced for 20 further minutes. Total RNA was isolated using the RiboPure-Yeast RNA isolation Kit (Ambion). The quality of the isolated RNA was determined with a NanoDrop ND 1000 spectrophotometer (NanoDrop Technologies, Delaware, USA) and a Bioanalyzer 2100 (Agilent, Waldbronn, Germany). Only those samples were further processed whose absorption ratio at 260 nm and 280 nm was between 1.8–2.1, and whose a 28S/18S rDNA ratio was within 1.5-2. Total RNA samples (50 ng) were reverse- transcribed into double-stranded cDNA, and then in vitro transcribed in the presence of biotin labeled nucleotides using the GeneChip® 3' IVT Express Kit (Affymetrix). The quality and quantity of the biotinylated cRNA was determined using the NanoDrop ND 1000 and Bioanalyzer 2100. Biotin-labeled cRNA samples (7.5 µg) were then fragmented randomly to 35- 200 bp at 94°C in Fragmentation Buffer (Affymetrix) and were mixed in 100 µl of Hybridization Mix (Affymetrix) containing Hybridization Controls and Control Oligonucleotide B2 (Affymetrix Inc., P/N 900454). Samples were then hybridized to GeneChip® Yeast Genome 2.0 Arrays for 16 h at 45°C. Subsequently, arrays were washed using the Affymetrix Fluidics Station 450 FS450_0003 protocol. An Affymetrix GeneChip Scanner 3000 (Affymetrix Inc.) was used to measure the fluorescent intensity emitted by the labeled target. Raw data processing was performed using the Affymetrix AGCC software. After hybridization and scanning, probe cell intensities were calculated and summarized for the respective probe sets by means of the MAS5 algorithm (Hubbell et al., 2002). To compare the expression values of the genes from chip to chip, global scaling was performed and a signed-rank call algorithm was applied (Liu et

130 Appendix to Chapter 2 al., 2002). The normalized signal intensities were then logarithmically (base 2) transformed.

The microarray analysis was carried out for 2 replicates each for the ancestral strain in YPG and YPGN (4 in total) and for 4 replicate population samples (1 for S1,2 for S2 and 1 for S3) each in YPG and YPGN for the S lines (8 in total). For identifying “shared” genes, genes that respond to NaCl in a similar manner between the ancestral strain and the evolved lines, the average intensities were calculated for the replicates in YPG (for the ancestral strain and the evolved lines jointly) and for the replicates in YPGN (for the ancestral strain and for the evolved lines jointly). The genes with significant up-regulation or down-regulation were identified using a t- test at p=0.05, False Discovery Rate (FDR)<10% and |log2(fold change)|≥1.

To identify genes that whose basal expression or whose regulation changed (see Results), the average intensities were calculated for the ancestral strain in YPG and YPGN, and for the S lines in YPG and YPGN. For each gene, the following two Z-scores were then calculated.

(SAYPGN−−NC YPGN) ( SA YPG−NC YPG ) Z− scoreforchangeinregulation() Zr = ⎛⎞ Variance 0.09+∑⎜⎟ ⎝⎠No. of replicates and,

(SAYPG− NC YPG ) Z− scoreforbasalexpressionchange() Zb = ⎛⎞ Variance 0.09+∑⎜⎟ ⎝⎠No. of replicates where, SYPGN is the average log2-transformed expression level of the focal gene in the S lines in YPGN medium, ANCYPGN is the average log2-transformed expression level of the focal gene in the ancestral strain in YPGN medium, SYPG is the average log2-transformed expression level of the focal gene in the S lines in YPG medium, and ANCYPG is the average log2-transformed expression level of the focal gene in the ancestral strain in YPG medium.

The Z-scores take into account the mean change in fold expression of a particular gene, and also the variation in expression of the gene among the replicates under different conditions (Mukhopadhyay et al., 2006). In both calculations, 0.09 is a pseudo-variance (Mukhopadhyay et al., 2006). The variance in the denominator of the expressions above is calculated for the expression level of a gene within replicate arrays in the ancestral strain and within replicate arrays in the evolved lines, in a particular medium. It is divided by the number of replicate arrays and summed over the different media in which the expression values were measured, as indicated by the summation sign in the formula (Mukhopadhyay et al., 2006). For calculation of Zr, the variances of the expression level of a gene are calculated within the replicate arrays in each of the following cases: in the ancestral strain in YPG, in the S lines in YPG, in the ancestral strain in YPGN and in the S lines in YPGN. For calculation of Zb, the variances of a gene are calculated within the replicate arrays in the ancestral strain in YPG and in the S lines in YPG. The formulae used here are modified from (Mukhopadhyay et al., 2006) for several reasons. First, even a small change in gene expression might have an important contribution towards salt tolerance. Hence, the pseudo-variance is set lower, so that genes with small expression changes

131 Appendix to Chapter 2 can be detected. Secondly, the variance is divided by the number of replicates for each condition, as the actual variance would be proportional to the sample variance and inversely proportional to the number of replicates. Finally, the aim of this analysis was to detect the changes in expression of genes that are common across all the salt evolved (S) lines. A gene might not change its expression level to the same extent in all evolved lines, thus generating variance in expression among the replicates. Hence, the Z-score threshold for identifying differentially expressed genes was set lower than the Z-score threshold used in (Mukhopadhyay et al., 2006).

It should be noted that some genes with change in regulation could be miscategorized as follows. First, a gene could be mistakenly classified as having changed its regulation if it shows a change in its basal expression, and if its level of induction or repression is unchanged (Figure 5c and d). To remove such genes from the set of genes whose regulation changed, we subtracted the difference in basal expression (in the absence of NaCl) between ancestral and evolved lines from the expression change in response to NaCl, when calculating the Z-value Zr for a change in regulation. Second, a gene might have changed both its basal expression and be regulated differently (e.g., Supplementary figure 3 b,c,d and e). Subtracting the difference in basal expression change in calculating the Z-score Zr avoided this confounding factor as well.

Genes whose absolute Z-scores exceeded a value of 1.5 were considered to be differentially expressed. The differentially expressed genes were then grouped into different classes using the CYGD functional classification for yeast genes (Güldener et al., 2005).

The t-test to determine expression changes was not used here for two reasons. Firstly, the number of replicate arrays for the ancestral strain are small (2 replicates). And secondly, as mentioned above, a gene might be up-regulated or down-regulated to different levels in the different S lines, thus generating higher variance in expression level. A t-test would give high p- values for such a gene, and would thus cause the gene to be discarded from subsequent analysis, even though it showed the same qualitative expression change in all replicate lines.

Whole Genome Sequencing and SNP identification The ancestral strain, as well as population samples of the NaCl evolved lines at generation 300 were sequenced at approximately 10X coverage using next generation pyrosequencing (Margulies et al., 2005) (genome sequencer FLX LR system, Roche). The line S3 was further sequenced to a total of ~50X coverage. To identify SNPs in these regions, the read sequences were mapped to the yeast reference genome, as obtained from the UCSC Genome Browser; http://genome.ucsc.edu/) using blastn (Altschul et al., 1990) with default parameters. The reads with less than 90% identity to the reference genome, or with a blast E-value greater than 10-4 or both were discarded. For further analysis, only the regions of the genome that were covered by at least three independent reads in both the ancestral strain and the evolved line were considered. These regions covered ~90% of the genome in all the S lines. The reads mapping to the same position of the genome were then aligned using the Multiple Sequence Alignment program MUSCLE with default parameters (Edgar, 2004) to identify single nucleotide polymorphisms (SNPs) and indels. To identify SNPs and insertion or deletions (indels) present in evolved lines but not in the ancestral strain, the sequence reads of evolved lines were aligned together with reads from the ancestral strain during multiple sequence alignment, preserving information about the origin of each read. Finally, multiple sequence alignments were checked manually to remove

132 Appendix to Chapter 2 any alignment artifacts.

SNPs candidates were chosen based on the following main criteria for a changed nucleotide: (1) it does not occur in the ancestral strain, (2) at least 3 independent (non-duplicate) reads in an evolved line carry it, and (3) it is more than 5bp away from any homo-polymeric stretch longer than 4bp. The second criterion reduces the impact of sequencing error, whereas the third criterion deals with the sequencing and alignment problems associated with long homo-polymeric stretches (Lynch et al., 2008). In addition, for calling small indels, a fourth criterion was used, namely that a majority (>50%) of the reads show the insertion or deletion. This criterion aims to ensure a low error rate in calling small indels, because pyro-sequencing has a known high error rate associated with indels (Margulies et al., 2005). In addition to applying these criteria, two independent analyses were carried out, one for reads with base quality scores Q≥20, and one for Q>0 (Margulies et al., 2005). The higher base quality score ensures higher reliability in SNP calling, whereas a base quality score cutoff of Q>0 allows identification of all putative mutations. We note that increasing the base quality scores beyond Q=20 would not have revealed any new SNPs or indels. In addition, firstly, reads mapping to multiple regions of the yeast genome and secondly, duplicate reads, that is, reads with same start position were also considered for analysis, to ensure that no potential mutations were missed. For a read mapping to multiple regions of the genome and showing a particular SNP (or indel), the contribution of the read to the SNP (or indel) count was considered to be the inverse of the number of genomic regions that the read mapped to in the genome. The approximate frequency of a mutation was estimated as the ratio of the number of reads with the mutation to the sequence coverage of that particular base position in the focal evolved line.

For a given candidate SNP, population samples of the evolved lines were sequenced by PCR if at least 50 percent of reads contained the mutation. For mutations that occurred in fewer than 50 percent of reads, both population samples and individual clones were sequenced by PCR. The number of clones sequenced was greater than the inverse of the fraction of reads that carried the mutation. PCR sequencing was done in both directions using forward and reverse primers. For sequencing, the PCR sequencing signal chromatograms were used for identification of likely polymorphisms, and for a crude estimation of the frequency of the mutation in the population.

Reconstruction of wildtype and mutant MOT2 gene in ancestral strain To reintroduce the MOT2 mutation into the ancestral yeast strain, a well established selection- counter selection procedure was used (Scherer & Davis, 1979; Boeke et al.,1984; Widlund & Davis, 2005; Akada et al., 2006). Wildtype and mutant copies of the MOT2 gene were isolated by PCR from the clones of the line S2, and were confirmed by PCR sequencing. The PCR products were then treated with the restriction enzyme EcoRI to generate staggered ends, and ligated into the plasmid vector pRS306 (Sikorski & Hieter, 1989) that had been linearized with EcoRI. The ligation product was transformed into competent E. coli HB101 cells and selected on plates containing 50μg/ml ampicillin. Plasmids were isolated using the QIAGEN plasmid isolation kit, linearized with BglII, and transformed into competent ancestral yeast cells. The cells were then selected in plates lacking uracil, which selects for plasmid integration into the genome. The surviving clones were counter-selected in ura+ plates containing 5-fluoro-orotic acid (5-FOA) at a final concentration of 1mg/ml, to promote excision of vector sequences from the genome, such that only the MOT2 insert would be preserved. The correct insertion of

133 Appendix to Chapter 2 wildtype and mutant MOT2 sequences into the ancestral yeast genome was confirmed by PCR sequencing. The fitness of the reconstructed ancestral strains were then determined using FACS protocol as discussed before.

Pulsed-field gel electrophoresis (PFGE) To identify large scale chromosomal rearrangements, two clones from each of the evolved lines (S1, S2 and S3) and the ancestral strain were analyzed by PFGE. Agarose plugs were prepared with the CHEF Yeast Genomic DNA Plug Kit (Bio-Rad) according to the manufacturer’s protocol. For restriction enzyme digestion, agarose plugs were incubated with NotI restriction enzyme for 16hrs at 37oC mixture in an appropriate restriction buffer. The digested or undigested plugs were then loaded into the wells of 1% agarose gels. Electrophoresis was carried out in a CHEF-DRIII system (Bio-Rad). For separating NotI digested chromosomes, the gel was run at 6 V/cm and at 15°C for 24 hours, where the switch times were ramped from 1-25 seconds with MidRange I PFG Marker (New England Biolabs Catalog# N3551S). For the undigested chromosomes, the gel was run with Yeast Chromosome PFG Marker ( New England Biolabs Catalog# N0345S) at 6 V/cm and at 15°C for 26 hours, where the switch time was 70 seconds for the first 15 hours and 120 seconds for the next 11 hours. The gels were stained using ethidium bromide (EtBr) and photographed.

Gene Amplifications and Deletions from Next Generation Sequencing data To detect amplifications and deletions, we used a segmentation algorithm based on moving CR averages that we now describe. For each base of the genome, the CR was calculated based on the following formula –

⎛⎞Coverage in the evolved line at that position⎛ Coverage in the ancestral strain at that position ⎞ CR = log2 ⎜⎟− log2 ⎜⎟ ⎝⎠Average read coverage in the evolved line⎝ Average read coverage in the ancestral strain ⎠

The CR was set to zero for a region where either the coverage in the evolved line was zero or the coverage in the ancestral line was zero or both were equal to zero. For a genomic region with no copy number change, the CR is expected to be close to zero, whereas for a region with increased copy number in the evolved line, the CR would be above zero and for a deleted region in the evolved line, the CR would be negative. The whole genome was first partitioned into regions of positive CR and regions of negative CR (without using any threshold value). If a region’s CR stayed above or below zero for less than 250bp, the region was discarded from the analysis. The cutoff length was chosen as 250bp because the median length of the reads is ~250 bp for all our sequence data except a part of S3 (which has a median length of ~450bp that would increase the minimum threshold length). It would therefore be extremely difficult to detect duplications or deletions shorter than 250bp by this approach.

The average read coverages per base were 9.69, 9.03, 8.57 and 46.59 for the ancestral strain, line S1, line S2 and line S3, respectively. For calculation of read coverage, duplicate reads were taken into account (Chiang et al., 2009). The repetitive regions in the yeast genome were discarded from this analysis, as these regions might show coverage variation caused by non- unique alignments of reads to genomic regions. We are aware that this causes a limitation of our work, because the repetitive regions are prone to change their copy numbers and would not be detected.

134 Appendix to Chapter 2

In each of the remaining partitioned regions, the mean CR in a window of length X base pairs was calculated from the CR value of each base pair. More precisely, this window was slid from the starting position of the partitioned region through the end of the region in one base pair increments, and the mean CR was calculated for each of these sliding windows. The window length x was varied from 250 bp up to the length N of the region in one base pair increments. If the mean CR value for a window of length X, covering the genomic interval (A,A+X) exceeded the threshold values of ≥1.0 or ≤-1.0, the region (A,A+X) was a candidate for a duplication and deletion, and stored for further analysis. If two regions, (A,A+X) and (A+X+Y,A+X+Y+Z) that had both either amplifications or deletions were separated by one or more segments in the region (A+X,A+X+Y) that did not have any amplification or deletion, these two regions were merged into a single segment (A,A+X+Y+Z), and the moving window procedure just described was performed for the merged region. Among all candidate indels, this approach identified the longest window of length X whose mean CR exceeded the threshold was retained as an indel. In case of multiple non-overlapping windows each with the CR mean exceeding the threshold, all of the windows were retained as separate copy number variants (supplementary figure 10).

There are several factors in addition to indels that could cause variation in read coverage across the genome and between samples, and that could thus affect the CR ratio and bias our results. One major factor that has been described before to affect read coverage is the G+C content (%G+C) of genomic regions (Dohm et al., 2008; Chiang et al., 2009; Harismendy et al., 2009). However, in our case, no significant association between %G+C content and the read coverage was observed in the ancestral strain as well as in the evolved S lines (Supplementary figure 6). The correlation coefficients between the %G+C content and read coverages were -0.0207 (p- value=0.4713), -0.0401 (p-value=0.1620), -0.0222 (p-value=0.4385) and 0.0066 (p-value=0.818) for the ancestral strain, line S1, line S2 and line S3, respectively.

There are numerous other factors, such as sample preparations (e.g., nebulization, binding of adapter to the fragments), concentration of genomic fragments, emulsion PCR efficiency, pyro- sequencing efficiency etc.,that could cause variation in coverage between our samples. If any of these factors are correlated with %G+C content, those would not influence our analysis, because %G+C content has no significant association with read coverage in any of the samples. In a systematic study of read coverage biases, Harismendy et al. (Harismendy et al., 2009) have investigated some of the factors that might be associated with variation in the read coverage. They showed that factors such as pooling of samples and amplicon bias could explain only a small part of the variation in the read coverage in 454 sequencing. Also, the pattern of non- uniform sequence coverage was reproducible across four different samples sequenced with a single sequencing technology, but this pattern varied from one sequencing technology to other. The last observation suggests that many of these factors, whose effects can not be estimated directly, do not vary significantly across samples. Our analysis compares samples from the evolved lines and the ancestral strain, and if the observations from ref. Harismendy et al., 2009 hold, these factors would not affect our analysis. Beyond these general observations, we note that read coverage biases are not well understood, and hence can not yet be estimated from the 454 sequencing data. They are a potential limitation of our analysis.

One of the major differences of our data from the analysis in ref. Harismendy et al., 2009 is the

135 Appendix to Chapter 2 average read coverage per base of the genome. The genomes in Harismendy et al., 2009 were sequenced to a saturating level of sequence coverage, whereas the sequence coverages in our samples are below the saturating coverage except for one sample (Line S3). Low sequencing coverage introduces random sampling effects which could cause variations between sequencing coverages of the different samples in our experiment. Random sampling can affect our analysis in two different ways. Firstly, it can give rise to falsely called amplifications and deletions due to fluctuations in read coverage. Secondly, it can hinder the detection of true copy number variants, especially the small ones, due to insufficient coverage. To address the first issue, randomization analyses were performed to estimate the chance of falsely calling amplification and deletions because of random fluctuations in read coverage (false positive rate or FPR). To analyze the effect of random sampling on detection of true variants, artificial amplifications and deletions were spiked into the simulated datasets. The sensitivity of our algorithm to detect such amplifications and deletions was estimated from these datasets. We next describe these analyses in greater detail.

False Positive Indels To estimate the false positive rate due to random sampling, three randomizations were carried out for the three evolved S lines (one randomization per line), where fragments of length 250bp were sampled from all possible reads of length 250bp from the reference yeast genome with equal probability, until the average read coverage per base observed in the actual sequence data had been reached. For these three randomized or simulated next generation sequence data sets, putative amplifications and deletions of at least 250bp, were determined exactly as for the real data, using the segmentation algorithm we described above. This process of randomization and identification of amplifications/deletions was repeated 1,000 times for each evolved line to get an estimate of, first, the number of false positive copy number changes per sequence data set, second, the length distributions of falsely called amplifications or deletions and third, the distribution of actual changes in copy numbers of falsely called amplifications and deletions.

Supplementary figure 7b shows the length distribution of falsely called amplifications and deletions (from the randomized data) and the same distribution for the amplifications and deletions from the real data. Supplementary figure 8b shows the copy number distribution of the falsely called amplifications and deletions (from the randomized data ), along with the copy number distribution of amplifications and deletions our algorithm detected from the real data.

We call such copy number changes "apparent" for the following reason. For deletions, a change in copy number we detect actually signifies the population frequency of a deletion rather than the actual change in the copy number, because a region with a single copy in the genome can not be deleted more than once in a haploid genome. If a deletion is present in the population at a frequency 1, then there would not be any read seen in that region in the evolved line. Thus, if a deletion is present at a frequency of 0.5, the number of reads in these regions would be approximately half the number of reads in the ancestral strain. In general, if the copy number of a deleted region shows an apparent m fold decrease in the evolved line compared to the ancestral line, the population frequency of the deletion could be approximately estimated as (1-1/m). Similarly, the copy number change in duplications is also a combination of a duplication’s population frequency and the actual copy number change in it.

136 Appendix to Chapter 2

The estimates on the number of falsely called amplifications and deletions from the simulated datasets were used to determine a likelihood value (L) for each of the amplified or deleted regions in the real data. This likelihood value is based on the size of the actual amplification/deletion and its change in copy number. The likelihood value L describes the likelihood that an amplification or deletion from the real data is an artifact of random sampling. In the simulated datasets, the number of falsely called copy number variants itself represents the likelihood value, as none of these variants are true amplifications or deletions. To control the number of false positive amplifications and deletions in the real data, the following procedure was used. A threshold in the likelihood value was set (Lthr). The amplifications and deletions with L≥Lthr were considered false positives and discarded from our analysis. For a particular Lthr, the remaining number of actual amplifications and deletions from the real data (Nreal) and the remaining number of falsely called amplifications and deletions from the simulated data (Nsim) were calculated. Finally, Lthr was adjusted such that Nsim/Nreal<0.1, implying a false positive rate (FPR) of less than 10%.

False Negative Indels As mentioned above, random sampling can also hinder the detection of true copy number variants. Hence, it becomes necessary to estimate, first, the impact of random sampling on the detection of true amplifications and deletions and, second, the sensitivity of our segmentation algorithm to detect those amplifications and deletions from such data. To this end, amplifications and deletions were spiked into our simulated sequencing data. To do so, the above randomization method was used, but with one important difference. The probability of choosing reads in some of genomic regions of defined length was made higher or lower to represent amplifications or deletions, respectively. This probability reflected increases or decreases in copy number. For example, to generate an amplified region in the evolved line with copy number m compared to a single copy in the ancestral strain, the probability of sampling the reads in the amplified region in the evolved line is m times the sampling probabilities of the reads in the non-amplified region, whereas the sampling probabilities of all the reads in the ancestral line was set to be equal. Similarly, for generating a deletion where the copy number was reduced by m fold in the evolved line, the probability of sampling the reads in the deleted region in the evolved line was set m times lower than the sampling probabilities of the reads in the non-deleted regions. (Our earlier point that copy number changes in deletions reflect population frequencies of a deletion applies here as well.)

In this spiking approach, simulated sequence datasets were generated through randomization as above, with amplifications and deletions of different lengths and different copy numbers spiked into the evolved lines. For each evolved line, a single copy number variant of particular length and copy number change was introduced in the genome. The same segmentation algorithm was then applied to detect the copy number variant. This process was iterated 100 times for each of the amplifications and deletions of various lengths and changes in copy number. For a variant of particular length and a particular copy number, the sensitivity of the algorithm was defined as the number of times the algorithm was able to correctly identify the region of spiked-in amplification or deletion, and the type of change (amplification or deletion) out of these 100 iterations.

Supplementary figure 9 shows the sensitivity of the segmentation algorithm as estimated by this

137 Appendix to Chapter 2 approach. The sensitivity for detecting amplification of the shortest lengths (250 bp) is very low (7% for a doubling of copy number in line S1), whereas detecting deletion of that length is moderate (70% for a deletion with frequency of ~0.5 in line S1). The sensitivity increases with an increase in the size of the amplified or deleted region, and also with increase in copy number (for amplifications) or frequency (for deletions). For example, in line S1, the algorithm correctlys detect amplifications of a 2.25kbp region that double the regions copy number 85 percent of the times, and 2.25kbp amplifications that triple copy number 100 percent of the times.

Ploidy determination of the evolved lines To estimate the ploidy level of the evolved lines in comparison to the ancestral strain, the ancestral strain and population samples of the evolved lines were grown for 24 hours at 30oC. For each of the cultures, the cell density was estimated using Haemocytometer. Genomic DNA was isolated from a defined number of cells following the protocol described in (Sambrook & Russell, 2001, Chapter 6). DNA quantification was carried out with a NanoDrop ND-1000 spectrophotometer, followed by a calculation of DNA content per cell.

Cell size measurement Population samples of ancestral and evolved cells were grown in YPG medium up to mid-log phase. The cell density was adjusted to be the same for all the samples (~1x107cells/ml). 20ul of cells were then loaded onto disposable counting chambers and the cell sizes, were measured using a Cellometer M10 (Nexcelom Biosciences), following the manufacturer's protocol.

Supplementary references Akada, R., Kitagawa, T., Kaneko, S., Toyonaga, D., Ito, S., Kakihara, Y. et al. 2006. PCR-mediated seamless gene deletion and marker recycling in Saccharomyces cerevisiae. Yeast 23: 399-405.

Altschul, S.F., Gish, W., Miller, W., Myers, E.W. & Lipman, D.J. 1990. Basic local alignment search tool. J. Mol. Biol. 215: 403-410.

Boeke, J.D., LaCroute, F. & Fink, G.R. 1984. A positive selection for mutants lacking orotidine-5'-phosphate decarboxylase activity in yeast: 5-fluoro-orotic acid resistance. Mol. Gen. Genet. 197: 345-346.

Brachmann, C.B., Davies, A., Cost, G.J., Caputo, E., Li, J., Hieter, P. et al. 1998. Designer Deletion Strains derived from Saccharomyces cerevisiae S288C: a Useful set of Strains and Plasmids for PCR-mediated Gene Disruption and Other Applications. Yeast 14: 115-132.

Chiang, D.Y., Getz, G., Jaffe, D.B., O'Kelly, M.J., Zhao, X., Carter, S.L. et al. 2009. High-resolution mapping of copy-number alterations with massively parallel sequencing. Nat. Methods 6: 99-103.

Dohm, J.C., Lottaz, C., Borodina, T., & Himmelbauer, H. 2008. Substantial biases in ultra-short read data sets from high-throughput DNA sequencing. Nucleic Acids Res. 36: e105.

Edgar, R.C. 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32:1792-1797.

Ghaemmaghami, S., Huh, W.K., Bower, K., Howson, R.W., Belle, A., Dephoure, N. et al. 2003. Global analysis of protein expression in yeast. Nature 425: 737-741.

Goffeau, A., Barrell, B.G., Bussey, H., Davis, R.W., Dujon, B., Feldmann, H. et al. 1996. Life with 6000 Genes. Science 274: 546,563-567.

138 Appendix to Chapter 2

Güldener, U., Münsterkötter, M., Kastenmüller, G., Strack, N., van Helden, J., Lemer, C. et al. 2005. CYGD: the Comprehensive Yeast Genome Database. Nucl. Acids Res. 33: D364-D368.

Harismendy, O., Ng, P.C., Strausberg, R.L., Wang, X., Stockwell, T.B., Beeson, K.Y. et al. 2009. Evaluation of next generation sequencing platforms for population targeted sequencing studies. Genome Biol. 10: R32.

Hubbell, E., Liu, W-M. & Mei, R. 2002. Robust estimators for expression analysis. Bioinformatics 18: 1585-1592.

Lenski, R.E., Rose, M.R., Simpson, S.C. & Tadler, S.C. 1991. Long-Term Experimental Evolution in Escherichia coli. I. Adaptation and Divergence During 2,000 Generations. The American Naturalist 138: 1315-1341.

Liu, W-M., Mei, R., Di, X., Ryder, T.B., Hubbell, E., Dee, S. et al. 2002. Analysis of high density expression microarrays with signed-rank call algorithms. Bioinformatics 18: 1593-1599.

Lynch, M., Sung, W., Morris, K., Coffey, N., Landry, C.R., Dopman, E.B. et al. 2008. A genome-wide view of the spectrum of spontaneous mutations in yeast. Proc. Natl. Acad. Sci. USA 105: 9272-9277.

Margulies, M., Egholm, M., Altman, W.E., Attiya, S., Bader, J.S., Bemben, L.A. et al. 2005. Genome sequencing in microfabricated high-density picolitre reactors. Nature 437: 376-380.

Mukhopadhyay, A., He, Z., Alm, E.J., Arkin, A.P., Baidoo, E.E., Borglin, S.C. et al. 2006: Salt Stress in Desulfovibrio vulgaris Hildenborough: an Integrated Genomics Approach. J. Bacteriol. 188: 4068-4078.

Sambrook, J. & Russell, D.W. 2001. Molecular Cloning: A laboratory Manual, 3rd edn. Cold Spring Harbor Laboratory Press, New York.

Scherer, S. & Davis, R.W. 1979. Replacement of chromosome segments with altered DNA sequences constructed in vitro. Proc. Natl. Acad. Sci. USA 76: 4951-4955.

Sikorski, R.S. & Hieter, P. 1989. A system of shuttle vectors and yeast host strains designed for efficient manipulation of DNA in Saccharomyces cerevisiae. Genetics 122: 19-27.

Widlund, P.O. & Davis, T.N. 2005. A high-efficiency method to replace essential genes with mutant alleles in yeast. Yeast 22: 769-774.

139 Chapter 3

Yeast adapts to a changing stressful environment by evolving cross-protection and anticipatory gene regulation

This chapter was published as “Dhar R, Sägesser R, Weikert C, Wagner A, 2012. Yeast adapts to a changing stressful environment by evolving cross-protection and anticipatory gene regulation. Molecular Biology and Evolution, Advance online access.” http://mbe.oxfordjournals.org/content/early/2012/11/28/molbev.mss253.long

Abstract Organisms can protect themselves against future environmental change. An example is cross- protection, where physiological adaptation against a present environmental stressor can protect an organism against a future stressor. Another is anticipation, where an organism uses information about its present environment to trigger gene expression and other physiological changes adaptive in future environments. ‘Predictive’ abilities like this exist in organisms that have been exposed to periodic changes in environments. It is unknown how readily they can evolve. To answer this question, we carried out laboratory evolution experiments in the yeast Saccharomyces cerevisiae. Specifically, we exposed three replicate populations of yeast to environments that varied cyclically between two stressors, salt stress and oxidative stress, every 10 generations, for a total of 300 generations. We evolved six replicate control populations in only one of these stressors for the same amount of time. We analyzed fitness changes and genome-scale expression changes in all these evolved populations. Our populations evolved asymmetric cross protection, where oxidative stress protects against salt stress, but not vice versa. Gene expression data also suggests the evolution of anticipation, as well as basal gene expression changes that occur uniquely in cyclic environments. Our study shows that highly complex physiological states that are adaptive in future environments can evolve on very short evolutionary time scales.

Introduction Organisms are continually challenged by changing environments. They thus have evolved physiological adaptations to cope with such change. These mechanisms include genome-scale

140 Chapter 3

gene expression changes on short, physiological time scales. Such changes have been most thoroughly studied in stressful environments, such as environments characterized by osmotic stress, oxidative stress, or temperature stress (Jamieson 1998; Gasch et al. 2000; Posas et al. 2000; Rep et al. 2000; Causton et al. 2001; Petersohn et al. 2001; Yale and Bohnert 2001; Zheng et al. 2001; Weber and Jung 2002; Toledano et al. 2003; Weber et al. 2005; Gasch 2007; Chen et al. 2008; Jozefczuk et al. 2010). Some genes respond to multiple stressors in a ‘general’ or ‘common’ environmental stress response, whereas others respond to specific stressors (Gasch et al. 2000; Causton et al. 2001; Petersohn et al. 2001; Higgins et al. 2002; Warringer et al. 2003; Chinnusamy et al. 2004; Phadtare and Inouye 2004; Weber et al. 2005; Yoshikawa et al. 2009; Rutherford et al. 2010).

Some environmental stressors may be extremely rare, whereas others may occur more frequently, and yet others may even recur on a regular basis, such as fluctuation in temperature and nutrient level in circadian cycles (Wijnen and Young 2006; van der Linden et al. 2010), or changes in soil properties over longer periods of time (Fierer et al. 2003; Bapiri et al. 2010; DeAngelis et al. 2010). Several physiological mechanisms can help organisms prepare for recurring stressors. First, rare and stochastic recurrences may require stochastic switching of gene expression (Thattai and van Oudenaarden 2004; Kussell and Leibler 2005; Blake et al. 2006; Beaumont et al. 2009; Salathé et al., 2009; Gaál et al. 2010, Rainey et al. 2011). This is a form of bet-hedging (Montgomery 1974; Philippi and Seger 1989; de Jong et al. 2011) where different, otherwise identical individuals of a population express different genes, which may protect their carriers against specific stressors (Attfield et al. 2001, Booth 2002; Meyers and Bull 2002; Sumner and Avery 2002). While most of a population would be vulnerable to the stressor, a small part of it would be protected.

A second relevant mechanism is evolved cross-protection. In cross-protection, one environmental stressor protects cells against a second stressor (Völker et al. 1992; Cullum et al. 2001; Greenacre and Brocklehurst 2006; Berry and Gasch 2008; Berry et al. 2011). For example, exposure of yeast cells to salt stress can improve their fitness in oxidative stress (Berry and Gasch 2008; Berry et al. 2011). Cross-protection is by no means universal among stressors. That is, not every stressor cross-protects against other stressors (Völker et al. 1992; Berry and Gasch

141 Chapter 3

2008; Berry et al. 2011). We know little about whether and how cross-protection can change in evolution.

A third mechanism is anticipation. Here, a present environment serves as a signal for future environmental change (Tagkopoulos et al. 2008; Wolf et al. 2008; Mitchell et al. 2009). Cells use this signal to pre-adapt to the future change. An example involves the environmental changes that E. coli typically experiences as it is ingested by a mammal. Specifically, ambient temperature increases rapidly immediately after ingestion, and then ambient oxygen levels decrease as cells enter the gastrointestinal tract. E. coli cells exposed to high temperatures express genes that may be adaptive for the subsequent drop in oxygen levels (Tagkopoulos et al. 2008). Another example is exposure to lactose, which helps prepare E. coli cells for exposure to maltose. This anticipation reflects the temporal order in which these sugars appear in the mammalian digestive tract (Mitchell et al. 2009). Similarly, exposure of yeast cells to heat stress helps them prepare for oxidative stress. The two stressors may follow one another during the wine production process for which yeasts are indispensable (Mitchell et al. 2009). Analogous predictive mechanisms are present in Candida albicans and Vibrio cholerae (Schild et al. 2007; Rodaki et al. 2009). Theoretical work has studied the costs and benefits of such adaptive predictions (Mitchell and Pilpel 2011).

In contrast to well-studied physiological adaptations to stressful environments, the process of evolutionary adaptations to such environments is less well studied (Völker et al. 1992; Jamieson 1998; Gasch et al. 2000; Posas et al. 2000; Rep et al. 2000; Causton et al. 2001; Petersohn et al. 2001; Yale and Bohnert 2001; Zheng et al. 2001; Weber and Jung 2002; Toledano et al. 2003; Weber et al. 2005; Greenacre and Brocklehurst 2006; Gasch 2007; Chen et al. 2008; Jozefczuk et al. 2010; Ferea et al. 1999; Cullum et al. 2001; Cooper et al. 2003; Alcántara-Díaz et al. 2004; Fong et al. 2005; Pelosi et al. 2006; Bennett and Lenski 2007; Hughes et al. 2007a, 2007b; Jasmin and Kassen 2007; Cooper and Lenski 2010; Rudolph et al. 2010; Samani and Bell 2010; Dhar et al. 2011; Goldman and Travisano 2011). We here use laboratory evolution to study evolutionary adaptation of yeast cells to changing stressful environments, an especially poorly studied subject. Among the small amount of relevant work is one study that asked how E. coli cells adapt evolutionarily to fluctuating acidity. It showed that environmental generalists emerge

142 Chapter 3

which are better adapted to varying acidity than their ancestors (Hughes et al. 2007b), but left the mechanisms behind their adaptation open. Another study showed that anticipation can readily be reduced on short evolutionary time scales (Tagkopoulos et al. 2008), but left open the question whether anticipation can arise just as easily.

In our experiment, we chose two cyclically varying environments that are characterized by different stressors. Specifically, the stressors we used are salt stress caused by sodium chloride, and oxidative stress caused by hydrogen peroxide. Sodium chloride (NaCl) causes hyperosmotic stress to yeast cells. In addition, it imposes hyperionic stress on cells due to the presence of high concentrations of Na+ and Cl- ions, and necessitates ion detoxification mechanisms (Apse et al. 1999; Maathuis and Amtmann 1999; Serrano et al. 1999; Hohmann 2002). Hydrogen peroxide generates reactive oxygen species that are responsible for oxidative damage through the oxidation of metabolites, enzymes, and DNA, thus inhibiting metabolism and growth (Jamieson 1998; Toledano et al. 2003).

We evolved three yeast populations in parallel, in an environment characterized by cyclically varying stressors. In the first part of each environmental cycle, we exposed the cells to salt stress (0.5M sodium chloride) for 10 generations. In the second part, we exposed them to oxidative stress (1mM hydrogen peroxide) for another 10 generations. We alternated between these environments for a total of 300 generations, i.e., for 15 cycles, and called the yeast populations thus evolved SO populations (figure 1a). In control experiments, we evolved three yeast populations under continuous salt stress for 300 generations (S populations), and yet another three populations under continuous oxidative stress for 300 generations (O populations) (figure 1a).

To study the evolutionary adaptation of all nine populations to our stressors, we carried out competition assays to measure the fitness of evolved populations relative to the ancestral strain, using a yeast strain labeled with green fluorescent protein (see Materials and Methods). To understand functional genomic changes that may be associated with these fitness changes, we used gene expression microarrays to study transcriptome-wide gene expression changes in the evolved and ancestral populations.

143 Chapter 3

Figure 1: Experimental design and hypothetical examples of gene expression changes (a) We evolved 9 yeast populations in parallel in the laboratory. Three of these populations (SO populations) were exposed to a fluctuating environment, where the cells were first exposed to salt stress for 10 generations and then to oxidative stress for the next 10 generations. We repeated this cycle 15 times for a total of 300 generations. We also evolved six other populations as controls. Three of these populations were continuously exposed to salt stress for 300 generations (S populations), while the other three were exposed to continuous oxidative stress for 300 generations (O populations). Panels (b-d) show physiological expression changes of a hypothetical gene before (solid line) and after (dotted line) evolutionary adaptation. The x-axis shows time (on a physiological time scale) after the organism has been exposed to a stressor; the y-axis shows the expression level of the gene. (b) basal increase in expression of a gene even in the absence of a stressor. (c) increase in the induction of the gene in the stressor. (d) reduction in repression of the gene in the stressor after evolution. Other changes and combinations thereof are possible (figure S2). Expression changes are shown as linear, but need not be linear.

144 Chapter 3

With these experiments, we asked several questions. Does evolutionary adaptation to salt stress and oxidative stress reflect a general stress response, or a response specific to a given stressor? Are there tradeoffs between fitness in the two stressors? And most importantly, do abilities such as cross-protection or anticipation evolve? We found that within a mere 300 generations, our populations adapted evolutionarily to the fluctuating stressors. They evolved changes in fitness and gene expression that suggest several, non-exclusive mechanisms to cope with cyclical change, including cross-protection, anticipation, and a change in basal gene expression levels.

Results Populations adapt evolutionarily in a partially stressor-specific manner Our study populations evolved increased fitness in both salt stress and oxidative stress (Figure 2). Specifically, when exposed to salt stress, the salt-evolved (S-) populations increased their fitness relative to the ancestor (w=1.28±0.04; Figure 2a). Their fitness also increased when measured in an environment without salt stress, but to a significantly lesser extent (Mann- Whitney U test, p=4x10-5; figure 2a and figure 2c; see also ref. Dhar et al. 2011). The peroxide- evolved (O-) populations also increased their fitness in oxidative stress relative to the ancestor (w=1.13±0.02; p=0.004; one sample Wilcoxon signed rank test). Their fitness increased in the absence of oxidative stress as well, but to a significantly lesser extent (Mann-Whitney U test, p=0.004; figure 2a and figure 2c). And the same holds for the SO populations, which cycle between two environments (Figure 2; Mann Whitney U tests, p=0.004 in salt stress and in oxidative stress).

A second supporting line of evidence comes from yeast populations evolved in growth medium without stressors, which are referred to as control (C) populations in Figure 2. These populations allowed us to completely rule out the possibility that the adaptations in the S, O and SO populations are solely due to adaptations to the general growth medium or due to the experimental evolution protocol. In salt stress, the S, O and SO populations had significantly higher fitness than the control populations, suggesting that the adaptations in the S, O and SO populations are partially specific to salt stress (w=1.18±0.01 for control populations; p=4.1×10-5 for comparison between S and control populations; p=0.004 for comparison between O and

145 Chapter 3

control populations; p=4.1×10-5 for comparison between SO and control populations; Figure 2a). In oxidative stress, the O and SO populations but not the S populations had significantly higher fitness than the control populations suggesting that the adaptations in O and SO populations are partially oxidative stress specific (w=1.05±0.02 for control populations; p=0.26 for comparison between S and control populations; p=4.1×10-5 for comparison between O and control populations; p=4.1×10-5 for comparison between SO and control populations; Figure 2b). Taken together, these observations show that evolutionary adaptation has occurred, and that part of that adaptation reflects the presence of a stressor in the environment.

Previous studies on the physiological response of yeast and other organisms (e.g., E. coli, B. subtilis) have shown that organisms have a general stress response that is independent of the stressors that they are exposed to (Gasch et al. 2000; Causton et al. 2001; Petersohn et al. 2001; Weber et al. 2005;). Do perhaps all evolutionary adaptations that occurred in our populations affect only this general stress response? This possibility leads to two predictions, neither of which is born out by our data.

First, we would expect that our S and O populations would not differ in fitness in any of the stressors. However, this is not the case. Specifically, consider the S populations which we evolved under salt stress. Their fitness increased significantly relative to the ancestor in salt stress (w=1.28±0.04; one sample Wilcoxon signed rank test, p=0.004; Figure 2a), an increase that is significantly higher than the fitness increase in O populations under salt stress (Mann Whitney U test, p=0.003). Analogously, O populations show a significant fitness increase in oxidative stress (one sample Wilcoxon signed rank test, p=0.004). This increase is also significantly higher than the fitness increase in S populations under oxidative stress (Mann Whitney U test, p=0.0005). Taken together, these observations suggest that evolutionary adaptation to these stressors does not just affect a general stress response.

A second prediction is that SO populations should not differ at all in their fitness from S populations and from O populations. This prediction is also wrong. The cycling populations increased their fitness under both salt stress (w=1.24±0.05; Wilcoxon signed rank test, p=0.004) and oxidative stress (w=1.13±0.04, p=0.004) compared to the ancestor. Importantly, under

146 Chapter 3

oxidative stress, the SO populations had a higher fitness than the S populations (w=1.04±0.05 for S populations; Mann-Whitney U test p=0.004). Under oxidative stress, their fitness was indistinguishable from that of the O populations, and under salt stress, they also had the same fitness as the S and O populations (Figure 2, Mann Whitney U tests, p>0.19). In sum, two different lines of evidence argue that part of the evolutionary adaptation we see is partially specific to the stressor applied, and does not just affect a general stress response.

Cells exposed to oxidative stress evolve cross-protection against salt stress We next turned to analyzing the changes in gene expression that accompanied evolutionary adaptation. Although parallel populations in our experiment may well differ in how individual genes change their expression, we focused on common patterns rather than idiosyncratic changes in individual populations. For this reason we pooled expression data from our replicate populations for statistical analysis.

To interpret our gene expression measurements, it is important to distinguish between two kinds of adaptation, physiological and evolutionary adaptation. Physiological adaptation refers to changes in gene expression within a short time after exposure to a stressor. In contrast, evolutionary adaptation reflects changes in gene expression that occur after long-term exposure to stressors, that is, over many generations. Changes caused by evolutionary adaptation can be further subdivided into basal expression changes and changes in regulation. Basal expression changes are changes that occur even in the absence of stressor in an evolved population relative to the ancestor, whereas regulatory changes arise through changed regulation of genes in response to stressors. Figure 1b through 1d show several hypothetical examples of the kinds of evolutionary expression changes that can occur in our evolutionary experiments. (Figure S2 contains a more exhaustive list.)

147 Chapter 3

Figure 2: Fitness analysis of the evolved populations in salt stress and in oxidative stress. The fitness of each evolved yeast population relative to the ancestral strain was measured with a competition assay against a GFP-tagged reference yeast strain using FACS. A relative fitness of one suggests that the evolved population is as fit as the ancestral strain. (a) Relative fitness of the evolved yeast populations under salt stress. All populations show increased salt-specific fitness in this medium, including O populations and control (C) populations that never experienced salt stress (w=1.28±0.04 standard deviations for S populations, w=1.22±0.02 for O populations, w=1.24±0.05 for SO populations, w=1.18±0.01 for C populations). We note that S, O and SO populations show a significantly higher fitness increase than the C populations, suggesting salt specific adaptations in these populations. (b) Relative fitness of the evolved yeast populations under oxidative stress. The O and the SO populations, but not the S populations, which never experienced oxidative stress during this experiment, show a stressor-specific fitness increase under oxidative stress (w=1.04±0.05 for S populations, w=1.13±0.02 for O populations, w=1.13±0.04 for SO populations and w=1.05±0.02 for C populations). Again, O and SO populations but not S populations show significantly higher fitness increase compared to the C populations, suggesting that at least part of the adaptations in these populations is due to adaptation to oxidative stress. (c) Relative fitness of the evolved yeast populations in medium without stressors - w=1.12±0.03 for S populations, w=1.09±0.03 for O populations, w=1.07±0.02 for SO populations and w=1.10±0.01 for C populations. In each panel, the horizontal line inside a box corresponds to the mean fitness, the height of a box corresponds to the mean±1 standard error (s.e.), and the whisker corresponds to the mean± 1 standard deviation (s.d.). An asterisk indicates a significant difference in a pairwise comparison of populations (Mann- Whitney U test).

148 Chapter 3

Berry and Gasch (Berry and Gasch 2008) showed that the exposure of yeast cells to a primary stressor can prepare them for future secondary stressors that could be different from the primary stressor. We wanted to investigate if this kind of cross-protection can evolve in our experiment, that is, whether long term exposure of yeast cells to one stressor can protect these cells against another stressor. We note that cross-protection could be symmetric or asymmetric in nature (figure 3a). In symmetric cross-protection, adaptation to stressor 1 would protect the cells against stressor 2 (1→2) and vice versa (2→1). In contrast, asymmetric cross-protection would work in only one direction, i.e., 1→2 or 2→1, but not both. In terms of our experiment, asymmetric cross-protection means that evolutionary adaptation to salt stress could protect cells against oxidative stress, whereas evolutionary adaptation to oxidative stress might not protect cells against salt stress, or vice versa.

Two lines of evidence from our experiments point to evolution of asymmetric cross-protection, where long term adaptation to oxidative stress protects yeast cells against salt stress, but not vice versa. First, the O populations show a significant fitness increase in salt compared to the ancestor (w=1.22±0.02; p=0.004), although they were never exposed to salt stress during our evolution experiment (figure 2a). Their fitness in salt is also significantly higher than their fitness in medium without stressor (Mann Whitney U test, p=4×10-5). In contrast, although the S populations show a (small but significant) fitness increase under oxidative stress relative to the ancestor (w=1.04±0.05; p=0.008), their fitness in oxidative stress is lower than in medium without stressor (Mann Whitney U test, p=0.008; figure 2b and figure 2c).Thus, this fitness increase in the S populations does not reflect an adaptation to the oxidative stressor in the growth medium. These observations suggest that previous exposure to oxidative stress protects cells against salt stress, but not vice versa. The second line of evidence comes from fitness measurements that we carried out in two different cycles of exposure to stressors. Briefly, SO and O populations show higher fitness in the O→S cycle than in the S→O cycle, suggesting evolution of asymmetric cross-protection (results S1).

149 Chapter 3

Figure 3: Symmetric and asymmetric cross-protection. (a) The types of stress cross- protection that can occur in our experiments. Symmetric cross-protection means that adaptation to stressor 1 protects the cells against stressor 2 and vice versa. In contrast, in case of asymmetric cross-protection, adaptation to only one stressor protects against the other, but not vice versa. Asymmetric cross-protection could be of two different types, as shown in the figure. (b) Venn diagram showing overlap between genes with changed regulation in oxidative stress (upper three numbers) and in salt stress (lower three numbers). The middle three numbers, in the intersection of the ellipses, correspond to genes that change regulation both in oxidative stress and in salt stress. For example, in S populations, 13 and 61 genes show changes in regulation in oxidative stress and in salt stress respectively, with only one gene that is common between them. See text for details. The purple asterisks indicate significant difference in overlap between different populations.

150 Chapter 3

Both the O and the SO populations had been exposed to oxidative stress for prolonged periods during the evolution experiment, one of them continuously, and the other multiple times. This raised the question whether such prolonged exposure is a requirement for the cross-protection we see. In other words, is the cross-protection we observed an evolutionary response to prolonged or repeated oxidative stress, or is it independent of such prolonged exposure? Our experiments allow us to distinguish between these possibilities, because the S populations were not exposed to oxidative stress during the evolution experiment. If cross-protection is an evolved feature, we would predict that it does not occur in S populations. Indeed, in these populations, there is no fitness difference between the O→S cycle (figure 4a, left-most data ) and the S→O cycle (figure 4b; left-most data) (Mann Whitney U test, p=0.1043). That is, in populations not exposed to oxidative stress during the evolution experiment, pre-exposure to oxidative stress does not increase fitness. In sum, our observations suggest that our populations evolved asymmetric cross- protection, where pre-exposure to oxidative stress can protect a population against salt stress.

Evolved asymmetric cross-protection is reflected in shared genes that change regulation

We next asked whether the evolved cross-protection we observed had also left an evolutionary signature in gene expression changes. To find out, we turned our attention to genes that changed their regulation during our evolution experiment (see Materials and Methods). The observations we made regard quantitative differences in the number of genes that change regulation in the S population, on the one hand, and in the O and SO populations, on the other hand. We first turn to the S populations, where 61 genes changed their regulation relative to the ancestral strain in the salt-evolved S populations – when the S populations are exposed to salt (figure 3b, left-most data). Many fewer genes (13 genes) changed their regulation in the S populations when these populations are exposed to oxidative stress. Importantly, the intersection of these two sets of genes is very small: only one gene (1.64 percent of 61 genes) changed regulation in response to both salt and oxidative stress (figure 3b, left-most data). We tested the null-hypothesis that this overlap could occur by chance alone, i.e., for sets of 61 and 13 genes drawn at random from the yeast genome (see methods S1). The answer is yes (randomization test p=0.131; methods S1).

151 Chapter 3

Figure 4: Fitness analysis of the evolved populations in one S→O transition and in one O→S transition. (a) Relative fitness in the S→O transition. In these competition assays, we first let yeast cells grow for 24 hours under salt stress, then for another 24 hours under oxidative stress, and measured competitive fitness thereafter. (w=1.07±0.04 for S populations, w=1.12±0.01 for O populations and w=1.14±0.02 for SO populations) (b) Relative fitness in the O→S transition. As in (a), except that cells were first exposed to oxidative stress and then to salt stress. (w=1.09±0.05 for S populations, w=1.14±0.01 for O populations and w=1.15±0.02 for SO populations). For O populations and SO populations, the fitness increase is significantly higher in the O→S transition compared to the fitness in the S→O transition (p=2.37x10-8 for O populations and p=0.001 for SO populations). However, this does not hold for S populations (p=0.1043). The line in the box represents the mean fitness, the box represents mean± 1 standard error (s.e.) and the whisker represents mean± 1 standard deviation (s.d.). The asterisks indicate a significant difference in fitness (Mann-Whitney U test).

152 Chapter 3

This overlap is much greater in O populations (figure 3b, right-most data). Analogously to what we observed in the S populations, more genes change their regulation when O populations are exposed to oxidative stress (348 genes) than when they are exposed to salt stress (132). However, now the overlap between these two sets of genes (35 genes or 10.06 percent of 348 genes) is greater than expected by chance alone (randomization test p<10-7). In addition, the overlap is also significantly greater than in the salt-evolved populations (chi-square test p=0.0392, df=1, see methods S2).We note that this significant overlap does not just result from the greater number of genes that change their regulation in O than in S populations (figure 3b, right-most data and left-most data), because our tests account for this fact (see methods S1). Among these 35 genes, there are two genes that show a change in induction in salt as well as in peroxide. One gene, PUT4 was induced in response to osmotic stress in three previous physiological studies of the stress response (Posas et al. 2000; Rep et al. 2000; Yale and Bohnert 2001). The second gene, YDL199C, encodes a putative transporter protein that has been shown to reduce oxidative stress resistance when knocked out (Higgins et al. 2002). In addition, O populations harbor 33 genes that show change in repression in salt as well as in peroxide (results S3).

An analogous pattern holds in SO populations (figure 3b, middle data). Here, the overlap between the set of genes that changed regulation in salt and oxidative stress (40 genes) is greater than expected by chance alone (randomization test p<10-7). It is also significantly greater than the overlap in S populations (chi-square test p=0.0045 df=1). One of the affected genes, PUT4 is shared with O populations. A second gene, DDR48, was found to be upregulated in salt as well as in peroxide in previous studies of physiological stress adaptation (Gasch et al. 2000; Posas et al. 2000; Rep et al. 2000; Causton et al. 2001; Yale and Bohnert 2001). Yet another gene, YOR152C, is upregulated as part of the common environmental stress response (Gasch et al. 2000) (results S3).

We note parenthetically that in the SO populations, the number of genes that change in regulation is almost equal between salt stress (186 genes) and oxidative stress (180 genes). These two numbers are significantly more similar to each other than in the S and O populations (chi- square test at df=1, p<0.0001 both for comparison between S and SO populations and between O

153 Chapter 3 and SO populations; see methods S3), which is consistent with the fact that these populations have been evolved in both stressors for an equal amount of time.

In sum, asymmetric cross-protection is associated with a significantly increased fraction of genes that change their expression in both salt and oxidative stress, when populations evolve under sustained (O populations) or repeated (SO populations) oxidative stress.

In the supplementary material, we discuss in more detail the numbers of genes that change either basal expression or regulation in S, O, and SO populations, as well as the genes with expression changes that are shared between these populations (figure S3 and results S2).. Also in the supplementary material, we report another observation that is consistent with evolved asymmetric cross-protection. Specifically, genes changing their expression evolutionarily in O and SO populations are significantly more similar in their function to each other than to genes that change their expression in S populations (figure S4 and results S4). We also associate genes that change regulation in salt stress and in oxidative stress in these populations with phenotype data for null, reduction in function and over-expression mutants from Saccharomyces Genome Database (SGD) (figure S13) (phenotype_data.tab from http://www.yeastgenome.org/download- data/curation/; Cherry et al. 2012).

Evolved anticipation of stressors in SO populations

We next asked whether asymmetric cross-protection could explain all of the fitness increase we observe in SO populations. If this was so, the fitness of O populations and SO populations would be identical under cycling conditions. However, this is not the case. Even though the fitness of O and SO populations does not differ significantly in salt stress (p=0.1903) or in oxidative stress (p=0.5457), fitness is higher in SO populations both in the O→S cycle (figure 4a, p=0.001) and in the S→O cycle (figure 4b, p=9x10-10). This higher fitness could have at least two non- exclusive explanations.

First, the SO populations may have evolved adaptations unique to them. In the supplementary material, we show that our gene expression data supports this possibility: In the SO populations,

154 Chapter 3 but not in the S or O populations, a distinct group of genes has changed their basal expression in the course of the experiment (results S5 and figures S5 and S6). These include genes well-known to be involved in the physiological stress response, such as the gene FRT2, which can promote growth in conditions of high Na+ concentrations (Heath et al. 2004), and the genes FRE7 and HMX1. Knockout mutations in these genes can reduce oxidative stress resistance (Higgins et al. 2002; Collinson et al. 2011).

Second, the cycling (SO) populations may have acquired the ability to anticipate the next stressor in each cycle. Adapting physiologically to that stressor before it arises, for example through gene expression changes, might help explain the higher fitness we observe in the cycling populations. This possibility gives rise to several predictions about changes in gene regulation that we tested next.

The SO population shows evolutionary adaptation in the regulation of two sets of genes, one upon exposure to stressor S (S-specific genes), and the other upon exposure to stressor O (O- specific genes). In this population S-specific genes would change their expression after exposure to S, i.e., as a physiological adaptation to S. If the appearance of stressor S also served as an anticipatory signal for stressor O, one would expect that stressor S also triggers the expression of many O-specific genes. More specifically, in population SO the number of these O-specific genes should be greater than in a population that was only exposed to stressor S during its evolution. If so, one could conclude that the population has acquired the ability to ‘interpret’ stressor S as anticipating stressor O. Conversely, when population SO is exposed to stressor O, it would show changes in the expression of O-specific genes. If it also shows changes in the expression of many S-specific genes, one might conclude that stressor O can help the population anticipate stressor S. These simple considerations show that anticipation can be symmetric or asymmetric (figure 5a; Mitchell et al. 2009). Symmetric anticipation means that each stressor serves as a signal to anticipate the other stressor. Asymmetric anticipation means that only one of the stressor can help anticipate the other stressor.

The scenario from the preceding paragraph leads to three specific predictions. First, in the SO populations, the sets of genes that evolved changed regulation in response to S and O should

155 Chapter 3

show a statistically significant overlap and, more importantly, this overlap should be greater than in the S and O populations that were never exposed to the fluctuating environment – otherwise we could not conclude that the overlap is an evolved response to the fluctuating environment. We already discussed earlier that the sets of genes with regulatory changes in both salt stress and oxidative stress overlap to a significantly greater extent in SO populations than in S populations (figure 3b, chi-square test p=0.0045 df=1). The overlap is also significantly higher than in O populations (chi-square test p=0.0046, df=1), thus confirming the prediction.

A second prediction is that the overlap of the sets of genes changing expression in physiological response to both salt and oxidative stress should be higher in the SO populations than in the

ancestor. This is indeed the case (chi-square test, p <0.0001, df=1, figure S7). Also, the overlap of these sets of genes is significantly higher in the SO populations when compared to the S populations (chi-square test, p<0.0001, df=1), but not when compared to the O populations (chi- square test, p=0.70, df=1) (figure S7). We discuss the latter observation further in the supplementary results (results S6).

We note that these two predictions do not distinguish between symmetric or asymmetric anticipation. However, a third prediction does. It is illustrated by the schematic of figure 5b. Let

us denote by A1 the fraction of genes that have changed regulation both after exposure to salt in SO populations and after exposure to oxidative stress in O populations. If salt has become an

anticipatory signal for oxidative stress in SO populations, then A1 should be significantly greater

than A2 (figure 5b), which denotes the fraction of genes that change regulation both after exposure to salt stress in S populations and to oxidative stress in O populations. This is indeed

the case (chi-square test, p <0.0001 df=1). In terms of absolute numbers, there are 45 genes in set

A1, i.e., genes that changed regulation in response to salt stress in SO populations, and that changed regulation in response to oxidative stress in O populations. There are, however, only 12

genes in set A2, that is, genes with changed regulation in O populations in response to oxidative stress, and with changed regulation in S populations in response to salt stress.

156 Chapter 3

Figure 5: Evolved symmetric and asymmetric anticipation and its detection in the SO populations. (a) The concepts of symmetric and asymmetric anticipation. Symmetric anticipation means that each stressor serves as a signal to anticipate the other stressor. Asymmetric anticipation means that only one of the stressors can help anticipate the other stressor. (b) Criteria and data to detecting anticipation. A1 is the fraction of genes with changed regulation in salt stress in SO populations and in oxidative stress in O populations; A2 is the fraction of genes with changed regulation both in salt stress in S populations and in oxidative stress in O populations; A3 is the fraction of genes with changed regulation both in salt stress in S populations and to oxidative stress in SO populations. The criteria for evidence for no anticipation, symmetric anticipation and asymmetric anticipation are summarized below the Venn diagram. In our experiments, we found that the set A1 contains 45 genes and is significantly greater than set A2, which contains 12 genes. Set A3 contains 12 genes, a number that does not differ significantly from the size of set A2. The purple asterisk indicates a significant difference in the sizes of the sets A1 and A2.

157 Chapter 3

Conversely, if oxidative stress has become a signal in SO populations to anticipate the salt stress response, then A3 (figure 5b) should be significantly greater than A2. That is, the fraction of genes that change their expression when exposed to oxidative stress in SO populations, and that also change their expression in response to salt stress in S populations, should be significantly greater than the fraction of genes that change regulation both after exposure to salt stress in S populations and to oxidative stress in O populations. This is not the case (chi-square test, p=0.05, df=1). The absence of evidence for oxidative stress as an anticipatory signal for salt stress could be due to the fact that one cannot completely disentangle the effects of cross-protection and anticipation (see Discussion).

In sum, three lines of evidence based on gene expression changes in our evolved populations suggested that asymmetric anticipatory regulation evolved in our lines, where salt stress can help anticipate oxidative stress, but not vice versa. The genes most likely to be affected by this anticipatory regulation are genes that are contained in set A1 but not in set A2 (figure 5b). There are 33 such genes (table S1). Three of them have a known role in the stress response. Specifically, a null mutant of gene YGR035C shows decreased hyperosmotic stress resistance (Yoshikawa et al. 2009), and the genes SPS4 and HXT2, are associated with tolerance of and adaptation to salt in yeast (Warringer et al. 2003).

Discussion

All three kinds of populations (S, O, and SO) that we studied adapted evolutionarily to their respective stressors within a mere 300 generations, and experienced a fitness increase between 10 and 30 percent relative to the ancestor. Fitness differences between the populations indicate that their evolutionary adaptation is not due to a common stress response, but at least partly specific to salt or oxidative stress. Our data also speak to the possibility of a trade-off in fitness, which would occur if the elevated fitness of S populations comes at the price of a reduced fitness under oxidative stress, and vice versa for O populations. However, no such trade-off is evident.

We observed the evolution of asymmetric cross-protection (figure 2a and 2b), where exposure to oxidative stress protected the cells against salt stress, but not vice versa. This assessment is based

158 Chapter 3

on fitness differences of populations exposed to oxidative stress, and is accompanied by changes in gene regulation in the evolved populations. Specifically, the sets of genes that changed their regulation in response to oxidative and salt stress share many more genes in populations where cross-protection arose than in the ancestor. We emphasize that cross-protection is an evolutionary adaptation that occurred during our experiment, and that depends on prolonged exposure to oxidative stress, because our S populations do not show it. We note that the cross- protection evolved not only in the cycling SO populations, but also in the O populations.

One question that remains unanswered is why long term adaptation to oxidative stress protects yeast cells against salt stress but not vice versa. We discuss two candidate explanations here. First, it is possible that our O populations took an evolutionary path for adaptation to oxidative stress that also partially overlaps with the evolutionary path of adaptation to salt stress. This idea is supported by the fact that similar functional gene classes are significantly enriched or impoverished for genes with regulation change in peroxide in O populations, and for genes with regulation change in salt in S populations (figure S6). A second possible explanation comes from the observation that many environmental stressors induce the generation or accumulation of reactive oxygen species inside cells, thereby causing oxidative stress. Studies from other yeast species and plants suggest that some oxidative stress response genes might also be associated with the osmotic stress response (Gossett et al. 1996; Cazalé et al. 1998; Lin and Kao 2002; Chao et al. 2009). Thus, long term adaptation to oxidative stress might partially alleviate the harmful effects of reactive oxygen species caused by other stressors. A related phenomenon has also been observed for bacterial antibiotic resistance, where many antibiotics work through generating reactive oxygen species in the cells (Albesa et al. 2004; Dwyer et al. 2007; Hassett and Imlay 2007; Kohanski et al. 2007). Thus, adaptations to oxidative stress or induction of oxidative stress response pathways through cellular signaling molecules help bacteria survive these antibiotics (Dwyer et al. 2009; Lee and Collins 2011; Poole 2012; Vega et al. 2012).

Cross-protection alone cannot explain all of the fitness increase in the SO populations in our experiment, because the cycling populations have significantly higher fitness than the O populations under cycling conditions. We investigated two potential non-exclusive causes for this observation through their gene expression signatures. First, the SO populations may have

159 Chapter 3

experienced adaptations unique to them, which may include unique changes in gene expression or regulation. We showed that candidate genes with such changes indeed exist. Specifically, many genes changed their basal expression specifically in SO populations (results S5 and figure S5 and S6). Second, the SO populations may have ’learned‘ about the periodic nature of the environmental change, and become able to anticipate it. Three lines of evidence for this possibility exist through the proportion of genes that change expression in the SO populations, compared to the S, O, and ancestral populations (figures 3b and 5b, figure S7, and results S6). More specifically, this evidence argues for asymmetric anticipation, where salt stress helps anticipate oxidative stress, but not vice versa, in SO populations. Figure 6 summarizes our observations schematically.

The evolutionary adaptations we characterize are complex and involve many genes. It is thus not surprising that our observations leave open questions. First, we have not been able to disentangle the effects of cross-protection and anticipation on fitness in the cycling SO populations. Doing so may in general be very difficult, for the following reason. In cross-protection, a physiological state attained in one stressor protects against a second stressor. In comparison, in anticipation, exposure to the first stressor is necessary for triggering a second protective physiological state. To distinguish between cross-protection and anticipation, it is thus essential to know whether a cell’s physiological state when exposed to the second stressor is a result of physiological adaptation to the first stressor, or whether it is merely triggered by exposure to the first stressor but specifically protects against the second stressor. Because physiological states may change almost immediately after any environmental change (Gasch et al. 2000; Causton et al. 2001), it is difficult to make this distinction.

While our experiments did not allow us to clearly separate cross-protection and anticipation, they revealed specific gene expression signatures that point towards anticipatory regulation in the cycling SO populations. The fitness changes we observed point in the same direction, because the SO populations have significantly higher fitness than the O populations both in the O→S and in the S→O transition. Using control populations is crucial in this regard, because in their absence, one might falsely attribute the adaptations we see in cycling populations solely to anticipation, instead of to a combination of cross-protection and anticipation.

160 Chapter 3

Figure 6: Summary of observations for cycling populations. The rounded boxes correspond to the two different stressors between which the SO populations cycle (arrows). Fitness and gene expression data provided evidence for evolved cross-protection of oxidative stress against salt stress (light grey ellipse). Gene expression data also indicated asymmetric anticipation of oxidative stress by salt stress (dark grey ellipse), and basal expression changes in genes that were unique to cycling populations (white ellipse). Which of these three features affect fitness most strongly cannot be distinguished from our experiments.

A second, related limitation is that we do not know whether cross-protection, anticipation, or basal expression changes increase fitness more strongly. The reason is again the inability to completely separate anticipation and cross-protections in cycling populations. However, the data hint that cross-protection has the greatest effect on fitness, because the cycling populations have higher fitness in the O→S transition than in the S→O transition (Figure 4a and 4b). A third open question is why cross-protection and anticipation are asymmetric, and why they occur in opposite directions.

161 Chapter 3

Multiple genes that changed their regulation and expression in our experiments have been previously implicated in physiological adaptation to salt or oxidative stress (results S3 and S5). It may be tempting to search for single genes that may be causally responsible for the adaptations we see. However, two lines of evidence suggest that such a search may not be successful. Recent work has shown (Berry et al. 2011) that the genes required for physiological adaptation to a stressor do not only depend on the stressor, but also on other stressors that preceded it. Such interdependencies make the identification of single genes that are unconditionally responsive to a stressor difficult. Second, both the physiological stress response and evolutionary adaptations to stress involves many genes. For example, evolutionary adaptations to salt stress are complex, polygenic and do not involve measurable allele frequency increases in single adaptive alleles, at least on the time scale of a laboratory evolution experiment (Dhar et al. 2011).

In sum, past work on physiological adaptation has demonstrated the existence of both cross- protection and anticipation (Völker et al. 1992; Cullum et al. 2001; Greenacre and Brocklehurst 2006; Schild et al. 2007; Berry and Gasch 2008; Tagkopoulos et al. 2008; Wolf et al. 2008; Mitchell et al. 2009; Rodaki et al. 2009; Berry et al. 2011). The present work shows that both phenomena, as well as relevant basal gene expression changes, can evolve within a mere 300 generations of laboratory evolution. Future work may be needed to show which of these changes are more important in the adaptation of organisms to varying environments.

Materials and Methods

Strains and media We used the haploid yeast strain BY4741 (MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0; ATCC#201388) in this study. We started all laboratory evolution experiments from the same clone of BY4741, which we refer to as the ancestral strain. We refer to populations derived from the ancestral strain through serial transfer cycles as evolved populations. We refer to the three replicate yeast populations evolved in salt as S populations, three replicate populations evolved in hydrogen peroxide stress as O populations, and three replicate populations evolved in periodically occurring salt stress and oxidative stress as SO populations. We also evolved three replicate control (C) populations in general growth medium without salt stress and oxidative

162 Chapter 3

stress. We estimated the growth rates of the evolved populations as well as that of the ancestral strain relative to a BY4741 strain in which the CWP2 gene is tagged with green fluorescent protein (GFP). We refer to this GFP-tagged strain as the reference strain. We obtained ancestral and reference strains from EUROSCARF (http://web.uni- frankfurt.de/fb15/mikro/euroscarf/yeast.html) and Invitrogen (www.invitrogen.com and http://yeastgfp.yeastgenome.org/) (Ghaemmaghami et al. 2003), respectively. For serial transfers, we cultured cells in YP (consisting of 2% peptone, 1% yeast extract) and 2% galactose (YPG),

YPG supplemented with 0.5 M NaCl (YPGS) and YPG supplemented with 1mM H2O2 (YPGO). We supplemented all media for serial transfer with 50 μg/ml ampicillin and 25 μg/ml tetracycline to minimize the risk of contamination. We verified the stability of hydrogen peroxide in YPG medium using a Merckoquant peroxide test kit (Merck Catalogue No. 1100810001), which showed that after 24 hours at 30oC with shaking, the concentration of hydrogen peroxide in YPG medium still exceeded 80% of the initial hydrogen peroxide concentration.

We chose the concentrations of salt and hydrogen peroxide based on growth rate and viability measurements of ancestral yeast strain in these stressors (figures S11 and S12). Briefly, we found that salt stress from 0.5M NaCl reduces the growth rate of yeast so that the final cell number after 24 hours of growth is approximately 1/3rd of the cell number without stressors but it does not affect the viability of yeast cells even after growth for 24 hours. In contrast, under oxidative stress from 1mM peroxide the final cell number after 24 hours of growth is about 87% of the cell number without stressor. However, oxidative stress at 1 mM peroxide leads to a loss in cell viability, and higher peroxide concentration causes a population size reduction greater than 90 percent, when the stressor is applied only for three hours.

Serial transfer We started nine parallel serial transfer experiments with an overnight culture derived from one single clone of the ancestral strain. In each parallel experiment, we grew 50 ml of yeast culture in an incubating shaker for 24 hours at 220rpm and 30oC, after which cultures had reached stationary phase. Every 24 hours, we transferred 50μl of stationary culture into 50 ml of fresh culture medium. We carried out 30 such transfer cycles for a total of approximately 300 generations (Each transfer cycle involved approximately log21000≈10 cell generations). In three

163 Chapter 3

of the parallel experiments, we used YPGS medium for each transfer, in three other parallel experiments, we used YPGO medium and in another three experiments, we used YPGS and YPGO medium alternatively. Before each transfer, we froze 1.5ml of culture supplemented with 25% glycerol, and stored this cell suspension at -80oC for future analysis. We examined cultures periodically for contamination by microscope and through plating.

Competition assays To compare growth rates of the evolved populations with that of the ancestral strain, we performed competition assays using fluorescence activated cell sorting (FACS), as described below. We grew cells overnight from frozen glycerol stocks in 4ml YPD medium (30oC, 220rpm) until they had reached late logarithmic phase (<1.5x108 cells/ml). To estimate cell counts, we used a Neubauer cytometer. In each culture, we then adjusted the cell numbers to 2.5x107 cells/ml with YPD. These pre-culturing steps ensured that cells were in comparable physiological states with high viability before competition. For the competition assay, we mixed equal cell numbers (2.5×106) of the reference strain and of the competing strain in 10ml YPGS or YPGO medium (5.0×105ml-1 final cell density). We collected a 2.0ml aliquot of this culture,

centrifuged it, and resuspended the cells in 1.0ml PBSE buffer (10mM Na2HPO4, 2mM

KH2PO4,137mM NaCl, 2.7mM KCl, 0.1mM EDTA, pH 7.4). We stored these cells overnight at o 4 C. We used FACS to measure the relative cell numbers Nr(0) of the reference strain, and Ne(0)

of an evolved population (or Na(0) of the ancestral strain, depending on the experiment) at the beginning of the competition assay. We carried out all the competition assays in three replicates. For each replicate, we grew up 2ml cell aliquots for 24 hours in an incubating shaker at 220rpm o and 30 C. At the end of 24 hours we counted the relative cell numbers Nr(1) and Ne(1) (or

Na(1)) using FACS with a Beckman Coulter Cytomics FC500 fluorescence activated cell sorter. The histogram of fluorescing and non-fluorescing cells provided the relative number of cells of the reference strain (GFP-tagged) and the competing strain (evolved population or ancestral strain).

For measurement of fitness in a change from oxidative stress to salt stress (O→S), we first grew cells subject to oxidative stress, i.e., in YPGO medium following the same protocol as above (24 hours, 220rpm, 30oC). We then diluted the cells 1000 times into fresh YPGS medium and grew

164 Chapter 3

them for another 24 hours (220 rpm, 30oC). We then counted the relative cell numbers for the reference strain (Nr(2)) and the evolved (Ne(2)) or ancestral strain (Na(2)), depending on the experiment, using same FACS protocol as described above. Fitness measurement in the S→O transition proceeded exactly analogously, except that cells were first grown in YPGS for 24 hours, and then in YPGO for another 24 hours.

Estimating growth rate differences From the competition assays we determined the growth rate differences as follows. If the

rtr reference strains R and some other strain E grow according to Ntrr()= eN (0) and

rte Ntee()= eN (0), where rr and re are strain-specific growth rates, and where Ni(t) are population sizes of the respective strains at time t, then

Ntee() N (0) N e (0) =−exp[(rrter ) ] = exp[( mmt e − r ) ] Ntrr() N (0) N r (0)

Here, me is the Malthusian fitness of strain E and mr is the Malthusian fitness of strain R. Since we measured population numbers through FACS after t=1 day of competition, we estimated me (dimension [d-1]) as ⎛⎞NN⎛ FLFL ⎞ er(1) / (1) (100− e (1)) / e (1) mmer−=ln⎜⎟ = ln ⎜ ⎟ N(0) / N (0) (100− FL (0)) / FL (0) ⎝⎠er⎝ e e ⎠ where FLe(0) represents the percentage of GFP-tagged reference cells in the medium at the beginning of the competition assay, and FLe(1) represents the percentage of these cells after 1 day of competition. Similarly,

⎛⎞NNar(1) / (1)⎛ (100− FLF a (1)) /L a (1) ⎞ mmar−=ln⎜⎟ = ln ⎜ ⎟ ⎝⎠Nar(0) / N (0)⎝ (100− FL a (0)) / FL a (0) ⎠

To compare the cell numbers of an evolved population to that of the ancestral strain, we

calculated the dimensionless ratio N=Ne/Na. This ratio is equivalent to

NNNeee⎛⎞(1) / (0) Nr==⎜⎟ =exp(ea −=r ) exp(m e −m a ) NNNaaa⎝⎠(1) / (0)

165 Chapter 3

Therefore,

⎛⎞⎛NNee(1) /NFe (0) ⎞⎛ (100−−LF ee (1)) /LF (1) ⎞⎛ (100LF aa (1)) /L (1) ⎞ mmea−=ln⎜⎟⎜ = ln ⎟⎜ = ln ⎟⎜ − ln ⎟ ⎝⎠⎝NNNaaa(1) / (0) ⎠⎝ (100−− FLF ee (0)) /LF (0) ⎠⎝ (100LF aa (0)) /L (0) ⎠ −1 ()dimension ⎣⎦⎡⎤d . Since the growth of our yeast cells is density dependent and is limited to a 1000 fold increase per -1 culture transfer cycle, we can estimate ma as ma=mav=ln(1000)/(1d)=6.9078d (where mav is the average Malthusian fitness for the ancestral strain, as estimated from this1000-fold increase). We define the selection coefficient s as the difference in Darwinian fitness between an evolved population and the ancestral strain, i.e., s=(me-ma)/ma=w-1 (Lenski et al. 1991), where s>0 (or w>1) indicates that the evolved population has an advantage over the ancestral strain.

For measurement of fitness in the O→S and S→O transitions, we have

NN(1) (0) N (0) ee=−exp[(rr )] = exp[( mm − )] e er e r NNrr(1) (0) N r (0) and NNee(2) (1) / 1000 N e (1) =− exp[( rr er )] = exp[( mm e − r )] . NNrr(2) (1) /1000 N r (1)

NNee(2) (1) N e (0) This yields =−exp[()mmer] = exp[2() mm er −] NNrr(2) (1) N r (0)

Thus, following a similar derivation as that above,

NNee⎛⎞(2) / N e (0) Nm==⎜⎟ =exp[2(ea −m )] NNaa⎝⎠(2) / N a (0) Therefore,

11⎛⎞NNee ⎛(2) /Ne (0) ⎞ mmea−=ln⎜⎟ = ln ⎜ ⎟ 22⎝⎠NNaa ⎝(2)/Na(0) ⎠

11⎛⎞(100−−FLee (2)) / FL (2)⎛⎞ (100 FL aa (2)) / FL (2) −1 = ln⎜⎟− ln⎜⎟ ()dimension ⎣⎦⎡⎤d . 2⎝⎠ (100−−FLee (0)) / FL (0) 2⎝⎠ (100 FL aa (0)) / FL (0)

166 Chapter 3

2 -1 -1 6 As above, ma=mav*=ln(1000 )/(2d)= 2*6.9078/2 d =6.9078 d ,because there is a 10 fold increase in the number of cells over two cycles i.e., in two days. These considerations yield

1 ⎛⎞NN(2) / (0) ln ee ⎜⎟ 2⎝⎠NNaa (2) / (0) s/1=−()mmea m a =−= w 6.908

Whole genome transcriptome analysis

We analyzed the mRNA expression levels in the ancestral strain and in the evolved populations using a GeneChip Yeast Genome 2.0 Array (Affymetrix). We grew up equal number of cells from the ancestral strain and the evolved populations in YPG medium for 16 hours. We then either induced the cells with 0.5M NaCl or with 1mM H2O2, or grew them uninduced for 20 further minutes as a control. We isolated total RNA using the RiboPure-Yeast RNA isolation Kit (Ambion), and determined the quality of the isolated RNA with a NanoDrop ND 1000 spectrophotometer (NanoDrop Technologies, Delaware, USA) and a Bioanalyzer 2100 (Agilent, Waldbronn, Germany). We only processed samples further if the absorption ratio at 260 nm and 280 nm was between 1.8–2.1, and 28S -18S rDNA ratio was within 1.5-2 . We then reverse- transcribed total RNA samples (50 ng) into double-stranded cDNA, and then in vitro transcribed the cDNA in the presence of biotin-labeled nucleotides using the GeneChip® 3' IVT Express Kit (Affymetrix). We determined the quality and quantity of the biotinylated cRNA using the NanoDrop ND 1000 and Bioanalyzer 2100. We then fragmented Biotin-labeled cRNA samples (7.5 µg) randomly to 35-200 bp at 94°C in Fragmentation Buffer (Affymetrix), and mixed in 100 µl of Hybridization Mix (Affymetrix) containing Hybridization Controls and Control Oligonucleotide B2 (Affymetrix Inc., P/N 900454). We then hybridized samples to GeneChip® Yeast Genome 2.0 Arrays for 16 h at 45°C. Subsequently, we washed arrays using the Affymetrix Fluidics Station 450 FS450_0003 protocol. We used an Affymetrix GeneChip Scanner 3000 (Affymetrix Inc.) to measure the fluorescent intensity emitted by the labeled target. After hybridization and scanning, we calculated probe cell intensities and summarized for the respective probe sets by means of the MAS5 algorithm (Hubbell et al. 2002). To compare the expression values of the genes from chip to chip, we performed global scaling using a signed- rank call algorithm (Liu et al. 2002). We then transformed the normalized signal intensities

167 Chapter 3 logarithmically (base 2). Genes with log 2 transformed values of greater than 4 in all of the arrays were considered present in our data.

We carried out microarray analyses (i) for two replicates each for the ancestral strain in YPG, YPGS and YPGO (six in total); (ii) for four replicate population samples for the S populations in YPG, YPGS and YPGO, that is,12 analyses in total; (iii) for three replicate population samples for the O populations in YPG, YPGS and YPGO, that is, nine analyses in total; (iv) for three replicate population samples for the SO populations in YPG, YPGS and YPGO (nine analyses in total).

To identify genes that show physiological expression change in stressor, we considered only those genes for which |EEYPGX−≥ YPG | 1where EYPGX represents the averaged log2-transformed expression level of a gene in the stressor YPGX over all the replicate measurements and EYPG represents the averaged log2-transformed expression of the same gene in YPG medium without any stressor averaged over all replicate measurements.

To identify genes whose basal expression or whose regulation changed (see Results), we calculated the average expression levels for the ancestral strain and the evolved populations in YPG, YPGS and YPGO. For each gene, we then calculated the following two Z-scores.

()EYPGX− ANC YPGX−−() E YPG ANC YPG Z-score for change in regulation Zr = ⎛⎞Variance 0.09 + ∑⎜⎟ ⎝⎠No. of replicates and, EAYPG− NC YPG Z-score for basal expression change Zb = ⎛⎞Variance 0.09 + ∑⎜⎟ ⎝⎠No. of replicates

where EYPGX is the average log2-transformed expression level of the focal gene in the evolved populations in YPGX medium (YPGS or YPGO), ANCYPGX is the average log2-transformed expression level of the focal gene in the ancestral strain in YPGX medium, EYPG is the average log2-transformed expression level of the focal gene in the evolved populations in YPG medium,

168 Chapter 3

and ANCYPG is the average log2-transformed expression level of the focal gene in the ancestral strain in YPG medium (Mukhopadhyay et al. 2006; Dhar et al. 2011).

The Z-scores take into account the mean expression change of a particular gene, and also the variation in gene expression among the replicates under various conditions (Mukhopadhyay et al. 2006). In both calculations, 0.09 is a pseudo-variance (Mukhopadhyay et al. 2006). We calculated the Variance term in the denominator of the above equations for the expression level of a gene within replicate arrays in the ancestral strain, and within replicate arrays in the evolved populations, always in a given medium. The Variance term is divided by the number of replicate arrays, and summed over the different media in which the expression values were measured, as indicated by the summation sign in the formula. For calculation of Zr, we calculated the variances of expression levels for the replicate arrays in each of the following cases: in the ancestral strain in YPG, in the evolved populations in YPG, in the ancestral strain in YPGS or

YPGO and in the evolved populations in YPGS or YPGO. For calculation of Zb, we calculated the variances of a gene within the replicate arrays in the ancestral strain in YPG, and in the evolved populations in YPG. The formulae used here are modified from (Mukhopadhyay et al. 2006) for several reasons. First, in our experiment we use expression data from three biological replicate populations for each kind of evolved populations (S,O or SO). Thus, the probability of finding the same genes being differentially expressed in three independent replicate populations by chance alone will be very low. Second, the aim of this analysis was to detect the genes with changed expression that are shared among replicate evolved populations. A gene might not change its expression level to the same extent in parallel evolved populations, thus generating variance in expression among the replicates. To compensate for this increased variance, we set the Z-score threshold for identifying differentially expressed genes lower than the Z-score threshold used in (Mukhopadhyay et al. 2006). Third, we divided the variance by the number of replicates for each condition, as the actual variance would be proportional to the sample variance and inversely proportional to the number of replicates. Overall, we considered genes whose absolute Z-scores exceeded a value of 1.5 to be differentially expressed. We then classified the genes into two main categories of change in regulation, namely change in induction and change in repression, based on the physiological response of these genes to the stressors in the ancestral strain.

169 Chapter 3

The choice of the Z-score threshold determines the minimal change in expression that we can detect, taking the variance among replicates into account. If the sum of the variances in the above equations was equal to zero, and the Z-score threshold was set as 1.5 (as we do here) then all genes with an absolute value of log2(fold change) greater than 0.45 would be identified as genes with evolutionary expression changes. If the variance among replicates is greater than zero, the absolute value of the log2(fold change) would also need to be higher to identify a gene to be differentially expressed. For example, if the sum of the variances among replicates was equal to

1, then the absolute value of log2(fold change) would have to be greater than 1.57 to identify a gene as differentially expressed.

Acknowledgements We thank the Functional Genomics Center Zurich for its service in generating microarray data. This work was supported by Swiss National Science Foundation (grant 315230-129708), as well as through the YeastX project of SystemsX.ch, and the University Priority Research Program in Systems Biology at the University of Zurich.

Literature cited

Albesa I, Becerra MC, Battán PC, Páez PL. 2004. Oxidative stress involved in the antibacterial action of different antibiotics. Biochem Biophys Res Commun. 317:605-609.

Alcántara-Díaz D, Breña-Valle M, Serment-Guerrero J. 2004. Divergent adaptation of Escherichia coli to cyclic ultraviolet light exposures. Mutagenesis 19:349-354.

Apse MP, Aharon GS, Snedden WA, Blumwald E. 1999. Salt Tolerance Conferred by Overexpression of a Vacuolar Na +/H+ Antiport in Arabidopsis. Science 285:1256-1258.

Attfield PV, Choi HY, Veal DA, Bell PJ. 2001. Heterogeneity of stress gene expression and stress resistance among individual cells of Saccharomyces cerevisiae. Mol Microbiol. 40:1000-1008.

Bapiri A, Bååth E, Rousk J. 2010. Drying-rewetting cycles affect fungal and bacterial growth differently in an arable soil. Microb Ecol. 60:419-428.

Beaumont HJ, Gallie J, Kost C, Ferguson GC, Rainey PB. 2009. Experimental evolution of bet hedging. Nature 462:90-93.

Bennett AF, Lenski RE. 2007 An experimental test of evolutionary trade-offs during temperature adaptation. Proc Natl Acad Sci USA 104 Suppl 1:8649-8654.

Berry DB, Gasch AP. 2008. Stress-activated genomic expression changes serve a preparative role for impending stress in yeast. Mol Biol Cell 19:4580-4587.

170 Chapter 3

Berry DB, Guan Q, Hose J, Haroon S, Gebbia M, Heisler LE, Nislow C, Giaever G, Gasch AP. 2011. Multiple means to the same end: the genetic basis of acquired stress resistance in yeast. PLoS Genet. 7:e1002353.

Blake WJ, Balázsi G, Kohanski MA, Isaacs FJ, Murphy KF, Kuang Y, Cantor CR, Walt DR, Collins JJ. 2006. Phenotypic consequences of promoter-mediated transcriptional noise. Mol Cell. 24:853-865.

Booth IR. 2002. Stress and the single cell: intrapopulation diversity is a mechanism to ensure survival upon exposure to stress. Int J Food Microbiol. 78:19-30.

Causton HC, Ren B, Koh SS, Harbison CT, Kanin E, Jennings EG, Lee TI, True HL, Lander ES, Young RA. 2001. Remodeling of yeast genome expression in response to environmental changes. Mol Biol Cell 12: 323–337.

Cazalé AC, Rouet-Mayer MA, Barbier-Brygoo H, Mathieu Y, Laurière C. 1998. Oxidative Burst and Hypoosmotic Stress in Tobacco Cell Suspensions Plant Physiol. 116:659-669.

Chao HF, Yen YF, Ku MS. 2009. Characterization of a salt-induced DhAHP, a gene coding for alkyl hydroperoxide reductase, from the extremely halophilic yeast Debaryomyces hansenii. BMC Microbiol. 9:182.

Chen D, Wilkinson CR, Watt S, Penkett CJ, Toone WM, Jones N, Bähler J. 2008. Multiple Pathways Differentially Regulate Global Oxidative Stress Responses in Fission Yeast. Mol Biol Cell 19: 308–317.

Cherry JM, Hong EL, Amundsen C et al. (22 co-authors). 2012. Saccharomyces Genome Database: the genomics resource of budding yeast. Nucleic Acids Res. 40:D700-705.

Chinnusamy V, Schumaker K, Zhu J-K. 2004. Molecular genetic perspectives on cross-talk and specificity in abiotic stress signalling in plants. J Exp Bot. 55: 225–236.

Collinson EJ, Wimmer-Kleikamp S, Gerega SK, Yang YH, Parish CR, Dawes IW, Stocker R. 2011. The yeast homolog of heme oxygenase-1 affords cellular antioxidant protection via the transcriptional regulation of known antioxidant genes. J Biol Chem. 286:2205-2214.

Cooper TF, Rozen DE, Lenski RE. 2003. Parallel changes in gene expression after 20,000 generations of evolution in Escherichia coli. Proc Natl Acad Sci USA 100:1072-1077.

Cooper TF, Lenski RE. 2010. Experimental evolution with E. coli in diverse resource environments. I. Fluctuating environments promote divergence of replicate populations. BMC Evol Biol. 10:11.

Cullum AJ, Bennett AF, Lenski RE. 2001. Evolutionary adaptation to temperature. IX. Preadaptation to novel stressful environments of Escherichia coli adapted to high temperature. Evolution 55:2194-2202.

DeAngelis KM, Silver WL, Thompson AW, Firestone MK. 2010. Microbial communities acclimate to recurring changes in soil redox potential status. Environ Microbiol. 12:3137-3149. de Jong IG, Haccou P, Kuipers OP. 2011. Bet hedging or not? A guide to proper classification of microbial survival strategies. Bioessays 33:215-223.

Dhar R, Sägesser R, Weikert C, Yuan J, Wagner A. 2011. Adaptation of Saccharomyces cerevisiae to saline stress through laboratory evolution. J Evol Biol. 24:1135-1153.

Dwyer DJ, Kohanski MA, Hayete B, Collins JJ. 2007. Gyrase inhibitors induce an oxidative damage cellular death pathway in Escherichia coli. Mol Syst Biol. 3:91.

Dwyer DJ, Kohanski MA, Collins JJ. 2009. Role of reactive oxygen species in antibiotic action and resistance. Curr Opin Microbiol. 12:482-489.

171 Chapter 3

Ferea TL, Botstein D, Brown PO, Rosenzweig RF. 1999. Systematic changes in gene expression patterns following adaptive evolution in yeast. Proc Natl Acad Sci USA 96:9721-9726.

Fierer N, Schimel JP, Holden PA. 2003. Influence of drying-rewetting frequency on soil bacterial community structure. Microb Ecol. 45:63-71.

Fong SS, Joyce AR, Palsson BØ. 2005. Parallel adaptive evolution cultures of Escherichia coli lead to convergent growth phenotypes with different gene expression states. Genome Res. 15:1365-1372.

Gaál B, Pitchford JW, Wood AJ. 2010. Exact results for the evolution of stochastic switching in variable asymmetric environments. Genetics 184:1113-1119.

Gasch AP, Spellman PT, Kao CM, Carmel-Harel O, Eisen MB, Storz G, Botstein D, Brown PO. 2000. Genomic expression programs in the response of yeast cells to environmental changes. Mol Biol Cell 11: 4241–4257.

Gasch AP. 2007. Comparative genomics of the environmental stress response in ascomycete fungi. Yeast 24: 961– 976.

Ghaemmaghami S, Huh WK, Bower K, Howson RW, Belle A, Dephoure N, O'Shea EK, Weissman JS. 2003. Global analysis of protein expression in yeast. Nature 425:737-741.

Goldman RP, Travisano M. 2011. Experimental evolution of ultraviolet radiation resistance in Escherichia coli. Evolution 65:3486-3498.

Gossett DR, Banks SW, Millhollon EP, Lucas MC. 1996. Antioxidant Response to NaCl Stress in a Control and an NaCl-Tolerant Cotton Cell Line Grown in the Presence of Paraquat, Buthionine Sulfoximine, and Exogenous Glutathione. Plant Physiol. 112:803-809.

Greenacre EJ, Brocklehurst TF. 2006. The Acetic Acid Tolerance Response induces cross-protection to salt stress in Salmonella typhimurium. Int J Food Microbiol. 112:62-65.

Hassett DJ, Imlay JA. 2007. Bactericidal antibiotics and oxidative stress: a radical proposal. ACS Chem Biol. 2:708- 710.

Heath VL, Shaw SL, Roy S, Cyert MS. 2004. Hph1p and Hph2p, novel components of calcineurin-mediated stress responses in Saccharomyces cerevisiae. Eukaryot Cell 3:695-704.

Higgins VJ, Alic N, Thorpe GW, Breitenbach M, Larsson V, Dawes IW. 2002. Phenotypic analysis of gene deletant strains for sensitivity to oxidative stress. Yeast 19:203-214.

Hohmann S. 2002 Osmotic stress signaling and osmoadaptation in yeasts. Microbiol Mol Biol Rev. 66:300-372.

Hubbell E, Liu W-M, Mei R. 2002. Robust estimators for expression analysis. Bioinformatics 18:1585-1592.

Hughes BS, Cullum AJ, Bennett AF. 2007a. Evolutionary adaptation to environmental pH in experimental lineages of Escherichia coli. Evolution 61:1725-1734.

Hughes BS, Cullum AJ, Bennett AF. 2007b. An experimental evolutionary study on adaptation to temporally fluctuating pH in Escherichia coli. Physiol Biochem Zool. 80:406-421.

Jamieson DJ. 1998. Oxidative stress responses of the yeast Saccharomyces cerevisiae. Yeast 14: 1511–1527.

Jasmin JN, Kassen R. 2007. Evolution of a single niche specialist in variable environments. Proc Roy Soc B: Biol Sci. 274:2761-2767.

Jozefczuk S, Klie S, Catchpole G, Szymanski J, Cuadros-Inostroza A, Steinhauser D, Selbig J, Willmitzer L. 2010.

172 Chapter 3

Metabolomic and transcriptomic stress response of Escherichia coli. Mol Syst Biol. 6: 364.

Kohanski MA, Dwyer DJ, Hayete B, Lawrence CA, Collins JJ. 2007. A common mechanism of cellular death induced by bactericidal antibiotics. Cell 130:797-810.

Kussell E, Leibler S. 2005. Phenotypic diversity, population growth, and information in fluctuating environments. Science 309:2075-2078.

Lee HH, Collins JJ. 2011. Microbial environments confound antibiotic efficacy. Nat Chem Biol. 8:6-9.

Lenski RE, Rose MR, Simpson SC, Tadler SC. 1991. Long-Term Experimental Evolution in Escherichia coli. I. Adaptation and Divergence During 2,000 Generations. Am Nat. 138:1315-1341.

Lin CC, Kao CH. 2002. Osmotic stress-induced changes in cell wall peroxidase activity and hydrogen peroxide level in roots of rice seedlings. Plant Growth Reg. 37: 177–184.

Liu WM, Mei R, Di X, Ryder TB, Hubbell E, Dee S, Webster TA, Harrington CA, Ho MH, Baid J, et al. (11 co- authors). 2002. Analysis of high density expression microarrays with signed-rank call algorithms. Bioinformatics 18:1593-1599.

Maathuis FJM, Amtmann A. 1999. K+ Nutrition and Na+ Toxicity : The Basis of Cellular K+/Na+ Ratios. Annals of Bot. 84:123-133.

Meyers LA, Bull JJ. 2002. Fighting change with change: adaptive variation in an uncertain world. Trends Ecol Evol.17:551-557.

Mitchell A, Romano GH, Groisman B, Yona A, Dekel E, Kupiec M, Dahan O, Pilpel Y. 2009. Adaptive prediction of environmental changes by microorganisms. Nature 460:220-224.

Mitchell A, Pilpel Y. 2011. A mathematical model for adaptive prediction of environmental changes by microorganisms. Proc. Natl. Acad. Sci. USA 108:7271-7276.

Montgomery S. 1974. Hedging one's evolutionary bets. Nature 250:704-705.

Mukhopadhyay A, He Z, Alm EJ et al. (24 co-authors). 2006. Salt Stress in Desulfovibrio vulgaris Hildenborough: an Integrated Genomics Approach. J Bacteriol. 188:4068-4078.

Pelosi L, Kühn L, Guetta D, Garin J, Geiselmann J, Lenski RE, Schneider D. 2006. Parallel changes in global protein profiles during long-term experimental evolution in Escherichia coli. Genetics 173:1851-1869.

Petersohn A, Brigulla M, Haas S, Hoheisel JD, Völker U, Hecker M. 2001. Global Analysis of the General Stress Response of Bacillus subtilis. J Bacteriol. 183: 5617–5631.

Phadtare S, Inouye M. 2004. Genome-wide transcriptional analysis of the cold shock response in wild-type and cold-sensitive, quadruple-csp-deletion strains of Escherichia coli. J Bacteriol. 186:7007-7014.

Philippi T, Seger J. 1989. Hedging one's evolutionary bets, revisited. Trends Ecol Evol. 4:41-44.

Poole K. 2012. Bacterial stress responses as determinants of antimicrobial resistance. J Antimicrob Chemother. 67:2069-2089.

Posas F, Chambers JR, Heyman JA, Hoeffler JP, de Nadal E, Ariño J. 2000. The transcriptional response of yeast to saline stress. J Biol Chem. 275: 17249–17255.

Rainey PB, Beaumont HJ, Ferguson GC, Gallie J, Kost C, Libby E, Zhang XX. 2011. The evolutionary emergence

173 Chapter 3 of stochastic phenotype switching in bacteria. Microb Cell Fact. 10 Suppl 1:S14.

Rep M, Krantz M, Thevelein JM, Hohmann S. 2000. The Transcriptional Response of Saccharomyces cerevisiae to osmotic shock. J Biol Chem. 275: 8290–8300.

Rodaki A, Bohovych IM, Enjalbert B, Young T, Odds FC, Gow NA, Brown AJ. 2009. Glucose promotes stress resistance in the fungal pathogen Candida albicans. Mol Biol Cell 20:4845-4855.

Rudolph B, Gebendorfer KM, Buchner J, Winter J. 2010. Evolution of Escherichia coli for Growth at High Temperatures. J Biol Chem. 285:19029-19034.

Rutherford BJ, Dahl RH, Price RE, Szmidt HL, Benke PI, Mukhopadhyay A, Keasling JD. 2010. Functional genomic study of exogenous n-butanol stress in Escherichia coli. Appl Environ Microbiol. 76:1935-1945.

Samani P, Bell G. 2010. Adaptation of experimental yeast populations to stressful conditions in relation to population size. J Evol Biol. 23:791-796.

Schild S, Tamayo R, Nelson EJ, Qadri F, Calderwood SB, Camilli A. 2007. Genes induced late in infection increase fitness of Vibrio cholerae after release into the environment. Cell Host Microbe 2:264-277.

Sumner ER, Avery SV. 2002. Phenotypic heterogeneity: differential stress resistance among individual cells of the yeast Saccharomyces cerevisiae. Microbiology 148:345-51.

Serrano R, Mulet J, Rios G et al. (11 co-authors). 1999. A glimpse of the mechanisms of ion homeostasis during salt stress. J Exp Bot. 50:1023-1036.

Tagkopoulos I, Liu YC, Tavazoie S. 2008. Predictive behavior within microbial genetic networks. Science 320:1313-1317.

Thattai M, van Oudenaarden A. 2004. Stochastic gene expression in fluctuating environments. Genetics 167:523- 530.

Toledano MB, Delaunay A, Biteau B, Spector D, Azevedo D. 2003. Oxidative stress responses in yeast. In: Hohman S, Mager WH, editors. Yeast Stress Responses. Berlin: Springer-Verlag. p. 241-303.

Salathé M, van Cleve J, Feldman MW. 2009. Evolution of stochastic switching rates in asymmetric fitness landscapes. Genetics 182:1159-1164. van der Linden AM, Beverly M, Kadener S, Rodriguez J, Wasserman S, Rosbash M, Sengupta P. 2010. Genome- wide analysis of light- and temperature-entrained circadian transcripts in Caenorhabditis elegans. PLoS Biol. 8:e1000503.

Vega NM, Allison KR, Khalil AS, Collins JJ. 2012. Signaling-mediated bacterial persister formation. Nat Chem Biol. 8:431-433.

Völker U, Mach H, Schmid R, Hecker M. 1992. Stress proteins and cross-protection by heat shock and salt stress in Bacillus subtilis. J Gen Microbiol. 138:2125-2135.

Warringer J, Ericson E, Fernandez L, Nerman O, Blomberg A. 2003. High-resolution yeast phenomics resolves different physiological features in the saline response. Proc Natl Acad Sci. USA 100: 15724–15729.

Weber A, Jung K. 2002. Profiling Early Osmostress-Dependent Gene Expression in Escherichia coli Using DNA Macroarrays. J Bacteriol. 184: 5502–5507.

Weber H, Polen T, Heuveling J, Wendisch VF, Hengge R. 2005. Genome-wide analysis of the general stress

174 Chapter 3 response network in Escherichia coli: sigmaS-dependent genes, promoters, and sigma factor selectivity. J Bacteriol. 187:1591-1603.

Wijnen H, Young MW. 2006. Interplay of circadian clocks and metabolic rhythms. Annu Rev Genet. 40:409-448.

Wolf DM, Fontaine-Bodin L, Bischofs I, Price G, Keasling J, Arkin AP. 2008. Memory in microbes: quantifying history-dependent behavior in a bacterium. PLoS One 3:e1700.

Yoshikawa K, Tanaka T, Furusawa C, Nagahisa K, Hirasawa T, Shimizu H. 2009. Comprehensive phenotypic analysis for identification of genes affecting growth under ethanol stress in Saccharomyces cerevisiae. FEMS Yeast Res. 9: 32–44.

Yale J, Bohnert HJ. 2001. Transcript expression in Saccharomyces cerevisiae at high salinity. J Biol Chem. 276: 15996–16007.

Zheng M, Wang X, Templeton LJ, Smulski DR, LaRossa RA, Storz G. 2001. DNA Microarray-Mediated Transcriptional Profiling of the Escherichia coli Response to Hydrogen Peroxide. J Bacteriol. 183: 4562–4570.

175 Appendix to Chapter 3

Figure S1: Fitness of the ancestral strain relative to the reference strain. The two strains are from identical genetic backgrounds with the exception that the reference strain contains a GFP tag at the CWP2 gene. The fitness of the ancestral strain relative to the reference strain is as follows – in medium without stressors w=0.97±0.02, in salt stress w=0.93±0.04, and in oxidative stress w=1.03±0.07. Although the reference strain’s GFP tag might affect fitness in some environments, such changes do not affect our measurements, because the fitness changes of the evolved populations and ancestral strain were calculated relative to the reference strain.

176 Appendix to Chapter 3

Figure S2: Schematic diagrams showing hypothetical examples of several types of evolutionary gene expression changes. Panels (a-e) show physiological expression changes of a hypothetical gene before (solid line) and after (dotted line) evolutionary adaptation. The x-axis shows time (on a physiological time scale) after the organism has been exposed to a stressor; the y-axis shows the expression level of the gene. The types of evolutionary gene expression changes that are depicted here are as follows (a) Basal change in expression but no change in regulation (b) Change in induction, with and without basal change (c) Change in repression, with and without basal change (d) New induction, with and without basal change (e) New repression, with and without basal change. The gene expression change over time is assumed to be linear with time but it need not be linear.

177 Appendix to Chapter 3

178 Appendix to Chapter 3

179 Appendix to Chapter 3

180 Appendix to Chapter 3

Results S1: Fitness measurements in cycles of exposure to stressors

We carried out fitness measurements in two different cycles of exposure to stressors. In the first cycle, we exposed evolved populations to oxidative stress for 24 hours (ca. 10 generations), followed by 24 hours of salt stress (O→S cycle, see Methods). In the second cycle, we exposed the populations first to salt stress and then to oxidative stress (S→O cycle, see Methods). If asymmetric stress protection exists, we would predict that the fitness of populations in the O→S cycle is higher than that of populations in the S→O cycle. This is indeed the case for two of our populations. That is, in the O populations, fitness was significantly higher in the O→S cycle compared to the fitness in the S→O cycle (Mann Whitney U test, p=2×10-8; figure 4a and 4b, right-most data), and the same holds for the SO populations (Mann Whitney U test, p=0.0001; figure 4a and 4b, middle data).

Results S2: Genes that change either basal expression or regulation in S, O and SO populations

The evolutionary changes that we observe in gene expression are of two kinds – basal changes and changes in regulation. Basal changes refer to the change in expression of genes even in the absence of any stressor in the medium.

There are 81 genes in the S populations, 105 genes in the O populations, and 390 genes in the SO populations that show a basal expression change, including genes with a basal increase and a basal decrease in expression (figure S3a). Among all these genes, 50 genes are shared between all populations, 6 genes between S and O populations, 14 genes between S and SO populations, and 32 genes between O and SO populations.

Among the genes that change their regulation under salt stress in the evolved populations, there are 61, 132 and 186 such genes in S, O and SO populations respectively (figure S3b). There are 42 genes with changed regulation in salt stress that are shared between S, O and SO populations. In addition, 8 genes are shared between S and O populations, 4 genes between S and SO populations and 52 genes between O and SO populations.

Finally, there are 13, 348 and 180 genes that show change in regulation in oxidative stress in S, O and SO populations respectively (figure S3c). Out of all these genes, 3 genes are shared by all populations, 7 genes between S and O populations, and 87 genes between O and SO populations, whereas S and SO population do not share any common gene.

181 Appendix to Chapter 3

Figure S3: Number of common genes with evolutionary expression changes between S, O and SO populations. Evolutionary gene expression changes can be of two types – basal change and regulation change. (a) There are 81 genes in the S populations, 105 genes in the O populations, and 390 genes in the SO populations that show a basal expression change. Among all these genes, 50 genes are shared between all populations, 6 genes between S and O populations, 14 genes between S and SO populations, and 32 genes between O and SO populations. (b) There are 61, 132 and 186 genes that change their regulation under salt stress in S, O and SO populations respectively. There are 42 genes with changed regulation in salt stress that are shared between S, O and SO populations. In addition, 8 genes are shared between S and O populations, 4 genes between S and SO populations and 52 genes between O and SO populations. (c) There are 13, 348 and 180 genes that show change in regulation in oxidative stress in S, O and SO populations respectively. Out of all these genes, 3 genes are shared by all populations, 7 genes between S and O populations, and 87 genes between O and SO populations, whereas S and SO population do not share any genes with regulation changes in oxidative stress.

182 Appendix to Chapter 3

Results S3: Genes that changed regulation in salt and oxidative stress in both O and SO populations

In O populations, there are two genes, both belonging to the functional class 'Cellular transport, transport facilities and transport routes' (Güldener et al. 2005), that show a change in induction in salt as well as in peroxide. One gene, PUT4 is induced in response to osmotic stress in three physiological studies of the stress response (Posas et al. 2000; Rep et al. 2000; Yale and Bohnert 2001). The second gene, YDL199C, encodes a putative transporter protein that has been shown to reduce oxidative stress resistance when knocked out (Higgins et al. 2002). In addition, O populations harbor 33 genes, belonging to ten different protein functional classes, that show change in repression in salt as well as in peroxide. Among these 33 genes, 18 genes (SPB1, UTP5, LCP5, DBP3, PXR1, IMP3, DBP8, TRM2, SRP40, DBP9, UTP14, ESF2, YOR287C, RRS1, ARX1, KRI1, NOC2, and YLR063W) are repressed as part of a common environmental stress response (Gasch et al. 2000).

In SO populations, we observed 10 genes, from eight protein functional classes, with a change in induction in salt as well as in peroxide. One of these genes, PUT4 is shared with O populations. A second gene, DDR48, has been shown to be upregulated in salt as well as in peroxide in previous studies of physiological stress adaptation (Gasch et al. 2000; Posas et al. 2000; Rep et al. 2000; Causton et al. 2001; Yale and Bohnert 2001). Yet another gene, YOR152C, is up-regulated as part of the common environmental stress response (Gasch et al. 2000).

Also in SO populations, there are 30 genes, from 12 protein functional classes, that show a change in repression in salt as well as in peroxide medium. Among these 30 genes, 19 genes (MAK5, ENP1, ESF1, UTP5, LCP5, DBP3, UTP11, TRM2, SRP40, DBP9, DUS3, UTP14, RLP7, ARX1, KRI1, NOC2, YLR063W, YLR073C, and KRE33) have been shown to be repressed as part of a common environmental stress response during physiological adaptation (Gasch et al. 2000). Out of these 30 genes, 17 genes (ARX1, DBP9,YLR063W, SRP40, GFD2, YOR293C-A, UBC11, KRI1, DBP3, BAS1, YBL029W, TRM2, NOC2, YNL162W-A, UTP14, UTP5, and LCP5) also changed repression in salt and in peroxide in the O populations.

Results S4: Genes that change their regulation in O and SO populations have more similar functions than genes that change their regulation in S populations

We developed a measure of functional distance Df among sets of genes that indicates differences between known functional annotations of two sets of genes (see methods S4). Based on this measure, we compared functional distances among genes that change their regulation in O, S, and SO populations (see methods S4). The mean value of Df for these genes is significantly lower for O and SO populations than for the other two pairs of populations (Mann-Whitney U test, p=0.02 for comparison with Df between S and O populations; p=0.001 for comparison with Df between S and SO populations). The distribution of Df between O populations and SO populations has the lowest mean (S & O populations: Df = 0.0344±0.0396, O & SO populations: Df = 0.0221±0.0260, S & SO populations: Df = 0.0386±0.0444, figure S4).

183 Appendix to Chapter 3

Figure S4: Functional distances between evolved populations based on functional classification of genes with evolutionary expression change. We calculated a measure of the functional similarity of genes with changed expression between evolved populations using a distance measure Df described in the methods S4. Genes in the evolved yeast populations can in principle show six types of gene expression changes. For each of these six expression change categories, we determined the fraction of genes that fall into these functional categories (see methods S4). From these set of fractions, we calculated the functional distance given by Df between S and O populations, between S and SO populations and between O and SO populations (see methods S4). The data suggests that the genes changing their expression in O and SO populations are significantly more similar to each other in function than they are to genes that change their expression in S populations. The line in the box represents the mean functional distance, the box represents mean± 1 standard error (s.e.) and the whisker represents mean± 1 standard deviation (s.d.). The asterisk indicates a significant difference in fitness (Mann-Whitney U test).

184 Appendix to Chapter 3

Results S5: Gene expression changes specific to the SO populations.

In this analysis we asked whether an evolutionary signature of gene expression or regulation change exists specifically for SO populations. If so, the affected genes may help explain why SO populations have higher fitness than O populations under salt stress. Specifically, we analyzed variation in gene expression in the S populations, O populations and SO populations using principal component analysis (PCA). We carried out this analysis separately for genes that changed their basal expression (figure S5a), their regulation under salt stress (figure S5b), and their regulation under oxidative stress (figure S5c) (see methods S5). Figure S5a shows that basal gene expression changes in SO populations (red dots) are well-separated from basal changes in S and O populations. This separation holds to a much lesser extent for regulatory changes in salt stress (figure S5b). It is absent for such changes in oxidative stress (figure S5c). Thus, based on this analysis, most of the evolved gene expression change that is specific to SO populations should affect basal expression. This is also born out by the numbers of genes that change basal expression (figure S3). In total, 130 genes change their basal expression, excluding genes where this change is specific to the SO populations; 294 genes change their basal expression only in the SO population (69.34 percent). This percentage is significantly greater than the 38.10 percent of genes that change their regulation only in SO populations in response to salt stress (chi-squared test, p<0.0001, df=1) It is also significantly greater than the 26.39 percent of genes that change their regulation in only SO populations in response to oxidative stress (chi-squared test, p<0.0001, df=1)

We next studied the annotated functions of genes with changes in basal expression that are unique to SO populations. Based on Z-score calculation (see Methods), there are 393 genes that show basal expression change in SO populations, compared to 82 genes in S populations and 106 genes in O populations. We classified the products of these genes into functional categories, using the CYGD classification (Güldener et al. 2005). For each of these functional classes, we asked whether genes in a given population and expression change category occur preferentially in this class. To this end, we used an exact binomial test and corrected for multiple hypothesis testing (p<0.05, FDR<10%) (see methods S6). The results are displayed in a color matrix in figure S6. This matrix indicates whether a functional class (matrix rows) is enriched (yellow) or impoverished (blue) for genes in a given population and expression change category (matrix columns) are. Several functional classes are enriched or impoverished for genes whose changes are unique to SO populations (right-most 6 columns of figure S6). For example, genes with a decrease in basal expression are enriched in the classes 'cellular communication /signal transduction mechanisms', 'interaction with environment' and 'development'. Fewer such genes than expected by chance alone are involved in ‘protein synthesis’. And fewer genes with change in repression in salt are involved in ‘development’ than expected by chance alone. More such genes are involved in the 'biogenesis of cellular components'. In sum, the functions of genes whose expression or regulation changes in SO populations fall into multiple classes, some of which are enriched with genes only in SO populations.

Finally, we analyzed specific genes with basal expression changes unique to SO populations. Several of these genes have been implicated in the physiological stress response by previous studies. First, the gene FRT2 encodes a protein in the membrane of the endoplasmic reticulum. It is known to promote growth in conditions of high Na+ concentrations, under alkaline pH, or

185 Appendix to Chapter 3

under cell wall stress (Heath et al. 2004). FRT2 also showed increased induction to salt stress in SO populations. Second, the gene MSB2 encodes an osmosensor that acts in parallel to the SHO1 mediated stress signaling pathway (O'Rourke and Herskowitz 2002). Third, the gene SSK22 encodes the MAP kinase kinase kinase of the HOG1 signaling pathway (Posas and Saito 1998), which is the major signaling pathway of the osmotic stress response. Both MSB2 and SSK22, are members of the MAPK signaling pathway. Fourth, TRK2, encodes a component of Trk1p-Trk2p potassium transport system in yeast (Ko et al. 1990; Ko and Gaber 1991). Expression changes in this gene are likely biologically significant, because potassium transport systems are important for ion homeostasis under hyperionic stress (Maathuis and Amtmann 1999). Fifth, ARG82, causes increase in sensitivity to heat and decrease in ionic stress resistance when knocked out (Dubois and Messenguy 1994; Dubois et al. 2002). Sixths and sevenths, the genes FRE7 and HMX1, reduce oxidative stress resistance in yeast when knocked out individually (Higgins et al. 2002; Collinson et al. 2011). Moreover, there are seven genes (MFA1, MFA2, SDP1, ASG7, AGA2, FUS1, and KIN82) that were affected by environmental stress or osmotic stress in previous studies of physiological stress adaptation. Another seven genes (SOK2, RHO5, FAR1, BAR1, VHS3, SIR1, and AXL1) reduce hyperosmotic stress response when knocked out individually (Yoshikawa et al. 2009). Except for STE3 and FUS1, all the genes described above show a decrease in basal expression that is restricted to SO populations.

One intriguing observation is that there are 13 genes that encode pheromone receptors (STE2, STE3), are involved in pheromone signaling (STE5, STE12, STE18, SST2, GPA1, FUS3, STE4, PRM2, PRM5, and PRM6), or function as transcription factors for gene regulation in response to pheromones (KAR4). What could be the significance of a basal decrease in expression of these pheromone signaling genes? We do not have firm answers, but mention several possibilities. First, the changes in expression of these genes could merely be caused by changes in expression of other genes in SO populations. (There are 307 genes that show decreased basal expression in the SO populations and out of these 307 genes, 247 are unique to SO populations.) The second possibility is that genes associated with pheromone signaling might be associated with the stress response. This is not far-fetched. For example, expression of three genes, MFA1, MFA2 and FUS1, involved in pheromone signaling (Michaelis and Herskowitz 1988; Chen et al. 1997; Nelson et al. 2004), is affected in the physiological stress response (Yale and Bohnert 2001). Third, it is possible that there is cross talk between MAPK pathways for stress signaling, which might explain the changes in expression of these genes. Fourth, the genes responsible for stress adaptation might be controlled by several signaling pathways, including the pheromone signaling pathway. Thus, changes in the expression level of these genes might result in better regulation of downstream stress response genes.

186 Appendix to Chapter 3

Figure S5: Principal component analysis (PCA) of evolutionary changes in gene expression. To investigate the similarities and differences of evolutionary gene expression changes between S, O, and SO populations, we performed PCA based on all genes in the microarray data (see methods S5). The figure shows the results of PCA analysis for genes in the evolved populations that change their (a) basal expression (b) regulation in salt (c) regulation in peroxide. This analysis suggests that genes with basal expression changes in SO populations (red dots) are well- separated from genes with basal changes in S and O populations. This separation holds to a much lesser extent for regulatory changes in salt stress and it is absent for such changes in oxidative stress.

187 Appendix to Chapter 3

Figure S6: Protein functional classes that are enriched or impoverished for genes with evolutionary expression change. We classified genes with evolutionary change in expression in S, O and SO populations according to the CYGD functional classification (Güldener et al. 2005). Each row in the matrix shown below represents a protein functional class that is indicated on the right side of the figure and each column represents a type of evolutionary gene expression change in a specific kind of evolved population (indicated on top). For each matrix entry we performed a statistical test to determine whether the number of genes in the corresponding functional class could occur by chance alone, given the total number of genes in the column (see methods S6). A square is black if the number of genes observed in that class is not significantly different from what would be expected by chance alone. A square is yellow (blue) if the number of genes in that functional class is significantly higher (lower) than expected by chance alone (exact binomial test, p<0.05 and FDR<10%).

188 Appendix to Chapter 3

Results S6: Genes that that respond physiologically to salt and oxidative stress in SO populations Consider the set of genes that change expression in a population in response to both salt and oxidative stress, and refer to this set of genes as shared or overlapping between the two stress responses. We found that this set of genes is not significantly greater in SO than in O populations (chi-square test, p=0.70, df=1) (methods S2, figure S7). To explain this observation, which apparently contradicts what one would predict for the evolution of anticipation, we note that overlaps between two sets of genes can be influenced by different adaptive processes. For example, since both O populations and SO populations evolved cross-protection, it is possible that the overlap in O populations (365 genes) could be caused by evolved cross-protection in these populations, whereas the overlap in SO population could be caused by a combination of evolved cross-protection and anticipation (figure S7). If so, one would predict be that the identity of shared genes would differ between O populations and SO populations. Indeed, we found that there are 77 shared genes (out of 365 genes) in O populations (figure S7), that are unique, i.e, they are neither involved in the salt stress response nor in the oxidative stress response in any of the other populations (S populations, O populations, and ancestral strain). Similarly, out of 355 shared genes in SO populations (figure S7), 75 genes are unique to these populations. By comparison, S populations have only two such unique genes and the ancestral strain has five such unique genes. Thus, although the shared genes between SO and O populations do not differ in number, they differ in identity. This difference is consistent with the view that different adaptive processes – impossible to disentangle from our experiments – have contributed to the size and composition of the shared gene sets in O and SO populations.

189 Appendix to Chapter 3

Figure S7: Genes that respond physiologically to salt stress and oxidative stress. In the ancestral strain (black circles), there are 1117 genes that respond physiologically to salt stress and 143 genes that respond physiologically to oxidative stress. There are 59 genes in common between these two sets of genes. In S populations (green circles), there are 1258 genes that respond physiologically to salt stress and 224 genes that respond physiologically to oxidative stress. There are 94 genes that are in common between these two sets of genes. This number of common genes is significantly greater than in the ancestral strain (Large ‘<’ sign, asterisk; p=0.0104). In O populations (blue circles), there are 1302 genes that respond physiologically to salt stress and 629 genes that respond physiologically to oxidative stress. There are 365 genes in common between these two sets of genes. This overlap is also significantly greater than the overlap in the ancestral strain (p<10-16) and the overlap in S populations (p<10-16). In SO populations (red circles), there are 1368 genes that respond physiologically to salt stress and 538 genes that respond physiologically to oxidative stress. There are 355 genes in common between these two sets of genes. Again, this overlap is significantly greater than the overlap in the ancestral strain (p<10-16) and in S populations (p<10-16). However, this overlap is similar to the overlap in O populations (p=0.70).

190 Appendix to Chapter 3

Figure S8: Comparing overlaps between two sets of genes using a chi-square test.

191 Appendix to Chapter 3

192 Appendix to Chapter 3

Figure S9: Chi-square tests to compare the numbers of genes with changed regulation in salt stress and in oxidative stress between two populations.

193 Appendix to Chapter 3

Figure S10: Calculating functional similarity between two populations using the distance measure Df (methods S4).

194 Appendix to Chapter 3

Figure S11: Growth of ancestral yeast strain in salt stress and oxidative stress. Cell density (cells per ml) of the ancestral yeast strain after 24 hours of growth in (a) 0.5M NaCl and (b) 1mM H2O2. Cell numbers after 24 hours of growth were determined manually with haemocytometer.

(a)

(b)

195 Appendix to Chapter 3

Figure S12: Viability of the ancestral yeast strain. (a) Viability of ancestral yeast strain after 16 hours and 24 hours of growth in medium supplemented with 0.5M NaCl (b) Viability of ancestral yeast strain after 3 hours of growth in medium supplemented with various concentrations of H2O2. Viability of the ancestral yeast strain was determined by plating at 30°C and is calculated relative to the viability of the ancestral strain in medium without stressors.

(a)

Viability of yeast after 16 and 24 hrs of growth in 0.5M NaCl 100 90 80

70 60 50 40 30

20 Viabilityof yeast cells (%) 10 0 16hours 24hours

(b)

Viability of yeast after growth in hydrogen peroxide for 3 hrs

100 90 80 70

60

50

40 30 20

Viability ofViability yeast cells (%) 10

0 00.51 2 510 Hydrogen peroxide conc. (mM)

196 Appendix to Chapter 3

Figure S13: Mutant phenotypes, associated with stress response, of genes that show change in regulation in our evolved populations. We used phenotype data for null, reduction in function and over-expression mutants from Saccharomyces Genome Database (SGD) (http://www.yeastgenome.org/download-data/curation/phenotype_data.tab; Cherry et al. 2012). The x-axis distinguishes the populations and the class of genes with regulatory changes under the indicated condition. For example, “S-oxidative stress” refers to genes that changed their regulation under oxidative stress in populations evolved under salt stress (S populations). The y- axis shows the number of genes that can be associated with salt stress/osmotic stress response and oxidative stress response phenotype in the phenotype data from SGD for each category on the x-axis. For example, among the 13 genes that changed regulation in oxidative stress in the S populations, only one gene could be associated with the salt stress/osmotic stress response phenotype and one gene could be associated with the oxidative stress response phenotype from the phenotype data from SGD.

GO phenotypes of genes with change in regulation 30 Salt stress response 25 Oxidative stress response

20

15

10 Number ofNumber genes 5

0 S-salt S-oxidative O-salt O-oxidative SO-salt SO- stress stress stress stress stress oxidative stress

197 Appendix to Chapter 3

Figure S14: Localization of genes with increase in basal expression in the yeast chromosomes for (a) S populations (b) O populations and (c) SO populations. In all the populations, chromosome 7 and chromosome 15 contain a higher percentage of genes in this category, compared to the other chromosomes.

(a)

S populations 30

20

10 0

Percentage of genes Percentage r4 r7 1 2 3 5 6 hr1 hr 9 r1 r1 c chr2 chr3 ch chr5 chr6 ch chr8 c chr10 chr1 ch chr1 chr14 ch chr1

(b)

O populations 30 20 10 0

Percentage of genes 4 5 9 12 13 chr1 chr2 chr3 chr chr chr6 chr7 chr8 chr chr10 chr11 chr chr chr14 chr15 chr16

(c)

SO populations

30

20

10

0

Percentage of genes 1 2 3 4 6 7 8 9 hr hr hr hr hr hr hr hr 10 r15 c c c c chr5 c c c c hr c chr11 chr12 chr13 chr14 ch chr16

198 Appendix to Chapter 3

Table S1: List of unique genes in set A1 that are not present in set A2 from figure 5a

Gene name Function (in CYGD (Güldener et al. 2005)) (Common name) YOR145C (PNO1) Protein required for cell viability YOR287C Weak similarity to PITSLRE protein kinase isoforms YCR098C (GIT1) Glycerophosphoinositol transporter also able to mediate low-affinity phosphate transport YBR104W (YMC2) Member of the mitochondrial carrier (MCF) family - unknown function YNL227C (JJJ1) J-protein (Type III), similar to dnaJ-like proteins YGR035C Conserved hypothetical protein YKL172W (EBP2) Protein required for pre-rRNA processing and ribosomal subunit assembly YKR044W (UIP5) Ulp1 Interacting integral membrane protein YOR306C (MCH5) Transporter YNL162W-A Unknown function YML093W Nucleolar protein, component of the small subunit (SSU) processome (UTP14) containing the U3 snoRNA that is involved in processing of pre-18S rRNA YER127W (LCP5) Ngg1p interacting protein YFL065C Unknown function YDR101C (ARX1) Protein required for 60S pre-ribosome formation YHR148W (IMP3) Component of the U3 small nucleolar ribonucleoprotein YOR313C (SPS4) Sporulation-specific protein YLR063W Unknown function localised to cytoplasm YDR299W (BFR2) Protein involved in protein transport steps at the Brefeldin A blocks YKR092C (SRP40) Suppressor of mutant AC40 of RNA polymerase I and III YCL036W (GFD2) “Great for Full DEAD box protein activity”- unknown function, identified as a high-copy suppressor of a dbp5 mutation YOR315W (SFG1) putative transcription factor required for growth of superficial pseudohyphae YOR293C-A Hypothetical protein - identified by MS Proteomic Analyses YLR223C (IFH1) Pre-rRNA processing machinery control protein YNR054C (ESF2) Essential nucleolar protein involved in pre-18S rRNA processing YBR142W (MAK5) ATP-dependent RNA helicase YMR011W (HXT2) Glucose transporter YKR099W (BAS1) Myb-related transcription factor YBR247C (ENP1) Effects N-glycosylation, Protein associated with U3 and U14 snoRNAs, required for pre-rRNA processing and 40S ribosomal subunit synthesis YOR004W (UTP23) Essential nucleolar protein that is a component of the SSU (small subunit) processome involved in 40S ribosomal subunit biogenesis YKL099C (UTP11) U3 snoRNP protein YOR206W (NOC2) Crucial for intranuclear movement of ribosomal precursor particles

199 Appendix to Chapter 3

YNL002C (RLP7) Nucleolar protein related to ribosomal protein L7 YPL119C-A Hypothetical protein

200 Appendix to Chapter 3

Table S2: List of unique genes in set A3 that are not present in set A2 in figure 5a

Gene name (Common Function (in CYGD (Güldener et al. 2005)) name) YLR401C (DUS3) Member of dihydrouridine synthase family YBL108C-A (PAU9) Unknown function YOL136C (PFK27) 6-phosphofructose-2-kinase, isoenzyme 2

201 Appendix to Chapter 3

Methods S1: Do two sets of genes overlap to a greater extent than expected by chance alone?

Let us consider two sets of genes A and B that contain |A| and |B| genes, respectively, as well as a number of genes O=||AB∩ shared (or overlapping) between both sets. We would like to know whether the number of shared genes O could be observed by chance alone, if the two sets of genes were to be drawn at random from the yeast genome. To find out, we used a randomization test. We created samples X and Y of |A| and |B| genes, respectively, at random from those genes in the yeast genome for which gene expression data was available from the microarray (5668 genes in total). We then computed the number of genes Orand=|X ∩Y | in which X and Y 7 7 overlap. We repeated this procedure n=10 times, thus obtaining 10 values for Orand. We then established the fraction p (of these values in which O≥Orand, which yields a p-value for this randomization test. We note that this test is limited in that the smallest obtainable p-value is determined by the sample size n. We call the number O of shared genes significant if p lies below some pre-defined cutoff value, such as p<0.05.

Methods S2: Do two pairs of gene sets differ significantly in the genes they share?

Some of our analyses required us to compare the extent to which two pairs of gene sets shared genes. Specifically, consider a population 1 with two sets of genes A and B, with |A| and |B| members, as well as the number of members in their intersection |AB∩ | . Next consider a population 2 with two sets of genes C and D, and the number of members in their intersection|CD∩ |. We wanted to know whether the first two sets of genes C and D overlap to a significantly greater extent than the second two sets of genes, A and B, or vice versa (figure S8). To this end, we carried out the following two-part chi-squared test.

For population 1, the number of genes that is shared and not shared between the two sets is given by |AB∩ | and |AB∪−∩ | | AB | , respectively. The total number of genes is given by

N1=|AB∪ | (see figure S8). Similarly, in population 2, the number of genes that is shared and not shared between the two sets is given by |CD∩ | and|CD∪−∩ | | CD | , respectively, and the total number of genes is N2=||CD∪ (figure S8). In the first part of the test, we use the observations from population 1 as the expected values for population 2. Thus, the expected ||AB∩ ||AB∪−∩|| AB number of shared and non-shared genes would be N2 × and N2 × N1 N1 2 respectively. We calculate the value of χ1 (1 degree of freedom, df) as follows

2 2 ⎛⎞NAB×∩||⎛ N × (|| ABAB ∪−∩||) ⎞ ⎜⎟||CD∩−22⎜ (|| CD ∪−∩|| CD) − ⎟ 2 ⎝⎠NN1 ⎝1 ⎠ χ1 =+ N22×∩|| AB N × (|| AB ∪−∩|| AB)

NN11

202 Appendix to Chapter 3

and obtain a p-value that we designate as p1. In the second part of the test, we use the observations from population 2 as the expected values for population 1. Thus, the expected ||CD∩ ||CD∪−∩|| CD number of shared and non-shared genes would be N1 × and N1 × N2 N2 2 respectively. We calculate the value of χ2 (df=1) as follows

2 2 ⎛⎞N×∩|| CD⎛ N × (|| CD ∪−∩|| CD) ⎞ ⎜⎟||AB∩−11⎜ (|| AB ∪−∩|| AB) − ⎟ 2 ⎝⎠NN2 ⎝2 ⎠ χ2 =+ NCD11×∩|| N × (|| CD ∪−∩|| CD)

NN22 and obtain a p-value that we designate as p2 (figure S8). We report an overall p-value p=max(p1, p2) as a summary result from both tests.

Methods S3: Does the number of genes with altered regulation differ significantly between two populations?

We also used a two-part chi-square test to ask whether the number of genes with changed regulation in salt stress, and the number of genes with changed regulation in oxidative stress differ significantly between two populations. For this analysis, we considered genes that show changed regulation only in one of the stressors, but not genes that show changed regulation in both salt stress and oxidative stress. For example, let us assume that in population 1, there are P genes that show changed regulation only under salt stress, and Q genes that show changed regulation only under oxidative stress (figure S9). Similarly, in population 2, there are R genes that show changed regulation only under salt stress, and S genes that show changed regulation only under oxidative stress. This data gives rise to the 2x2 table shown in figure S9. In this table, the row totals are given by N1=P+Q and N2=R+S. The first part of the test uses the observed numbers from population 1 as the expected numbers for population 2. Specifically, the expected P Q number of genes in the two categories for population 2 are N2 × and N2 × . One can then N1 N1 2 calculate the value of χ1 as follows

22 ⎛⎞NP22××⎛⎞ NQ ⎜⎟RS−−⎜⎟ 2 ⎝⎠NN11⎝⎠ χ1 =+ NP22×× NQ NN 11 2 to obtain a p-value (p1) from the χ distribution with df=1.

203 Appendix to Chapter 3

In the second part of the test, we use the observed numbers from population 2 as the expected numbers for population 1. Specifically, the expected number of genes in the two categories are R S N × N × 2 1 and 1 . The value of χ2 is given by N2 N2

22 ⎛⎞NR××⎛⎞ NS ⎜⎟PQ−−11⎜⎟ NN χ 2 =+⎝⎠22⎝⎠ 2 NR×× NS 11 NN22

2 It yields a p-value (p2) from the χ distribution with df=1. We report an overall p-value p=max(p1,p2) as a summary result from both tests.

Methods S4: Calculation of functional similarity of genes with evolutionary expression change

To estimate the functional similarity of genes with changed expression in different populations, we classified genes with evolutionary change in expression (either basal change or change in regulation) in the evolved yeast populations according to the function of their protein products, using the CYGD protein functional classification (Güldener et al. 2005) with 18 different functional categories of genes. The CYGD classification (Güldener et al. 2005) contains 27 protein function categories, of which 9 categories do not contain any gene in yeast and were thus discarded from our analysis. Genes in the evolved yeast populations can in principle show six types of gene expression changes, namely a basal expression increase, a basal expression decrease, changed induction in salt stress, changed repression in salt stress, changed induction in oxidative stress, and changed repression in oxidative stress. For each of these six expression change categories, we determined the fraction of genes that fall into the 18 functional categories

aij, given by fij, = , thus constructing a 6×18 table of these fractions (with 108 entries in total) Ni 18 18 (see figure S10). We note that ∑ aNij, = i, thus ∑ fij, =1. In total we arrived in this way at three j=1 j=1 such tables, one for each of the S, O, and SO populations. From the table entries (figure S10), we calculated a pair-wise functional distance between two populations as

Dijf (, )()AB− =− |( f ij , ) A ( f ij , ) B | for 16≤ i ≤ , 11≤ j ≤ 8. Means and standard deviations of the resulting distributions of distances (based on all 108 table entries) are shown in figure S4. We used a non-parametric Mann-Whitney U test to ask whether these functional distances differ significantly among pairs of populations.

Methods S5: PCA analysis

To identify similarities and differences in gene expression in evolved yeast populations, we performed principal component analysis (PCA) on all the genes in the microarray data. To this

204 Appendix to Chapter 3 end, we represented the expression data of each population as a vector of gene expression in a space of N=5621 dimensions, as many as there are genes in our data set. Specifically, we used the following kinds of gene expression vectors in this analysis. The first is a vector (B)k of basal expression changes in replicate population k. The entries (Bi)k of this vector (corresponding to the basal expression change of gene i) are given by (Bi)k= ()EAiYPG,, k− iavg_ YPG . Here, (Ei,YPG)k represents the log2-transformed expression level of gene i in population k and in medium without any stressor. Ai,avg_YPG represents the log2-transformed expression level of gene i in the ancestral strain averaged over the replicates and in medium without any stressor. Relatedly, we define the vector (RX)k that describes the change in regulation of gene i in population k under stressor X as

(Ri,X)k= ()()(EEAAi, YPGX k−− i , YPG k i ,_ avg YPGX − i ,_ avg YPG ). Here, (Ei,YPGX )k represents the log2- transformed expression level of gene i in population k and in medium with stressor X (X=S in salt stress and X=O in oxidative stress), and Ai,avg_YPGX represents the log2-transformed expression level of gene i in stressor X in the ancestral strain, averaged over the replicates. We performed PCA analysis using the perl module (de Hoon et al. 2004) of the cluster program in (Eisen et al. 1998).

Methods S6: Testing for significant enrichment of gene function categories for differently expressed genes

We grouped differentially expressed genes into different classes using the Comprehensive Yeast Genome Database (CYGD) functional classification for yeast genes (Güldener et al. 2005). There are 18 functional categories in total, after discarding categories that do not contain any genes in yeast. First, we classified all the genes that are present in the microarray (in total 5621 genes) into the CYGD functional categories, and calculated the expected fraction of genes in all the 18 functional classes. Let us assume that mk genes are classified into class k, and as ρk the 18 mk expected fraction of genes in class k, i.e., ρk = for k=1 to 18 where Nm= ∑ k . Now, let us N k =1 assume that Yk genes with changed expression in a given experiment fall into class k, and define 18 the quantity AY= ∑ k , i.e., the total number of genes with changed expression. To ask if a given k =1 functional class is enriched or impoverished for genes with a given type of expression change, we performed an exact binomial test for each functional class as follows. We define a random variable X that describes the number of genes in class i under the null hypothesis that X is binomially distributed with parameters A and ρk (X~B(A,ρk)). If Yk

If Yk>A×ρk , the functional class contains more genes than expected by chance, and the p-value is A A! iA()−i pXkk=≥=Pr(Y )∑ ρρ k (1 −k ) iY= iAi!!()− k

205 Appendix to Chapter 3

We calculated p-values for all functional classes and for all different types of expression changes in our populations. We corrected for multiple hypothesis testing, with a False Discovery Rate (FDR) of less than 10 percent in all analyses using the Benjamini-Hochberg procedure (Benjamini and Hochberg 1995).

Supplementary references

Benjamini Y, Hochberg Y. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Royal Stat Soc, Series B (Methodological) 57: 289–300.

Causton HC, Ren B, Koh SS, Harbison CT, Kanin E, Jennings EG, Lee TI, True HL, Lander ES, Young RA. 2001. Remodeling of yeast genome expression in response to environmental changes. Mol Biol Cell 12: 323–337.

Chen P, Sapperstein SK, Choi JD, Michaelis S. 1997. Biogenesis of the Saccharomyces cerevisiae mating pheromone a-factor. J. Cell Biol. 136:251-269.

Cherry JM, Hong EL, Amundsen C et al. (22 co-authors). 2012. Saccharomyces Genome Database: the genomics resource of budding yeast. Nucleic Acids Res. 40:D700-705.

Collinson EJ, Wimmer-Kleikamp S, Gerega SK, Yang YH, Parish CR, Dawes IW, Stocker R. 2011. The yeast homolog of heme oxygenase-1 affords cellular antioxidant protection via the transcriptional regulation of known antioxidant genes. J Biol Chem. 286:2205-2214. de Hoon MJ, Imoto S, Nolan J, Miyano S. 2004. Open source clustering software. Bioinformatics 20:1453-1454.

Dubois E, Messenguy F. 1994. Pleiotropic function of ArgRIIIp (Arg82p), one of the regulators of arginine metabolism in Saccharomyces cerevisiae. Role in expression of cell-type-specific genes. Mol Gen Genet. 243:315- 324.

Dubois E, Scherens B, Vierendeels F, Ho MM, Messenguy F, Shears SB. 2002. In Saccharomyces cerevisiae, the inositol polyphosphate kinase activity of Kcs1p is required for resistance to salt stress, cell wall integrity, and vacuolar morphogenesis. J Biol Chem. 277:23755-23763.

Eisen MB, Spellman PT, Brown PO, Botstein D. 1998. Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci. USA 95:14863-14868.

Gasch AP, Spellman PT, Kao CM, Carmel-Harel O, Eisen MB, Storz G, Botstein D, Brown PO. 2000. Genomic expression programs in the response of yeast cells to environmental changes. Mol Biol Cell 11: 4241–4257.

Güldener U, Münsterkötter M, Kastenmüller G et al. (20 co-authors). 2005. CYGD: the Comprehensive Yeast Genome Database. Nucl Acids Res. 33:D364-D368.

Heath VL, Shaw SL, Roy S, Cyert MS. 2004. Hph1p and Hph2p, novel components of calcineurin-mediated stress responses in Saccharomyces cerevisiae. Eukaryot Cell 3:695-704.

Higgins VJ, Alic N, Thorpe GW, Breitenbach M, Larsson V, Dawes IW. 2002. Phenotypic analysis of gene deletant strains for sensitivity to oxidative stress. Yeast 19:203-214.

Ko CH, Buckley AM, Gaber RF. 1990. TRK2 is required for low affinity K+ transport in Saccharomyces cerevisiae. Genetics 125:305-312.

206 Appendix to Chapter 3

Ko CH, Gaber RF. 1991. TRK1 and TRK2 encode structurally related K+ transporters in Saccharomyces cerevisiae. Mol Cell Biol. 11:4266-4273.

Maathuis FJM, Amtmann A. 1999. K+ Nutrition and Na+ Toxicity : The Basis of Cellular K+/Na+ Ratios. Annals of Bot. 84:123-133.

Michaelis S, Herskowitz I. 1988. The a-factor pheromone of Saccharomyces cerevisiae is essential for mating. Mol Cell Biol. 8:1309-1318.

Nelson B, Parsons AB, Evangelista M, Schaefer K, Kennedy K, Ritchie S, Petryshen TL, Boone C. 2004. Fus1p interacts with components of the Hog1p mitogen-activated protein kinase and Cdc42p morphogenesis signaling pathways to control cell fusion during yeast mating. Genetics 166:67-77.

O'Rourke SM, Herskowitz I. 2002. A third osmosensing branch in Saccharomyces cerevisiae requires the Msb2 protein and functions in parallel with the Sho1 branch. Mol Cell Biol. 22:4739-4749.

Posas F, Saito H. 1998. Activation of the yeast SSK2 MAP kinase kinase kinase by the SSK1 two-component response regulator. EMBO J. 17:1385-1394.

Posas F, Chambers JR, Heyman JA, Hoeffler JP, de Nadal E, Ariño J. 2000. The transcriptional response of yeast to saline stress. J Biol Chem. 275: 17249–17255.

Rep M, Krantz M, Thevelein JM, Hohmann S. 2000. The Transcriptional Response of Saccharomyces cerevisiae to osmotic shock. J Biol Chem. 275: 8290–8300.

Yale J, Bohnert HJ. 2001. Transcript expression in Saccharomyces cerevisiae at high salinity. J Biol Chem. 276: 15996–16007.

Yoshikawa K, Tanaka T, Furusawa C, Nagahisa K, Hirasawa T, Shimizu H. 2009. Comprehensive phenotypic analysis for identification of genes affecting growth under ethanol stress in Saccharomyces cerevisiae. FEMS Yeast Res. 9: 32–44.

207 Chapter 4

Increased gene dosage plays the most important role in the initial stages of evolution of duplicate TEM-1 beta lactamase genes

(To be submitted for publication)

Abstract Gene duplication is important in evolution, because it provides new raw material for evolutionary adaptations. Several existing hypotheses differ in their emphasis on gene dosage, sub-functionalization, and neo-functionalization as prominent causes in duplicate retention and diversification. Little experimental data exists on the relative importance of gene expression changes and genetic changes in the evolution of duplicate genes. Furthermore, we do not know how strongly the environment could affect this importance. To address these questions, we performed evolution experiments with the TEM-1 beta lactamase gene in E. coli to study the initial stages of duplicate gene evolution in the laboratory. We mimicked tandem duplication by inserting two copies of the TEM-1 gene on the same plasmid. We then subjected these copies to repeated cycles of mutagenesis and selection in various environments that contained antibiotics in different combinations and concentrations. Our experiments showed that gene dosage is the most important factor in the initial stages of duplicate gene evolution, and overshadows the importance of point mutations in the coding region.

Introduction Gene duplication is one of the fundamental driving forces of molecular evolution. Although it had been studied for a long time (Kuwada, 1911; Serebrovsky, 1938; Bridges, 1936; Stephens, 1951; Anderson and Roth, 1977; reviewed in Taylor and Raes, 2004), only the genomic era revealed its full extent, showing that about 20-60 percent of a genome’s genes can have duplicates (Himmelreich et al., 1996; Klenk et al., 1997; Tomb et al., 1997; Arabidopsis Genome Initiative, 2000; Rubin et al., 2000; Li et al., 2001; Kellis et al., 2004). Gene duplication, which occurs approximately as frequently as mutation (Lynch and Conery, 2000), is important for the evolution of new protein functions (Ohno, 1970; Hughes, 1994; Walsh, 1995), and may also

208 Chapter 4

facilitate speciation (Sidow, 1996; Spring, 1997; Lynch and Force, 2000a; Bailey et al., 2002; Simillion et al., 2002; Bowers et al., 2003; Dehal and Boore, 2005; Marques-Bonet et al., 2009). Most duplicate gene copies are lost by deletion or become pseudogenes (Rouquier et al., 1998; Rouquier et al., 2000; Balakirev and Ayala, 2003), yet many duplicate genes have persisted in the genomes of organisms in all branches of life (Tekaia and Dujon, 1999; Zhang, 2003; Vavouri et al., 2008), further underscoring the high incidence of gene duplication.

Three main classes of hypotheses exist to explain the retention of duplicate genes (reviewed in Hahn, 2009). The first hypothesis highlights the role of gene dosage. Because gene duplication increases the expression level of a gene, this increased expression can be beneficial and help retain both duplicates (Horiuchi et al., 1963; Kondrashov et al., 2002; Conant et al., 2007). Several biological examples highlight the role of increased dosage in the preservation of duplicate genes. Several duplicate genes in yeast metabolism may have been retained since an ancient genome duplication, because their higher dosage was beneficial in a glucose rich environment (Conant and Wolfe, 2007). Indeed, several ancient duplicate metabolic genes in yeast still show significant functional overlap (DeLuna et al., 2008), which is important, because it suggests a role of increased gene dosage in duplicate retention. Gene amplifications may also be beneficial in nutrient limited environments, for adaptation to stressors, and for resistance to antibiotics, presumably due to the increased gene dosage they cause (Straus, 1975; Otto et al., 1986; Matthews and Stewart, 1988; Nichols and Guay, 1989; Sonti and Roth, 1989; Ives and Bott, 1990; Kondratyeva et al., 1995; Brown et al., 1998; Riehle et al., 2001; Dunham et al., 2002; Reams and Neidle, 2003).

The second hypothesis emphasizes the evolution of new functions – neo-functionalization (Ohno, 1970; Hughes, 1994; Walsh, 1995; Zhang et al., 2002; Hooper and Berg, 2003; Hughes, 2005; Francino, 2005; Conant and Wolfe, 2008). Ohno first suggested that one copy of a duplicate gene pair experiences relaxed selection and can evolve a new function through accumulating mutations (Ohno, 1970). If this new function is beneficial for an organism, both original and new gene copies would be retained. Several models for neo-functionalization such as the Innovation- Amplification-Divergence (IAD) model (Bergthorsson et al., 2007; Soo et al., 2011; Näsvall et

209 Chapter 4

al., 2012), and the Escape from Adaptive Conflict (EAC) model (Des Marais and Rausher, 2008; Barkman and Zhang, 2009; Deng et al., 2010) have since been proposed.

The third hypothesis suggests sub-functionalization as a possible mechanism. Force et al. (Force et al., 1999) proposed that two copies of a gene acquire mutations that change the genes' activity in such a way that the two copies become complementary to each other in function. In consequence, both copies are required to maintain the original function (Duplication- Degeneration-Complementation or DDC model) (Force et al., 1999; Lynch and Force, 2000b; Lynch et al., 2001). Apart from these three main classes of hypotheses, theoretical studies have suggested diverse further scenarios for gene duplicate retention (Nowak et al., 1997; Wagner, 2000; Wagner, 2002; Gu et al., 2003; Kafri et al., 2006; Kafri et al., 2008).

Changes in gene dosage (gene expression) and in gene coding regions play a role in all three main hypotheses. Although comparative studies address the importance of gene dosage changes or sequence changes for the retention of duplicate genes (Kondrashov et al., 2002; Zhang et al., 2002; Rodríguez-Trelles et al., 2003; Tocchini-Valentini et al., 2005; Conant et al., 2007; Lynch, 2007; Tirosh and Barkai, 2007; Woolfe and Elgar, 2007; Des Marais and Rausher, 2008; Kleinjan et al., 2008; Deng et al., 2010), very little pertinent experimental evidence exist (Holloway et al., 2007; Näsvall et al., 2012). Moreover, we do not know how environmental conditions can influence the retention of duplicate genes via one mechanism or the other. Although some experimental work studied the evolution of duplicate genes (Holloway et al., 2007; Näsvall et al., 2012), none of it examined the relative importance of these mechanisms in the retention process, nor did it study the role of environmental conditions in the evolutionary outcome. To address these questions, we constructed a tandem duplication of the TEM-1 beta lactamase gene on a plasmid. We subjected the two duplicates to repeated rounds of mutagenesis and selection on the native substrate ampicillin, on novel substrates, and on a combination of native and novel substrates, to ask what role the chemical environments play in the retention of duplicate genes. We were especially interested in the role that gene dosage (and the resulting changes in gene expression) or changes in coding regions play in the initial stages of duplicate gene evolution. Do environmental conditions influence this role? We found that gene dosage increases plays the most important role in retaining duplicate genes, regardless of the antibiotic

210 Chapter 4

environment. They are affected by four molecular processes that could alter TEM-1 gene expression in our evolved populations.

Results Duplication confers fitness advantage To study the evolution of duplicate genes in the laboratory, we created a tandem duplication of the TEM-1 gene on a plasmid, where both duplicates had the same transcriptional orientation. We transformed this plasmid into E. coli DH5α cells and refer to the resulting strain as the ancestral D strain (AncD). We also constructed a control strain with only one copy of the TEM-1 gene on the plasmid (ancestral S or AncS) (figure 1, see Methods). We first wanted to know if the two strains differ in fitness. The answer is yes, using cell density after seven hours of growth in medium supplemented with antibiotics as a proxy for fitness (see Methods). Fitness decreased in both ancestral strains as we increased the ampicillin concentration in the medium. However, the AncD strain suffered a significantly smaller reduction than the AncS strain when exposed to ampicillin (figure 2a, p=0.03, ANOVA - see supplementary methods S1). This advantage of the double-copy strain was especially large at high concentration of ampicillin (>250µg/ml, figure 2a; supplementary table S1a). It is not surprising, because the AncD strain has a higher dosage of the gene encoding TEM-1 beta lactamase, and we would thus expect it to hydrolyze its native substrate ampicillin more efficiently. This difference should become more important at higher, more damaging concentrations of ampicillin, just as we observed.

Figure 1: Ancestral strains of our experiments. Our starting strains AncS and AncD are derivatives of E. coli DH5α cells that contain one (AncS) or two (AncD) copies of the TEM-1 gene on plasmid pUA66 (Zaslaver et al., 2006).

211 Chapter 4

In a next step, we asked whether this difference also exists in a novel substrate such as cefotaxime. We used cefotaxime because it has previously been shown that the TEM-1 gene can acquire activity towards cefotaxime by accumulating mutations (Stemmer, 1994; Zaccolo and Gherardi, 1999; Barlow and Hall, 2002). We repeated the experiment described above, starting from very low concentrations. As on ampicillin, increasing cefotaxime concentrations led to a fitness decrease in both ancestral strains (figure 2b; supplementary table S1b). However, unlike in ampicillin, the double-copy AncD strain did not have a significant advantage over the single- copy AncS strain (ANOVA, p=0.90).

We then asked whether the two antibiotics interact in reducing fitness, and whether this interaction favors the duplicate strain. To this end, we measured growth of the ancestral strains in medium supplemented with various concentrations of ampicillin and cefotaxime (figure 2c and supplementary figure S1). In low concentrations of ampicillin (50 µg/ml and 100µg/ml) supplemented with cefotaxime, the double-copy strain performed like the single-copy strain (ANOVA; p=0.37 in 50µg/ml ampicillin supplement; p=0.08 in 100 µg/ml ampicillin supplement; supplementary figure S1; supplementary table S1c). However, at larger ampicillin concentration, the double copy strain had higher fitness than the single copy strain (ANOVA; p=0.0003 in 150µg/ml ampicillin supplement; p=0.0009 in 200 µg/ml ampicillin supplement; figure 2c and supplementary figure S1; supplementary table S1c).

We note that the concentrations of ampicillin and cefotaxime required to reduce fitness to a given level is considerably lower when ampicillin and cefotaxime are both present in the medium than when only one of them is present. This observation suggests a synergistic effect of these two antibiotics (Gunnison et al., 1955; Moellering et al., 1971; Farrar and Newsome, 1973). In this case, higher gene dosage can be beneficial, because the AncD strain can hydrolyze ampicillin more efficiently and thus can withstand higher concentrations of cefotaxime. We made similar observations for imipenem, another novel substrate (supplementary figures S2 and S3; supplementary results S1). A different fitness measure, namely the rate at which cell density increases over time provides further evidence in support of our observations (supplementary figure S4; supplementary results S2). Taken together, the results show that two copies of the TEM-1 gene can cause benefits due to the higher gene dosage they provide.

212 Chapter 4

Figure 2: Gene duplication confers a significant fitness advantage on some antibiotic concentrations (AncS and AncD). To measure fitness, we cultured cells from the ancestral strains for 7 hours in medium supplemented with antibiotics, and calculated the fold-change in cell density (y-axis) on different concentrations (x-axis) of (a) ampicillin, (b) cefotaxime, and (c) cefotaxime with a constant 200 µg/ml concentration of ampicillin. The red curve shows the data for the AncD strain, and the blue curve shows the data for the AncS strain. Note a general decrease in fitness as antibiotic concentrations increase. On ampicillin (panel a) the AncS strain declined significantly more in fitness than the AncD strain (ANOVA, supplementary methods S1; p=0.03). No such advantage of the AncD strain existed on cefotaxime (ANOVA, p=0.90), but on cefotaxime supplemented with ampicillin (200 µg/ml), this advantage became more pronounced than on ampicillin alone (ANOVA; p=0.0009). Supplementary Figure S1 contains fitness data on further antibiotic concentrations.

213 Chapter 4

Experimental evolution We next asked whether the inherent fitness difference we observed between the AncD and AncS strains is maintained in the long term. To do so, we subjected the ancestral strains to 12 cycles of mutagenesis and selection. Briefly, we mutagenized plasmids in a mutator E. coli strain (see Methods), selected mutant offspring on various antibiotic-containing media (see Methods, figure 3), and iterated this procedure for twelve rounds of mutagenesis and selection. We refer to the evolved populations as EvoS (evolved from AncS) and EvoD (evolved from AncD). We performed each evolution experiment in 5 replicates for each of the two ancestral strains, and for each of seven selection conditions, creating a total of 70 replicate evolved populations (see Methods; supplementary figure S5). However, we will mainly discuss results from one selection condition, i.e., cefotaxime in combination with ampicillin, which gave rise to populations that we call CA lines (see Methods). This selection condition is of major interest, because it has been suggested to accelerate divergence between duplicate copies, where one copy would maintain the original function while the second copy would evolve a new function (Ohno, 1970, Bergthorsson et al., 2007). We also used several other selection conditions, and did so for two reasons. First, we wanted to ensure that our inferences from the CA lines are not only due to an effect specific to a particular novel antibiotic, namely cefotaxime. Second, we wanted to study the effect of different environmental selection conditions, to find out whether duplicate gene copies diverge in function or sub-functionalize depending on the environmental condition.

Evolved lines show increased fitness At the end of the evolution experiment, we measured the fitness of the evolved lines to see whether they had adapted to their respective selection conditions, and indeed they had. For the CA lines, which had evolved in cefotaxime and ampicillin, we measured fitness on several ampicillin and cefotaxime concentrations. Both the EvoS and EvoD lines showed a significant fitness increase in all combinations (table 1; figure 4 and supplementary figure S6a). These observations suggest that the evolved CA lines have adapted to ampicillin and cefotaxime during the course of the experiment. However, in none of these conditions did the EvoD lines show a higher fitness increase than the EvoS lines. To the contrary, in two conditions, the EvoS lines showed a significantly higher fitness increase than the EvoD lines (figure 4 and supplementary

214 Chapter 4 figure S6a; Wilcoxon rank sum test; p=0.007 in 200 ug/ul ampicillin and 0.003 ug/ul cefotaxime; p=1.2×10-5 in 200 ug/ul ampicillin and 0.006 ug/ul cefotaxime). We discuss the reasons below.

Figure 3: Experimental evolution design. Our evolution experiment had four steps (see Methods). Briefly, we transformed the plasmids containing single or double copy TEM-1 inserts into the mutator E. coli TB90 cells. We then isolated the mutagenized plasmids and transformed them into the E. coli DH5α strain. We then let these cells grow for 16 hours in LB medium supplemented with 25µg/ml of kanamycin. In the final step, we diluted the cells 10-fold into appropriate antibiotic selection medium, let them grow in the selection medium for 7 hours and subsequently isolated the plasmids. We repeated these four steps 12 times, thus subjecting the ancestral strains to 12 rounds of mutagenesis and selection. We refer to the populations thus evolved as the EvoS and EvoD lines. We emphasize that only the plasmids evolved in our experiment and not the cells, since we changed the host cells at each step of the experiment.

Fitness measurements in the other evolved lines suggested that most of the lines had adapted to their respective selection conditions (supplementary table S2; supplementary results S3; supplementary figures S7 to S12). And again, not only did the EvoD lines not increase their fitness beyond that of the EvoS lines, in some antibiotics concentrations, the EvoS lines acquired

215 Chapter 4

significantly higher fitness than the EvoD lines (supplementary results S3; supplementary table S2; supplementary figures S7 to S12). We note that even our ‘neutral lines’ – they were not exposed to antibiotic except for the kanamycin required to maintain the plasmid – also showed an increase in fitness in ampicillin. We discuss these phenomena in detail in the supplementary section (supplementary results S4).

Figure 4: Laboratory evolution significantly increases fitness. We measured the fitness of the CA lines (evolved on ampicillin and cefotaxime) relative to the AncD strain as a relative fold- change in cell density after 7 hours of growth (see Methods). A relative fitness of 1 would mean that an evolved line and the ancestral strain have identical fitness. We measured fitness of the CA lines in 200 µg/ml ampicillin + 0.006 µg/ml cefotaxime. Both the EvoS-CA lines and EvoD- CA lines show a significant fitness increase relative to the AncD strain (rf=5.12±1.05 for EvoS- CA lines; rf=3.25±0.69 for EvoD-CA lines). Note that the fitness increase in the EvoS-CA lines was significantly higher than that in the EvoD-CA lines (p=1.2×10-5, Wilcoxon rank sum test). Bars show mean relative fitness values (averaged over measurements in triplicate for each of the replicate lines) whiskers show one standard deviation (s.d.), and asterisks indicate a significant fitness difference.

216 Chapter 4

Plasmid copy number increase in the evolved lines We then turned our attention to investigate the changes at the molecular level that could lead to the increased fitness in these evolved lines, and might explain the difference in fitness between the EvoS and the EvoD lines. Because of recombinational instabilities associated with double- copy inserts, we had chosen not to re-clone the inserts in each step of the experiment, but to subject the whole plasmid sequence to evolutionary change. This means that plasmid copy numbers could change in the evolved lines, and indeed they did. To estimate changes in plasmid copy number, we isolated plasmids from the evolved lines and the ancestral strains, and compared their concentrations in an agarose gel, while accounting for variation in cell number before plasmid isolation (see Methods). The ancestral plasmid contains the SC101 origin of replication and is present inside the bacterial cell in 5 to 10 copies (Hasunuma and Sekiguchi, 1977; Armstrong et al., 1984). Plasmid copy numbers increased significantly in both the EvoS- CA and EvoD-CA lines (figure 5; 6.29±2.08 (s.d.) fold in the EvoS-CA lines; 8.69±1.31 (s.d.) fold in the EvoD-CA lines, based on measurements from 5 replicate lines). The single-copy and double-copy lines did not differ significantly in their plasmid copy number increase (Wilcoxon rank sum test, p=0.09). We also observed plasmid copy number increases in all other evolved lines (supplementary figure S13; supplementary results S5; supplementary table S6). They suggest that the accompanying increase in gene dosage is likely to be beneficial for the cells under all conditions employed in our experiment.

We note that the neutral lines also show increase in plasmid copy number. The likely explanation is that increase in copy number in these lines occurs due to selection unrelated to the selection on the TEM-1 gene. If some plasmids in our experiment have higher copy number, they would have higher chance of being transformed and propagated in E. coli cells in each round of experiment (supplementary results S5).

We also observed high frequency SNPs in the origin of replication of plasmids in all the evolved lines where we sequenced the origin (supplementary results S5). These SNPs are likely to be responsible for the increased plasmid copy number.

217 Chapter 4

Figure 5: In CA lines plasmid copy number increased several fold. We estimated the change in plasmid copy number in the evolved lines compared to the ancestral strain (see Methods). The horizontal axis shows the names of the evolved lines in which the changes were measured, and the vertical axis shows the estimated fold-change in plasmid copy number. We observed a 6.29±2.08-fold increase in copy number in the EvoS-CA lines and a 8.69±1.31-fold increase in the EvoD-CA lines. The two lines showed no significant difference in the copy number increase (p=0.09). Columns represent means and whiskers indicate one standard deviation (s.d.).

Loss of one copy in the EvoD lines counter increase in gene dosage A second molecular change that we observed in our populations is a loss of one TEM-1 copy in many individuals of the double-copy (EvoD) lines. Thus, even though we had taken care to use recombination-deficient E.coli strains, and had avoided recombinogenic re-cloning of double- copy constructs during each round, as well as recombinogenic PCR-based mutagenesis, gene loss had occurred at the end of the evolution experiment. To estimate its extent, we performed a simple PCR based screen (see Methods) to determine the fraction of individuals in the EvoD lines that still retained duplicate gene copies. It revealed that 7 to 25 % of the individuals in the five replicate EvoD-CA lines retained duplicate copies of the gene (figure 6). Gene loss also occurred in all other EvoD lines. Its incidence varied widely among replicates and conditions (supplementary figure S14; supplementary table S7).

218 Chapter 4

Figure 6: Some initially double-copy individuals lose one copy of the TEM1 gene in EvoD- CA lines. The vertical axis shows the percentage of individuals that still retained duplicate copies of the TEM-1 gene (see Methods), and the x-axis shows the biological replicate of the EvoD-CA lines for which we measured duplicate retention. The percentage of individuals with retained duplicates varied from 7% in the EvoD-CA2 replicate to 25% in EvoD-CA4 replicate.

Gene loss would lead to reduced gene dosage in the EvoD lines, and its high incidence in the CA lines (>50% of individuals) could have two reasons. First, selection may favor individuals with a single TEM-1 copy. The single-copy individuals may have risen to high frequency during the evolution experiment for this reason. Second, the loss of one gene copy from individuals with duplicate genes may simply be very frequent, such that more and more individuals lose gene copies over the course of the evolution experiment. In other words, gene copy loss could be driven by selection or by frequent recombination.

219 Chapter 4

Figure 7: Fitness of the clones from the EvoD-CA lines. From the EvoD-CA lines we isolated clones that still contained duplicate genes (EvoD-CA-D) and clones that had lost one TEM-1 copy (EvoD-CA-S). We then measured fitness of these clones in various combinations of ampicillin and cefotaxime concentrations, as indicated on the horizontal axis. We performed this analysis for 33 clones containing duplicate genes and 33 clones containing a single gene copy. In medium containing 100 µg/ml ampicillin and 0.009 µg/ml cefotaxime, and in medium containing 200 µg/ml ampicillin and 0.009 µg/ml cefotaxime, the EvoD-CA-D and EvoCA-S clones showed a significant fitness increase relative to the ancestor, and this increase was significantly higher in the EvoD-CA-D clones compared to the EvoD-CA-S clones (p=0.012 and p=0.003 respectively). Columns represent means and whiskers indicate one standard deviation (s.d.). The red asterisk indicates a significant difference in fitness.

The following observations argue for the second scenario of frequent copy loss. First, we noted that a substantial fraction of the immediate offspring of clones that contained double-copy plasmids themselves harbored single copy plasmids (supplementary figure S23). Second, we measured the fitness of EvoD-CA lines, and asked whether the clones that had retained duplicate gene copies (referred to as EvoD-CA-D lines) have higher fitness than the clones that had lost a copy (referred to as EvoD-CA-S lines). Although the rapid plasmid loss we observed complicates this analysis, it can still give us some indication of the fitness difference between these two different types of clones. We measured the fitness of 33 clones each for the EvoD-CA- S and EvoD-CA-D lines in medium supplemented with ampicillin and cefotaxime (figure 7). The EvoD-CA-D lines showed significant higher fitness than the EvoD-CA-S lines in medium supplemented with 100µg/ml ampicillin and 0.009 µg/ml cefotaxime (p=0.01), as well as in

220 Chapter 4

medium with 200µg/ml ampicillin and 0.009 µg/ml cefotaxime (p=0.003; figure 7; table 1). We note that this observation is consistent with the advantage that double copy TEM-1 genes provide over single copy TEM-1 genes in the ancestral strain (figure 2a, 2c and supplementary figure 1). A third, indirect line of evidence comes from the fact that duplicate copies were retained at all through the end of the 12-round evolution experiment. Loss of one TEM-1 copy creates plasmids of smaller size that can transform host cells at higher efficiency than the larger double-copy plasmids (Hanahan, 1983). Thus, if duplicate genes were not beneficial, it is likely that all of the duplicate copies would be lost in the EvoD lines during the evolution experiment. The mere fact that many individuals retained duplicate copies argues that they are beneficial.

Taken together, all these observations suggest that the loss of one gene copy is caused by the high frequency of loss in the EvoD lines, rather than by the selective advantage of a single TEM- 1 gene copy in evolved lines.

EvoD lines have a unique regulatory mutation

Because we expected that point mutations in our TEM-1 genes might influence evolutionary adaptation, we sequenced the coding and near-upstream regions of TEM-1 genes in both single and double-copy inserts using 454 high-throughput sequencing (Margulies et al., 2005). We were especially interested in mutations that rise to high frequency during the evolution experiment (see Methods), because they indicate the action of positive selection. To identify them, we sequenced TEM-1 genes in the EvoS and EvoD lines after 1, 4, 8 and 12 rounds of mutagenesis and selection. For the EvoD lines, we only sequenced the individuals that had retained duplicate genes (see Methods). We observed several mutations at low population frequency (<5%) in all the evolved lines. However, we found only one high frequency mutation. It occurred only in the individuals with duplicate gene copies in the EvoD lines. This mutation was a T→ C transition and mapped to the upstream region of the gene copy on the left, as indicated by the red asterisk in figure 8a (T-53C at position -53 relative to +1 at the transcription start site). This mutation reached a frequency of almost 80% in two of the five replicate EvoD-CA lines (EvoD-CA1 and EvoD-CA2; figure 8a) as well as in several other EvoD lines (supplementary figure S15). Also, the frequency of this mutation was highest after round 8 in several EvoD lines and decreased

221 Chapter 4

thereafter (figure 8a and supplementary figure S15). In addition, the mutation was present in one of the starting clones of the EvoD lines (AncD clone 9, that gave rise to EvoD-CA4; supplementary figure S19), but its frequency decreased in this population from 80 percent after round one to a very low frequency of less than 1 percent at the end of round 12 ( figure 8a). Similar trends existed in the other EvoD lines, except for the neutral lines, where this mutation persisted at high frequency at round 12 (supplementary figure S15).

The features of this mutation indicate how it affects fitness. First, its occurrence in the upstream region suggests a regulatory role. Second, it rose to high frequency only in the double-copy EvoD lines, which suggests that the single-copy EvoS lines do not tolerate it. Both features could be explained if the mutation caused a reduction in the expression of the TEM-1 copy next to it, a reduction that EvoD lines can tolerate, because they have a second, intact TEM-1 gene with high expression. Third, the frequency of this mutation decreased with increasing antibiotic concentration beyond round 8 of the evolution experiment. This again suggests a role in down- regulation the expression of one of the copies.

To test the fitness effect of this mutation, we reconstructed it in the AncS strain using site- directed mutagenesis (see Methods). We also reconstructed a control with the ancestral sequence using the same method, to ensure that the process of reconstruction did not bias our results. We measured the fitness of the mutant and this ancestral control in medium supplemented with 0, 100 and 200 µg/ml of ampicillin. As would be expected from a down-regulating mutation, we saw a significant fitness decrease in the mutant compared to the ancestral control at 100 and 200 µg/ml of ampicillin (p=0.0001 in 100 µg/ml ampicillin and p=1.6×10-5 in 200 µg/ml ampicillin; figure 8b), thus suggesting that this regulatory mutation reduces gene expression of one of the TEM-1 copies in the EvoD lines. The observation that the mutation shows no fitness differences in the absence of ampicillin (figure 8b) further supports this assertion. We note that we measured fitness here at antibiotic concentrations that differed from original experimental conditions. However, this does not affect our inference of the general effect of the regulatory mutation on gene expression through fitness measurements in ampicillin.

222 Chapter 4

Figure 8: (a) Change in frequency of the regulatory mutation T-53C. We sequenced plasmid inserts in the evolved lines to identify mutations, and observed only one high frequency mutation. It occurred only in the EvoD lines, and had high frequency only at some time points. It maps to the upstream region of the promoter of the left copy of the TEM-1 gene, as indicated by the asterisk to the left of the boxed region with the red P (promoter) and blue TEM-1 (coding region) lettering. The graphs below show the mutation’s frequency (vertical axes) in the EvoD-CA lines at different rounds of the experiment (horizontal axes). The frequency of this mutation was highest at round 8 in the EvoD-CA1, EvoD-CA2 and EvoD-CA3 lines (approximately 80%, 60% and 20% respectively) and had decreased dramatically by round 12. The mutation was present in the starting clone for the EvoD-CA4 line, persists to high frequency after round one, and decreases in frequency thereafter (from >80% in round 1 to <1% in round 12). The mutation occurred only in low frequency (<5%) in the EvoD-CA5 line at rounds 1, 4 and 8. (b) Fitness of the reconstructed mutant. To test the effect of the mutation T-53C on the fitness of the EvoD lines, we reconstructed this mutant (“Mut”) in the ancestral strain (AncS) by site directed mutagenesis (see Methods). We also reconstructed a control with the ancestral sequence, using the same method to ensure that the method of reconstruction did not bias our results (“Anc”). We then measured the fitness of the reconstructed mutant and the control on 0, 100 and 200 µg/ml of ampicillin. Fitness was significantly lower in the mutant than in the ancestral reconstruction on 100 and 200 µg/ml of ampicillin (p=0.0001 and p=1.6×10-5 respectively), but not in the absence of ampicillin. Columns represent means and whiskers indicate one standard deviation (s.d.).The red asterisk indicates a significant fitness difference in fitness.

223 Chapter 4

Table 1: Relative fitness values of the EvoS-CA, EvoD-CA, EvoD-CA-S and EvoD-CA-D lines. The table shows the fitness of the evolved lines in medium supplemented with ampicillin and cefotaxime and relative to the fitness of the AncD strain (see Methods). A relative fitness significantly greater than 1 suggests a fitness increase in the evolved populations. We tested whether the relative fitness is significantly greater than 1 using a one-sample Wilcoxon signed rank test. The p-value resulting from this test is shown in the rightmost column.

Antibiotic Relative fitness (rf) p-value (Wilcoxon Line concentrations (µg/ml) mean±sd signed rank test) EvoS-CA Amp 200+Cef 0.006 5.12±1.05 6.1×10-5 EvoD-CA Amp 200+Cef 0.006 3.25±0.69 6.1×10-5

EvoD-CA-S Amp100+Cef0.006 3.24±0.76 <2.2×10-16 EvoD-CA-D Amp100+Cef0.006 3.28±0.76 <2.2×10-16 EvoD-CA-S Amp100+Cef0.009 3.80±1.39 <2.2×10-16 EvoD-CA-D Amp100+Cef0.009 4.62±2.15 <2.2×10-16 EvoD-CA-S Amp200+Cef0.006 7.20±4.28 <2.2×10-16 EvoD-CA-D Amp200+Cef0.006 7.60±4.12 <2.2×10-16 EvoD-CA-S Amp200+Cef0.009 4.52±1.83 <2.2×10-16 EvoD-CA-D Amp200+Cef0.009 5.76±2.85 <2.2×10-16

Another possibility is that this regulatory mutation reduces plasmid stability or replicability, and thus reduces fitness. However, this is not the case. In the EvoD-N3, EvoD-N4 and EvoD-N5 lines, this mutation was present at high frequency (>30%) and the copy numbers of the plasmids were also very high compared to the ancestor (~14 fold increase; supplementary figures S13 and S15; supplementary table S6).

Loss of one copy along with low frequency mutations can explain the lower fitness of the EvoD lines

We had expected the EvoD lines to evolve higher fitness than the EvoS lines, in accordance with existing hypotheses about duplicate gene evolution (Ohno, 1970; Bergthorsson, 2007). Surprisingly, the opposite had occurred in several lines (figure 4; supplementary figures S6). One

224 Chapter 4

candidate explanation relates to the ability of duplicate genes to buffer the effects of otherwise deleterious mutations: Loss-of-function mutations in one gene copy should be neutral or tolerable as long as the other copy provides the needed function. (Ohno, 1970; Wagner, 2000; Gu et al., 2003; Kellis et al., 2004; Hughes, 2005). However, if one of the copies is lost after such mutations have occurred, the deleterious effects of these mutations would be exposed and would reduce fitness (figure 9). Over time, such gene deletions would disappear from the population except in the following circumstances. First, these deletions are created at a very high rate, as we observed in our experiment (figure 6). Second, in a heterogeneous population that contains fitter individuals which hydrolyze antibiotic at a higher rate than less fit individuals, selection can be weakened in the later stages of the selection experiment, because most of the antibiotic may have been hydrolyzed by that time. Such weaker selection may lead to a slower elimination of single- copy individuals with deleterious mutations.

Figure 9: Loss of buffering in individuals with only one copy of the TEM-1 gene in the EvoD lines. Individuals with duplicate copies of the TEM-1 gene may be able to withstand some mutations that would otherwise be deleterious in a single copy of the gene. If one of the copies accumulates a deleterious mutation (indicated by the red asterisk) the second copy would be able to buffer its deleterious effects. However, this deleterious effect would be exposed when the second copy is lost.

In other words, we postulate that the double-copy lines have lower fitness, because their frequent loss of gene copies exposes deleterious mutations that have occurred since the experiment’s

225 Chapter 4

beginning. To test this hypothesis, it is necessary to show that deleterious mutations can be tolerated better in the EvoD lines. If double-copy lines can indeed buffer deleterious mutations better, then we would expect a significantly higher sequence diversity in EvoD than in EvoS lines, at least at some time points during the evolution experiment. This is indeed the case at round 8 in EvoD-CA and EvoS-CA (Wilcoxon rank sum test, p=0.03; supplementary figure S16f). We also determined ratios of non-synonymous to synonymous substitutions, but for several reasons (supplementary results S8) they are not suitable for inferring selection on the TEM-1 genes in our experiments.

Finally, another line of evidence, which has been discussed earlier (figure 7), supports the notion that the frequent loss of gene copies is responsible for the lower fitness of EvoD lines. The EvoD-CA-D clones that have retained two TEM-1 copies have higher fitness than single-copy EvoD-CA-S, just as they had at the beginning of the experiment (figure 2 and supplementary figure S1).

In sum, these observations suggest that the lowered fitness of EvoD lines is caused by a combination of deleterious mutations and frequent gene copy loss in these lines.

Discussion To test for the relative importance of expression changes and genetic changes in the evolution of duplicate genes, we constructed a tandem duplication of the TEM-1 beta lactamase gene on a plasmid in E. coli. Initial fitness measurements suggested that cells with two copies of the gene (AncD strain) have greater fitness than cells with only one copy (AncS strain) in antibiotic- containing media. These results indicate an advantage of increased TEM-1 gene dosage in the AncD strain.

We then subjected the plasmids to 12 rounds of experimental evolution in the form of mutagenesis and selection. We used several antibiotic conditions to investigate the effect of the selection environment on the outcome of the evolution experiment, and observed a fitness increase in the evolved lines. In some conditions, the EvoS lines had higher fitness than the EvoD lines.

226 Chapter 4

While investigating the molecular mechanisms behind this fitness increase, we encountered four factors that could affect fitness by changing TEM-1 gene dosage and, as a result, gene expression. First, we observed that gene duplication can increase fitness because it increases gene dosage (figure 1). Second, during the evolution experiment we saw an average 6 to 8 fold beneficial increase in plasmid copy number (figure 5), which also increases gene dosage. We observed high frequency SNPs in the origin of replication of plasmids in the evolved lines. These SNPs are likely to explain the increased plasmid copy number in the evolved lines, although none of them match previously known copy number mutants for this origin of replication (Armstrong et al., 1984). Third, we saw a frequent loss of one gene duplicate in the EvoD lines (figure 6), which reduced not only gene dosage but also fitness (figure 7). Finally, we found a high frequency, putative regulatory mutation in the EvoD lines that may have down-regulated expression of one TEM-1 gene copy (figure 8). It is also relevant here that no point mutations in coding regions (supplementary figures S16 and S17; supplementary results S6) rose to high frequency during the experiment, suggesting that their contributions to improving fitness were small. Taken together, these observations suggest that an increase in gene dosage through increased plasmid copy number was the most important factor behind the fitness increase in our evolved lines.

We observed that the ‘neutral’ lines which did not experience any antibiotic selection, also showed an increased fitness in ampicillin, which was likely due to the increased plasmid copy number in these lines. This increase can occur due to selection at the plasmid level that is unrelated to selection on the TEM-1 gene, where a plasmid with higher copy number can be preferably transformed and propagated (supplementary results S5). Was such selection at the plasmid level responsible for all the adaptations that we see in our evolved lines? The answer is no, because we observed several distinct changes in the lines selected in antibiotics compared to the neutral lines (see supplementary results S7 for details). First, plasmid copy number increases in the EvoD lines selected in novel antibiotics, and in combinations of native and novel antibiotics were different in extent to copy number increases in the neutral EvoD lines (supplementary figure S13). This could be due to detrimental effects of either high cost of TEM- 1 gene expression, or high metabolic cost associated with high plasmid copy numbers in the selection conditions. Second, apart from the EvoD-A↑ lines, in all other EvoD lines selected in

227 Chapter 4

antibiotics, one gene copy was lost at a higher rate than in the EvoD-N lines (supplementary figure S14), which may be caused by higher instability of double-copy inserts in antibiotic medium. Finally, in almost all the EvoD lines, the regulatory mutation T-53C rose to high frequency at round 8, and subsequently dropped to low frequency at round 12, except in the neutral lines (supplementary figure S15). We speculate that this regulatory mutation might down-regulate expression of one TEM-1 gene copy, and if so, it was beneficial until round 8 but became detrimental with increasing antibiotic concentration beyond round 12. In the neutral lines, this mutation was retained at high frequency. This mutation could be beneficial in the EvoD-N lines, which also show very high fold-increase in plasmid copy number, by reducing expression cost.

Gene loss coupled with low frequency deleterious mutations could explain the lower fitness we observed in EvoD lines compared to EvoS lines in some conditions. Duplicate genes would be able to tolerate deleterious mutations in one copy if the other copy remains functional. Indeed, we observed higher sequence diversity in double copy lines at round 8, suggesting that the double copy lines could buffer deleterious mutations better. However, the deleterious effects of these mutations would be exposed if the functional copy is lost. These deleterious mutations, when present in one copy of the TEM-1 gene, would be removed from a population except in two cases. First, the mutations might be created at a high frequency, which was indeed the case in our experiment (figure 6). Second, these mutations might be created at a later stage of the experiment, when antibiotic concentration in the medium has become lower due to hydrolysis by fitter individuals of the population. Taken together, our observations suggest that high gene loss coupled with deleterious mutations brought down the overall fitness of the EvoD lines.

How can factors that increase gene dosage, such as an increasing plasmid copy number, and factors that reduce gene dosage, such as gene loss and our putative regulatory mutation, exist simultaneously in our experiment? There are several possible answers. First, it is very likely that gene loss is just a by-product of tandem duplication, and that antibiotic selection does not increase the frequency of plasmids with single copy of the TEM-1 gene over the course of the evolution experiment. This idea is supported by the observation that close proximity of two regions of almost identical sequence can cause deletion of one copy even in recombinase

228 Chapter 4

deficient strains (Lovett et al., 1993). It is further supported by our observation that gene loss occurs at very high rate in our experiment, and cell carrying duplicate genes have fitness advantage over single gene, even at the end of the experiment (figure 7).

A second reason for co-existence of all these factors could be that an increase in gene dosage is beneficial up to a certain extent but detrimental beyond that, such that dosage level reduction can become beneficial. Two observations are consistent with this possibility. First, the putative regulatory mutation T-53C decreased in population frequency, as we increased the antibiotic concentration after round 8. The mutation may still have been beneficial for the cells at round 8, but became detrimental as we increased antibiotic concentrations after round 8. Second, we saw a very high copy number increase only in the EvoD lines without selection, and in the lines selected in ampicillin (supplementary figure S13), but not on in EvoD lines selected in novel substrates or in combinations of native and novel substrates. Taken together, it is likely that all these factors coexist because they help in adjusting the dosage level to optimum in the EvoD lines.

Clonal interference can be a reason behind the decrease in frequency of the T-53C mutation in several EvoD lines beyond round 8 (Miralles et al., 1999; Kao and Sherlock, 2008). We did not observe any high frequency mutation in TEM-1 genes, which would be required for such clonal interference, but note that such mutations could have occurred in the origin of replication or elsewhere in the plasmid which we did not sequence. We note that in our experiment, the duplicate gene was already fixed in the population and we were interested in its subsequent evolution. In comparison, in nature, gene duplicates often arise in a single or few cells and go to fixation in the population. Our study shows that duplicate genes can confer a significant fitness advantage in some environments over single copy gene. This advantage can help in fixation of the duplicate in the population. Indeed, it has been shown that the fixation of a duplicate gene can require positive selection (Clarke, 1994; Moore and Purugganan, 2003).

In sum, our study shows that an increase in gene dosage is initially the most important factor in the evolution of duplicate TEM-1 beta lactamase genes under the influence of antibiotics. This

229 Chapter 4

increase in gene dosage occurs through an increase in plasmid copy numbers, which creates a similar effect as gene amplification on a chromosome. However, in our case, the genes are present on different plasmids. This increase in gene dosage can provide an initial benefit to cells, and could pave the way for neo-functionalization under appropriate selection conditions through an accumulation of mutations (Andersson and Hughes, 2009; Näsvall et al., 2012).

Methods Bacterial strains and medium We used the E. coli strain DH5α for initial fitness measurements, for our evolution experiment, and for the fitness measurements after the evolution experiment. Because the mutagenic polymerase chain reaction that is often used to introduce genetic variation into evolving genes is highly recombinogenic, we used instead a mutator E. coli strain (E. coli TB90) for mutagenesis. This mutator strain, which we constructed in the laboratory, has the genetic background

∆recA::frt PBAD-dinB (in E. coli MG1655). Mutagenesis can be induced with the addition of L- (+) arabinose (Sigma) and the mutation rate can be controlled up to certain extent (supplementary methods S2; supplementary table S3). For all E.coli cultures, we used LB medium (Becton-Dickinson). All E. coli strains were grown at 37°C.

Construction of duplicate copy We cloned the gene TEM-1 beta lactamase gene along with a promoter region from the pBR322 plasmid (Sutcliffe, 1978) in the pUA66 plasmid (Zaslaver et al., 2006), using the recognition sites for restriction enzymes XhoI and BamHI (New England Biolabs). We transformed this plasmid into E. coli DH5α cells, referring to the resulting strain as the ancestral S (AncS) strain, where S stands for “single copy”. We then inserted a second, identical copy of the TEM-1 gene into the same plasmid using the BamHI restriction site, and transformed the resulting construct into E. coli DH5α cells. We refer to this strain as the ancestral D (AncD) strain (figure 1). We confirmed the sequences of the inserts using Sanger sequencing. We also performed initial fitness measurements on these lines. We then amplified the inserts of the AncS and AncD strains using PCR and recloned them into a modified plasmid pUA67 (supplementary figure S18) using the restriction sites for EcoRI and HindIII (New England Biolabs). We chose 10 clones for each of the two lines as starting populations for our evolution experiment (supplementary figure S19).

230 Chapter 4

Initial fitness measurements We measured the fitness of the ancestral strains (AncS and AncD) in several antibiotics and in several concentrations for each of the antibiotics. Briefly, we measured fitness in ampicillin (Sigma), the native substrate for the TEM-1 beta lactamase enzyme; in the novel substrates cefotaxime (Sigma) and imipenem (Sigma); and in combinations of native and novel substrates (ampicillin in combination with cefotaxime, and ampicillin in combination with imipenem). We used 25µg/ml kanamycin in each growth condition for the selection of the plasmid marker. For the fitness measurements, we cultured the ancestral strains overnight (for 16 hours) at 37°C in LB supplemented with 25µg/ml kanamycin, after which time the cultures reach stationary phase. We measured the optical density of the cells using a plate reader (Tecan infinite F200 pro) at 600nm (OD(0) at time t=0). We then diluted the cultures 100-fold into medium with antibiotics (total volume of 2 ml) and let the cells grow for 7 further hours in a 96 well plate (Sarstedt, MegaBlock 96 well plate) in a plate shaker (VWR, Incubating Microplate Shaker) at 600rpm. After 7 hours, we again measured the optical density of the cells (OD(7) at t=7 hours). We estimated fitness (f) by calculating the fold-change in cell density given by OD(7) f = . (OD (0) /100)

We used fold-change in cell density in 7 hours of growth as a proxy for fitness because this measure is influenced by three components of fitness. First, the fold-change in cell density is affected by a change in initial viability of cells in antibiotic medium. Second, fold-change in cell density is also influenced by differences in growth rate. Finally, this measure is also affected by a change in carrying capacity if the cells reach stationary phase within 7 hours. Thus, measurement of the cell density change gives us a far more complete picture of fitness compared to only one component of fitness that is widely used, namely cell viability. We here note that the growth phase of cells (exponential or stationary) at 7 hours is also influenced by the type of antibiotic and its concentration in the growth medium (see supplementary figure S4).

231 Chapter 4

Evolution experiment Our evolution experiment had four steps (figure 3). In the first step, we transformed the plasmids containing single or double copy TEM-1 inserts into the mutator E. coli TB90 cells using a chemical transformation protocol (Hanahan, 1983; Swords, 2003). We let the mutator cells with the plasmids grow for 16 hours in LB medium supplemented with 50µg/ml of kanamycin (Sigma) without inducing mutagenesis. In the second step, we transferred 150µl of the cells from the previous culture into 2 ml LB medium supplemented with 50µg/ml of kanamycin and containing 50µl of 10% (w/v) L-(+)-arabinose for mutagenesis. Under these mutagenic conditions, we allowed the cells to grow for 20 hours. We then isolated the plasmids using the alkaline-lysis method (Birnboim and Doly, 1979; Birnboim, 1983; Ehrt and Schnappinger, 2003) on a 96-well plate. In the third step, we transformed the mutagenized plasmids into the E. coli DH5α strain using a chemical transformation protocol and let the cells grow for 16 hours in LB medium supplemented with 25µg/ml of kanamycin. In the final step, we diluted the cells 10-fold into appropriate antibiotic selection medium. We let the cells grow in the selection medium for 7 hours and subsequently isolated the plasmids using the alkaline lysis method. We repeated these four steps 12 times, thus subjecting the ancestral strains to 12 rounds of mutagenesis and selection. We refer to the populations thus evolved as the EvoS and EvoD lines. We transformed the plasmids isolated at the end of each round of selection into E. coli DH5α cells, let them grow for 16 hours and stored them at -80°C after the addition of 15% glycerol (Sigma) (final concentration, w/v) for future use. We emphasize that only the plasmids evolved in our experiment and not the cells, since we changed the host cells at each step of the experiment.

Selection conditions We used several antibiotic conditions for our selection experiment (supplementary figure S5). For each condition, we evolved five replicate populations starting from the clones derived from the AncS strain and five replicate populations starting from clones derived from the AncD strain. In general, we refer to these evolved lines as EvoS and EvoD lines, but we additionally labeled them with the selection conditions in which they evolved. All selection media listed below also contained 25µg/ml of kanamycin to maintain the plasmid in the cell. In the first selection condition we subjected cells only to kanamycin without any selection for maintenance of the inserts. We call these evolved lines neutral lines, and label them EvoS-N and EvoD-N. We

232 Chapter 4

propagated EvoS-A and EvoD-A lines in 100µg/ml of ampicillin, and EvoS-A↑ and EvoD-A↑ lines in increasing concentrations of ampicillin (supplementary figure S20a). We subjected lines EvoS-C and EvoD-C to selection on increasing concentrations of cefotaxime (supplementary figure S20b); lines EvoS-I and EvoD-I to increasing concentrations of imipenem (supplementary figure S20c); lines EvoS-CA and EvoD-CA to 100µg/ml of ampicillin and increasing concentration of cefotaxime (supplementary figure S20d); and lines EvoS-IA and EvoD-IA to 100µg/ml of ampicillin and increasing concentration of imipenem (supplementary figure S20e). We denote the five replicate populations for each condition by the numbers 1-5 after a line’s name (e.g., EvoS-CA1, EvoD-CA1 etc.). We increased the antibiotics concentrations for the novel substrates for two reasons. First, we wanted to ensure that the cell growth in antibiotics does not occur just because of inoculum effect (Brook I, 1989; Thomson and Moland, 2001). Second, we put higher selection pressure for adaptation to the novel substrates to ensure that these lines adapt to the novel substrates and not just to the native substrate.

Fitness measurement of the evolved lines We used the same experimental protocol as for the ancestral strain to measure the fitness of the evolved lines, except that we calculated the fitness of the evolved lines relative to the fitness of the AncD strain. We measured fitness relative to AncD strain for two reasons. First, we intended to compare fitness of EvoS and EvoD lines. Thus, it was necessary to have a single reference fitness value in the ancestral strain. Second, in certain antibiotic conditions, the AncD strain had higher fitness than the AncS strain. Thus, any fitness increase in the EvoD lines relative to the AncS strain would not necessarily indicate fitness increase relative to the AncD strain. Conversely, any fitness increase in the EvoS lines relative to the AncD strain would certainly indicate fitness increase relative to the AncS strain. For an evolved line ‘Evo’,

ODEvo(7) OD Evo (7) (OD (0) /100) OD (0) relative fitness (rf) = Evo= Evo ODAncD(7) OD AncD (7)

(ODAncD (0) /100) OD AncD (0)

where ODAncD(0) is the cell density of the AncD strain at t=0; ODAncD(7) is the cell density of the

AncD strain after 7 hours; ODEvo(0) is the cell density of the evolved line at t=0; and ODEvo(7) is the cell density of the evolved line after 7 hours. A relative fitness value greater than 1 indicates that the evolved strain has a fitness advantage over the AncD strain.

233 Chapter 4

We note that the antibiotic concentrations used for fitness measurements were lower than the concentrations used for selection experiments. This is because in selection experiments, initial cell numbers at the beginning of the experiment were 10 times higher than initial number of cells used for fitness measurements. Thus, we used higher antibiotic concentrations in the selection experiments to avoid selecting cells that only survive due to an inoculum effect (Brook I, 1989; Thomson and Moland, 2001).

Measurement of plasmid copy number change To estimate the change in plasmid copy number during the evolution experiment, we isolated plasmids from the ancestral strain and from the evolved lines. We cultured cells in 5ml LB medium supplemented with 25µg/ml of kanamycin for 16 hours. Before plasmid isolation, we estimated cell density by measuring optical density of the culture at 600nm using a plate reader. We then performed plasmid isolation using the ChargeSwitch-Pro plasmid miniprep kit (Invitorgen), separated plasmid DNA on a 0.8% agarose gel, and estimated its concentration with the Genetools software (Syngene), which estimates sample DNA concentrations relative to reference DNA concentrations in an agarose gel. We calculated the change in plasmid copy ([]plasmid/ OD ) number in an evolved line ‘Evo’ relative to an ancestor as Evo Evo , where ()[]plasmidAnc/ OD Anc

[plasmid]Evo is the concentration of plasmid DNA isolated from the evolved line; ODEvo is the optical density of the evolved line from which the plasmid isolation was performed; [plasmid]Anc

is the concentration of the plasmid DNA isolated from the ancestral strain; and ODAnc is the optical density of the ancestral strain from which the plasmid isolation was performed.

Quantification of gene loss To estimate the extent of gene loss in the EvoD lines, we employed a simple PCR based screen. Because of our plasmid design, individuals that contain duplicate copies of the TEM-1 gene have a unique sequence in the region where the two copies are joined. We designed a primer pair (supplementary table S4) where one primer bound only to this unique region. Together, the two primers can amplify a 1 kbp-long PCR product that occurs only in individuals with two TEM-1 copies. To perform this screen, we streaked out evolved lines onto LB-agar plates to isolate

234 Chapter 4 clones. We then performed colony PCR on these clones with this primer pair, and determined the number of clones with duplicate copies of the gene. We screened at least 20 clones in each EvoD line.

High-throughput sequencing To study mutations in evolved lines, we sequenced all such lines after 1, 4, 8, and 12 rounds of mutagenesis and selection using high-throughput sequencing on the Roche 454 platform (Margulies et al., 2005). In addition, we sequenced the neutral lines (EvoS-N and EvoD-N) after 2 and 3 rounds of mutagenesis and selection. The inserts in our experiment (~ 1 kbp and ~ 2 kbp for single-gene and duplicate-gene lines) are longer than the maximum length of amplicons (ca. 400-500 bps) that can be reliably sequenced with current technology. We thus amplified the inserts in EvoS and EvoD lines in two and four separate amplicons, respectively. For the EvoD lines, we only sequenced individuals that had retained two copies of the TEM-1 gene. We also sequenced the AncS and AncD strains to estimate the PCR and sequencing error rates in our samples. We used Roche 454 barcodes (supplementary table S5) for sample multiplexing (see supplementary methods S3).

Sequence data analysis We discarded reads that were less than 200 base pairs long. Also, we used only the first 450 bases of reads for all data analyses, because we found that the sequencing error rate increases beyond that. Our sequence analysis consisted of several steps. First, based on the multiplexing barcodes, we assigned the reads from the sequencing data to different lines, and to different rounds of the evolution experiment. We then aligned the sequences using a progressive multiple sequence alignment based on dynamic programming (Lipman et al., 1989). Subsequently, we identified true-positive variants as those that (a) occurred in at least three reads, (b) persisted even after correcting for PCR and sequencing errors. (see supplementary methods S4). Finally, we mapped these variants to the TEM-1 gene(s).

Analysis of low frequency mutations Sequencing errors could easily be mistaken as low frequency mutations and could bias our analysis. To avoid them, we sequenced known TEM-1 genes to estimate sequencing error rates.

235 Chapter 4

We used these error rates to discard polymorphisms that occurred in our data due to PCR and sequencing errors (see supplementary methods S4). After error correction, we found several low frequency mutations in all the lines. We then calculated a measure of sequence diversity observed in an evolved line as Number of different polymorphisms in a line Sequence diversity (d) = . Read Coverage We note that in the numerator, we only counted polymorphisms without considering their frequency. We did not consider the frequency of these polymorphisms for two reasons. First, we were interested in the number of different types of polymorphisms these evolved lines accumulate rather than their frequencies of occurrence. Second, the EvoD lines had a regulatory mutation that rose to high frequency in the evolution experiment, whereas the EvoS lines did not have any high frequency mutation except those that were present in the starting clones. Thus, had we considered the frequency of this mutation while calculating sequence diversity, we would have observed higher diversity in most of the EvoD lines solely due to this mutation. We also note that a line (or gene segment) with greater read coverage would reveal more polymorphisms than a line with lower read coverage (supplementary figure S21). It is therefore essential to take the read coverage into account while calculating sequence diversity.

Mutant reconstruction We reconstructed the T-53C mutation in the ancestral TEM-1 gene using site directed mutagenesis. Since this mutation was very close to the left primer binding region of our insert, we designed a long primer with this particular mutation present in the primer. We then amplified the TEM-1 gene from AncS strain using this primer pair and cloned the amplicon in the pUA67 plasmid using EcoRI and HindIII restriction sites. In the same manner, we also reconstructed a wild type sequence to rule out any bias in our data due to the reconstruction procedure. We confirmed the sequence of the reconstructed inserts by Sanger sequencing.

Acknowledgements We thank the Functional Genomics Center Zurich for its service in generating microarray data. We also thank Dr. Martin Ackermann and Dr. Eric Hayden for helpful discussions. This work was supported by Swiss National Science Foundation (grant 315230-129708), as well as through

236 Chapter 4 the YeastX project of SystemsX.ch, and the University Priority Research Program in Systems Biology at the University of Zurich. RD acknowledges support from the Forschungskredit program of the University of Zurich.

References Arabidopsis Genome Initiative, 2000. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408:796-815.

Andersson DI, Hughes D, 2009. Gene amplification and adaptive evolution in bacteria. Annu Rev Genet. 43:167- 195.

Anderson RP, Roth JR, 1977. Tandem genetic duplications in phage and bacteria. Annu Rev Microbiol. 31:473–505.

Armstrong KA, Acosta R, Ledner E, Machida Y, Pancotto M, McCormick M, Ohtsubo H, Ohtsubo E, 1984. A 37×10(3) molecular weight plasmid-encoded protein is required for replication and copy number control in the plasmid pSC101 and its temperature-sensitive derivative pHS1. J Mol Biol. 175:331-348.

Bailey JA, Gu Z, Clark RA, Reinert K, Samonte RV, Schwartz S, Adams MD, Myers EW, Li PW, Eichler EE, 2002. Recent segmental duplications in the human genome. Science 297:1003-1007.

Balakirev ES, Ayala FJ, 2003. Pseudogenes: Are They “Junk” or Functional DNA? Annu Rev Genet. 37:123-151.

Barkman T, Zhang J, 2009. Evidence for escape from adaptive conflict? Nature 462: E1; discussion E2-3.

Barlow M, Hall BG, 2002. Predicting evolutionary potential: in vitro evolution accurately reproduces natural evolution of the tem beta-lactamase. Genetics 160:823-832.

Bergthorsson U, Andersson DI, Roth JR, 2007. Ohno’s dilemma: Evolution of new genes under continuous selection. Proc Natl Acad Sci USA 104:17004 –17009.

Birnboim HC, 1983. A rapid alkaline extraction method for the isolation of plasmid DNA. Methods Enzymol. 100:243–255.

Birnboim HC, Doly J. 1979. A rapid alkaline extraction procedure for screening recombinant plasmid DNA. Nucleic Acids Res. 7:1513–1523.

Bowers JE, Chapman BA, Rong J, Paterson AH, 2003. Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422:433-438.

Bridges, CB, 1936. The Bar ‘gene’ a duplication. Science 83:210 – 211.

Brook I. 1989. Inoculum Effect. Clin Infect Diseases 11: 361-368.

Brown CJ, Todd KM, Rosenzweig RF, 1998. Multiple duplications of yeast hexose transport genes in response to selection in a glucose-limited environment. Mol Biol Evol. 15:931-942.

Clark AG, 1994. Invasion and maintenance of a gene duplication. Proc Natl Acad Sci USA 91:2950-2954.

Conant GC, Wolfe KH, 2007. Increased glycolytic flux as an outcome of whole-genome duplication in yeast. Mol Syst Biol. 3:1-12.

237 Chapter 4

Conant GC, Wolfe KH, 2008. Turning a hobby into a job: How duplicated genes find new functions. Nat Rev Genet. 9:938-950.

Dehal P, Boore JL, 2005. Two Rounds of Whole Genome Duplication in the Ancestral Vertebrate. Plos Biol. 3:1700-1708.

DeLuna A, Vetsigian K, Shoresh N, Hegreness M, Colon-Gonzalez M, Chao S, Kishony R, 2008. Exposing the fitness contribution of duplicated genes. Nat Genet. 40: 676-681.

Deng C, Cheng CH, Ye H, He X, Chen L, 2010. Evolution of an antifreeze protein by neofunctionalization under escape from adaptive conflict. Proc Natl Acad Sci USA 107: 21593–21598.

Des Marais DL, Rausher MD, 2008. Escape from adaptive conflict after duplication in an anthocyanin pathway gene. Nature 454:762-765.

Dunham MJ, Badrane H, Ferea T, Adams J, Brown PO, Rosenzweig F, Botstein D, 2002. Characteristic genome rearrangements in experimental evolution of Saccharomyces cerevisiae. Proc Natl Acad Sci USA 99:16144 –16149.

Ehrt S, Schnappinger D, 2003. Isolation of Plasmids from E. coli by Alkaline Lysis. In E. coli Plasmid Vectors Methods and Applications, Nicola Casali and Andrew Preston (editors), Methods in Molecular Biology, 235:75-78. Humana press.

Farrar WE Jr, Newsome JK, 1973. Mechanism of synergistic effects of beta-lactam antibiotic combinations on gram-negative bacilli. Antimicrob Agents Chemother. 4:109-114.

Force A, Lynch M, Pickett FB, Amores A, Yan Y, Postlethwait J, 1999. Preservation of Duplicate Genes by Complementary, Degenerative Mutations. Genetics 151:1531-1545.

Francino MP, 2005. An adaptive radiation model for the origin of new gene functions. Nat Genet. 37:573-577.

Gu Z, Steinmetz LM, Gu X, Scharfe C, Davis RW, Li W-H, 2003. Role of duplicate genes in genetic robustness against null mutations. Nature 421:63-66.

Gunnison JB, Kunishige E, Coleman VR, Jawetz E, 1955. The mode of action of antibiotic synergism and antagonism: the effect in vitro on bacteria not actively multiplying. J Gen Microbiol. 13:509-518.

Hahn MW, 2009. Distinguishing among evolutionary models for the maintenance of gene duplicates. J Hered. 100:605-617.

Hanahan D, 1983. Studies on transformation of Escherichia coli with plasmids. J Mol Biol. 166:557-580.

Hasunuma K, Sekiguchi M, 1977. Replication of plasmid pSC101 in Escherichia coli K12: requirement for dnaA function. Mol Gen Genet 154:225-230.

Himmelreich R, Hilbert H, Plagens H, Pirkl E, Li BC, Herrmann R, 1996. Complete sequence analysis of the genome of the bacterium Mycoplasma pneumoniae. Nucleic Acids Res. 24:4420-4449.

Holloway AK, Palzkill T, Bull JJ, 2007. Experimental evolution of gene duplicates in a bacterial plasmid model. J Mol Evol. 64:215-222.

Hooper SD, Berg OG, 2003. On the nature of gene innovation: duplication patterns in microbial genomes. Mol Biol Evol. 20:945-954.

Horiuchi T, Horiuchi S, Novick A, 1963. The genetic basis of hypersynthesis of ß-galactosidase. Genetics 48:157- 169.

238 Chapter 4

Hughes AL, 1994. The evolution of functionally novel proteins after gene duplication. Proc R Soc Lond B Biol Sci. 256:119-124.

Hughes AL, 2005. Gene duplication and the origin of novel proteins. Proc Natl Acad Sci USA 102:8791-8792.

Ives CL, Bott KF, 1990. Characterization of chromosomal DNA amplifications with associated tetracycline resistance in Bacillus subtilis. J Bacteriol. 172:4936-4944.

Kafri R, Dahan O, Levy J, Pilpel Y, 2008. Preferential protection of protein interaction network hubs in yeast: Evolved functionality of genetic redundancy. Proc Natl Acad Sci USA 105:1243-1248.

Kafri R, Levy M, Pilpel Y, 2006. The regulatory utilization of genetic redundancy through responsive backup circuits. Proc Natl Acad Sci USA 103:11653-11658.

Kao KC, Sherlock G, 2008. Molecular characterization of clonal interference during adaptive evolution in asexual populations of Saccharomyces cerevisiae. Nat Genet. 40:1499-1504.

Kellis M, Birren BE, Lander ES, 2004. Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae. Nature 428:617-624.

Kleinjan DA, Bancewicz RM, Gautier P, Dahm R, Schonthaler HB, Damante G, Seawright A, Hever AM, Yeyati PL, van Heyningen V, Coutinho P, 2008. Subfunctionalization of duplicated zebrafish pax6 genes by cis-regulatory divergence. PLoS Genet. 4:e29.

Klenk HP et al., 1997. The complete genome sequence of the hyperthermophilic, sulphate-reducing archaeon Archaeoglobus fulgidus. Nature 390:364-370.

Kondratyeva TF, Muntyan LN, Karavaiko GI, 1995. Zinc- and arsenic-resistant strains of Thiobacillus ferrooxidans have increased copy numbers of chromosomal resistance genes. Microbiology 141:1157-1162.

Kondrashov FA, Rogozin IB, Wolf YI, Koonin EV, 2002. Selection in the evolution of gene duplications. Genome Biol. 3:1–8.

Kuwada Y, 1911. Meiosis in the pollen mother cells of Zea Mays L. Bot Mag. 25:1633.

Li WH, Gu Z, Wang H, Nekrutenko A, 2001. Evolutionary analyses of the human genome. Nature 409:847-849.

Lipman DJ, Altschul SF, Kececioglu JD, 1989. A tool for multiple sequence alignment. Proc Natl Acad Sci USA 86: 4412–4415.

Lovett ST, Drapkin PT, Sutera VA Jr, Gluckman-Peskind TJ, 1993. A sister-strand exchange mechanism for recA- independent deletion of repeated DNA sequences in Escherichia coli. Genetics 135:631-642.

Lynch M, Conery JS, 2000. The Evolutionary Fate and Consequences of Duplicate Genes. Science 290:1151-1155.

Lynch M, Force A, 2000a. The origin of interspecies genomic incompatibility via gene duplication. Am Nat 156:590-605.

Lynch M, Force A, 2000b. The Probability of Duplicate Gene Preservation by Subfunctionalization. Genetics 154:459-473.

Lynch M, O’Hely M, Walsh B, Force A, 2001. The Probability of Preservation of a Newly Arisen Gene Duplicate. Genetics 159:1789-1804.

Lynch VJ, 2007. Inventing an arsenal: adaptive evolution and neofunctionalization of snake venom phospholipase A2 genes. BMC Evol Biol. 7:2.

239 Chapter 4

Margulies M et al., 2005. Genome sequencing in microfabricated high-density picolitre reactors. Nature 437:376- 380.

Marques-Bonet T, Kidd JM, Ventura M, Graves TA, Cheng Z, Hillier LDW, Jiang Z, Baker C, Malfavon-Borja R, Fulton LA, Alkan C, Aksay G, Girirajan S, Siswara P, Chen L, Cardone MF, Navarro A, Mardis ER, Wilson RK, and Eichler EE, 2009. A burst of segmental duplications in the genome of the African great ape ancestor. Nature 457:877-881.

Matthews PR, Stewart PR, 1988. Amplification of a section of chromosomal DNA in methicillin-resistant Staphylococcus aureus following growth in high concentrations of methicillin. J Gen Microbiol 134:1455-1464.

Miralles R, Gerrish PJ, Moya A, Elena SF, 1999. Clonal interference and the evolution of RNA viruses. Science 285:1745-1747.

Moellering RC Jr, Wennersten C, Weinberg AN, 1971. Studies on antibiotic synergism against enterococci. I. Bacteriologic studies. J Lab Clin Med. 77:821-828.

Moore RC, Purugganan MD, 2003. The early stages of duplicate gene evolution. Proc Natl Acad Sci USA 100:15682-15687.

Näsvall J, Sun L, Roth JR, Andersson DI, 2012. Real-Time Evolution of New Genes by Innovation, Amplification, and Divergence. Science 338:384-387.

Nichols BP, Guay GG, 1989. Gene amplification contributes to sulfonamide resistance in Escherichia coli. Antimicrob Agents Chemother. 12:2042-2048.

Nowak MA, Boerlijst MC, Cooke J, Smith JM, 1997. Evolution of genetic redundancy. Nature 388:167-171.

Ohno S, 1970. Evolution by gene duplication. Springer-Verlag, Heidelberg, Germany.

Otto E, Young JE, Maroni G, 1986. Structure and expression of a tandem duplication of the Drosophila metallothionein gene. Proc Natl Acad Sci USA 83:6025-6029.

Riehle MM, Bennett AF, Long AD, 2001. Genetic architecture of thermal adaptation in Escherichia coli. Proc Natl Acad Sci USA 98:525-530.

Reams AB, Neidle EL, 2003. Genome plasticity in Acinetobacter: new degradative capabilities acquired by the spontaneous amplification of large chromosomal segments. Mol Microbiol. 47:1291-1304.

Rodríguez-Trelles F, Tarrío R, Ayala FJ, 2003. Convergent neofunctionalization by positive Darwinian selection after ancient recurrent duplications of the xanthine dehydrogenase gene. Proc Natl Acad Sci USA 100:13413-13417.

Rouquier S, Blancher A, Giorgi D, 2000. The olfactory receptor gene repertoire in primates and mouse: Evidence for reduction of the functional fraction in primates. Proc Natl Acad Sci USA 97:2870-2874.

Rouquier S, Taviaux S, Trask B, Brand-Arpon V, van den Engh G, Demaille J, Giorgi D, 1998. Distribution of Olfactory Receptor Genes in the Human Genome. Nat Genet. 18: 243–250.

Rubin GM et al., 2000. Comparative genomics of the eukaryotes. Science 287:2204-2215.

Serebrovsky AS, 1938. Genes scute and achaete in Drosophila melanogaster and a hypothesis of gene divergency. C R Acad Sci. URSS 19:77–81.

Sidow A, 1996. Gen(om)e duplications in the evolution of early vertebrates. Curr Opin Genet Devel. 6:715-722.

240 Chapter 4

Simillion C, Vandepoele K, Van Montagu MC, Zabeau M, Van de Peer Y, 2002. The hidden duplication past of Arabidopsis thaliana. Proc Natl Acad Sci USA 99:13627-13632.

Sonti RV, Roth JR, 1989. Role of gene duplications in the adaptation of Salmonella typhimurium to growth on limiting carbon sources. Genetics 123:19–28.

Soo VW, Hanson-Manful P, Patrick WM, 2011. Artificial gene amplification reveals an abundance of promiscuous resistance determinants in Escherichia coli. Proc Natl Acad Sci USA 108:1484-1489.

Spring J, 1997. Vertebrate evolution by interspecific hybridisation – are we polyploids? FEBS Lett. 400:2-8.

Stemmer WP, 1994. Rapid evolution of a protein in vitro by DNA shuffling. Nature 370:389-391.

Stephens SG, 1951. Possible significance of duplications in evolution. Adv Genet. 4, 247–265.

Straus DS, 1975. Selection for a large genetic duplication in Salmonella typhimurium. Genetics 80:227-237.

Sutcliffe JG. 1978. Nucleotide sequence of the ampicillin resistance gene of Escherichia coli plasmid pBR322. Proc Natl Acad Sci USA 75:3737-3741.

Swords WE, 2003. Chemical Transformation of E. col. In E. coli Plasmid Vectors Methods and Applications, Nicola Casali and Andrew Preston (editors), Methods in Molecular Biology, 235:49-53. Humana press.

Taylor JS, Raes J, 2004. Duplication and divergence: The Evolution of New Genes and Old Ideas. Annu Rev Genet. 38:615–643.

Tekaia F, Dujon B, 1999. Pervasiveness of Gene Conservation and Persistence of Duplicates in Cellular Genomes. J Mol Evol 49:591–600.

Tirosh I, Barkai N, 2007. Comparative analysis indicates regulatory neofunctionalization of yeast duplicates. Genome Biol. 8:R50.

Thomson KS, Moland ES, 2001. Cefepime, piperacillin-tazobactam, and the inoculum effect in tests with extended- spectrum beta-lactamase-producing Enterobacteriaceae. Antimicrob Agents Chemother. 45:3548-3554.

Tocchini-Valentini GD, Fruscoloni P, and Tocchini-Valentini GP, 2005. Structure, function, and evolution of the tRNA endonucleases of Archaea: An example of subfunctionalization. Proc Natl Acad Sci USA 102:8933– 8938.

Tomb JF et al., 1997. The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature 388:539- 547.

Vavouri T, Semple JI, Lehner B, 2008. Widespread conservation of genetic redundancy during a billion years of eukaryotic evolution. Trends Genet. 24:485-488.

Wagner A, 2000. The Role of Population Size, Pleiotropy and Fitness Effects of Mutations in the Evolution of Overlapping Gene Functions. Genetics 154:1389–1401.

Wagner A, 2000. Robustness against mutations in genetic networks of yeast. Nat Genet. 24:355-361.

Wagner A, 2002. Selection and gene duplication: a view from the genome. Genome Biol. 3:1-3.

Walsh JB, 1995. How Often Do Duplicated Genes Evolve New Functions? Genetics 139: 421-428.

Woolfe A, Elgar G, 2007. Comparative genomics using Fugu reveals insights into regulatory subfunctionalization. Genome Biol. 8:R53.

241 Chapter 4

Zaccolo M, Gherardi E, 1999. The effect of high-frequency random mutagenesis on in vitro protein evolution: a study on TEM-1 beta-lactamase. J Mol Biol. 285:775-783.

Zaslaver A, Bren A, Ronen M, Itzkovitz S, Kikoin I, Shavit S, Liebermeister W, Surette MG, Alon U, 2006. A comprehensive library of fluorescent transcriptional reporters for Escherichia coli. Nat Methods 3:623-628.

Zhang J, 2003. Evolution by gene duplication: an update. Trends Ecol Evol. 18: 292-298.

Zhang J, Zhang Y, Rosenberg HF, 2002. Adaptive evolution of a duplicated pancreatic ribonuclease gene in a leaf- eating monkey. Nat Genet. 30:411-415.

242 Appendix to Chapter 4

Supplementary figures

Figure S1: Fitness of the ancestral strains (AncS and AncD) in ampicillin + cefotaxime. We measured fitness of the ancestral strains in medium supplemented with (a) 50 µg/ml ampicillin + cefotaxime, (b) 100 µg/ml ampicillin + cefotaxime, (c) 150 µg/ml ampicillin + cefotaxime and (d) 200 µg/ml ampicillin + cefotaxime. The y-axis shows the fold increase in cell density after 7 hours of growth and the x-axis shows the concentrations of cefotaxime in which we measured fitness. The red curve shows the data for the AncD line, and the blue curve shows the data for the AncS strain. We observed a decrease in cell density of the ancestral strains as we increased the concentration of cefotaxime in the medium. However, there was no significant difference in this decrease between the AncS and the AncD lines in (a) and (b) (ANOVA, supplementary methods S1; p=0.37 in 50 µg/ml ampicillin + cefotaxime; p=0.08 in 100 µg/ml ampicillin + cefotaxime). In contrast, a significant difference existed in the decrease in growth between these two lines in (c) and (d) (p=0.0003 for data in (c); p=0.0009 for data in (d)). For each data point, squares correspond to mean cell densities (averaged over 3 measurements), and whiskers to one standard deviation (s.d.). The figure S1d is reproduced here from the main text figure 2c for completeness.

243 Appendix to Chapter 4

Figure S2: Fitness of the ancestral strains in imipenem. We measured fitness of the ancestral strains in medium supplemented with imipenem. The y-axis shows the fold increase in cell density after 7 hours of growth and the x-axis shows the concentrations of imipenem in which we measured fitness. The red curve shows the data for the AncD strain and the blue curve shows the data for the AncS strain. We observed a decrease in growth of the ancestral strains as we increased the concentration of imipenem in the medium. However, there was no significant difference in this decrease between the AncS and the AncD lines (ANOVA, supplementary methods S1; p=0.16). For each data point, squares correspond to mean cell densities (averaged over 3 measurements), and whiskers to one standard deviation (s.d.).

244 Appendix to Chapter 4

Figure S3: Initial fitness measurement of the ancestral strains (AncS and AncD) in ampicillin +imipenem. We measured fitness of the ancestral strains in medium supplemented with (a) 50 µg/ml ampicillin + imipenem, (b) 100 µg/ml ampicillin + imipenem, (c) 150 µg/ml ampicillin + imipenem and (d) 200 µg/ml ampicillin + imipenem. The y-axis shows the fold- increase in cell density after 7 hours of growth and the x-axis shows the concentrations of imipenem in which we measured fitness. The red curve shows the data for the AncD line, and the blue curve shows the data for the AncS strain. We observed a decrease in cell density of the ancestral strains as we increased the concentration of imipenem in the medium. However, there was no significant difference in this decrease between the AncS and the AncD lines in (a) and (b) (ANOVA, supplementary methods S1; p=0.99 for (a); p=0.09 for (b)). In contrast, a significant difference existed between these two lines in (c) and (d) (p=0.0002 for data in both (c) and (d)). For each data point, squares correspond to mean cell densities (averaged over 3 measurements), and whiskers to one standard deviation (s.d.).

245 Appendix to Chapter 4

Figure S4: Growth curves of the ancestral strains (AncS and AncD). We measured growth of the ancestral strains for 10 hours (see Methods S6) in medium supplemented with (a) 100 µg/ml ampicillin, (b) 300 µg/ml ampicillin, (c) 350 µg/ml ampicillin, (d) 0.105 µg/ml cefotaxime, (e) 0.06 µg/ml imipenem, (f) 100 µg/ml ampicillin + 0.006 µg/ml cefotaxime, (g) 200 µg/ml ampicillin + 0.003 µg/ml cefotaxime, (h) 100 µg/ml ampicillin + 0.04 µg/ml imipenem and (i) 200 µg/ml ampicillin + 0.03 µg/ml imipenem. In each panel, the x-axis shows the time in hours, and the y-axis shows the fold increase in cell density. The red curve indicates the data for the AncD strain and the blue curve for the AncS strain. We do not see any significant difference in growth curves between the AncS and AncD strains in (a) (repeated measures ANOVA, supplementary methods S1; p=0.23), (d) (p=0.50), (e) (p=0.52), (h) (p=0.07). In contrast, we observed a significant difference in (b) (p=7.1×10-7), (c) (p=1.5×10-7), (g) (p=6.2×10-13) and (i) (p=9.6×10-6), supporting our results from the fitness measurements of the ancestral strains. We saw differences between the growth curves of the AncS and the AncD lines in panel (f) with 100 µg/ml ampicillin and 0.006 µg/ml cefotaxime (p=2.0×10-3). Where significant differences occurred, they generally favored the double-copy AncD strain. Each data point shows mean cell density ± one standard deviation (s.d.).

246 Appendix to Chapter 4

Figure S4 (continued)

247 Appendix to Chapter 4

Figure S5: Nomenclature of the evolved lines based on the selection condition. We used several antibiotic conditions for our selection experiment. For each condition, we evolved five replicate populations starting from the clones derived from the AncS strain and five replicate populations starting from clones derived from the AncD strain. In general, we refer to these evolved lines as EvoS and EvoD lines, but we additionally labeled them with the selection conditions in which they evolved. All selection media listed below also contained 25µg/ml of kanamycin to maintain the plasmid in the cell. In the first selection condition we subjected cells only to kanamycin without any selection for maintenance of the inserts. We call these evolved lines neutral lines, and label them EvoS-N and EvoD-N. We propagated EvoS-A and EvoD-A lines in 100µg/ml of ampicillin, andEvoS-A↑ and EvoD-A↑ lines in increasing concentrations of ampicillin (figure S31a). We subjected lines EvoS-C and EvoD-C to selection on increasing concentrations of cefotaxime (figure S31b); lines EvoS-I and EvoD-I to increasing concentrations of imipenem (figure S31c); lines EvoS-CA and EvoD-CA to 100µg/ml of ampicillin and increasing concentration of cefotaxime (figure S31d); and lines EvoS-IA and EvoD-IA to 100µg/ml of ampicillin and increasing concentration of imipenem (figure S31e). We denote the five replicate populations for each condition by the numbers 1-5 after a line’s name (e.g., EvoS-CA1, EvoD-CA1 etc.). “↑” indicates an increase in concentration of an antibiotic over the rounds of the experiment.

EvoS EvoD Line Antibiotic(s) N1 N2 N3 N4 N5 N1 N2 N3 N4 N5 N Marker A1 A2 A3 A4 A5 A1 A2 A3 A4 A5 A Amp 100 A↑1 A↑2 A↑3 A↑4 A↑5 A↑1 A↑2 A↑3 A↑4 A↑5 A↑ Amp 175↑ C1 C2 C3 C4 C5 C1 C2 C3 C4 C5 C Cef 0.0105↑ I1 I2 I3 I4 I5 I1 I2 I3 I4 I5 I Imi 0.03↑ Amp 100+ CA1 CA2 CA3 CA4 CA5 CA1 CA2 CA3 CA4 CA5 CA Cef 0.003↑ Amp 100+ IA1 IA2 IA3 IA4 IA5 IA1 IA2 IA3 IA4 IA5 IA Imi 0.02↑

248 Appendix to Chapter 4

Figure S6: Fitness of the evolved CA lines. To test for adaptation of the lines evolved in cefotaxime and ampicillin (CA lines), we measured their fitness in medium supplemented with (a) ampicillin and cefotaxime, (b) ampicillin alone, and (c) cefotaxime alone. The y-axis shows the relative fitness measured relative to the fitness of the AncD strain. A relative fitness of greater than 1 suggests a fitness increase in an evolved line. We measured fitness of the CA lines in 100 µg/ml amp + 0.003 µg/ml cef, 100µg/ml amp + 0.006 µg/ml cef, 100µg/ml amp + 0.009 µg/ml cef, 200µg/ml amp + 0.003 µg/ml cef and 200µg/ml amp + 0.006 µg/ml cef. In all conditions, both EvoS-CA and EvoD-CA lines showed a significant fitness increase (table S1). In none of these conditions did the EvoD-CA lines show a higher fitness than the EvoS-CA lines. Surprisingly, in 200µg/ml amp + 0.003 µg/ml cef and 200µg/ml amp + 0.006 µg/ml cef, the EvoS-CA lines had higher fitness than the EvoD-CA lines (p=0.007 and p=1.2×10-5 respectively). To find out whether the CA lines had evolved increased activity on ampicillin, we also measured fitness of these lines in 100 µg/ml and 200 µg/ml ampicillin. No significant fitness increase had occurred in the CA lines in 100 µg/ml ampicillin, but both EvoS-CA and EvoD-CA lines showed increased fitness in 200 µg/ml ampicillin (see also table S1). To find out whether the CA lines evolved increased activity on the new substrate cefotaxime, we measured the fitness of these lines in 0.0105 µg/ml and 0.021 µg/ml cefotaxime. Both lines showed a significant fitness increase in cefotaxime (table S1). This increase was again higher in the EvoS-CA lines than in the EvoD-CA lines (p=3.5×10-6 and p=0.002 respectively). Bars indicate mean fitness, and whiskers show one standard deviation (s.d.). Asterisks indicate a significant difference in the fitness increase between lines. One of the panels in (a) is reproduced here from the main text figure 4 for completeness.

249 Appendix to Chapter 4

Figure S6 (continued)

(a)

250 Appendix to Chapter 4

Figure S6 (continued)

(b)

(c)

251 Appendix to Chapter 4

Figure S7: Fitness of the evolved N lines. We measured fitness of the evolved N lines in 100 and 200 µg/ml of ampicillin to check whether they show a fitness increase in ampicillin. We did not see any significant increase in fitness in 100 µg/ml ampicillin (supplementary table S2), but all the lines showed a significant increase in 200 µg/ml of ampicillin (p=5.3×10-3 for the EvoS-N lines, and p=6.1×10-4 for the EvoD-N lines; supplementary table S2). The lines showed an increase in plasmid copy number that can increase their fitness in ampicillin by increasing their gene dosage (see supplementary results S4). Bars indicate mean fitness, and whiskers indicate one standard deviation.

252 Appendix to Chapter 4

Figure S8: Fitness of the evolved A lines. We measured the fitness of A lines in 100 µg/ml and 200 µg/ml of ampicillin to verify whether they had adapted to ampicillin. Both the EvoS-A and the EvoD-A lines showed a significant fitness increase in the two concentrations of ampicillin (table S1), but we did not observe any significant difference in the fitness increase between the lines (p=0.19 in 100 µg/ml of ampicillin; p=0.06 in 200 µg/ml of ampicillin). Bars indicate mean fitness, and whiskers show one standard deviation.

253 Appendix to Chapter 4

Figure S9: Fitness of the evolved A↑ lines. We measured the fitness of the A↑ lines in 400 µg/ml of ampicillin because these lines were evolved in high (final) concentrations of ampicillin. Both lines showed a significant fitness increase in this medium (table S1), but we observed no difference in fitness between these lines (p=0.22). Bars indicate mean fitness, and whiskers show one standard deviation.

254 Appendix to Chapter 4

Figure S10: Fitness of the evolved C lines. We measured the fitness of C lines in 0.0105 µg/ml and 0.021 µg/ml of cefotaxime to verify whether these lines had adapted to cefotaxime. Only the EvoS-C lines showed a significant fitness increase in both these conditions, whereas the EvoD-C lines did not show any significant fitness increase (table S1). The fitness increase in the EvoS-C lines was significantly higher than that in the EvoD-C lines (p=3.4×10-5 in 0.0105 µg/ml cefotaxime and p=8.7×10-7 in 0.021 µg/ml cefotaxime). Bars indicate mean fitness, and whiskers show one standard deviation. Asterisks indicate a significant difference in fitness between lines.

255 Appendix to Chapter 4

Figure S11: Fitness of the evolved I lines. We measured fitness of the I lines in 0.02 µg/ml and 0.04 µg/ml of imipenem to see whether these lines had adapted to imipenem. We did not observe any significant increase in any of the lines (table S1). This observation suggests that the I lines did not evolve increased activity towards the novel substrate imipenem. Bars indicate mean fitness, and whiskers show one standard deviation

256 Appendix to Chapter 4

Figure S12: Fitness of the evolved IA lines. To test for adaptation of the IA lines (evolved in imipenem and ampicillin), we measured their fitness in medium supplemented with (a) ampicillin + imipenem, (b) ampicillin and (c) imipenem. Specifically, we measured fitness of these lines in 100 µg/ml amp + 0.02 µg/ml imi, 100µg/ml amp + 0.04 µg/ml imi, 100µg/ml amp + 0.06 µg/ml imi, 200µg/ml amp + 0.02 µg/ml imi, 200µg/ml amp + 0.04 µg/ml imi and 200µg/ml amp + 0.06 µg/ml imi. In all the conditions except the first one, both EvoS-IA and EvoD-IA lines showed a significant fitness increase (panel (a); supplementary table S2). In the first condition (100 µg/ml amp + 0.02 µg/ml imi, upper left panel), only the EvoS lines showed a significant fitness increase. In none of the conditions did the EvoD-IA lines show a higher fitness than the EvoS-IA lines. In five of these six conditions the EvoS-IA lines had higher fitness than the EvoD-IA lines (p=0.003 (panel (a) top left), 1.2×10-5 (panel (a) top right), 4.4×10-5 (panel (a) middle left), 0.04 (panel (a) middle right), and 0.0001 (panel (a) bottom left) respectively). We measured fitness in 100 µg/ml and 200 µg/ml ampicillin, and found no significant increase in 100 µg/ml ampicillin, but did observe a significant increase in 200 µg/ml ampicillin (panel (b); supplementary table S2). To find out whether the IA lines evolved increased activity towards the new substrate imipenem, we measured their fitness in 0.03 µg/ml and 0.06 µg/ml imipenem. Only the EvoS-IA lines showed a significant increase in 0.03 µg/ml imipenem, but did not show any fitness increase in 0.06 µg/ml imipenem (panel (c); supplementary table S2). The EvoD-IA lines did not show any fitness increase in imipenem (see also supplementary table S2). These observations suggest that most of the fitness increase in these lines is due to adaptation to a combination of the native substrate and the novel substrate. Bars indicate mean fitness, and whiskers show one standard deviation. Asterisks indicate a significant difference in fitness.

257 Appendix to Chapter 4

Figure S12 (continued)

(a)

258 Appendix to Chapter 4

Figure S12 (continued)

(b)

(c)

259 Appendix to Chapter 4

Figure S13: Increase of plasmid copy number in the evolved lines. We observed an increase in plasmid copy numbers in all the evolved lines. This increase was significantly higher for double-copy than for single-copy populations in the case of for N lines (p=0.03), A lines (p=0.008) and A↑ lines (p=0.008).Bars indicate mean fold-increase, and whiskers show one standard deviation. The asterisks indicate significant differences in the copy number increase. The data for the CA lines are reproduced here from figure 5 for completeness.

260 Appendix to Chapter 4

Figure S14: Frequency of individuals retaining duplicate copies of the TEM-1 gene in the EvoD lines. We observed a loss of one copy of the TEM-1 gene in all EvoD lines. To identify the extent of gene loss in our evolved populations, we employed a simple PCR based screen to identify individuals that retained duplicate copies of the gene at the end of the experiment. We observed loss of one copy of the TEM-1 gene in all the EvoD lines in our experiment (supplementary table S7), resulting in plasmids with only one copy of the gene. In the EvoD-N lines, this percentage ranged from 10% in replicate 1 to 50% in replicate 3, 4 and 5. In EvoD-A lines, this percentage varied between 5% and 75%, in EvoD-A↑ lines between 15% and 80%, in EvoD-C lines between 5% and 35%, in EvoD-I lines between 2% and 15%, in EvoD-CA lines between 7% and 25%, and finally, in EvoD-IA lines between 2% and 50%. The data for the CA lines are reproduced from figure 6 for completeness.

261 Appendix to Chapter 4

Figure S15: Change in population frequency of the T-53C mutation in the evolved lines over the rounds of the experiment. We sequenced the coding and near-upstream regions of TEM-1 genes in both single and double-copy inserts using 454 high-throughput sequencing (Margulies et al., 2005). We sequenced TEM-1 genes in the EvoS and EvoD lines after 1, 4, 8 and 12 rounds of mutagenesis and selection. For the EvoD lines, we only sequenced the individuals that had retained duplicate genes (see Methods). We found only one high frequency mutation. It occurred only in the individuals with duplicate gene copies in the EvoD lines. This mutation was a T→ C transition and mapped to the upstream region of the gene copy on the left, as indicated by the red asterisk (T-53C at position -53 relative to +1 at the transcription start site). This mutation reached a frequency of almost 80% in several EvoD lines. Also, the frequency of this mutation was highest after round 8 in several EvoD lines and decreased thereafter. In addition, the mutation was present in one of the starting clones of the EvoD lines (AncD clone 9, that gave rise to EvoD-A4, EvoD-C4 and EvoD-CA4 lines; supplementary figure S19), but its frequency decreased in this population from 80 percent after round one to a very low frequency of less than 1 percent at the end of round 12. Similar trends existed in the other EvoD lines, except for the neutral lines (EvoD-N3, EvoD-N4 and EvoD-N5 lines), where this mutation persisted at high frequency at round 12. The data for the CA lines are reproduced here from Figure 8a in the main text for completeness. Only replicates where the T-53C mutation occurred at high frequency are shown in the figure.

262 Appendix to Chapter 4

Figure S15 (continued)

263 Appendix to Chapter 4

Figure S15 (continued)

264 Appendix to Chapter 4

Figure S16: Change in the sequence diversities of the evolved lines over the course of the experiment. We calculated the sequence diversities of the (a) N lines, (b) A lines, (c) A↑ lines, (d) C lines, (e) I lines, (f) CA lines and (g) IA lines separately for rounds 1,4,8, and 12 of the evolution experiment. The y-axis shows the sequence diversity and the x-axis shows the round of the evolution experiment for which we calculated diversity (see Methods in the main text). The red asterisk shows a significant difference between the diversities. For this calculation, we merged the data from the 5 biological replicates in each of the lines. For each data point, we show mean ± 1 standard deviation (s.d.). Asterisks indicate a significant difference in sequence diversity.

265 Appendix to Chapter 4

Figure S17: Change in the sequence diversities in replicate lines. Here we show the changes in sequence diversities for individual replicates in each of the evolved lines.

266 Appendix to Chapter 4

Figure S18: Plasmid pUA67. We modified plasmid pUA66 (Zaslaver et al., 2006) by inserting additional restriction sites between XhoI and BamHI through cloning short oligonucleotides into these sites (5’ GAATTCTAGAGGTACCCGGGAAGCTT 3’). The restriction sites are marked as a multiple cloning site (MCS) in the figure. We refer to this modified plasmid as pUA67.

267 Appendix to Chapter 4

Figure S19: Clones derived from AncS and AncD strains that were used in our experiment. We constructed 10 clones each from AncS strain and AncD strain in the pUA67 plasmid. The numbers (1-10) in the table represent the clone number. Some of the clones contained mutations that were identified by sequencing. All the evolved lines can be mapped back to these starting clones by comparing position of the clone numbers in this table with the position of the evolved lines in table S5. For example, the clone 8 of AncS strain gave rise to EvoS-A3, EvoS-C3 and EvoS-CA3 lines.

AncS - clone AncD - clone Line Antibiotic(s) 1 2 3 4 5 1 2 3 4 5 N Marker (Kan) 6 7 8 9 10 6 7 8 9 10 A Amp 100 1 2 3 4 5 1 2 3 4 5 A↑ Amp 175↑ 6 7 8 9 10 6 7 8 9 10 C Cef 0.0105↑ 1 2 3 4 5 1 2 3 4 5 I Imi 0.03↑ 6 7 8 9 10 6 7 8 9 10 CA Amp 100 + Cef 0.003↑ 1 2 3 4 5 1 2 3 4 5 IA Amp 100 + Imi 0.02↑

268 Appendix to Chapter 4

Figure S20: Antibiotics concentrations used for selection experiments. We gradually changed the concentration of antibiotics in the medium in the course of the evolution experiment. The purpose of this change was two-fold. First, we wanted to avoid selecting cells that survived solely due to the well-known inoculum effect (Brook I, 1989; Thomson and Moland, 2001). Second, for evolution experiments involving the new antibiotics cefotaxime and imipenem, we wanted to ensure that cells can adapt to the novel antibiotics and not just to the native ampicillin. The horizontal axis below shows the time in rounds of an evolution experiment, and the vertical axis shows the concentration of the antibiotic. (a) For the A↑ lines, we increased the ampicillin concentration to test for the role of gene dosage in the maintenance of duplicate genes. (b) We increased the concentration of (b) cefotaxime in C lines; and (c) imipenem in I-lines. (d) In CA lines, we kept the ampicillin concentration fixed at 100 µg/ml and increased the cefotaxime concentration. (e) In IA lines, we also kept the ampicillin concentration fixed at 100 µg/ml and increased the imipenem concentration in the medium.

269 Appendix to Chapter 4

Figure S21: Correlation between the read coverage and the number of polymorphisms detected. It is possible that one would observe more polymorphisms in a line that has greater read coverage than others. To test this hypothesis, we calculated the correlation between the number of true polymorphisms (after error correction) that we detected from our analysis of the sequencing data and the minimum number of reads (or read coverage) required to detect that many polymorphisms. We note that we only count kinds of polymorphisms at all positions without considering their frequency. Indeed, a significant correlation exists between read coverage and the number of polymorphisms detected (Pearson’s correlation coefficient r=0.83, p=0.0003). Thus, it is essential to take the read coverage into account while calculating sequence diversity.

270 Appendix to Chapter 4

Figure S22: Change in frequency of the mutations that were present in the starting lines over the rounds of the experiment. We observed some high frequency sequence variants in the EvoS and EvoD lines after round 1 of our evolution experiment. These variants were present in our starting clones and were confirmed by sequencing. Specifically, they are the nucleotide changes (amino acid changes) G56A (V10I), A132G (D35G), A177G (D50G), A182G (N52D), T259C (C77C), C594G (T189R), and A866G (I282V). All coordinates are relative to the transcription start site +1. The individual panels show the changes in frequency (in %) of these mutations in the replicate lines over the rounds of the evolution experiment. The frequency of most of these mutations decreased over time and most of them had a frequency lower than 1% at round 12. These observations suggest that most of the mutations were purged from through strong selection in our experiment. It is also possible that the regulatory mutation (T-53C) that rose to high frequency in several EvoD lines or other mutations in the plasmid could drive some of these mutations to extinction. However, it is unlikely to be true in our experiment. If these mutations had no effect on fitness, the regulatory mutation (T-53C) in the EvoD line or other mutations in the plasmid would occur together in at least some of the evolved lines.

271 Appendix to Chapter 4

Figure S22 (continued)

272 Appendix to Chapter 4

Figure S22 (continued)

273 Appendix to Chapter 4

Figure S23: Analysis of the inserts in the plasmids isolated from the clones of the EvoD- CA1 line. To analyze the extent of gene loss, we used a simple PCR based screening to identify individuals that still retained duplicate copies of the TEM-1 gene. In addition, we isolated plasmids from clones derived from these individuals for further verification. Briefly, we isolated plasmids from these clones and digested them with EcoRI and HindIII enzyme which cut out the inserts. We then separated the digested products on an agarose gel to identify the inserts based on their size. We found that some clones, in addition to containing plasmids with duplicate copies of the gene, also contained some plasmids with only one copy of the gene. To ensure that this was not due to contamination of clones, we repeated the procedure, streaking out some of these clones on an LB-agar plate, isolating further more clones, and analyzing their plasmid content. Some clones still contained plasmids with only a single TEM-1 copy, and the same held when we repeated this procedure one more time. Panel (a) shows the digested products from the clones after two rounds of streaking, and panel (b) shows the digested products of double-copy clones after three rounds of streaking. The identifiers on the top of the gel indicate clone numbers as well as the expected inserts of their plasmids (single copy or double copy). There are two possibilities that could explain these observations. First, it is possible that the loss of one copy of the gene occurs at high frequency in these populations. This could lead to generation of plasmids with only one copy of the gene during a very short period of time. Second, it could be possible that some individuals could be transformed with both kinds of plasmids and could pass these plasmids on to their daughter cells. Unfortunately, our experiments do not allow us to distinguish between these two scenarios.

274 Appendix to Chapter 4

Figure S24: Barcodes in the amplicons for 454 sequencing (not to scale). The white region represents an amplicon, the green region a line-specific barcode, the blue region a universal sequence (forward and reverse), the red region a round specific-barcode, and the grey region 454 sequencing adapters (forward and reverse). Each amplicon was sequenced from the forward adapter region (left to right) to ensure that the barcodes were always sequenced.

275 Appendix to Chapter 4

Figure S25: Amplicons used for 454 sequencing and how they map to TEM-1 regions (not to scale). (a) Single-copy lines were sequenced with two amplicons, Single 1 and Single 2. These amplicons map to between -103 and +724, and between +522 and +1287 respectively, relative to the transcription start site (+1). The coding sequence ranges from +35 to +895. (b) Double-copy lines were sequenced with four amplicons, Duplicate 1, Duplicate 2, Duplicate 3 and Duplicate 4. These amplicons map, in this order, to the two gene copies as follows -103 to +724, +1590 to +2335, +522 to +1336 and +967 to +1773. All coordinates are relative to the transcription start (+1) site of the left copy. The coding sequences range from +35 to +895 and from +1103 to +1963. Although the amplicons are approximately 800bp in size, we considered only the first 450 bps in our analysis, because we had found in preliminary sequence analysis that base calling errors increase beyond 450 bps.

276 Appendix to Chapter 4

Figure S26: Error rates in the amplicons from the next generation sequencing data. We sequenced the ancestral strains, AncS and AncD to estimate error rates in PCR and 454 sequencing, and estimated these rates separately for indels and SNPs. There are two amplicons for the AncS strain, numbered 1 and 2, and four amplicons for AncD, numbered 1, 2, 3, and 4. We used only the first 450 bases of each 454 read for error calculation. The x-axis shows each of the amplicons of the AncS strain and the AncD strain. The y-axis shows error rates per base pair. Note that error rates vary among amplicons. We calculated the probability of a SNP/indel in the evolved lines to be a false positive based on this error rates (supplementary methods S4).

277 Appendix to Chapter 4

Figure S27: Error rates in two sets of reads in the 454 sequencing data. We used reads from two sequencing runs from the 454 sequencing platform, and sequenced the ancestral TEM-1 in both sequencing runs, which allowed us to estimate the error rates in two runs independently. The y-axis shows the error rate per base pair. The x-axis shows the error rates for set 1 reads, for set 2 reads and for the combined data. We note that the error rates are higher in the set 1 reads compared to the set2 reads. The red bar shows the SNP error rate, the blue bar shows the indel error rate, and the grey bar shows the total error rate (indel+SNP error rates).

278 Appendix to Chapter 4

Figure S28: Ka/Ks ratio in the evolved lines. We studied the ratio Ka/Ks of non-synonymous divergence at non-synonymous sites (Ka) to the synonymous divergence of synonymous sites (Ks). Specifically, we counted number of non-synonymous and synonymous mutations in each read from the sequencing data for five replicates in each evolved line and calculated the Ka/Ks ratio from it (see supplementary methods S6). The figure shows the Ka/Ks ratio for each evolved line over the rounds of the experiment. We note that several factors may render this ratio unsuitable for inferring the role of selection on the TEM-1 gene in our experiment (supplementary results S8).

279 Appendix to Chapter 4

Figure S29: Average number of nucleotide differences between sequence reads in pair-wise comparisons. The data here represents the number of nucleotide differences in pair-wise comparisons between reads, averaged over all pair-wise comparisons in a replicate line, and subsequently averaged over five replicates for each evolved line. We note that this difference is extremely low (<1) for all the evolved lines.

280 Appendix to Chapter 4

Supplementary results

Results S1: Initial fitness measurement of the ancestral strains. We wanted to know if there are inherent fitness differences between the AncD and AncS strain beyond what we had found on ampicillin and cefotaxime (main text, figure 2). We observed a reduction in growth of the ancestral strains as we increased the imipenem concentration in the medium (supplementary figure S2; supplementary table S1d), but this decrease was not significantly different between AncS and AncD(ANOVA, p=0.16). In low concentrations of ampicillin (50 µg/ml and 100µg/ml) supplemented with imipenem, there was also no significant difference in the reduction in growth between the AncS and AncD (ANOVA; p=0.99 in 50µg/ml ampicillin supplement; p=0.08 in 100 µg/ml ampicillin supplement; supplementary figure S3; supplementary table S1e). However, as we increased the supplemented ampicillin concentration, the AncD strain showed a significantly smaller fitness reduction than the AncS strain (ANOVA; p=0.0001 in 150µg/ml ampicillin supplement; p=0.0002 in 200 µg/ml ampicillin supplement; supplementary figure S3; supplementary table S1e). These results support our observations for the other novel antibiotic cefotaxime and suggest that these observations are not only specific to one novel antibiotic.

Results S2: Analysis of growth curves of the ancestral strains To investigate the differences in growth between the ancestral strains in more detail, we measured their growth for 10 hours (supplementary methods S5) in medium supplemented with 100 µg/ml ampicillin (supplementary figure S4a), 300 µg/ml ampicillin (supplementary figure S4b), 350 µg/ml ampicillin (supplementary figure S4c), 0.105 µg/ml cefotaxime (supplementary figure S4d), 0.06 µg/ml imipenem (supplementary figure S4e), 100 µg/ml ampicillin and 0.006 µg/ml cefotaxime (supplementary figure S4f), 200 µg/ml ampicillin and 0.003 µg/ml cefotaxime (supplementary figure S4g), 100 µg/ml ampicillin and 0.04 µg/ml imipenem (supplementary figure S4h) and 200 µg/ml ampicillin and 0.03 µg/ml imipenem (supplementary figure S4i). The red lines indicate the fold-change in cell density for the AncD strain over time, and the blue lines show the same for the AncS strain. We did not see any significant difference in growth curves between the AncS and AncD strains in 100 µg/ml ampicillin (ANOVA, p=0.23), in 0.105 µg/ml cefotaxime (p=0.50), in 0.06 µg/ml imipenem (p=0.52), and in 100 µg/ml ampicillin and 0.04 µg/ml imipenem (p=0.07). In contrast, we observed significant differences in 300 µg/ml ampicillin (p=7.1×10-7), in 350 µg/ml ampicillin (p=1.5×10-7), in 100 µg/ml ampicillin and 0.006 µg/ml cefotaxime (p=2.0×10-3), in 200 µg/ml ampicillin and 0.003 µg/ml cefotaxime (p=6.2×10- 13) and in 200 µg/ml ampicillin and 0.03 µg/ml imipenem (p=9.6×10-6). Where significant differences occurred, they generally favored the double-copy AncD strain. These results support our observations from the cell-density based fitness measurement of the ancestral strains in the main text.

Results S3: Fitness increase in the evolved lines We measured fitness of the evolved N lines in 100 and 200 µg/ml of ampicillin to check whether they show a fitness increase in ampicillin. We did not see any significant increase in fitness in 100 µg/ml ampicillin (supplementary table S2; supplementary figure S7), but all the lines showed a significant increase in 200 µg/ml of ampicillin (p=5.3×10-3 for the EvoS-N lines, and p=6.1×10-4 for the EvoD-N lines; supplementary table S2; supplementary figure S7).

281 Appendix to Chapter 4

We also measured fitness of the A lines in 100 µg/ml and 200 µg/ml of ampicillin to check whether they had adapted to ampicillin. The EvoS-A as well as the EvoD-A lines showed a significant fitness increase under both concentrations of ampicillin (supplementary table S2; supplementary figure S8). We did not see any significant difference in fitness increase between these lines (p=0.19 in 100 µg/ml of ampicillin; p=0.06 in 200 µg/ml of ampicillin).

Furthermore, we measured fitness of the A↑ lines in 400 µg/ml of ampicillin as these lines were evolved under high concentrations of ampicillin. Both lines showed significant fitness increase in this medium (supplementary table S2; supplementary figure S9), but no difference in fitness between the single-copy and double-copy lines (p=0.22).

Next, we measured fitness of the C lines in 0.0105 and 0.021 µg/ml of cefotaxime to see whether these lines had adapted to cefotaxime. Only the EvoS-C lines showed a significant fitness increase in both these conditions, whereas the EvoD-C lines did not (supplementary table S2; supplementary figure S10). The fitness increase in the EvoS-C lines was significantly higher than that in the EvoD-C lines (p=3.4×10-5 in 0.0105 µg/ml cefotaxime, and p=8.7×10-7 in 0.021 µg/ml cefotaxime).

Furthermore, we measured fitness of the I lines in 0.02 µg/ml and 0.04 µg/ml of imipenem to see whether these lines adapted to imipenem, but did not observe any significant fitness increase in any of these lines (supplementary table S2; supplementary figure S11).

In addition to measuring the fitness of CA lines in 200 µg/ml ampicillin and 0.006 µg/ml cefotaxime, as described in the main text, we also measures their fitness in medium supplemented with 100 µg/ml ampicillin and 0.003 µg/ml cefotaxime, 100µg/ml ampicillin and 0.006 µg/ml cefotaxime, 100µg/ml ampicillin and 0.009 µg/ml cefotaxime, and 200µg/ml ampicillin and 0.003 µg/ml cefotaxime. In all conditions both the EvoS-CA and EvoD-CA lines showed a significant fitness increase (supplementary table S2; supplementary figure S6a). Again, in two conditions, the EvoD-CA lines showed higher fitness than the EvoS-CA lines (p=0.007 and 1.2×10-5 respectively).

Adaptation to ampicillin and cefotaxime could occur through adaptation to the native substrate ampicillin to the novel substrate cefotaxime, or to a combination of both. No significant fitness increase occurred in any of the evolved lines in 100 µg/ml ampicillin, but both single-copy and double-copy lines had evolved increased fitness in 200 µg/ml ampicillin (supplementary table S2; supplementary figure S6b). We also measured fitness of these populations on cefotaxime alone, and found a significant fitness increase (supplementary table S2; supplementary figure S6c), suggesting adaptation to the novel substrate. Again, the higher fitness increase occurred in the EvoS-CA lines compared to the EvoD-CA lines (p=3.5×10-6). These observations suggest that the adaptation in these lines is driven by adaptation to both the native substrate ampicillin and the novel substrate cefotaxime.

We measured fitness of the IA lines in medium supplemented with 100 µg/ml amp + 0.02 µg/ml imi, 100µg/ml amp + 0.04 µg/ml imi, 100µg/ml amp + 0.06 µg/ml imi, 200µg/ml amp + 0.02 µg/ml imi, 200µg/ml amp + 0.04 µg/ml imi and 200µg/ml amp + 0.06 µg/ml imi. In the first condition, only the EvoS lines showed a significant fitness increase, whereas in all other

282 Appendix to Chapter 4

conditions, both EvoS-IA and EvoD-IA lines showed a significant fitness increase (supplementary table S2; supplementary figure S12a). In none of these conditions did the EvoD- IA lines show a higher fitness than the EvoS-IA lines. In contrast, in five of these six conditions the EvoS-IA lines showed higher fitness than the EvoD-IA lines (p=0.003, 1.2×10-5, 4.4×10-5, 0.04, and 0.0001 respectively).

Like for adaptation to cefotaxime and ampicillin, adaptation to imipenem and ampicillin could occur through adaptation to the native substrate ampicillin, to the novel substrate imipenem, or to a combination of both. As we observed before, neither single-copy nor dual-copy lines showed a significant increase in 100 µg/ml ampicillin, but both lines showed a significant fitness increase in 200 µg/ml ampicillin (supplementary table S2; supplementary figure S12b). To find out whether the IA lines evolved increased activity towards the new substrate imipenem, we measured the fitness of the IA lines in 0.03 µg/ml and 0.06 µg/ml imipenem. The EvoS-IA lines showed a significant fitness increase in 0.03 µg/ml imipenem, but not in 0.06 µg/ml imipenem (supplementary table S2; supplementary figure S12c). The EvoD-IA lines had not increased fitness in either medium. These observations suggest that most of the adaptation in these lines is likely driven by adaptation to the native substrate ampicillin.

Taken together, we see poorer adaptation to imipenem than to cefotaxime, which may be caused by the larger difference in chemical structure between imipenem and ampicillin compared to that between cefotaxime and ampicillin.

Results S4: Neutral lines show increased fitness and signs of selection on specific point mutations We measured the fitness of evolved N lines in 100 and 200 µg/ml of ampicillin to check whether they also show a fitness increase in ampicillin. While we did not see any significant fitness increase in 100 µg/ml ampicillin (supplementary table S2; supplementary figure S7), both single- copy and double-copy lines increased fitness in 200 µg/ml ampicillin (p=5.3×10-3 for the EvoS-N lines and p=6.1×10-4 for the EvoD-N lines). The increase in fitness was not significantly different between single-copy and double-copy lines (p=0.81). The likely reason is the plasmid copy number increase we observe in these and other lines (figure S13), as discussed in the main text, which effectively leads to an increase in gene dosage.

We also note that the fitness values of the EvoS-N and EvoD-N lines were significantly lower than those of the EvoS-A and EvoD-A lines, respectively, in 100 µg/ml ampicillin (p=0.01 for comparison between EvoS-N and EvoS-A lines; p=0.003 for comparison between EvoD-N and EvoD-A lines). In medium with 200 µg/ml ampicillin, the EvoS-N lines still had significantly lower fitness than the EvoS-A lines (p=0.02) whereas this was no longer true for the comparison between the EvoD-N and EvoD-A lines (p=0.71). To interpret these observations, it is important to be aware that many individuals in the EvoD lines lost a copy of the TEM-1 gene. Furthermore, we observed that the EvoD-A lines show a higher incidence of such loss than the EvoD-N lines (supplementary figure S14; supplementary table S7). The difference in the frequency of gene loss between the EvoD-N and EvoD-A lines might explain why fitness differences between these two lines disappear in 200 µg/ml ampicillin.

283 Appendix to Chapter 4

We also observed signs of selection acting on specific point mutations in the neutral lines. First, the regulatory mutation T-53C rose to high frequency in the EvoD-N lines (supplementary figure S15). Second, some of the neutral EvoS and EvoD lines contained mutations at high frequency after round 1 of our experiment (supplementary figure S22). Some of these nucleotide mutations (amino acid changes) are A177G (D50G) in the EvoS-N4 line, A182G (N52D) in the EvoS-N3 line, and T259C (C77C) in the EvoS-N5 and EvoD-N1 lines. These alleles later decreased in frequency and were present at a frequency of less than 1% after round 12. Although it is possible that these frequency changes are caused by neutral genetic drift, it is unlikely that drift would cause similar frequency changes in all the neutral lines, as we observed (supplementary figure S22). Taken together, these observations suggest that specific point mutations were subject to frequency changes caused by selection. There are two possible explanations. First, in the neutral condition, especially high expression of the TEM-1 gene could be detrimental, because such expression does not provide a benefit in the absence of the antibiotic. Increased gene dosage in the EvoD-N lines associated with increased plasmid copy number could then be deleterious. Thus, the regulatory mutation T-53C that presumably reduces expression of one copy could be beneficial and rise to high frequency in these lines. Second, the neutral lines could face selection at the level of transcription, translation and protein folding. For example, if a mutation reduces stability of the beta lactamase protein, this might lead to protein aggregation which can be toxic for a cell. In such a scenario, this mutation would be selected against in the evolved lines. The detrimental effects of these mutations could also be pronounced in case of increased dosage associated with increased plasmid copy number. For example, with increased gene dosage a mutation that causes protein aggregation can be more harmful for cells due to toxic effects of aggregates than at lower gene dosage.

Results S5: Plasmid copy number increase in the evolved lines We observed an increase in plasmid copy number in all evolved lines (supplementary figure S13; supplementary table S6). It is likely that the gene dosage increase associated with the increase in plasmid copy number is beneficial for the cells in all selection conditions. We also observed an increase in plasmid copy number in the neutral lines (N lines) which was selected only on the plasmid marker kanamycin during the course of the evolution experiment.

The likely explanation is that plasmid copy number could also increase through selection unrelated to the TEM-1 gene. Specifically, if some plasmids have higher copy numbers than others, then we will preferentially transform such plasmids into E. coli cells during each round of our experiment. Any individual plasmid that has a high copy number would leave effectively more ‘offspring’ over time. While, it is possible that this process is solely responsible for the increase in plasmid copy number we see under the influence of antibiotics, we note that this is probably not the case (supplementary results S7).

We did not see any significant difference in plasmid copy numbers between the EvoS and the EvoD populations for C lines (p=0.31), I lines (p=0.42), CA lines (p=0.1) and IA lines (p=1). However, this was not the case for N lines, A lines and A↑ lines. In these lines, the EvoD populations increased their plasmid copy numbers to a significantly greater extent than the EvoS populations (p=0.03 for N lines, p=0.008 for A lines, and p=0.008 for A↑ lines). One possible explanation is that this increase compensates for a reduction in fitness due to TEM-1 gene copy loss in these populations. We speculate that it may not have been observed in the other EvoD

284 Appendix to Chapter 4

populations, because an excessive increase in plasmid copy number may be detrimental in some selection conditions.

We also sequenced the origin of replication of the ancestral plasmid and some of the evolved plasmids using Sanger sequencing, to identify candidate SNPs that could cause an increase in plasmid copy number (see supplementary methods S7). Specifically, we sequenced the origin of replication of the evolved plasmid populations in EvoS-CA3 and EvoD-CA3 lines. Because we sequenced plasmids from heterogeneous populations, we would be able to discover only those mutations that rose to high frequency (>50%) in these populations. We observed two such mutations in the EvoS-CA3 line. One of these mutations, G960A, was present at a frequency of approximately 50%. The second mutation G996A was fixed in this population. In the EvoD-CA3 line, we observed only one mutation. This mutation is the G996A mutation that is also present in the EvoS-CA3 line. It rose to approximately 50% frequency in the EvoD-CA3 line. In addition, we sequenced the origin of replication of the evolved plasmids in the EvoD-N3 line. This is one of the many EvoD lines that had shown an especially high copy number increase and as such could have some unique mutations in the origin of replication. Indeed, we found 3 unique mutations in this line that were not observed in the EvoS-CA3 and the EvoD-CA3 lines. These three mutations were A981G, A997G, and A1066G, and all of them were present at approximately 50% frequency in the population.

The actual role of each of these mutations in the plasmid copy number increase is not clear, because none of them are identical to previously described copy number mutants for this origin of replication (Armstrong et al., 1984). However, all of these mutations were present in the coding region of the gene encoding the repA protein, an observation consistent with previous work (Armstrong et al., 1984). They are thus likely to be responsible for the increase in copy number in our evolved lines. Another line of evidence comes from the fact that all these mutations rose to high frequency in our populations, suggesting a beneficial effect of these mutations.

Results S6: Sequence diversities in the evolved lines We calculated the sequence diversities of all the evolved lines separately for rounds 1, 4, 8, and 12 of the evolution experiment. To calculate diversity, we only counted polymorphisms without considering their frequency and normalized with the read coverage (see Methods in the main text). In the N lines, we observed significantly higher sequence diversity in the EvoD-N lines compared to the EvoS-N lines only at round 8 (p=0.008; supplementary figure S16a). These lines should not show any difference in diversity as they never experienced antibiotic selection. However, there could be selection on TEM-1 gene, unrelated to antibiotic selection, such as selection for gene expression level. Thus, regulatory mutations might occur in these lines that could influence our diversity calculations. One example of such regulatory mutations is the T- 53C mutation that rose to high frequency in three of the five replicate EvoD-N lines. In the A lines, the EvoD-A lines showed higher sequence diversity than the EvoS-A lines at round 1 (p=0.008; supplementary figure S16b). In thethe A↑ lines, the EvoD-A↑ lines showed higher sequence diversity than the EvoS-A↑ lines at round 1 (p=0.04), at round 4 (p=0.04) and at round 8 (p=0.04; supplementary figure S16c). In the C lines, the EvoD-C lines showed higher sequence diversity than the EvoS-C lines at round 1 (p=0.008; supplementary figure S16d). We did not see any significant difference between diversities among the EvoD-I and EvoS-I lines in any round

285 Appendix to Chapter 4

of the evolution experiment (supplementary figure S16e). In the CA lines, the EvoD-CA lines showed higher sequence diversity than the EvoS-CA lines at round 8 (p=0.04; supplementary figure S16f). In the IA lines, the EvoD-IA lines showed higher sequence diversity than the EvoS- IA lines at round 8 (p=0.02; supplementary figure S16), whereas the EvoS-IA lines showed higher diversity than the EvoD-IA lines at round 12 (p=0.02; supplementary figure S16g). Supplementary figure S17 shows the change in diversities in the replicate populations in more detail.

Results S7: Evolved lines show signs of selection caused by antibiotics Experimental results suggest that the increased gene dosage due to an increase in plasmid copy number was the most important factor behind the fitness increase in our evolved lines. The neutral lines, which never experienced antibiotic selection, also showed a plasmid copy number increase that resulted in increased fitness of these lines in ampicillin. In supplementary results S5, we discussed a possible mode of selection at the plasmid level, unrelated to selection on the TEM-1 gene(s), which could cause this copy number increase in the neutral lines.

One important question is whether this selection at the plasmid level could be solely responsible for the fitness increase and the molecular changes we observed in all the evolved lines. Several lines of evidence suggest that this is not the case. First, plasmid copy number could increase through selection at the plasmid level, selection for increased dosage of the TEM-1 gene, or through a combination of both. We might not always be able to distinguish the role of each of these two types of selection in changing plasmid copy numbers. However, if an increase in gene dosage (due to an increase in plasmid copy number) was not beneficial for cells in antibiotic selection condition, high copy number plasmids would be selected against. Such selection would be reflected in the plasmid copy number in the lines that were exposed to antibiotics. This is precisely what we see in the EvoD lines under selection in novel antibiotics or in novel antibiotics together with ampicillin (supplementary figure S13; supplementary table S6). The EvoD-N lines show, on average, a 14 fold increase in plasmid copy number. In comparison, the EvoD-C, EvoD-I, EvoD-CA and EvoD-IA lines show a lower fold-increase in copy number (on average a 6-fold increase in EvoD-C lines, p=0.03, one sided Wilcoxon rank sum test; a 7-fold increase in EvoD-I lines , p=0.02; a 9-fold increase in EvoD-CA lines, p= 0.08; and a 6-fold increase in EvoD-IA lines, p=0.008; supplementary figure S13; supplementary table S6), suggesting selection against very high copy number plasmids in these antibiotic conditions.

Two reasons could be responsible. First, an increase in gene dosage due to increased plasmid copy number could be beneficial for cells, but only up to certain extent in our selection conditions. For example, a very high dosage could impose a very high expression cost, without providing enough benefit, on the cells under these conditions. Previous experimental work shows that optimal gene expression level is indeed dependent on the growth medium (Dekel and Alon, 2005). Second, maintaining a very high copy number plasmids could impose a very high metabolic cost on cells in these conditions. This is supported by the results of a previous study on gene duplication using two different plasmids. This study observed loss of one plasmid when cefotaxime concentration in the medium was increased (Holloway et al., 2007), suggesting a high plasmid maintenance cost on the cells.

286 Appendix to Chapter 4

A second difference is observed in the frequency of regulatory mutation T-53C that was present at high frequency in several EvoD lines. This mutation rose to high frequency at round 8 and subsequently dropped to low frequency (<1%) in almost all the EvoD lines except the EvoD-N lines (supplementary figure S15). In three replicate EvoD-N lines, where this mutation was present, it remained at high frequency (>30% in EvoD-N3, and >40% in EvoD-N4 and EvoD- N5). This mutation may down-regulate expression of one of the TEM-1 gene copies and thus, could be detrimental in the EvoD lines under selection in increasing antibiotic concentrations beyond round 8. In comparison, this mutation could be beneficial in the EvoD-N lines, which also show very high fold-increase in plasmid copy number, by reducing expression cost.

Finally, we observed loss of one of the duplicate gene copies in all the EvoD lines (supplementary figure S14; supplementary table S7). However, this loss occurred to different extents in different EvoD lines. Except for EvoD-A↑ lines, all other EvoD lines that experienced antibiotic selection showed a lower frequency of retention of duplicate gene copies than the neutral lines. In three of the five replicate EvoD-N lines, more than 50% of the individuals retained duplicate genes. In comparison, we did not observe such a high frequency of duplicate gene retention in any of the replicate EvoD-C, EvoD-I and EvoD-CA lines. In the EvoD-C lines, the replicate line with highest frequency of retention had only ~35% of the individuals containing duplicate copies of the gene (EvoD-C4). In both EvoD-I and EvoD-CA lines, the highest frequency of duplicate retention was ~25% (in EvoD-I5 and EvoD-CA4 respectively). In only one of the replicate EvoD-IA lines did 50% of the individuals retain duplicate gene copies (EvoD-IA5), whereas in all other replicate lines, less than 25% of the individuals retained duplicate gene copies. We observed a similar pattern in EvoD-A lines, where only in one replicate line ~80% of the individuals retained duplicate copies (EvoD-A1), whereas in all other replicate lines, less than 30% individuals retained duplicate copies. Higher gene loss in the EvoD lines selected in antibiotics could be due to two reasons. First, plasmids with single copy genes were selected in the antibiotics medium due to their higher fitness. Second, tandem duplicates can have a lower stability of tandem on a plasmid. However, our experiments suggest that the first possibility is unlikely to be true, at least in the EvoD-CA lines where clones with duplicate genes showed higher fitness than the single genes. Thus, the higher gene loss in the antibiotic selected medium is likely to be caused by higher instability of tandem duplicates in antibiotics medium. In the EvoD-A↑ lines, retention of duplicate genes may have been favored due to the increased gene dosage that can be beneficial in high ampicillin concentrations.

Results S8: Several factors can affect the Ka/Ks ratio in our experiment We calculated the Ka/Ks ratio for the evolved TEM-1 genes, which can indicate the type and the extent of selection operating on this gene in our experiment. There are several factors in our experiment that could affect the Ka/Ks ratio and thus, could prevent us from inferring the extent of selection on the TEM-1 gene. First and foremost, the number of mutations in our evolved lines is extremely small, which is reflected in very low average numbers of nucleotide differences between reads in pair-wise comparisons (supplementary figure S29). Such low diversity could introduce substantial sampling bias in our sequence data. Second, selection on the regulatory region of TEM-1 gene can also affect the Ka/Ks ratio. For example, the non-coding mutation T- 53C that rose to high frequency in several EvoD lines (supplementary figure S15) could influence the sequence changes we observe in coding regions due to hitchhiking. Third, the Ka/Ks ratio could also be affected by a mutational bias in the mutator E. coli strain used in our

287 Appendix to Chapter 4 experiment. Our mutator strain highly expresses the error prone DNA polymerase gene dinB that has been shown to have a high mutational bias (Wagner and Nohmi, 2000). More generally, such a mutational bias is common among mutator strains (Cox and Yanofsky, 1967; Schaaper RM, 1988). Fourth, analysis of selection using the Ka/Ks ratio considers all non-synonymous mutations to be deleterious or beneficial and all synonymous mutations to be neutral, which might not be the case. For example, the high value of Ka/Ks in EvoS-N lines (supplementary figure S28) (Round 1) is due to a single non-synonymous valine to isoleucine change (which are very similar amino acids whose substitution for each other is often neutral). Furthermore, in our experiment, selection may have acted not only on the TEM-1 gene but also on the KanR gene, as well as on plasmid stability. Such selection could also affect the Ka/Ks ratio in ways that could make its interpretation difficult.

288 Appendix to Chapter 4

Supplementary methods

Methods S1: Analysis of variance (ANOVA) To find out whether there is any fitness difference between the AncS strain and the AncD strain, we used an analysis of variance (ANOVA) to compare growth data obtained at various concentrations of a particular antibiotic. Briefly, we performed a two-way ANOVA considering the antibiotic concentration and the strain as two independent variables and the growth as the dependent variable. We also considered the interaction term between the two independent variables. We used “repeated measures” ANOVA (Larson, 2008; Sullivan 2008) for comparison of growth curves, where we considered growth measurements at different time points as repeated measurements and the ancestral strain as an independent variable.

Methods S2: Construction of the mutator strain - E. coli TB90 The genotype of the mutator E. coli strain is ∆recA::frt PBAD-dinB (in an MG1655 background). We used a strain lacking the recA gene to reduce recombination and gene loss among the duplicate TEM-1 copies. The dinB gene encodes an error prone DNA polymerase in E. coli (DNA polymeras IV; Wagner et al., 1999). We replaced the promoter of the dinB gene with the promoter of the arabinose operon (Schleif, 2000) to induce mutagenesis with L-(+)-arabinose, which results in high expression of the dinB gene. Because DNA polymerase IV is error prone, it introduces mutations in the replicating DNA. We calculated the mutation rate in this E. coli strain using the rate at which rifampicin resistant cells remerge through mutation (Maisnier-Patin et al., 2005). By this criterion, the strain had a 600-fold higher mutation rate than wild type E. coli in the presence of 0.05% L-(+)-arabinose (supplementary table S3).

Methods S3: Sample multiplexing for sequencing We used 4 regions of a picotiter plate for 454 sequencing (Margulies et al., 2005). For each round of the evolution experiment, we had 70 replicate lines of which 35 were EvoS lines and 35 were EvoD lines. For EvoS lines, we sequenced these 35 lines in 4 separate regions, i.e., we sequenced 9, 9, 9 and 8 EvoS lines per region, and did the same for EvoD lines. We designed 18 unique barcodes for each amplicon to distinguish the maximally 18 lines in a region of the plate. We added this barcode to the 5’ end of an amplicon along with a sequence that was identical for all the lines. In a next step, we used this universal sequence to introduce a round-specific barcode to distinguish between the same lines from different rounds of the evolution experiment (supplementary figure S24). We sequenced all amplicons from the same direction to ensure that the barcodes were always sequenced. Supplementary table S5 lists all the primers along with the barcodes used for this purpose.

Methods S4: Error correction for sequencing data We sequenced the ancestral strains, AncS and AncD, to estimate error rates in PCR and 454 sequencing. Because the inserts in our experiment are longer than the maximum length of amplicons that can be sequenced with current technology, we amplified the insert in the AncS strain in two separate amplicons numbered as 1 and 2. For the AncD strain, we amplified the insert in four amplicons numbered as 1, 2, 3, and 4 (supplementary figure S25). We sequenced all the EvoS lines with same two amplicons (i.e., in the same region of the constructs), and all the EvoD lines with the same four amplicons as in the ancestral strains. We used only the first 450 bases of each read for error calculation, and calculated error rates for SNPs and indels separately

289 Appendix to Chapter 4

(supplementary figure S26). We performed several error corrections before calling true SNPs and indels.

1. Test for two sets of sequencing data The sequencing data we have comes from two different sequencing runs, and we call the reads thus obtained “set1” reads and “set2” reads. Error rates could differ between these two different sets. Since we sequenced amplicons from the ancestral strains in both runs, we can calculate the error rates for each run. Our data confirmed that the error rate is different for these two sets of reads (supplementary figure S27). To avoid erroneous SNP/indel calls we used a statistical test of the hypothesis that a true SNP/indel should be observed in both sets of reads. Thus, a significant difference in the incidence of an SNP/indel between the two sets would suggest that the SNP/indel exists because of an error bias in one of the read sets.

Specifically, assume that a variant M (SNP or indel) has a relative frequency f in a given evolved (“test”) amplicon. This frequency can be calculated from the total coverage N (with N1 reads from set 1 and N2 reads from set 2), and from the number of reads O1 that carry the variant M in N1 , as well as the number of reads O2 that carry M in N2, as

()OO12+ f = . We define f1 and f2 as f111= ON/ and f222= ON/ . To ask whether the ()NN12+ frequencies observed from two sets of reads differ significantly from each other, we performed an exact binomial test. (We also used this test whenever f1=0 or f2=0, the rationale being that if a SNP occurs in both read sets, it is more likely to be a real mutation and less likely to be an artifact based on different error rates in two read sets.) To perform the test, we defined a random variable Z which denotes the number of occurrences of the sequence variant at a given position in the test amplicon due to errors. Z follows a binomial distribution B(N,r). We distinguish four cases to estimate its parameters. First, if f1< f2 and f1 is not equal to zero, then N=N2 and r= f1 . In this case, we can estimate the probability p that we observe the variant at frequency f2 or greater, given its frequency f1 in set 1 as N2 N 2 ! i ()Ni2 − pZO=≥=Pr(21 )∑ ff (1 −1 ) . iO= 2 iN!(2 − i )! Second, if f1< f2 and f1 =0, then N=N1 and r= f2, and we can calculate an analogous p-value as N 1 ! 0 (0)NN11− p ===Pr(Zf 0)22 (1 −f ) =− (1f 2 ) . 0!(N1 − 0)! Third, if f2< f1 and f2 is not equal to zero, then N=N1 and r= f2 and N1 N 1 ! i ()Ni1 − pZO=≥=Pr(12 )∑ ff (1 −2 ) . iO= 1 iN!(1 − i )! Fourth, if f2< f1 and f2 =0, then N=N2 and r= f1 and we calculate a p-value as

N2 ! 0 (0)NN22− p ===Pr(Zf 0)11 (1 −f ) =− (1f 1 ) . 0!(N2 − 0)! If we observed a p of less than 0.05 for a variant, we concluded that the incidence of the variant showed a greater difference between the read sets than expected by chance alone, and we eliminated the variant from further analysis.

290 Appendix to Chapter 4

2. Correction for overall error rate In the next step, we performed a correction based on the overall error rates for SNPs and indels that we calculated from the ancestral sequences. Let us assume that a mutation (SNP or indel) occurs in the test sequence at a frequency given by a/N. To ask whether this mutation could occur just because of errors in the sequencing data, we defined a random variable Z which denotes the number of occurrences of the mutation at that position in the test sequence due to errors. Z follows a binomial distribution, B(N,r), where r is an amplicon-specific error rate for SNPs or indels. To estimate this error rate for each amplicon, we used the maximum of the two error rates from read set 1 and 2 that we observed in each of the ancestral sequences. We tested the null hypothesis that the number of mutations in the test sequence could be observed due to sequencing error alone, by calculating a p-value with the expression

N N ! pZa=≥=Pr( ) rriN (1 − )()−i ∑ iN!(− i )! ia= If p<0.05, we concluded that the variant did not occur just because of sequencing errors. We then subjected it to further filtering for position-specific errors in amplicons, as described next.

3. Filtering for position-specific errors in amplicons Some errors may preferentially occur at certain positions of an amplicon in the ancestral sequence. If so, these errors might also preferentially occur at the same positions in the evolved sequences. To correct for this kind of error, we compiled a list of such repeated errors (base substitutions, indels) for each of the six amplicons from the ancestral strains, and for each position of the amplicon. We found four types of insertions, three types of deletions, and three types of substitutions that (erroneously) occurred at specific positions in the ancestral sequences. When we saw such changes in the evolved sequences, we eliminated them. By way of example, if we found that a C→G substitution occurred even once at position 154 in amplicon Single 1 in the ancestral sequences, we would discard this change from all our evolved sequences before analyzing them further.

Methods S5: Growth curves (cell density increases over time) To analyze how cell densities increase in a growing culture over time, we cultured cells of the ancestral strains for 16 hours at 37°C in LB medium supplemented with 25 µg/ml kanamycin. We measured the cell density of these starter cultures by measuring optical density at 600 nm (OD(0)). We then diluted the cells 100 times in 2ml LB medium supplemented with appropriate antibiotics and allowed them to grow for 10 hours on a 96 deep-well plate. Every hour we withdrew 20µl of the culture and mixed it with 80 µl LB before measuring optical density at OD() t 600nm (OD(t)). We then calculated the fold-change in cell density as . We (OD (0) /100) performed all such estimates in triplicate.

Methods S6: Estimation of the Ka/Ks ratio Ka/Ks ratio for a protein coding gene is defined as the ratio of the number of non-synonymous mutations per non-synonymous site to number of synonymous mutations per synonymous site (Li, 1997; Kreitman M, 2000). The Ka/Ks ratio can indicate the type (purifying, neutral or positive) and the extent of selection operating on a gene. To estimate the Ka/Ks ratio, we counted

291 Appendix to Chapter 4 the total number of non-synonymous and synonymous mutations in the sequence reads for five replicates in each evolved line. We then estimated the ratio as

(Number of non− synonymous mutations / Number of non-synonymous sites) (Number of synonymous mutations / Number of synonymous sites) (Number of non− synonymous mutations / Number of synonymous mutations) = (Number of non− synonymous sites / Number of synonymous sites) (Number of non− synonymous mutations / Number of synonymous mutations) = 3.19 , since (Number of non− synonymous sites / Number of synonymous sites)=3.19 for the TEM-1 gene.

Methods S7: Sequencing origin of replication of ancestral and evolved plasmids

The plasmid we used for our experiment contains the SC101 origin of replication (Zaslaver et al., 2006). At the end of our evolution experiment, all the evolved lines showed an increase in plasmid copy numbers which is likely to happen through mutations in the origin of replication. To look for candidate mutations in the plasmids that could explain their increased copy number, we sequenced the origin of replication region (NCBI GenBank ID: K00042.1) of plasmids in the ancestral strains and in some of the evolved lines. We considered a SNP to be a true positive SNP if it was observed after sequencing from both directions. We numbered the SNP positions based on the numbering in the origin of replication sequence given in the NCBI GenBank entry with identifier K00042.1. We estimated SNP frequencies from chromatograms in the Sanger sequencing.

292 Appendix to Chapter 4

Supplementary tables

Table S1: Fitness measurements of the ancestral strains

(a) in ampicillin

Fold-increase in cell Fold-increase in cell Ampicillin concentration density in the AncS strain density in the AncD strain (µg/ml) (mean±sd) (mean±sd) 0 117.91±4.89 112.04±1.95 50 124.63±4.34 116.52±1.42 100 124.82±3.94 116.96±1.43 150 126.10±4.82 114.87±3.27 200 127.82±5.49 117.46±2.80 250 107.03±3.84 113.72±3.50 300 88.91±2.75 112.68±1.66 350 64.20±7.55 97.36±3.22 400 52.60±12.32 78.02±4.66 450 31.01±2.17 61.06±6.48 500 24.85±1.48 44.36±5.37

(b) in cefotaxime

Fold-increase in cell Fold-increase in cell Cefotaxime concentration density in the AncS strain density in the AncD strain (µg/ml) (mean±sd) (mean±sd) 0 129.58±7.40 128.36±9.11 0.0015 139.15±1.02 131.29±4.53 0.0030 135.52±9.03 132.33±5.79 0.0045 134.63±2.79 123.69±2.16 0.0060 112.11±3.01 113.40±4.97 0.0075 106.64±5.03 104.64±4.08 0.0090 72.00±4.51 78.19±1.86 0.0105 40.63±5.98 45.81±4.07 0.0120 27.88±3.13 31.95±1.16 0.0135 19.61±0.35 20.92±0.72 0.0150 13.57±0.50 15.11±1.06 0.1650 12.55±0.48 13.11±0.96

293 Appendix to Chapter 4

(c) in ampicillin+cefotaxime

Ampicillin Cefotaxime Fold-increase in cell Fold-increase in cell concentration concentration density in the AncS density in the AncD (µg/ml) (µg/ml) strain (mean±sd) strain (mean±sd) 100 0 130.03±3.11 125.51±1.93 100 0.0015 146.26±2.66 136.62±3.93 100 0.0030 123.18±14.91 128.01±2.87 100 0.0045 109.93±5.43 118.70±2.03 100 0.0060 61.48±7.62 98.66±6.69 100 0.0075 29.05±2.58 52.72±15.58 100 0.0090 16.40±2.99 26.74±5.37 150 0 130.73±5.40 127.09±0.50 150 0.0015 120.52±4.89 122.91±1.94 150 0.0030 103.57±5.43 116.46±5.43 150 0.0045 47.21±7.07 95.75±5.03 150 0.0060 38.31±0.57 65.26±7.17 150 0.0075 21.18±1.93 34.43±2.99 150 0.0090 16.88±1.56 20.06±1.40

(d) in imipenem

Fold-increase in cell Fold-increase in cell Imipenem concentration density in the AncS strain density in the AncD strain (µg/ml) (mean±sd) (mean±sd) 0 123.15±1.26 126.58±4.64 0.005 113.27±0.86 115.36±2.40 0.010 114.99±1.90 109.95±1.85 0.015 109.31±3.76 106.69±2.27 0.020 106.39±2.70 106.57±1.64 0.025 101.34±4.73 101.52±2.64 0.030 97.71±3.48 99.64±3.49 0.035 91.76±1.93 93.11±3.47 0.040 84.31±1.40 86.61±1.87 0.045 75.52±2.98 78.71±4.84 0.050 65.70±3.85 69.42±4.41 0.055 51.47±1.56 59.38±4.08 0.060 43.14±3.53 50.19±4.17

294 Appendix to Chapter 4

(e) in ampicillin+imipenem

Ampicillin Imipenem Fold-increase in cell Fold-increase in cell concentration concentration density in the AncS density in the AncD (µg/ml) (µg/ml) strain (mean±sd) strain (mean±sd) 50 0 118.04±4.97 126.53±7.21 50 0.01 112.94±5.58 102.82±1.74 50 0.02 110.14±1.23 102.48±1.01 50 0.03 96.37±2.35 94.96±0.44 50 0.04 84.51±1.30 87.78±3.03 50 0.05 59.86±4.18 66.90±4.42 50 0.06 43.74±4.65 44.09±2.20 100 0 133.97±2.17 120.38±10.30 100 0.01 104.67±3.01 116.29±6.44 100 0.02 102.97±1.44 108.25±3.13 100 0.03 93.02±0.48 98.05±3.27 100 0.04 74.62±6.70 83.80±5.36 100 0.05 50.21±5.71 58.41±2.26 100 0.06 38.38±3.55 42.49±3.47 150 0 122.05±4.92 133.99±3.72 150 0.01 106.00±3.44 110.07±1.51 150 0.02 108.59±3.24 110.29±1.33 150 0.03 86.30±4.33 94.26±7.54 150 0.04 68.76±5.18 89.18±2.46 150 0.05 45.76±5.60 66.33±3.67 150 0.06 44.34±7.97 46.60±4.95 200 0 125.60±3.98 125.22±1.93 200 0.01 107.21±10.36 119.87±4.32 200 0.02 89.31±2.66 110.18±2.95 200 0.03 59.54±2.85 94.29±5.70 200 0.04 54.18±3.66 74.08±0.97 200 0.05 52.58±3.85 47.26±2.80 200 0.06 42.89±4.21 38.27±1.54

295 Appendix to Chapter 4

Table S2: Relative fitness values of the evolved lines

Line Antibiotics Relative fitness p-value concentrations (µg/ml) mean±sd EvoS-CA Amp100+Cef 0.003 1.65±0.34 6.1×10-5 EvoS-CA Amp100+Cef 0.006 2.52±0.53 6.1×10-5 EvoS-CA Amp100+Cef0.009 1.47±0.25 6.1×10-5 EvoS-CA Amp 200+Cef 0.003 2.52±0.44 6.1×10-5 EvoS-CA Amp 200+Cef 0.006 5.12±1.05 6.1×10-5 EvoS-CA Cef 0.0105 1.66±0.31 6.1×10-5 EvoS-CA Cef 0.021 1.60±0.26 6.1×10-5 EvoS-CA Amp100 0.96±0.18 0.36 EvoS-CA Amp200 1.10±0.08 4.3×10-4 EvoD-CA Amp100+Cef 0.003 1.53±0.21 6.1×10-5 EvoD-CA Amp100+Cef 0.006 2.22±0.44 6.1×10-5 EvoD-CA Amp100+Cef0.009 1.40±0.20 6.1×10-5 EvoD-CA Amp 200+Cef 0.003 2.14±0.22 6.1×10-5 EvoD-CA Amp 200+Cef 0.006 3.25±0.69 6.1×10-5 EvoD-CA Cef 0.0105 1.15±0.12 8.5×10-4 EvoD-CA Cef 0.021 1.30±0.18 6.1×10-5 EvoD-CA Amp100 0.87±0.10 3.1×10-4† EvoD-CA Amp200 1.05±0.06 5.4×10-4

EvoS-IA Amp100+Imi 0.02 1.10±0.07 6.1×10-4 EvoS-IA Amp100+ Imi 0.04 1.33±0.08 6.1×10-5 EvoS-IA Amp100+Imi0.06 1.21±0.10 6.1×10-5 EvoS-IA Amp 200+Imi 0.02 1.77±0.13 6.1×10-5 EvoS-IA Amp 200+Imi 0.04 1.55±0.07 6.1×10-5 EvoS-IA Amp200+Imi0.06 1.21±0.56 0.17 EvoS-IA Imi 0.03 1.14±0.07 6.1×10-5 EvoS-IA Imi 0.06 1.02±0.10 0.15 EvoS-IA Amp100 1.04±0.08 0.11 EvoS-IA Amp200 1.07±0.09 8.4×10-3 EvoD-IA Amp100+Imi 0.02 1.02±0.05 0.25 EvoD-IA Amp100+ Imi 0.04 1.16±0.08 6.1×10-5 EvoD-IA Amp100+Imi0.06 1.00±0.12 0.98 EvoD-IA Amp 200+Imi 0.02 1.67±0.11 6.1×10-5 EvoD-IA Amp 200+Imi 0.04 1.37±0.12 6.1×10-5 EvoD-IA Amp200+Imi0.06 1.12±0.37 0.25 EvoD-IA Imi 0.03 1.03±0.11 0.49 EvoD-IA Imi 0.06 0.95±0.12 0.10 EvoD-IA Amp100 1.00±0.05 0.68 EvoD-IA Amp200 1.04±0.08 0.04

EvoS-N Amp100 1.04±0.11 0.25

296 Appendix to Chapter 4

EvoS-N Amp200 1.10±0.13 5.3×10-3 EvoD-N Amp100 1.03±0.06 0.15 EvoD-N Amp200 1.09±0.07 6.1×10-4

EvoS-A Amp100 1.17±0.12 6.1×10-4 EvoS-A Amp200 1.21±0.13 6.1×10-5 EvoD-A Amp100 1.12±0.09 6.1×10-5 EvoD-A Amp200 1.12±0.10 1.2×10-4

EvoS-A↑ Amp400 1.69±0.15 6.1×10-5 EvoD-A↑ Amp400 1.80±0.24 6.1×10-5

EvoS-C Cef0.0105 1.58±0.19 6.1×10-5 EvoS-C Cef0.021 1.37±0.12 6.1×10-5 EvoD-C Cef0.0105 1.16±0.25 0.10 EvoD-C Cef0.021 1.02±0.15 0.98

EvoS-I Imi0.02 0.97±0.13 0.30 EvoS-I Imi0.04 1.00±0.15 0.60 EvoD-I Imi0.02 0.90±0.10 2.0×10-3 † EvoD-I Imi0.04 0.88±0.11 6.1×10-4†

† Fitness values significantly lower than 1

297 Appendix to Chapter 4

Table S3: Average mutation rate in the mutator E. coli strain (TB90)

% L-(+)-arabinose Average mutation rate (per base Relative Increase per generation) 0 1.46×10-8 1 0.003125 1.13×10-6 77 0.0125 3.02×10-6 206 0.05 8.64×10-6 590

298 Appendix to Chapter 4

Table S4: Primer pairs for screening of individuals containing duplicate genes in the EvoD lines

Forward primer 5’ GGCGTATCACGAGGCCCTTTCG 3’ Reverse primer 5' GGGTTCCGCGCACATTTCGGA 3'

299 Appendix to Chapter 4

Table S5: Primers with barcodes for sample multiplexing in 454 sequencing. (a) Round-specific primers

Forward primers Round 5’ Forward adapter – Barcode – Universal sequence (forward) 3’ 1 CCATCTCATCCCTGCGTGTCTCCGACTCAG – ACGAGTGCGT – GTCTGCACGCTCGACCTCTG 2 CCATCTCATCCCTGCGTGTCTCCGACTCAG – ACGCTCGACA – GTCTGCACGCTCGACCTCTG 3 CCATCTCATCCCTGCGTGTCTCCGACTCAG – AGACGCACTC – GTCTGCACGCTCGACCTCTG 4 CCATCTCATCCCTGCGTGTCTCCGACTCAG – AGCACTGTAG – GTCTGCACGCTCGACCTCTG 8 CCATCTCATCCCTGCGTGTCTCCGACTCAG – CTCGCGTGTC – GTCTGCACGCTCGACCTCTG 12 CCATCTCATCCCTGCGTGTCTCCGACTCAG – CGAGAGATAC – GTCTGCACGCTCGACCTCTG

Reverse primer Round 5’ Reverse adapter – Universal sequence (reverse) 3’ All CCTATCCCCTGTGTGCCTTGGCAGTCTCAG – CGACAGCATGGGATCAGCACG

(b) Line-specific primers

Forward primers Amplicon & ID 5’ Universal sequence(forward) – Barcode – primer binding site 3’ Single1-ID1 GTCTGCACGCTCGACCTCTG – ATACGACGTA – GAGGCCCTTTCGTCTTCACCTC Single1-ID2 GTCTGCACGCTCGACCTCTG – TCACGTACTA – GAGGCCCTTTCGTCTTCACCTC Single1-ID3 GTCTGCACGCTCGACCTCTG – CGTCTAGTAC – GAGGCCCTTTCGTCTTCACCTC Single1-ID4 GTCTGCACGCTCGACCTCTG – TCTACGTAGC – GAGGCCCTTTCGTCTTCACCTC Single1-ID5 GTCTGCACGCTCGACCTCTG – TGTACTACTC – GAGGCCCTTTCGTCTTCACCTC Single1-ID6 GTCTGCACGCTCGACCTCTG – ACGACTACAG – GAGGCCCTTTCGTCTTCACCTC Single1-ID7 GTCTGCACGCTCGACCTCTG – CGTAGACTAG – GAGGCCCTTTCGTCTTCACCTC Single1-ID8 GTCTGCACGCTCGACCTCTG – TACGAGTATG – GAGGCCCTTTCGTCTTCACCTC Single1-ID9 GTCTGCACGCTCGACCTCTG – TACTCTCGTG – GAGGCCCTTTCGTCTTCACCTC

Single2-ID1 GTCTGCACGCTCGACCTCTG – ATACGACGTA – GGGAACCGGAGCTGAATGAAGC Single2-ID2 GTCTGCACGCTCGACCTCTG – TCACGTACTA – GGGAACCGGAGCTGAATGAAGC Single2-ID3 GTCTGCACGCTCGACCTCTG – CGTCTAGTAC – GGGAACCGGAGCTGAATGAAGC

300 Appendix to Chapter 4

Single2-ID4 GTCTGCACGCTCGACCTCTG – TCTACGTAGC – GGGAACCGGAGCTGAATGAAGC Single2-ID5 GTCTGCACGCTCGACCTCTG – TGTACTACTC – GGGAACCGGAGCTGAATGAAGC Single2-ID6 GTCTGCACGCTCGACCTCTG – ACGACTACAG – GGGAACCGGAGCTGAATGAAGC Single2-ID7 GTCTGCACGCTCGACCTCTG – CGTAGACTAG – GGGAACCGGAGCTGAATGAAGC Single2-ID8 GTCTGCACGCTCGACCTCTG – TACGAGTATG – GGGAACCGGAGCTGAATGAAGC Single2-ID9 GTCTGCACGCTCGACCTCTG – TACTCTCGTG – GGGAACCGGAGCTGAATGAAGC

Duplicate1-ID1 GTCTGCACGCTCGACCTCTG – TAGAGACGAG – GAGGCCCTTTCGTCTTCACCTC Duplicate1-ID2 GTCTGCACGCTCGACCTCTG – TCGTCGCTCG – GAGGCCCTTTCGTCTTCACCTC Duplicate1-ID3 GTCTGCACGCTCGACCTCTG – ACATACGCGT – GAGGCCCTTTCGTCTTCACCTC Duplicate1-ID4 GTCTGCACGCTCGACCTCTG – ACGCGAGTAT – GAGGCCCTTTCGTCTTCACCTC Duplicate1-ID5 GTCTGCACGCTCGACCTCTG – ACTACTATGT – GAGGCCCTTTCGTCTTCACCTC Duplicate1-ID6 GTCTGCACGCTCGACCTCTG – ACTGTACAGT – GAGGCCCTTTCGTCTTCACCTC Duplicate1-ID7 GTCTGCACGCTCGACCTCTG – AGACTATACT – GAGGCCCTTTCGTCTTCACCTC Duplicate1-ID8 GTCTGCACGCTCGACCTCTG – AGCGTCGTCT – GAGGCCCTTTCGTCTTCACCTC Duplicate1-ID9 GTCTGCACGCTCGACCTCTG – AGTACGCTAT – GAGGCCCTTTCGTCTTCACCTC

Duplicate2-ID1 GTCTGCACGCTCGACCTCTG – TAGAGACGAG – GGGAACCGGAGCTGAATGAAGC Duplicate2-ID2 GTCTGCACGCTCGACCTCTG – TCGTCGCTCG – GGGAACCGGAGCTGAATGAAGC Duplicate2-ID3 GTCTGCACGCTCGACCTCTG – ACATACGCGT – GGGAACCGGAGCTGAATGAAGC Duplicate2-ID4 GTCTGCACGCTCGACCTCTG – ACGCGAGTAT – GGGAACCGGAGCTGAATGAAGC Duplicate2-ID5 GTCTGCACGCTCGACCTCTG – ACTACTATGT – GGGAACCGGAGCTGAATGAAGC Duplicate2-ID6 GTCTGCACGCTCGACCTCTG – ACTGTACAGT – GGGAACCGGAGCTGAATGAAGC Duplicate2-ID7 GTCTGCACGCTCGACCTCTG – AGACTATACT – GGGAACCGGAGCTGAATGAAGC Duplicate2-ID8 GTCTGCACGCTCGACCTCTG – AGCGTCGTCT – GGGAACCGGAGCTGAATGAAGC Duplicate2-ID9 GTCTGCACGCTCGACCTCTG – AGTACGCTAT – GGGAACCGGAGCTGAATGAAGC

Duplicate3-ID1 GTCTGCACGCTCGACCTCTG – ATAGAGTACT – GGGAACCGGAGCTGAATGAAGC Duplicate3-ID2 GTCTGCACGCTCGACCTCTG – CACGCTACGT –

301 Appendix to Chapter 4

GGGAACCGGAGCTGAATGAAGC Duplicate3-ID3 GTCTGCACGCTCGACCTCTG – CAGTAGACGT – GGGAACCGGAGCTGAATGAAGC Duplicate3-ID4 GTCTGCACGCTCGACCTCTG – CGACGTGACT – GGGAACCGGAGCTGAATGAAGC Duplicate3-ID5 GTCTGCACGCTCGACCTCTG – TACACACACT – GGGAACCGGAGCTGAATGAAGC Duplicate3-ID6 GTCTGCACGCTCGACCTCTG – TACACGTGAT – GGGAACCGGAGCTGAATGAAGC Duplicate3-ID7 GTCTGCACGCTCGACCTCTG – TACAGATCGT – GGGAACCGGAGCTGAATGAAGC Duplicate3-ID8 GTCTGCACGCTCGACCTCTG – TACGCTGTCT – GGGAACCGGAGCTGAATGAAGC Duplicate3-ID9 GTCTGCACGCTCGACCTCTG – TAGTGTAGAT – GGGAACCGGAGCTGAATGAAGC

Duplicate4-ID1 GTCTGCACGCTCGACCTCTG – ATACGACGTA – GGTGAAGATCCTTTTTGGGATCCGAA Duplicate4-ID2 GTCTGCACGCTCGACCTCTG – TCACGTACTA – GGTGAAGATCCTTTTTGGGATCCGAA Duplicate4-ID3 GTCTGCACGCTCGACCTCTG – CGTCTAGTAC – GGTGAAGATCCTTTTTGGGATCCGAA Duplicate4-ID4 GTCTGCACGCTCGACCTCTG – TCTACGTAGC – GGTGAAGATCCTTTTTGGGATCCGAA Duplicate4-ID5 GTCTGCACGCTCGACCTCTG – TGTACTACTC – GGTGAAGATCCTTTTTGGGATCCGAA Duplicate4-ID6 GTCTGCACGCTCGACCTCTG – ACGACTACAG – GGTGAAGATCCTTTTTGGGATCCGAA Duplicate4-ID7 GTCTGCACGCTCGACCTCTG – CGTAGACTAG – GGTGAAGATCCTTTTTGGGATCCGAA Duplicate4-ID8 GTCTGCACGCTCGACCTCTG – TACGAGTATG – GGTGAAGATCCTTTTTGGGATCCGAA Duplicate4-ID9 GTCTGCACGCTCGACCTCTG – TACTCTCGTG – GGTGAAGATCCTTTTTGGGATCCGAA

Reverse primers Amplicon & 5’ Universal sequence(reverse) – primer binding site 3’ ID Single1 (all) CGACAGCATGGGATCAGCACG – CAGCAATAAACCAGCCAGCCG Single2 (all) CGACAGCATGGGATCAGCACG – CCTTCGGGCATGGCACTCTTG Duplicate1(all) CGACAGCATGGGATCAGCACG – CAGCAATAAACCAGCCAGCCG Duplicate2 (all) CGACAGCATGGGATCAGCACG – CCTTCGGGCATGGCACTCTTG Duplicate3 (all) CGACAGCATGGGATCAGCACG – TACCGCGCCACATAGCAGAAC Duplicate4 (all) CGACAGCATGGGATCAGCACG – CAGCAATAAACCAGCCAGCCG

302 Appendix to Chapter 4

Table S6: Fold-increase in plasmid copy number in the evolved lines

Fold-increase in plasmid Fold-increase in plasmid Line Line copy number (mean±sd) copy number (mean±sd) EvoS-N 6.75±1.58 EvoD-N 14.17±4.93 EvoS-A 6.13±1.99 EvoD-A 19.28±4.17 EvoS-A↑ 5.49±0.57 EvoD-A↑ 18.74±3.17 EvoS-C 6.66±1.28 EvoD-C 6.07±1.93 EvoS-I 6.07±1.93 EvoD-I 7.03±2.71 EvoS-CA 6.29±2.08 EvoD-CA 8.69±1.31 EvoS-IA 6.02±1.79 EvoD-IA 6.29±2.33

303 Appendix to Chapter 4

Table S7: Percentage of individuals that retained duplicate genes in the EvoD lines

Line Replicate 1 Replicate 2 Replicate 3 Replicate 4 Replicate 5 EvoD-N 9.5 28.6 52.4 52.4 52.4 EvoD-A 76.2 4.8 23.8 9.5 19.1 EvoD-A↑ 81.0 66.7 38.0 33.3 14.3 EvoD-C 30.0 25.0 35.0 25.0 5.0 EvoD-I 15.0 2.7 10.0 5.0 25.0 EvoD-CA 11.1 7.4 15.0 25.0 15.6 EvoD-IA 1.9 21.9 21.4 11.9 50.0

304 Appendix to Chapter 4

Supplementary references

Armstrong KA, Acosta R, Ledner E, Machida Y, Pancotto M, McCormick M, Ohtsubo H, Ohtsubo E, 1984. A 37×10(3) molecular weight plasmid-encoded protein is required for replication and copy number control in the plasmid pSC101 and its temperature-sensitive derivative pHS1. J Mol Biol. 175:331-348.

Cox EC, Yanofsky C, 1967. Altered base ratios in the DNA of an Escherichia coli mutator strain. Proc Natl Acad Sci USA 58:1895-1902.

Dekel E, Alon U, 2005. Optimality and evolutionary tuning of the expression level of a protein. Nature 436:588-592.

Holloway AK, Palzkill T, Bull JJ, 2007. Experimental evolution of gene duplicates in a bacterial plasmid model. J Mol Evol. 64:215-222.

Kreitman M, 2000. Methods to detect selection in populations with applications to the human. Ann Rev Genom Hum Genet. 1: 539–559.

Larson MG, 2008. Analysis of variance. Circulation 117:115-121.

Li W-H, 1997. Molecular Evolution. Sinauer, Sunderland, MA.

Maisnier-Patin S, Roth JR, Fredriksson A, Nyström T, Berg OG, Andersson DI, 2005. Genomic buffering mitigates the effects of deleterious mutations in bacteria. Nat Genet. 37:1376-1379.

Margulies M et al., 2005. Genome sequencing in microfabricated high-density picolitre reactors. Nature 437:376- 380.

Ohno S, 1970. Evolution by gene duplication. Springer-Verlag, Heidelberg, Germany.

Schaaper RM, 1988. Mechanisms of mutagenesis in the Escherichia coli mutator mutD5: role of DNA mismatch repair. Proc Natl Acad Sci USA 85:8126-8130.

Schleif R, 2000. Regulation of the L-arabinose operon of Escherichia coli. Trends Genet. 16:559-565.

Sullivan LM, 2008. Repeated measures. Circulation 117:1238-1243.

Wagner J, Nohmi T, 2000. Escherichia coli DNA polymerase IV mutator activity: genetic requirements and mutational specificity. J Bacteriol. 182:4587-4595.

Wagner J, Gruz P, Kim SR, Yamada M, Matsui K, Fuchs RP, Nohmi T, 1999. The dinB gene encodes a novel E. coli DNA polymerase, DNA pol IV, involved in mutagenesis. Mol Cell 4:281-286.

Zaslaver A, Bren A, Ronen M, Itzkovitz S, Kikoin I, Shavit S, Liebermeister W, Surette MG, Alon U, 2006. A comprehensive library of fluorescent transcriptional reporters for Escherichia coli. Nat Methods 3:623-628.

305 Acknowledgements

I would like to thank my supervisor, Prof. Andreas Wagner, for giving me this opportunity to work on exciting projects, and for his supervision. I am also grateful to other members of my committee for their valuable inputs on these projects.

I would like to express my gratitude to Drs. Rudolf Sägesser, Christian Weikert, Ju Yuan and Tobias Bergmiller for their collaboration, without which this work would not have been possible. I would also like to thank Drs. Lucy Poveda, Weihong Qi, Rita Lecca, Hubert Rehrauer, Andrea Patrignani and Catharine Aquino at the Functional Genomic Center, Zurich for generating sequencing and microarray data for my work, and for their technical inputs on data analyses. I am also thankful to Dr. Francis Parlange and Stefania Luisoni for their help with experimental techniques which I applied to my research.

It was a pleasure to be a part of the Wagner group, where I had many fruitful discussions on my research. I would especially like to thank Dr. Eric Hayden, for helpful discussions on laboratory techniques, and Manuel Bichsel, for his help in statistical analyses. I am also grateful to the members of the Ackermann group for their inputs on laboratory techniques. I would also like to thank Manuel Bichsel and Dr. Peter Szövényi for German translation of the thesis summary.

Finally, I would like to thank my family and friends for their support, without which I could not have done it.

306 Curriculum Vitae PERSONAL

Surname DHAR Name Riddhiman Date of Birth 01.04.1985 Nationality Indian

EDUCATION

2008-current Graduate student, Institute of Evolutionary Biology & Environmental Studies, University of Zurich

Dissertation Title: The genetic basis of microbial adaptations to new environments in laboratory evolution Supervisor: Prof. Andreas Wagner

2003-2008 B. Tech & M. Tech in Biotechnology & Biochemical Engineering, Indian Institute of Technology, Kharagpur, India

Master Thesis Title: Analyzing the catalytic mechanism of a protein tyrosine phosphatase (PtpB) from Staphylococcus aureus through site-directed mutagenesis Supervisor: Prof. Amit Kumar Das

2003 Higher Secondary School Examination, RKM Narendrapur, Kolkata, India

GRANTS

2011 Forschungskredit, University of Zurich

PUBLICATIONS

2012 Riddhiman Dhar, Rudolf Sägesser, Christian Weikert and Andreas Wagner, 2012. Yeast adapts to a changing stressful environment by evolving cross-protection and anticipatory gene regulation. Molecular Biology and Evolution, Advance online access. (http://mbe.oxfordjournals.org/content/early/2012/11/28/molbev.mss253.long)

2011 Riddhiman Dhar, Rudolf Sägesser, Christian Weikert, Ju Yuan and Andreas Wagner, 2011. Adaptation of Saccharomyces cerevisiae to saline stress through laboratory evolution. Journal of Evolutionary Biology 24:1135-1153.

307