The effect of N-terminal mutations on the thermostability of VPR, a cold active -like

Árni Thorlacius

FacultyFaculty of of Physical Physical Sciences Sciences UniversityUniversity of of Iceland Iceland 20192019

THE EFFECT OF N-TERMINAL MUTATIONS ON THE THERMOSTABILITY OF VPR, A COLD ACTIVE SUBTILISIN-LIKE SERINE PROTEASE

Árni Thorlacius

15 ECTS thesis submitted in partial fulfillment of a Baccalaureus Scientiarum degree in Biochemistry

Advisor Prof. Magnús Már Kristjánsson

Co-advisor Kristinn Ragnar Óskarsson, M.Sc.

Faculty of Physical Sciences School of Engineering and Natural Sciences University of Iceland Reykjavik, May 2019 The effect of N-terminal mutations on the thermostability of VPR, a cold active subtilisin- like serine protease 15 ECTS thesis submitted in partial fulfillment of a B.Sc. degree in Biochemistry

Copyright © 2019 Árni Thorlacius All rights reserved

Faculty of Physical Sciences School of Engineering and Natural Sciences University of Iceland Dunhagi 5 107, Reykjavik, Reykjavik Iceland

Telephone: 525 4000

Bibliographic information: Árni Thorlacius, 2019, The effect of N-terminal mutations on the thermostability of VPR, a cold active subtilisin-like serine protease, B.Sc. thesis, Faculty of Physical Sciences, University of Iceland.

Printing: Háskólaprent, Fálkagata 2, 107 Reykjavík Reykjavik, Iceland, May 2019 Útdráttur

Rannsóknarverkefnið byggir á fyrri rannsóknum varðandi byggingareinkenni sem ákvarða hitastigsaðlögun í mismunandi próteinum, þar sem borin voru saman tvö samstofna ensím: C-enda stytt afbrigði af kuldaaðlöguðu VPR (VPR∆C) úr kuldakærri Vibrio örveru, og hitastöðugt aqualysin I (AQUI) úr hitakæru örverunni Thermus aquaticus. Stökkbreytingar við N-endann hafa áhrif á víxlverkanir milli N-enda lykkjunnar og megin- byggingar þessara ensíma og þ.a.l. á hitastöðugleika þeirra. Tvöfalda prólín stökk- breytingin N3P/I5P jók hitastöðugleika VPR meðan að W6F stökkbreytingin lækkaði hann. Tilgangur þessa verkefnis var að tjá allar þrjár stökkbreytingarnar samtímis í einu afbrigði, VPR∆C/N3P/I5P/W6F, til að skoða áhrif þeirra á hitastöðugleika og hraðafræðilega eiginleika, þ.e.a.s. að athuga hvort jákvæð áhrif N3P/I5P á hitastöðug- leika VPR trompi neikvæð áhrif W6F. N3P/I5P/W6F afbrigðið sýndi lægri hvötunarvirkni en villigerðarensímið. Hún var einnig lægri en hvötunarvirkni N3P/I5P og W6F afbrigð- anna. Hitastöðugleikamælingar á N3P/I5P/W6F afbrigðinu gáfu lægri Tm og T50% gildi ◦ ◦ en fyrir villigerðina, ∼12-13 C lægra og ∼3 C lægra, í hvoru tilfelli. Tm og T50% voru lægri en hjá N3P/I5P afbrigðinu, Tm var einnig lægra en hjá W6F afbrigðinu. T50% ◦ var hins vegar hærra en hjá W6F, og það sem meira er, ∼1 C hærra en Tm sem hefur aldrei mælst hingað til í öðru afbrigði VPR. Byggingin umhverfis hvarfstöðina gæti verið stöðugri en annars staðar í ensíminu. DSC mæling á VPR∆C/N3P/I5P/W6F renndi stoðum undir þessa ályktun, en hún gaf til kynna að a.m.k. einn varmafræðilegur atburður á sér stað við hærra hitastig en hjá hinum. Jákvæð áhrif N3P/I5P á hitastöðugleika VPR trompa því ekki neikvæð áhrif W6F.

Abstract

This research project is a part of a larger project aimed at determining the structural basis for temperature adaptations in proteins by studying two subtilisin-like serine proteases that are structurally homologous: A C-terminal truncated form of the cold-active VPR

(VPR∆C), from a psychrophilic Vibrio species, and the thermostable aqualysin I (AQUI), from the thermophile Thermus aquaticus. Substitutions close to the N-terminus affect interactions between the N-terminal loop and the main body of these , which in turn affects their thermostability. The double proline substitution N3P/I5P increased the thermostability of VPR but the W6F substitution decreased it. The purpose of this project was to express all three mutations simultaneously in one variant, VPR∆C/N3P /I5P/W6F, and observe their effect on both thermostability and kinetic parameters, i.e. to see whether the stabilizing effect of N3P/I5P could overcome the destabilizing effect of W6F. The N3P/I5P/W6F variant displayed much lower catalytic activity than the wild type and both N3P/I5P and W6F variants. Thermostability measurements ◦ generated lower Tm and T50% values than for the wild type enzyme, ∼12-13 C lower and ◦ ∼3 C lower, respectively. Both Tm and T50% were lower than those of the N3P/I5P variant, Tm was also lower than that of the W6F variant. The T50% was however ◦ higher than that of the W6F variant, and more interestingly, ∼1 C higher than Tm which has never been observed before in any variant of VPR. This could be due to the scaffold around the enzymes being more stable than the surrounding structures. Indeed, a DSC measurment of VPR∆C/N3P/I5P/W6F revealed at least one thermal event occuring at a higher temperature than the rest. The stabilizing effect of N3P/I5P was apparently insufficient to overcome the destabilizing effect of W6F.

Contents

List of Figures ix

List of Tables xiii

1 Introduction1 1.1 Protein stability ...... 2 1.2 Protein adaptations to temperature ...... 5 1.2.1 Thermodynamic stability ...... 5 1.2.2 Flexibility ...... 7 1.3 Proteases ...... 11 1.3.1 Serine proteases ...... 12 1.3.2 Subtilisin-like serine proteases ...... 14 1.3.3 VPR ...... 15 1.4 The aim of the project ...... 20

2 Materials and methods 21 2.1 Protein purification ...... 21

2.1.1 Purification of VPR∆C and VPR∆C/N3P/I5P/W6F ...... 21 2.1.2 Zaman-Verwilghen protein quantitation ...... 22 2.1.3 SDS-PAGE ...... 22 2.2 Enzymatic activity assays ...... 24 2.3 Michaelis-Menten kinetics ...... 24 2.4 Stability measurements ...... 26

2.4.1 Melting point (Tm) determination ...... 26

2.4.2 Thermal inactivation (T50%) measurements ...... 27 2.4.3 Differential scanning calorimetry (DSC) ...... 28

vii Contents

3 Results and discussions 29 3.1 Purification ...... 29 3.1.1 SDS-PAGE ...... 31 3.2 Michaelis-Menten kinetics ...... 32 3.3 Stability measurements ...... 33

4 Conclusions 37

5 References 39

viii List of Figures

1.1 The trade-off relationship between thermostability and catalytic activity at low temperatures in homologous enzymes. Thermophilic enzymes display high thermostability but low catalytic activity at low temperatures and psychrophilic enzymes display opposite qualities. The area shaded blue is where natural enzymes are located, and to its left is the area where the minimal activity and stability requirements for practical enzymes are not met, which is colored pink. Enzymes that would be located in the upper-right corner, i.e. highly thermostable and active at low temperatures, can be developed in the laboratory, but are generally not found in nature (15)...... 3

1.2 (a) Diagram linking protein-thermodynamic stability (∆GU) and folding/

unfolding rate constants (kF/kU), in a two-state model, where unfolding ‡ ‡ is reversible. TS denotes the high-energy transition state and ∆GF/∆GU the energy required to transition between the two states. (b) The three thermodynamic models used to explain increased thermodynamic stability in thermostable enzymes, compared to a mesophilic homolog (black solid line) (8)...... 6

1.3 Diagram of the Maxwell-Boltzmann distributions of molecules at different temperatures in Kelvin along with the activation energies necessary for two different reactions. At 173 K the portion of molecules that have enough kinetic energy that is required for either reaction is negligible. At 273 K a small portion of molecules has enough energy for the first reaction (dark grey). At 373 K some molecules finally have the kinetic energy that is needed for the second reaction (6)...... 8

ix LIST OF FIGURES

1.4 Diagram showing the different energy landscapes for psychro- and thermo- philic enzymes. The wider the funnel the more unfolded conformations there are, which are high in Gibbs free energy. The longer the funnel the more stable the native states of the enzymes compared to their unfolded states. The valleys in the bottom house stable conformations and the ridges in between them indicate the energy barrier the enzymes need to overcome to switch between stable conformations (7)...... 9

1.5 Schematic representation of the mechanism for serine proteases. Number- ing of residues corresponds to that of -like proteases (26). . 13

1.6 Diagram of a substrate (P2’-P4) bound to the substrate binding sites (S2’-S4) of a subtilisin-like serine protease. Hydrogen bonds between substrate and protein are shown as dotted lines. Enzyme numbering is that of subtilisin BPN’. The catalytic residues D32, H64, and S221, and the residue N155 have been labeled (27)...... 15

1.7 Schematic representation showing the maturation of VPR and AQUI (30). 16

1.8 Three-dimensional model of VPR from a psychrotrophic Vibrio species (PDB ID: 1SH7), made using the program UCSF Chimera. Calcium ions are colored green, disulfide bonds yellow, and the N-terminal region orange. Residues of the (Asp37, His70 and Ser220), the stabilizing Trp6 residue, and disulfide bonds have been labeled, along with residues located at the site of an important salt bridge in AQUI (Asn15 and Lys257 in VPR)...... 17

1.9 Topology diagram of the structure of VPR (34)...... 18

1.10 Close-up of a three-dimensional model of VPR (light blue, PDB ID: 1SH7) superimposed on to a three-dimensional model of AQUI (orange, PDB ID: 4DZT), made using the program UCSF Chimera, which shows some of the differences between the amino acid composition of their N-termini...... 19

x LIST OF FIGURES

3.1 SDS-PAGE of purified samples of VPR∆C and VPR∆C/N3P/I5P/W6F. ™ Lane 1: PageRuler Prestained Protein Ladder. Lane 2: VPR∆C (Table

3.1). Lane 3: VPR∆C/N3P/I5P/W6F (Table 3.2). The size of the proteins, in the protein ladder, is marked in kDa...... 31

3.2 Example of two kinetic assay data sets (black dots) for VPR∆C/N3P/ I5P/W6F, along with error bars, fitted to Michaelis-Menten equations (red and blue lines). The kinetic parameters determined for the assay

fitted to the red line were: KM = 0.193 mM, Vmax = 0.00186 mM/sec, −1 −1 −1 kcat = 87.0 sec , and kcat/KM = 452 sec mM ...... 32

3.3 The CD spectra of VPR∆C (red) and VPR∆C/N3P/I5P/W6F (blue), between 200 nm and 250 nm. MRE denotes mean residue ellipticity. . . . 33

3.4 The averaged and normalized melting curves for VPR∆C (red dots) and

VPR∆C/N3P/I5P/W6F (blue dots), fitted to variable slope sigmoidal curves (black)...... 34

3.5 The averaged Arrhenius plots for VPR∆C (red dots) and VPR∆C/N3P/I5P /W6F (blue dots), along with linear fits (black)...... 35

3.6 Graph of raw data from DSC measurements for VPR∆C (red) and VPR∆C/

N3P/I5P/W6F (blue), where specific heat capacity (Cp) is plotted as a function of temperature...... 36

xi

List of Tables

1.1 List of known clans that contain proteases of the serine protease class, families that belong to them, representative members of the clans, and their catalytic residues. This table was made from data gathered from the peptidase database MEROPS. Clans and families whose names start with P house proteins of the serine protease class along with proteases belonging to other classes (28)...... 12

1.2 Thermal stability and kinetic parameters of VPR∆C, and its mutants

VPR∆C/N3P/I5P and VPR∆C/W6F. Expressed as mean values ± standard deviation of the mean...... 19

2.1 Separating gel recipe...... 23

2.2 Stacking gel recipe...... 23

3.1 Purification table for VPR∆C expressed in Lemo21...... 29

3.2 Purification table for VPR∆C/N3P/I5P/W6F expressed in Lemo21. Concentration of protein in the second step could not be estimated. That result and results that should have been derived from it are, therefore, marked as not applicable (N/A)...... 30

3.3 Kinetic parameters for the activity of VPR∆C and VPR∆C/N3P/I5P/W6F against sAAPF-pNA at 25 ◦C in assay buffer. Expressed as mean values ± standard deviation of the mean...... 32

xiii LIST OF TABLES

3.4 Thermal stability of VPR∆C and VPR∆C/N3P/I5P/W6F. Expressed as mean values ± standard deviation of the mean...... 35

4.1 Main results of the project for VPR∆C (wild type) and VPR∆C/N3P/I5P /W6F...... 37

xiv 1 Introduction

Life as we know it began around 3.5 to 4 billion years ago (1–4) and has spread to every corner of the planet. It can be found high upon tall peaks and deep down in oceanic trenches. The key ingredient to all life on Earth is water. Life can thrive in extreme environments in regard to temperature, pH, salinity, radiation, and pressure as long as there is water present. Organisms that inhabit such niches are called extremophiles, or polyextremophiles, and they have developed various adaptations in order to survive. Most extremophiles are microorganisms though a few higher eukaryotes, such as some insects, also qualify as extremophiles. Temperature poses multiple challenges to life. Most biomolecules, such as proteins and DNA, denature at temperatures between ∼50-100 ◦C and the formation of ice crystals in water at temperatures below 0 ◦C, ruin their structure (5). Organisms have adapted to extreme temperatures by evolving specialized molecular or cellular mechanisms that function at extreme temperatures, e.g. modified enzymes with higher or lower optimal temperatures (6, 7). Thermophiles and hyperthermophiles proliferate at high temperatures, ∼45-75 ◦C and ≥80 ◦C, respectively, and proteins from these organisms typically display high thermal stability (8, 9). Psychrophiles on the other hand, proliferate at temperatures close to or below the freezing point of water. The proteins of these organisms wrestle with decreasing reaction rates due to lower temperatures. Their structures have to be more flexible, and therefore more unstable, to be able to fulfill their function as catalysts (6, 7). Some psychrophiles also employ certain antifreeze proteins which have the ability to bind ice crystals and protect the organisms from damage inflicted by the build up of ice (10, 11). Despite living at radically different temperatures, thermophiles and psychrophiles have many proteins that are homologous. Studying the difference between homologous proteins from these extremophiles and others may yield evidence as to what structural characteristics define cold- and heat-adapted proteins (12).

1 1 Introduction

1.1 Protein stability

Proteins are large, complex macromolecules comprised of one or more polypeptide chains made of amino acids in a linear sequence, linked by peptide bonds which are a type of amide bonds. Despite their diversity, most proteins are composed of 20 common L-amino acids which are coded for by DNA in units called genes. The primary structure of proteins is their amino acid sequence. They also have secondary structures, which refers to the local conformation of parts of their peptide chains. Two folding patterns are particularly stable and are common in proteins; α-helices and β-strands. These patterns are a result of the amino acid sequence itself and are often connected by so called β-turns or loops. A protein’s tertiary and quaternary structure refers to the overall three-dimensional location of all the atoms in the protein. The tertiary and quaternary structures of globular proteins are very diverse and tightly packed. This structural diversity allows proteins to take part in a wide range of biological processes (13, 14).

Globular proteins include enzymes, highly specialized biological catalysts that are central to every biochemical process. Enzymes accelerate chemical reactions and, are responsible for the harnessing of chemical energy from nutrients and the construction of biomolecules from simple precursors, using that energy (14). Proteins, especially enzymes, are dynamic molecules with many different conformations that are vital to their function (6). To function, the globular enzymes native state has to be stable enough so as not to fall apart, but this stability has to be balanced with flexibility so the enzyme can transition between vital conformations, i.e. they have to be marginally stable (9). This trade-off relationship becomes obvious when the thermostability and catalytic activity of homologous enzymes in psychrophilic and thermophilic organisms are compared (see Figure 1.1). Enzymes in thermophilic organisms are usually more rigid and, therefore, their activity is reduced at low temperatures. Meanwhile, enzymes in psychrophilic organisms are usually less rigid and, therefore, have more catalytic activity at low temperatures, but denature more quickly at higher ones (6, 15). The thermodynamic stability of protein structures is derived from two competing thermodynamic factors; enthalpy (∆H) and entropy (∆S). Their relationship is described in the following equation:

∆G = ∆H − T ∆S (1.1)

Where ∆G denotes the Gibbs free energy of the system. Proteins are stabilized and held

2 1.1 Protein stability

Figure 1.1: The trade-off relationship between thermostability and catalytic activity at low temperatures in homologous enzymes. Thermophilic enzymes display high thermostability but low catalytic activity at low temperatures and psychrophilic enzymes display opposite qualities. The area shaded blue is where natural enzymes are located, and to its left is the area where the minimal activity and stability requirements for practical enzymes are not met, which is colored pink. Enzymes that would be located in the upper-right corner, i.e. highly thermostable and active at low temperatures, can be developed in the laboratory, but are generally not found in nature (15). together with a number of factors, rooted in their primary structure, that contribute to enthalpy. These factors include various electrostatic forces, hydrophobic interactions and covalent bonds.

Covalent bonds are by far the strongest bonds in proteins and have strengths of about 200 kJ/mol, non-covalent bonds forces are only a few kJ/mol each. These strong bonds remain mostly intact during the lifetime of a protein, with the prominent exception of disulfide bonds. Disulfide bonds occur between cysteine residues and can stabilize the protein greatly but can be broken by a reducing agent. They are, therefore, prevalent in extracellular proteins, where the environment is oxidizing, and uncommon in intracellular proteins, where the conditions are kept in a highly reducing state.

Proteins contain both hydrophobic and hydrophilic, or polar, groups. Salt bridges occur between residues whose side groups have opposite charges (6, 13, 16), as a consequence

3 1 Introduction of the electrostatic force which is described by Coulomb’s law:

1 q1 · q2 F = · (1.2) 4π r2

Where F denotes the force between two charges, q1 and q2,  the dielectric constant, and r the radius between the charges. In a vacuum,  is equal to 1, whereas in water  has a value of 80 at 25 ◦C. Water greatly decreases the force between the charges by solvating and screening them from each other. The inside of proteins, has an  value of around 4, and is consequently worse at screening charges. Unsurprisingly, naked charges are seldom found inside proteins, but are common on their surface, which is covered by a layer of water molecules (14, 16).

Hydrogen bonds are formed by electrostatic dipoles; two electronegative atoms with a hydrogen atom in between, bonded to one of them. Peptide groups have N-H and C=O dipoles which can form hydrogen bonds. Inside a protein, essentially all peptides groups hydrogen bond to either other peptide groups or side groups. So, despite being rather weak, around 15-20 kJ/mol between peptide groups, they have a significant effect collectively. The same is true for van der Waals interactions in proteins, the weakest of all the forces discussed here. All atoms consist of a positively charged nucleus with negatively charged electrons surrounding it. When two atoms are close to each other, a temporary charge produced in one atom causes a temporary redistribution of charge in the other, causing a weak attraction.

In solution, hydrophobic groups such as the side chains of phenylalanine, leucine, iso- leucine and valine, cannot form hydrogen bonds with water molecules. Therefore, they decrease the number of hydrogen bonds that nearby water molecules can form, lowering the system’s entropy which leads to an energetically unfavorable state. Thermodynamic equilibrium is achieved when entropy is maximized. To reach it, water molecules respond to this unfavorable state by becoming more ordered around hydrophobic groups, which allows them to form more hydrogen bonds. Hydrophobic groups also come together in solution, thereby minimizing the hydrophobic area exposed to the solvent. This hydrophobic interaction leads hydrophobic amino acids, like those described above, to group together in a hydrophobic core at the center of the protein (6, 13, 14, 16). The strength of hydrophobic interactions is temperature dependent. It increases with growing temperatures, until it reaches its maximum at around 100-140 ◦C, and subsequently starts decreasing again (17). Increased packing does not appear to be a defining

4 1.2 Protein adaptations to temperature factor for thermostability, as structures of homologous proteins from thermo-, meso- and psychrophiles often display similar packing (18).

Together, these factors are the driving force behind the folding of proteins into their native structure. Without their contribution to enthalpy, the unfolded, or denatured state of the protein would be favored due to its considerably higher level of entropy, i.e. proteins would not fold into their native states (13).

1.2 Protein adaptations to temperature

As mentioned earlier, different organisms can survive at vastly different temperatures. Some psychrophilic microorganisms can survive in pockets of brine in polar sea ice, remaining metabolically active at temperatures down to -20◦C. Hyperthermophiles can be found at the brink of deep sea hydrothermal vents; the current record holder being Methanopyrus kalderi, from the domain Archeae, which can survive and reproduce at temperatures at up to 122 ◦C(7). The proteins of such extremophiles have evolved certain adaptations which will here be discussed further.

1.2.1 Thermodynamic stability

In a simplified two-state model, where a protein is either folded (F) or unfolded (U) and unfolding is reversible, the thermodynamic stability of a protein can be described with the following equation:

∆GU = −RT ln(KU ) (1.3)

Where ∆GU denotes the Gibbs free energy associated with unfolding, R the universal gas constant, T the temperature in Kelvin, and KU the equilibrium constant defined as:

kU KU = (1.4) kF

Where kU and kF are the rate constants for unfolding and folding, respectively. Increased kinetic stability may result from a slower rate of unfolding, or an increased rate of folding, or both (see Figure 1.2a). This kinetic stability is directly linked to the energy of the

5 1 Introduction transition state. Three models have been suggested to explain the high stability of thermostable proteins (see Figure 1.2b). The first proposes that a thermostable protein would be more thermodynamically stable throughout its temperature range compared to a mesophilic homolog, i.e. elevating ∆GU at every temperature. The second model suggests that the maximum value of ∆GU and the shape of the parabola remains the same, but shifts to a higher temperature, making thermostable proteins less stable at lower temperatures. Lastly, the third model suggests that the height of the parabola for the thermostable protein matches that of the mesophilic homolog, it instead broadens out in both directions, i.e. it becomes more stable at different temperatures but the maximum

∆GU value remains the same. Each model has received support from observations made regarding different thermostable proteins, as have combinations of them (8).

Various structural features have been observed to contribute to the higher thermostability of proteins in thermophiles compared to their homologous counterparts in meso- or psychrophiles. The unfolded state of thermostable enzymes generally has fewer confor- mations compared with meso- or psychrophilic proteins, lowering its level of entropy which promotes folding and increases stability (17). Thermophilic extracellular proteins

(a) (b)

Figure 1.2: (a) Diagram linking protein-thermodynamic stability (∆GU) and folding/unfolding rate constants (kF/kU), in a two-state model, where unfolding is ‡ ‡ reversible. TS denotes the high-energy transition state and ∆GF/∆GU the energy required to transition between the two states. (b) The three thermodynamic models used to explain increased thermodynamic stability in thermostable enzymes, compared to a mesophilic homolog (black solid line) (8).

6 1.2 Protein adaptations to temperature often have more disulfide bonds (6). Crystal structures of thermostable proteins reveal that surface loops, and C- and N-termini are generally the regions with the highest amount of flexibility in globular proteins and are likely the points of origin for thermal unfolding. There is an increased frequency of proline residues in these regions in thermophilic enzymes compared to mesophilic, which is likely due to their restrictive effect on the peptide chain (19). Glycine, the smallest amino acid, has the opposite effect and decreasing the number of glycines in the protein reduces the overall flexibility of the peptide chain. These restrictive effects and the reduction of overall flexibility lower the conformational entropy on the unfolded state of the peptide chain. They also lower the state of entropy of the folded state, but the effect there is weaker, so they also reduce the force of entropy driven unfolding. Sequence comparisons indicate that sequence shortening within surface loops and the C- and N-terminal domains is another common adaptation, which once again reduces the level of entropy in the unfolded state (19). Additionally, comparisons show that thermophilic enzymes have higher percentage of charged amino acids such as lysine, arginine, and glutamate (6, 17). The dielectric constant () of water shrinks with rising temperatures, strengthening ionic bonds such as salt bridges (Eq. 1.2). Another consequence of the shrinking dielectric constant is that the solvents ability to shield the charges diminishes, so forming an ionic bond instead of bonding with the surrounding solvent molecules becomes less unfavorable. Some hyperthermophilic proteins have taken this to the extreme and evolved so called ionic networks, where multiple charged side chains interact. Networks have been found that are formed by up to 18 residues (17). An increased number of hydrogen bonds in thermophilic proteins have been observed as well. A greater quantity of hydrophobic interactions in the core of proteins have been shown to raise stability (6). Van der Waals interactions between these buried side groups serve to reinforce hydrophobic interactions. Other structural features that may enhance thermostability include increased secondary structure, and additional C- and N-terminal interactions that anchor them to the main body of the protein. None of the structural features mentioned here apply to all thermo- or hyperthermostable proteins (6, 17).

1.2.2 Flexibility

At the lower extremes of the temperature spectrum that can support life, we find psychrophiles. Central to these organisms’ struggle are chemical reaction rates, which

7 1 Introduction drop exponentially with decreasing temperatures. Temperature is a measure of the random kinetic motion of molecules and the probability of two molecules colliding with enough energy to react, such as an enzyme and its substrate, is relative to it. The average kinetic energy of molecules (E) at a certain absolute temperature (T) in Kelvin, can be described with the following equation:

3 E = kBT (1.5) 2

−23 −1 −1 Where kB is the Boltzmann constant (1.381·10 J·mol K )(6). This equation can modified to describe the temperature dependence of different reaction rates (k) (Eq. 1.6), known as the Eyring equation.

‡ κkBT − ∆G k = e RT (1.6) h

Where κ denotes the transmission coefficient, h the Planck constant (6.626·10−34 J·s) and ∆G‡ is the Gibbs free energy of activation. What this means is that at low temperatures fewer molecules have the kinetic energy necessary for a reaction to occur (see Figure 1.3). Another problem for organisms exposed to cold environments is that the structure of their proteins become too rigid as a result of low thermal input from

Figure 1.3: Diagram of the Maxwell-Boltzmann distributions of molecules at different temperatures in Kelvin along with the activation energies necessary for two different reactions. At 173 K the portion of molecules that have enough kinetic energy that is required for either reaction is negligible. At 273 K a small portion of molecules has enough energy for the first reaction (dark grey). At 373 K some molecules finally have the kinetic energy that is needed for the second reaction (6).

8 1.2 Protein adaptations to temperature the environment. Psychrophilic organisms solve this dilemma with specific cold-adapted enzymes. Their distinguishing characteristic is their ability to maintain high specific activity at low temperatures, usually as a result of higher turnover values (kcat). Opposite to thermostable ones, these enzymes can have very flexible structures, or at least parts of the structures are very flexible, usually those that participate in catalysis. This flexibility allows the enzymes to change their conformation rapidly, which is believed to be necessary during catalysis at low temperatures. This belief is reinforced by the fact that most cold- adapted enzymes are not very thermostable (7, 10, 11, 20) and by the difference in the energy landscapes of these proteins (see Figure 1.4). Their energy landscapes can be viewed and compared as two types of folding funnels. Located at the top of the funnels are the many different unfolded conformations of the proteins that have high levels of Gibbs free energy. Low energy folded native and catalytically active states occupy the bottom of each funnel. The length of the funnels reflects the difference in Gibbs free energy upon folding, i.e. the stability of the native active states increases in proportion to the length of the funnel. The psychrophilic enzyme is depicted as having a wide, shallow funnel with a wide bottom, which indicates that it has more conformations in its unfolded state, is less stable, and has few or no folding intermediates (7). The low

Figure 1.4: Diagram showing the different energy landscapes for psychro- and thermophilic enzymes. The wider the funnel the more unfolded conformations there are, which are high in Gibbs free energy. The longer the funnel the more stable the native states of the enzymes compared to their unfolded states. The valleys in the bottom house stable conformations and the ridges in between them indicate the energy barrier the enzymes need to overcome to switch between stable conformations (7).

9 1 Introduction ridges in the bottom imply that the energy barriers between stable conformations are low, making it easier for the enzyme to switch between them quickly. This is opposed to the thermophilic enzyme which has a narrow, steep and long funnel, with few native and catalytically active states that are separated by greater energy barriers, and the niches in the side of the funnel which may house folding intermediates (7). Nuclear magnetic resonance (NMR) studies on homologous enzymes from thermophilic and mesophilic species of adenylate kinase, seem to indicate that increased flexibility in key regions allow the enzyme to undergo large conformational changes. The enzyme maintains the balance of adenine nucleotides within the cell, and its catalytic cycle requires the closing of two structural domains over its active site, that resemble lids. Around the lids are ultra-flexible regions, dubbed hinges, that show increased dynamics on the picosecond timescale. These hinge regions correspond with those where the peptide chain must change conformations in order for the lids to close. This suggests that these fast fluctuations are the origin of larger motions that occur on the micro- to millisecond timescale. The variable frequency of fluctuations in these homologous enzymes seems to derive from the differences in their amino acid sequences (21).

The structural characteristics of cold-active enzymes, like those of thermophilic ones, are many and not ubiquitous. Crystal structure analyses of several cold-active enzymes have revealed that they generally have more hydrophilic residues, especially on their surface, resulting in fewer hydrophobic interactions with the solvent compared to their more thermostable counterparts. Longer loops and reduced frequency of prolines in those loops have also been observed, and these structures occasionally contain more glycines (11). Some contain fewer disulfide bonds or salt bridges and hydrogen bonds, weaker interactions between secondary structures, along with fewer hydrophobic interactions in their core. An increased number in anionic residues that are close to each other, may give rise to repulsive charge-charge interactions that destabilize the protein structure (11). It is not clear exactly what structural characteristics make thermo- and psychrophilic enzymes work best at their respective temperatures. Care must be taken when tinkering with thermostability in mutagenic studies, some mutations may have unintended con- sequences, and enzymes that are too flexible or too rigid make for inefficient catalysts (17).

10 1.3 Proteases

1.3 Proteases

Proteases, also called proteinases or peptidases, are enzymes that catalyze the hydrolysis of peptide bonds and can be found in all living organisms. The origin of proteases most likely dates back to the earliest stages of protein evolution and they often evolved independently in different protein families. These enzymes were necessary for protein catabolism and the generation of amino acids of proteins for early life (22). Around 2-4% of genes in a typical genome encode different proteases (23). Proteases have evolved to perform many essential tasks in the cell. These tasks directly or indirectly influence many diverse biological processes, such as DNA replication and transcription, heat shock and unfolded protein responses, and autophagy and apoptosis (22). Most proteases are not highly specific when it comes to substrates and hydrolyse various protein substrates. Others are very specific and only hydrolyse one unique peptide bond of, sometimes only found in a single type of protein. They also come in different shapes and sizes; from relatively small enzymes with one catalytic domain (∼20 kDa), to large enzymes complexes with multiple subunits (up to 6 MDa) (22).

Initially, proteases were broadly classified into two groups: exopeptidases, that catalyze hydrolysis at either the N-terminal, C-terminal, or both, of the polypeptide chain of the substrate, and , that hydrolyze peptide bonds between nonterminal aminoacids in the polypeptide chain of the substrate. This has since been largely abandoned due to the creation of new classification systems, based on the structural and mechanistic information on these enzymes (22). Proteases have now been divided into seven classes, based on their mechanism of catalysis; aspartic, glutamic, aspargine, cysteine, serine, threonine, and metalloproteases. A few proteases have an unknown mechanism of catalysis and still remain unclassified (24). Aspartic, glutamic and metallo- proteases make use of an activated water molecule as a nucleophile. The remaining four utilize an amino acid residue located in their active site as a nucleophile. Enzymes in these classes are grouped into families based on their sequence homology, and families into superfamilies or clans based on similarities in their three-dimensional structure. Proteins of a different class often find themselves together in the same clan (22).

11 1 Introduction

1.3.1 Serine proteases

Serine proteases belong to a class of proteases where the catalytic nucleophile is the hydroxyl group on the side chain of a serine residue (25). This class is one of the most populous (22), with almost one third of all proteases belonging to it (26). They are widespread and have diverse functions. Many distinct clans of serine proteases have been defined (see Table 1.1), the largest being the chymotrypsin-like and subtilisin-like clans (27).

Table 1.1: List of known clans that contain proteases of the serine protease class, families that belong to them, representative members of the clans, and their catalytic residues. This table was made from data gathered from the peptidase database MEROPS. Clans and families whose names start with P house proteins of the serine protease class along with proteases belonging to other classes (28). Clan Families Representative member Catalytic residues SB S8 Subtilisin Asp, His, Ser S53 Glu, Asp, Asp, Ser SC S9, S10, S15, S28, Serine carboxypeptidase D Ser, Asp, His S33, S37, S82 SE S11-13 D-Ala-D-Ala carboxypeptidase B Ser, Lys SF S24, S26 UmuD protein Ser, Lys/His SH S21, S73, S77, S78, Cytomegalovirus assemblin His, Ser, His S80 SJ S16, S50, S69 Lon-A peptidase Ser, Lys SK S14, S41, S49 Peptidase Clp (type 1) Ser, His, Asp SO S74 Escherichia coli phage K1F endo- Ser, Lys sialidase CIMCD self-cleaving protein SP S59 Nucleoporin 145 His, Ser SR S60 Lactoferrin Lys, Ser SS S66 Murein tetrapeptidase LD-carboxy- Ser, Glu, His peptidase ST S54 Rhomboid-1 Ser, His PA S1, S3, S6, S7, Chymotrypsin-like His, Asp, Ser S29-32, S39, S46, S55, S64, S65, S75 PB S45, P2 Penicillin G acylase precurser Ser PC S51 Dipeptidase E Ser, His, Glu PE P1 DmpA aminopeptidase Ser Unassigned S48, S62, S68, S71, S72, S79, S81

The serine protease class was originally distinguished by the prescence of a catalytic triad, composed of aspartate, histidine and serine residues, spanning the active site cleft. The location and order of these three residues differ between classes, so does their structural

12 1.3 Proteases context, which indicates that they evolved separately. Novel catalytic triads and dyads have since been discovered. Another common feature among serine proteases is the oxyanion hole, a positively charged pocket that activates the carbonyl of the peptide substrate and stabilizes the oxyanion of the tetrahedral intermediate (26).

The catalytic mechanism appears to be similar for most serine proteases and is generally accepted (see Figure 1.5), although the residue or residues acting as the general base varies. All proteases face three main obstacles during catalysis: Amide bonds are very stable, proteases use an activated water as a nucleophile and water is a comparatively poor nucleophile, and amines are poor leaving groups. The mechanism detailed here is that of a serine protease with a "classic" catalytic triad. It starts with the active site Ser being deprotonated by a close-by His. The resulting His-H+ is stabilized by a neighbouring Asp. The deprotonated Ser attacks the C=O group of the peptide bond of the substrate yielding a tetrahedral intermediate, whose oxyanion is stabilized by the oxyanion hole. The amide group is protonated by His-H+, making it a better leaving group, and subsequently leaves yielding an acylenzyme intermediate. His deprotonates a water molecule, activating and empowering it to attack the acylenzyme intermediate yielding a second tetrahedral intermediate. Upon its collapse, Ser is expelled, yielding the carboxylic acid product (26).

Figure 1.5: Schematic representation of the mechanism for serine proteases. Numbering of residues corresponds to that of chymotrypsin-like proteases (26).

13 1 Introduction

1.3.2 Subtilisin-like serine proteases

Subtilisin-like serine proteases, or subtilases, can be found in both prokaryotes and eukaryotes, and even some viruses (27). Around 7000 members of this superfamily have been classified, with more than 43.000 known sequences, and there are over 450 available three-dimensional structures (29). Most subtilases are synthesized as pre-pro- enzymes, then transported over a cell membrane and finally activated by cleavage of the pro-peptide. All subtilases have an N-terminal catalytic domain and either a signal peptide or activation peptide, or both (27). The pro-peptides appear to assist in folding and their subsequent cleavage makes the mature enzyme more kinetically stable towards unfolding (30). Some also contain a C-terminal extension, relative to subtilisin (27). Members of this clan have proved useful in protein engineering studies as they are quite malleable, yielding proteins that can interact with new types of substrates (23), or display increased thermostability or activity (30).

The core structure of subtilases is relatively conserved, despite there being a relatively low sequence homology between proteins in this superfamily (20, 30). Connecting loops between helices and strands, often at the surface of the protein, are more variable regions. The catalytic domain is highly variable with respect to sequence, except the residues that form the catalytic triad (D32, H64, and S221), that are highly conserved (27). An important exception, though, is the sedolisin family (S53, see Table 1.1) whose members have a catalytic tetrad (Glu-Asp-Asp-Ser) (28). Another well conserved trait of subtilases is the oxyanion hole formed by the side chains of Asn155, Thr220, and the backbone NH of Ser221 (26). The substrate binding region of subtilases (see Figure 1.6) is a crevice large enough to accommodate at least six amino acid residues of a polypeptide substrate (P2’-P4). The substrate binds into the binding sites (S2’-S4), where both side chain and main chain interactions between the protein and substrate contribute to binding. The S2’ is a hydrophobic pocket of variable size, which depends on the orientation of a conserved aromatic side chain of a residue in the site. S1 is a large and elongated cleft bounded on one side by the oxyanion hole and the active site residue Ser. S2 is a smaller cleft that contains the active site residues His and Asp. S3 is not a distinct site as the P3 faces away from the protein, towards the solvent. However, there are most likely some interactions between P3 and the side chain of a nearby residue that faces the same way. S4 is a distinct pocket that appears to have two subsites. In subtilisin, the residue Y104 is thought to form a flexible lid to the

14 1.3 Proteases pocket. The binding specificity towards a substrate seems to be mainly determined by interactions between the P1-P4 residue side chains and the S1-S4 binding sites. Calcium ions are crucial for the activity and thermal stability of subtilases (27, 30). Sixteen different binding sites have been identified from crystal structures, which bind calcium ions with varying strength. Some are associated with certain families within the clan, others specific to a single species (30). Disulfide bonds can be found in most members of the clan and they also promote stability. The number of calcium binding sites and disulfide bonds is different between subtilases (27).

Figure 1.6: Diagram of a substrate (P2’-P4) bound to the substrate binding sites (S2’- S4) of a subtilisin-like serine protease. Hydrogen bonds between substrate and protein are shown as dotted lines. Enzyme numbering is that of subtilisin BPN’. The catalytic residues D32, H64, and S221, and the oxyanion hole residue N155 have been labeled (27).

1.3.3 VPR

VPR is an extracellular protease from a psychrotrophic Vibrio species (PA44), a Gram- negative bacterium. The enzyme belongs to the proteinase K family of subtilisin- like serine proteases. The vpr gene from this species of Vibrio is a 1593 base pair

15 1 Introduction sequence which encodes 530 residues, or a 55.7 kDa precursor protein. The precursor contains a 139 residue N-terminal prosequence, a 291 residue protease domain, and a 100 residue C-terminal domain (31). The N-terminal prosequence most likely functions as an intramolecular chaperone, assisting in the correct folding of the protein, before being cleaved off in an intramolecular autocatalytic reaction. The C-terminal prosequence has been shown to play an important role in the extracellular secretion of VPR. The prosequence might be required for the translocation of the 40.6 kDa precursor through the outer membrane of the bacteria. Following secretion, the C-terminal prosequence is removed with another intramolecular autocatalytic cleavage which yields a fully func- tional, 29.7 kDa enzyme (See Figure 1.7). This seems to also hold true for recombinant VPR from an Escherichia coli expression system (31).

VPR (see Figure 1.8) is classified as an α/β-protein with a three-layer (αβα) sandwich architecture (32, 33), similar to other subtilases. The structure consists of six α-helices, one 3/10 helix, a β-sheet comprised of seven parallel strands and two β-sheets, each consisting of two antiparallel strands (see Figure 1.9) (34). The β-strands in the large sheet are in the order 2314567, with a left-handed crossover connection between 2 and 3, which is uncommon (30). VPR has three disulfide bonds; Cys67-Cys99, Cys163- Cys194, and Cys277-Cys281, and three calcium-binding sites. The active site consists of the catalytic triad; Asp37, His70, Ser220. The proteins substrate recognition and binding sites match those of other subtilases, as these sites are well conserved within the clan. They appear as a distinct cleft on the surface of the protein (34).

In this project, VPR∆C was used, a C-terminal truncated form of the enzyme that lacks a 15 residue C-terminal extension, and consequently, the disulfide brige between C277 and C281 (see Figure 1.8). Its kinetic parameters are comparable to those of wild type

Figure 1.7: Schematic representation showing the maturation of VPR and AQUI (30).

16 1.3 Proteases

VPR, although there is a small decrease in kcat and an elevation in KM. This form has been produced previously by this research group, for the purpose of mimicking the structure of the thermophilic homologue, aqualysin I (AQUI), from the thermophilic bacterium Thermus aquaticus. Wild type VPR has a 60% sequence identity to AQUI. This high sequence homology, despite the enzymes showing strong traits that reflect their natural environment, makes VPR∆C and AQUI an ideal pair for studies comparing

Figure 1.8: Three-dimensional model of VPR from a psychrotrophic Vibrio species (PDB ID: 1SH7), made using the program UCSF Chimera. Calcium ions are colored green, disulfide bonds yellow, and the N-terminal region orange. Residues of the catalytic triad (Asp37, His70 and Ser220), the stabilizing Trp6 residue, and disulfide bonds have been labeled, along with residues located at the site of an important salt bridge in AQUI (Asn15 and Lys257 in VPR).

17 1 Introduction

Figure 1.9: Topology diagram of the structure of VPR (34). adaptations to different temperatures (19). The enzymes share two calcium binding sites, but VPR contains one additional binding site which is surprising considering that these sites are associated with increased thermal stability. VPR and AQUI also share two disulfide bonds, but VPR contains one additional bond which again is surprising due to the perception that disulfide bonds increase overall stability (30). Structural comparison of these enzymes suggests that the thermophilic AQUI contains additional salt bridges that do not appear in VPR, indicating their importance for thermostability. When these additional salt bridges were systematically mutated, it had varying results on thermostability. One in particular had a large effect on thermostability, namely the ion pair formed by the residues Asp17 and Arg259 in AQUI. VPR instead contains the residues Asn15 and Lys257 at the corresponding sites (see Figure 1.8). The substitution D17N caused a significant drop in thermostability in AQUI, and the reverse substitution N15D increased the thermostability of VPR (35, 36). These observations suggest that it is not the total amount of ionic interactions that determine thermostability for these proteins, instead their importance is varied and tied to their structural contexts (30). Structural comparison also revealed that AQUI has five prolines that are not present in VPR, four of which are located in surface loops. Two are located in the N- terminal loop and have the most effect on thermostability in AQUI. Previous mutagenic studies on the N-terminus of VPR have targeted two residues (N3 and I5). The proline substitution I5P resulted in an enzyme with higher thermostability and the N3P/I5P mutant was even more thermostable (see Table 1.2). In wild type VPR the N-terminal

18 1.3 Proteases loop appears to be flexible and loose. When proline substitutions are introduced, the loop apparently becomes more rigid (see Figure 1.10). The introduced prolines stabilize a β-sheet, formed by the first two residues in the mature enzymes, which anchors the N-terminus to the main body of the protein (19). Another mutation that has previously been studied is W6F. Trp6, is highly conserved in subtilases related to VPR and AQUI

(see Figure 1.8). The VPR∆C/W6F variant lost some of its catalytic properties and the substitution was highly detrimental to the stability of the protein (see Table 1.2). Indicating that the N-terminal region is important for structural integrity of the protein. Indeed, molecular dynamics simulations indicated that the Trp6 is vital for interactions between the N-terminus and the main body of the enzyme. Phenylalanine has a less bulky side group than tryptophan and this substitution might, therefore, make the N-terminus more flexible, weakening these interactions (37).

Table 1.2: Thermal stability and kinetic parameters of VPR∆C, and its mutants VPR∆C/N3P/I5P and VPR∆C/W6F. Expressed as mean values ± standard deviation of the mean. ◦ ◦ −1 −1 −1 Protein Tm ( C) T50% ( C) kcat (s )KM (mM) kcat/KM (s mM )

VPR∆C 62.2 ± 0.2 53.8 ± 0.4 225.7 ± 12.0 0.177 ± 0.016 1238 ± 149

VPR∆C/N3P/I5P 67.9 ± 0.4 60.3 ± 0.4 231.8 ± 10.5 0.187 ± 0.009 1243 ± 77

VPR∆C/W6F 49.5 ± 0.3 40.6 ± 0.3 150.9 ± 31.9 0.145 ± 0.028 1098 ± 167

Figure 1.10: Close-up of a three-dimensional model of VPR (light blue, PDB ID: 1SH7) superimposed on to a three-dimensional model of AQUI (orange, PDB ID: 4DZT), made using the program UCSF Chimera, which shows some of the differences between the amino acid composition of their N-termini.

19 1 Introduction

1.4 The aim of the project

This project is part of a larger one aimed at determining the structural basis for tempera- ture adaptations in proteins conducted in the laboratory of Professor Magnús Már Kristjánsson. The group is currently working with two homologous subtilases from the proteinase K family: VPR, which was first isolated from a psychrophilic Vibrio species and is cold-active, and AQUI which was first isolated from the thermophilic bacterium Thermus aquaticus and is thermostable. A C-terminal truncated form of

VPR, VPR∆C, is used as its structure is more similar to AQUI. Members of the group have conducted numerous site directed mutagenesis experiments on both AQUI and VPR and have observed changes in the behavior of these mutants, when compared to the wild type enzymes (19, 35, 36, 38–41). This project was meant to build on previous work done by members of the group that explored the connection between flexibility at the N-terminus of VPR and its thermostability. Two mutated variants were selected: VPR∆CN3P/I5P and VPR∆C/W6F. The N3P/I5P variant showed increased thermal stability due to increased interactions between the N-terminal region and the main structure of the enzyme, such as H-bonds and van der Waals interactions (19). The W6F variant displays decreased stability, likely due to a decrease in these same interactions. These three mutations were expressed simultaneously in one enzyme,

VPR∆CN3P/I5P/W6F, to observe the effects on both thermal stability and kinetic parameters.

20 2 Materials and methods

2.1 Protein purification

2.1.1 Purification of VPR∆C and VPR∆C/N3P/I5P/W6F

VPR∆C and VPR∆C/N3P/I5P/W6F were produced in E. coli (strain Lemo21) by Kristinn

Ragnar Óskarsson (as described in (42)). Frozen pellets of E. coli cells containing VPR∆C or VPR∆C/N3P/I5P/W6F were suspended in 45 mL of buffer A (25 mM Tris/HCl ◦ (Sigma), 10 mM CaCl2 (Sigma), pH 8.0 at 25 C) along with DNase (Sigma) and lyzozyme (Sigma) that had a final concentration of 1 µg/mL and 1 mg/mL, respectively. This mixture was gently shaken for 2 hours at room temperature. The cell mixture was flash frozen in liquid nitrogen (N2), thawed while being gently shaken at room temperature and then flash frozen again. This freeze-thaw cycle was performed thrice with the last thawing being carried out overnight at 4 ◦C with gentle shaking. The crude cell extract was centrifuged at 20,000 ×g and 4 ◦C for 45 minutes in a Beckman Coulter ® Avanti J-26XP centrifuge. The supernatant was collected and (NH4)2SO4 (Honeywell Fluka) added until the solution was 80 % saturated. It was centrifuged again at 20,000 ×g and 4 ◦C for 45 minutes. The precipitate was collected and suspended in 100 mL of buffer A.

All column purification steps were performed using a BioLogic LP workstation from BioRad. This ammonium sulfate protein solution was loaded onto a z-D-Phe-TETA affinity column which had been equilibriated with buffer A. Next a wash was performed with a buffer A solution containing 1 M NaCl followed by a second wash with pure buffer A. The sample was eluted using a solution of 2 M guanidinium chloride (GdmCl) in buffer

A, into 300 mL of a 3 M (NH4)2SO4 in buffer A solution; the ratio was approximately

21 2 Materials and methods

5:4.

The mixture was loaded immediately onto a phenyl-Sepharose hydrophobic column that had been equilibriated with a 1 M (NH4)2SO4 in buffer A solution. The (NH4)2SO4 concentration was lowered stepwise to 0 M in two steps; first down to 0.25 M and then down to 0 M, using buffer A. The protein was eluted using a 50% ethylene glycol in buffer A solution.

Finally, the sample was loaded onto a Q-Sepharose ion-exchange column which had been equilibriated with buffer A. A wash was performed using buffer A followed by elution using a linear NaCl gradient from 0 M to 0.5 M. Active portions were sorted and pooled into three aliquots according to their activity. Those aliquots were made 20% ethylene glycol and then flash frozen using liquid nitrogen. All samples were stored at -20 ◦C.

2.1.2 Zaman-Verwilghen protein quantitation

To determine the protein concentration in samples from each step of the protein purifica- tion process, a Bradford assay was performed (Zaman-Verwilghen variation) (43). The samples were prepared by combining 2.75 mL of a Coomassie Brilliant Blue G-250 protein staining solution with 0.25 mL of a protein solution; dH2O was used for the blank. These mixtures were mixed using a vortex mixer and incubated for 15 minutes at room temperature. Following that, their absorbance at 620 nm were measured.

2.1.3 SDS-PAGE

To estimate the degree of purification of the proteins, sodium dodecyl sulfate polyacryl- amide gel electrophoresis (SDS-PAGE) was carried out with the purified protein samples in home cast discontinuous gels. First the separation gel (16% acrylamide) was prepared (see table 2.1). This was poured into the cast and isopropanol poured on top to obtain an even surface. This was left to gel while the stacking gel (7% acrylamide) was prepared (see table 2.2). The isopropanol was poured off and the stacking gel mixture placed in the cast along with a sample comb. This recipe is enough for 2 gels. Samples were inhibited using the serine protease inhibitor phenylmethylsulfonyl fluoride (PMSF)

22 2.1 Protein purification

Table 2.1: Separating gel recipe. Solution Volume 40% w/v acrylamide/bis-acrylamide (37/1) (Sigma) 6 mL 1.5 M Tris/HCl (pH 8.8) (Sigma) 3.75 mL 10% w/v SDS (Sigma) 150 µL

dH2O 5 mL Tetraacetylethylenediamine (TEMED) (Merck) 15 µL 10% w/v Ammonium persulfate (Sigma) 75 µL

(Sigma) to a final concentration of 2.5 mM and then incubated for 15 minutes. The sample was mixed with dithiothreitol (DTT) (Sigma) to a final concentration of 0.25 mM and 4× lithium dodecyl sulfate (LDS) sample buffer (Invitrogen), and denatured at 80 ◦C for 10 minutes. 10 µL samples were loaded into the gel along with 5 µL of PageRuler™ Prestained Protein Ladder (Thermo Scientific). Following electrophoresis, the gel was fixed in a 2,5% H3PO4 and 50% EtOH solution for 30 minutes. Then, the gel was washed twice for 20 minutes in dH2O. The gel was stained overnight with a Blue

Silver staining solution (0.12% w/v Coomassie Brilliant Blue G-250 dye, 10% H3PO4,

20% MeOH) (44), and then destained in dH2O.

Table 2.2: Stacking gel recipe. Solution Volume 40% w/v acrylamide/bis-acrylamide (37/1) (Sigma) 2 mL 0.5 M Tris/HCl (pH 6.8) (Sigma) 0.5 mL 10% w/v SDS (Sigma) 150 µL

dH2O 9 mL TEMED (Merck) 30 µL 10% w/v Ammonium persulfate (Sigma) 75 µL

The gel was finally photographed with a Bio-Rad GelDoc™ EZ Imager and processed in Bio-Rad Image Lab™.

23 2 Materials and methods

2.2 Enzymatic activity assays

The activity of VPR∆C and VPR∆C/N3P/I5P/W6F was assayed by using the synthetic substrate, succinyl-Ala-Ala-Pro-Phe-p-nitroanilide (sAAPF-pNA) (Bachem). A 25 mM stock solution was prepared by dissolving the substrate in dimethyl sulfoxide (DMSO) and was kept at 4 ◦C when not in use. Assays were carried out at room temperature using a Cary50 Bio UV-Visible spectrometer from Varian with a 0.5 mM sAAPF-pNA solution. It was prepared by diluting the stock solution with a 100 mM Tris/HCl, 10 ◦ mM CaCl2, pH 8.6 (at 25 C) solution (assay buffer). To determine the reaction rate (V) in mM/sec, the increase in absorbance at 410 nm was measured over a 30 second period and the molar absorption coefficient () of 8480 M−1cm−1 was used (Eq. 2.1).

∆A/sec 1000 mM V = · (2.1) 8480 M −1cm−1 M

2.3 Michaelis-Menten kinetics

Protein samples were thawed and dialysed against 2 L of assay buffer at 4 ◦C overnight. The concentrations of the protein stock solutions were calculated using the Beer-Lambert law (Eq. 2.2) by measuring their absorbance at 280 nm.

A =  · l · c (2.2)

Where A denotes the absorbance,  the molar absorption coefficient l the width of the cuvette (cm) and c the concentration of the sample (M). The molar absorption coefficient for VPR∆C and VPR∆C/N3P/I5P/W6F were calculated using the web-based program ProtParam (https://web.expasy.org/protparam/). They were 34,170 M−1cm−1 and 28,670 M−1cm−1, respectively (45). Samples were prepared, that had an activity of ∼1 U/mL, by weighing protein stock on a precision scale and then diluting the stock solution with assay buffer.

The activity of the enzyme was measured against seven different substrate (sAAPF- pNA) concentrations; 0.075 mM, 0.10 mM, 0.15 mM, 0.25 mM, 0.50 mM, 0.75 mM and 1.00 mM. Those were prepared by diluting a stock substrate solution with assay

24 2.3 Michaelis-Menten kinetics buffer. These substrate solutions were kept in a water bath to keep their temperature constant, at 25 ◦C. Each measurement was carried out over a 30 second period using 950 µL of substrate solution and 50 µL of protein solution. The absorbance was measured using a UV-2700 spectrophotometer from Shimadzu and the temperature of the reaction mixture was maintained at 25 ◦C with a TCC-100 temperature controller from Shimadzu. The protein was measured in triplicate at each substrate concentration (one set) and the reaction mixtures for each concentration were pooled together and stored overnight to allow the reaction to complete. The reaction mixture was then diluted 10 fold and its absorbance at 410 nm measured, to obtain a more accurate estimation of the substrate concentration with the following equation derived from (2.2):

A 1000 mM [sAAPF-pNA] mM = · (2.3) 8420 M −1cm−1 M

Two sets of measurements were performed per day and each protein was measured on three different days with new protein and substrate solutions being prepared for each day. The results from the measurements were fitted to a Michaelis-Menten equation (2.4) using GraphPad Prism. Vmax · [S] v = (2.4) KM + [S]

Where v denotes the reaction rate at [S] substrate concentration, Vmax the maximum rate and KM the substrate concentration needed to reach half of Vmax.KM is also an evaluation of an enzymes affinity towards a certain substrate. The turnover number

(kcat) is a measure of the number of molecules of substrate converted into product per second by the enzyme and is calculated from Vmax with the following equation:

Vmax kcat = (2.5) [E]

Where Vmax denotes the maximum reaction rate using a specific concentration of the enzyme, [E]. The specificity constant is used to evaluate an enzymes catalytic efficiency. It has units of M−1s−1 and is defined as (46):

kcat Specificity constant = (2.6) KM

All kinetic parameters were estimated from three individual samples, for each protein, and are displayed as the mean value ± the standard deviation of the mean.

25 2 Materials and methods

2.4 Stability measurements

2.4.1 Melting point (Tm) determination

Protein samples were thawed, inhibited using PMSF (final concentration was 2.5 mM), incubated at room temperature for 15 minutes, and dialysed overnight at 4 ◦C against 2 ◦ L of a 25 mM glycine, 100 mM NaCl, 15 mM CaCl2, and pH 8.6 (at 25 C) buffer (Tm buffer). The concentration of each sample was diluted with the incubation buffer until the samples absorbance at 280 nm was ∼0.25-0.40. Prior to Tm measurements, a CD spectrum was recorded for each sample from 250 nm down to 200 nm at 25 ◦C using a J-1100 CD Spectrometer from Jasco in a 1 mm cuvette. The absorbance (mdeg) of the sample at 222 nm was then measured. The temperature gradient was 1 ◦C/min from 25 ◦C to 85 ◦C and was maintained using a Jasco CTU-100 temperature controller. The cuvette was washed using fuming HNO3 and then Tm buffer between samples. Assuming that the protein exists in either an unfolded or folded state, the following applies:

Total protein = fU + fF = 1 (2.7)

Where fU denotes the fraction of unfolded protein and fF the fraction of folded. The absorbance from the solution against temperature resulting from denaturation of the protein can be described using the following equation:

y = fU · yU + fF · yF (2.8)

Where y is the total absorbance in mdeg, yU and yF are dependent on temperature, as the absorbance changes with temperature even if no conformational changes occur.

They can be derived from linear extrapolation of the raw data before (yF) and after (yU) denaturation. The curve can then be normalized from 0 (folded) to 1 (unfolded) using the following equation for each measurement:

yF − y fU = (2.9) yF − yU

The melting point of a protein is defined as the temperature where equal amounts of the protein populate the unfolded and folded state (fU = 0.5). Fraction unfolded was calculated according to (2.9), plotted against temperature, fitted to a variable slope

26 2.4 Stability measurements

sigmoidal curve with GraphPad Prism and the melting point determined. The Tm was estimated from three individual samples, for each protein, and is displayed as the mean value ± the standard deviation of the mean.

2.4.2 Thermal inactivation (T50%) measurements

T50% is defined as the temperature at which half of the enzyme activity has been lost after 30 minutes. When a protein is irreversibly denatured in a way that affects its active site, and therefore, its ability to function as it should, it has been inactivated. By measuring the decrease in enzyme activity at different temperatures, i.e. the rate of thermal inactivation, we can evaluate the proteins thermal stability. The temperature range used for measurements should contain the temperature associated with T50%.

Protein samples were thawed and dialysed overnight at 4 ◦C against 2 L of a 25 mM ◦ Tris, 15 mM CaCl2, 100 mM NaCl, 1 mM EDTA buffer, and pH 8.95 (at 25 C) buffer (T50% buffer). The enzyme solution was then diluted with T50% buffer until it contained ∼2 U/mL and 400 mL aliquots of the solution were placed in test tubes. The test tubes were incubated at six different temperatures in a water bath, to keep the temperature constant, and the exact temperature was measured using a HI935005 K-type thermocouple digital thermometer from Hanna. Their thermal inactivation was determined by measuring the activity at regular intervals (4-6 measurements) using a Cary50 Bio UV-Visible spectrometer (Varian) at room temperature. The activity data was fitted to first-order plots to obtain rate constants for thermal inactivation at each temperature. The equation for first-order reactions is:

ln(vt) = ln(v0) − kt (2.10)

Where vt denotes the enzymes activity at a given time (mM/s), v0 the initial activity, k the first order rate constant (s−1) and t the time. This equation can be rearranged to obtain the following equation, which the activity data was fitted to:   vt ln = −kt (2.11) v0

The rate constants were then fitted to an Arrhenius plot which describes their tempera-

27 2 Materials and methods ture dependence: Ea ln(k) = − · ln(A) (2.12) R · T

Where Ea denotes the activation energy of unfolding, A the pre-exponential factor, R the universal gas constant (8.314 J·mol−1K−1). The rate needed for the loss of half of the initial activity over 30 minutes was then calculated by rearranging Eq. 2.11 and solving it for v0/vt=0.5: ln v0  vt ln(0.5) k50% = = s (2.13) t 30 min · 60 min

T50% was obtained by rearranging Eq. 2.12 and solving it for k50%.

ln(A) · R T50% = ln(k50%) − (2.14) Ea

The T50% was estimated from three individual samples, for each protein, and is displayed as the mean value ± the standard deviation of the mean.

2.4.3 Differential scanning calorimetry (DSC)

Differential scanning calorimetry is a method for measuring the specific heat capacity

(Cp) of thermally induced events, such as conformational changes in proteins during thermal induced denaturation, as a function of temperature (47). DSC were carried out here to monitor thermal events occurring during thermal denaturation of the of the protease variants.

Protein samples were thawed, inhibited using PMSF (final concentration was 2.5 mM), incubated at room temperature for 15 minutes, and dialysed overnight at 4 ◦C against

2 L of Tm buffer. The DSC experiments were carried out by Kristinn Ragnar Óskarsson (as described in (42)). A single experiment was performed for each variant.

28 3 Results and discussions

3.1 Purification

VPR∆C and VPR∆C/N3P/I5P/W6F were purified using the methods detailed in chapter

2.1. Expression and purification was successful (Table 3.1) for VPR∆C, as 70-90% yields should be expected. Observation of higher yields following the second step is due to the underestimation of protein in the soluble fraction. This could be due to the presence of additional protein substrates for the enzyme, i.e. the enzyme hydrolyzes polypeptides present in the solution instead of the sAAPF-pNA. Observation of lower yields following the z-D-TETA-Sepharose column could be due to the denaturing effects of the chaotropic salt GdmCl. Expression was successful for VPR∆C/N3P/I5P/W6F

Table 3.1: Purification table for VPR∆C expressed in Lemo21. Step Volume Concen- Activity Units Total Specific Yield Purifica- (mL) tration (U/mL) (U) protein activity (%) tion (mg/mL) (mg) (U/mg) (fold) Soluble fraction 121 1.50 118 14278 182 78 100 1.0

80% (NH4)2SO4 precipitate 102 1.36 171 17442 138 126 122 1.6 z-D-Phe-TETA-Sepharose 1050 0.10 13.5 14175 102 139 99 1.8 Phenyl-Sepharose 134 0.18 132 17688 25 722 124 9.2 Q-Sepharose 36 0.40 347 12579 14 875 88 11 but its purification went unsatisfactorily (Table 3.2). The sample taken of the 80%

(NH4)2SO4 precipitate was misplaced. Consequently, results from the Bradford assay (see Chapter 2.1.2), and results deriving from it, could not be obtained. The purification table for VPR∆C/N3P/I5P/W6F is, therefore, incomplete. Exceptionally low yields were observed following the z-D-Phe-TETA-Sepharose column step. This was likely due to

VPR∆C/N3P/I5P/W6F being more sensitive to the denaturing effects of Gdm-Cl than

VPR∆C, and a higher fraction of the enzyme, therefore, being fully denatured. Poor

29 3 Results and discussions yields following the Q-Sepharose step are likely due to the protein being left overnight on the column, after it had been loaded onto it. A large fraction of the enzyme was subsequently washed off the column and lost.

Table 3.2: Purification table for VPR∆C/N3P/I5P/W6F expressed in Lemo21. Concentration of protein in the second step could not be estimated. That result and results that should have been derived from it are, therefore, marked as not applicable (N/A). Step Volume Concen- Activity Units Total Specific Yield Purifica- (mL) tration (U/mL) (U) protein activity (%) tion (mg/mL) (mg) (U/mg) (fold) Soluble fraction 88 2.32 194 17072 204 84 100 1.0

80% (NH4)2SO4 precipitate 102 N/A 155 15810 N/A N/A 93 N/A z-D-Phe-TETA-Sepharose 935 0.06 3 3151 58 54 19 0.7 Phenyl-Sepharose 183 0.08 26 4721 15 319 28 3.8 Q-Sepharose 19 0.20 49 923 4 247 5 3.0

30 3.1 Purification

3.1.1 SDS-PAGE

Following SDS-PAGE of purified samples of VPR∆C and VPR∆C/N3P/I5P/W6F, the gel was photographed and analyzed (see Figure 3.1). Each sample lane contained one intense protein band at ∼34 kDa, which is consistent with previous results (42). The presence of faint protein bands, that were smaller than the intense protein bands, suggests that a fraction of the protein in the samples underwent autoproteolytic cleavage. The two faint bands above the intense protein band in lane 2 were likely a product of impurities, but they were <1%.

Figure 3.1: SDS-PAGE of purified samples of VPR∆C and VPR∆C/N3P/I5P/W6F. Lane ™ 1: PageRuler Prestained Protein Ladder. Lane 2: VPR∆C (Table 3.1). Lane 3: VPR∆C/N3P/I5P/W6F (Table 3.2). The size of the proteins, in the protein ladder, is marked in kDa.

31 3 Results and discussions

3.2 Michaelis-Menten kinetics

Michaelis-Menten kinetic assay experiments were carried out on VPR∆C and VPR∆C/ N3P/I5P/W6F in assay buffer, to see what effects the mutations had on the catalytic activity of the enzyme. The kinetic parameters were determined using the methods described in chapter 2.3 (see Table 3.3).

Figure 3.2: Example of two kinetic assay data sets (black dots) for VPR∆C/N3P/ I5P/W6F, along with error bars, fitted to Michaelis-Menten equations (red and blue lines). The kinetic parameters determined for the assay fitted to the red line were: −1 KM = 0.193 mM, Vmax = 0.00186 mM/sec, kcat = 87.0 sec , and kcat/KM = 452 sec−1mM−1.

Table 3.3: Kinetic parameters for the activity of VPR∆C and VPR∆C/N3P/I5P/W6F against sAAPF-pNA at 25 ◦C in assay buffer. Expressed as mean values ± standard deviation of the mean. −1 −1 −1 Protein KM (mM) kcat (s ) kcat/KM (s mM )

VPR∆C 0.184 ± 0.016 224.9 ± 18.5 1267 ± 139

VPR∆C/N3P/I5P/W6F 0.180 ± 0.013 83.0 ± 13.1 461 ± 101

The measured kinetic parameters for VPR∆C are very similar to previous measurements of this enzyme (42). It seems as KM for VPR∆C/N3P/I5P/W6F is similar when compared to VPR∆C, but kcat is clearly much lower. The specificty constant, kcat/KM, is therefore

32 3.3 Stability measurements substantially lower for the N3P/I5P/W6F variant, which indicates a large decrease in the catalytic efficiency of the enzyme. Lower kcat/KM values have also been observed for both N3P/I5P and W6F variants (see Table 1.2), but still much higher (≥1100 s−1mM−1) than the measured specificty constant of the N3P/I5P/W6F variant.

3.3 Stability measurements

The melting points (Tm) of VPR∆C and VPR∆C/N3P/I5P/W6F were measured using a CD spectrometer and the raw data processed with the methods detailed in chapter 2.4.1, but first their CD spectra were measured (see Figure 3.3) to see if the mutations had an effect on the overall secondary structure of the protein.

Figure 3.3: The CD spectra of VPR∆C (red) and VPR∆C/N3P/I5P/W6F (blue), between 200 nm and 250 nm. MRE denotes mean residue ellipticity.

The depth of the curves was found to be different, however likely due to an overestimation of protein concentration for the VPR∆C/N3P/I5P/W6F variant. Some differences can be observed in the shape of the curves. Relatively smaller peak at around 220 nm for the

VPR∆C/N3P/I5P/W6F variant as compared to VPR∆C may indicate a higher content of β-strands as compared to α-helices in the mutated variant. This higher ratio may imply that the mutations have a destabilizing effect on an α-helix, or multiple α-helices,

33 3 Results and discussions in the protein. A partial unraveling of one or more α-helices would manifest itself in a similar way. Indeed, molecular dynamics simulations indicated that two helices (those shown on the top of VPR in Figure 1.8) may be partially unfolded in the W6F variant (37). The same could be true for the N3P/I5P/W6F variant.

When the raw data had been processed for all measurements (see Table 3.4), the normalized melting curves for each variant were averaged and plotted together on a single graph. The mutations clearly have a large destabilizing effect on the enzyme ◦ when compared to the wild type, manifested in ∼12-13 C decrease in Tm (see Figure 3.4).

Figure 3.4: The averaged and normalized melting curves for VPR∆C (red dots) and VPR∆C/N3P/I5P/W6F (blue dots), fitted to variable slope sigmoidal curves (black).

T50% measurements were also carried out for VPR∆C and VPR∆C/N3P/I5P/W6F and the data were analyzed according to the methods detailed in chapter 2.4.2 (see Table 3.4). Following that, each Arrhenius plot for each variant was averaged and the results plotted together on a single graph (see Figure 3.5). T50% was also lowered for the variant ◦ (∼3 C), to lesser extent than in the case of Tm (see Table 3.4).

Both the Tm and T50% values for VPR∆C reported here were lower than those previously reported (42). This may be due to an overestimation of the concentration of Ca2+ in

34 3.3 Stability measurements

Figure 3.5: The averaged Arrhenius plots for VPR∆C (red dots) and VPR∆C/N3P/I5P /W6F (blue dots), along with linear fits (black).

Table 3.4: Thermal stability of VPR∆C and VPR∆C/N3P/I5P/W6F. Expressed as mean values ± standard deviation of the mean.

Protein Tm T50%

VPR∆C 60.6 ± 0.4 52.4 ± 0.2

VPR∆C/N3P/I5P/W6F 48.2 ± 0.5 49.4 ± 0.3

the buffer, which might be due to the new batch of CaCl2 used being more hydrated than previous ones. It has been shown that the stability of VPR is highly dependent on calcium ion concentration (42). On the other hand, VPR∆C/N3P/I5P/W6F is clearly less thermostable than VPR∆C. The Tm was also lower than those for both N3P/I5P and W6F variants (see Table 1.2). T50% was lower than that of the N3P/I5P variant, but still much higher than the T50% of the W6F variant (see Table 1.2). This indicates that the stabilizing effects of the N3P/I5P mutations do not completely cancel out the destabilizing effects of the W6F mutation. Interestingly, in the N3P/I5P/W6F variant, ◦ the estimated value for Tm is roughly 1 C lower than that of T50%, something that has not been observed before in any variant of VPR (19, 35–41). Somehow, the enzyme is still catalytically active past its melting point. Possibly, the scaffolding around the active

35 3 Results and discussions site is more stable than the structure surrounding it, allowing the enzyme to remain functional even as it starts denaturing.

A single DSC experiment was performed for each variant by Kristinn Ragnar Óskarsson. The raw, unprocessed data for both was then plotted together on a single graph (see Figure 3.6).

Figure 3.6: Graph of raw data from DSC measurements for VPR∆C (red) and VPR∆C/N3P/I5P/W6F (blue), where specific heat capacity (Cp) is plotted as a function of temperature.

No effort was made to process the data for the N3P/I5P/W6F variant since no post- denaturation heat capacity value could be determined. There can still be observed a clear difference between the location and shape of the curves. The curve of VPR∆C shows only one smooth peak during denaturation. The peak appears at ∼64.5 ◦C, which is higher than the CD melting point, but that has been observed before (42). The curve of

VPR∆C/N3P/I5P /W6F displays what looks like possibly two separate peaks, indicating at least one thermal event occurring at a higher temperature than the rest. The first peak appears at ∼48 ◦C and the second one at ∼57.5 ◦C. The rough, latter part of the curve is likely due to fact that the protein started precipitating, and that it did so to a much larger extent than the wild type enzyme. This could be due to new nucleation points as a consequence of the mutations, making it more thermodynamically favorable for the protein to aggregate.

36 4 Conclusions

The purpose of this project was to observe the combined effects of the stabilizing mutations N3P/I5P and the destabilizing mutation W6F on a single variant. Hence we were interested to see if the stabilizing effect of the double proline substitutions could counteract the detrimental effect of the W6F mutation on stability and activity of VPR. The mutated variant, VPR∆C/N3P/I5P/W6F, clearly behaves differently than the wild type enzyme, VPR∆C, as summarized in Table 4.1.

Table 4.1: Main results of the project for VPR∆C (wild type) and VPR∆C/N3P/I5P /W6F. ◦ ◦ −1 −1 −1 Protein Tm ( C) T50% ( C) KM (mM) kcat (s ) kcat/KM (s mM ) Wild type 60.6 ± 0.4 52.4 ± 0.2 0.184 ± 0.016 224.9 ± 18.5 1267 ± 139 N3P/I5P/W6F 48.2 ± 0.5 49.4 ± 0.3 0.180 ± 0.013 83.0 ± 13.1 461 ± 101

The N3P/I5P/W6F variant displays both lower kcat and KM values, resulting in a drastically lower kcat/KM value. This means it shows much lower catalytic efficiency towards sAAPF-pNA than wild type VPR, as well as both the N3P/I5P and W6F variants (see Table 1.2). This could be due to the structure of the active site changing as a result of weaker interactions between the N-terminus and the main body of the enzyme, making it worse suited to catalyze the hydrolysis of this substrate, i.e. the stabilizing effect of the double proline mutations (N3P/I5P) is insufficient to overcome the destabilizing effect of the phenylalanine substitution (W6F). The mutated variant displays reduced Tm and

T50% values and is, therefore, clearly less thermostable than the wild type VPR. It is also less thermostable than the N3P/I5P variant and similar Tm values were observed as for the W6F variant, but the T50% values for the N3P/I5P/W6F variant were much higher than those for the W6F variant (see Table 1.2). Tm and T50% values reported here for

VPR∆C are slightly lower, likely due to an overestimation of calcium ion concentration in the buffers. What is most interesting though, is the difference between Tm and T50% in

37 4 Conclusions

◦ VPR∆C/N3P/I5P/W6F. The value for T50% was ∼1 C higher than that of Tm, which has not been observed previously in any other variant of VPR (19, 35–41). Results from DSC experiments for the N3P/I5P/W6F variant may indicate that at least one thermal event occurs at a higher temperature than the rest. That could mean that the scaffolding around the active site is possibly more structurally sound than for the rest of the protein, allowing it to retain some catalytic activity even as it has begun denaturing. This anomaly could of course be a result of something else entirely. Increased protein precipitation was likely attributable to new nucleation points being created due to the mutations, making it easier for the protein to aggregate. Further mutagenic studies regarding interactions between the N-terminus and the main structure of the protein could yield more substantial evidence regarding thermal events that occur during thermal denaturation. Possibly even how the protein folds itself and how it unfolds, using e.g. optical tweezers (48–50).

38 5 References

(1) Van Kranendonk, M. J., Webb, G. E., and Kamber, B. S., (2003). Geological and trace element evidence for a marine sedimentary environment of deposition and biogenicity of 3.45 Ga stromatolitic carbonates in the Pilbara Craton, and support for a reducing Archaean ocean. Geobiology 1, 91–108. (2) Ohtomo, Y., Kakegawa, T., Ishida, A., Nagase, T., and Rosing, M. T., (2014). Evidence for biogenic graphite in early Archaean Isua metasedimentary rocks. Nat. Geosci. 7, 25–28. (3) Dodd, M. S., Papineau, D., Grenne, T., Slack, J. F., Rittner, M., Pirajno, F., O’Neil, J., and Little, C. T. S., (2017). Evidence for early life in Earth’s oldest hydrothermal vent precipitates. Nature 543, 60–64. (4) Schopf, J. W., Kitajima, K., Spicuzza, M. J., Kudryavtsev, A. B., and Valley, J. W., (2018). SIMS analyses of the oldest known assemblage of microfossils document their taxon-correlated carbon isotope compositions. Proc. Natl. Acad. Sci. U.S.A. 115, 53–58. (5) Rothschild, L. J., and Mancinelli, R. L., (2001). Life in extreme environments. Nature 543, 1092–1101. (6) Tattersall, G. J., Sinclair, B. J., Withers, P. C., Fields, P. A., Seebacher, F., Cooper, E. C., and Maloney, S. K., (2012). Coping with thermal challenges: Physiological adaptations to environmental temperatures. Compr. Physiol. 2, 2151–2202. (7) Feller, G., (2010). Protein stability and enzyme activity at extreme biological temperatures. J. Phys. Condens. Matter 22, 323101. (8) Luke, K. A., Higgins, C. L., and Wittung-Stafshede, P., (2007). Thermodynamic stability and folding of proteins from hyperthermophilic organisms. FEBS J. 274, 4023–4033.

39 5 References

(9) Jaenicke, R., and Böhm, G., (1998). The stability of proteins in extreme environ- ments. Curr. Opin. Struct. Biol. 8, 738–748. (10) D’Amico, S., Collins, T., Marx, J. C., Feller, G., and Gerday, C., (2006). Psychro- philic microorganisms: challenges for life. EMBO Rep. 7, 385–389. (11) Smalås, A., Leiros, H., Os, V., and Willassen, N., (2000). Cold adapted enzymes. Biotechnol. Annu. Rev. 6, 1–57. (12) Kristjánsson, M. M., Magnússon, Ó. T., Guðmundsson, H. M., Alfreðsson, G. Á., and Matsuzawa, H., (1999). Properties of a subtilisin-like proteinase from a psychrotrophic Vibrio species. Eur. J. Biochem. 260, 752–760. (13) Branden, C., and Tooze, J., Introduction to Protein Structure; Garland Publishing: New York, 1999. (14) Nelson, D. L., and Cox, M. M., Lehninger Principles of Biochemistry, 7th ed.; W.H. Freeman and Company: New York, 2013. (15) Arnold, F., Wintrode, P., Miyazaki, K., and Gershenson, A., (2001). How enzymes adapt: Lessons from directed evolution. Trends Biochem. Sci. 26, 100–106. (16) Williamson, M., How Proteins Work; Garland Science: New York, 2012. (17) Daniel, R. M., Danson, M. J., Hough, D. W., Lee, C. K., Peterson, M. E., and Cowan, D. A., In Protein adaptation in extremophiles, Siddiqui, K. S., and Thomas, T., Eds.; Nova Science Publishers: New York, 2008; Chapter 1: Enzyme stability and activity at high temperatures, pp 1–34. (18) Sterpone, F., and Melchionna, S., In Thermostable Proteins: Structural Stability and Design, Sen, S., and Nilsson, L., Eds.; CRC Press: Boca Raton, 2011; Chapter 2: Role of packing, hydration, and fluctuations on thermostability, pp 21– 46. (19) Arnórsdóttir, J., Sigtryggsdóttir, Á. R., Þorbjarnardóttir, S. H., and Kristjánsson, M. M., (2009). Effect of proline substitutions on stability and kinetic properties of a cold adapted subtilase. J. Biochem. 145, 325–329. (20) Kristjánsson, M. M., In Handbook to Proteolytic Enzymes, Rawlings, N., and Salvesen, G., Eds., 3rd ed.; Academic Press: Cambridge, 2013; Chapter 695 - Cold adapted subtilases, pp 3161–3166. (21) Henzler-Wildman, K., and Kern, D., (2007). Dynamic personalities of proteins. Nature 450, 964–972.

40 (22) López-Otín, C., and Bond, J. S., (2008). Proteases: Multifunctional enzymes in life and disease. J. Biol. Chem. 283, 30433–30437. (23) Page, M. J., and Di Cera, E., (2008). Serine peptidases: Classification, structure and function. Cell. Mol. Life Sci. 65, 1220–1236. (24) Rawlings, N., Waller, M., Barrett, A., and Bateman, A., (2014). MEROPS: The database of proteolytic enzymes, their substrates and inhibitors. Nucleic Acids Res. 42, https://www.ebi.ac.uk/merops/about/classification.shtml (20/04/2019), D503–D509. (25) Rawlings, N., Waller, M., Barrett, A., and Bateman, A., (2014). MEROPS: The database of proteolytic enzymes, their substrates and inhibitors. Nucleic Acids Res. 42, https://www.ebi.ac.uk/merops/about/glossary.shtml (20/04/2019), D503–D509. (26) Hedstrom, L., (2002). Serine protease mechanism and specificity. Chem. Rev. 102, 4501–4523. (27) Siezen, R. J., and Leunissen, J. A. M., (1997). Subtilases: The superfamily of subtilisin-like serine proteases. Protein Sci. 6, 501–523. (28) Rawlings, N., Waller, M., Barrett, A., and Bateman, A., (2014). MEROPS: The database of proteolytic enzymes, their substrates and inhibitors. Nucleic Acids Res. 42, https://www.ebi.ac.uk/merops/cgi-bin/clan_index (20/04/2019), D503–D509. (29) Finn, R. D., Coggill, P., Eberhardt, R. Y., Eddy, S. R., Mistry, J., Mitchell, A. L., Potter, S. C., Punta, M., Qureshi, M., Sangrador-Vegas, A., Salazar, G. A., Tate, J., and Bateman, A., (2016). The Pfam protein families database: Towards a more sustainable future. Nucleic Acids Res. 44, http://pfam.xfam.org/family/ PF00082 (7/05/2019), D279–D285. (30) Kristjánsson, M. M., In Thermostable Proteins: Structural Stability and Design, Sen, S., and Nilsson, L., Eds.; CRC Press: Boca Raton, 2011; Chapter 4: Thermo- stable subtilases (subtilisin-like serine proteinases), pp 67–104. (31) Arnórsdóttir, J., Smáradóttir, R. B., Magnússon, Ó. T., Þorbjarnardóttir, S. H., Eggertsson, G., and Kristjánsson, M. M., (2002). Characterization of a cloned subtilisin-like serine proteinase from a psychrotrophic Vibrio species. Eur. J. Bio- chem. 269, 5536–5546.

41 5 References

(32) Dawson, N. L., Lewis, T. E., Das, S., Lees, J. G., Lee, D., Ashford, P., Orengo, C. A., and Sillitoe, I., (2017). CATH: An expanded resource to predict protein function through structure and sequence. Nucleic Acids Res. 45, http://www.cathdb.info/version/latest/superfamily/3.40.50.20/ classification (25/04/2019), D289–D295. (33) Murzin, A. G., Brenner, S. E., Hubbard, T., and Chothia, C., (1995). SCOP: A structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247, http://scop.mrc-lmb.cam.ac.uk/scop/ data/scop.b.d.fi.b.b.bd.html (30/04/2019), 536–540. (34) Arnórsdóttir, J., Kristjánsson, M. M., and Ficner, R., (2005). Crystal structure of a subtilisin-like serine proteinase from a psychrotrophic Vibrio species reveals structural aspects of cold adaptation. FEBS J. 272, 832–845. (35) Jónsdóttir, L. B., Ellertsson, B. Ö., Invernizzi, G., Magnúsdóttir, M., Þorbjarnar- dóttir, S. H., Papaleo, E., and Kristjánsson, M. M., (2014). The role of salt bridges on the temperature adaptation of aqualysin I, a thermostable subtilisin- like proteinase. Biochim. Biophys. Acta 1844, 2174–2181. (36) Sigurðardóttir, A. G., Arnórsdóttir, J., Þorbjarnardóttir, S. H., Eggertsson, G., Suhre, K., and Kristjánsson, M. M., (2009). Characteristics of mutants designed to incorporate a new ion pair into the structure of a cold adapted subtilisin-like serine protease. Biochim. Biophys. Acta 1794, 512–518. (37) Óskarsson, K. R., and Kristjánsson, M. M., Unpublished results. (38) Arnórsdóttir, J., Magnúsdóttir, M., Friðjónsson, Ó. H., and Kristjánsson, M. M., (2011). The effect of deleting a putative salt bridge on the properties of the thermostable subtilisin-like proteinase, Aqualysin I. Protein Pept. Lett. 18, 545– 551. (39) Sigtryggsdóttir, Á. R., Papaleo, E., Þorbjarnardóttir, S. H., and Kristjánsson, M. M., (2014). Flexibility of cold- and heat-adapted subtilisin-like serine protein- ases evaluated with fluorescence quenching and molecular dynamics. Biochim. Biophys. Acta 1844, 705–712. (40) Óskarsson, K. R., Nygaard, M., Ellertsson, B. Ö., Þorbjarnardóttir, S. H., Papaleo, E., and Kristjánsson, M. M., (2016). A single mutation Gln142Lys doubles the catalytic activity of VPR, a cold adapted subtilisin-like serine proteinase. Biochim. Biophys. Acta 1864, 1436–1442.

42 (41) Óskarsson, K. R., Rational design of the cold active subtilisin-like serine protease VPR towards higher activity and thermostability., M.Sc. thesis, Univ. of Iceland, 2015. (42) Óskarsson, K. R., and Kristjánsson, M. M., (2019). Improved expression, purifica- tion and characterization of VPR, a cold active subtilisin-like serine protease and the effects of calcium on expression and stability. Biochim. Biophys. Acta, Proteins Proteom. 1867, 152–162. (43) Zaman, Z., and Verwilghen, R. L., (1979). Quantitation of proteins solubilized in sodium dodecyl sulfate-mercaptoethanol-tris electrophoresis Buffer. Anal. Bio- chem. 100, 64–69. (44) Candiano, G., Bruschi, M., Musante, L., Santucci, L., Ghiggeri, G. M., Carne- molla, B., Orecchia, P., Zardi, L., and Righetti, P. G., (2004). Blue silver: A very sensitive colloidal Coomassie G-250 staining for proteome analysis. Electrophoresis 25, 1327–1333. (45) Page, N. C., Vajdos, F., Fee, L., Grimsley, G., and Gray, T., (1995). How to measure and predict the molar adsorption coefficient of a protein. Anal. Biochem. 4, 2411–2423. (46) Wilson, K., and Walker, J., Principles and techniques of biochemistry and mole- cular biology, 7th ed.; Cambridge University Press: 2010. (47) Chiu, M., and Prenner, E., (2011). Differential scanning calorimetry: An invalu- able tool for a detailed thermodynamic characterization of macromolecules and their interactions. J. Pharm. Bioallied Sci. 3, 39–59. (48) Cecconi, C., Shank, E. A., Bustamante, C., and Marqusee, S., (2005). Direct observation of the three-state folding of a single protein molecule. Science 309, 2057–2060. (49) Neuman, K. C., and Nagy, A., (2008). Single-molecule force spectroscopy: optical tweezers, magnetic tweezers and atomic force microscopy. Nat. Methods 5, 491– 505. (50) Ritchie, D. B., and Woodside, M. T., (2015). Probing the structural dynamics of proteins and nucleic acids with optical tweezers. Curr. Opin. Struct. Biol. 34, 43–51.

43