A Dissertation
Entitled
Analysis of the Interactions between the 5' to 3' Exonuclease
and the Single-Stranded DNA-Binding Protein from
Bacteriophage T4 and Related Phages
By Laurence S. Boutemy
Submitted as partial fulfillment of the requirements for
the Doctor of Philosophy in Chemistry
______Advisor: Timothy C. Mueser, Ph.D.
______College of Graduate Studies
The University of Toledo
August 2008
Copyright © 2008
This document is copyrighted material. Under copyright law, no parts of this document may be reproduced without the expressed permission of the author.
An Abstract of
Analysis of the Interactions between the 5' to 3' Exonuclease
and the Single-Stranded DNA-Binding Protein from
Bacteriophage T4 and Related Phages
Laurence S. Boutemy
Submitted as partial fulfillment of the requirements for
the Doctor of Philosophy in Chemistry
The University of Toledo
August 2008
DNA replication and repair is one of the most important cellular processes, since preserving the integrity of the DNA genome is essential to all forms of life.
Many proteins are involved in the DNA replication process, and their interaction ensures that the DNA is duplicated and repaired in a coordinated and efficient manner. Bacteriophage T4 is a very good model to study DNA replication, since it encodes all the proteins required at the replication fork, proteins which have been extensively characterized. However, how these proteins interact and coordinate the replication process is still largely unknown. One of these
iii
interactions that appears to govern the rate and efficiency of the lagging strand synthesis occurs between the 5’ to 3’ exonuclease RNase H and the single- stranded DNA-binding 32 protein. The interaction between these two proteins is the focus of this work.
RNase H and the 32 protein, as well as a number of mutants and truncations, were cloned, expressed and purified. These proteins were then used to form different variants of the RNase H + 32 protein complex, which were characterized through biophysical and structural studies. A crystal structure was obtained for the RNase H + 32-B truncation. This structure, along with the results obtained from the biophysical experiments, provides valuable information on how these two proteins interact to coordinate the lagging strand DNA replication.
Finally, the study of the interaction between RNase H and the 32 protein from bacteriophage Rb 69, a phage related to bacteriophage T4, was also initiated.
iv
ACKNOWLEDGEMENTS
First of all I would like to thank my advisor, Dr. Mueser, for his help and
guidance throughout these past five years. Thank you so much for teaching me
the ways of scientific research, when the scientific knowledge I had when I
arrived in Toledo was mostly academic. What I learned in the Mueser lab will be,
I am sure, invaluable for the rest of my research career. I also want to thank Dr.
B. Leif Hanson, who was like a second advisor to me, for his help and assistance
on everything from data collection to career advice. Many thanks to Dr. Funk, Dr.
Viola and Dr. Von Grafenstein, my committee members, for their helpful
suggestions, and to Dr. Huang and Dr. Slama as well, who were part of my
committee at some point. I would like to thank Dr. Charlie Jones and Dr. Nancy
Nossal from the National Institute of Health in Bethesda, MD, for their precious
collaboration on this project. Thank you also to the UT Department of Chemistry and its staff, especially in the Instrumentation Center and the Chemistry
Stockroom.
I am very grateful for all the encouragement and support received from my family and friends back in France while I was in Toledo. Thank you so much for your love and your patience! I also want to thank my labmates, past and present, for their help and friendship, and for all the good times we have had over these five years in the lab. Finally, many thanks go to all the friends I have made in the
Toledo area and who made my stay in Ohio lots of fun and a memory I will forever treasure. I will miss you guys, thank you!
v
TABLE OF CONTENTS
ACKNOWLEDGEMENTS ...... v
TABLE OF CONTENTS ...... vi
LIST OF TABLES...... xiii
LIST OF FIGURES ...... xvii
LIST OF ABBREVIATIONS ...... xxiv
CHAPTER 1 - Background ...... 1
1.1. Bacteriophage T4 DNA Replication and Repair...... 1
1.1.1. Bacteriophage T4 is a Model for DNA Replication ...... 1
1.1.2. Bacteriophage T4 Single-Stranded DNA-Binding 32 Protein ...... 6
1.1.3. Bacteriophage T4 RNase H ...... 9
1.1.4. Known Interactions between Nucleases and Single-Stranded
DNA-Binding Proteins ...... 15
1.1.5. Project Goals ...... 15
1.2. Escherichia coli DNA-binding Protein from Starved Cells ...... 17
CHAPTER 2 - Methodology...... 20
2.1. Molecular Cloning ...... 20
2.1.1. Polymerase Chain Reaction (PCR)...... 20
2.1.2. Insertion of the PCR Product into an Entry or Expression Vector .... 22
2.1.3. Site-Directed Mutagenesis ...... 24
vi
2.1.4. Transformation into Competent E. coli Cells ...... 27
2.1.5. Agarose Gel Electrophoresis ...... 28
2.1.6. Overview of the Molecular Cloning Process...... 28
2.2. Protein Expression...... 30
2.2.1. Small Scale Expression Studies ...... 30
2.2.2. Large Scale Protein Expression...... 32
2.2.3. SDS-PAGE Gel Electrophoresis ...... 33
2.3. Cell Lysis and Protein Solubility...... 33
2.4. Protein Purification ...... 34
2.5. Protein Preparation ...... 35
2.5.1. Dialysis...... 35
2.5.2. Protein Concentration ...... 36
2.5.3. Solubility Screen ...... 36
2.6. Protein Crystallization ...... 38
2.7. X-Ray Diffraction Data Collection ...... 40
2.7.1. Crystal Cryoprotection and Freezing...... 40
2.7.2. Data Collection...... 42
2.8. Data Processing and Structure Determination...... 43
2.8.1. Data Processing...... 43
2.8.2. Phasing...... 43
2.8.3. Model Building ...... 44
2.8.4. Structure Refinement and Validation ...... 45
2.8.5. Summary of the Data Processing and Model Building Process ...... 46
vii
2.9. Non-Denaturing Gel Electrophoresis ...... 46
2.10. Scattering Studies...... 48
2.10.1. Dynamic Light Scattering (DLS)...... 48
2.10.2. Small-Angle X-Ray Scattering (SAXS)...... 49
2.11. Isothermal Titration Calorimetry (ITC)...... 52
2.12. Fluorescence Anisotropy Titration...... 54
2.13. DNA Purification and Annealing ...... 55
2.13.1. DNA Oligomer Purification ...... 55
2.13.2. DNA Substrate Annealing ...... 56
2.13.3. TBE Gel Electrophoresis...... 56
CHAPTER 3 - Bacteriophage T4 32 Protein and Its Truncations 58
3.1. Introduction ...... 58
3.2. Bacteriophage T4 32 Protein...... 60
3.2.1. Introduction ...... 60
3.2.2. Cell Lysis...... 60
3.2.3. Protein Purification...... 61
3.3. Bacteriophage T4 32 Core Protein...... 62
3.4. Bacteriophage T4 32-A Protein ...... 65
3.5. Bacteriophage T4 32-B Protein ...... 65
3.5.1. Introduction ...... 65
3.5.2. Molecular Cloning ...... 66
3.5.3. Protein Expression...... 78
3.5.4. Cell Lysis...... 80
viii
3.5.5. Protein Purification...... 81
3.5.6. Solubility Screen ...... 88
3.5.7. Dialysis and Concentration ...... 89
3.5.8. Crystal Screening and Optimization...... 89
3.5.9. Data Collection...... 92
3.5.10. Data Processing...... 94
3.5.11. Dynamic Light Scattering ...... 96
3.5.12. Small Angle X-Ray Scattering...... 97
3.6. Bacteriophage T4 32-B Mutants...... 106
3.6.1. Introduction ...... 106
3.6.2. Molecular Cloning ...... 107
3.6.3. Protein Expression and Solubility...... 112
3.6.4. Protein Purification...... 115
3.6.5. Cleaving of the His-Tag...... 118
3.7. Conclusion...... 123
CHAPTER 4 - Bacteriophage T4 RNase H ...... 125
4.1. Introduction ...... 125
4.2. Bacteriophage T4 Native and D132N Mutant RNase H...... 126
4.2.1. Protein Expression...... 126
4.2.2. Cell Lysis...... 127
4.2.3. Protein Purification...... 128
4.2.4. Dialysis and Concentration ...... 129
4.2.5 Scattering Studies...... 133
ix
4.3. Bacteriophage T4 D132N ∆N RNase H...... 137
4.3.1. Protein Expression and Cell Lysis...... 137
4.3.2. Protein Purification...... 139
4.3.4. Solubility Screen ...... 142
4.3.5. Dialysis and Concentration ...... 143
4.4. Conclusion...... 143
CHAPTER 5 - Bacteriophage T4 RNase H + 32 Protein + DNA
Interactions ...... 144
5.1. Introduction ...... 144
5.2. Preliminary Complex Determination...... 144
5.2.1. Protein-Protein Interactions...... 145
5.2.2. Protein-Protein-DNA Interactions...... 147
5.2.3. Summary of the T4 RNase H + 32 Protein Complexes...... 153
5.3. D132N RNase H + 32-B Protein Interaction...... 154
5.3.1. Complex Preparation ...... 154
5.3.2. Structural Studies...... 154
5.3.3. Scattering Studies...... 185
5.3.4. Size Exclusion Chromatography ...... 195
5.3.5. Isothermal Titration Calorimetry ...... 199
5.3.6. Fluorescence Anisotropy Titration...... 202
5.3.7. Protein-Protein-DNA Crystallization ...... 208
5.3.8. 32-B Mutants Studies...... 211
5.3.9. Conclusion ...... 216
x
5.4. D132N ∆N RNase H + 32-B Protein Interaction ...... 218
5.4.1. Complex Preparation ...... 218
5.4.2. Protein-Protein Crystallization...... 218
5.4.3. Protein-Protein-DNA Complex ...... 223
5.4.4. 32-B Mutants Studies...... 223
5.5. D132N ∆N RNase H + 32 Protein Interaction...... 227
5.5.1. Complex Preparation ...... 227
5.5.2. Protein-Protein Crystallization...... 227
5.5.3. Protein-Protein-DNA Crystallization ...... 228
5.5.4. Fluorescence Anisotropy...... 229
5.6. D132N ∆N RNase H + 32 Core Protein Interaction...... 231
5.6.1. Complex Preparation ...... 231
5.6.2. Protein-Protein Crystallization...... 231
5.6.3. Fluorescence Anisotropy...... 233
5.7. Conclusion...... 234
CHAPTER 6 - Bacteriophage Rb69...... 236
6.1. Introduction ...... 236
6.2. Bacteriophage Rb69 Native RNase H ...... 238
6.2.1. Introduction ...... 238
6.2.2. Initial Cloning and Expression...... 239
6.2.3. Molecular Cloning ...... 244
6.2.4. Protein Expression and Solubility...... 252
6.2.5. Protein Purification...... 256
xi
6.3. Bacteriophage Rb69 D132N RNase H...... 271
6.3.1. Introduction ...... 271
6.3.2. Molecular Cloning ...... 273
6.3.3. Protein Expression and Solubility...... 276
6.3.4. Protein Purification...... 278
6.3.5. Cleaving of the His-Tag...... 283
6.4. Bacteriophage Rb69 32-B and Future Work ...... 285
CHAPTER 7 - Escherichia coli DNA-Binding Protein from Starved
Cells ...... 286
7.1. Introduction ...... 286
7.2. Previous Work ...... 286
7.2.1. Expression and Purification...... 287
7.2.2. Characterization...... 287
7.2.3. X-Ray Diffraction Studies...... 289
7.2.4. Discussion and Future Work ...... 290
7.3. Project Follow-up ...... 290
7.3.1. Further Characterization of (SJT Ape FEN-1) Dps...... 292
7.3.2. X-Ray Diffraction Studies...... 296
8.4. Conclusion...... 303
BIBLIOGRAPHY ...... 305
APPENDICES ...... 310
xii LIST OF TABLES
Table 2.1 – PCR Reaction Setup...... 22
Table 2.2 – Vector Insertion Reaction Setup ...... 23
Table 2.3 – Site-Directed Mutagenesis PCR Reaction ...... 26
Table 2.4 – Cloning and Expression Hosts...... 27
Table 2.5 – Antibiotics Required by the Vectors / Cell Lines ...... 31
Table 2.6 – Solubility Screen Solutions ...... 37
Table 2.7 – Crystal Screens ...... 39
Table 2.8 – Cryoprotectants ...... 41
Table 3.1 – 32 Protein and Truncations Characteristics ...... 60
Table 3.2 – HPLC Buffers for T4 32 Protein Purification ...... 61
Table 3.3 – T4 32-B PCR Reactions ...... 70
Table 3.4 – T4 32-B Insertion in pET101 Reaction...... 71
Table 3.5 – T4 32-B Insertion in pENTR-D Reaction...... 72
Table 3.6 – T4 32-B Insertion in pDEST-C1 Reaction ...... 73
Table 3.7 – HPLC Buffers for T4 32-B Purification Scheme 1 ...... 81
Table 3.8 – HPLC Buffers for T4 32-B Purification Scheme 2 ...... 85
Table 3.9 – T4 32-B Crystal Screens...... 90
Table 3.10 – 32-B Data Processing Summary ...... 95
Table 3.11 – 32-B Protein Dynamic Light Scattering Results ...... 97
Table 3.12 – 32 Protein and Truncations Characteristics ...... 106
Table 3.13 – Site-Directed Mutagenesis PCR Reactions for the 32-B Mutants 109
xiii
Table 3.14 – Lysis and HPLC Buffers for the T4 32-B Mutants Purification ..... 115
Table 3.15 – TEV Protease Reaction Setup...... 119
Table 4.1 – RNase H characteristics ...... 126
Table 4.2 – HPLC buffers for T4 RNase H purification ...... 129
Table 4.3 – D132N RNase H Dynamic Light Scattering Results ...... 134
Table 5.1 – D132N RNase H and 32 Truncations Calculated pIs...... 145
Table 5.2 – RNase H + 32 Protein ± DNA Complexes ...... 153
Table 5.3 – RNase H + 32-B Crystal Screens ...... 155
Table 5.4 – D132N RNase H + 32-B Crystals Cryoprotection ...... 159
Table 5.5 – Crystallographic Data for the D132N RNase H + 32-B Dataset 1.. 162
Table 5.6 – Crystallographic Data for the D132N RNase H + 32-B Dataset 2.. 164
Table 5.7 - D132N RNase H + 32-B Crystal Data Collection and Processing .. 175
Table 5.8 - D132N RNase H + 32-B Molecular Replacement Results...... 176
Table 5.9 – Final Refinement and Validation Summary...... 178
Table 5.10 – Dynamic Light Scattering Results for the D132N
RNase H + 32-B Complex at 4 °C ...... 187
Table 5.11 – D132N and 32-B Molecular Weights...... 195
Table 5.12 – Protein Concentrations Used in the ITC Experiment ...... 200
Table 5.13 – Thermodynamic Parameters of the D132N RNase H + 32-B
Complex Formation ...... 201
Table 5.14 –Summary of the Dissociation Constants for the 32 Truncations +
D132N RNase H + Fork DNA Complex...... 206
Table 5.15 – D132N RNase H + 32-B + DNA Crystal Screens ...... 209
xiv
Table 5.16 – Summary of the Dissociation Constants for the 32-B Mutants +
D132N RNase H + fork DNA Complex ...... 215
Table 5.17 – D132N ∆N RNase H + 32-B Crystal Screens ...... 220
Table 5.18 – D132N ∆N RNase H + 32 Protein Crystal Screens...... 227
Table 5.19 – D132N ∆N RNase H + 32 Protein + DNA Crystal Screens ...... 228
Table 5.20 – Summary of the Dissociation Constants from the FA Titrations... 230
Table 5.21 – D132N ∆N RNase H + 32 Core Crystal Screens ...... 231
Table 6.1 – Rb69 RNase H Characteristics...... 239
Table 6.2 – Truncated Rb 69 RNase H vs. T4 RNase H Characteristics...... 240
Table 6.3 – Rb69 RNase H PCR Reaction...... 245
Table 6.4 – Rb69 RNase H Insertion in pET101 Reaction ...... 246
Table 6.5 – Rb69 RNase H Insertion in pENTR-D Reaction ...... 247
Table 6.6 – Rb69 RNase H Insertion in pDEST-C1 Reaction...... 249
Table 6.7 – Lysis and HPLC buffers for Rb69 RNase H purification scheme 1 257
Table 6.8 – Lysis and HPLC buffers for Rb69 RNase H purification scheme 2 261
Table 6.9 – Lysis and HPLC buffers for Rb69 RNase H purification scheme 3 266
Table 6.10 – Rb69 D132N RNase H characteristics...... 272
Table 6.11 – Site-Directed Mutagenesis PCR Reaction for D132N RNase H .. 274
Table 6.12 – Lysis and HPLC buffers for Rb69 D132N RNase H purification .. 278
Table 6.13 – TEV Protease Reaction Setup...... 283
Table 7.1 – Dps Characteristics...... 288
Table 7.2 – DLS Results for (SJT Ape FEN-1) Dps...... 294
Table 7.3 – (BKC Ape FEN-1) Dps Data Processing Summary ...... 298
xv
Table 7.4 – (BKC Ape FEN-1) Dps MolRep Molecular Replacement Summary
...... 299
Table 7.5 - (BKC Ape FEN-1) Dps Phaser Molecular Replacement Summary 300
xvi LIST OF FIGURES
Figure 1.1 – Bacteriophage T4 DNA Replication Fork...... 2
Figure 1.2 – Interaction between T4 RNase H and the 32 Protein ...... 5
Figure 1.3 – T4 32 Protein Domains and Truncations ...... 7
Figure 1.4 – T4 32 Protein Core Domain Crystal Structure ...... 8
Figure 1.5 – Exonuclease vs. Endonuclease Activity ...... 10
Figure 1.6 – T4 RNase H Crystal Structure ...... 11
Figure 1.7 – T4 RNase H Active Site...... 12
Figure 1.8 – T4 RNase H Native vs. Metal Free Crystal Structures...... 13
Figure 1.9 – T4 RNase H + Fork DNA Crystal Structure ...... 14
Figure 1.10 – Ribbon structure of Dps...... 18
Figure 2.1 – Primer Design Scheme...... 21
Figure 2.2 – Site-Directed Mutagenesis Scheme ...... 25
Figure 2.3 – Summary of the Molecular Cloning Process...... 29
Figure 2.4 – X-Ray Diffraction Data Analysis Scheme ...... 47
Figure 2.5 – SAXS Data Analysis Scheme...... 52
Figure 2.6 – DNA Substrate Used in the Fluorescence Anisotropy Titrations .... 54
Figure 3.1 – 32 Protein Domains and 32 Truncations ...... 59
Figure 3.2 – T4 32 Protein Purification ...... 63
Figure 3.3 – T4 32-B Protein Amino-Acid Sequence ...... 66
Figure 3.4 – T4 32-B Initial Expression Plasmid Miniprep ...... 67
Figure 3.5 – T4 32-B PCR Primers...... 69
xvii
Figure 3.6 – Agarose Gels for T4 32-B Cloning...... 74
Figure 3.7 – T4 32-B Protein Expression and Solubility ...... 77
Figure 3.8 – T4 32-B Protein Expression...... 79
Figure 3.9 – T4 32-B Cell Lysis ...... 80
Figure 3.10 – T4 32-B Purification Scheme 1...... 82
Figure 3.11 – T4 32-B Purification Scheme 2...... 86
Figure 3.12 – T4 32-B Solubility Screen ...... 88
Figure 3.13 – T4 32-B Crystals after Screening...... 91
Figure 3.14 – T4 32-B Crystals after Optimization...... 92
Figure 3.15 – 32-B Crystal X-Ray Diffraction Images ...... 94
Figure 3.16 – 32-B Protein Dynamic Light Scattering Results...... 97
Figure 3.17 – 32-B SAXS Data Collection ...... 99
Figure 3.18 – 32-B GNOM Plots...... 101
Figure 3.19 – 32-B 3D Molecular Envelope...... 103
Figure 3.20 – Modeling of the A Domain of 32 Protein (Chadd) ...... 105
Figure 3.21 – T4 32-B Mutants Site-Directed Mutagenesis Primers...... 107
Figure 3.22 – Agarose Gels for the T4 32-B Mutants Cloning ...... 111
Figure 3.23 – T4 32-B Mutants Expression and Solubility ...... 113
Figure 3.24 –T4 I60D 32-B Purification ...... 116
Figure 3.25 – TEV Protease Cleavage Site...... 118
Figure 3.26 – 32-B Mutants TEV Protease Reactions ...... 121
Figure 3.27 – Cysteine Residues in the 32-B Protein ...... 122
Figure 3.28 – I151D 32-B Cross Linking...... 123
xviii
Figure 4.1 – SDS-PAGE of T4 native and D132N RNase H expression ...... 127
Figure 4.2 – SDS-PAGE of T4 D132N RNase H cell lysis...... 128
Figure 4.3 – T4 D132N RNase H purification ...... 130
Figure 4.4 – D132N RNase H Dynamic Light Scattering Results ...... 133
Figure 4.5 – D132N RNase H SAXS Data Collection ...... 134
Figure 4.6 – D132N RNase H GNOM Plots...... 135
Figure 4.7 – D132N RNase H 3D SAXS Molecular Envelopes...... 136
Figure 4.8 – SDS-PAGE of T4 D132N ∆N RNase Expression and Lysis ...... 138
Figure 4.9 – T4 D132N ∆N RNase H Purification ...... 140
Figure 4.10 – T4 D132N ∆N RNase H Solubillity Screen Results...... 142
Figure 5.1 – D132N RNase H + 32 Truncations Native Gel ...... 146
Figure 5.2 – D132N ∆N RNase H + 32 Truncations Native Gel...... 147
Figure 5.3 – DNA Substrates...... 148
Figure 5.4 – RNase H + 32 Truncations + DNA Substrates Gel Shift Assays .. 150
Figure 5.5 – Native RNase H + 32-B Initial Crystal Hits...... 156
Figure 5.6 – D132N RNase H + 32-B Crystal Hits after Screening...... 157
Figure 5.7 – D132N RNase H + 32-B Crystals after Optimization ...... 158
Figure 5.8 – D132N RNase H + 32-B Crystal Data Collection 1...... 161
Figure 5.9 – D132N RNase H + 32-B Crystal Data Collection 2...... 163
Figure 5.10 – SDS-PAGE Gel of the D132N RNase H + 32-B Crystals ...... 166
Figure 5.11 – Intact Mass Spectrum of the D132N RNase H + 32-B Crystals.. 167
Figure 5.12 – RNase H + 32-B Crystals MALDI-TOF Results ...... 168
Figure 5.13 - D132N RNase H + 32-B Crystal Used in Data Collection 3...... 174
xix
Figure 5.14 - D132N RNase H + 32-B Crystal Data Collection 3 Images...... 174
Figure 5.15 – D132N RNase H + 32-B Model Building and Refinement ...... 177
Figure 5.16 – Final D132N RNase H + 32-B Model...... 179
Figure 5.17 – Domain Movement Observed upon Binding ...... 181
Figure 5.18 – Electrostatic Surfaces...... 182
Figure 5.19 – Superposition of a Fork DNA Substrate...... 183
Figure 5.20 – 32-B Mutated Residues ...... 185
Figure 5.21 – Dynamic Light Scattering Results for the D132N
RNase H + 32-B Complex at 4 °C ...... 186
Figure 5.22 – D132N RNase H + 32-B SAXS Data Collection ...... 188
Figure 5.23 – D132N RNase H + 32-B SAXS Data Processing (GNOM)...... 189
Figure 5.24 – D132N RNase H + 32-B SAXS 3D Molecular Envelopes...... 190
Figure 5.25 – Best SASREF fit for the D132N RNase H + 32-B calculated
vs. experimental data ...... 192
Figure 5.26 – Best SASREF Model for the D132N RNase H + 32-B Complex. 192
Figure 5.27 –CRYSOL fit for the D132N RNase H + 32-B theoretical vs.
experimental data...... 193
Figure 5.28 – D132N RNase H + 32-B Complex Gel Filtration Assay ...... 197
Figure 5.29 – D132N RNase H + 32-B Complex Gel Filtration SDS-PAGE Gel198
Figure 5.30 – D132N RNase H + 32-B Isothermal Titration...... 201
Figure 5.31 – Fluorescence Anistropy Titration Fork DNA Substrate ...... 203
Figure 5.32 – Fluorescence Anisotropy Titration of the 32-B + D132N
RNase H + fork DNA Complex ...... 205
xx
Figure 5.33 – Fluorescence Anisotropy Titrations of the 32 Truncations + D132N
RNase H + Fork DNA Complex...... 207
Figure 5.34 – DNA Substrates Used in the Ternary Complex Screens ...... 208
Figure 5.35 – D132N RNase H + 32-B + Fork DNA Crystals after Screening .. 211
Figure 5.36 – Location of the 32-B Mutated Residues at the Interface between
D132N RNase H and 32-B ...... 212
Figure 5.37 – D132N RNase H + 32-B Mutants Native Gels...... 213
Figure 5.38 – Fluorescence Anisotropy Titration of the 32-B Mutants + D132N
RNase H + fork DNA Complex ...... 214
Figure 5.39 – D132N ∆N RNase H + 32-B Initial Crystals ...... 219
Figure 5.40 – D132N ∆N RNase H + 32-B Crystal Hits after Screening...... 220
Figure 5.41 – D132N ∆N RNase H + 32-B Crystal Optimization ...... 222
Figure 5.42 – D132N ∆N RNase H + 32-B Crystal Hits after Optimization ...... 222
Figure 5.43 – D132N ∆N RNase H + 32-B Mutants Native Gels ...... 224
Figure 5.44 – Fluorescence Anisotropy Titrations of the D132N ∆N RNase H +
32-B Mutants + Fork DNA Complex ...... 226
Figure 5.45 – D132N ∆N RNase H + 32 Protein Crystals after Screening...... 228
Figure 5.46 – Fluorescence Anisotropy Titrations of the 32 Truncations + D132N
∆N RNase H + Fork DNA Complex ...... 230
Figure 5.47 – D132N ∆N RNase H + 32 Core Crystals after Screening ...... 232
Figure 5.48 – D132N ∆N RNase H + 32 Core Crystals after Optimization ...... 232
Figure 6.1 – Bacteriophage Rb69 Genomic Map...... 237
Figure 6.2 – Sequence Alignement of T4 RNase H and Rb69 RNase H...... 238
xxi
Figure 6.3 – Sequence Alignment of T4 Native and Rb69 Truncated RNase H240
Figure 6.4 – pET 101 Insert of the Rb69 rnh Gene...... 241
Figure 6.5 – Rb69 RNase H Expression...... 241
Figure 6.6 – Rb69 RNase H Cell Lysis ...... 242
Figure 6.7 – Nucleotide and Amino-Acid C-terminus Sequence Alignment of T4
and Rb69 RNase H ...... 243
Figure 6.8 – Rb69 Full Length RNase H PCR Primers...... 244
Figure 6.9 – Agarose Gels for Rb69 RNase H Cloning ...... 250
Figure 6.10 – Rb69 RNase H (pET101) Expression...... 253
Figure 6.11 – Rb69 RNase H (pDEST-C1) Expression and Cell Lysis...... 255
Figure 6.12 – Rb69 RNase H purification scheme 1...... 258
Figure 6.13 – Rb69 Full Length RNase H purification scheme 2...... 263
Figure 6.14 – Rb69 Full Length RNase H purification scheme 3...... 268
Figure 6.15 – Sequence Alignement of T4 RNase H and Rb69 RNase H...... 272
Figure 6.16 – Site-Directed Mutagenesis PCR primers for Rb69 D132N RNase H
Cloning ...... 273
Figure 6.17 – Agarose Gels for Rb69 D132N RNase H Cloning ...... 275
Figure 6.18 – Rb69 D132N RNase H Expression and Cell Lysis ...... 277
Figure 6.19 – Rb69 D132N RNase H purification...... 280
Figure 6.20 – TEV Protease Reaction Results ...... 284
Figure 7.1 – Amino-Acid Sequence of E. coli Dps...... 288
Figure 7.2 – Crystals of (BKC Tzi FEN-1) Dps and (BKC Ape FEN-1) Dps...... 289
Figure 7.3 – SDS-PAGE Gel of the Truncated Dps Samples ...... 291
xxii
Figure 7.4 – DLS Results for (SJT Ape FEN-1) Dps...... 293
Figure 7.5 – MALDI-TOF Mass Spectrometry Results ...... 294
Figure 7.6 – (BKC Ape FEN-1) Dps Crystals...... 296
Figure 7.7 – X-Ray Diffraction Images of the (BKC Ape FEN-1) Dps Crystals . 297
Figure 7.8 – MolRep vs. Phaser Solutions ...... 301
Figure 7.9 – Final Dps Model from the (BKC Ape FEN-1) Dps Crystals...... 302
Figure 7.10 – Final Dps Model from the (BKC Ape FEN-1) Dps Crystals...... 303
xxiii LIST OF ABBREVIATIONS
AEBSF...... 4-(2-aminoethyl)-benzenesulfonylfluoride
Afu ...... Archeoglobus fulgidus
Ape ...... Aeropyrum pernix
APS ...... Advanced Photon Source
Bis-Tris HCl ...... 2,2-Bis(hydroxymethyl)-2,2’,2’’-nitrilotriethanol Hydrochloride
BME...... β-mercaptoethanol bp ...... base pair
CAPS...... 3-cyclohexamino-1-propanesulfonic acid
CC ...... Correlation Coefficient
CCD...... Charge-Coupled Device
CHES ...... Cyclohexyl-2-aminoethanesulfonic acid
COOT ...... Crystallographic Object-Oriented Toolkit
DLS ...... Dynamic Light Scattering
DNA...... Deoxyribonucleic Acid
dNTP ...... deoxyribonucleotide triphosphate
Dps ...... DNA binding protein from starved cells
dsDNA ...... double-stranded DNA
DTT ...... dithiothreitol
ε...... Extinction Coefficient
E. coli...... Escherichia coli
xxiv
EDTA...... Ethylenediaminetetraacetic Acid
FA...... Fluorescence Anisotropy
FEN-1 ...... Flap Endonuclease 1
GOI...... Gene of Interest
HEPES ...... 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid
HPLC...... High Performance Liquid Chromatography
HSV-1...... Herpes Simplex Virus type 1
IDT...... Integrated DNA Technologies
IPTG ...... Isopropyl β-D-1 Thiogalactopyranoside
ITC...... Isothermal Titration Calorimetry
kb...... kilobase
Kd...... Dissociation Constant kDa ...... kilo Dalton
KOD...... Thermococcus kodakaraensis
LB ...... Luria-Bertani
LDS ...... Lauryl dodecylsulfate
MES...... 4-morpholineethanesulfonic acid monohydrate
MPD ...... 2-Methyl-2,4-Pentanediol
MS ...... Mass Spectrometry
NIH ...... National Institute of Health
OB-fold ...... Oligosaccharide/Oligonucleotide-Binding Fold
PCR...... Polymerase Chain Reaction
PEG...... PolyEthylene Glycol
xxv
PEI...... Polyethylene Imine
Pfu ...... Pyrococcus furiosus pI ...... Isoelectric Point
PIPES...... Piperazinebis(ethanesulfonic) acid
Pol III ...... Polymerase III holoenzyme
Rh ...... Hydrodynamic Radius
RNA...... Ribonucleic Acid
RP-A...... Replication Protein A
SAXS...... Small-Angle X-Ray Scattering
SDS-PAGE...... Sodium Dodecylsulfate–Polyacrylamide Gel Electrophoresis
SEC ...... Size Exclusion Chromatography
SOC...... Super Optimal Catabolite Repression Broth
SSB ...... Single-Stranded DNA-Binding Protein ssDNA ...... single-stranded DNA
TAE ...... Tris-Acetate-EDTA buffer
TAPS ...... N-Tris(hydroxymethyl)methyl-3-aminopropanesulfonic acid
TBE ...... Tris-Borate-EDTA buffer
TCEP...... Tris-2-carboxyethylphosphine
TE...... Tris-EDTA buffer
TEV ...... Tobacco Etch Virus
TOPO ...... Topoisomerase I
Tris HCl ...... Tris(hydroxymethyl)aminomethane Hydrochloride
Tzi...... Thermococcus ziligii
xxvi
UV ...... Ultra Violet
WT...... Wild Type
xxvii
CHAPTER 1 - Background
1.1. Bacteriophage T4 DNA Replication and Repair
1.1.1. Bacteriophage T4 is a Model for DNA Replication
Bacteriophage T4 is a virus that infects E. coli bacteria. It undergoes a
lytic lifecycle, where after infecting the bacterial cell, it replicates its phage DNA genome using phage-encoded DNA replication proteins. After the new DNA has been replicated and new phage particles have been formed, they are released by the burst of the host E. coli cell. (Karam, 1994)
DNA replication takes place in the 5’ to 3’ direction. Because of that directionality, the synthesis of the leading strand happens in a continuous fashion, while the synthesis of the lagging strand is discontinuous: Fragments of
DNA, called Okazaki fragments, are synthesized separately and then joined together to form the new lagging DNA strand. (Kornberg, 1992)
A model of the bacteriophage T4 DNA replication fork along with all the proteins involved in the process is presented in Figure 1.1. The T4 DNA replication proteins are named after their gene number, with the exception of
RNase H.
1 2
Figure 1.1 – Bacteriophage T4 DNA Replication Fork
The parent DNA strands are shown in black, the newly synthesized daughters strands in red and the primers in green. The direction of DNA replication, from the 5’ end to the 3’ end is indicated by the arrows. RNase H and the single-stranded DNA–binding 32 proteins are in red, as the work presented here focuses on the interaction between these two proteins.
The DNA duplex is unwound by the 5’ to 3’ DNA helicase (gene 41 protein) in purple. The hexameric helicase is loaded on the DNA by the helicase loading protein (gene 59 protein), in red. The leading strand synthesis only involves a limited number of proteins: The DNA polymerase (gene 43 protein), in gray, synthesizes the new DNA strand in the 5’ to 3’ direction. It also has a 3’ to
5’ exonuclease activity. The processivity of the DNA polymerase is increased by its interaction with the trimeric clamp protein (gene 45 protein) in green, which is itself loaded on the DNA substrate by the clamp loader proteins (gene 44 and 62
3 proteins) in orange. The single-stranded DNA is protected from nucleases and re-annealing by the single-stranded DNA-binding proteins (gene 32 proteins), in blue. The synthesis of the lagging strand is somewhat more complicated. First of all, as mentioned previously, the lagging strand synthesis is discontinuous and occurs through the synthesis of Okazaki fragments, that then need to be linked together to form a continuous strand of DNA. Moreover, the DNA polymerase cannot initiate new chains, therefore requiring the need for RNA primers to initiate the synthesis of the Okazaki fragments. The short RNA primers (5 nucleotides) are synthesized by the primase (gene 61 protein) in bright yellow.
The DNA polymerase can then proceed from the primer and synthesize the
Okazaki fragment. Similarly to the leading strand, the single-stranded DNA of the lagging strand is protected by the 32 proteins. The RNA primers are removed by the 5’ to 3’ nuclease RNase H, in pink. Finally, after removal of the primers and subsequent repairs of the gaps, the nicks between the different Okazaki fragments are sealed by the DNA ligase (gene 30 protein) in light yellow. Another protein not shown on the diagram is the Dda helicase. Its role is to remove any other protein that might be blocking the DNA replication. Obviously, the coordination of the DNA replication, for both the leading and the lagging strand, depends on the interactions between all of these proteins. (Nossal, 1992; Nossal,
1994)
The bacteriophage T4 replication proteins are similar to other DNA replication proteins from higher organisms (Nossal, 1994), only the T4 system is a lot simpler as each activity at the replication fork is performed by a separate
4 protein. For instance, the E. coli Pol III holoenzyme is a complex made of ten types of subunits that has both polymerase and exonuclease activities, as well as the DNA binding enhancing activity that is associated with the clamp and clamp loader proteins in T4 (Nossal, 1994). The relative simplicity of the bacteriophage
T4 replication fork, compared to the eukaryotic or prokaryotic DNA replication systems, makes it a very good model to study DNA replication. It is especially useful to study the protein-protein interactions at the replication fork, and how these interactions are involved in coordinating the synthesis of the leading and lagging strands.
One of the important protein-protein interactions at the T4 DNA replication fork is the one between RNase H and the single-stranded DNA-binding protein or
32 protein. The two proteins interact at the RNA primer location on the Okazaki fragment, as is shown on Figure 1.2 below. The RNase H / 32 protein complex is especially important is it appears to be a key player in the regulation of the lagging strand synthesis. Indeed, RNase H alone can only remove a short oligonucleotide (one to four nucleotides) before falling off the replication fork.
However, upon 32 binding its processivity is dramatically increased, and the
32-assisted RNase H can then go through multiple rounds of DNA cleavage, and is able to remove up to 50 nucleotides each time it binds to the DNA duplex.
(Bhagwat et al., 1997)
The two proteins will be introduced separately in the following sections.
5
Figure 1.2 – Interaction between T4 RNase H and the 32 Protein
The proteins and DNA are color-coded similarly to the ones in Figure 1.1.
c The polymerase / clamp complex synthesizes the Okazaki fragment on the left, and displaces the 32 proteins as it moves along the DNA strand.
d RNase H binds where the RNA primer is located.
e RNase H cleaves the RNA primer along with some of the adjacent DNA, and then falls off the DNA replication fork. The DNA polymerase displaces the remaining 32 proteins and fills the gap left by RNase H.
f The DNA ligase comes in and seals the nick between the two Okazaki fragments.
6
1.1.2. Bacteriophage T4 Single-Stranded DNA-Binding 32 Protein
The single-stranded DNA-binding 32 protein from bacteriophage T4 is the
phage equivalent of SSB in E. coli or RP-A in humans (Nossal, 1994). It was first
isolated by Bruce Alberts in 1970 (Alberts, 1970). It cooperatively binds to the
single-stranded DNA during replication, in order to prevent it from re-annealing
and protect it from nucleases. It is also known as the “helix destabilizing protein”
or “DNA melting protein” for its ability to lower the melting temperature of
double-stranded DNA helices (Waidner et al., 2001). It also that appears the
ssDNA decorated with 32 proteins is not extended but rather forms a compact structure (Chastain, 2003).
The 32 protein plays a very important role at the replication fork, especially in lagging strand DNA synthesis. It stimulates the assembly of the polymerase-clamp complex on the lagging strand (Nossal, 1992). In the event that the helicase is loaded by the helicase-loading protein (59 protein), which is much more efficient than when the helicase is loaded without the help of 59 protein, then the 32 protein is required for the leading strand synthesis (Jones et al., 2004). And as it was mentioned before, the processivity of RNase H is greatly increased upon binding to the 32 protein (Bhagwat et al., 1997). So clearly the 32 protein is a very important protein at the DNA replication fork, by its implication on leading strand as well as lagging strand synthesis efficiency. These roles can however only be fulfilled through interactions with other proteins at the replication fork.
7
The 32 protein is a metalloprotein that contains one Zn2+ ion per molecule of 32 (Giedroc et al., 1986). That Zn (II) atom has a structural role, which was shown by proteolysis studies (Giedroc et al., 1987) and proton NMR studies (Pan et al., 1989).
The domains of the 32 protein were identified by limited proteolysis experiments (Karpel, 1990). They are described below in Figure 1.3.
Figure 1.3 – T4 32 Protein Domains and Truncations
1 16 17 253 254 301 B Domain Core Domain A Domain
• 32 Protein : amino-acid 1 to 301 • 32 Core Protein : amino-acid 17 to 253 • 32-B (32 minus B) Protein : amino-acid 17 to 301 • 32-A (32 minus A) Protein : amino-acid 1 to 253
The N-terminal or B (for basic) domain is responsible for the cooperativity of the 32 protein binding to DNA (Giedroc et al., 1991; Casas-Finet et al., 1992).
The C-terminus or A (acidic) domain plays a large role in the interaction of the 32 protein with other proteins at the replication fork, and has for instance been shown to bind to the T4 DNA polymerase (43 protein) (Hurley, 1993). It is not known, however, if the A domain is also involved in the RNase H interaction.
Finally, the core domain is responsible for ssDNA binding, but this DNA-protein interaction is enhanced in the presence of the A domain (Waidner et al., 2001).
Similarly, the C-terminus of the E. coli SSB protein was found to affect the
8
binding to DNA (Williams, 1983) and this domain was disordered in the crystal
structure, even in the presence of ssDNA (Savvides, 2004).
The crystal structure of the core domain of bacteriophage T4 32 protein
was solved in 1995 by Yousif Shamoo (Shamoo et al., 1995). The model of the
32 core domain, shown below in Figure 1.4, shows that the protein has an overall
OB-fold (oligosaccharide / oligonucleotide-binding fold).
Figure 1.4 – T4 32 Protein Core Domain Crystal Structure
The 32 core domain is colored as Jones’rainbow, with the N-terminus in blue and the C- 2+ terminus in red. The Zn ion is shown in gray. The figure was generated using PyMOL (DeLano and Lam, 2005)
The structure shows three main domains in the 32 core protein. The
subdomain I binds the Zn2+ ion, while the sudomain II and the connecting region
forms the DNA-binding cleft. This cleft is lined with positively charged residues on one side and hydrophobic residues on the other, allowing the single-stranded
DNA to bind through respective electrostatic interactions with the phosphate
9
backbone, and hydrophobic interactions with the bases. Moreover, the crystals
used to solve the structure did contain single-stranded DNA, for which only very
weak electron density was observed, indicating that the binding of ssDNA is not
sequence-dependent, and that the ssDNA can slide freely along that cleft
(Shamoo et al., 1995).
1.1.3. Bacteriophage T4 RNase H
Bacteriophage T4 RNase H is a member of the FEN-1 family of replication
and repair nucleases (Liu et al., 2004). It was first isolated in 1990 by Nancy
Nossal (Hollingsworth and Nossal, 1991). RNase H is a 5’ to 3’ nuclease, with
both exonuclease and endonuclease activities. It removes the five nucleotide
long RNA primers and about thirty nucleotides of the adjacent DNA before the
lagging strand Okazaki fragments are completed and ligated. RNase H acts as a
5’ to 3’ exonuclease on DNA/DNA or DNA/RNA duplex, and as an endonuclease
on fork and flap substrates. This endonuclease activity is necessary, in case the
RNA primer from one Okazaki fragment is being displaced by the polymerase/clamp complex while extending the next fragment. It was shown
however that the 5’ to 3’ exonuclease activity is the most prominent at the T4
DNA replication fork (Bhagwat and Nossal, 2001). In other members of the FEN-
1 family, such as the Aeropyrum pernix (Ape) FEN-1, the endonuclease activity is
mostly responsible for the RNA primer processing. A comparison of the two
mechanisms is presented in Figure 1.5.
10
Figure 1.5 – Exonuclease vs. Endonuclease Activity
c
d
c Exonuclease activity of RNase H – it removes the RNA primer and adjacent DNA before the polymerase/clamp complex reaches that particular Okazaki fragment.
d Endonuclease activity of FEN-1 – The primer is first diplaced by the polymerase/clamp complex, and FEN-1 can then act on the flap DNA that is created and remove the RNA primer and adjacent DNA. The cut is made one nucleotide short of the junction between dsDNA and ssDNA.
RNase H is known to interact with two other proteins at the replication
fork, the 32 protein (single-stranded DNA-binding protein), which was described in the previous section, and the 45 protein (clamp protein). The 32 protein increases the processivity of the nuclease: upon 32 binding, RNase H can go through multiple cuts and remove an average of 30 nucleotides from each
Okazaki fragment, while it can only remove a maximum of four nucleotides by itself (Bhagwat et al., 1997). 32 also strongly inhibits the flap endonuclease activity of RNase H, when binding to the single-strand of the flap substrate
(Bhagwat et al., 1997). The processing of nicked or gapped substrates is stimulated by the binding of the 45 clamp protein (Gangisetty et al., 2005). It was
11 shown that the 32 protein interaction occurs through the C-terminus of RNase H, and the interaction with the 45 protein through the N-terminus (Gangisetty et al.,
2005).
The crystal structure of T4 RNase H was solved in 1996 by Timothy
Mueser (Mueser et al., 1996). It is shown below in Figure 1.6.
Figure 1.6 – T4 RNase H Crystal Structure
The RNase H protein is colored as Jones’rainbow, with the N-terminus in blue and the 2+ C-terminus in red. The two Mg ions in the active site are shown in gray. The figure was generated using PyMOL (DeLano and Lam, 2005)
The structure was solved in the presence of magnesium, which is required for the nuclease activity of the protein. RNase H is composed a small subdomain, and a large subdomain containing the N- and C-termini. The active site is located in the groove in between. The bridge region (residues 89 to 97), connecting the two subdomains is disordered in the crystal structure, and so are the first eleven residues in the N-terminus. A closer look at the active site is given in Figure 1.7.
12
Figure 1.7 – T4 RNase H Active Site
K199
D155 D157 Q22
D19 Mg2 Mg1 D132
D71
E130
The magnesium ions are shown in black and the water molecules directly coordinated to the metal in yellow. The residues shown in orange are negatively charged (Asp and Glu residues), the one in green is neutral (Gln) and the one in blue positively charged (Lys). They are all involved in Mg2+ binding to some extent.
The two magnesium ions have an octahedral coordination sphere. The
first one, Mg1, is directly coordinated to the D132 residue, and five water molecules. Mg2 has an inner sphere coordination of only water molecules. These water molecules are bound through a network of hydrogen bonds made with the
carboxyl groups of aspartate and glutamate residues clustered in the active site.
A lot of these residues had previously been identified as important for catalysis since they are conserved throughout the FEN-1 family of nucleases, and their
13 role been deciphered by site-directed mutagenesis studies (Bhagwat et al.,
1997). Other residues such as Q22 and K199 are there to orient the carboxyl groups properly for the hydrogen bond interactions. Mg1 is important for the catalytic nuclease reaction, while Mg2 is a structural ion and plays a smaller role in catalysis (Mueser et al., 1996).
A number of aspartate mutants were designed as inactive versions of
RNase H that could be used for protein-DNA studies (Bhagwat et al., 1997), among which the D132N mutant. The crystal structure of the D132N RNase H without magnesium bound has been solved (Tomanicek) and is presented below in Figure 1.8, along with the native RNase H crystal structure.
Figure 1.8 – T4 RNase H Native vs. Metal Free Crystal Structures
The native RNase H is shown in gray and the metal-free D132N mutant in blue. The fig ure was generated using PyMOL (DeLano and Lam, 2005)
14
The two structures are very similar, but the N-terminus and the bridge region are ordered in the metal-free structure. This is an interesting result, as it is usually expected that metalloproteins are more ordered in the presence of metal ions, which is clearly not the case for RNase H. Other experimental evidence further confirmed this result (Mueser, personal communication). This might be an indication that the magnesium ions are not bound to the protein when it is inactive, but rather come along with the DNA substrate and activate the protein upon substrate binding. This hypothesis is still under ongoing investigation.
Yet another crystal structure of T4 RNase H has been solved, this time complexed with a fork DNA substrate (Devos et al., 2007). This is shown in
Figure 1.9.
Figure 1.9 – T4 RNase H + Fork DNA Crystal Structure
The figure was generated using PyMOL (DeLano and Lam, 2005)
15
1.1.4. Known Interactions between Nucleases and Single-Stranded
DNA-Binding Proteins
Interactions between single-stranded DNA binding proteins and nucleases have been characterized in a number of different organisms.
The E. coli DNA replication system being one of the most studied, the
E. coli tetrameric SSB has been shown to interact with several E. coli nucleases.
SSB was found to stimulate the activity of Polymerase II and the 3’ to 5’
Exonuclease I (Molineux and Gefter, 1975), and bind to either one even in the absence of DNA. The Exonuclease I – SSB interaction was further characterized later on, as it appears that SSB also enhances the dRpase (DNA deoxyribophosphodiesterase) activity of the protein (Sandigursky, 1993), and that the two proteins interact through the SSB C-terminus (Genschel et al.,
2000). E. coli SSB also interacts with and enhances the activity of the 5’ to 3’
RecJ exonuclease (Han et al., 2006).
In HSV-1 (Herpes Simplex Virus type 1), the 5’ to 3’ exonuclease alkaline nuclease, required for homologous recombination, performs strand exchange in association with the single-stranded DNA-binding protein ICP8 (Reuven et al.,
2003).
1.1.5. Project Goals
One of the focuses of the Mueser lab is to study structure-specific recognition of fork and flap DNA substrates at the DNA replication fork.
Bacteriophage T4 is a very good model system for that purpose, as it encodes all
16 the proteins needed for DNA replication as separate entities, therefore making the system easier to study. The structures of most of these proteins have already been solved, so the next step is to look at protein-protein and protein-DNA complexes, in order to shed more light on how the different proteins involved at the replication fork come together to run and coordinate the DNA replication process.
One of the main interactions that occur at the replication fork is the one between the 5’ to 3’ RNase H and the single-stranded DNA-binding 32 protein.
This interaction appears to be one of the key players in keeping the lagging strand replication organized and efficient. As it was shown, the structures of
RNase H and the DNA-binding 32 core domain have been solved. The way
RNase H binds to its fork DNA substrate is also known, thanks to the RNase H +
DNA co-crystal structure. With this information in hand, the objective is to further characterize the interaction between RNase H and the 32 protein via structural and biophysical studies. The results from these studies, together with the RNase
H + DNA interaction studies already available, should provide critical information on the organization of the lagging strand replication of bacteriophage T4. And as we have seen in the previous section, the interactions between single-stranded
DNA-binding proteins and nucleases are ubiquitous in various life forms, making the information that will be obtained from the T4 system applicable to higher organisms’ DNA replication systems.
The work done on the RNase H – 32 protein interaction is described in
Chapters 3 through 6 of this dissertation.
17
1.2. Escherichia coli DNA-binding Protein from Starved Cells
Bacteria are constantly facing various changes in their environment, and
conditions such as low pH, oxidative stress or nutrient limitations can impair their
chances of survival. Therefore, they have developed a number of mechanisms to
enable them to survive under these stressful conditions. For instance, when E. coli enters the stationary phase, it starts expressing a non-specific DNA-binding protein, Dps (DNA-binding protein from starved cells), also called PexB (Grant et
al., 1998). Dps has been shown to protect DNA from oxidative damage (Martinez
and Kolter, 1997; Ilari et al., 2002), and induce compaction of the genomic DNA
during the transition from the exponential to the stationary growth phase, which
would also provide protection to the DNA (Azam and Ishihama, 1999).
The crystal structure of Dps was solved by Grant and coworkers (1998).
The Dps monomer folds as a four-helix bundle, which then associates as a
dodecamer to form a sphere-like structure measuring 90 Å in diameter, with a 45
Å hollow core. The ribbon structures of the monomer and dodecamer are shown
in Figure 1.10.
Dps appears to be a structural analogue of ferritin, an iron storage protein,
which also forms a four-helix bundle that associates as a 24-mer into a hollow
sphere. The ferritin 24-mer is 120 Å in diameter, the central core 80 Å in diameter
and can contain up to 4000 iron atoms. It has a ferroxidase site that allows it to
oxidize Fe2+ ions into Fe3+ ions that can then be incorporated into the ferrihydrite
core. Because of the structural similarities, it has been proposed that Dps
protects DNA from oxidative damage by storing Fe2+ ions, therefore preventing
18
them from generating hydroxyl free radicals through the Fenton reaction that
could then create single- and double-strand breaks. It has since been shown that
Dps also contains a ferroxidase center and can store up to 400 iron atoms inside
its hollow core (Ilari et al., 2002). This provides an original way of protecting DNA
from free radicals by preventing their formation, as opposed to oxidative repair
proteins such as the catalase or the superoxide dismutase that remove active
oxygen species after they appear.
Figure 1.10 – Ribbon structure of Dps
a b
a – Dps monomer, the sodium ion is shown in black, the N-terminus is in blue and the C-terminus in red. b – Dodecameric structure of Dps (12 monomers). PDB: 1dps (Grant et al., 1998) The figures were generated using PyMOL (DeLano and Lam, 2005)
19
The crystal structure doesn’t provide any insight into on the mechanism of
DNA-binding of Dps, as the surface of the sphere is negatively charged.
However, the lysine-rich N-terminus of the protein was disordered in the crystals and is not present in the model. Once modeled in, it can be seen that within the two-dimensional hexagonal crystal lattice, each N-terminus along with two other neighboring N-termini are brought together and line the solvent channels of the crystal, which are then positively charged with nine lysine residues and can accommodate a DNA helix. As a result, the overall structure of Dps bound to
DNA would be several hexagonal sheet-like structures, with DNA threaded though the solvent channels (Grant et al., 1998). Indeed, N-terminal deletion studies showed that the N-terminus plays an important role in self-aggregation and DNA binding activities, but not in preventing oxidative damage (Ceci et al.,
2004). Finally, Azam and coworkers showed that Dps binds DNA ranging from 40 to 64 base pairs with a dissociation constant Kd of 172 - 178 nM (Azam and
Ishihama, 1999).
The work done on the E. coli Dps protein is described in Chapter 7 of this
dissertation.
CHAPTER 2 - Methodology
2.1. Molecular Cloning
2.1.1. Polymerase Chain Reaction (PCR)
The primers for the PCR reaction are designed according to the expression vector that is to be used later on. Out of the different proteins that were cloned for this project, the pET101 (Invitrogen) and the pDEST-C1 (Horanyi et al., 2006) expression vectors were used. The PCR product can be inserted directly in the pET101 vector, but in the case of pDEST-C1 it has to be inserted in the pENTR-D (Invitrogen) entry vector first. Maps of all these vectors are available in Figure 2.2. Both pET101 and pENTR-D vectors use directed TOPO- assisted cloning, and therefore require a CACC overhang on the 5’-end of the gene that is to be inserted. This CACC overhang therefore has to be added to the forward primer. Another consideration to take into account is the fact that the pDEST-C1 inserts an N-terminal His-Tag. To allow cleavage of that tag after purification, it is recommended to insert a TEV protease cleavage site between the CACC overhang and the gene-coding sequence. A schematic overview on how to design the forward and reverse primers is provided in Figure 2.1. The
20 21
annealing part of the primers should be extended in such a way that the melting
temperature is around 55 °C.
Figure 2.1 – Primer Design Scheme
Forward Primer – pDEST-C1 insertion
GOI 5’- ATG NNN… -3’ primer 5’- C ACC GAG AAC CTC TAC TTC CAA GGA ATG NNN… -3’
5’- C ACC GAG AAC CTC TAC TTC CAA GGA ATG NNN… -3’
Forward Primer – pET101 insertion
GOI 5’- ATG NNN… -3’ primer 5’- C ACC ATG NNN……-3’
5’- C ACC ATG CTG NNN… -3’
Reverse Primer (Inverse Complement)
GOI 5’- …XXX TAA -3’ primer 3’- …NNN ATT -5’
5’- TTA NNN… -3’
The CACC overhang is colored in red, the TEV protease cleavage site in blue. The ATG start codon is highlighted in green and the TAA (can also be TGA) stop codon in red.
The primers are ordered from Integrated DNA Technologies (IDT). Upon
reception, they are redissolved in 1X TE buffer (20 mM Tris HCl pH 8.0 + 1 mM
EDTA) and a final primer solution at 10 µM is prepared. The PCR reaction is then
set up as described in Table 2.1. The polymerase that is usually used in this step
is the ProofStart DNA Polymerase (Qiagen), but in some cases, like for a longer
gene that requires improved fidelity, the pfuUltra DNA Polymerase (Strategene)
22 can be used as well. The annealing temperature is calculated as the melting temperature + 2 °C.
Table 2.1 – PCR Reaction Setup
PCR Reaction PCR Program
Polymerase Buffer 1X Activation 95 °C
Primer solution 2 µM Denaturation 95 °C
dNTPs 10 mM Annealing Annealing temp.
MgSO4 2.5 mM Extension 68 to 70 °C
Polymerase (1 U/µL) 1 µL N cycles
Extension temp. Template 1 µL Final Extension 5 to 20 minutes Autoclaved water Final vol. 50 µL
The concentrations indicated are final concentration in the PCR reaction. Autoclaved MQ water is added so that the final volume of the reaction is 50 µL. The temperatures and times used in the PCR program depend on the polymerase used and the length of the gene being amplified.
After the PCR reaction is done, the PCR product is run on a 1 % agarose gel (see Section 2.1.5), along with the appropriate DNA ladder (100 bp or 1 kb linear DNA ladder at this stage). If the product has the correct size, it is then gel purified using the MiniElute kit (Qiagen).
2.1.2. Insertion of the PCR Product into an Entry or Expression Vector
After the PCR product has been purified, it is inserted into the vector of choice using the Gateway® technology. The entry (pENTR-D) and expression
23
(pET101 and pDEST-C1) vectors used for this particular project are presented in
Appendix 1.
The PCR product is first inserted in the vector using directional TOPO-assisted cloning. The Topoisomerase 1 is already covalently linked to the commercially available vector. In the case of pDEST-C1, the gene is swapped from the entry vector to the expression vector using a transposition reaction, using the LR
Clonase enzyme mix.
The way the different reactions are setup is described in the following tables. The reactions are then incubated at room temperature for 30 minutes to 2 hours. After the LR Clonase reaction is performed, the LR Clonase enzymes have to be digested. This is done by adding 1 µL of Proteinase K to the reaction and incubating at 37 °C for 10 minutes.
Table 2.2 – Vector Insertion Reaction Setup
pENTR-D Insertion Reaction
Purified PCR product 2 µL
Salt Solution 1 µL
pENTR-D vector 1 µL
Autoclaved water 2 µL
pET101 Insertion Reaction
Purified PCR product 2 µL
Salt Solution 1 µL
pET101 vector 1 µL
Autoclaved water 2 µL
24
LR Clonase Reaction
Gel Purified GOI in pENTR-D 1 µL
pDEST-C1 1 µL
LR Clonase Mix 2 µL
TE buffer (1X) 6 µL
The 1X TE buffer is made of 20 mM Tris HCl pH 8.0 + 1 mM EDTA
After the insertion reaction is done, the plasmid is transformed into competent
E. coli cells (see Section 2.1.4) and amplified before being run on an agarose gel
to check the success of the reaction.
2.1.3. Site-Directed Mutagenesis
The site-directed mutagenesis protocol described here is similar to the
QuikChange® (Strategene) protocol.
Contrary to regular gene amplification by PCR, both primers here are centered on the codon that is to be mutated. The mutation is introduced by modifying the nucleotides of interest directly in the primers. These primers will therefore be able to anneal completely with the gene, except at the site of the mutated nucleotides. A scheme of the site-directed mutagenesis process is given in Figure 2.2. The modified primers are annealed on the template plasmid, and the mutated plasmid is amplified by PCR reaction. Finally, the original template is recognized as methylated DNA and digested using an appropriate restriction enzyme.
25
Figure 2.2 – Site-Directed Mutagenesis Scheme
c Annealing of the Template plasmid mutant primers and DNA replication
d
e Mutated plasmid Mutated and template plasmids
The gene of interest that needs to be mutated is shown in red, the mutated gene in green and the rest of the vector in black. c - The mutated primers are annealed on the template plasmid. The mutated nucleotides do not anneal with the template gene. d - The PCR amplification reaction is performed, after which both the initial plasmid and the mutated one are present in solution. e - The template plasmid, methylated as it was isolated from bacteria, is digested using the DpnI restriction enzyme. Only the mutated plasmid is left in solution, and it can now be transformed into competent cells.
26
The primers are designed so that the annealing temperature (calculated according to the QuikChange manual) is close to 75 °C. They are also ordered from Integrated DNA Technologies, and dissolved in TE buffer upon reception.
The polymerase used in the site-directed mutagenesis PCR reaction is the high fidelity Hot Start KOD DNA Polymerase (Novagen), since it is more efficient in replicating large pieces of DNA like plasmids than the ProofStart DNA
Polymerase used previously. The PCR reaction is set up as described in Table
2.3.
Table 2.3 – Site-Directed Mutagenesis PCR Reaction
PCR Reaction PCR Program
KOD buffer (10X) 5 µL Activation 95 °C, 2 minutes
Forward primer (2.5 µM) 6 µL Denaturation 95 °C, 20 seconds
Reverse primer (2.5 µM) 6 µL Annealing 60 °C, 10 seconds
dNTPs (2 mM) 5 µL Extension 70 °C, 2 minutes
MgSO4 (25 mM) 5 µL 20 cycles
KOD Polymerase (1 U/µL) 1 µL Final extension 70 °C, 20 minutes
Template 1 µL
Autoclaved water 21 µL
After the reaction, the PCR product is incubated with the DpnI restriction
enzyme for one hour at 37 °C. The mutated plasmid is then ready for
transformation.
27
2.1.4. Transformation into Competent E. coli Cells
There are two types of cell lines: the cloning host, used to amplify plasmids, and the expression host, for protein production. Each plasmid has to be transformed first into a competent cloning host. Once it has been assessed that the gene is inserted correctly in the vector, the expression plasmid can then be transformed into a competent expression host. The different cell lines used in this project are described in Table 2.4.
Table 2.4 – Cloning and Expression Hosts
Cloning Hosts Expression Hosts
TOP10 BL21 (DE3) Star
XL10 BL21 (DE3) Gold
DH5α BL21 (DE3) pLysS
Omnimax BL21 (DE3) RILP
RosettaBlue™ (DE3)
T7 Express lacIq
The plasmid of interest (2 µL for cloning hosts, 0.5 to 0.8 µL for expression
hosts) is added to 25 to 50 µL of competent cells. The mixture is incubated on ice for 30 minutes, heat-shocked at 42 °C for 30 seconds and incubated on ice again for a few minutes. SOC media is then added (250 µL) to the cells which are then incubated at 37 °C for one hour. Finally, the cells can be plated on LB Agar
(35 g/L) + antibiotics plates, or grown directly in liquid media of LB (25 g/L) + antibiotics. A list of the antibiotics required by the vectors and/or cell lines is
28 given in Table 2.5. The plates or cell cultures are incubated at 37 °C until growth is achieved.
When the cells have been plated and colonies obtained, some of these colonies are picked and grown overnight in LB + antibiotics media. The cells are then harvested and the plasmid isolated using the MiniPrep kit from Qiagen. The plasmid can then be run on an agarose gel to check for insertion of the gene.
2.1.5. Agarose Gel Electrophoresis
The 1 % agarose gel is prepared by boiling Agarose MB (Midsci) in 20 mM
Tris-Acetate pH 8.0, 1 mM EDTA (TAE) buffer, which is then poured in the gel box, and the comb is placed on top of the gel. It is left on the bench at room temperature to cool down, then at 4°C overnight. The samples are prepared by mixing 2 µL of the DNA with 1 µL of Gel Loading Solution (0.25 % Bromophenol
Blue + 40 % glycerol). Once the samples are pipetted inside the wells, the gel is run at 90 V for 90 to 120 minutes. It is then stained in 0.01 % SYBR Gold nucleic acid gel stain (Invitrogen) for 30 minutes, and visualized using a UV transilluminator and a 515 nm emission filter.
2.1.6. Overview of the Molecular Cloning Process
A schematic summary of the molecular cloning process is given below in
Figure 2.3.
29
Figure 2.3 – Summary of the Molecular Cloning Process
Gene Amplification by PCR Site-Directed Mutagenesis * PCR Reaction
* Insertion in pENTR-D Insertion in pET101
Transformation in Cloning Host
Transformation in Transformation in * Cloning Host Cloning Host Transformation in * * Expression Host LR Clonase Reaction Transformation in Insertion in pDEST-C1 Expression Host Small Scale Expression Studies
Transformation in Small Scale * Cloning Host Expression Studies Gene Sequencing * *
Transformation in Gene Sequencing Expression Host
Small Scale * Agarose Gel Expression Studies * SDS-PAGE Gel *
Gene Sequencing
30
2.2. Protein Expression
2.2.1. Small Scale Expression Studies
Once an expression plasmid has been obtained and successfully
transformed into an expression host, small scale expression studies are carried
out to confirm that the protein is indeed expressed.
After transformation, 100 µL of the transformed cells are added to 10 mL
of Luria Broth culture media at 25 g/L and 1 mM of the required antibiotics (see
Table 2.5), and grown overnight at 37 °C and 180 rpm in a New Brunswick
Scientific Innova 4000 Incubator Shaker. The next morning, 500 µL of the overnight cell culture are taken and used to inoculate a fresh solution of 10 mL of
LB at 25 g/L and 1 mM antibiotics. After the optical density at 600 nm OD600 reaches 0.4 to 0.6, 1 mL of the culture is taken, mixed with 1 mL of 50 % glycerol, and flash frozen on dry ice. This glycerol stock of cells is stored at -80°C and can be used later on to inoculate cell cultures for this particular protein preparation. At OD600 = 0.6, 1 mM of IPTG is added to induce protein expression.
A 1 mL sample for SDS-PAGE gel electrophoresis is also taken out to check that
no leaky expression of the protein of interest is occurring before IPTG induction
(0 hour sample). After three hours, another 1 mL SDS-PAGE sample is taken to
check for protein expression (3 hour sample).
31
Table 2.5 – Antibiotics Required by the Vectors / Cell Lines
Vector Antibiotic Cell Line Antibiotic
pET 21a Ampicillin TOP10 Streptomycin
pENTR-D Kanamycin XL10 /
pET101 Ampicillin DH5α /
pDEST-C1 Streptomycin Omnimax Tetracyclin
32 proteins Ampicillin BL21 (DE3) Star / plasmids BL21 (DE3) Gold / BL21 (DE3) pLysS Chloramphenicol The antibiotics required by each vector are shown on the left table, and the ones BL21 (DE3) RILP Chloramphenicol required by each cell line on the right table. Chloramphenicol The cell lines shown in italic are cloning RosettaBlue™ (DE3) hosts, the other ones are expression hosts. and Tetracyclin T7 Express lacIq Tetracyclin
Final Concentration Antibiotic Stock Concentration in Cell Culture
Ampicillin 25 mg/mL 25 µg/mL
Kanamycin 30 mg/mL 30 µg/mL
Streptomycin 50 mg/mL 200 µg/mL
Tetracyclin 5 mg/mL 5 µg/mL
Chloramphenicol 34 mg/mL 17 - 34 µg/mL
Once the SDS-PAGE gel electrophoresis confirms that the protein is expressed, large scale protein expression can be done, as described in the following section. On the other hand, if no protein expression can be seen, it might be necessary to transform the expression plasmid into a different expression host, or clone the gene in a different expression vector. If expression
32 has been assessed, the gene is also sent for sequencing to the Plant-Microbe
Genomics Facility at Ohio State University.
2.2.2. Large Scale Protein Expression
After confirmation that the protein of interest is expressed in the chosen expression host, large scale expression can take place. Indeed, protein crystallization trials require large amounts of protein and large scale preparations are necessary.
Three Erlenmeyer flasks containing 100 mL of 25 g/L LB medium and 1 mM of the antibiotics needed are inoculated with a small amount of frozen cells taken from a glycerol stock stored at -80 °C. The cells are then grown at 37 °C and a shaking speed of 180 rpm. The next day, each one of the 6 flasks of 1 L of
25 g/L of LB medium and 1 mM antibiotics are again inoculated with 50 mL of the overnight cell culture. The 6 L of cell culture are then left to grow at 37 °C and
180 rpm until an optical density OD600 at 600 nm reaches around 0.6. At that
point, protein expression is induced by adding 1 mM IPTG or 1 mM Nalidixic
Acid, depending on the protein. Also, 1 mL of the cell culture can be taken to
make a glycerol stock of the expression host. After IPTG induction, the cells are
left in the shaker for another three hours. Like in the case of small scale
expression studies, 0 hour and 3 hour samples are taken to check on
SDS-PAGE that the protein is expressed. The cells are then harvested by
centrifugation at 5,000 rcf for 15 minutes using a Beckman Coulter™ TJ-25
33 centrifuge, after which the supernatant is discarded. Finally, the cell pellet is stored at -20 °C.
2.2.3. SDS-PAGE Gel Electrophoresis
Protein expression samples previously mentioned have to be first centrifuged on a bench-top centrifuge (5417C from Eppendorf), at 10,000 rcf for a few minutes. The supernatant is discarded, and the pellet resuspended in 50
µL of BugBuster™ Protein Extraction Reagent (Novagen). Another 50 µL of 2X
NuPage™ LDS Sample Buffer (Invitrogen) are then added. If samples are to be prepared from a protein solution, the 2X NuPage™ LDS Sample Buffer is added to the protein solution in a 1:1 volume ratio. The samples are boiled for 10 minutes, centrifuged at 10,000 rcf for a few minutes before being loaded on a
NuPage™ 4-12 % Bis-Tris gel. The gel is run in 1X NuPage™ MES SDS
Running Buffer, in a XCell SureLock™ gel box, at 200 V for 35 minutes.
Once the gel is finished running, it is stained in Commassie® G250 stain
(Bio-Rad) overnight, then destained in a 30 % methanol + 10 % acetic acid
destaining solution. Finally, the gel is dried in a 30 % methanol + 5 % glycerol
drying solution.
2.3. Cell Lysis and Protein Solubility
The cell pellet stored at -20°C is thawed out in lysis buffer. The lysis buffer
composition depends on the protein, and 10 mL are used for every gram of cells.
Small amounts of lyophilized hen egg white lysozyme (Sigma) and AEBSF serine
34 protease inhibitor (USB) are then added to the cells to the solution, which is stirred at room temperature for 20 minutes. The cells are kept on ice and lysed open by sonication with a Branson™ sonifier 250. The soluble portion or lysate is then separated from the insoluble cell debris pellet by centrifugation at 4 °C, at
18,000 rpm for 30 minutes. Samples of the lysate and pellet are run on an
SDS-PAGE gel to check for solubility of the protein of interest.
If the protein is insoluble and found in the pellet, solubility studies can be done. The salt content of the lysis buffer is altered from no salt present to 1 M salt, or a different pH/buffer can be used. Bug Buster™ can also be used to extract the protein. The protein can be expressed at a lower temperature (17 °C) to slow down the rate of protein folding. Yet another option is to use a different expression host.
2.4. Protein Purification (Huber, 2000)
Protein purification was done using a BioLogic DuoFlow™ HPLC system
(Bio-Rad) controlled by the BioLogic DuoFlow™ software version 5.0. The HPLC
system is placed inside a 4 °C cabinet. The purification process is monitored by
following the absorbance at 260 nm and 280 nm, as well as the conductivity. The
different columns used for this work and the chemistry of each resin are
presented in Appendix 2. The purification protocols vary from one protein to
another. The isoelectric point is first calculated using the ExPASy tool (Gasteiger
et al., 2003). If the pI is acidic, anion-exchange chromatography is used, and if
the pI is basic, cation-exchange chromatography used. Size exclusion
35 chromatography is used as a polishing step to further clean a sample that has already been purified. Metal affinity, hydrophobic interaction and hydroxyapatite are used in special cases.
Two buffers are needed for most purification steps, a low salt buffer (buffer
A) and a high salt buffer used for elution (buffer B). The resin has to be washed and equilibrated in the corresponding buffer A before the protein sample can be loaded. If the protein sample happens to be a lysate, it also has to be filtered before loading. After the protein has been loaded, impurities are washed with the buffer A. The elution run is then started by slowly increasing the amount of buffer
B. Fractions are collected and the evolution of the run is monitored by UV absorbance. The fractions suspected to contain the protein are then run on an
SDS-PAGE gel. Once it is known which fractions contain the protein of interest, these fractions are pooled to be run on the next column or dialyzed and concentrated if the protein is pure. Any leftover impurity on the resin is washed away with buffer B or a cleaning buffer, and the resin is then stored in 20 % ethanol until the next use.
2.5. Protein Preparation
2.5.1. Dialysis
The protein is dialyzed in Slide-A-Lyzer® dialysis cassettes (Pierce) that can accommodate up to 12 mL of protein solution. When larger volumes have to be dialyzed, SnakeSkin® Pleated Dialysis Tubing (Pierce) is used. Two pore
sizes for the dialysis membrane are available, 3,500 or 10,000 MWCO. The
36 protein is left to dialyze, at slow stirring speed, at 4 °C or room temperature for a few hours, and the dialysis buffer is changed twice.
2.5.2. Protein Concentration
Small volumes of protein are concentrated in Microcon™ concentrators
(Millipore), with a molecular weight cutoff of 3,500 kDa (YM-3, yellow) or 10,000 kDa (YM-10, green). For larger volumes, Amicon™ Ultra-4 or Ultra-15 (4 mL or
15 mL) centrifugal filter devices (Millipore) are used. The protein sample is spun at 3,000 rpm until the desirable concentration is reached.
Protein concentration is calculated by measuring the absorbance at 280 nm with an Agilent 8453 UV-Visible Spectrophotometer (Agilent Technologies).
The protein sample may have to be diluted with buffer in order to get an absorbance reading between 0.1 and 1, which corresponds to the linear range of measurement. The concentration is calculated by multiplying the absorbance measured by the dilution factor and then dividing by the extinction coefficient of the protein, which was calculated with the ExPASy ProtParam tool (Gill and von
Hippel, 1989; Gasteiger et al., 2003).
Finally, the protein sample is filtered with an Ultrafree®-MC centrifugal unit,
0.45 µm pore size (Millipore).
2.5.3. Solubility Screen
The solubility screen is used to find an optimum buffer condition, in order
to enhance the protein solubility and therefore increase the chances of obtaining
37 diffraction quality crystals (Collins et al., 2004; Izaac et al., 2006). The screen tests six different chloride salts, six sodium salts and four buffers, as is shown in
Table 2.6, at either 100 mM or 1 M concentration.
Table 2.6 – Solubility Screen Solutions
Cations Anions Buffers
NH4Cl Na Formate Na MES pH 5.6
NaCl Na Acetate Na PIPES pH 6.5
KCl Na Cacodylate Na HEPES pH 7.5
LiCl Na Sulfate Na TAPS pH 8.5
MgCl2 Na Phosphate
CaCl2 Na Citrate
The protein is first precipitated by dialysis against deionized water or
addition of polyethylene glycol (PEG). The precipitated solution is aliquoted in 16 eppendorf tubes and spun down at 20,000 rcf for 5 minutes. The supernatants are removed and kept as a control. Next, 25 µL of each salt or buffer stated in
Table 2.2 is added to the tubes, the precipitate is resuspended and the solutions are incubated at room temperature for 20 minutes, before being centrifuged again. The protein concentration in the supernatant, corresponding to the amount of protein that went back into solution, is measured using the Bio-Rad protein assay, based on the Bradford assay (Bradford, 1976). For this assay, 995 µL of the 1X Bio-Rad protein assay reagent are mixed with 5 µL of each supernatant, and absorbance at 595 nm is measured with the UV-visible spectrophotometer.
38
The salt and buffer conditions corresponding to the highest solubility values are then chosen as the optimum buffer conditions.
2.6. Protein Crystallization
Preliminary crystal screening is done using the Honeybee 963 crystallization robot from Genomic Solutions, in the Ohio Crystallography
Consortium, located in the Instrumentation Center. Corning™ trays (Hampton
Research) are used when screening only one protein, and three-well Greiner™
trays (Hampton Research) can be used with up to three different protein samples per tray. These trays use the sitting drop vapor diffusion technique. A list of the
screens available in the lab is shown in Table 2.7. Each screen contains 96 different crystallization conditions. The Honeybee robot transfers 100 µL of each condition into the wells of the Corning™ or Greiner™ tray, then transfers 0.5, 1 or
2 µL, depending on the program that is used, inside the sitting drop depression.
The same volume of protein is dispensed in each sitting drop depression. Finally, the tray is manually sealed with clear tape, and stored at room temperature in a cabinet or at 4 °C in a cold room. The results can be manually recorded with a
Nikon SMZ1500 microscope and pictures of the crystals taken with a Nikon
CoolPix™ 990 digital camera, or automatically using the DCA Rhombix Imager
(Kendro) located in the Ohio Crystallography Consortium as well. Typically,
results are recorded after one day, three days, one week and every week after
that if some wells are still clear. To make sure the crystals growing are protein
crystals and not salt crystals, 0.5 µL of Izit™ Crystal Dye is added to the drop
39 and left to react for 30 minutes. The blue Izit™ dye only fits in solvent channels in protein crystals and turns the protein crystals blue, while the solvent channels in salt crystals are too small and the salt crystals remain colorless.
Table 2.7 – Crystal Screens
Screen Type Origin
Crystal Screen I / II ™ Sparse Matrix Hampton Research
Wizard I / II ™ Random Sparse Matrix Emerald BioSystems Sparse Matrix with Cryo I / II ™ Emerald BioSystems Cryoprotectants Combination of Grid Screen, Index ™ Sparse Matrix, and Incomplete Hampton Research Factorial Sparse Matrix for Natrix ™ Hampton Research Nucleic Acids Salt Rx ™ High Salt Sparse Matrix Hampton Research Sparse Matrix for MembFac ™ Hampton Research Membrane Proteins PEG Ion Screen PEG vs. Salt Matrix In-House
Additive Screen PEG vs. Additive Matrix In-House
Once crystal hits are obtained from the crystal screening, these hits have
to be optimized to grow diffraction quality crystals, which are very rarely obtained
directly from the crystal screens. Twenty-four-well expansion trays are set up in
VDX™ (Hampton Research) or Nextal™ (Qiagen) plates, using the hanging drop
vapor diffusion technique this time. For each crystal hit condition, the amount of
precipitating agent, the amount of salt, the pH… can be varied, and this is done
using the A/B gradient technique (Senger and Mueser, 2005). Only two solutions
have to be prepared, solution A corresponding to the lower end of the gradient,
40 and solution B corresponding to the higher end. Pipetting maps are available for
4x6, 2x12 and 1x24 setups. The A and B crystallization solutions are poured inside each well according to the pipetting map with an EDP 10 mL programmable pipette (Rainin). The tray is then placed on a stirring device for a few minutes. It should be noted that in the case of VDX™ trays, the edges of the well also have to be greased. Next, 1 µL or 2 µL of the protein are pipetted onto a
22 mm siliconized glass cover slide for the VDX™ tray, or on the screw-in crystallization support provided with the Nextal™ tray, and 1 or 2 µL of the well solution is added to the protein drop. Finally, the glass cover slides are sealed upside down on top of the wells using tweezers, and the Nextal™ crystallization supports simply screwed on each well. Sitting drop vapor diffusion can also be used. In this case, the protein drop is placed in a polypropylene Micro-Bridge®
(Hampton Research), which is then placed in the well of a VDX™ tray containing the crystallization solution. The wells are covered with clear tape. The advantage of the sitting drop method is that it allows for a larger protein drop, which might be needed to obtain large enough crystals. The trays are also stored at either room temperature or 4°C, and results are recorded in the same ways as the crystals screens after a few days.
2.7. X-Ray Diffraction Data Collection
2.7.1. Crystal Cryoprotection and Freezing
Once crystals are obtained, they have to be cryoprotected and flash
frozen prior to exposure to X-Rays for data collection (Rodgers, 1994).
41
A list of cryoprotectants is available in Table 2.4. Crystals grown in polyethylene glycol or other polyalcohols are cryoprotected with organic cryoprotectants, while crystals grown in high salt conditions are typically cryoprotected with high salt cryoprotectants. The crystals are removed from the drop using a nylon CryoLoop™ (Hampton Research), and rinsed for a few minutes in the substitute mother liquor, which is composed of the crystallization condition as well as the dialysis condition of the protein before it was crystallized.
The crystals are then placed in a drop containing the substitute mother liquor and the amount of cryoprotectant required to get an amorphous glass upon freezing
(see Table 2.8), and quickly picked up from the drop and plunged in liquid
Nitrogen.
Table 2.8 – Cryoprotectants
Organic Cryoprotectants High Salt Cryoprotectants
Name Concentration Name Concentration
PEG 400 25 – 35 % (v/v) Sodium Chloride 3.0 M (5.0 M)
Ethylene Glycol 11 – 30 % (v/v) Sodium Nitrate 5.0 M (7.0 M) 2-Methyl-2,4- 20 – 30 % (v/v) Sodium Malonate 2.0 M (3.5 M) Pentanediol (MPD) Glycerol 13 – 30 % (v/v) Sodium Formate 4.0 M (7.0 M)
Glucose 25 % (w/v) Lithium Sulfate 2.0 M (2.5 M)
Xylitol 22 % (w/v) Lithium Nitrate 4.0 M (8.0 M)
Lithium Chloride 2.5 M (10.0 M)
The list of organic cryoprotectants is available in (Rodgers, 1994) The first molarity given for the high salt cryoprotectants is the minimum molarity required to obtain a amorphous glass freeze, the molarity in parenthesis is the maximum solubility. It should be noted that NaCl and NaNO3 gave lower quality freezes than the other salts.
42
It is possible that the crystal cannot handle being kept in the cryoprotection solution, even for a few seconds, and will start degrading, cracking or dissolving. In that case, another cryoprotectant should be used. Another option is to incremently increase the amount of cryoprotectant, from none to the concentration needed, and soak the crystal at each step for a few minutes.
Also, an alternative to flash freezing the crystal by plunging it into liquid
Nitrogen is to flash freeze the crystal directly in the Nitrogen stream at 100 K while mounting the crystal on the diffractometer goniometer. The crystal can also be flash frozen in a Helium stream between 15 and 20 K. The Helium freeze is thought to reduce the lattice disruption that occurs while freezing.
2.7.2. Data Collection
X-Ray diffraction data collection can be done on the in-house FR-E
Rigaku Diffractometer at the Ohio Crystallography Consortium, in the
Instrumentation Center. However, when a more intense X-Ray source is needed, the in-house diffractometer is only used to screen crystals and data are collected at the Argonne Photon Source (APS) synchrotron, at Argonne National
Laboratories. Two beamlines were used for the present work, the Bio-CARS
14-BM-C beamline and the Ser-CAT 22-ID-D beamline.
43
2.8. Data Processing and Structure Determination
2.8.1. Data Processing
Once the dataset has been collected, it has to be processed and a
reflection file written before phasing can be done. Processing consists of several steps: indexing of the data to a particular space group, integration of all the
reflections and finally data reduction and merging of the equivalent reflections,
and scaling of the different resolution bins. The resolution is usually cut at the
scaling stage so that the Rmerge value is below 15 %.
⎡ n ⎤ n R =100 × F 2 − F 2 / F 2 Equation 2.1 merge ⎢∑∑ hkl hkl i ⎥ ∑∑ hkl ⎣ hkl i=1 ⎦ hkl i=1
where F 2 is the intensity of each hkl reflection and F 2 is the mean value of i hkl hkl i
measurements of n equivalent reflections.
For this particular project, HKL2000 (Otwinowski and Minor, 1997), d*TREK
(Pflugrath, 1999) or MOSFLM (Leslie, 1992) were used for data processing. In
the case of MOSFLM processing, the scaling step is done using the program
SCALA in the CCP4 program suite (Bailey, 1994).
2.8.2. Phasing
For this particular project, molecular replacement phasing can be used
since the structures of the separate RNase H and 32 core domain proteins have
already been solved. All the molecular replacement programs used here,
AMoRe, MolRep and Phaser (McCoy, 2007), are part of the CCP4 program suite
(Bailey, 1994). They all carry out rotation and translation searches as well as
44 rigid body refinement. AMoRe and MolRep output a R value and a CC correlation coefficient, in order to assess the quality of the solution. CC is defined as:
1/ 2 2 2 CC = ∆ F obs ∆ F calc /⎡ (∆ F obs ) (∆ F calc ) ⎤ Equation 2.2 ⎣⎢ ⎦⎥
The R value should be low and the correlation factor high. Phaser, a maximum-likelihood phasing program, outputs log-likelihood gains (LLG) as well as translation (TFZ) and rotation (RFZ) Z scores. The TFZ value should be
higher than 8, and the RFZ one higher than five for an acceptable solution. All
programs output a coordinate file with the solution after an initial rigid body
refinement.
2.8.3. Model Building
The coordinate file from molecular replacement and the reflection file
obtained after the initial rigid body refinement are fed into the model building
program COOT (Emsley, 2004). The electron density is calculated from the
reflection file as follows:
1 ρ(x,y,z) = ∑ Fhkl cos[2π (hx + ky + lz − φhkl )] Equation 2.3 V hkl
In there, the model can be adjusted to fit the electron density map better
using the different building and refinement tools available.
45
2.8.4. Structure Refinement and Validation
After a round of model building, the coordinate file from COOT is refined
against the reflection file using the refinement program REFMAC (Bailey, 1994;
Murshudov, 1997), in the restrained refinement mode. In this mode, the bonds
and angles from the model are refined against the REFMAC dictionary of allowed
bond lengths, angles and atomic sizes. How much freedom is allowed compared
to the dictionary is input in the weight term: the higher the weight, the less
REFMAC will try to follow the dictionary and the more it will try to fit atoms in the
electron density. If the weight term is too high and the electron density map
quality low, bonds can be broken, resulting in a model that doesn’t make any sense.
REFMAC outputs an R and Rfree value after the refinement process is
completed. The R value is the error between the model and the data at that stage, and the Rfree the same error on the amount of data (usually 5 %) that was
left out of the refinement process. The Rfree value should be around 5 % higher
than the R value.
The refined model is then fed into COOT once again, the model adjusted
after which another cycle of refinement is done. This process should be repeated
until an R value of around 20 % is reached. For low resolution models, a slightly
higher R value is acceptable. The water molecules are also added last using the
ARP/wARP tool in REFMAC.
The final model is then validated using PROCHECK (Laskowski, 1993) in
the CCP4 program suite.
46
2.8.5. Summary of the Data Processing and Model Building Process
A summary of all the steps needed from data collection to model building
is given in Figure 2.4.
2.9. Non-Denaturing Gel Electrophoresis
Non-denaturing or native gel electrophoresis is used to study the
interactions between proteins in their native state. The pH of the running buffer
has to be chosen according to the pI of the proteins that are to be run. Indeed, if
the pH of the solution is too close to the pI, the protein won’t move out of the well.
For these studies, a pH of 6.5 was chosen for the running buffer.
A 0.6 % agarose gel is prepared by boiling Agarose MB (Midsci) in 40 mM
Bis-Tris Acetate pH 6.5, 1 mM EDTA (TAE) buffer. The agarose is poured onto a
GelBond® film (Amersham Biosciences), and the comb is placed in the middle of
the gel. The agarose is left to solidify at room temperature first, then at 4°C
overnight. The protein samples are usually prepared at 50 µM or 100 µM. The protein is mixed with the TAE running buffer so that the sample contains 5 µL at the chosen concentration. Then, 1 µL of Gel Loading Solution and 4 µL of 50 %
glycerol are added. The gel is run at 50 V for 2 to 5 hours. At the end of the run, it
is transferred to a staining box, and is first fixed with a 0.05 % SDS solution for
30 minutes to denature the proteins, before being stained with SYPRO orange
(Staining solution: 20 µL of SYPRO orange protein gel stain (Invitrogen) in a
7.5 % acetic acid + 10 % methanol solution). The gel is visualized under UV light.
47
Figure 2.4 – X-Ray Diffraction Data Analysis Scheme
Data Collection • Program available at the beam line
Auto-Indexing / Integration
• HKL2000 • D*TREK • MOSFLM
Data Reduction and Merging • HKL2000 • D*TREK
• SCALA
Molecular Replacement Phasing • MolRep • AMoRe • PHASER (Maximum likelihood)
Initial Rigid Body Refinement • Molecular Replacement Programs
• REFMAC
Model Building Refinement Structure Validation • COOT • REFMAC • PROCHECK
48
2.10. Scattering Studies
2.10.1. Dynamic Light Scattering (DLS)
DLS requires samples at low concentration, with a minimal amount of
aggregates floating in solution. Therefore, the DLS samples are prepared around
1 mg/mL, filtered using a 0.1 µm pore size Ultrafree®-MC filter from Millipore, and
finally centrifuged for 20 minutes at 18,000 rcf. The measurements were taken using the DynaPro-Titan DLS instrument from Wyatt Technology in Dr. Viola’s
lab, and analyzed using the Dynamics version 6.7.3 software. The DLS cuvette
has to be extremely clean, with water background counts below 20. The protein sample is then injected in the cuvette and the counts recorded. The DLS instrument has a temperature control, allowing the user to make measurements
at 4 °C as well. The final graph is reported in terms of % mass versus
hydrodynamic radius Rh. The hydrodynamic radius is calculated using the
Stokes-Einstein equation:
kT R = Equation 2.4 h 6πηD
where k is the Boltzmann constant, T the temperature in K, η the solvent
viscosity and D the diffusion constant. More background information on light
scattering is available in (Tanford).
The program also calculates the molecular weight corresponding to each
peak. The polydispersity is a good estimate of the homogeneity of the protein
sample: a sample that has a polydispersity of 15 % or lower is considered
homogeneous.
49
2.10.2. Small-Angle X-Ray Scattering (SAXS) (Koch et al., 2003)
SAXS data can only be collected on a homogeneous protein sample, so usually DLS experiments are performed first to check the status of the protein in solution. A typical sample for SAXS has a concentration of 3 to 5 mg/mL.
The SAXS data collection in this study was carried out at the Advanced
Photon Source at Argonne National Laboratory in Chicago, at the
ChemMat-CARS 15-ID beamline.
The program used for data collection is 15-ID SAXS/WAXS v.3.294. A number of images have to be collected for the buffer, used as a blank, and the protein, for different exposure times. The best exposure time is then chosen and the corresponding images averaged. The data is plotted as the intensity versus the momentum transfer q, described by the following equation:
4π sinθ Momentum Transfer q = Equation 2.5 λ
The momentum transfer q is related to the resolution:
2π Resolution (Å) = Equation 2.6 q
Once the data has been collected, it is processed using a number of programs that belong to the ATSAS suite of SAXS programs (Petoukhov, 2007).
First, a regularization and reduction program such as GNOM or PRIMUS is used, to evaluate the size and shape of the particles in solution. PRIMUS only considers low q data, while GNOM takes into account higher resolution data as well. These programs use the momentum transfer plot to calculate a size distribution function p(r), which is then used to calculate the extrapolated intensity
50
at q = 0 : I(0), and the radius of gyration of the particle in solution Rg. All these
terms are defined as follows:
1 ∞ p(r) = qrI(q) sin(qr)dq Equation 2.5 2π 2 ∫0
D max I(0) = 4π p(r)dr Equation 2.6 ∫0
r 2 p(r)dr 2 ∫ Rg = Equation 2.7 2∫ p(r)dr
The overall shape of the size distribution plot also gives an estimate of the overall shape of the particle: a Gaussian function corresponds to a spherical particle, while a function that tails off at higher radius corresponds to an elongated particle (Koch et al., 2003).
The reduced data can then be used in different ways:
A 3D molecular envelope can be calculated with Ab Initio programs such as DAMMIN or GASBOR. DAMMIN only uses low resolution data, so the resolution has to be cutoff while running GNOM. DAMMIN can be run in several modes, the best one being the “keep” mode, as it outputs all the possible models.
These models can then be averaged with the program DAMAVER.
If a crystal structure is available and only part of the protein of interest is unknown, the missing domain can be built and added to the known structure with the programs CHADD or CREDO.
When trying to dock two proteins with known structures, a rigid body modeling program called SASREF is available. It fits the atomic models together
51
so that the final shape of the complex can account for the scattering data that is
observed.
Finally, CRYSOL outputs calculated scattering data from a coordinate file.
This is a good way of comparing scattering data, obtained from a solution based
experiment, to X-Ray diffraction data, obtained from a crystal.
Other programs are available in the ATSAS suite of programs but were not used in this particular study.
A summary of the different steps of data processing and analysis is presented in Figure 2.5.
52
Figure 2.5 – SAXS Data Analysis Scheme
Data Collection • 15-ID SAXS / WAXS program
Data Reduction and Regularization • PRIMUS (low q range, Guinier plot) • GNOM (entire q range)
Ab Initio Modeling Building of Missing Fragments • DAMMIN (low q range, low resolution) • CREDO (chain of dummy atoms) • GASBOR (entire q range, higher res.) • CHADD (same as CREDO, starts at the terminal residue)
Model Averaging
• DAMAVER Rigid Body Modeling
• SASREF
Computation of Scattering Data from an Atomic Model • CRYSOL
2.11. Isothermal Titration Calorimetry (ITC) (Pierce et al., 1999)
ITC is used to evaluate the thermodynamic parameters of binding of two species in solution, in this case the binding of one protein to another. The two proteins have to be dialyzed in exactly the same buffer. As far as concentrations
53 go, the protein that is used as a titrant (in the syringe) has to be roughly 20 times more concentrated that the one that is being titrated (in the cell). A total of four runs have to be performed: a buffer into buffer run to get the heat of injection, a buffer into titrated protein run and a titrant protein into buffer to get the heats of dilution, and finally the protein into protein run. The heats of injection and dilution are then substracted from the heats obtained in the final run to obtain the actual heat of binding.
The VP-ITC Instrument from MicroCal® in Dr. Funk’s lab was used in this particular study. Forty injections of 10 s were made, of 5 µL each except the first one which was 1 µL, with a 5 minute pause in between injections to allow the instrument to adjust the temperature of the cell. The protein mixture in the cell was stirred with the injection syringe at 270 rpm. The temperature of the cell was maintained at 20 °C, but this value can be modified to suit the experiment.
The data was analyzed using the program Origin 7.0 for ITC. After data processing, including removing the first couple data points and substracting the heats of injection and dilution, the titration curve obtained is fit using either a one-site binding or two-site binding model. The heat of the reaction is obtained from the integration of the peaks and the thermodynamic parameters of the reaction calculated using the following equations:
∆G = ∆H − T∆S Equation 2.8
∆G = −RT ln K Equation 2.9
The ITC experiment gives access to the enthalpy ∆H, the entropy ∆S, the association constant K and the stoechiometry N of the reaction.
54
2.12. Fluorescence Anisotropy Titration
Fluorescence Anisotropy titrations are another way to obtain dissociation
constants of binding of one species to another. The titrated molecule, usually
DNA, has to be labeled with a fluorophore. In this case HEX-Fluorescein was
chosen. The excitation wavelength of HEX is 535 nm, and the emission wavelength 556 nm. The fluorescent tag was attached to the fork DNA substrate
presented below in Figure 2.6. To measure the binding of one protein to another,
the first protein has to be bound to the DNA substrate before starting titrating the
other protein.
Figure 2.6 – DNA Substrate Used in the Fluorescence Anisotropy Titrations
5’ 5 15 * 3’ HEX Label
5’ * 15 15 3’
The relative concentrations of the proteins and DNA substrate depend on
the estimated dissociation constant: if the Kd is estimated to be in the micromolar
range, then all the samples need to be prepared at a micromolar concentration.
The concentrations therefore vary from one titration to another. The titrant protein is added in such a way that 50 % binding is achieved after three or four additions,
and the final concentration of the titrant in the cell is about ten times the one of
the titrated protein/DNA sample. This is to ensure that 90 % or more of the titrant
55
is bound at the end of the titration, resulting in a better estimate of the
dissociation constant.
A fluorometer from Photon Technology International was used for this
work, and it is found in Dr. Dignam’s lab on the Health Science Campus of the
University of Toledo. The data was collected using the program Felix, and
analyzed using DynaFit version 3.28.058 (Kuzmic, 1996).
2.13. DNA Purification and Annealing
2.13.1. DNA Oligomer Purification
The lyophilized DNA oligomers are purchased from Integrated DNA
Technologies (IDT). Typically, 1 to 2 µmoles are purchased for each oligomer.
Upon reception, the DNA is redissolved in 2 mL of autoclaved water. Each
oligomer is then purified by anion-exchange chromatography on a
BioCAD/SPRINT Perfusion Chromatography System, with a Poros HQ column.
The buffers are composed of 10 mM ammonium hydroxide and 200 mM (buffer
A) or 3 M (buffer B) ammonium acetate. The Poros HQ column is equilibrated
with buffer A, the DNA sample is then loaded onto the column, eluted with a
linear salt gradient and collected with a fraction collector. The absorbance is monitored at 296 nm. The fractions containing the pure DNA oligomer are kept, and after addition of 1 µL of 100X Tris HCl, EDTA buffer for each milliliter of DNA
solution, they are concentrated down overnight at medium heat in a Speed-Vac
concentrating system with a refrigerated condensation trap. The next day, each
DNA pellet is resuspended in 50 µL of autoclaved water. Samples are taken from
56
the different fractions to check the purity of the oligomer on a 20 % Tris, boric
acid, EDTA (TBE) polyacrylamide gel (Invitrogen) (see Section 7.12.3). Once it
has been established that the DNA purification was successful, the fractions are
pooled together. Absorbance at 260 nm is measured with the UV-Visible
Spectrophotomer and the concentration for each oligomer is calculated.
2.13.2. DNA Substrate Annealing
The oligomers needed for the substrate are mixed in a 1:1 molar ratio in an eppendorf tube. Sodium Chloride is also added to the mixture with a final concentration of 100 mM, for stringency purposes. The tube is placed to float in a beaker containing cold water, and the water is then heated until right before the boiling point. At that time, the beaker is placed in a closed Styrofoam box, at 4 °C overnight, to allow the DNA solution to slowly cool down. Finally, a 20% TBE gel is run to check that the DNA substrates annealed properly.
2.13.3. TBE Gel Electrophoresis
The 20 % TBE polyacrylamide gels are purchased from Invitrogen, and the 5X TBE running buffer (445 mM Tris, 445 mM Boric Acid and 10 mM EDTA) from USB. The samples are prepared by mixing 3 µL of the DNA with 15 µL of
TBE sample buffer (1X TBE running buffer + 15 % glycerol). On the outside lanes, a mixture of 2 µL of Gel Loading Solution and 23 µL of TBE sample buffer
is run, in order to check how far the gel has been running, since the DNA samples are colorless. The gel is run at 180 V for 75 minutes, at 4 °C. The bands
57 are visualized by UV shadowing: the gel is placed on a fluorescent silica gel coated sheet, and UV light (254 nm) is used to reveal the bands.
CHAPTER 3 - Bacteriophage T4 32 Protein
and Its Truncations
3.1. Introduction
The 32 protein from bacteriophage T4 is a single-stranded DNA-binding protein,
homologous to the SSB protein in other organisms. Binding of 32 protein
prevents re-annealing of the single strands of DNA during replication as well as
formation of secondary structures, and protects the single strands from nuclease activity. Additional background information on the 32 protein is available in
Section 1.1.2.
The three domains of 32 protein were identified by proteolytic cleavage
(Karpel, 1990). The N-terminus, also known as the B domain (for basic domain),
contains amino-acids 1 to 22 according to Karpel and coworkers. The C-terminus
or A (acidic) domain contains amino-acids 254 to 301. The intermediate domain
is also known as the core domain or 32 core protein. The core is thought to
interact with the single-stranded DNA, while the B domain is responsible for
cooperative binding of 32 proteins to one another, and the A domain for binding
to other proteins present at the replication fork. Several truncations of 32 protein
are available: the 32 core protein, the 32-A (32 minus A) protein missing the A
58 59 domain or C-terminus, and the 32-B (32 minus B) protein missing the B domain or N-terminus. It should be noted that the 32-B protein available in the Mueser lab is only missing the first 16 amino-acids instead of 22 (see Section 4.5.2). A schematic view of the different domains and truncations of 32 protein available in the lab is shown in Figure 3.1.
Figure 3.1 – 32 Protein Domains and 32 Truncations
1 16 17 253 254 301 B Domain Core Domain A Domain
• 32 Protein : amino-acid 1 to 301 • 32 Core Protein : amino-acid 17 to 253 • 32-B Protein : amino-acid 17 to 301 • 32-A Protein : amino-acid 1 to 253
The 32 protein domains presented above were discovered by Karpel and coworkers (Waidner et al., 2001). However, the B domain they described contains the first 22 amino- acids of the protein, while it was found that the 32-B protein used in the Mueser lab is actually missing only 16 amino-acids (see Section 4.5.2 for more details).
The protein characteristics for each 32 truncation were calculated using the ExPASy website (Gill and von Hippel, 1989; Gasteiger et al., 2003) and are summarized in Table 3.1.
60
Table 3.1 – 32 Protein and Truncations Characteristics
32 32 Core 32-B 32-A
Amino-acids 301 218 286 253
Molecular Weight 33.5 kDa 24.9 kDa 31.8 kDa 28.4 kDa
pI 5.82 5.25 4.65 6.76
ε 1.16 1.56 1.24 1.37
3.2. Bacteriophage T4 32 Protein
3.2.1. Introduction
The 32 protein construct was a gift from Dr. Nancy Nossal (NIH). The protein was expressed from the pAS6-2 plasmid transformed in the E. coli N8430 cell line. Cells containing the overexpressed recombinant 32 protein were available in the Mueser lab, they were lysed by sonication before the protein could be purified.
3.2.2. Cell Lysis
The cells were lysed according to the cell lysis protocol described in
Section 2.3. The lysis buffer that was used is composed of 25 mM bis-Tris HCl pH 6.5, 50 mM NaCl, 1 mM EDTA and 1mM DTT. A volume of 100 mL was used for every 10 grams of cells. Unfortunately, the cell lysis samples were not run on an SDS-PAGE gel, but a sample of the supernatant that was then purified can be
61
seen in Figure 3.2.a, lane 2. A large band corresponding to 32 protein is present,
meaning the protein was soluble after cell lysis.
3.2.3. Protein Purification
The supernatant containing 32 protein obtained after cell lysis was purified
using ion-exchange and hydrophobic interaction chromatography. Since the
calculated pI of the 32 protein is around 5.8, anion-exchange chromatography
was used. The buffers used for the different purification steps are detailed in
Table 3.2.
Table 3.2 – HPLC Buffers for T4 32 Protein Purification
Ion Exchange Hydrophobic
(Q Sepharose, POROS HQ) (POROS PE)
25 mM Tris HCl pH 7.5 25 mM bis-Tris HCl pH 6.5 50 mM NaCl Buffers 2 % glycerol 2% glycerol 50-500 mM NaCl 600 mM (NH4)2SO4
Buffer A: ~ 6 mS/cm Conductivity ~ 86 mS/cm Buffer B: ~ 43 mS/cm
The lysate was first loaded on the low-resolution anion-exchange Q
Sepharose. An example of a Q Sepharose run with the matching SDS-PAGE gel is presented in Figure 3.2.a. The elution from the Q Sepharose was then run on a hydrophobic column, the POROS PE, to get rid of any endogenous nuclease that might be present in solution. For that particular step, the conductivity of the protein sample had to be raised by addition of 3 M Ammonium Sulfate to match the one of the PE buffer. The protein was eluted in the void fraction and no salt
62 gradient was required. The PE elution then had to be either dialyzed in or diluted with Q buffer A in order to lower the conductivity back around 6 mS/cm. After that, the protein sample was run on the high-resolution anion-exchange column, the POROS HQ. A chromatogram and SDS-PAGE gel for that last step are shown in Figure 3.2.b. 32 protein was pure after the POROS HQ run, and could be concentrated before being kept at -80°C for further use.
3.3. Bacteriophage T4 32 Core Protein
The 32 core protein was donated by Dr. Yousif Shamoo (Rice University).
The gene was inserted in the pKC30 vector (Yoakum, 1983; Rao, 1984) and the protein expressed in the AR120 E. coli cell line (Waidner et al., 2001). The protein expression and the cell lysis were performed by Jennifer M. Dlwgosh, according to the full length 32 protein expression and cell lysis protocols.
Concerning the purification of 32 core protein, the buffers and protocol are the same as the ones used for 32 protein (see Section 3.2.3). The Q Sepharose and Poros PE runs are similar to the ones from the 32 protein purification. On the other hand, 32 core protein doesn’t seem to bind as strongly to the POROS HQ compared to 32 protein, even though its calculated pI is lower. Nonetheless, the
32 core protein purification did not present any particular challenge and the pure protein was obtained in large quantities, about 90 milligrams from 10 grams of cells.
63
Figure 3.2 – T4 32 Protein Purification
Figure 3.2.a – Q Sepharose
* * * *
1 2 3 4 5 6 7 8
1- Molecular Weight Marker 66.3 kDa 2- Q Sepharose – Load 55.4 kDa 3- Q Sepharose – F. 23
36.5 kDa 4- Q Sepharose – F. 35 31.0 kDa 5- Q Sepharose – F. 57
21.5 kDa 6- Q Sepharose – F. 72 14.4 kDa 7- Q Sepharose – Rinse Fraction 8- Q Sepharose – Flow Through
On the chromatogram, the OD260 is shown in purple, the OD280 in green, the conductivity in red and the % B in black. The red stars indicate which fractions were run on a SDS- PAGE gel. 32 protein can be found in a fairly pure state already in fractions 48 to 78, which were pooled to be run on the POROS PE column. Fractions 31 to 47 as well as the rinse fraction also contain 32 protein but are still contaminated. These fractions were rerun on the Q Sepharose and the pure 32 protein eluted after that second Q Sepharose run was added to the first elution.
64
Figure 3.2.b – POROS HQ
* *****
1 2 3 4 5 6 7
1- Molecular Weight Marker 66.3 kDa 55.4 kDa 2- POROS HQ – F. 18 3- POROS HQ – F. 23 36.5 kDa 31.0 kDa 4- POROS HQ – F. 29 21.5 kDa 5- POROS HQ – F. 33 14.4 kDa 6- POROS HQ – F. 39 7- POROS HQ – F. 43
32 protein is present in most fractions, but it was pure only in fractions 27 to 36. These fractions were pooled and concentrated.
65
3.4. Bacteriophage T4 32-A Protein
The 32-A protein was a gift from Dr. Richard Karpel (UMBC). The
truncated 32 gene was inserted in the pKC30 vector (Yoakum, 1983; Rao, 1984)
and the recombinant protein expressed from the pEKF1 plasmid and the AR120
E. coli cell line (Waidner et al., 2001). Protein expression and cell lysis were performed according to the 32 protein protocols. The 32-A protein was then purified over the Q Sepharose and Poros HQ anion-exchange columns. This work was done by Jennifer M. Dlwgosh.
3.5. Bacteriophage T4 32-B Protein
3.5.1. Introduction
The majority of the work done on the 32 truncations was accomplished
with the 32-B protein. Since the B domain responsible for cooperative binding is
missing but the A domain that is thought to be involved in interactions with other
DNA replication proteins is still present, the 32-B protein can still be used in
protein-protein complexes, but does not tend to aggregate and form filaments like
the 32 full length protein does.
The X-Ray structure of the 32 core was solved (Shamoo et al., 1995) but the structure of the missing A and B domains is still unknown. In parallel to the protein-protein complex studies (see Chapter 5), structural and biophysical experiments were also carried out on the 32-B protein alone, in order to get more insight on the role and structure of the A domain.
66
3.5.2. Molecular Cloning
The 32-B construct and glycerol stocks were donated by Dr. Richard
Karpel (UMBC). The protein was expressed from the pEKF2 plasmid transformed
in AR120 E. coli cells (Waidner et al., 2001).
Another student in the lab, Jennifer M. Dlwgosh, was also studying 32-B
protein, and therefore had her own preparations of 32-B. However, the different
batches of protein ran differently on SDS-PAGE gels, so the plasmid encoding
for the 32-B protein was sent for sequencing to make sure the protein that was
used in the present study was really missing the first 21 amino-acids. The
sequencing results are presented in Appendix 3.
It turns out that the 32-B protein that has been used in the lab is only missing the first 16 amino-acids, as compared to the 32*II protein described by
Karpel and coworkers (Waidner et al., 2001), which is missing the first 22 amino-acids. The “new” sequence for 32-B is shown in Figure 3.3. That sequence was used to calculate the protein characteristics that were presented in Table 3.1.
Figure 3.3 – T4 32-B Protein Amino-Acid Sequence
1 MFKRKSTAEL AAQMAKLNGN KGFSSEDKGE WKLKLDNAGN GQAVIRFLPS KNDEQAPFAI 61 LVNHGFKKNG KWYIETCSST HGDYDSCPVC QYISKNDLYN TDNKEYSLVK RKTSYWANIL 121 VVKDPAAPEN EGKVFKYRFG KKIWDKINAM IAVDVEMGET PVDVTCPWEG ANFVLKVKQV 181 SGFSNYDESK FLNQSAIPNI DDESFQKELF EQMVDLSEMT SKDKFKSFEE LNTKFGQVMG 241 TAVMGGAAAT AAKKADKVAD DLDAFNVDDF NTKTEDDFMS SSSGSSSSAD DTDLDDLLND 301 L
The 32-B protein sequence is in black, shaded in gray are the16 missing amino-acids at the N-terminus. An additional Methionine from the start codon is also present at the N-terminus.
67
All the work with 32-B protein described in this dissertation was carried out
with this particular N-terminal truncation. Also, the amino-acid numbering was
kept as it is for the full length 32 protein. For example, with the I151D 32-B
mutant described below in Section 4.6, the number 151 refers to isoleucine 151 in the 32 protein sequence, not to amino-acid 151 in the 32-B protein sequence if the numbering were to start at the initial methionine.
The following Section 3.6 describes the two 32-B protein mutants that were made and the work that was done with them. These mutants were cloned using site-directed mutagenesis, with the native 32-B protein expression plasmid as a template for the PCR reaction. This pEKF2 plasmid, obtained from Dr.
Richard Karpel (UMBC), is very large: 15-20 kb, as one can see on Figure 3.4.
Figure 3.4 – T4 32-B Initial Expression Plasmid Miniprep
1 2 3 1- Initial pEKF2 32-B expression plasmid
10 kb 2- Supercoiled DNA ladder
10 kb 3- 1 kb DNA ladder
5 kb 5 kb The plasmid containing the 32-B gene is supercoiled. Since it runs past the 2 kb supercoiled DNA ladder, its size can only 1 kb be roughly estimated from looking at the gel. It is probably between 15 and 20 kb.
68
Because the template plasmid was so large, the PCR reactions for site-directed mutagenesis were all unsuccessful, more details on this are provided in Section 4.6. Therefore, the gene encoding for 32-B had to be recloned in a different expression vector. This work is described below.
Two different expression vectors were chosen. One is the commercially
available pET101 (Invitrogen) that has the advantage of allowing direct insertion of the PCR product. However, protein expression is rarely obtained from this
vector. Another vector available is the pDEST-C1. It first requires an insertion of
the PCR product in an entry vector, the pENTR-D vector, the gene of interest
being then inserted in the pDEST-C1 expression vector though a transposition
reaction. Both pET101 and pENTR-D use TOPO-assisted directional cloning
while inserting the PCR product, which is why a CACC overhang is necessarily
present in the forward primer. The pDEST-C1 also adds an N-terminal His-Tag to
the protein expressed for purification purposes. That tag may interfere during
crystallization studies and it is preferable to add a TEV protease cleavage site in
the pDEST-C1 forward primer so that the His-Tag can be cleaved off after the
protein has been purified. The different primers are presented in Figure 3.5.
Despite the low success rate of expression from the pET101 vector, the pET101
cloning does not require a lot of time, which is why it was done in parallel with the
pDEST-C1 cloning.
69
Figure 3.5 – T4 32-B PCR Primers
Forward Primer – pDEST-C1 insertion
32-B 5’- ATG CTG AAT GGC AAT -3’ primer 5’- C ACC GAG AAC CTC TAC TTC CAA GGA CTG AAT GGC AAT -3’
32-B 5’- AAA GGT TTT TCT TCT… -3’ primer 5’- AAA GGT TTT TCT TCT… -3’
5’- C ACC GAG AAC CTC TAC TTC CAA GGA CTG AAT GGC AAT AAA GGT TTT TCT TCT -3’
27 bp from 32-b, 52 bp total 33% GC content Tm = 54°C
Forward Primer – pET101 insertion
32-B 5’- ATG CTG AAT GGC AAT AAA GGT TTT TCT TCT… -3’ primer 5’- C ACC ATG CTG AAT GGC AAT AAA GGT TTT TCT TCT… -3’
5’- C ACC ATG CTG AAT GGC AAT AAA GGT TTT TCT TCT -3’
30 bp from 32-b, 34 bp total 33% GC content Tm = 54°C
Reverse Primer (Inverse Complement)
32-B 5’- …CTG GAT GAC CTT TTG AAT GAC CTT TAA -3’ primer 3’- GAC CTA CTG GAA AAC TTA CTG GAA ATT -5’
5’- TTA AAG GTC ATT CAA AAG GTC ATC CAG -3’
27 bp total 37% GC content Tm = 55°C
Both forward primers were aligned with the nucleotide sequence of the 32-b gene. The start codon is highlighted in green. In red is the CACC overhang necessary for TOPO-assisted directional cloning, and the sequence coding for the TEV protease cleavage site is shown in blue in the pDEST-C1 primer. The reverse primer is shown as the inverse complement of 32-b, where the stop codon is highlighted in red.
The primers were ordered from Integrated DNA Technologies®. They were received as a lyophilized pellet and were resuspended in 1X TE buffer (20 mM
Tris HCl pH 8.0 + 1 mM EDTA) to make 250 µM solutions of each primer. A
70
primer solution was then made for each set of primers (the pET101 insertion and
the pDEST-C1 insertion), by mixing 2 µL of the forward primer, 2 µL of the reverse primer and adding 46 µL of EB buffer, in order to make a 10 µM primer mix solution for each reaction.
The PCR reactions for both the pET101 insertion and the pDEST-C1 insertion were run at the same time, according to the reaction setup described in
Table 3.3. The KOD polymerase was used in the reactions, as it has a higher fidelity and processivity than the regularly used Proof Start polymerase.
Table 3.3 – T4 32-B PCR Reactions
PCR Reaction PCR Program
KOD Buffer (10X) 5 µL Activation 95 °C, 2 minutes
Primer solution (10 µM) 1.5 µL Denaturation 95 °C, 20 seconds
dNTPs (2 mM each) 5 µL Annealing 54 °C, 10 seconds
25 mM MgSO4 3 µL Extension 70 °C, 15 seconds
KOD Polymerase (1 U/µL) 1 µL 30 cycles
Initial pEKF2 32-B expression Final Extension 70 °C, 5 minutes 1 µL plasmid (10 ng/µL)
Autoclaved water 33.5 µL
After the reaction, the two PCR products were run on a 1% agarose gel
that is shown in Figure 3.6.a. Both reactions were successful and were gel
purified using the MiniElute kit from Qiagen. The amplified 32-b gene from lane 2
was then inserted in the pET101 vector, according to the reaction setup shown in
Table 3.4. The mixture was incubated at 25 °C for 30 minutes.
71
Table 3.4 – T4 32-B Insertion in pET101 Reaction
pET101 Insertion Reaction The salt solution is provided with the Purified PCR product 2 µL pET101 cloning kit, and is composed of Salt Solution 1 µL 1.2 M NaCl + 0.06 M MgCl2. The pET101 vector is already pET101 vector 1 µL linearized and covalently linked to Topoisomerase I. Autoclaved water 2 µL
After the reaction, 2 µL of the product were transformed into 50 µL of
DH5α competent cells, 50 µL and 100 µL of the cells respectively were plated on
LB + Carbenicillin plates. The pET 101 vector carries the resistance gene to
Ampicillin, and Carbenicillin is an analog of Ampicillin that is less susceptible to
hydrolysis. Colonies were picked from both plates, grown overnight in a LB +
Carbenicillin media, and the cells were collected the next day for plasmid
isolation using the Miniprep kit (Qiagen). Glycerol stocks were also made by
mixing 1 mL of the cell culture at OD600 = 0.6 with 1 mL of 50% glycerol, then
flash freezing the mixture on dry ice. The products of the pET101 insertion
reaction isolated after the Miniprep reactions were run on a 1% agarose gel,
present in Figure 3.6.b. All the plasmids shown on the gel have the correct size.
The plasmids isolated from colonies 1 and 3 were chosen for transformation into
competent BL21 (DE3) Star cells (0.8 µL / 25 µL of cells). After transformation, the cells were not plated and grown directly overnight at 37 °C in LB +
Carbenicillin media (the BL21 (DE3) star cell line does not carry any additional antibiotic resistance). Theses cultures were then used the next day to inoculate fresh LB + Carbenicillin media, and after glycerol stocks were taken at OD600 =
72
0.6, protein expression was induced by addition of 1 mM IPTG to the culture. The
cells were left at 37 °C for another three hours. Samples of protein expression
were taken after 0h and 3h, and were run on an SDS-PAGE gel, showed in
Figure 3.7.a. No protein was expressed around 31.5 kDa.
It was then decided to clone the 32-b gene in the pDEST-C1 vector. The
PCR product from Figure 3.6.a, lane 3 was inserted in the pENTR-D vector first,
as detailed in the reaction setup in Table 3.5. The reaction was incubated at
room temperature for 30 minutes.
Table 3.5 – T4 32-B Insertion in pENTR-D Reaction
pENTR-D Insertion Reaction The salt solution is provided with the Purified PCR product 2 µL pENTR-D cloning kit, and is composed of 1.2 M NaCl + 0.06 M MgCl2. Salt Solution 1 µL The pENTR-D vector is already linearized and covalently linked to Topoisomerase I. pENTR-D vector 1 µL
Autoclaved water 2 µL
A 2 µL sample was then transformed in 50 µL of competent DH5α cells.
The cells were plated on LB + Kanamycin plates (50 µL and 100 µL per plate
respectively), since the pENTR-D vector carries the Kanamycin resistance gene.
Colonies were picked from both plates, grown overnight in LB + Kanamycin
media and the cells were collected to isolate the pENTR-D plasmid. The different
plasmids isolated from the colonies were run on a 1% agarose gel, which is
shown in Figure 3.6.c. Only one plasmid has the estimated size for a pENTR-D +
73
32-b insert, from colony one. That plasmid was then gel purified before running
the LR Clonase transposition reaction, as described in Table 3.6.
Table 3.6 – T4 32-B Insertion in pDEST-C1 Reaction
LR Clonase Reaction
Gel Purified 32-b in pENTR-D 1 µL
pDEST-C1 1 µL
LR Clonase Mix 2 µL
TE buffer (1X) 6 µL
The 1X TE buffer is made of 20 mM Tris HCl pH 8.0 + 1 mM EDTA
The reaction was incubated at room temperature for 2 hours, and
terminated by addition of 1 µL of Proteinase K at 2 µg/µL, and then incubation at
37 °C for 10 minutes. The reaction product was transformed in DH5α competent
cells (2 µL plasmid / 50 µL of cells), then 50 µL and 100 µL of the cells were
plated on LB + Streptomycin plates. Colonies were picked from the plates and
grown overnight for plasmid isolation. These plasmids were run on a 1% agarose
gel displayed in Figure 3.6.d. All plasmids have the correct size, with the exception of the one from colony 7 being contaminated with the pENTR-D insert.
The plasmid from colony 3 showed a stronger band so it was transformed in
competent T7 express cells.
74
Figure 3.6 – Agarose Gels for T4 32-B Cloning
Figure 3.6.a – T4 32-B PCR Reactions
1 2 3 4 1- 100 bp DNA ladder 2- T4 32-b PCR product (pET101 insertion) 3- T4 32-b PCR product (pDEST-C1 insertion) 4- 1 kb DNA ladder 5 kb
The 32-b gene is 865 bp long. The pET101 PCR product is also 865 bp, 1 kb 1 kb and the pDEST-C1 is 883 bp long because of the additional TEV protease cleavage 0.5 kb 0.5 kb site. Both PCR products have the correct size.
Figure 3.6.b – T4 32-B Insertion in pET101
1- Supercoiled DNA ladder 1 2 3 4 5 6 2- Blank
3- T4 32-b insert in pET101 – colony 1 4- T4 32-b insert in pET101 – colony 2 5- T4 32-b insert in pET101 – colony 3 6- T4 32-b insert in pET101 – colony 4
5 kb The correct sizes are as follows: • 32-b 865 bp • pET101 5753 bp 2 kb • Total 6618 bp
The pET101 + 32-b constructs run between 6 and 7 kb, which indicates the correct insertion.
75
Figure 3.6.c –T4 32-B Insertion in pENTR-D
1 2 3 4 5 6 7
1- Supercoiled DNA ladder 2- Blank 3- T4 32-b insert in pENTR-D – colony 1 4- T4 32-b insert in pENTR-D – colony 2
5 kb 5- T4 32-b insert in pENTR-D – colony 3 6- T4 32-b insert in pENTR-D – colony 4 7- T4 32-b insert in pENTR-D – colony 5 2 kb
The correct sizes are as follows: • 32-b 883 bp • pENTR-D 2580 bp • Total 3463 bp
Only colony 1 has the right insert as the corresponding plasmid is the only one running between 3 and 4 kb.
Figure 3.6.d –T4 32-b Insertion in pDEST-C1
1 2 3 4 5 6 7 8 1- T4 32-b insert in pDEST-C1 – colony 1 2- T4 32-b insert in pDEST-C1 – colony 2 3- T4 32-b insert in pDEST-C1 – colony 3
4- T4 32-b insert in pDEST-C1 – colony 4
5 kb 5- T4 32-b insert in pDEST-C1 – colony 5 6- T4 32-b insert in pDEST-C1 – colony 6 7- T4 32-b insert in pDEST-C1 – colony 7 2 kb 8- Supercoiled DNA ladder
The correct sizes are as follows: • 32-b 883 bp • pDEST-C1 +5334 bp • ccdB -1600 bp • Total 4617 bp
All plasmids seem to have the correct size, but the one isolated from colony 7 is contaminated with the pENTR-D + 32-B plasmid.
76
After transformation, the cells were grown directly overnight in LB +
Steptomycin (pDEST-C1 vector) + Tetracyclin (T7 express cell line). Protein expression was induced at OD600 = 0.6 with 1 mM IPTG. The results from the
protein expression in pDEST-C1 are showed in Figure 3.7.b. A protein is overexpressed around 37 kDa, corresponding to the expected molecular weight of 32-B with an N-terminal His-Tag.
The last step was to make sure the expressed 32-B from the pDEST-C1 plasmid was soluble. A 1 L culture was prepared and protein expression induced under the same conditions as described above. The cells were lysed and sonicated in the 32-B lysis buffer, which is composed of 40 mM Tris HCl pH 8.0,
100 mM NaCl, 10 mM MgCl2, 2 mM CaCl2 and 1mM EDTA. Samples were taken
from the pellet and supernatant after cell lysis and run on an SDS-PAGE gel,
presented in Figure 3.7.c., Even though the amount of 32-B in the pellet is quite
significant, large amounts of the protein of interest are present in the
supernatant as well. The 32-B protein expressed with an N-terminal His-Tag from
the pDEST-C1 vector is therefore soluble, and the plasmid can be used in
site-directed mutagenesis experiments.
77
Figure 3.7 – T4 32-B Protein Expression and Solubility
Figure 3.7.a – T4 32-B Protein Expression from the pET101 Vector
1- T4 32-B expression (plasmid from 1 2 3 4 5 colony 1) – 0h sample 2- T4 32-B expression (plasmid from colony 1) – 3h sample 3- T4 32-B expression (plasmid from 66.3 kDa colony 3) – 0h sample 55.4 kDa 4- T4 32-B expression (plasmid from 36.5 kDa colony 3) – 3h sample 31.0 kDa 5- Molecular Weight Marker 21.5 kDa 14.4 kDa A protein is expressed around 40 kDa, that is too large to be T4 32-B which has a molecular weight of 31.3 kDa.
Figure 3.7.b – T4 32-B Protein Expression from the pDEST-C1 Vector
1 2 3 1- Molecular Weight Marker 2- T4 32-B expression – 0h sample
66.3 kDa 3- T4 32-B expression – 3h sample 55.4 kDa
36.5 kDa 31.0 kDa
21.5 kDa A 37-40 kDa protein, corresponding to the expected molecular weight of 32-B + His-Tag, is expressed. 14.4 kDa
78
Figure 3.7.c – T4 32-B Cell Lysis
1 2 3 4 5
66.3 kDa 1- Molecular Weight Marker 55.4 kDa 2- T4 32-B expression – 3h sample 3- T4 32-B cell lysis – pellet 36.5 kDa 4- T4 32-B cell lysis – supernatant 31.0 kDa 5- Molecular Weight Marker 21.5 kDa
After protein expression was assessed, the pDEST-C1 + 32-b plasmid that
was transformed in the T7 express cell line was sent for DNA sequencing. The
results are shown in Appendix 3. The 32-b gene was cloned correctly in the
pDEST-C1 vector.
The molecular cloning of 32-B protein in the pDEST-C1 vector was done solely for site-directed mutagenesis purposes, described in Section 4.6. All the remaining work on 32-B detailed below was done with protein expressed from
the pEKF2 plasmid.
3.5.3. Protein Expression
The glycerol stock for T4 32-B expression was obtained from Dr. Richard
Karpel (UMBC). The 32-b gene was cloned in the pEKF2 plasmid, derived from
79
the PKC30 vector (Waidner et al., 2001). The plasmid was then transformed in
the AR120 E. coli cell line.
The cells obtained from the glycerol stock were first grown overnight at
37 °C in 300 mL of LB + Ampicillin media. Fresh media (6 L) was then inoculated
the next day with the overnight culture and the cells grown until they reached
OD600 = 0.6. Protein expression was induced at that stage by adding 1 mM of
nalidixic acid: the pEKF2 plasmid has a PL (phage lambda) promoter, therefore
expression cannot be induced by addition of IPTG like the pET vectors that have a T7 promoter. Instead, nalidixic acid provokes an SOS response by inhibiting the cell DNA gyrase and creating DNA damage, which removes the repressor bound to the PL promoter and induces expression of the protein of interest (Little
and Mount, 1982; Shatzman and Rosenberg, 1987). After induction, the cells
were left to grow at 37 °C for another three hours, then centrifuged. Samples
were taken at 0h and 3h of protein expression, and run on an SDS-PAGE gel
shown below in Figure 3.8. T4 32-B is largely overexpressed around 32 kDa.
Figure 3.8 – T4 32-B Protein Expression 1 2 3
1- T4 32-B expression – 0h sample 66.3 kDa 2- T4 32-B expression – 3h sample 55.4 kDa 3- Molecular Weight Marker
36.5 kDa 31.0 kDa A protein is overexpressed around 32 kDa, 21.5 kDa corresponding to the molecular weight of 32-B.
80
3.5.4. Cell Lysis
The cells obtained as described in the previous section were then lysed in
the following buffer: 40 mM Tris HCl pH 8.0, 100 mM NaCl, 10 mM MgCl2, 2 mM
CaCl2 and 1 mM EDTA. The cells were thawed in that buffer in the presence of
lysozyme and AEBSF, a protease inhibitor. They were then lysed open by
sonication, after which a pellet and supernatant sample were taken. The results
from the cell lysis for 32-B are shown in Figure 3.9. The protein is present in both
the pellet and the lysate but that might be due to the high concentration of 32-B
in the cells after expression. Enough protein was present in the supernatant,
which was then purified as is described in the next section.
Figure 3.9 – T4 32-B Cell Lysis
1 2 3
1- Molecular Weight Marker 2- T4 32-B cell lysis – pellet 66.3 kDa 55.4 kDa 3- T4 32-B cell lysis – supernatant
36.5 kDa 31.0 kDa
21.5 kDa After cell lysis, the protein is present in both the pellet and the lysate, but that might be due to the fact that a lot of protein was expressed and not enough lysis buffer was used to extract it all.
81
3.5.5. Protein Purification
T4 32-B protein was initially purified following the protocol obtained from
(Waidner et al., 2001). As the calculated pI of the protein is around 4.6 (Gill and
von Hippel, 1989), anion exchange chromatography was used. The lysate was
first loaded on the low resolution anion-exchange column Q Sepharose. The
POROS PE was the next step, it is a hydrophobic column that is used to remove
nucleases from the sample. Then, the high resolution anion-exchange POROS
HQ was used, and finally 32-B protein was further purified using size exclusion
chromatography, as it was not pure enough after the POROS HQ elution. The
buffers needed for each step are presented in Table 3.7.
Table 3.7 – HPLC Buffers for T4 32-B Purification Scheme 1
Ion Exchange Hydrophobic Size Exclusion
(Q Sepharose, POROS HQ) (POROS PE) (Superdex 75)
25 mM Tris HCl pH 7.5 25 mM bis-Tris HCl pH 6.5 25 mM bis-Tris HCl pH 6.5 50 mM NaCl 150 mM NH Cl Buffers 10 % glycerol 600 mM (NH ) SO 4 4 2 4 2 mM EDTA 50-500 mM NaCl 1 mM EDTA 2 mM BME 2 mM BME
Buffer A: ~ 4 mS/cm Conductivity ~ 90 mS/cm ~ 19 mS/cm Buffer B: ~ 33 mS/cm
Chromatograms and SDS-PAGE gel for this purification scheme are
shown in Figure 3.10. The final yield for this purification was 13 mg of protein / L
of cell culture.
82
Figure 3.10 – T4 32-B Purification Scheme 1
Figure 3.10.a – Q Sepharose
**
1 2 3 4 5
66.3 kDa 1- Q Sepharose – Load 55.4 kDa 2- Q Sepharose – Flow Through
36.5 kDa 3- Q Sepharose – F. 67 31.0 kDa 4- Q Sepharose – F. 76 21.5 kDa 5- Molecular Weight Marker 14.4 kDa
On the chromatogram, the OD260 is shown in purple, the OD280 in green, the conductivity in red and the % B in black. The red stars indicate which fractions were run on a SDS-PAGE gel. 32-B is present in fractions 64 to 76 that were pooled to be run on the POROS PE.
83
Figure 3.10.b – POROS HQ
*******
1 2 3 4 5 6 7 8 9 10 11 1- Molecular Weight Marker
2- POROS PE – Load 66.3 kDa 3- POROS PE – Flow Through 55.4 kDa 4- POROS PE – Void Fraction 5- POROS HQ – F. 21 36.5 kDa 31.0 kDa 6- POROS HQ – F. 23
21.5 kDa 7- POROS HQ – F. 26 14.4 kDa 8- POROS HQ – F. 32 9- POROS HQ – F. 36
10- POROS HQ – F. 42 11- POROS HQ – F. 50
After the POROS HQ elution, 32-B protein is present throughout the run, indicating that the protein has some solubility problems. Fractions 20 to 40 were pooled and run on the Superdex 75.
84
Figure 3.10.c – Superdex 75
*
1 2
66.3 kDa 55.4 kDa 1- Molecular Weight Marker 36.5 kDa 31.0 kDa 2- Superdex 75 – F.11 21.5 kDa 14.4 kDa
After the Superdex 75 run, 32-B is pure.
85
Even though 32-B was pure after the size exclusion column, it was noticed
that the protein may have solubility issues, from the broad shape of the peak that
was obtained with the POROS HQ. The solubility screen was performed on pure
32-B obtained after the Superdex 75 run. That work is described in Section 3.5.6.
It was found that 32-B is more soluble at pH 7.5, and in Na Citrate. A new set of buffers was devised, shown in Table 3.8. The purification scheme was kept the
same, but the buffers were changed.
Table 3.8 – HPLC Buffers for T4 32-B Purification Scheme 2
Ion Exchange Hydrophobic
(Q Sepharose, POROS HQ) (POROS PE)
25 mM Tris HCl pH 7.5 25 mM Tris HCl pH 7.5 50 mM NaCl Buffers 25 mM Na Citrate pH 7.5 600 mM (NH4)2SO4 0-1 M NaCl 1 mM EDTA 2 mM β-mercaptoethanol
Buffer A: ~ 16 mS/cm Conductivity ~ 90 mS/cm Buffer B: ~ 96 mS/cm
Again, the chromatograms and SDS-PAGE gels from the purification
scheme 2 are shown below in Figure 3.11. It should be noted that 32-B did not
stick tightly to the POROS HQ resin with the new buffers. It would be eluted off
the column with buffer A and would mostly be found in the rinse fraction, and then in the early fractions of the salt gradient elution. The protein was pure
enough after the POROS HQ run and did not need to be run on the size
exclusion column. The final yield for this scheme is as follows: 60 mg of pure
protein / L of cells. This is a five fold improvement in terms of solubility, compared
to the initial purification.
86
Figure 3.11 – T4 32-B Purification Scheme 2
Figure 3.11.a – Q Sepharose
* * * * *
1 2 3 4 5 6 7
1- Q Sepharose – F. 20 2- Q Sepharose – F. 46 66.3 kDa 3- Q Sepharose – F. 50 55.4 kDa 4- Q Sepharose – F. 56 36.5 kDa 5- Q Sepharose – F. 72 31.0 kDa 6- Q Sepharose – Flow Through 21.5 kDa 7- Molecular Weight Marker 14.4 kDa
On the chromatogram, the OD260 is shown in purple, the OD280 in green, the conductivity in red and the % B in black. The red stars indicate which fractions were run on a SDS-PAGE gel. 32-B was found in fractions 44 to 58, which were pooled to be run the POROS PE.
87
Figure 3.11.b – POROS HQ
* * * *
1 2 3 4 5 6 7 8 9
1- Molecular Weight Marker 2- POROS PE – Load 3- POROS PE – Flow Through
66.3 kDa 4- POROS PE – Void Fraction 55.4 kDa 5- POROS HQ – F.4
6- POROS HQ – F.6 36.5 kDa 31.0 kDa 7- POROS HQ – F.13 8- POROS HQ – F.27 21.5 kDa 9- POROS HQ – Flow Through 14.4 kDa
32-B was found in fractions 4 to 21, and was pure so these fractions were pooled and concentrated.
88
3.5.6. Solubility Screen
32-B obtained from the Superdex 75 run was precipitated by dialysis against distilled water. The precipitate was then aliquoted in eppendorf tubes, and the different solutions at 100 mM from the solubility screen were added to each tube. Each mixture was then thoroughly mixed and left to incubate at room temperature for 20 minutes, then centrifuged. The absorbance of each supernatant + Bradford reagent at 595 nm was checked, and the values tabulated in a graph, shown in Figure 3.12. 32-B was most soluble in HEPES pH 7.5 and Na Citrate. HEPES was however replaced by Tris HCl pH 7.5 in the
HPLC buffers for monetary reasons.
Figure 3.12 – T4 32-B Solubility Screen
Supernatant
H2O
TAPS pH 8.5
HEPES pH 7.5
PIPES pH 6.5
MES pH 5.6
Na Citrate
Na Phosphate
Na Sulfate
Na Cacodylate
Na Acetate
Na Formate CaCl2 MgCl2 LiCl KCl NaCl NH4Cl 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 abs (595 nm)
89
3.5.7. Dialysis and Concentration
32-B after purification was dialyzed in the T4 RNase H dialysis buffer,
since the two proteins were studied as a complex. This buffer is composed of
25 mM bis-Tris HCl pH 6.5, 150 mM NH4Cl, 10 mM MgCl2 and 2 mM
β-mercaptoethanol. 32-B was then concentrated before being stored at -80 °C.
The maximum concentration that was reached for the protein was around
100 mg/mL.
3.5.8. Crystal Screening and Optimization
The crystal structure of the core domain of 32 protein was solved in 1995
by Shamoo and coworkers (Shamoo et al., 1995) and it is described in Section
1.1.2. On the other hand, the structure of the two missing domains is still
unknown, and the attempts to get diffraction quality crystals of the full length 32
protein were all unsuccessful. Since 32-B protein was available in large
quantities, it was screened for crystallization in an effort to obtain the missing
structure of the A domain.
32-B was screened against different commercial screens, using either
Greiner or Corning trays. The trays were poured and the drops set up using the
Honeybee crystallization robot. The screens that were set up are summarized in
Table 3.9.
90
Table 3.9 – T4 32-B Crystal Screens
Crystal Screen Concentration Temperature Drop (µL)
Crystal Screen I and II 10 mg/mL Room Temp. 0.5 + 0.5
Index 10 mg/mL Room Temp. 0.5 + 0.5
PEG Ion Screen 24 mg/mL Room Temp. 0.5 + 0.5
Natrix 24 mg/mL Room Temp. 0.5 + 0.5
Wizard I and II 24 mg/mL Room Temp. 0.5 + 0.5
Salt RX 24 mg/mL Room Temp. 0.5 + 0.5
Additive Screen 10 mg/mL 4 °C 0.5 + 0.5
PEG Ion Screen 10 mg/mL 4 °C 0.5 + 0.5
Natrix 10 mg/mL 4 °C 0.5 + 0.5
Wizard I and II 10 mg/mL 4 °C 0.5 + 0.5
A large number of hits were obtained, but only around 20 of them were
confirmed to be protein crystals. They were mostly micro-crystals.
A number of these micro-crystal conditions were expanded on, but for most of them it did not yield any successful results. One condition from Crystal
Screen, however, gave reproducible crystals upon expansion. It is presented in
Figure 3.13. The crystals were stained using the Izit Dye, and turned blue,
indicating that they were protein crystals.
91
Figure 3.13 – T4 32-B Crystals after Screening
Crystal Screen II – condition 41
1.0 M Lithium Sulfate 0.1 M Tris HCl pH 8.5 0.01 M NiCl2, 6 H20
Since the crystallization condition 41 from Crystal Screen II was the best
one obtained and could be reproduced, it was chosen for further optimization experiments. Here are some of the attempts that were made to improve the crystal quality: the Lithium Sulfate concentration was modified and increased up to 2 M, Lithium Sulfate was replaced with Ammonium Sulfate, Ammonium or
Sodium Nitrate, Ammonium or Sodium Formate, and Ammonium or Sodium
Malonate. The 32-B concentration was also varied anywhere between 20 to 35 mg/mL.
The best crystals were grown in 2 M or higher of Ammonium or Lithium
Sulfate, 100 mM Tris HCl pH 8.5 and 10 mM NiCl2. The crystals grown in
Ammonium Sulfate however had a tendency to melt faster that the ones grown in
Lithium Sulfate. Some pictures of crystals obtained in these different conditions
are shown in Figure 3.14.
92
Figure 3.14 – T4 32-B Crystals after Optimization
1
2.0 M Ammonium Sulfate 32-B, 22 mg/mL 100 mM Tris HCl pH 8.5 Room Temperature 10 mM NiCl2, 6 H20 2 µL + 2 µL drop
2
2.2 M Ammonium Sulfate 32-B, 22 mg/mL 100 mM Tris HCl pH 8.5 Room Temperature 10 mM NiCl2, 6 H20 2 µL + 2 µL drop
3
32-B, 33 mg/mL 1.8 M Lithium Sulfate Room Temperature 100 mM Tris HCl pH 8.5 1 µL + 2 µL + 2 µL drop 10 mM NiCl2, 6 H20
4 32-B, 33 mg/mL 1.8 M Lithium Sulfate Room Temperature 100 mM Tris HCl pH 8.5 2 µL + 2 µL drop 10 mM NiCl2, 6 H20
3.5.9. Data Collection
The crystals grown in Lithium Sulfate, shown in Figure 3.16 picture 3 and
4, had sharper edges, which could indicate a better crystal packing. They were therefore chosen for X-Ray diffraction studies.
The crystals first had to be flash-frozen in liquid nitrogen. Lithium sulfate at
2 M or higher concentration is a cryoprotectant, and since the crystals were
93
grown in 1.8 M Lithium Sulfate, the salt concentration just had to be increased to cryoprotect the crystals. A substitute mother liquor was made containing 2 M
Lithium Sulfate, 100 mM Tris HCl pH 8.5, 10 mM NiCl2, and then 25 mM bis-Tris
HCl pH 6.5, 150 mM NH4Cl and 2 mM EDTA as the dialysis buffer the protein
was in before being crystallized. The crystals were soaked quickly in the
substitute mother liquor before being plunged in liquid Nitrogen.
The different crystals were then screened for X-Ray diffraction on the
Rigaku FR-E high brilliance X-Ray diffractometer in the Ohio Crystallography
Consortium, located in the Instrumentation Center. Out of the dozen crystals that
were prepared that way, most only diffracted to a 7 Å or less resolution. One
crystal however, from the drop shown in Figure 3.6 picture 4, did diffract to 5 Å
and the preliminary data could be indexed more easily than with the other
crystals. A 5 Å resolution dataset is usually not good enough to solve a protein
structure, as there isn’t enough data to get the phase information. The crystal
was therefore taken to the Argonne National Lab Advanced Photon Source
(APS) synchrotron in Chicago. Synchrotron beams are more intense than in-
house beams and better resolution data can be obtained with the same crystal. A
dataset was collected by Dr. B. Leif Hanson to 4 Å resolution, at 0.90 Å X-Ray
wavelength, 300 mm crystal to detector distance. Some diffraction images are
shown in Figure 3.15.
94
Figure 3.15 – 32-B Crystal X-Ray Diffraction Images
a b c
a – Image 1 (0 to 1°) b – Image 45 (44 to 45°) c – Image 90 (89 to 90°)
The crystals grown in the ammonium sulfate condition were also flash
frozen in the 2 M Lithium Sulfate substitute mother liquor, but they did not diffract once in the X-Ray beam.
3.5.10. Data Processing
The dataset was processed using HKL2000 (Minor et al., 2002).
The initial indexing and integration were done with the space group P2,
but the Molecular Replacement attempts with AMoRe and MolRep, using the
1GPC 32 core pdb (Shamoo et al., 1995) as a search model, were all
unsuccessful. The space group P21 was tried next, without any success either.
Other space groups (P1, P4, P222, P422, C222 and their Laue subgroups) were
also used, but phasing with Molecular Replacement was unsuccessful in all
cases. A summary of the data reduction and phasing for the different space
groups is presented in Table 3.10.
95
Table 3.10 – 32-B Data Processing Summary
P1 P2 P4 P222 P422 C222
Resolution 30 to 4.5 Å 50 to 4.2 Å 30 to 4.5 Å 30 to 4.5 Å 20 to 4.5 Å 20 to 4.5 Å
Unit Cell 81.15 Å 90.004° 81.12 Å 90° 81.17 Å 90° 81.18 Å 90° 81.18 Å 90° 114.75 Å 90° 159.35 Å 90.009° 159.29 Å 90.06° 159.37 Å 90° 159.34 Å 90° 159.34 Å 90° 114.83 Å 90° Dimensions 81.15 Å 90.042° 81.17 Å 90° 81.17 Å 90° 81.13 Å 90° 81.13 Å 90° 159.35 Å 90°
Rmerge * 3.8% 6.1% 6.4% 6.3% 6.8% 5.4%
Mosaicity (°) 0.67 0.71 0.66 / / /
15842 (26996) 14992 (55076) 6060 (35591) 6517 (35609) 3441 (35593) 6395 (35616) Reflections ~ 0.5 / atom ~ 1 / atom ~ 1 / atom ~ 1 / atom ~ 1.5 / atom ~ 1 / atom
Completeness 65.6% 99.1% 99.2% 99.0% 99.2% 99.3%
# Molecules 12 6 3 3 1 or 2 3 per A.S.U.
Molecular Rfactor ~ 63% Rfactor ~ 60% Rfactor ~ 63% Rfactor ~ 63% Rfactor ~ 60% Rfactor ~ 63% Replacement CC ~ 42% CC ~ 35% CC ~ 45% CC ~ 45% CC ~ 40% CC~ 45%
For all space groups, the Rmerge values are below 10%, which is good. However, the Molecular Replacement statistics all indicate that no solution was found, as the Rfactor values are above 50 % and the correlation coefficients CC below 50 %.
⎡ n ⎤ n * R =100 × F 2 − F 2 / F 2 Equation 3.1 merge ⎢∑∑ hkl hkl i ⎥ ∑∑ hkl ⎣ hkl i=1 ⎦ hkl i=1 where F 2 is the intensity of each hkl reflection and F 2 is the mean value of i hkl hkl i measurements of n equivalent reflections.
No solution could be found after Molecular Replacement, and the space group could not be determined. This is most likely due to the low resolution of the data, and the low number of unique reflections per atom. As a rule of thumb, four unique reflections per atom are needed to ensure the final model is not biased by
96
the way the data was processed. Here, a maximum of 1.5 unique reflections per
atom was obtained with the highest symmetry space group P422, which is not enough. A new dataset should be collected on a better crystal, with a high
enough resolution so that the data could be phased with Molecular Replacement.
Unfortunately, no better crystal could be obtained.
3.5.11. Dynamic Light Scattering
Dynamic Light Scattering experiments were run on the 32-B protein, in order to have a better idea of its state in solution, which could provide some information on the poor quality of the 32-B crystals.
The protein was dialyzed in its dialysis buffer: 25 mM bis-Tris HCl pH 6.5,
150 mM NH4Cl, 10 mM MgCl2 and 2 mM β-mercaptoethanol. A protein sample at
1 mg/mL was prepared, filtered using the Millipore Ultrafree-MC filters with a 0.1
µm pore size and finally spun down at 18,000 rcf for 20 min in order to eliminate
the larger aggregates. The DLS experiment was run at 4 °C as well as 20 °C.
The results are presented below in Figure 3.16 and Table 3.11.
The 32-B sample showed some signs of light aggregation, that only account for 0.5 % or less of the total mass. The interesting result, however, is that 32-B appears to be a monomer (35 kDa) at 4 °C, but starts dimerizing at
Room Temperature: the 46 kDa estimated molecular weight is probably due to a
mixed population of monomers and dimers in solution. This result was
consistently obtained upon repeat of the experiment.
97
Figure 3.16 – 32-B Protein Dynamic Light Scattering Results
4°C 20°C
Table 3.11 – 32-B Protein Dynamic Light Scattering Results
Rh (nm) % Pd MW (kDa) % Intensity % Mass
4°C 2.7 15.7 35 90.1 99.6
20°C 3.1 16.9 46 94.7 100.0
The equilibrium between monomers and dimers seems to be more prominent at room temperature, where the 32-B crystals were grown. That mixed population was certainly one of the reasons for the low resolution obtained with the X-Ray diffraction patterns. The extra A domain must also be rather flexible, therefore inducing more disorder in the crystal lattice and accounting for some of the low quality of the diffraction.
3.5.12. Small Angle X-Ray Scattering
When X-Ray crystallography can not be used to obtain the structure of a protein, like in the case of 32-B where the crystals did not yield good enough data, solution-based techniques can be used instead. One of these techniques is
98
Small-angle X-Ray Scattering or SAXS, which provides molecular envelopes of
proteins in solution.
SAXS data was collected at the Argonne Advanced Photon Source in
Chicago, on the ChemMat-CARS 15-ID beamline, at 1.50 Å wavelength. The
resolution range of the data collected was 312 to 7.5 Å, corresponding to a
momentum transfer q range of 0.02 to 0.84 Å-1.
2π Resolution (Å) = Equation 3.2 q
4π sinθ Momentum Transfer q = Equation 3.3 λ
(where θ is the angle and λ the X--Ray wavelength).
32-B was dialyzed in its dialysis buffer, made of 25 mM bis-Tris HCl
pH 6.5, 150 mM NH4Cl, 10 mM MgCl2 and 2 mM β-mercaptoethanol. A 100 µM
(around 3 mg/mL) sample was made and all the data collected at that concentration. The buffer alone had to be run first. A number of images were collected for all exposure time, which were 5 s, 20 s and 40 s. The protein sample was then run and several images collected for each exposure time. The images corresponding to the same exposure were averaged. Figure 3.17 shows the scattering curves, plotting Intensity I(q) versus the momentum transfer q that were collected. In curve c, the buffer and protein scattering signals are superimposed for the three exposure times. The buffer signal is then substracted from the protein signal and the scattering curve from the protein alone is obtained. This is shown in curve d. Finally, the curve corresponding to the most intense signal and less error on the measurement, namely the 40 s exposure
99 one, was chosen for data analysis. It is shown in curve e. The program used was 15-ID SAXS/WAXS v.3.294, which was obtained at the beamline.
Figure 3.17 – 32-B SAXS Data Collection
c
Buffer – 5 s exposure 32-B – 5 s exposure Buffer – 20 s exposure 32-B – 20 s exposure Buffer – 40 s exposure 32-B – 40 s exposure
d
5 s exposure 20 s exposure 40 s exposure
e
100
The programs used for data processing and model building are all part of
the Svergun ATSAS suite of SAXS data processing programs (Petoukhov, 2007).
The experimental data from the 15-ID SAXS/WAXS program was fed into a data
reduction and regularization program such as Primus or GNOM that evaluates
the shape and size of the particle in solution. Primus only considers low q data,
while GNOM takes into account higher resolution data as well. These programs
use the momentum transfer plot to calculate a size distribution function p(r),
which is then used to calculate the extrapolated intensity at q = 0 : I(0), and the
radius of gyration of the particle in solution Rg. The respective equations for
these calculations are shown below.
1 ∞ p(r) = qrI(q) sin(qr)dq Equation 3.4 2π 2 ∫0
D max I(0) = 4π p(r)dr Equation 3.5 ∫0
r 2 p(r)dr 2 ∫ Rg = Equation 3.6 2∫ p(r)dr
GNOM was used to process the 32-B data. The first 5 and last 250 or 300
data points were removed, depending on the highest resolution needed in the programs used next. The maximum dimension (Dmax) of the particle was
estimated to be close to 80 Å and input in the program, it was used to calculate
the model data which were then superimposed onto the experimental data. That
is shown in Figure 3.18, plot c. The size distribution function p(r) is plotted in
Figure 3.18.d. The final radius of gyration for 32-B in solution was calculated to
be 25.91 ± 0.05 Å. With the lower resolution data (172 to 45 Å), it was calculated
101 to be 26.08 ± 0.06 Å, which is very close to the higher resolution (172 to 24 Å) value. These values are also consistent with the hydrodynamic radius obtained with the Dynamic Light Scattering experiments, which was around 30 Å. The shape of the p(r) curve provides some indication as to what the shape of the particle is. In the case of 32-B, the plot tails off at high radius, indicating that the protein is elongated.
Figure 3.18 – 32-B GNOM Plots
172 to 24 Å resolution 172 to 45 Å resolution
c
experimental data
model data
d
102
The output file from GNOM is then used in Ab Initio modeling programs like DAMMIN or GASBOR. These programs model a 3D molecular envelope of the protein. Both programs output a pdb file containing dummy atoms packed in the shape of the molecular envelope. While DAMMIN packs dummy atoms within the envelope, GASBOR packs them in a chain-compatible model. Other differences include : GASBOR uses the entire q range of the data (and therefore the higher resolution output from GNOM), and outputs the most probable model only, while DAMMIN, which uses only low q data (lower resolution output from
GNOM) can be run in several modes. The best mode with which to run DAMMIN is called “keep mode”, where all possible models are output. They can then be averaged using the program DAMAVER. Both DAMMIN and GASBOR were used to determine what the shape of 32-B is. The ribbon structure of the 32 core protein was then modeled manually in the different envelopes. Figure 3.19 shows the envelopes obtained from both programs. The dimensions of the envelopes obtained in both cases allow the modeling of two chains of 32 core. This is consistent with the dynamic light scattering (see Section 3.5.11) and analytical ultra-centrifugation (Dwlgosh, 2008) results that showed that 32-B is a dimer in solution. DAMMIN was used with both the high resolution and low resolution data. The higher resolution model has a more defined shape, but in both cases the two chains were modeled in a back to back fashion. In the case of GASBOR, the envelope is a little smaller and the two chains had to be modeled in an interlocked manner to fit the model. The DAMMIN model, where the two proteins interact through hydrophobic regions, is more likely than the interlocked model.
103
Moreover, the high resolution model from DAMMIN is the only one that leaves enough space for the missing C-terminal 62 residues, and also has a lower χ2 value.
Figure 3.19 – 32-B 3D Molecular Envelope
Figure 3.19.a – DAMMIN Models
C-terminus
Low resolution model (172 to 45 Å)
Average of ten models χ2 = 2.1 Dimensions: 90 Å × 60 Å × 50 Å
C-terminus
C-terminus High resolution model (172 to 24 Å)
Average of five models χ2 = 1.8
Dimensions: 85 Å × 80 Å × 50 Å C-terminus
104
Figure 3.19.b – GASBOR Model (172 to 24 Å resolution)
C-termini
χ2 = 2.1
Dimensions: 85 Å × 55 Å × 35 Å
The A domain, present in the 32-B protein but missing from the 32 core crystal structure, could also be modeled using the CREDO program package.
Among the programs contained in the package, CREDO and CHADD are the most widely used. CREDO creates a chain of dummy atoms corresponding to the missing fragment, while CHADD attaches that chain to the known terminal residue of the atomic structure. In addition, the dummy atoms output by CHADD are separated by 3.8 Å, corresponding to the average length of a peptide bond.
The program CHADD was used to model the A domain of the 32-B protein. The model is shown in Figure 3.20. However, the χ2 value on that model is higher
than with the Ab Initio models from DAMMIN and GASBOR.
105
Figure 3.20 – Modeling of the A Domain of 32 Protein (Chadd)
C-terminus
χ2 = 3.3
The surface model output by CHADD is shown in transparent pink, the white spheres are water molecules. The X-Ray structure of 32 core was superposed onto the CHADD output surface. The missing C-terminal or A domain is shown on the left, linked to the 32 core C-terminal residue.
The Small-Angle X-Ray Scattering results for the 32-B protein provided
some more insight concerning the structure of that protein in solution. It was confirmed that the protein exists in the dimeric form in solution, which may only be an artifact of the truncation and is not necessarily physiologically relevant. A model of the missing A domain was calculated, as well as several models of the
overall 32-B dimer. The programs used for the data processing and modeling
however do not provide the user with a definite answer, and interpretation of the models is an important part of the SAXS data analysis, which is why several possibilities were presented. Only X-Ray crystallography can yield a definite structure for the 32-B protein. However, growing high resolution crystals of 32-B proved to be difficult, and the same problems were encountered by several groups while trying to grow crystals of the full length 32 protein. The extra
106
C-terminal A domain and N-terminal B domain are probably very flexible, which is
why only the 32 core could be crystallized into high enough resolution crystals.
This domain flexibility could also explain the high χ2 value on the modeling of the
A domain from the 32-B SAXS data.
3.6. Bacteriophage T4 32-B Mutants
3.6.1. Introduction
As described in Sections 5.3.2 and 5.3.8, three 32-B mutants were designed in order to further probe the interaction between 32-B and RNase H.
The mutants were as follows: W144E, I151D and I60D where a Tryptophan and
two Isoleucine residues at the interface of the interaction with RNase H were
respectively mutated into a Glutamate and two Aspartate residues. Unlike the
I151D and I60D mutants, the W144E mutant could not be cloned successfully.
The protein characteristics for the native 32-B protein in comparison to the two
successful mutants were calculated using ExPASy (Gill and von Hippel, 1989;
Gasteiger et al., 2003).
Table 3.12 – 32 Protein and Truncations Characteristics
32-B I151D 32-B I60D 32-B
Amino-acids 286 286 286
31.8 kDa 31.8 kDa Molecular Weight 31.8 kDa (36.3 kDa with His-Tag) (36.3 kDa with His-Tag)
pI 4.65 4.61 4.61
ε 1.24 1.25 1.25
107
3.6.2. Molecular Cloning
The forward and reverse primers for the site-directed mutagenesis PCR reaction were designed according to the QuikChange® manual (see Section
2.1.3), and are presented in Figure 3.21.
Figure 3.21 – T4 32-B Mutants Site-Directed Mutagenesis Primers
Figure 3.21.a – T4 W144E 32-B Primers
Forward Primer
32-b 5’– TAC CGC TTT GGT AAG AAA ATC TGG GAT AAA ATC AAT GCA ATG –3’ Y R F G K K I W D K I N A M Primer 5’– CGC TTT GGT AAG AAA ATC GAA GAT AAA ATC AAT GC –3’ R F G K K I E D K I N
5’– CGC TTT GGT AAG AAA ATC GAA GAT AAA ATC AAT GC –3’
Reverse Primer
32-b 3’– ATG GCG AAA CCA TTC TTT TAG ACC CTA TTT TAG TTA CGT TAC –5’ Primer 3’– GCG AAA CCA TTC TTT TAG CTT CTA TTT TAG TTA CG –5’
5’– GC ATT GAT TTT ATC TTC GAT TTT CTT ACC AAA GCG –3’
35 bp total Tm = 75.0 °C
Figure 3.21.b – T4 I151D 32-B Primers
Forward Primer
32-b 5’– AAA ATC AAT GCA ATG ATT GCG GTT GAT GTT GAA ATG –3’ K I N A M I A V D V E M Primer 5’– ATC AAT GCA ATG GAT GCG GTT GAT GTT G -3’ I N A M D A V D V
5’– ATC AAT GCA ATG GAT GCG GTT GAT GTT G –3’
Reverse Primer
32-b 3’- TTT TAG TTA CGT TAC TAA CGC CAA CTA CAA CTT TAC –5’ Primer 3’- TAG TTA CGT TAC CTA CGC CAA CTA CAA C –5’
5’– C AAC ATC AAC CGC ATC CAT TGC ATT GAT –3’
28 bp total Tm = 73.4 °C
108
Figure 3.21.c – T4 I60D 32-B Primers
Forward Primer
32-b 5’– CAA GCA CCA TTC GCA ATT CTT GTA AAT CAC GGT TTC –3’ Q A P F A I L V N H G F Primer 5’- GCA CCA TTC GCA GAT CTT GTA AAT CAC GG –3’ A P F A D L V N H
5’– GCA CCA TTC GCA GAT CTT GTA AAT CAC GG –3’
Reverse Primer
32-b 3’– GTT CGT GGT AAG CGT TAA GAA CAT TTA GTG CCA AAG –5’ Primer 3’– CGT GGT AAG CGT CTA GAA CAT TTA GTG CC –5’
5’– CC GTG ATT TAC AAG ATC TGC GAA TGG TGC –3’
29 bp total Tm = 76.5 °C
The forward primers for the site-directed mutagenesis PCR reactions are shown first. They were aligned with the nucleotide sequence of T4 32-B. The mutated nucleotide is shown in red. Highlighted in yellow is the primer that anneals with the original 32-b gene. The reverse primers are shown second, highlighted in the same manner.
Initial attempts at the site-directed mutagenesis PCR reaction for all three mutants were made, using the pEKF2 plasmid as a template, but were all unsuccessful due to the large size of that template plasmid. Optimization of the reaction was also attempted, by using polymerases with a higher processivity such as the Pfu Ultra Polymerase or the KOD polymerase. Increasing the dNTPs concentration, MgCl2 concentration, extension time or changing the annealing
temperature were also done, but without success. Finally, the 32-b gene was
recloned in a different expression vector, the pDEST-C1 vector, as described in
Section 3.5.2, and the new plasmid was then used as the template for the site- directed mutagenesis PCR reactions.
109
Out of the three mutants, only two could be cloned, the I151D and I60D
32-B mutants. All the PCR reactions for the third W144E 32-B mutant were unsuccessful, despite the attempts at optimizing the PCR reaction: different annealing temperatures were probed, as well as different concentrations of primers and template. Below is the description of the cloning procedures for the
I151D and I60D mutants. Table 3.13 shows the parameters for the two successful PCR reactions.
Table 3.13 – Site-Directed Mutagenesis PCR Reactions for the 32-B Mutants
Table 3.13.a – I151D 32-B Reaction
PCR Reaction PCR Program
KOD buffer (10X) 5 µL Activation 95 °C, 2 minutes
Forward primer (2.5 µM) 6 µL Denaturation 95 °C, 20 seconds
Reverse primer (2.5 µM) 6 µL Annealing 55 °C, 10 seconds
dNTPs (2 mM) 5 µL Extension 70 °C, 2 minutes
MgSO4 (25 mM) 5 µL 20 cycles
KOD Polymerase (1 U/µL) 1 µL Final extension 70 °C, 20 minutes
Template 1 µL
Autoclaved water 21 µL
110
Table 3.13.b – I60D 32-B Reaction
PCR Reaction PCR Program
KOD buffer (10X) 5 µL Activation 95 °C, 2 minutes
Forward primer (2.5 µM) 6 µL Denaturation 95 °C, 20 seconds
Reverse primer (2.5 µM) 6 µL Annealing 60 °C, 10 seconds
dNTPs (2 mM) 5 µL Extension 70 °C, 2 minutes
MgSO4 (25 mM) 5 µL 20 cycles
KOD Polymerase (1 U/µL) 1 µL Final extension 70 °C, 20 minutes
Template 1 µL
Autoclaved water 21 µL
Upon reception, the primers were first dissolved and diluted to 250 µM with 1X TE buffer, then a 2.5 µM stock was made for each primer to be used in the PCR reaction.
When the reactions were finished, the template was digested by adding
1 µL of the DpnI restriction enzyme at 20 U/µL, leaving only the mutated plasmids in solution, which were then run on 1% agarose gels, shown in Figure
4.22.a. At that point, the plasmids are still double-nicked, and they run higher than their actual size when compared to the supercoiled DNA ladder. These plasmids were then transformed into competent DH5α cells (2 µL of plasmid / 25
µL of cells), 50 µL and 100 µL of each transformation were plated on LB +
Streptomycin plates. Colonies were picked and grown in LB + Streptomycin media, and the plasmids isolated to be run on 1% agarose gels. These are shown in Figure 3.22.b. The I151D 32-B plasmid from colony 3 and the I60D
32-B plasmid from colony 4 were chosen for expression studies.
111
Figure 3.22 – Agarose Gels for the T4 32-B Mutants Cloning
Figure 3.22.a – T4 32-B Mutants PCR Reactions
1 2 3 4 5
1- Supercoiled DNA ladder 2- T4 I151D 32-b in pDEST-C1 – Mutagenesis PCR product
5 kb 5 kb 5 kb 3- Supercoiled DNA ladder
4- T4 I60D 32-b in pDEST-C1 – 2 kb 2 kb 2 kb Mutagenesis PCR product
5- 1 kb DNA ladder
Figure 3.22.b – T4 32-B Mutants (pDEST-C1) Miniprep
1 2 3 4 5 6 7 8 1- T4 I151D 32-b in pDEST-C1 – PCR product
2- T4 I151D 32-b in pDEST-C1 – colony 1 3- T4 I151D 32-b in pDEST-C1 – colony 2 4- T4 I151D 32-b in pDEST-C1 – colony 3 5- T4 I151D 32-b in pDEST-C1 – colony 4 6- T4 I151D 32-b in pDEST-C1 – colony 5
7- T4 I151D 32-b in pDEST-C1 – colony 6 8- Supercoiled DNA ladder 5 kb
2 kb • 32-b in pDEST-C1: 4617 bp All plasmids have the correct size except the one isolated from colony 3.
112
1 2 3 4 5 6 7 1- T4 I60D 32-b in pDEST-C1 – colony 1 2- T4 I60D 32-b in pDEST-C1 – colony 2 3- T4 I60D 32-b in pDEST-C1 – colony 3 4- T4 I60D 32-b in pDEST-C1 – colony 4 5- T4 I60D 32-b in pDEST-C1 – colony 5 6- T4 I60D 32-b in pDEST-C1 – colony 6 7- Supercoiled DNA ladder 5 kb
2 kb • 32-b in pDEST-C1: 4617 bp All plasmids have the correct size.
3.6.3. Protein Expression and Solubility
The two plasmids described above were transformed into competent T7
express cells (0.8 µL / 50 µL of cells). After transformation, the cells were first
plated on LB + Streptomycin + Tetracyclin plates (50 µL and 100 µL of cells per
plate, for each mutant). One colony was picked from each plate, grown up
overnight in LB + Streptomycin + Tetracyclin media, that culture was then used to
inoculate media and the cells were grown until OD600 = 0.6, when glycerol stocks
were taken and protein expression was induced by adding 1 mM IPTG. 0 h and
3 h expression samples were taken and run on a SDS-PAGE gel. The results
from protein expression are presented in Figure 3.23.a. Both the I151D and the
I60D 32-B mutants were overexpressed.
The cells were then lysed, according to the lysis procotol described in
Section 2.3, to check for protein solubility. The lysis buffer that was used was
composed of 40 mM Tris-HCl pH 8.0, 100 mM NaCl, 10 mM MgCl2, 2 mM CaCl2
113 and 1 mM EDTA. Large amounts of protein can be found in the pellet and supernatant after cell lysis, as can be seen on Figure 3.23.b. However, this is not necessarily an indication of a solubility problem, since the amounts of 32-B mutants expressed were so large, and not enough lysis buffer might have been used to allow all the protein to be extracted from the cells.
Figure 3.23 – T4 32-B Mutants Expression and Solubility
Figure 3.23.a – T4 32-B Mutants Protein Expression
1 2 3 4 5 1- T4 I151D 32-B expression (plasmid from colony 1) – 0h sample 2- T4 I151D 32-B expression (plasmid
from colony 1) – 3h sample 3- T4 I151D 32-B expression (plasmid 66.3 kDa 55.4 kDa from colony 2) – 0h sample
4- T4 I151D 32-B expression (plasmid
36.5 kDa from colony 2) – 3h sample 31.0 kDa 5- Molecular Weight Marker 21.5 kDa
1 2 3 4 5
66.3 kDa 1- T4 I60D 32-B expression (plasmid 55.4 kDa from colony 1) – 0h sample 36.5 kDa 2- T4 60D 32-B expression (plasmid 31.0 kDa from colony 1) – 3h sample 21.5 kDa 3- T4 I60D 32-B expression (plasmid 14.4 kDa from colony 2) – 0h sample 4- T4 I60D 32-B expression (plasmid from colony 2) – 3h sample
5- Molecular Weight Marker
114
Figure 3.23.b – T4 32-B Mutants Cell Lysis
1 2 3
1- T4 I151D 32-B cell lysis – pellet 2- T4 I151D 32-B cell lysis – supernatant 66.3 kDa 3- Molecular Weight Marker 55.4 kDa
36.5 kDa 31.0 kDa 1 2 3
21.5 kDa
66.3 kDa 55.4 kDa
36.5 kDa 31.0 kDa
21.5 kDa 1- T4 I60D 32-B cell lysis – pellet 14.4 kDa 2- T4 I60D 32-B cell lysis – supernatant
3- Molecular Weight Marker
As it was the case for T4 32-B expressed in pDEST-C1, the 32-B mutants are found in both the pellet and supernatant after cell lysis. A large enough portion of the protein is soluble to carry on with purification.
Once it was known that both mutants could be expressed in a soluble
manner, the two plasmids were sent for sequencing at the Plant-Microbe
Genomics Facility at Ohio State University. Both sequences were correct, as is
shown in Appendix 3, and the mutations were confirmed.
115
3.6.4. Protein Purification
The protocol that was designed for the 32-B protein after a solubility
screen was applied to the two 32-B mutants. Indeed, that purification scheme
yielded large amounts of pure protein. The different sets of buffers are detailed in
Table 3.14.
Table 3.14 – Lysis and HPLC Buffers for the T4 32-B Mutants Purification
Ion Exchange Hydrophobic Lysis (Q Sepharose, POROS HQ) (POROS PE)
40 mM Tris HCl pH 8.0 25 mM Tris HCl pH 7.5 100 mM NaCl 25 mM Tris HCl pH 7.5 50 mM NaCl Buffers 10 mM MgCl 25 mM Na Citrate pH 7.5 2 2% glycerol 2 mM CaCl 0 - 1 M NaCl 2 600 mM (NH ) SO 1 mM EDTA 4 2 4
Buffer A: ~ 16 mS/cm Conductivity ~ 18 mS/cm ~ 110 mS/cm Buffer B: ~ 96 mS/cm
After cell lysis, the supernatant was first loaded on the low-resolution
anion-exchange Q Sepharose. The elution from the Q Sepharose was then
loaded on the hydrophobic POROS PE to get rid of any endogenous nuclease
that might be present in solution. The conductivity of the protein sample had to
be increased first by addition of 3 M Ammonium Sulfate. The protein was found
in the void fraction as expected after the POROS PE run. To decrease the
conductivity of the sample and match it back to that of Q buffer A, the protein was
dialyzed overnight in Q buffer A. It was then further purified with the
high-resolution anion-exchange column POROS HQ.
116
Examples of the chromatograms and corresponding SDS-PAGE gels for the I60D 32-B mutant purification are shown in Figure 3.24.
Figure 3.24 –T4 I60D 32-B Purification
Figure 3.24.a –Q Sepharose
* * * * * *
1- Molecular Weight Marker 1 2 3 4 5 6 7 8 9 10 11 12 13 2- Q Sepharose – Load 3- Q Sepharose – F. 18 4- Q Sepharose – F. 33
66.3 kDa 5- Q Sepharose – F. 40 55.4 kDa 6- Q Sepharose – F. 45 36.5 kDa 7- Q Sepharose – F. 49 31.0 kDa 8- Q Sepharose – F. 58 21.5 kDa 9- Q Sepharose – Flow Through 1 14.4 kDa 10- Q Sepharose – Flow Through 2 11- Molecular Weight Marker 12- POROS PE – Void Fraction
13- POROS PE – Flow Through
On the chromatogram, the OD260 is shown in purple, the OD280 in green, the conductivity in red and the % B in black. The red stars indicate which fractions were run on a SDS- PAGE gel. The fractions 37 to 52 were pooled and run on the POROS PE column, while the fractions 53 to 66 would have to be run the Q Sepharose column again. They were pooled and kept at -80 °C until further purification. The POROS PE void fraction, containing I151D 32-B, and the flow through are shown in lane 12 an 13.
117
Figure 3.24.b –POROS HQ
* * * *** *
1 2 3 4 5 6 7 8 9 10 11 12 1- Molecular Weight Marker 2- POROS HQ – Load
3- POROS HQ – F. 12 4- POROS HQ – F. 17 66.3 kDa 55.4 kDa 5- POROS HQ – F. 20 6- POROS HQ – F. 25 36.5 kDa 7- POROS HQ – F. 31 31.0 kDa 8- POROS HQ – F. 42 21.5 kDa 9- POROS HQ – F. 52 14.4 kDa 10- POROS HQ – Rinse Fraction
11- POROS HQ – Flow Through 12- Molecular Weight Marker
I60D 32-B was found in the rinse fraction as well as in fractions 1 to 17, they were pooled and concentrated. More protein was found in fraction 18 to 27, they were added to the extra fractions from the Q Sepharose column and kept at -80 °C.
118
After the POROS HQ run, both mutants were pure enough and were concentrated. The final yields for the two mutants with the His-Tag still attached are as follows: 120 mg of I151D 32-B from a 6 L culture, and 120 mg of I60D
32-B from a 3 L culture. Some of each protein was flash frozen on dry ice in the presence of 25 % glycerol and kept at -80 °C, and the remaining was used in the
TEV Protease reaction, in order to remove the His-Tag.
3.6.5. Cleaving of the His-Tag
The N-terminal His-Tag present in both mutants needs to be removed as it might interfere while studying the interaction with RNase H. The TEV protease is
a cysteine protease that specifically recognizes the amino-acid sequence
ENLYFQG and cleaves between the Glutamine and Glycine residues. The full
sequence of 32-B with the His-Tag and linker is shown below in Figure 3.25.
Figure 3.25 – TEV Protease Cleavage Site
1 MAHHHHHHVG TGSNDDDDKS TSLYKKAGSA AAPFTENLYF Q*GLNGNKGFS SEDKGEWKLK 61 LDNAGNGQAV IRFLPSKNDE QAPFAILVNH GFKKNGKWYI ETCSSTHGDY DSCPVCQYIS 121 KNDLYNTDNK EYSLVKRKTS YWANILVVKD PAAPENEGKV FKYRFGKKIW DKINAMIAVD 181 VEMGETPVDV TCPWEGANFV LKVKQVSGFS NYDESKFLNQ SAIPNIDDES FQKELFEQMV 241 DLSEMTSKDK FKSFEELNTK FGQVMGTAVM GGAAATAAKK ADKVADDLDA FNVDDFNTKT 301 EDDFMSSSSG SSSSADDTDL DDLLNDL
The 32-B protein sequence is highlighted in yellow. In green is the starting Methionine, the His-Tag is highlighted in pink and the TEVprotease cleavage site in blue. The cut made by the protease is represented with a red star. The two isoleucines that were mutated are shown in red.
119
Once the TEV protease has cleaved the His-Tag, a residual Glycine residue is left at the N-terminus of the protein.
I151D and I60D 32-B were mixed with the TEV protease, in a 1:20 and
1:50 mass ratio respectively. The reaction setups are presented in Table 3.15.
β-Mercaptoethanol was added to initiate the reaction, since the TEV protease is a cysteine protease.
Table 3.15 – TEV Protease Reaction Setup
Table 3.15.a – I151D 32-B Reaction
TEV Protease Reaction
Component Concentration Volume Amount
I151D 32-B 20.9 mg/mL 1 mL ~20 mg
TEV protease 1.55 mg/mL 700 µL ~1 mg final β-mercaptoethanol 14.3 M 1 µL concentration ~5 mM
Table 3.15.b – I60D 32-B Reaction
TEV Protease Reaction
Component Concentration Volume Amount
I60D 32-B 5 mg/mL 10 mL ~50 mg
TEV protease 1.55 mg/mL 700 µL ~1 mg final β-mercaptoethanol 14.3 M 2 µL concentration 2.5 mM
120
In the case of I151D 32-B, the protein was concentrated at around 20 mg/mL, which is apparently too high, as some precipitation occurred when the
TEV protease was added. The protein concentration was lowered to 5 mg/mL when the reaction was done with I60D 32-B. The protease reactions were left at room temperature overnight. Some light precipitate was present at the end of the reaction in both cases. The SDS-PAGE gels run for the two respective experiments are presented in Figure 3.26.
After the protease reaction, each mixture was run on the cobalt affinity
Talon column. The TEV protease has a His-Tag, so it binds to the column as well as the His-Tag that was cleaved, and the cleaved 32-B mutant protein flows through the column. The TEV protease and the His-Tag were eluted off the column with either a 0 to 250 mM imidazole elution, or a direct 250 mM imidazole elution. The SDS-PAGE gels for both 32-B mutants purification are also shown in
Figure 3.29.
The cleaved proteins after purification over the Talon column still showed some higher molecular weight impurities, some of them having twice the size of
32-B. This indicates that some cross-linking might be taking place, which is also
a possibility since there are four cysteine residues in 32-B. They are highlighted in yellow in the amino-acid sequence below, in Figure 3.27.
121
Figure 3.26 – 32-B Mutants TEV Protease Reactions
Figure 3.26.a – I151D 32-B TEV Protease Reaction
1 2 3 4 5 6 7 8 9 10 11 12 13 14
66.3 kDa 55.4 kDa
36.5 kDa 31.0 kDa 21.5 kDa
1- I151D 32-B (+ His-Tag) 2- TEV Protease It can be seen how from lane 1 to 6 3- I151D 32-B + TEV Protease – 0 h pellet
the molecular weight of I151D 32-B 4- I151D 32-B + TEV Protease – 0 h sup. decreased, as the His-Tag was cleaved off. The precipitate is mostly 5- I151D 32-B + TEV Protease – 16 h pellet formed of I151D 32-B, but it might be 6- I151D 32-B + TEV Protease – 16 h sup. due to a concentration problem. After the reaction, the sample was 7- Molecular Weight Marker run on the Talon column. I151D 32-B 8- Talon – Load was eluted in the wash fraction (lane 13) and the His-Tag can be seen in 9- Talon – F. 7 the lane 11, eluted with 250 mM 10- Talon – F. 19 Imidazole. 11- Talon – F. 26
12- Talon – Flow Through 13- Talon – Wash Fraction (7.5 mM Imidazole) 14- Molecular Weight Marker
122
Figure 3.26.b – I60D 32-B TEV Protease Reaction
1 2 3 4 5 6 7 8
1- I60D 32-B (+ His-Tag) 66.3 kDa 55.4 kDa 2- TEV Protease 3- I60D 32-B + TEV Protease – 0 h 36.5 kDa 4- Talon – Load 31.0 kDa 5- Talon – Wash Fraction (7.5 mM Imidazole) 21.5 kDa 6- Talon – F. 25 14.4 kDa 7- Talon – 250 mM Imidazole Elution 8- Molecular Weight Marker
Here, most of the I60D 32-B after reaction did not precipitate and was eluted in the wash fraction from the Talon column. The TEV protease and the His-Tag were eluted first with am Imadazole gradient, and then directly with 250 mM Imidazole (the sample had to be loaded on the column in several batches).
Figure 3.27 – Cysteine Residues in the 32-B Protein
1 GLNGNKGFSS EDKGEWKLKL DNAGNGQAVI RFLPSKNDEQ APFAILVNHG FKKNGKWYIE 61 TCSSTHGDYD SCPVCQYISK NDLYNTDNKE YSLVKRKTSY WANILVVKDP AAPENEGKVF 121 KYRFGKKIWD KINAMIAVDV EMGETPVDVT CPWEGANFVL KVKQVSGFSN YDESKFLNQS 181 AIPNIDDESF QKELFEQMVD LSEMTSKDKF KSFEELNTKF GQVMGTAVMG GAAATAAKKA 241 DKVADDLDAF NVDDFNTKTE DDFMSSSSGS SSSADDTDLD DLLNDL
The cysteine residues are highlighted in yellow, and the mutated isoleucine residues are shown in red.
123
β-Mercaptoethanol was added to the protein samples, with a final
concentration of 50-100 mM, and most of the higher molecular weight bands
disappeared. This is shown for I151D 32-B in Figure 3.28.
Figure 3.28 – I151D 32-B Cross Linking
1 2 3 4 1- I151D 32-B after Talon column elution 2- Molecular Weight Marker 66.3 kDa 55.4 kDa 3- Molecular Weight Marker 4- I151D 32-B + 100 mM β-mercaptoethanol 36.5 kDa 31.0 kDa
21.5 kDa
14.4 kDa A band is present in lane 1 around 66 kDa, corresponding to a dimer of I151D 32-B. This band disappears after addition of β-mercaptoethanol.
Once both mutants were cleaved from the His-Tag and in a monomeric
state by addition of a reducing agent, they were concentrated before any further experiment was done.
3.7. Conclusion
In this chapter it was described how the 32 protein and the three 32
truncations were expressed and purified.
124
A great deal of time was dedicated to the 32-B truncation. The protein had
to be re-cloned in a different expression vector, in order to clone mutants. These mutants were successfully cloned and expressed. The 32-B protein was also
characterized through a number of structural and biophysical studies. The X-Ray
diffraction studies were not successful as a phase solution could not be obtained
from the data. However, scattering experiments did provide some more insight
on the state of 32-B in solution: it appears to be in equilibrium between a
monomer and a dimer form; and the shape of the A domain, which was missing
from the 32 core crystal structure, was obtained.
CHAPTER 4 - Bacteriophage T4 RNase H
4.1. Introduction
Bacteriophage T4 RNase H is a 5’ to 3’ exonuclease associated with the DNA
replication fork. Its role is to cleave off the RNA/DNA duplex primers needed to
start the synthesis of the Okazaki fragments on the DNA lagging strand. More
background information on RNase H is provided in Section 1.1.3.
In this chapter is presented mostly the expression and purification of the
native RNase H, and two of its mutants. The D132N active site mutant is an
inactive nuclease, needed for the assays that require the presence of DNA. The
D132N ∆N mutant is the N-terminal truncation of D132N RNase H, missing the first nine amino-acids. The N-terminus of RNase H is known to interact with the
45 protein (see Section 1.1.3).
The protein characteristics for the native RNase H as well as the D132N mutant and D132N ∆N double mutant were calculated using the ExPASy
website, (Gill and von Hippel, 1989; Gasteiger et al., 2003) and are summarized
in Table 4.1.
125 126
Table 4.1 – RNase H characteristics
Native D132N D132N ∆N
Amino-acids 305 305 297
Molecular Weight 35.6 kDa 35.6 kDa 34.6 kDa
pI 8.61 8.75 9.11
ε 1.65 1.72 1.78
4.2. Bacteriophage T4 Native and D132N Mutant RNase H
4.2.1. Protein Expression
Glycerol stocks harboring the genes for the native and the D132N mutant
RNase H were obtained from Dr. Nancy Nossal (N.I.H.). The gene encoding for
RNase H was cloned into the pNN2202 plasmid derived from the pT7-7 vector
(Hollingsworth and Nossal, 1991) and transformed in MV1190 E. coli cells.
Glycerol stocks were made and stored at -80 °C.
The protein was expressed using the large scale protein expression protocol described in Section 2.2.2. Cells transformed with the native RNase H plasmid were grown overnight at 37 °C in 25 g/L LB containing 1 mM
Chloramphenicol and 1 mM Ampicillin, that culture was then used to inoculate 6
L of LB + 1 mM Ampicillin. D132N RNase H did not require Chloroamphenicol in the overnight culture. Protein expression in both cases was induced with 1 mM
IPTG when the OD600 reached 0.6, the cells were harvested after three hours at
37 °C and stored at -20 °C. The expression of RNase H is shown in Figure 4.1. A
total amount of 10 to 12 grams of cells was typically obtained for both proteins.
127
Figure 4.1 – SDS-PAGE of T4 native and D132N RNase H expression
4 5 6 a 1 2 3 b
66.3 kDa 55.4 kDa 1- Molecular Weight Marker 2- Native RNase H - 0h sample 66.3 kDa 36.5 kDa 55.4 kDa 3- Native RNase H - 3h sample 31.0 kDa
21.5 kDa 36.5 kDa 31.0 kDa 4- Molecular Weight Marker 14.4 kDa 21.5 kDa 5- D132N RNase H - 0h sample
14.4 kDa 6- D132N RNase H - 3h sample
a – Native RNase H expression, b – D132N RNase H expression
4.2.2. Cell Lysis
The native and D132N mutant RNase H were extracted from the cells
using the same protocol, described in Section 2.3.
The lysis buffer contained 50 mM Tris HCl pH 7.5, 200 mM NH4Cl, 10 mM
MgCl2, 5% glycerol, 2 mM Dithiothreitol (DTT), 0.03% Polyethylene Imine (PEI).
A volume of 100 mL of buffer was used for every 10 g of cells. The success of the cell lysis was tested by SDS-PAGE, and an example of the D132N RNase H
cell lysis can be seen on Figure 4.2. RNase H is found in the lysate.
The lysate was either directly purified or stored at -80°C for later HPLC
use.
128
Figure 4.2 – SDS-PAGE of T4 D132N RNase H cell lysis
1 2 3
66.3 kDa 55.4 kDa 1- D132N RNase H lysis - pellet
36.5 kDa 2- D132N RNase H lysis - supernatant 31.0 kDa 3- Molecular Weight Marker
21.5 kDa 14.4 kDa
4.2.3. Protein Purification
Similarly to the cell lysis, the native RNase H and the D132N were purified using the same protocol. The purification was done in three steps: first a low-resolution cation-exchange column (SP Sepharose), then the hydroxyapatite column (HA) to remove the remaining DNA in solution, and finally a high- resolution cation-exchange column (POROS HS). The different buffers used for these runs are shown in Table 4.2.
129
Table 4.2 – HPLC buffers for T4 RNase H purification
Ion Exchange Hydroxyapatite (SP Sepharose, POROS HS)
50 mM Tris HCl pH 7.5 25 mM Tris HCl pH 7.5 100 mM NH Cl 100 mM NaCl Buffers 4 10 mM MgCl2 1% glycerol 0 - 750 mM NaCl 0 - 1 M (NH4)2SO4
Buffer A: ~ 16 mS/cm Buffer A: ~ 13 mS/cm Conductivity Buffer B: ~ 75 mS/cm Buffer B: ~ 130 mS/cm
The lysate was prefiltered, and its conductivity adjusted with 50 mM Tris
HCl pH 7.5 to match the conductivity of SP buffer A before it could be loaded onto the SP Sepharose. The protein was eluted from the column using a salt gradient. The fractions containing RNase H were pooled and the purity tested by
SDS-PAGE. A similar approach was used for the HA and POROS HS runs.
Examples of chromatograms and SDS-PAGE gels for D132N RNase H purification are shown in Figure 4.3.
4.2.4. Dialysis and Concentration
The pure RNase H was dialyzed against a buffer containing 25 mM
Bis-Tris HCl pH 6.5, 150 mM NH4Cl, 10 mM MgCl2 and 2 mM β-mercaptoethanol
(BME). Upon dialysis, the protein was concentrated up to 30 mg/mL and flash
frozen on dry ice after addition of a minimum of 15% glycerol. The frozen protein
was kept at -80°C until further use.
130
Figure 4.3 – T4 D132N RNase H purification
Figure 4.3.a – SP Sepharose
* * * * * * * * * * * * *
1- Molecular Weight Marker 2- SP Sepharose - F. 12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 3- SP Sepharose - F. 13
4- SP Sepharose - F. 15 66.3 kDa 5- SP Sepharose - F. 16 55.4 kDa 6- SP Sepharose - F. 17 36.5 kDa 7- SP Sepharose - F. 18 31.0 kDa 8- SP Sepharose - F. 19 21.5 kDa 9- SP Sepharose - F. 20 10- SP Sepharose - F. 21 11- SP Sepharose - F. 22
12- SP Sepharose - F. 23 13- SP Sepharose - F. 24 14- SP Sepharose - F. 25 15- SP Sepharose - Flow Through
On the chromatogram, the OD260 is shown in purple, the OD280 in green, the conductivity in red and the % B in black. The red stars indicate which fractions were run on a SDS-PAGE gel. Here, fractions 13 to 19 contained D132N RNase H and were pooled to be run on the hydroxyapatite column. The pooled fractions are indicated with a red line on the chromatogram.
131
Figure 4.3.b – Hydroxyapatite
**
1 2 3 4 5
1- HA – Load 66.3 kDa 55.4 kDa 2- HA – F. 9 3- HA – F. 12 36.5 kDa 31.0 kDa 4- HA – Flow Through 21.5 kDa 5- Molecular Weight Marker
14.4 kDa
132
Figure 4.3.c – POROS HS
**
1 2 3 4 5
1- POROS HS – Load 66.3 kDa 55.4 kDa 2- POROS HS – F. 19
36.5 kDa 3- POROS HS – F. 21 31.0 kDa 4- POROS HS – Flow Through 21.5 kDa 5- Molecular Weight Marker 14.4 kDa
D132N RNase H can be found in fractions 19 and 21 on the SDS-PAGE gel, and looks pure. The fractions 19 to 22 were pooled to be dialyzed and concentrated.
133
4.2.5 Scattering Studies
Scattering studies were done on the D132N RNase H, to further characterize the protein in solution.
Dynamic Light Scattering
D132N was dialyzed in its dialysis buffer described in the previous section.
A 1 mg/mL sample was then prepared, filtered using a 0.1 µm pore size Millipore
Ultrafree-MC filtering device, and finally spun down at 18,000 rcf at 4 °C for 20 minutes. The DLS readings were taken at 4 °C and 20 °C. The results are presented below in Figure 4.4 and Table 4.3.
The 4 °C and 20 °C results are consistent with one another. The polydispersity of the sample indicates some slight aggregation, and the calculated molecular weight from a hydrodynamic radius of 2.5 nm is around 30 kDa, which is close enough to the theoretical 35.5 kDa of the protein. D132N
RNase H therefore appears to be a monomer in solution.
Figure 4.4 – D132N RNase H Dynamic Light Scattering Results
4 °C 20 °C
134
Table 4.3 – D132N RNase H Dynamic Light Scattering Results
Rh (nm) % Pd MW (kDa) % Intensity % Mass
4 °C 2.5 14.0 30 94.4 100.0
20°C 2.5 14.0 28 89.7 100.0
Small Angle X-Ray Scattering
Small Angle X-Ray Scattering experiments were done on D132N RNase
H, as part of the controls needed for the SAXS experiments carried out on the
D132N RNase H + 32 protein complex (see Section 5.3.3).
The SAXS data were collected at the Argonne Advanced Photon Source
15-ID beamline, under the same conditions described in the previous chapter
(see Section 3.5.12). The sample was prepared similarly to the DLS sample, but the sample concentration was 100 µM (~ 3.5 mg/mL). Several images were collected with a 5 s, 20 s and 40 s exposure for the buffer and the protein sample, but only the 40 s ones were averaged and used for data processing. The data collected at 40 s exposure time is shown in Figure 4.5.
Figure 4.5 – D132N RNase H SAXS Data Collection
135
The data reduction program GNOM was used next to process the data.
Two datasets were made, one at higher resolution (171 to 24 Å) and one at lower
resolution (171 to 45 Å). In both cases, the maximum dimension Dmax of the particle was estimated to be 65 Å. The radius of gyration obtained from the size distribution plot p(r) was calculated to be 22.53 ± 0.02 Å for the high resolution dataset, and 22.45 ± 0.02 Å for the lower resolution dataset. These values are very consistent with one another, and also consistent with the hydrodynamic radius obtained from the DLS experiments, which was 25 Å. The plots output by
GNOM for both datasets are shown in Figure 4.6.
Figure 4.6 – D132N RNase H GNOM Plots
172 to 24 Å resolution 172 to 45 Å resolution
experimental data model data
136
Finally, the two datasets obtained from GNOM were used in Ab Initio modeling programs such as DAMMIN and GASBOR. DAMMIN was used in the
“keep” mode with both datasets, and the multiple models were then averaged using DAMAVER. GASBOR only uses the higher resolution data. The models output by both programs are presented in Figure 4.7.
Figure 4.7 – D132N RNase H 3D SAXS Molecular Envelopes
Figure 4.7.a – DAMMIN Models
Low resolution model (172 to 45 Å)
Average of three models χ2 = 4.2
Dimensions: 70 Å × 70 Å × 70 Å
High resolution model (172 to 24 Å)
Average of five models χ2 = 2.1 Dimensions: 70 Å × 55 Å × 45 Å
137
Figure 4.7.b – GASBOR Model
χ2 = 2.1 Dimensions: 65 Å × 45 Å × 35 Å
The high resolution model from DAMMIN and the one from GASBOR are
fairly similar, with identical χ2 values. The ribbon structure from the RNase H
crystal structure can be fitted in the envelopes nicely. The DAMMIN model is a
little bigger as it takes the solvation of the molecule into account. On the other
hand, the low resolution model from DAMMIN appeared to be spherical, but this is most likely irrelevant as the χ2 value is very high.
4.3. Bacteriophage T4 D132N ∆N RNase H
4.3.1. Protein Expression and Cell Lysis
A glycerol stock of BL21 (DE3) pLysS cells containing the plasmid
pCJrnh1321 encoding for T4 D132N ∆N RNase H was obtained from Dr. Charles
Jones at N.I.H..
138
D132N ∆N RNase H was expressed according to the protocol described in
Section 2.2.2. The overnight culture was grown in 25 g/L LB containing 1 mM
Ampicillin and 1 mM Chloramphenicol. That culture was then used to inoculate 6
liters of fresh LB and 1 mM Ampicillin. After induction with 1 mM IPTG, the cells
were incubated at 37 °C for three hours, then harvested and stored at -20 °C.
About 10 g of cells were obtained for 6 L of culture.
The expression of D132N ∆N RNase H can be seen on the SDS-PAGE gel shown in Figure 4.8.a.
The cells were lysed according to the same protocol that was used for the native and D132N RNase H (see Section 4.2.2). The results of the D132N ∆N
RNase H cell lysis can be seen on the SDS-PAGE gel on Figure 4.8.b.
Figure 4.8 – SDS-PAGE of T4 D132N ∆N RNase Expression and Lysis
a 1 2 3 4 5 6 7 8 9
1- D132N ∆N RNase H expression – 0h sample
2- D132N ∆N RNase H expression – 3h sample 66.3 kDa 3- D132N ∆N RNase H expression – 3h sample 55.4 kDa 4- D132N ∆N RNase H expression – 3h sample
36.5 kDa 5- D132N ∆N RNase H expression – 3h sample 31.0 kDa 6- D132N ∆N RNase H expression – 3h sample 21.5 kDa 7- D132N ∆N RNase H cell lysis – pellet 14.4 kDa 8- D132N ∆N RNase H cell lysis – supernatant 9- Molecular Weight Marker
139
b 1 2 3
66.3 kDa 55.4 kDa 1- Molecular Weight Marker 2- D132N ∆N RNase H cell lysis – pellet 36.5 kDa 3- D132N ∆N RNase H cell lysis – supernatant 31.0 kDa
21.5 kDa 14.4 kDa
D132N ∆N RNase expressed very well. After cell lysis, a high amount of RNase H is found in the supernatant, indicating that the protein is soluble
4.3.2. Protein Purification
D132N ∆N RNase H was purified using cation-exchange chromatography.
The lysate was filtered before being loaded on the SP Sepharose column. The A and B buffers were similar to the ones used for native and D132N RNase H. The elution from the SP Sepharose was then further purified with the high resolution
POROS HS column.
Chromatograms and SDS-PAGE gels from the SP Sepharose and
POROS HS runs are shown in Figure 4.9.
140
Figure 4.9 – T4 D132N ∆N RNase H Purification
Figure 4.9.a – SP Sepharose
* * * * * * * * * * *
1- Molecular Weight Marker 2- SP Sepharose – Load 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 3- SP Sepharose – F. 9 4- SP Sepharose – F. 12 5- SP Sepharose – F. 15 6- SP Sepharose – F. 18 7- SP Sepharose – F. 21
8- SP Sepharose – F. 24 9- SP Sepharose – F. 27 10- SP Sepharose – F. 30
11- SP Sepharose – F. 33 12- SP Sepharose – F. 36 13- SP Sepharose – F. 43 14- SP Sepharose – Flow Through 15- Molecular Weight Marker
On the chromatogram, the OD260 is shown in purple, the OD280 in green, the conductivity in red and the % B in black. The red stars indicate which fractions were run on a SDS-PAGE gel. Here, fractions 1 to 36 contain D132N ∆N RNase H and were pooled to be run on the high resolution POROS HS column. The pooled fractions are indicated with a red line on the chromatogram.
141
Figure 4.9.b – POROS HS
* * * ****** * * *
1- Molecular Weight Marker 2- POROS HS – Load 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 3- POROS HS – F. 5 4- POROS HS – F. 11 5- POROS HS – F. 17
6- POROS HS – F. 19 7- POROS HS – F. 20 8- POROS HS – F. 21 9- POROS HS – F. 22 10- POROS HS – F. 23
11- POROS HS – F. 24 12- POROS HS – F. 26 13- POROS HS – F. 30
14- POROS HS – F. 33 15- Molecular Weight Marker D132N ∆N RNase H can be found in fractions 19 through 33 on the SDS-PAGE gel, and looks pure. The fractions 17 to 35 were pooled to be concentrated.
142
4.3.4. Solubility Screen
When D132N ∆N RNase H was concentrated in the HLPC buffer (50 mM
Tris HCl pH 7.5, 100 mM NH4Cl, 10 mM MgCl2 and ~ 300 mM NaCl), the protein precipitated heavily. A solubility screen was then performed in order to determine the optimum buffer for D132N ∆N RNase H. The protocol for the solubility screen
is described in Section 2.5.3.
The results are shown in Figure 4.10. PIPES pH 6.5 and MgCl2 improved
the solubility of the protein, therefore the dialysis buffer was modified to 25 mM bis-Tris HCl pH 6.5, 150 mM NH4Cl, 10 mM MgCl2 and 2 mM BME. This is also the optimized buffer for the native and the D132N mutant RNase H proteins.
Figure 4.10 – T4 D132N ∆N RNase H Solubillity Screen Results
Supernatant H2O TAPS pH 8.5 HEPES pH 7.5 PIPES pH 6.5 MES pH 5.6 Na Citrate Na Phosphate Na Sulfate Na Cacodylate Na Acetate Na Formate CaCl2 MgCl2 LiCl KCl NaCl NH4Cl 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 abs (595 nm)
143
4.3.5. Dialysis and Concentration
D132N ∆N RNase H was dialyzed in the optimized buffer obtained from
the solubility screen. Upon concentration, it still precipitated, but not as heavily as
previously. It was also found that the addition of glycerol to a final amount of 25
to 30% reduced the precipitation to a minimal level. D132N ∆N RNase H could
be concentrated up to 15 mg/mL, but was never stable and would precipitate out
of solution over time. However, it was found to be more stable when it was
concentrated only to a maximum of 5 mg/mL. This seems to indicate that the
N-terminus of RNase H is important for solubility and correct folding of the protein
4.4. Conclusion
The native T4 RNase H, the D132N mutant as well as the D132N ∆N
N-terminal truncation were all expressed and purified. Solubility studies were done with the D132N ∆N double mutant, since it showed some solubility
problems, and scattering studies were performed on the D132N mutant, as part
of the RNase H + 32 protein complex SAXS characterization.
CHAPTER 5 - Bacteriophage T4 RNase H
+ 32 Protein + DNA Interactions
5.1. Introduction
As it was previously mentioned in Chapter 1, the crystal structures of the core domain from 32 protein and RNase H, both from bacteriophage T4, have been
solved (Shamoo et al., 1995; Mueser et al., 1996). However, how these two
proteins interact at the replication fork is still unknown. After the individual protein
structures, this is the next level of information needed, in order to get a better
understanding of how the different proteins at the replication fork come together to organize DNA replication.
All the experiments and assays described in this chapter were carried out with the D132N mutant of RNase H, since the nuclease activity of that protein is incompatible with the presence of DNA.
5.2. Preliminary Complex Determination
There are a number of truncations available for both RNase H and 32
protein, which were respectively described in Chapter 3 and 4. These truncations
might interact with one another differently. Another parameter to take into
144 145
account is the nature and length of the DNA substrate. In order to sort out the
different possibilities and identify the stronger complexes that are more likely to
crystallize and yield better data, a series of non-denaturing gels (for the protein-
protein complexes) and gel-shift assays (for the protein-protein-DNA complexes)
were run. The results obtained from these gels are described in this Section.
5.2.1. Protein-Protein Interactions
Native gel electrophoresis was used to identify the RNase H + 32 protein complexes. The two types of RNase H: the D132N mutant and the D132N ∆N
N-terminal truncation were run in the presence of the four 32 protein truncations: the full length 32 protein, and the 32-A, 32-B and 32 core truncations.
The methodology for non-denaturing gel electrophoresis is described in
Section 2.9. A summary of the pIs of the different proteins involved is shown in
Table 5.1. Since the RNase Hs are all basic with a pI of 8.5 or above, and the
32s are acidic with pIs around 5, the gels were run at pH 6.5.
Table 5.1 – D132N RNase H and 32 Truncations Calculated pIs
Protein Calculated pI
D132N RNase H 8.75
D132N DN RNase H 9.11
32 protein 5.82
32-A 6.76
32-B 4.65
32 core 5.25
146
The first gel, presented in Figure 5.1, was run with D132N RNase H in the
presence of the different 32 proteins.
Figure 5.1 – D132N RNase H + 32 Truncations Native Gel
32 32-A 32-B 32 core 1- 32 Protein 1 2 3 4 5 6 7 8 9 10 11 12 (-) 2- 32 + D132N RNase H 3- D132N RNase H
4- 32-A Protein 5- 32-A + D132N RNase H 6- D132N RNase H 7- 32-B Protein 8- 32-B + D132N RNase H 9- D132N RNase H 10- 32 core Protein 11- 32 core + D132N RNase H (+) 12- D132N RNase H
The 32 proteins run towards the anode (positive electrode), while RNase
H runs towards the cathode. A weak complex is formed between D132N RNase
H and 32 protein (lane 2), as well as between D132N RNase H and the 32 core protein (lane 11). The strongest complex, however, is formed between D132N
RNase H and 32-B, as a strong band can be seen in between where the separate proteins run. 32-A doesn’t form a complex with D132N RNase H.
A similar gel was run, but this time with the N-terminal truncation D132N
∆N RNase H. It is shown in Figure 5.2.
147
Figure 5.2 – D132N ∆N RNase H + 32 Truncations Native Gel
32 32-A 32-B 32 core 1- 32 Protein (-) 2- 32 + D132N ∆N RNase H 1 2 3 4 5 6 7 8 9 10 11 12 3- D132N ∆N RNase H 4- 32-A Protein 5- 32-A + D132N ∆N RNase H
6- D132N ∆N RNase H 7- 32-B Protein 8- 32-B + D132N ∆N RNase H
9- D132N ∆N RNase H 10- 32 core Protein 11- 32 core + D132N ∆N RNase H (+) 12- D132N ∆N RNase H
With the D132N ∆N RNase H, stronger complexes are observed with 32
protein and the 32 core domain, while the 32-B complex is still strong. 32-A,
similarly to what happened with D132N RNase H, does not complex with D132N
∆N RNase H.
Following these non-denaturing gels, it was determined that the D132N
RNase H + 32-B complex, as well as the D132N ∆N RNase H + 32 / 32-B / 32
core complexes, were strong enough in terms of protein-protein interactions, to
justify further studies. This is also summarized in Section 5.2.3.
5.2.2. Protein-Protein-DNA Interactions
Even though several strong RNase H + 32 protein interaction complexes
were identified using non-denaturing electrophoresis, the assumption that these
protein-protein complexes will yield strong protein + DNA complexes cannot be
148 made. Some protein domains might move upon DNA binding, and the two proteins would then interact differently. This is the reason another series of gels was run for the ternary complexes.
The DNA substrates were designed in such a way that they mimic the natural DNA substrate occurring at the replication fork, where RNase H and 32 protein would bind. Two substrates were designed and used, a 3’-overhang and a fork DNA substrate. The natural substrate for RNase H binding at the replication fork closely resembles the 3’-overhang DNA, since RNase H is a
5’-exonuclease. However, RNase H is not locked in a particular place on that substrate and can slide along the DNA strand, which would then create heterogeneity issues when it comes to the formation of the ternary complex.
Therefore, another substrate was designed, where a short 5’-arm was added in order to keep RNase H positioned at the fork.
Figure 5.3 – DNA Substrates
3’ 5’ 3’-overhang DNA 5’ 3’
RNase H 5’ 32 Protein 3’ Fork DNA 5’
3’
149
To identify the best RNase H + 32 protein + DNA complexes, gel-shift
assays were run. The methodology for these gels is not described in chapter 2, as the gels were run by Dr. Charles Jones at N.I.H.. In these assays, the DNA is labeled using 32P radio-labeling. The protein-DNA complexes are run on an
agarose gel, and complex formation can be observed by the retardation of the
labeled DNA substrate.
A total number of five gels were run, probing the effect of the DNA substrate nature and size, as well as the effect of truncation of RNase H and 32 protein domains, on the formation of the ternary complex. These gels are shown in Figure 5.4. Figure 5.4.a shows the difference between the DNA substrates on
RNase H + 32 protein + DNA binding. The gels in Figure 5.4.b were run to investigate the effect of the truncation of RNase H (full length versus ∆N RNase
H) as well as the 32 protein truncations on the formation of the ternary complex.
On Figure 5.4.a, it can be seen that both RNase H and 32 protein can bind to either the 3’-overhang or the fork DNA substrate. The longer the 3’-arm is, the stronger the binding of 32, as it has more room to bind (the DNA footprint of 32 protein is 5 nucleotides (Jensen et al., 1976)). It can also be seen that if a longer
5’-arm is present on the fork DNA, only RNase H can bind, and no ternary
complex is formed. Finally, the short 5’-arm on the fork DNA can be 4 or 6 nucleotides long, as it does not make a difference in RNase H or 32 protein binding. The gel shown in Figure 5.4.a.c was run with the D19N RNase H, which is another inactive mutant of the protein and doesn’t induce nuclease degradation of the DNA substrate.
150
Figure 5.4 – RNase H + 32 Truncations + DNA Substrates Gel Shift Assays
Figure 5.4.a – Comparison of DNA Substrates
c
DNA + RNase H + 32 DNA + RNase H
DNA
Here we compare the length of the 3’-overhang and its effect on the binding of 32 protein. The longer 3’-overhang shows a stronger binding of 32 protein. The results with the 12/12 fork are shown for comparison: RNase H can bind more tightly to the fork substrate, which is expected, but the 12-mer on either the 3’-arm or the 5’-arm is too short to allow binding of 32.
d
A new substrate was designed following the results shown in the previous gel. A short 5’-arm was added to the different 3’-overhang substrates, in order to bind RNase H in a “locked” position, meaning it cannot slide along the 3’-overhang binding site anymore. Again, the longer the 3’-arm is, the stronger 32 protein binds. The length of the 5’-arm is not critical, as a 6-mer shows the same results as the 4-mer. These particular fork substrates were chosen to study the effect of RNase H or 32 protein truncations on the formation of the ternary complex, presented in Figure 5.4.b.
151
The gels shown in Figure 5.4.b show the effect of the protein truncations on the formation of the complex. All of these gels were run with the fork DNA substrate. The results can be divided into two categories: formation of the ternary complex with the full length D132N RNase H, or with the D132N ∆N truncation.
Concerning D132N RNase H, the strongest binding is observed with the 32-B truncation. A somewhat strong binding also appears with the 32-A truncation, and a weak binding with the full length 32 protein. The 32 core was not tested in these gels. With D132N ∆N RNase H, the same hierarchy is observed, only all complexes seem to be stronger than their D132N RNase H counterparts. The third gel (Figure 5.4.b.d) shows again that 32-B binds more strongly to the fork
DNA + D132N ∆N RNase H than the full length 32 protein. It also confirms that a longer 3’-arm on the fork DNA substrate is synonymous with a stronger ternary complex.
One thing that should be pointed out, however, is that radioactive labeling is very sensitive, and therefore nanomolar concentrations are sufficient to observe DNA bands on the gel. In the experiments described later on, concentrations as high as millimolar had to be used, for instance in the crystallization experiments, and the increase in concentration can drive the formation of complexes that appear to be weak at nanomolar concentrations.
152
Figure 5.4.b – Comparison of RNase H and 32 Protein Truncations
c This gel compares the binding of wild type RNase H versus the N-terminal truncation, as well as 32 protein versus 32-B, while binding to the fork DNA that was previously described as the best substrate. It can be seen that with the wild type 32 protein, a stronger ternary complex is
32 + RNase H + DNA obtained with the ∆N RNase H. Strong complexes are obtained with 32-B, either RNase H + DNA with the wild type RNase N or the ∆N truncation.
This gel is similar to the d previous one, but it compares the binding of 32 protein versus the 32-A truncation, again with the wild type and ∆N RNase Hs. It can be seen again, in a more obvious way than before, that 32 protein forms a stronger complex in the presence of ∆N RNase H. The binding of 32-A is also stronger with ∆N RNase H.
Here we compare the binding of 32 protein versus the 32-B e truncation in the presence of ∆N RNase H and two different lengths of 3’-arms. 32-B can bind more strongly than 32 to the fork DNA loaded with ∆N RNase H, as it was shown before. Also, the longer 3’-arm is, the stronger the ternary complex that is formed, which is consistent with the results from the gel shown in Figure 5.4.a.
153
5.2.3. Summary of the T4 RNase H + 32 Protein Complexes
To summarize the information described in the two previous sections, the
different complexes identified were compiled in Table 5.2. For the protein-protein
interaction, D132N RNase H was found to interact with only the 32-B truncation,
while D132N ∆N RNase H interacts with 32-B as well as 32 protein and the 32
core domain. As for the ternary complex, the D132N RNase H + 32-B + DNA
complex was identified, and the D132N ∆N RNase H + 32-B / 32 + DNA also.
The D132N RNase H + 32 protein + DNA complex was considered too weak.
The complexes involving 32-A were not retained, as the bands on the gel were
not as strong as the ones for 32-B and were very smeary.
Table 5.2 – RNase H + 32 Protein ± DNA Complexes
D132N RNase H D132N ∆N RNase H
D132N ∆N RNase H + 32
Protein-Protein Complex D132N RNase H + 32-B D132N ∆N RNase H + 32-B D132N ∆N RNase H + 32 core
D132N ∆N RNase H + 32 Protein-Protein-DNA Complex D132N RNase H + 32-B D132N ∆N RNase H + 32-B
It is interesting to see that RNase H and the 32 protein only interact,
although weakly, through DNA binding, and when the N-terminal domain of the
32 protein is cleaved off, the binding is a lot stronger. It is possible that the 32
B domain moves upon DNA binding and allows RNase H to bind, which cannot be observed when DNA is not present.
154
The following sections describe the work that was done on the different
complexes previously identified.
5.3. D132N RNase H + 32-B Protein Interaction
5.3.1. Complex Preparation
Before the D132N RNase H + 32-B complex was prepared, the two
proteins were always dialyzed separately in the same buffer, containing 25 mM
bis-Tris HCl pH 6.5, 150 mM NH4Cl, 10 mM MgCl2 and 2 mM β-mercaptoethanol.
After dialysis, the concentrations of the two proteins were checked, and the proteins were re-concentrated if needed. The complex was then prepared by mixing D132N RNase H and 32-B protein in an equimolar ratio, since the two proteins have different molecular weights. For instance, a 300 µM concentration
of the complex is roughly equivalent to 15 mg/mL.
5.3.2. Structural Studies
Crystal Screening and Optimization
The RNase H + 32-B complex was screened against six different
commercial and lab-made screens. This is summarized in Table 5.3. The first two
screens that were used (Crystal Screen I/II and Index) are indicated in italic,
because they were set up differently from the other ones. Native RNase H was
used, instead of the D132N mutant, and the complex was prepared at 10 mg/mL
of each protein, which could be a problem as this is not an equimolar complex.
155
All the screens were done in three-well Greiner plates, RNase H being in the first well, the complex in the second well and 32-B in the third as controls.
Table 5.3 – RNase H + 32-B Crystal Screens
Crystal Screen Concentration Temperature Drop (µL)
Crystal Screen I and II 10 mg/mL Room Temp. 0.5 + 0.5
Index 10 mg/mL Room Temp. 0.5 + 0.5
PEG Ion Screen 0.3 mM Room Temp. 0.5 + 0.5
Natrix 0.3 mM Room Temp. 0.5 + 0.5
Wizard I and II 0.3 mM Room Temp. 0.5 + 0.5
Additive Screen 0.3 mM Room Temp. 0.5 + 0.5
PEG Ion Screen 0.3 mM 4 °C 0.5 + 0.5
Natrix 0.3 mM 4 °C 0.5 + 0.5
Wizard I and II 0.3 mM 4 °C 0.5 + 0.5
Additive Screen 0.3 mM 4 °C 0.5 + 0.5
After the initial screening with the Crystal I/II and Index screens, a few hits were obtained. Some of them are shown below in Figure 5.5. Unfortunately, as it can be seen on the pictures, the crystal morphology of the complex strongly resembles the one of the RNase H crystals, meaning the complex crystals most likely only contain RNase H. Moreover, the Crystal Screen I condition 18 (first pictures) is similar to the condition used to grow Native RNase H crystals.
156
Figure 5.5 – Native RNase H + 32-B Initial Crystal Hits
Complex Native RNase H
Crystal Screen I – condition 18
20 % PEG 8000 0.1 M Na Cacodylate pH 6.5 0.2 M Magnesium Acetate
Index – condition 66
25 % PEG 3350 0.1 M bis-Tris pH 5.5 0.2 M Ammonium Sulfate
After this first rather unsuccessful attempt, the complex was rescreened against other screens (Wizard I and II, Natrix, PEG Ion Screen and Additive
Screen). This time, the complex was prepared in an equimolar fashion, with each protein at 0.3 mM. The D132N RNase H mutant was used instead of the native
RNase H, and the screening was done at both 4 °C and room temperature. The best hits obtained from these screens are shown in Figure 5.6. A picture of the crystals for each condition at both temperatures is shown. For three conditions out of the four presented, crystals are obtained for both temperatures. However, the crystal morphologies differ somewhat and the crystals grown at 4 °C look like the RNase H crystals more than the room temperature ones.
157
Figure 5.6 – D132N RNase H + 32-B Crystal Hits after Screening
Room Temperature 4 °C
Wizard I – condition 12
20 % PEG 1000 0.1 M Imidazole pH 8.0 0.2 M Calcium Acetate
Wizard II – condition 18
clear drop 20 % PEG 3000 0.1 M Tris HCl pH 7.0 0.2 M Calcium Acetate
Wizard II – condition 28
20 % PEG 8000 0.1 M Na MES pH 6.0 0.2 M Calcium Acetate
PEG Ion Screen – condition 25
20 % PEG 4000 0.1 M Na HEPES pH 7.5 0.2 M Ammonium Chloride
Expansion trays were setup for all conditions, but only the PEG 3000
(Wizard II, condition 18) and the PEG 4000 (PEG Ion Screen, condition 25) crystals could be obtained reproducibly and repeatedly. The Wizard II, condition
28 did not yield any crystals in the expansions, and the Wizard I, condition 12
158 only gave showers of crystals, even upon optimization. The other two conditions, however, could be optimized to grow large, single crystals, as is shown in Figure
5.7.
Figure 5.7 – D132N RNase H + 32-B Crystals after Optimization
10.9 % PEG 3350 Complex at 0.5 mM 0.1 M Tris HCl pH 7.5 Room Temperature 0.2 M Calcium Acetate 2 µL + 2 µL hanging drop 3 % Glycerol
16.4 % PEG 3350 Complex at 0.3 mM 0.1 M Tris HCl pH 7.5 Room Temperature 0.2 M Calcium Acetate 2 µL + 2 µL hanging drop 3 % Glycerol
Complex at 0.3 mM 7.7 % PEG 4000 Room Temperature 0.1 M Na HEPES pH 7.5 4 µL + 4 µL sitting drop 0.2 M Ammonium Chloride
Complex at 0.3 mM 8.6 % PEG 4000 Room Temperature 0.1 M Na HEPES pH 7.5 4 µL + 4 µL sitting drop 0.2 M Ammonium Chloride
Complex at 0.3 mM 9.5 % PEG 4000 Room Temperature 0.1 M Na HEPES pH 7.5 4 µL + 4 µL sitting drop 0.2 M Ammonium Chloride
159
Crystal Handling and Freezing
Once single crystals were grown, they had to be cryoprotected and flash-
frozen in liquid nitrogen before being screened for diffraction. Table 5.4 below
summarizes the different cryoprotectants that were tried first. The crystals were
soaked in a mixture of the substitute mother liquor and X % of the cryoprotectant.
Table 5.4 – D132N RNase H + 32-B Crystals Cryoprotection
Cryoprotectant Result
25 % Glucose Crystal turns brown immediately
25 % MPD Crystal turns brown after 1 min
25 % Ethylene Glycol Crystal melts immediately
25 % PEG 400 Crystal looks good
25 % Glycerol Crystal melts after 1 min
35 % PEG 400 Crystal melts immediately
15 % Ethylene Glycol + Crystal turns brown immediately 25 % PEG 400 15 % Ethylene Glycol + Crystal looks nice, but turns 25 % PEG 400 slightly brown after 2 min
PEG 400 seems to be the only cryoprotectant that does not degrade the
crystals, but its cryoprotecting power is rather weak and PEG 400-cryoprotected crystals tend to show ice rings upon X-Ray diffraction. A combination of PEG 400 and ethylene glycol was then used, at different concentrations, but the crystals were never stable in any cryoprotectant that was used at that point. The crystals that could be flash-frozen all showed no or very weak diffraction. It should also be noted that the biggest crystals had a tendancy to crack.
160
Since the crystals could not be cryoprotected by soaking, as seen before, it was attempted to grow them in the presence of cryoprotectant. Concentrations of 5 to 15 % of glycerol, MPD, ethylene glycol, PEG 400, glucose or xylitol were added to the crystallization conditions. However, the crystals did not grow in the presence of any of these cryoprotectants.
Finally, it was attempted again to soak the crystals in the substitute mother liquor + cryoprotectant, but this time by slowly increasing the cryoprotectant concentration. Soaks of 5 minutes were done, starting at 0 %, then 2 %, then
5 %, and then increased by 5 % increments until reaching the concentration of cryoprotectant needed to obtain an amorphous freeze. This strategy was tested with MPD and ethylene glycol. MPD, similarly to what was seen before, turned the crystals brown and degraded the proteins. Ethylene glycol, on the other hand, could be used to up to 20 % without showing any sign of degradation of the crystals. The cryoprotection in increasing concentration of ethylene glycol
therefore became the method of choice for freezing the D132N RNase H + 32-B
crystals.
Initial Data Collection and Processing
A number of crystals were flash-frozen in liquid nitrogen using the
cryoprotection protocol described above. They were then screened for diffraction
using the high brilliance FR-E X-Ray diffractometer in the Ohio Crystallography
Consortium, located in the Instrumentation Center. The crystals that showed
good enough diffraction were then used for data collection.
161
A first dataset was collected in-house on the crystal shown below, in
Figure 5.8. An example of a diffraction frame is also presented.
Figure 5.8 – D132N RNase H + 32-B Crystal Data Collection 1
Image 1
Complex at 0.3 mM Room Temperature 2 µL + 2 µL hanging drop
9.5 % PEG 4000 0.1 M Na HEPES pH 7.5 0.2 M Ammonium Chloride
The data collection and processing statistics are shown in Table 5.5. The
crystal diffracted to 3.5 Å but the data was scaled to only 4 Å. Even though the
resolution was cutoff, the Rmerge value is still extremely high, 22.1 %, indicating
either that the space group is incorrect, or that the quality of the data is very low.
This is likely, as the mosaicity is also very high, 1.81 °. Also, one cell edge is rather big, around 230 Å, meaning that the diffraction spots of that particular
direction are very close to one another. This can be seen on the diffraction frame
shown above. Another issue with this dataset is that there are only around 7,000 unique reflections, and the RNase H + 32-B complex contains roughly 4,700 atoms (not counting the hydrogen atoms). An estimate of four unique reflections
162
per atom is needed in order to get an unbiased model upon building and
refinement. This is clearly not the case here.
Table 5.5 – Crystallographic Data for the D132N RNase H + 32-B Dataset 1
Data Collection Data Processing
X-ray Wavelength 1.54 Å Space Group P222
Detector CCD 52.11 Å 90 ° Cell Dimensions 65.68 Å 90 ° Crystal to Detector Distance 100 mm 232.83 Å 90 °
Exposure Time 20 s Resolution after Scaling 20 to 4 Å
Oscillation 1 ° Rmerge * 22.1 %
Maximum Resolution 3.5 Å I/σ 3.7 (2.0)
ϕ Range 132 ° Observed Reflections 31,606 (6,996 unique)
Images 1 to 132 Completeness 97.3 %
Kappa Offset 30 ° Redundancy 4.52 (4.93)
Mosaicity 1.81 °
1 mol./ASU : 3.0 (58.4 %) Matthews Coefficient 2 mol./ASU : 1.5 (16.8 %)
The dataset was processed using the HKL2000 software (Otwinowski and Minor, 1997) ⎡ n ⎤ n * R =100 × F 2 − F 2 / F 2 Equation 5.1 merge ⎢∑∑ hkl hkl i ⎥ ∑∑ hkl ⎣ hkl i=1 ⎦ hkl i=1 where F 2 is the intensity of each hkl reflection and F 2 is the mean value of i hkl hkl i measurements of n equivalent reflections.
After scaling, the Molecular Replacement program MolRep was used to
attempt to phase the data. One molecule of RNase H (pdb 1TFR) and one
molecule of 32 core (pdb 1GPC) were sought. The data was scaled in the space
group was P222 as well as in all the related Laue subgroups (P2221, P21212 and
P212121), including inversion of the hand on the P2221 and P21212 space groups.
163
However, all the attempts were unsuccessful. This is probably due to the bad
quality of the data, indicated by an Rmerge value of 22 %.
Since the crystals seem to diffract poorly, it was proposed that a
synchrotron data collection could help, as a synchrotron X-Ray beam is more
intense than an in-house one. A new dataset was collected on the 22-ID
beamline at the Advanced Photon Source at Argonne National Laboratory. The
data were collected by Dr. Alexander Pavlovsky from Dr. Viola’s group. The
crystal used for this data collection as well as a frame from the dataset are
shown in Figure 5.9.
Figure 5.9 – D132N RNase H + 32-B Crystal Data Collection 2
Complex at 0.3 mM 20 °C 2 µL + 2 µL hanging drop
11.9 % PEG 4000 0.1 M Na HEPES pH 7.5 0.2 M Ammonium Chloride
The data collection and processing statistics are tabulated below in Table
5.6. The quality of this dataset is better than the previous one : the Rmerge is only
6.0 %, compared to 22 % before, and the mosaicity decreased to 0.62 °. Despite
164
these improvements, the number of unique reflections is still too low and only 1.6
unique reflections per atom are observed.
Table 5.6 – Crystallographic Data for the D132N RNase H + 32-B Dataset 2
Data Collection Data Processing
X-ray Wavelength 1.0332 Å Space Group P222
Detector CCD 52.88 Å 90 ° Cell Dimensions 65.33 Å 90 ° Crystal to Detector Distance 485 mm 234.03 Å 90 °
Exposure Time 5 s Resolution after Scaling 30 to 4 Å
Oscillation 0.3 ° Rmerge * 6.0 % (10.9 %)
Maximum Resolution 3.26 Å I/σ 18.5 (14.1)
ϕ Range 122 ° Observed Reflections 24,163 (7,467 unique)
Images 1 to 340 Completeness 74.2 % (47.2 %)
Kappa Offset 0 ° Redundancy 2.5
Mosaicity 0.62 °
The dataset was processed using the HKL2000 software (Otwinowski and Minor, 1997) ⎡ n ⎤ n * R =100 × F 2 − F 2 / F 2 Equation 5.1 merge ⎢∑∑ hkl hkl i ⎥ ∑∑ hkl ⎣ hkl i=1 ⎦ hkl i=1 where F 2 is the intensity of each hkl reflection and F 2 is the mean value of i hkl hkl i measurements of n equivalent reflections.
Molecular Replacement was done, using one molecule of RNase H and
one molecule of the 32 core domain as search models, like it was previously
described. MolRep and AMoRe were used, but no solution was found.
165
Crystal Analysis
Since the diffraction quality of the complex crystals was so poor that it
made phasing of the data impossible, it was decided to carry out some
biophysical experiments on the crystals themselves, in order to determine if both
proteins were present.
The first experiment that was done was to run the crystals, as well as the
crystallization drops, on an SDS-PAGE gel. A large number of crystals were
harvested using crystal freezing loops, and washed using the substitute mother
liquor (crystallization condition + dialysis buffer). Once enough crystals were
collected, they were dissolved in a solution of 100 mM Ammonium Bicarbonate
and run on an SDS –PAGE gel, shown in Figure 5.10.c, along with the separate
proteins. The same crystal sample was run with 1 mM TCEP, in case some
cross-linking was occurring. The bands for both RNase H and 32-B can be seen
in the crystal sample, indicating that both proteins are indeed present in the
crystals. However, the band corresponding to D132N RNase H is stronger than the 32-B one. In order to further investigate that fact, more crystallization drops were collected, similar to the previous ones, but this time the crystals were spun down and the supernatant was run on an SDS-PAGE gel, shown in Figure
5.10.d. Both proteins can be seen in the supernatant, and again the RNase H band is stronger than the 32-B one. If the lower concentration of 32-B seen in the crystal (Figure 5.10.c, lane 2) was due to missing 32-B molecules within the crystal lattice, an excess of 32-B would be found in the supernatant, as the crystal drops were setup in an equimolar ratio of D132N RNase H : 32-B. Since
166
this is not the case, the other possible explanation is that 32-B binds to the
Coomassie Blue dye more weakly than RNase H, and therefore shows as a
weaker band on the SDS-PAGE gel.
Another experiment that was done to further investigate this question was
to set up crystallization of the complex in several RNase H : 32-B ratios, ranging from 1:1 to 1:3. The crystals mostly grew only for the 1:1 ratio. This again indicates that the reason more RNase H can be seen on the gel is probably due to a staining problem.
At any rate, the important result from this experiment is that D132N
RNase H and 32-B can be seen on the crystal sample, indicating that the crystals do indeed contain both proteins.
Figure 5.10 – SDS-PAGE Gel of the D132N RNase H + 32-B Crystals
c 1 2 3 4 5 d 6 7 1- Molecular Weight Marker 2- 32-B
66.3 kDa 3- D132N RNase H + 32-B crystals 66.3 kDa 55.4 kDa 55.4 kDa 4- D132N RNase H + 32-B crystals + TCEP
36.5 kDa 5- D132N RNase H 36.5 kDa 31.0 kDa 31.0 kDa
21.5 kDa 21.5 kDa
14.4 kDa 14.4 kDa 6- Molecular Weight Marker 7- Crystallization drop supernatant
167
Mass Spectrometry was the other biophysical experiment carried out on the crystals. Similarly to what was described before, a number of crystals were looped out of the crystallization drop, rinsed with the substitute mother liquor
(minus the PEG that could interfere with the MS experiment), and dissolved in
100 mM Ammonium Bicarbonate. The sample was then sent to the Proteome
Consortium at the University of Michigan Medical School, for intact mass analysis as well as trypsin digestion and protein identification. The intact mass spectrum
is shown in Figure 5.11.
Figure 5.11 – Intact Mass Spectrum of the D132N RNase H + 32-B Crystals
32-B 31.8 kDa
D132N RNase H 35.6 kDa
The error on the experimental molecular weights determined by intact mass spectrometry with this experiment is ± 22 Da.
Two intense peaks are present, with an apparent mass of 32.076 kDa and
35.833 kDa respectively. The calculated molecular weights for the two proteins
are also shown on the spectra. The experimental masses are close enough to
168 the theoretical ones, and this experiment confirms that both proteins are present in the crystals. Moreover, the sample was also subject to trypsin digestion and
MALDI-TOF analysis, which further confirmed the identity of the two proteins in the sample, as is shown in Figure 5.12.
Figure 5.12 – RNase H + 32-B Crystals MALDI-TOF Results
a – D132N RNase H, Protein Score: 159, 100 %
1 MDLEMMLDED YKEGICLIDF SQIALSTALV NFPDKEKINL SMVRHLILNS IKFNVKKAKT
61 LGYTKIVLCI DNAKSGYWRR DFAYYYKKNR GKAREESTWD WEGYFESSHK VIDELKAYMP
121 YIVMDIDKYE ANDHIAVLVK KFSLEGHKIL IISSDGDFTQ LHKYPNVKQW SPMHKKWVKI
181 KSGSAEIDCM TKILKGDKKD NVASVKVRSD FWFTRVEGER TPSMKTSIVE AIANDREQAK
241 VLLTESEYNR YKENLVLIDF DYIPDNIASN IVNYYNSYKL PPRGKIYSYF VKAGLSKLTN
301 SINEF
b – 32-B protein, Protein Score: 67, 99.99 %
1 MGFSSEDKGE WKLKLDNAGN GQAVIRFLPS KNDEQAPFAI LVNHGFKKNG KWYIETCSST
61 HGDYDSCPVC QYISKNDLYN TDNKEYSLVK RKTSYWANIL VVKDPAAPEN EGKVFKYRFG
121 KKIWDKINAM IAVDVEMGET PVDVTCPWEG ANFVLKVKQV SGFSNYDESK FLNQSAIPNI
181 DDESFQKELF EQMVDLSEMT SKDKFKSFEE LNTKFGQVMG TAVMGGAAAT AAKKADKVAD
241 DLDAFNVDDF NTKTEDDFMS SSSGSSSSAD DTDLDDLLND L
The sets of peptides corresponding to T4 D132N RNase H and the 32-B protein truncation with the highest protein score were chosen. Cysteine residues ozidized as carbamidomethyl are shown in blue, and oxidized Methionine residues are in green. The peptides identified by the MS experiment are shown in red. The database used for peptide matching was NCBInr.
169
Further Crystal Optimization
Since the two previous cryogenic datasets were not of good enough quality to allow phasing of the data, a room temperature diffraction experiment was done in order to figure out if the problem came from freezing the crystals or from the crystals themselves. The room temperature diffraction pattern, similarly to the cryogenic ones, was very anisotropic, the mosaicity was high. A lot of diffuse scattering was observed and the resolution was low. All of these indicate that the crystal lattice is somewhat disordered and therefore responsible for the poor quality of the data. In order to improve the diffraction quality of the D132N
RNase H + 32-B crystals, several optimization strategies were tested.
The first one was to slightly modify the crystallization condition, by changing concentrations (other than the precipitating agent concentration) the pH or adding an extra component.
• Depending if the interaction between the two proteins is hydrophobic or electrostatic, the ionic strength of the crystallization condition is likely to play an important role, so a salt gradient of 0 to 1 M of Ammonium Chloride was used.
The crystals only grew at a salt concentration ranging between 0.1 and 0.2 M.
This indicates that the interaction might be electrostatic, since higher salt concentrations seem to inhibit it, but still a minimum amount of salt in solution is needed to keep the proteins stable.
• The pH of the solution is also important, so the crystals were grown at a pH varying from 6.5 to 9.5, using Tris based buffers as well as the Na HEPES form the initial condition. Only a few crystals grew in bis-Tris HCl pH 6.5, and
170
they could not be reproducibly obtained. Going from HEPES pH 7.5 to Tris HCl
pH 7.5 did not make a big difference, but the crystals seemed to grow in clusters
instead of separate entities. At pH 8.5 and higher, only showers of smaller
crystals were obtained.
• As some cross-linking was observed when the crystals were run on an
SDS-PAGE gel, the crystals were grown in the presence of a reducing agent, such as 1 mM TCEP or 10 mM β-mercaptoethanol. The crystal quality improved somewhat, and one dataset could be collected on a TCEP grown crystal (see the following section). However, the improvement was not dramatic.
• It was also proposed that bacterial growth might be part of the problem.
Indeed, the crystals grown in the PEG 3350 + 0.1 M Tris HCl pH 7.5 + 0.2 M Ca
Acetate repeatedly showed contamination of the crystallization drop. A final
concentration of 0.5 mM Na Azide was added to the crystallization condition, but
did not improve the crystal quality.
Another strategy was to try to affect the nucleation and crystal growth
rates using several methods.
• The rate of nucleation seems rather high, as showers of crystals can be
obtained very easily. To tackle that issue, the protein concentration was lowered,
but it could not be lowered too much as bigger crystals are needed to obtain
enough resolution. 3 % glycerol was also added to the crystallization condition.
Glycerol has two effects, it increases the viscosity of the solution, slowing down
nucleation, and it is also a precipitating agent. Fewer and bigger single crystals
171
were indeed obtained in the presence of 3 % glycerol, but the diffraction quality did not improve.
• Addition of water to the crystallization drop is also known to slow down
nucleation, as it takes longer for the drop to reach super-saturation. In the case
of the D132N RNase H + 32-B crystals, no improvement was seen.
• The crystals were also grown at 4 °C, for the same reason. Interestingly,
only showers of small crystals were obtained in the cold, compared to single crystals at room temperature.
• Hanging drop versus sitting drop crystallization has been shown to make a
difference in crystal growth. Sitting drops were used, a method that allows for a
larger drop size. Larger crystals could be obtained, but the disorder of the crystal
lattice was only amplified, resulting in even more diffuse scattering and higher
mosaicity upon X-Ray diffraction. Polypropylene micro-bridges were used instead
of the regular polystyrene micro-bridges, on the surface of which protein crystals tend to grow and get stuck.
Post-crystallization crystal improvement methods were attempted, such as desiccation or cross-linking.
• The dessication method (Haebel et al., 2001) is supposed to rid the crystal
of the extra solvent that it contains, therefore shrinking the crystal lattice and
improving diffraction. It was carried out on the D132N RNase H + 32-B crystals, but all diffraction was lost after the desiccation of the crystals.
172
• Cross-linking using glutaraldehyde was also proposed to strengthen the
crystal lattice. This experiment was unsuccessful as well, as the cross-linking
actually destroyed the lattice.
Finally, optimization of the flash-freezing and data collection was done, so
as to limit to a minimum the loss of resolution upon these two steps.
• The crystals were flash-frozen in a helium stream at 20 K, instead of the
regularly used liquid nitrogen freeze. This is supposed to limit damage to the
crystal lattice upon freezing.
• It has been seen that using a substitute mother liquor with a different pH
from the crystallization condition can sometimes dramatically increase the
diifraction quality of the crystals. When this experiment was tried, it resulted in a
total loss of diffraction.
• Annealing of the crystal in the cryo-stream was attempted as well,
resulting also in the total loss of diffraction.
Challenges Posed by the D132N RNase H + 32-B Crystals
To summarize the issues that were encountered so far with the complex
crystals, several conflicting problems were found. All the crystals showed poor diffraction, diffuse scattering, high mosaicity and anisotropy of the diffraction pattern. One of the cell edges being large, the spots in that direction are very
close and can overlap easily. The larger crystals do diffract to better resolution,
but tend to crack upon flash-freezing, giving rise to twinned diffraction patterns,
and are also more disordered. On the other hand, smaller crystals show better
173 quality data, but to a lower resolution. All the optimization methods described in the previous section were somewhat unsuccessful, as they did not result in any dramatic improvement of the diffraction quality of the crystals.
The key to obtaining good enough data is to find a good compromise between all these issues. In other words, finding a crystal small enough so that the disorder of the crystal lattice is limited, but still big enough in order to have decent resolution. Moreover, the easiest way to tackle the spot overlapping problem would be to collect data on a more intense synchrotron X-Ray source, which could provide better resolution and better separation.
A large number of crystals responding to these criteria were therefore flash-frozen, tested for diffraction and the more promising candidates were saved so that a dataset could be collected at the synchrotron at APS.
Data Collection (APS) and Processing
A number of crystals were taken to the Advanced Photon Source synchrotron, and four datasets were collected. The best one, with a resolution of
3.2 Å, is presented below and was used for phasing and model building. The dataset was collected on one of the crystals shown in Figure 5.13. These crystals were grown in the presence of 1 mM TCEP, which seems to improve the diffraction quality. Tris HCl pH 7.5 was also used instead of Na HEPES. The crystals were flash frozen after the stepwise ethylene glycol cryoprotection method described above.
174
Figure 5.13 - D132N RNase H + 32-B Crystal Used in Data Collection 3
12.8 % PEG 4000 Complex at 0.3 mM 0.1 M Tris HCl pH 7.5 Room Temperature 0.2 M Ammonium Chloride 2 µL + 2 µL hanging drop 1 mM TCEP
The dataset was collected according to the parameters described in Table
5.7. A set of diffraction images from the dataset is shown in Figure 5.14. One can
already see that the resolution is higher and the mosaicity less than in the previous datasets.
Figure 5.14 - D132N RNase H + 32-B Crystal Data Collection 3 Images
a b c
a – Image 1 (0 to 0.5°) b – Image 90 (44.5 to 45°) c – Image 180 (89.5 to 90°)
The dataset was processed using MOSFLM (Leslie, 1992) and scaled using SCALA, which is part of the CCP4 Suite (Bailey, 1994). The data were indexed and integrated as P222 first, then the space group was changed to
175
P212121 using SORTMTZ, before scaling was done. A summary of the
processing and scaling statistics is available in Table 5.7.
Table 5.7 - D132N RNase H + 32-B Crystal Data Collection and Processing
Data Collection Data Processing
X-ray Wavelength 0.90020 Å Space Group P212121
Detector CCD 51.22 Å 90 ° Cell Dimensions 64.78 Å 90 ° Crystal to Detector Distance 400 mm 233.81 Å 90 °
Exposure Time 5 s Resolution after Scaling 117.0 to 3.4 Å
Oscillation 0.5 ° Rmerge * 15.2 % (62.1 %)
Maximum Resolution 3.2 Å I/σ 7.4 (2.3)
ϕ Range 180 ° Observed Reflections 46,208 (10,365 unique)
Images 1 to 360 Completeness 99.4 % (99.8 %)
Kappa Offset 0 ° Redundancy 4.5 (4.8)
Mosaicity 0.68 °
1 mol./ASU : 2.87 (57.2 %) Matthews Coefficient 2 mol./ASU : 1.44 (14.4 %)
The dataset was processed using the MOSFLM software (Leslie, 1992) ⎡ n ⎤ n * R =100 × F 2 − F 2 / F 2 Equation 5.1 merge ⎢∑∑ hkl hkl i ⎥ ∑∑ hkl ⎣ hkl i=1 ⎦ hkl i=1 where F 2 is the intensity of each hkl reflection and F 2 is the mean value of i hkl hkl i measurements of n equivalent reflections.
The Matthews coefficient that was calculated indicates that only one
molecule is present in the asymmetric unit (Matthews, 1968). Molecular
replacement was performed using the maximum likelihood program PHASER
(McCoy, 2007). The search models were the 1TFR (RNase H) and 1GPC (32
core domain) coordinate files. The results are shown in Table 5.8.
176
Table 5.8 - D132N RNase H + 32-B Molecular Replacement Results
Score Search Model (1TFR, 1GPC)
Rotation Function Score 11.5 6.1
Translation Function Score 26.9 27.6
Packing 0 0
Log-Likelihood Gain 597 1030
The Log-likelihood gain (LLG) indicates how well the data agrees with the model, a good molecular replacement solution will therefore have a high LLG score. The rotation (RFZ) and translation (TFZ) Z scores are then calculated from the LLG. An RFZ score higher than 5 and TFZ score higher than 8 indicate that a solution was found. The packing is an indication of clashes that may have been found by the program.
Model Building
The solution found by PHASER was first refined using rigid body refinement, then restrained refinement in REFMAC (Bailey, 1994; Murshudov,
1997). The refined model was then imported in COOT (Emsley, 2004), where major loop movements and clashes were fixed. The reflection file used to calculate the electron density was the one output after molecular replacement.
The electron density was calculated as follows:
1 ρ(x,y,z) = ∑ Fhkl cos[2π (hx + ky + lz − φhkl )] Equation 5.2 V hkl
177
Figure 5.15 – D132N RNase H + 32-B Model Building and Refinement
Molecular Replacement PHASER Model RFZ = 6.1, TFZ = 27.6
Rigid Body Refinement
R = 39.9 %, Rfree = 40.5 %
Restrained Refinement (Weight = 0.01)
R = 28.6 %, Rfree = 34.5 %
Model Building Fixed the main loop movements and clashes
Rigid Body Refinement
R = 28.1 %, Rfree = 33.5 %
Restrained Refinement (Weight = 0.01) R = 27.6 %, Rfree = 33.7 %
Procheck
Most favored = 83.7 % Allowed = 15.2 % Generously Allowed = 0.9 % Disallowed = 0.2 %
After model building, the modified model was then refined once again using REFMAC. At that point, the R value was around 27 % and the Rfree 33 %. A
178
schematic summary of the model building and refinement for the D132N RNase
H + 32-B crystal structure is available in Figure 5.15.
If additional modifications were made to the protein chains, even small, it always
resulted in either an increase of the R value, but mostly in an increase of the Rfree without seeing any significant decrease of the R value. This is usually an indication that the model is being built wrong. Since the electron density map quality was rather poor and it was difficult to interpret it correctly, the model output by REFMAC after the first round of model building was declared final. For the same reason, the automatic addition of water molecules with the ARP/wARP command was not done, and the missing A domain of the 32-B protein could not be built. The final statistics for refinement and validation are presented in Table
5.9.
Table 5.9 – Final Refinement and Validation Summary
Refinement Ramachandran Plot
Resolution 117.0 to 3.4 Å Most Favored 83.7 %
R 27.6 % Allowed 15.2 %
Rfree 33.7 % Generously Allowed 0.9 %
RMS Deviation: Disallowed 0.2 %
Bond Lengths 0.014 Å
Bond Angles 2.001 °
The final model for the interaction between RNase H and the 32 core domain is presented in Figure 5.16.
179
Figure 5.16 – Final D132N RNase H + 32-B Model
RNase H 32 core domain
Surface Rendering
The two proteins are color-coded similarly to the previous model.
Jones’ Rainbow Ribbons The N-termini of the two proteins are colored in red and the C-termini in blue.
B-Factor Ribbons The regions with the highest B-factors are colored in red, the lowest B-factors are shown in blue.
180
The C-terminus of RNase H, as it was predicted, is largely involved in the interaction with the 32 protein. The part of 32 that is involved in interacting with
RNase H is the helix C region at the back of the subdomain II. According to
Shamoo’s description of the 32 protein structure (Shamoo et al., 1995), the subdomain II forms most of the binding cleft for the single-stranded DNA, while the subdomain I is involved in Zn2+ binding. The surface rendering picture shows how nicely the two proteins fit together. It is visible from the B-factor ribbon structure that the subdomain I of the 32 protein has really high B-factors, which is why this region of the protein had really poor electron density. Being the
Zn2+-binding domain of 32 protein, it is possible that disorder in that region resulted from the lack of Zn2+ in the crystals.
The separate structures for RNase H and the 32 core domain were superposed onto the complex structure, so that the domain movement is more visible. This is shown in Figure 5.17. The complex is shown in green and cyan as before, and the RNase H and 32 core respective structures in yellow. The main domain movement occurs at the helix C loop in the 32 protein, which swings down to lock onto RNase H (c on the figure). There is some more subtle movement in RNase H as well, denoted by the d on the figure: the binding of the
32 protein seem to push the helices of the large subdomain towards the active site, which would strengthen the binding of the DNA substrate. Finally, shown as e, a little loop in the Zn2+ binding region of the 32 protein moves out of the way to accommodate RNase H.
181
Figure 5.17 – Domain Movement Observed upon Binding
d c
Front View d
e
c d
Back View d
e
The final models of RNase H and the 32 protein are shown in green and blue respectively, consistent with the previous figure. In yellow is the superposition of the crystal structures of the two proteins by themselves (pdb files 1 TFR and 1GPC). The main domain movements are indicated with a black star, the clashes that were fixed after model building with a red star.
Figure 5.18 below shows the electrostatic surfaces of RNase H and the 32 core domain at the site of the interaction. It can be seen that the two proteins interact through hydrophobic areas, in gray on the figures.
182
Figure 5.18 – Electrostatic Surfaces
RNase H is colored in green and the 32 protein is cyan. The two proteins were pulled apart so that the surfaces are easier to see. The electrostatic surfaces are colored as follows: the negatively charged areas are in red, and the positively charged ones in blue. The areas that are shown in gray are hydrophobic.
The helix 12 from RNase H, which plays a critical role in the binding, is
almost entirely non-polar: all the residues carrying a charge are oriented towards
the active site, and the non-polar residues are oriented towards the outside and the 32 protein. A number of hydrophobic residues are present on the exposed surface of the 32 core domain as well, notably two isoleucines (I60 and I151) and
a tryptophan (W144).
Moreover, it was seen on the crystal trials that at high salt concentration,
the crystals did not grow anymore. This can now easily be explained by the fact
that the interaction of the two proteins is hydrophobic.
Finally, a fork DNA substrate was modeled in the D132N RNase H + 32
core protein structure. The coordinates for the DNA were taken from the
RNase H + DNA crystal structure (Devos et al., 2007). The final model is
presented in Figure 5.19.
183
Figure 5.19 – Superposition of a Fork DNA Substrate
The fork DNA model was obtained from the D132N RNase H + fork DNA coordinate file (Devos et al., 2007) and superposed on the RNase H + 32 core domain model.
The RNase H + 32 protein structure complements really well the RNase H
+ fork DNA one. The manner the two proteins were found to interact positions the groove region of 32 protein perfectly to bind and protect the 3’-arm of single-stranded DNA coming from RNase H.
Discussion
The X-Ray diffraction data collected from the D132N RNase H + 32-B
crystals, even at 3.4 Å resolution, allowed us to dock the two proteins together
and observe some movements occurring upon binding. A large loop that is part
of the 32 protein subdomain II locks down on RNase H for instance. The surfaces
taking part in the binding were found to be largely non-polar, therefore meaning
that the interaction of the two proteins occurs through hydrophobic contact. This
184
is interesting as preliminary results during the crystallization trials indicated that
the interaction was electrostatic, and not hydrophobic. Also, the modeling of a
fork DNA substrate onto the RNase H + 32 protein structure showed how well the
32 protein is positioned to bind the single-stranded DNA in its cleft between the subdomains I and II. This is a very good confirmation that the low resolution model of the interaction between RNase H and 32 protein is physiologically relevant.
Since the interaction appears to be hydrophobic, a nice way of testing the model would be to mutate some of the non-polar residues involved in the binding into charged residues having roughly the same size. The hydrophobic residues at the 32 protein surface previously mentioned, I60, I151 and W144, were chosen.
They are all directly pointing at RNase H and span the entire length of the binding interface. Figure 5.20 shows where these residues are located on the model. It was decided to mutate the isoleucine residues into aspartate groups, and the tryptophan into a glutamate. These mutations were chosen as they only change the charge at the interface, but the overall shape is roughly conserved.
Once the three 32-B mutants have been cloned and expressed, their interaction with RNase H will be tested using the same techniques that were used for the RNase H + 32-B complex. Non-denaturing electrophoresis will be performed first to qualitatively estimate the binding or lack thereof between
RNase H and the mutants. Quantitative techniques like ITC or Fluorescence
Anisotropy titrations will then be used to estimate how well the mutants affected
185
the interaction. If it appears that the single-mutants do not have a strong effect on the binding, then double-mutants may have to be cloned as well.
Figure 5.20 – 32-B Mutated Residues
I60
W144
I151
5.3.3. Scattering Studies
Since it proved to be difficult for a long time to obtain any information from the crystals, solution-based experiments like Dynamic Light Scattering and Small
Angle X-Ray Scattering were carried out to gather more information on the state of the D132N RNase H + 32-B complex in solution.
Dynamic Light Scattering
186
The D132N RNase H + 32-B complex was prepared as described in
Section 5.3.1, at 30 µM ~ 1 mg/mL. The sample was filtered using a 0.1 µm pore size Ultrafree-MC filter, and centrifuged for 20 minutes and 4 °C at 18,000 rcf.
The DLS readings were then taken at 4 °C and 20 °C. Unfortunately, the complex aggregates at 20 °C and no satisfactory measurements could be made at that temperature. The results for the 4 °C experiment are shown in Figure 5.21 and
Table 5.10. The control measurements on the separate proteins were also done and presented in their respective chapter (Section 3.5.11 for 32-B and Section
4.2.5 for D132N RNase H).
Figure 5.21 – Dynamic Light Scattering Results for the D132N
RNase H + 32-B Complex at 4 °C
187
Table 5.10 – Dynamic Light Scattering Results for the D132N
RNase H + 32-B Complex at 4 °C
Rh (nm) % Pd MW (kDa) % Intensity % Mass
3.3 15.4 54 100.0 100.0
The polydispersity of the sample is around 15 %, indicating that slight
aggregation is occurring. The molecular weight of 54 kDa was calculated from
the hydrodynamic radius of 3.3 nm. That value was a little small but close to the
one expected for the D132N RNase H + 32-B complex, which has a theoretical
molecular weight of 67.4 kDa.
The fact that the complex is aggregating at 20 °C might be an indication
as to why the crystals were not diffracting to high resolution. However, when the
crystallization experiment was repeated at 4 °C, only showers of small crystals
were obtained.
Small-Angle X-Ray Scattering
The SAXS sample was prepared in the same way the DLS sample was, at
100 µM concentration. The data was collected at the APS 15-ID beamline, at
room temperature. Similarly to what was done with the separate proteins (see
Section 3.5.12 and 4.2.5), images were collected with 5 s, 20 s and 40 s
exposure, for the buffer and the protein sample, and only the 40 s images were
averaged for the data processing. The data used in processing is shown in
Figure 5.22.c, it was superposed with the D132N RNase H and the 32-B data, in
188
Figure 5.22.d. The scattering curves for the two separate proteins look very different from the complex one, indicating that it is indeed the complex scattering the X-Rays and not D132N RNase H or 32-B by themselves.
Figure 5.22 – D132N RNase H + 32-B SAXS Data Collection c d
Complex D132N RNase H 32-B
The data were processed with GNOM to two different resolutions: a higher resolution (171 to 24 Å) dataset was made as well as a lower resolution one (171 to 45 Å). The estimated maximum dimension of the particle Dmax was 100 Å. The experimental vs. calculated data and size distribution plots are shown in Figure
5.23. The radii of gyration that were calculated at that point were as follows:
30.66 ± 0.09 Å for the high resolution data, and 31.3 ± 0.2 Å for the low resolution data. These values are in good agreement with each other, and are also consistent with the hydrodynamic radius of 33 Å obtained from the DLS experiment.
189
Figure 5.23 – D132N RNase H + 32-B SAXS Data Processing (GNOM)
High Resolution (172 to 24 Å) Low Resolution (172 to 45 Å)
experimental data
model data
The GNOM datasets were then input in GASBOR or DAMMIN, two Ab
Initio modeling programs that calculate what the molecular envelope of the protein in solution look like. DAMMIN was used in the “keep” mode (models averaged with DAMAVER) with both the high resolution and the low resolution data, while GASBOR was only used with the high resolution data. The models output by the two programs are presented in Figure 5.24. The ribbon-like structures of the RNase H and the 32 core were modeled in the envelopes using
190 the program COOT (Emsley, 2004). The proteins were positioned in a relative orientation matching the one observed in the crystal structure of the D132N
RNase H + 32-B complex. For comparison, the surface rendering of the crystal structure is also shown in Figure 5.24.c.
Figure 5.24 – D132N RNase H + 32-B SAXS 3D Molecular Envelopes
Figure 5.24.a – DAMMIN Models
Lower resolution model (172 to 45 Å)
Average of seven models 2 χ = 0.71 Dimensions: 110 Å × 60 Å × 50 Å
Higher resolution model (172 to 24 Å)
Average of five models 2 χ = 0.76 Dimensions: 110 Å × 65 Å × 50 Å
Figure 5.24.b – GASBOR Model
χ2 = 1.0
Dimensions: 105 Å × 55 Å × 30 Å
191
Figure 5.24.c – Crystal Structure Surface Rendering
Dimensions: 95 Å × 50 Å × 40 Å
In all the figures above, RNase H is shown in green and the 32 protein in blue.
The GASBOR model seems to fit the crystal structure the best, even
though the χ2 values for the DAMMIN models are lower. In all cases, the
molecular envelope has an elongated shape, consistent with the surface
obtained from the crystal structure.
In the case of the DAMMIN and GASBOR models, the protein atomic
models were manually built inside the molecular envelopes, which is clearly not
the best way of obtaining a good model. The program SASREF in the ATSAS
suite is a rigid body modeling program that fits known structures into SAXS molecular envelopes using simulated annealing. It was used with the 1TFR
(RNase H) and 1GPC (32 core domain), against the GNOM dataset. This program should be run a large number of times, as a different model is obtained from every run. The models were compared, and only the best one is presented in this section. The χ2 values for these models ranged from 2.1 to 3 and above.
The fit obtained from the best SASREF run is shown in Figure 5.25.
192
Figure 5.25 – Best SASREF fit for the D132N RNase H + 32-B calculated
vs. experimental data
2 χ = 2.1
experimental data
calculated data
SASREF outputs a coordinate file for each model, corresponding to the rigid body modeling of the two proteins. The model corresponding to the fit shown previously is presented in Figure 5.26.
Figure 5.26 – Best SASREF Model for the D132N RNase H + 32-B Complex
32 core (SASREF)
RNase H 32 core 32 core domain (crystal)
SASREF model Superposition of the SASREF model and the crystal structure
193
In this model, the 32 protein interacts with helix 4 on the side of the bridge
region of RNase H (see (Mueser et al., 1996) for the description of the regions of
RNase H). It is known from other studies and the crystal structure that the
RNase H – 32 protein interaction is made through the C-terminus of RNase H.
Indeed, when the SASREF model was superposed on the crystal structure, as
seen on the right-hand side of Figure 5.25, the position of the 32 protein
predicted by SASREF is very different from the actual one.
In parallel, the scattering data from the D132N RNase H + 32-B complex crystal structure was calculated using the program CRYSOL, and superimposed onto the experimental data. The curves are shown in Figure 5.27.
Figure 5.27 –CRYSOL fit for the D132N RNase H + 32-B theoretical vs. experimental data
χ2 = 1.59
experimental data
model data
194
The two datasets superpose really well and the χ2 value for the fit is 1.59,
which is lower than anything SASREF came up with. This is a very good
indication that the crystal structure is also relevant in solution, and that the two
proteins are actually interacting in the way that was seen in the structural studies.
Discussion
Dynamic Light Scattering and Small-Angle X-Ray Scattering studies were
carried out on the D132N RNase H + 32-B complex, in order to better
characterize the interaction between these two proteins.
The DLS results showed that the complex is more stable at 4 °C than at
room temperature, which was an interesting result as the complex crystals could only be grown as large single crystals at room temperature. It was also confirmed that the molecular weight of the complex is consistent with a 1:1 ratio of the two proteins.
The SAXS experiment provided us with some molecular envelopes that are reasonably consistent with the crystal structure. The experimental and the
calculated scattering data were, on the other hand, remarkably consistent with
one another (CRYSOL), which was a very good confirmation of the status of the
D132N RNase H + 32-B complex in solution, as compared to the crystal. When
rigid body modeling was attempted to fit the two proteins together, no model
produced by SASREF was close to the one seen from the structural studies, and
all had a larger error than the CRYSOL fit.
195
In the case of the RNase H + 32 protein complex, structural studies were
successfully done so SAXS was mostly used to confirm that the solution 3D
envelope and the crystal structure were in good agreement with each other.
However, it would be much more difficult to produce a good model of the
interaction between two proteins if the SAXS data was to be used alone (as was
the case for the 59 protein and 32 protein complex, (Dwlgosh, 2008)): The
SASREF modeling program, based only on shape fitting, had difficulties coming
up with a decent model.
5.3.4. Size Exclusion Chromatography
D132N RNase H, 32-B and the complex were analyzed using size
exclusion chromatography. The Superdex 200 column was chosen, since it has a void volume of 200 kDa, as compared to the Superdex 75 which has a void
volume of 75 kDa and therefore was not appropriate for the molecular weights
that were dealt with in this particular study. These molecular weights are listed in
Table 5.11.
Table 5.11 – D132N and 32-B Molecular Weights
D132N RNase H 32-B Complex
31,844 kDa (monomer) Molecular Weight 35,558 kDa 67,402 kDa 63,688 kDa (dimer)
196
The buffer used was composed of 25 mM Bis-Tris HCl pH 6.5, 150 mM
NH4Cl, 10 mM MgCl2 and 2 mM β-mercaptoethanol. The proteins were also
dialyzed in that buffer prior to the experiment, and concentrated to around
0.2 mM. Each protein separately, then the complex was loaded on the column in three separate runs. The three chromatograms are shown in Figure 5.28. D132N was eluted in fractions 57 to 66 (elution time = 77 minutes), 32-B in fractions 52 to 61 (69.5 minutes). The complex run showed two peaks, one in fractions 52 to
59 (68 minutes) and the other in fractions 59 to 65 (76 minutes). The fractions
highlighted with a red star were run on an SDS-PAGE gel, presented in Figure
5.29.
Several problems appeared with this experiment. First, it was seen in
Chapter 3 that 32-B can exist as a dimer in solution. The molecular weight of a
32-B dimer is very close to the molecular weight of the complex (see Table 5.11),
which makes the separation via gel filtration difficult. Indeed, it is really hard to
tell if the first peak observed in the complex run corresponds to the D132N
RNase H + 32-B complex or a dimer of 32-B proteins. The Superdex 75 could
have provided a more efficient separation, which may nonetheless not have been
good enough, but it wasn’t used as its void volume of 75 kDa was too low. The
other problem appeared with the SDS-PAGE gel shown in Figure 5.28, where
D132N RNase H and 32-B run at the same level, and are indistinguishable.
197
Figure 5.28 – D132N RNase H + 32-B Complex Gel Filtration Assay
D132N RNase H (0.23 mM)
* *
32-B Protein (0.20 mM)
*
D132N RNase H + 32-B Complex (0.15 mM)
*******
On the chromatogram, the OD260 is shown in purple, the OD280 in green, the conductivity in red and the % B in black. The red stars indicate which fractions were run on a SDS-PAGE gel.
198
Figure 5.29 – D132N RNase H + 32-B Complex Gel Filtration SDS-PAGE Gel
1- D132N RNase H – Load 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 2- D132N RNase H – F. 62
3- D132N RNase H – F. 77 4- Molecular Weight Marker 5- 32-B – Load 6- 32-B – F. 56 7- Molecular Weight Marker
8- Complex – Load 9- Complex – F. 53 10- Complex – F. 55
11- Complex – F. 57 12- Complex – F. 59 13- Complex – F. 61 14- Complex – F. 63 15- Complex – F. 65
To solve that problem, the fractions 54 and 62 from the D132N RNase H +
32-B run were analyzed by mass spectrometry. They were chosen because they
correspond to the beginning of the first peak and the end of the second peak,
therefore minimizing overlaps and cross-contamination of the samples. About
500 µL of each fraction was dialyzed several times in 50 mM Ammonium
Bicarbonate using a MicroconTM concentrator, and then filtered. The samples
were then run on the ESI-Ion Trap Mass Spectrometer available at the
+ Instrumentation Center. Despite the multiple dialyses, the concentration of NH4 and Mg2+ ions in solution was still high, and that made the interpretation of the
spectra difficult. However, it was seen that fraction 54 did contain two different
proteins, but the molecular weights could not be determined precisely because of
199 the high salt concentration. Fraction 62 only contained one protein with a molecular weight of 34,465 kDa, which has to be RNase H since 32-B is only
31.8 kDa. From these results we can say that the first peak observed contained both D132N RNase H and 32-B. That peak then most likely corresponds to a mixture of the complex and of 32-B dimers, as it was twice as intense as the second peak. The identity of RNase H in the second peak was confirmed. It should be pointed out that the fractions were kept at 4 °C for a few weeks before being dialyzed. The proteins at that point were probably partly aggregated, trapping salt ions, which made the dialysis step more difficult. This could also explain why the molecular weight of RNase H came out to be about 1 kDa lighter that it should have been, the protein having been degraded over time.
5.3.5. Isothermal Titration Calorimetry
All the experiments that have been described so far show that D132N
RNase H does bind to 32-B, but these are all qualitative results. In order to better quantify the binding of the two proteins, ITC was performed for the D132N
RNase H + 32-B complex formation.
The two proteins were first dialyzed in the usual dialysis buffer
(25 mM bis-Tris HCl pH 6.5, 150 mM NH4Cl, 10 mM MgCl2 and 2 mM
β-mercaptoethanol) and then concentrated. The concentration of the protein in the syringe has to be roughly 20 times higher than the one in the cell. D132N
RNase H was the titrated protein, placed in the cell, and 32-B the titrant placed in the syringe, the reason being that 32-B can be obtained in larger quantities and
200 at a higher concentration, making it the obvious candidate for the titrant protein.
The concentration of the two proteins for each run is given in Table 5.12.
Table 5.12 – Protein Concentrations Used in the ITC Experiment
Titrant / Titrated Protein 32-B D132N RNase H
Buffer / Buffer / /
Buffer / D132N RNase H / 0.0516 mM
32-B / Buffer 0.980 mM /
32-B / D132N RNase H 0.917 mM 0.0509 mM
All the runs were set up in a similar fashion: 40 injections of 5 µL and 10 s were made, except for the first injection that only contained 1 µL. The rotation speed of the syringe was 270 rpm, and the temperature of the cell 20 °C. Several control runs had to be done first. The buffer-buffer run, or injection of buffer into buffer, provides the heat of injection. The 32-B-buffer and the buffer-D132N
RNase H runs give the heat of dilution of each protein into the buffer. The different heats (injection, dilution) obtained from the control runs then have to be subtracted from the 32-B-D132N RNase H data, in order to get the actual heat of binding. Figure 5.30 shows the 32-B into D132N RNase H titration curve. The shape of the curve was then fitted assuming a one-site binding, and the fit was used to calculate the thermodynamic parameters of the binding of 32-B to D132N
RNase H, which are summarized in Table 5.13.
201
Figure 5.30 – D132N RNase H + 32-B Isothermal Titration
Table 5.13 – Thermodynamic Parameters of the D132N RNase H + 32-B
Complex Formation
Parameter Result
N 0.92 ± 0.03
Kd (µM) 3.8 ± 0.7
∆H (kJ.mol-1) 8.1 ± 0.3
∆S (J.K-1.mol-1) 31.4
T∆S (kJ.mol-1) 9.2
∆G (kJ.mol-1) -1.1
202
The dissociation constant calculated for the D132N RNase H + 32-B
complex is in the low micromolar range, 3.8 µM, indicating that the affinity of the
two proteins for each other is not very strong, but strong enough so that the complex can be observed in in vitro assays, like on the non-denaturing gel. Also, the binding reaction appears to be endothermic (∆H > 0), and therefore entropy
driven. This is interesting, as it implies that the complex would form more readily
when the temperature increases, i.e. at room temperature versus 4 °C. This
might be the reason why showers of small crystals were observed at 4 °C, and larger single crystals could only be grown at room temperature. However, the proteins are less stable at room temperature than at 4 °C, as it was seen with the
DLS experiments, and have a tendancy to aggregate. These two opposite phenomena probably explain why the D132N RNase H + 32-B crystals could be grown at room temperature, but had some intrinsic disorder due to the lower stability of the proteins.
5.3.6. Fluorescence Anisotropy Titration
Fluorescence Anisotropy is another experiment that allows for quantitative characterization of the binding of a protein to another. One way of carrying out
this experiment, in the case of the D132N RNase H + 32-B complex, would be to
label one of the proteins with the fluorescent label, and titrate it with the other
protein. However, it has been shown that labeling of some proteins can affect
their interaction with other proteins or substrates. It is the case for RNase H:
once labeled, it doesn’t bind to its DNA substrate anymore, but the labeled DNA
203
binds to the unlabeled RNase H with a low nanomolar affinity (Juliette M. Devos,
personal communication). To circumvent these issues, it was chosen to use a
labeled DNA substrate loaded with RNase H as the starting point of the titration,
and then use 32-B as the titrant. That strategy also has the advantage of
bringing DNA back in the experiment, since the two proteins only interact in
physiologically relevant conditions through DNA. The labeled DNA substrate that
was chosen is shown in Figure 5.31. It is a 20/30 fork, labeled at its 5’-end with
HEX-Fluorescein. HEX absorbs at 535 nm and emits at 556 nm.
Figure 5.31 – Fluorescence Anistropy Titration Fork DNA Substrate
5’ 5 15 * 3’ HEX Label
5’ * 15 15
3’
The proteins were dialyzed in the dialysis buffer composed of 25 mM bis-Tris HCl pH 6.5, 150 mM NH4Cl, 10 mM MgCl2 and 2 mM β-mercaptoethanol.
They were then concentrated to around 300 µM for D132N RNase H and 700 µM
for 32-B, so as to limit the dilution effect during the titration. The DNA substrate
was annealed according to the protocol described in Section 2.14.2. The initial
concentration for the starting material in the cuvette (fork DNA + D132N RNase
H) was 10 µM, where only 5 % of the DNA was labeled and the rest was the
same 20/30 fork DNA without the 5’-HEX label. The reason behind that trace
204 experiment is that micromolar concentrations of protein and DNA are needed for titrations having a micromolar Kd, but a micromolar concentration of labeled DNA would saturate the detector, as well as being extremely expensive. This particular concentration was chosen according to the results from the ITC experiment, which gave a dissociation constant of 3.8 µM. 32-B at 400 µM was then titrated until equilibrium was reached.
The anisotropy
I − I
A total number of ten readings of the anisotropy were taken after each addition and averaged before being used in the titration curve. The initial fork
DNA anisotropy readings averaged around 0.07, and increased to 0.16 to 0.17 when D132N RNase H was loaded on the fork. It should be noted that the dissociation constant of RNase H for fork DNA is around 20 nm, so if the concentration is 10 µM, 98 % or more of RNase H is bound to the DNA. The titration curve for the 32-B + D132N RNase H + 20/30 fork DNA and the calculated fit are shown in Figure 5.32.
205
Figure 5.32 – Fluorescence Anisotropy Titration of the 32-B + D132N
RNase H + fork DNA Complex
The anisotropy shown on the Y axis has been normalized and really is the change in anisotropy while titrating 32-B.
Kd = 3.65 ± 0.01 µM
The dissociation constant obtained from the titration is 3.65 ± 0.01 µM, indicating a binding in the presence of DNA comparably close to the D132N
RNase H + 32-B binding obtained from the ITC experiment (3.8 µM).
To complete the study on the binding of D132N RNase H to 32 protein,
the same experiment was repeated with the other truncations of 32 protein
available to us, namely 32-A, the core domain and the full length 32 protein. This
was made possible by the fact that Fluorescence Anisotropy titrations are not
very much time- and material-consuming. The same thing cannot be said about
Isothermal Titration Calorimetry, which is why even though the binding of the 32
protein and 32 core domain to D132N RNase H was witnessed on the native gels
shown in Figure 5.1, it was too weak to justify ITC titrations. On the other hand, it
was seen on the gel-shift assays (Figure 5.4.b) that 32 protein and the 32-A
206
truncation do bind quite strongly to D132N RNase H, in the presence of a fork
DNA, justifying the FA titrations.
The titration curves and fits for the four 32 protein truncations being
titrated into D132N RNase H + fork DNA are shown in Figure 5.33. A summary of the dissociation constants for each truncation is also available in Table 5.14.
From this, it can be seen that the strongest binding is observed with the 32-B truncation (3.65 µM), followed by the 32 core domain (13.14 µM), the 32-A
truncation (31.98 µM) and finally the full length 32 protein (105.8 µM). These
results are consistent with the relative strength of binding observed from the gel
shift assays (see Section 5.2.2), with the exception of the 32 core domain that
wasn’t used in those.
Table 5.14 –Summary of the Dissociation Constants for the 32 Truncations
+ D132N RNase H + Fork DNA Complex
120
32 Truncation Kd (µM) 100
32 Protein 105.8 ± 0.1 80 60 32-B 3.65 ± 0.01 Kd (uM) 40 32-A 31.98 ± 0.07 20
32 core 13.14 ± 0.04 0 32 32-B 32-A 32core
207
Figure 5.33 – Fluorescence Anisotropy Titrations of the 32 Truncations +
D132N RNase H + Fork DNA Complex
32-B 32
32 core 32-A
The anisotropy shown on the Y axis has been normalized and brought back to zero, so that the titration curve only shows the change in anisotropy induced by the binding of the 32 protein.
The 32-A titration did show some additional interesting features. While for
the other three titrations, the anisotropy only increased by 0.04 at the most, it
increased by 0.12 with 32-A, a three fold increase compared to the other 32
truncations. Such a jump in anisotropy can only be explained by the formation of
a much larger complex in this particular case. Referring to Sections 1.1.2 and
3.1, the B domain that is present in the 32-A truncation is responsible for the cooperative binding of the 32 protein to itself, resulting in the formation of
208 filaments (Giedroc et al., 1991; Casas-Finet et al., 1992). There is a good chance that 32-A also displays this cooperative binding. Therefore, when 32-A is titrated into the RNase H + fork DNA complex, not only does it bind to RNase H but also to itself when added in excess. This would not have been observed on the gel shift assays that were run at low nanomolar concentrations.
5.3.7. Protein-Protein-DNA Crystallization
An important step in studying the interaction between D132N RNase H and the 32-B protein truncation was to screen the ternary complex for crystals. A crystal structure of RNase H and 32 protein in the presence of DNA would provide a lot more information about the way these proteins come together and organize the lagging strand replication.
The same DNA substrates, the 3’-overhang and the fork DNA, that have previously been described were used in the screens. As a reminder they are shown again in Figure 5.34.
Figure 5.34 – DNA Substrates Used in the Ternary Complex Screens
12 3’ 5’ • 12/24 3’-overhang 3’-overhang DNA 5’ • 12/27 3’-overhang 12 12/15/18 • 12/30 3’-overhang
3’ 5’ 4 12 3’ • 16/24 fork DNA Fork DNA 5’ • 16/27 fork DNA 12 • 16/30 fork DNA 12/15/18 3’
209
The ternary complex was always prepared in a 1:1:1 ratio, by incubating
D132N RNase H with the DNA substrate first, and then adding the 32 protein.
When this part of the project was initiated, the D132N RNase H + 32-B crystals had already been obtained, but not optimized, and the two crystallization conditions described in Figure 5.7 were known. A first crystallization attempt was done using these. The ternary complex ( D132N RNase H + 32-B + 12/24 or
12/27 or 12/30 3’-overhang) was screened at 0.2 mM and room temperature, in 2
µL + 2 µL hanging drops. The crystallization conditions were:
• 10-15 % PEG 4000 + 0.1 M HEPES pH 7.5 + 0.2 M Ammonium Chloride
• 10-20 % PEG 3350 + 0.1 M Tris HCl pH 7.5 + 0.2 M Calcium Acetate
Only microcrystals or precipitate were obtained with both conditions.
The D132N RNase H + 32-B + fork DNA was then screened using the
Crystal Screen I/II and the Wizard I/II commercial screens. A summary of the
different screens is given in Table 5.15. The screens were setup in Greiner trays, by hand, with 0.2 mM of the D132N RNase H + 32-B + DNA substrate complex.
Table 5.15 – D132N RNase H + 32-B + DNA Crystal Screens
Crystal Screen DNA Substrate Temperature Drop (µL)
Crystal Screen I and II 16/30 fork DNA Room Temp. 1 + 1
Wizard I and II 16/30 fork DNA Room Temp. 1 + 1
Crystal Screen I and II 16/27 fork DNA 4 °C 1 + 1
Crystal Screen I and II 16/30 fork DNA 4 °C 1 + 1
Wizard I and II 16/27 fork DNA 4 °C 1 + 1
Wizard I and II 16/30 fork DNA 4 °C 1 + 1
210
First of all, a trend was observed, where the ternary complex made with the shorter DNA substrate seemed to be more stable and crystallize more easily
that with the longer DNA substrate, that had a tendancy to precipitate more
readily. It was initially thought that the longer DNA substrate would be better, as
the binding from the gel shift assays was stronger (see Figure 5.4.a). However,
like it has been mentioned before, these gels were run at nanomolar
concentrations, and the crystallization experiment in the millimolar range, so
about a 10,000 fold increase in concentration. This is probably enough to drive
the binding of the 32 protein on a shorter DNA strand. Moreover, a shorter 3’-arm
DNA would also mean a more homogeneous ternary complex, resulting in better crystallization results.
The best crystal hits were obtained from the 4 °C screens, even though the most promising hits were all microcrystals. They are shown in Figure 5.35 below. The first one, however, is a salt crystal as it did not stain with the Izit blue
dye.
These hits were used as the starting point for crystal optimization
experiments. Unfortunately, these crystals could never be grown again.
211
Figure 5.35 – D132N RNase H + 32-B + Fork DNA Crystals after Screening
16/27 fork Crystal Screen I – Condition 6
30 % PEG 4000 0.1 M Tris HCl pH 8.5 0.2 M Magnesium Chloride
Wizard I – Condition 3
15 % ethanol 0.1 M Na CHES pH 9.5 16/30 fork 16/27 fork
Crystal Screen II – Condition 11
1,6-hexanediol 0.1 M Na Acetate pH 4.6 0.01 M Cobalt Chloride
Wizard II – Condition 28
20 % PEG 8000 0.1 M Na MES pH 6.0 0.2 M Calcium Acetate 16/30 fork
5.3.8. 32-B Mutants Studies
As previously discussed in Section 5.3.2, three mutants were designed in order to further characterize the interaction between D132N RNase H and 32-B.
The cloning, expression and purification of these mutants is described in Section
3.6. Only two out of the three mutants that were designed could be successfully cloned, and the W144E mutant never got past the PCR stage. As a reminder, a
212 picture of the location of the other two mutated residues, I151 and I60, is shown below in Figure 5.36.
Figure 5.36 – Location of the 32-B Mutated Residues at the Interface between D132N RNase H and 32-B
I60
I151
32 protein RNase H
32 protein is represented in blue and RNase H in green. The two mutants are colored in pink. The RNAse H electrostatic map is also shown in mesh.
A non-denaturing gel was run first, to assess if the 32-B mutants can still bind to D132N RNase H. This gel is shown in Figure 5.37. It was run in a bis-Tris
HCl pH 6.5 buffer, which is why it looks different from the gel in Figure 5.1, which was run in a Tris HCl pH 6.5 buffer. The wild type 32-B + D132N RNase H complex is in lane 2 and did not move from the well, and no extra bands can be observed for the separate 32-B and RNase H proteins either. For the two mutants on the other hand (lane 5 and 8), a band can be seen close to the well, especially for I151D 32-B, but bands for the 32B mutant and D132N RNase H
213 are also observed. This indicates that binding between RNase H and the 32-B mutant is still occurring, but at a weaker level than with the wild type 32-B. Also, by comparing lane 5 and 8, the I60D 32-B mutant seems to break the interaction with RNase H better than I151D 32-B.
Figure 5.37 – D132N RNase H + 32-B Mutants Native Gels
WT I151D I60D 1 2 3 4 5 6 7 8 9 1- 32-B Wild Type Protein 2- 32-B + D132N RNase H 3- D132N RNase H
4- I151D 32-B Protein 5- I151D 32-B + D132N RNase H 6- D132N RNase H
7- I60D 32-B Protein 8- I60D 32-B + D132N RNase H 9- D132N RNase H
In order to quantify the effect of the mutants on the 32-B + D132N
RNase H interaction, Fluorescence Anisotropy titrations were performed. Another parameter that was tested was the influence of the N-terminal His-Tag of the
32-B mutant. Titrations were carried out similarly to the 32-B titration into 20/30 fork DNA + D132N RNase H. Since the binding of the 32-B mutants to D132N
RNase is expected to be weaker, the initial concentration of D132N RNase H +
214
fork DNA was increased to 15 µM. The titration curves and fits for all four titrations are presented in Figure 5.38.
Figure 5.38 – Fluorescence Anisotropy Titration of the 32-B Mutants +
D132N RNase H + fork DNA Complex
I151D 32-B I151D 32-B (N-term His-Tag)
I60D 32-B I60D 32-B (N-term His-Tag)
The anisotropy shown on the Y axis has been normalized and brought back to zero, so that the titration curve only shows the change in anisotropy induced by the binding of the 32 protein.
The dissociation constants calculated from each titration are tabulated in
Table 5.16 below.
215
Table 5.16 – Summary of the Dissociation Constants for the 32-B Mutants +
D132N RNase H + fork DNA Complex
32-B Kd (µM)
Wild Type 3.65 ± 0.01
I151D 32-B 30.99 ± 0.02
I151D 32-B + His-Tag 31.92 ± 0.06
I60D 32-B 34.21 ± 0.05
I60D 32-B + His-Tag 39.99 ± 0.05
The first important point is that for both mutants, the 32-B + D132N
RNase H interaction is about ten times weaker than with the wild type 32-B protein. Moreover, confirming what was seen on the native gel (Figure 5.36), the
I60D mutation does seem to have a stronger effect on the interaction with RNase
H. Looking at the 32-B / RNase H crystal structure, it looks like I60 is more buried
in the hydrophobic interface than I151, which could explain why its mutation has
a more dramatic effect. Overall, both mutants did show a decrease in affinity for
D132N RNase H, and this results confirms that the crystal structure is indeed
physiologically relevant, and that the two proteins bind via hydrophobic
interactions.
The other result from these titrations is that the N-terminal His-Tag has no
effect on the binding to RNase H. From this, two things follow. First, it confirms that the N-terminus of 32-B is completely out of the way of the interaction with
RNase H, like what was seen with the crystal structure. The other thing, more
216 practical, is that the two versions of each 32-B mutant, containing the His-Tag or not, can be used interchangeably, and that will be done later on in this study.
5.3.9. Conclusion
The D132N RNase H + 32-B complex was extensively characterized and described in this chapter.
Biophysical experiments were done, to qualitatively and quantitatively characterize the interaction between the two proteins: non-denaturing gel electrophoresis showed that the interaction is strong enough to be witnessed in a gel, a result that was seen again with size exclusion chromatography. The SEC however revealed that the complex is in equilibrium with the two separate proteins. The interaction was quantitatively characterized with Isothermal
Titration Calorimetry and Fluorescence Anisotropy Titrations. The two experiments were in very good agreement with one another and gave a dissociation constant of around 3.5 µM.
The complex was further characterized by structural studies. A crystal structure was obtained, where the C-terminus of RNase H interacts with the 32 core domain, and from which a number of 32-B mutants were designed in order to test the X-Ray data model. It was then shown that the two 32-B mutants did have an effect on the RNase H + 32-B complex, and through FA titrations that the interaction was about an order of magnitude weaker.
Solution-based studies were also performed, such as Dynamic Light
Scattering and Small-Angle X-Ray Scattering. The SAXS data agreed rather well
217
with the crystal structure, which was a good indication that the interaction that is
seen in the crystal is also relevant in solution.
Crystallization experiments were done with the D132N RNase H + 32-B +
DNA ternary complex, but no diffraction quality crystals could be grown.
Finally, the interaction between RNase H, the different 32 protein
truncations and a fork DNA substrate was characterized with Fluorescence
Anisotropy titrations. It appears that the strongest complex is obtained with the
32-B truncation and the 32 core domain, the weakest being observed with the 32
full length protein and 32-A to some extent. Even though 32-B interacts with
RNase H through the core domain, the A domain must be playing an important
role as its truncation weakens the complex quite a lot. There must also be some
domain movement associated with the B domain upon DNA binding or influence
of the cooperative binding of 32, since the full length 32 protein appears to bind
more weakly to RNase H than the 32-B truncation. Unfortunately, the crystal
structure did not show where the A domain is located when the 32 protein is
bound to RNase H.
218
5.4. D132N ∆N RNase H + 32-B Protein Interaction
5.4.1. Complex Preparation
Preparing the D132N ∆N RNase H + 32 protein complexes is not as
straightforward as the D132N RNase H + 32 protein. As it was described in
Section 4.3, D132N ∆N RNase H had solubility issues and could not be obtained
at high concentration. However, in order to make the protein complexes, the separate proteins are needed at a high enough concentration, so that upon
mixing and dilution the complex can be made at a reasonable concentration too.
Since D132N ∆N RNase H can only be concentrated in the presence of glycerol
(see Section 4.3.5), the separate proteins were concentrated and the complex
prepared before it was dialyzed in its optimum buffer (25 mM Tris HCl pH 7.5,
150 mM NH4Cl, 10 mM MgCl2 and 2 mM β-mercaptoethanol). After dialysis, the
complex was re-concentrated if need be, before it was used.
5.4.2. Protein-Protein Crystallization
The first crystallization trial for the D132N ∆N RNase H + 32-B complex
was done using the crystallization conditions obtained from the D132N RNase H
+ 32-B complex (Section 5.3.2). As a reminder, here are the two crystallization
conditions again:
• 5-15 % PEG 4000 + 0.1 M HEPES pH 7.5 + 0.2 M Ammonium Chloride
• 10-20 % PEG 3350 + 0.1 M Tris HCl pH 7.5 + 0.2 M Calcium Acetate
219
Crystals were obtained only with the first condition, the other one yielding
only precipitate. Some of these crystals are shown in Figure 5.39.
Figure 5.39 – D132N ∆N RNase H + 32-B Initial Crystals
7.6 % PEG 4000 Complex at 0.15 mM 0.1 M HEPES pH 7.5 Room Temperature 0.2 M Ammonium Chloride 2 µL + 2 µL hanging drop
Complex at 0.15 mM 7.6 % PEG 4000 Room Temperature 0.1 M HEPES pH 7.5 2 µL + 2 µL hanging drop 0.2 M Ammonium Chloride
The crystals were cryoprotected similarly to the D132N RNase H + 32-B crystals, and flash-frozen in liquid nitrogen. They were then screened for X-Ray
diffraction but none of them diffracted.
The D132N ∆N RNase H + 32-B complex was therefore screened for
crystal hits, as shown in Table 5.17 below.
220
Table 5.17 – D132N ∆N RNase H + 32-B Crystal Screens
Crystal Screen Concentration Temperature Drop (µL)
Crystal Screen I and II 0.11 mM Room Temp. 1 + 1
Wizard I and II 0.11 mM Room Temp. 1 + 1
Additive Screen 0.1 mM Room Temp. 1 + 1
Index 0.085 mM Room Temp. 1 + 1
PEG Ion Screen 0.09 mM Room Temp. 1 + 1
Natrix 0.09 mM Room Temp. 1 + 1
A number of crystal hits were obtained from the screens, but most of them were microcrystals. Some of the best ones are presented in Figure 5.40.
Figure 5.40 – D132N ∆N RNase H + 32-B Crystal Hits after Screening
Crystal Screen I – Condition 46
18 % PEG 8000 0.1 M Na Cacodylate pH 6.5 0.2 M Calcium Acetate
Crystal Screen I – Condition 36
8 % PEG 8000 0.1 M Tris HCl pH 8.5
Wizard II – Condition 18
20 % PEG 3000 0.1 M Tris HCl pH 7.0 0.2 M Calcium Acetate
221
It was noticed that a lot of the hits were obtained in PEG 4000 or 8000,
and the identity of the buffer or the salt (if present) did not have an influence on
the crystal morphology. A first expansion screen, shown in Figure 5.41.c, was
designed, with only PEG 4000 or 8000 and different buffers. It appeared that
microcrystals formed better at higher pH, but the absence of salt in that screen did seem to have an influence on crystal growth, as a lot of the drops were
precipitated. Another expansion screen was set up, as shown in Figure 5.41.d.
Four different salts were tested at a 0.1 M concentration. These two screens
were done at room temperature and 0.3 mM protein concentration, in a 2 + 2
hanging drop setup.
222
Figure 5.41 – D132N ∆N RNase H + 32-B Crystal Optimization
c 5 % PEG 4000 or 8000 25 %
0.1 M Na Cacodylate pH 6.5
0.1 M Na HEPES pH 7.5
0.1 M Tris HCl pH 8.5
0.1 M Na CHES pH 9.5
0 % PEG 3400, 4000 or 8000 10 % d 0.1 M Magnesium Acetate
0.1 M Calcium Acetate
0.1 M Sodium Formate
0.1 M Lithium Sulfate
+ 0.1 M Na CHES pH 9.5
The best crystals that were grown with this strategy are presented in
Figure 5.42.
Figure 5.42 – D132N ∆N RNase H + 32-B Crystal Hits after Optimization
4.0 % PEG 4000 Complex at 0.3 mM 0.1 M Na CHES pH 9.5 Room Temperature 0.2 M Magnesium Acetate 2 µL + 2 µL hanging drop
223
Since these crystals could not be grown bigger than what is shown on the
picture, the macroseeding technique was used: some of the small crystals were
used as seeds and placed in a new crystallization drop. The attempt was
however unsuccessful and no large single crystals were obtained for the D132N
∆N RNase H + 32-B complex.
5.4.3. Protein-Protein-DNA Complex
The D132N ∆N RNase H + 32-B + DNA complex had been identified from
the gel shift assays as one of the strongest ternary complexes.
The complex was prepared at around 0.1 mM with the 16/30 fork DNA
shown on Figure 5.33. D132N ∆N RNase H was loaded onto the DNA substrate first and left to incubate for fifteen minutes on ice before the 32-B protein was added. The addition of D132N ∆N RNase H to the DNA caused precipitation, but
this phenomenon had been observed before with D132N RNase H, and the
complex redissolved again when the 32 protein was added. Here however, the
addition of 32-B to the D132N ∆N RNase H + fork DNA did not re-solubilize the
complex.
5.4.4. 32-B Mutants Studies
Two biophysical techniques, non-denaturing electrophoresis and
fluorescence anisotropy titration, were applied to study the interaction between
the ∆N truncation of RNase H and the 32-B protein.
224
The native gel in Figure 5.43 was run at pH 6.5 and 100 µM protein concentration, similarly to the D132N RNase H + 32-B mutants native gel presented in Figure 5.37. The D132N ∆N RNase H + 32-B complex, in lane 2,
looks rather strong, and the two complexes with the 32-B mutants, lane 5 and 8,
appear to be weaker.
Figure 5.43 – D132N ∆N RNase H + 32-B Mutants Native Gels
WT I151D I60D 1 2 3 4 5 6 7 8 9
1- 32-B Wild Type Protein 2- 32-B + D132N ∆N RNase H
3- D132N ∆N RNase H 4- I151D 32-B Protein 5- I151D 32-B + D132N ∆N RNase H
6- D132N ∆N RNase H 7- I60D 32-B Protein 8- I60D 32-B + D132N ∆N RNase H
9- D132N ∆N RNase H
Next, fluorescence anisotropy titrations were run in the same conditions
the D132N RNase H + 32-B mutants titrations were. The WT 32-B titration was
done at 10 µM D132N ∆N RNase H + 20/30 fork DNA, while the I151D and I60D
32-B titrations were done at 15 µM.
225
The dissociation constant for the D132N ∆N RNase H + WT 32-B + DNA
is comparable to the D132N RNase H complex one, even though slightly larger.
The D132N ∆N RNase H complex however did look stronger on the native gels
(see Figure 5.1 and Figure 5.2). The two mutants show a weaker interaction with
D132N ∆N RNase H than they did with the full length RNase H, but the difference
in affinity between the wild-type 32-B and the mutants is not as dramatic here
with the truncated RNase H. The N-terminus of RNase H, even though it was
disordered in the crystal structure, is located near the binding site of the 32
protein. Therefore, when the N-terminus is missing like in the case of D132N ∆N
RNase H, it is possible that the binding of 32 protein is less restricted to a specific area, and the presence of the mutated residues can be accommodated by sliding along RNase H towards where the N-terminus would be. Another point is that the I151D 32-B mutant appears to bind quite strongly to D132N ∆N RNase
H and the 20/30 fork DNA. However, the His-Tag version of that protein was
used in this titration and the His-Tag might be playing a role in the increase of
binding affinity that is observed.
226
Figure 5.44 – Fluorescence Anisotropy Titrations of the D132N ∆N RNase H
+ 32-B Mutants + Fork DNA Complex
WT 32-B
32-B Kd (µM)
Wild Type 6.70 ± 0.05
I151D 32-B 10.51 ± 0.03
I60D 32-B 21.0 ± 0.1
I151D 32-B I60D 32-B
227
5.5. D132N ∆N RNase H + 32 Protein Interaction
5.5.1. Complex Preparation
Similarly to the D132N ∆N RNase H + 32-B complex, the D132N ∆N
RNase H + 32 protein complex had to be prepared with concentrated stocks of
D132N ∆N RNase H and 32 protein, then dialyzed and finally concentrated (see
Section 5.4.1).
5.5.2. Protein-Protein Crystallization
The D132N ∆N RNase H + 32 Protein complex was screened for crystal
hits, according to Table 5.18 below. The screens were setup at 4 °C in order to
help keep the proteins in solution, as D132N ∆N RNase H has a tendency to
precipitate.
Table 5.18 – D132N ∆N RNase H + 32 Protein Crystal Screens
Crystal Screen Concentration Temperature Drop (µL)
Crystal Screen I and II 0.1 mM 4 °C 1 + 1
Wizard I and II 0.1 mM 4 °C 1 + 1
A few crystal hits were obtained, and two of them are shown in Figure
5.45. All the hits were microcrystalline ones.
Expansion trays were then setup using these two conditions, but crystals
large enough for diffraction studies could not be obtained.
228
Figure 5.45 – D132N ∆N RNase H + 32 Protein Crystals after Screening
Crystal Screen I – Condition 27
20 % 2-propanol 0.1 M Na HEPES pH 7.5 0.2 M Sodium Citrate
Crystal Screen II – Condition 43 50 % MPD 0.1 M Tris HCl pH 8.5 0.2 M Ammonium Phosphate monobasic
5.5.3. Protein-Protein-DNA Crystallization
The ternary complex (D132N ∆N RNase H + 32 Protein + fork DNA) was
also screened for diffraction, as it was one of the DNA complexes identified by
native gels (Section 5.2.2). The screens were done as follows in Table 5.19.
Table 5.19 – D132N ∆N RNase H + 32 Protein + DNA Crystal Screens
Crystal Screen Concentration Temperature Drop (µL)
Crystal Screen I and II 0.1 mM 4 °C 1 + 1
Wizard I and II 0.1 mM 4 °C 1 + 1
Unfortunately, no crystal hits were obtained with these screens, and all the
wells were precipitated.
5.5.4. Fluorescence Anisotropy
Finally, fluorescence anisotropy titrations were performed. In a similar way
to what was done with D132N RNase H + the 32 truncations, all four 32 protein
truncations were used with the D132N ∆N RNase H. These experiments were
run identically to the previous ones, with the same labeled DNA substrate.
The results are shown in Figure 5.46 and Table 5.20. The strongest
binding is observed with the 32-B truncation, with a Kd of 6.70 µM. Next are the
32 full length and the 32 core proteins, with a respective Kd of 12.3 and 12.4 µM.
Finally, the 32-A truncation shows the weakest binding with a Kd of 23 µM. It is interesting to see that the 32 full length protein is able to bind more strongly to
RNase H, when it is missing the N-terminus domain. This is consistent with what was said earlier in Section 5.2.3, about domains moving upon DNA binding in order to allow the binding of one protein to the other. In this particular case, the absence of the N-terminus allows the full length 32 protein to bind, just like the
32-B truncation could bind quite strongly to the full length RNase H, but the full length 32 could not.
229 230
Figure 5.46 – Fluorescence Anisotropy Titrations of the 32 Truncations +
D132N ∆N RNase H + Fork DNA Complex
32-B 32
32 core 32-A
Table 5.20 – Summary of the Dissociation Constants from the FA Titrations
25
32 Truncation Kd (µM) 20 32 Protein 12.31 ± 0.03 15
32-B 6.70 ± 0.05 Kd (uM) 10
32-A 22.98 ± 0.04 5
32 core 12.42 ± 0.05 0 32 32-B 32-A 32core
231
5.6. D132N ∆N RNase H + 32 Core Protein Interaction
5.6.1. Complex Preparation
Similarly to the D132N ∆N RNase H + 32-B complex, the D132N ∆N
RNase H + 32 core protein complex had to be prepared with concentrated stocks of D132N ∆N RNase H and 32 core protein, then dialyzed and finally
concentrated (see Section 5.4.1).
5.6.2. Protein-Protein Crystallization
The D132N ∆N RNase H + 32 core domain complex was screened for
crystal hits, as described in the Table 5.21 below.
Table 5.21 – D132N ∆N RNase H + 32 Core Crystal Screens
Crystal Screen Concentration Temperature Drop (µL)
Crystal Screen I and II 0.1 mM 4 °C 1 + 1
Wizard I and II 0.1 mM 4 °C 1 + 1
A few, nice crystals hits were obtained from the screens. Some of them
are shown on Figure 5.47. These crystallization conditions were then expanded
on, but only the 30 - 40 % MPD + 0.1 M Tris HCl pH 8.5 + 0.2 M Ammonium
Sulfate Monobasic condition grew single crystals that were big enough to be
flash-frozen and screened for X-Ray diffraction. These crystals are shown in
Figure 5.48. However, only faint diffraction was observed, so further optimization
of these crystals was needed.
232
Figure 5.47 – D132N ∆N RNase H + 32 Core Crystals after Screening
Crystal Screen I – Condition 27
20 % 2-propanol 0.1 M Na HEPES pH 7.5 0.2 M Sodium Citrate
Crystal Screen II – Condition 43 50 % MPD 0.1 M Tris HCl pH 8.5 0.2 M Ammonium Phosphate monobasic
Wizard I – Condition 12
20 % PEG 1000 0.1 M Imidazole pH 8.0 0.2 M Calcium Acetate
Wizard I – Condition 46
10 % PEG 8000 0.1 M Imidazole pH 8.0 0.2 M Calcium Acetate
Figure 5.48 – D132N ∆N RNase H + 32 Core Crystals after Optimization
35.4 % MPD Complex at 0.47 mM 0.1 M Tris HCl pH 8.5 4 °C 0.2 M Ammonium Phosphate 2 µL + 2 µL hanging drop
233
In order to increase the size of the crystals, larger drops were setup using the sitting drop method. First, round micro-bridges from Nextal (now Qiagen) were used, but this resulted in the precipitation of the protein. Poly-propylene micro-bridges from Hampton Research were used next, but only microcrystals mixed with precipitate could be grown with these. The sitting drop method or the plastic used in the microbridges might not have been appropriate for use with this particular complex, even though the same phenomenon was not observed with the D132N RNase H + 32-B protein complex. The hanging drop method was then used by default, but no large crystals could be obtained.
5.6.3. Fluorescence Anisotropy
The results from the D132N ∆N RNase H + 32 core + fork DNA
fluorescence anisotropy titration were presented in Section 5.5.4. The
dissociation constant was calculated to be 12.4 µM. As a comparison, the D132N
RNase H + 32 core + fork DNA dissociation constant was around 13.1 µM, so the
two bindings are fairly similar.
234
5.7. Conclusion
The work done to characterize bacteriophage T4 RNase H, the 32 protein
and their interaction was described in the previous three chapters.
The 32 protein, along with three truncations, was expressed and purified.
The 32-B truncation, missing the N-terminal cooperativity domain, was
characterized in further details. Even though a crystal structure could not be
obtained, scattering studies showed that the 32-B protein in solution is in
equilibrium between a monomeric and a dimeric form. The shape of the A
domain was also obtained from the SAXS experiment. Two 32-B mutants were
cloned, expressed and purified, and were used in the RNase H + 32 protein
study.
The native RNase H, the D132N RNase H mutant and the D132N ∆N
RNase H N-terminal truncation were all expressed and purified.
The D132N RNase H + 32 protein complex was characterized using different techniques. The different 32 protein truncations were tested in complex with RNase H, and it appeared that the 32-B truncation formed the strongest
RNase H + 32 complex. Dissociation constants were obtained by fluorescence anisotropy titrations for RNase H complexed with each 32 truncation, in the presence of the fork DNA substrate. The 32-B + RNase H dissociation constant in the absence of DNA was also obtained by ITC, and was consistent with the one from the fluorescence anisotropy titration. The 32-B + RNase H complex being the most robust, it was extensively characterized. A crystal structure at
3.4 Å resolution was obtained, which showed that the two proteins interact with
235
each other through the C-terminus of RNase H and the core domain of 32.
However, the A and B domain of the 32 protein seem to have an influence on the
interaction, but this could not be seen from the crystal structure as the B domain
was missing and the A domain disordered. The crystal structure was validated by
the SAXS experiment. Modeling of a fork DNA substrate in the RNase H + 32
core model further validated the crystal structure. Finally, site-directed
mutagenesis studies provided another confirmation of the RNase H + 32 model.
An N-terminal truncation of RNase H was then used in complex with the
32 truncations, to further probe the role of the 32 domains. Complexes of the
D132N ∆N RNase H with the 32-B, 32 core domain and 32 full length were
obtained, but working with these was made difficult by the low solubility of the
RNase H double mutant. It appeared that the N-terminus of RNase H, even
though it is not directly involved in binding to 32 but rather to the 45 clamp protein, still played a role in the RNase H / 32 protein interaction. The A domain
and the cooperative binding of 32 again seemed to play a role, however it could
not be determined which one from the biophysical or structural studies.
Since the N-terminal domain of RNase H and the A domain of the 32
protein were both disordered in the T4 D132N RNase H + 32-B crystal structure.
It would be interesting to see if more information can be obtained from homologous proteins from a related bacteriophage. This new project was initiated and the work accomplished so far is described in the following chapter.
CHAPTER 6 - Bacteriophage Rb69
6.1. Introduction
Bacteriophage Rb69 is a T4-like phage, meaning it is a close relative to bacteriophage T4. The genomic map of Rb69 is shown in Figure 6.1, and as it can be seen, most of the Rb69 genomic DNA is orthologous to that of T4.
Highlighted in yellow are the genes that differ between Rb69 and T4, they do not include the genes for RNase H and 32 protein (helix destabilizing protein) that are highlighted in red on the figure. If compared to the genomic map of bacteriophage T4, both genes for the two organisms are located in the same 150 region.
A common strategy in X-Ray crystallography is to study the same protein from closely related organisms. The protein is expected to have the same overall fold and structure, the important residues for substrate binding and catalysis are most commonly conserved, however surface residues can vary. These residues are the ones involved in lattice contact in forming the crystal. Therefore, one protein might crystallize better than the other and give better diffracting crystals.
Since the bacteriophage T4 RNase H + 32-B protein complex did not produce high resolution diffraction crystals, switching to the Rb69 RNase H + 32-B complex might yield better results in terms of X-Ray diffraction resolution and
236 237 provide a better picture of how the complex comes together. Moreover, it could be interesting to study the Rb69 RNase H + 32-B complex and compare it to the results that were obtained from the T4 complex.
Figure 6.1 – Bacteriophage Rb69 Genomic Map
The map was obtained from http://www.phage.org/images/rb69small.png
The following sections describe the work that was done on bacteriophage
Rb69 RNase H, both the native protein and the D132N mutant. Additional work on the Rb69 32-B protein that was done by other lab members is also detailed.
238
6.2. Bacteriophage Rb69 Native RNase H
6.2.1. Introduction
The bacteriophage Rb69 RNase H is a 5’ to 3’ exonuclease. For more information on the RNase H protein in general, see Section 1.1.3. The amino- acid sequence was aligned with the one of its T4 homologue, and the sequence alignment is presented in Figure 6.2. The two proteins are 75% identical, with an
85% positive score. This shows how closely related the Rb69 and T4 proteins are, which is why studying the Rb69 RNase H + 32-B complex could shed more light on some of the results obtained with the T4 proteins.
Figure 6.2 – Sequence Alignement of T4 RNase H and Rb69 RNase H
1 MDLEMMLDEDYKEGIALADFSNIALAAALNNFEDGDKITVPMVRHVVLNSIRKNVVMFRK 60 MDLEMMLDEDYKEGI L DFS IAL+ AL NF D +KI + MVRH++LNSI+ NV + 1 MDLEMMLDEDYKEGICLIDFSQIALSTALVNFPDKEKINLSMVRHLILNSIKFNVKKAKT 60
61 QGYTKFVLCMDNATSGYWRRDFAYYYKKNRKTDREASKWDWEGYFTALHQVVDEIKKYMP 120 GYTK VLC+DNA SGYWRRDFAYYYKKNR RE S WDWEGYF + H+V+DE+K YMP 61 LGYTKIVLCIDNAKSGYWRRDFAYYYKKNRGKAREESTWDWEGYFESSHKVIDELKAYMP 120
121 YVVMDIDKYEADDHIGVLTKYLSLAGHKVCIVASDGDFTQLHKYPNVKQWSPPQKKWVKI 180 Y+VMDIDKYEADDHI VL K SL GHK+ I++SDGDFTQLHKYPNVKQWSP KKWVKI 121 YIVMDIDKYEADDHIAVLVKKFSLEGHKILIISSDGDFTQLHKYPNVKQWSPMHKKWVKI 180
181 KNGSAEIDCMTKILKGDRKDGVASVRVRGDFWFTRVEGERTPSMKTTIIEALANDRSQAE 240 K+GSAEIDCMTKILKGD+KD VASV+VR DFWFTRVEGERTPSMKT+I+EA+ANDR QA+ 181 KSGSAEIDCMTKILKGDKKDNVASVKVRSDFWFTRVEGERTPSMKTSIVEAIANDREQAK 240
241 VLLSAEEYKRYQENLVLIDFDYIPDNIASTIIEYYNSYQPQPKGKIYSYFVKSGLSKLTS 300 VLL+ EY RY+ENLVLIDFDYIPDNIAS I+ YYNSY+ P+GKIYSYFVK+GLSKLT+ 241 VLLTESEYNRYKENLVLIDFDYIPDNIASNIVNYYNSYKLPPRGKIYSYFVKAGLSKLTN 300
301 VINEF 305 INEF 301 SINEF 305
The Rb69 RNase H sequence is shown in blue, the T4 RNase H is in green. Identities = 218/289 (75%), Positives = 246/289 (85%)
239
Rb69 RNase H is a 305 amino-acid protein, with a molecular weight of
35.4 kDa. Its calculated pI from the Expasy website is 6.97 and the extinction coefficient at 280 nm is 1.78 (Gill and von Hippel, 1989; Gasteiger et al., 2003). A summary of the Rb69 RNase H characteristics is given in Table 6.1.
Table 6.1 – Rb69 RNase H Characteristics
Amino-acids Molecular Weight Calculated pI ε
35.4 kDa 305 6.97 1.78 (40.0 kDa with HisTag)
6.2.2. Initial Cloning and Expression
An initial construct of Rb69 native RNase H in pET101 was previously
made by Hillary H. Voss, an undergraduate student in the Mueser lab. The
primers were designed according to the Rb69 rnh gene sequence found in the
NCBI GenBank database, with a CACC overhang on the forward primer to allow
insertion in the pET101 vector using the TOPO-assisted directional cloning
Gateway technology. The Rb69 rnh gene encodes for a 290 amino-acid protein, which was surprising as the T4 RNase H contains 305 amino-acids. Since T4 and Rb69 are closely related bacteriophages, it was expected that the two proteins would contain the same number of amino-acids and have high sequence similarities. The BLAST sequence alignment between the Rb69 RNase H and the
T4 RNase H, in Figure 6.3, shows that Rb69 RNase H is missing 15 amino-acids
at the C-terminus. The characteristics of that truncated Rb69 RNase H are
240 shown in Table 6.2, along with the T4 Native RNase H characteristics for comparison.
Figure 6.3 – Sequence Alignment of T4 Native and Rb69 Truncated RNase H
1 MDLEMMLDEDYKEGIALADFSNIALAAALNNFEDGDKITVPMVRHVVLNSIRKNVVMFRK 60 MDLEMMLDEDYKEGI L DFS IAL+ AL NF D +KI + MVRH++LNSI+ NV + 1 MDLEMMLDEDYKEGICLIDFSQIALSTALVNFPDKEKINLSMVRHLILNSIKFNVKKAKT 60
61 QGYTKFVLCMDNATSGYWRRDFAYYYKKNRKTDREASKWDWEGYFTALHQVVDEIKKYMP 120 GYTK VLC+DNA SGYWRRDFAYYYKKNR RE S WDWEGYF + H+V+DE+K YMP 61 LGYTKIVLCIDNAKSGYWRRDFAYYYKKNRGKAREESTWDWEGYFESSHKVIDELKAYMP 120
121 YVVMDIDKYEADDHIGVLTKYLSLAGHKVCIVASDGDFTQLHKYPNVKQWSPPQKKWVKI 180 Y+VMDIDKYEADDHI VL K SL GHK+ I++SDGDFTQLHKYPNVKQWSP KKWVKI 121 YIVMDIDKYEADDHIAVLVKKFSLEGHKILIISSDGDFTQLHKYPNVKQWSPMHKKWVKI 180
181 KNGSAEIDCMTKILKGDRKDGVASVRVRGDFWFTRVEGERTPSMKTTIIEALANDRSQAE 240 K+GSAEIDCMTKILKGD+KD VASV+VR DFWFTRVEGERTPSMKT+I+EA+ANDR QA+ 181 KSGSAEIDCMTKILKGDKKDNVASVKVRSDFWFTRVEGERTPSMKTSIVEAIANDREQAK 240
241 VLLSAEEYKRYQENLVLIDFDYIPDNIASTIIEYYNSYQPQPKGKIYSYL 290 VLL+ EY RY+ENLVLIDFDYIPDNIAS I+ YYNSY+ P+GKIYSY ------241 VLLTESEYNRYKENLVLIDFDYIPDNIASNIVNYYNSYKLPPRGKIYSYFVKAGLSKLTN 300
----- 301 SINEF 305
The Rb69 RNase H sequence is shown in blue, the T4 RNase H is in green. Identities = 218/289 (75%), Positives = 246/289 (85%)
Table 6.2 – Truncated Rb 69 RNase H vs. T4 RNase H Characteristics
Amino-Acids Molecular Weight pI ε
Rb69 RNase H 290 33.7 kDa 6.54 1.87
T4 RNase H 305 35.6 kDa 8.61 1.65
Nonetheless, the gene was amplified using PCR and was then inserted in the pET101 vector, as shown in Figure 6.4. The correct insertion was verified by
DNA sequencing at the Plant-Microbe Genomics Facility at Ohio State University.
241
Figure 6.4 – pET 101 Insert of the Rb69 rnh Gene
1 2 1- Rb69 rnh in pET 101 2- Supercoiled DNA ladder
The correct sizes are as follows: 5 kb • rnh 873 bp • pET101 5753 bp • Total 6623 bp
2 kb The pET101 + rnh construct runs between 6 and 7 kb, which indicates that rnh is inserted in the vector.
The pET101 + rnh vector was transformed in several expression cell lines, namely BL21 (DE3) pLysS, Rosetta Blue (DE3) and T7 Express lac Iq. Protein expression was induced with 1 mM IPTG and checked on an SDS-PAGE gel that is shown in Figure 6.5. Rb69 RNase H is expressed in both Rosetta Blue (DE3) and T7 express cell lines.
Figure 6.5 – Rb69 RNase H Expression
1 2 3 4 5 6 7 1- BL21 (DE3) pLysS expression – 0h sample
66.3 kDa 2- BL21 (DE3) pLysS expression – 3h sample 55.4 kDa 3- Rosetta Blue (DE3) expression – 0h sample 36.5 kDa 31.0 kDa 4- Rosetta Blue (DE3) expression – 3h sample
21.5 kDa 5- T7 Express expression – 0h sample 6- T7 Express expression – 3h sample 7- Molecular Weight Marker
242
A 1 L culture was grown for both cell lines, and the cells were lysed in the
presence of a low salt or high salt buffer. The low salt buffer was composed of
50 mM Tris HCl pH 8.0, 200 mM NH4Cl, 10 mM MgCl2, 5 % glycerol, 2 mM DTT
and 0.03% PEI. The high salt buffer had the same composition but contained 1 M
NH4Cl. The solubility of Rb69 RNase H was assessed by SDS-PAGE gel, which is shown in Figure 6.6. The protein was present in the pellet and therefore insoluble in all cases.
Figure 6.6 – Rb69 RNase H Cell Lysis
1 2 3 4 5 6 7 8 9 1- Rosetta Blue (DE3) low salt cell lysis – pellet 2- Rosetta Blue (DE3) low salt cell lysis – supernatant
66.3 kDa 3- Rosetta Blue (DE3) high salt cell lysis – pellet 55.4 kDa 4- Rosetta Blue (DE3) high salt cell lysis – supernatant 5- T7 Express low salt cell lysis – pellet 36.5 kDa 31.0 kDa 6- T7 Express low salt cell lysis – supernatant 7- T7 Express high salt cell lysis – pellet 21.5 kDa 8- T7 Express high salt cell lysis – supernatant 14.4 kDa 9- Molecular Weight Marker
The fact that Rb69 RNase H is insoluble isn’t very surprising, since the
C-terminus of the T4 RNase H is known to play an important role in solubility.
C-terminal deletions of the T4 RNase H were also found to have solubility issues
(Gangisetty et al., 2005). Thus, a closer look was taken at the nucleotide sequence of the C-terminus of Rb69 RNase H, which was aligned with the T4 nucleotide sequence. This is shown in Figure 6.7.a.
243
Figure 6.7 – Nucleotide and Amino-Acid C-terminus Sequence Alignment of
T4 and Rb69 RNase H
6.7.a – Truncated Rb69 RNase H Sequence Alignment
GAG TAT TAT AAC TCA TAT CAA CCA CAA CCT AAA GGC AAG ATT TAT TCA TAC 283 E Y Y N S Y Q P Q P K G K I Y S Y AAT TAC TAT AAT TCA TAT AAA TTA CCA CCG CGT GGC AAA ATT TAT TCA TAT 283 N Y Y N S Y K L P P R G K I Y S Y
TTG TAA AAT CCG GTC TTT CTA AAT TAA CAA GTG TAA TTA ATG AAT TCT GAG 290 L STOP N P V F L N stop Q V stop L M N S E TTT GTA AAA GCG GGT CTT TCT AAA TTA ACT AAT AGC ATT AAT GAA TTT TGA 290 F V K A G L S K L T N S I N E F stop
6.7.b – Full Length Rb69 RNase H Sequence Alignment
GAG TAT TAT AAC TCA TAT CAA CCA CAA CCT AAA GGC AAG ATT TAT TCA TAC 283 E Y Y N S Y Q P Q P K G K I Y S Y AAT TAC TAT AAT TCA TAT AAA TTA CCA CCG CGT GGC AAA ATT TAT TCA TAT 283 N Y Y N S Y K L P P R G K I Y S Y
TTT GTA AAA TCC GGT CTT TCT AAA TTA ACA AGT GTA ATT AAT GAA TTC TGA 290 F V K S G L S K L T S V I N E F stop TTT GTA AAA GCG GGT CTT TCT AAA TTA ACT AAT AGC ATT AAT GAA TTT TGA 290 F V K A G L S K L T N S I N E F stop
The Rb69 RNase H amino-acid sequence is shown in blue, the T4 RNase H is in green. The extra Thymidine base that was added in the full length Rb69 RNase H nucleotide sequence is shown in red.
It can be seen that the codons in the Rb69 RNase H sequence seem to be off frame with the ones in the T4 RNase H sequence, right after amino-acid 290
(Leucine 290). If a Thymidine base is added between the Thymidine and
Guanosine bases in the codon coding for Leucine 290, as shown in Figure 6.7.b, the sequences fall back on frame with one another. Also, the new C-terminus amino-acid sequence for Rb69 RNase H aligns much better with the T4, and a stop codon is created after Phenylalanine 305. This indicates that a sequencing
244 error was made in the NCBI GenBank. The new sequence alignment for the full length Rb69 RNase H and the T4 RNase H is shown in Figure 6.7.b.
6.2.3. Molecular Cloning
New primers had to be designed in order to clone the full length RNase H from Rb59. The reverse primer was designed according to the new C-terminal sequence that is described in the previous section. Also, the calculated pI of
Rb69 full length RNase H (see Table 6.1) is very close to 7, and it might be challenging to purify the protein using regular ion-exchange chromatography.
Therefore, a new forward primer was designed so that the rnh gene can be inserted in the pDEST-C1 expression vector, which incorporates a 6xHis-Tag on the N-terminus of the protein. The forward primer contains a CACC overhang necessary for TOPO-assisted directional insertion in the entry vector, and a TEV protease cleavage site which will allow the His-Tag to be cleaved off after purification. The new set of primers is shown in Figure 6.8.
Figure 6.8 – Rb69 Full Length RNase H PCR Primers
Forward Primer rnh 5’- ATG GAT TTA GAA -3’ primer 5’- C ACC GAG AAC CTC TAC TTC CAA GGA ATG GAT TTA GAA -3’ rnh 5’- ATG ATG TTG GAT GAA GAT TA… -3’ primer 5’- ATG ATG TTG GAT GAA GAT TA -3’
5’- C ACC GAG AAC CTC TAC TTC CAA GGA ATG GAT TTA GAA ATG ATG TTG GAT GAA GAT TA -3’
32 bp from rnh, 57 bp total 28% GC content Tm = 55°C
245
Reverse Primer (Inverse Complement) rnh 5’- …CTT TCT AAA TTA ACA AGT GTA ATT AAT GAA TTC TGA -3’ primer 3’- GAA AGA TTT AAT TGT TCA CAT TAA TTA CTT AAG ACT -5’
5’- TCA GAA TTC ATT AAT TAC ACT TGT TAA TTT AGA AAG -3’
36 bp total 22% GC content Tm = 55°C
The forward primer was aligned with the nucleotide sequence of the rnh gene. The start codon is highlighted in green. In red is the CACC overhang necessary for insertion in the pDEST-C1 vector, and the sequence coding for the TEV protease cleavage site is shown in blue. The reverse primer is shown as the inverse complement of rnh, where the stop codon is highlighted in red.
The PCR reaction was performed as described in Table 6.3. The annealing temperature was chosen as the melting temperature + 2°C, so 57°C. It is recommended to use 1 minute/kb for the extension time, therefore 1 minute was used as the rnh gene is 918 bp long.
Table 6.3 – Rb69 RNase H PCR Reaction
PCR Reaction PCR Program
Proof Start Buffer (10X) 5 µL Activation 95 °C, 4 minutes
Primer solution (10 µM) 10 µL Denaturation 95 °C, 1 minute
dNTPs (10 mM) 2 µL Annealing 57 °C, 1 minute
Proof Start Polymerase (2.5 U/µL) 2 µL Extension 68 °C, 1 minute
Rb69 genome 1 µL 32 cycles
Autoclaved water 35 µL
The PCR product was then run on a 1.5% agarose gel that is shown in
Figure 6.9.a. Since the band corresponding to the rnh gene amplified by PCR
246 runs at the correct size, it was excised from the gel and the gene was purified using the MiniElute® gel purification kit from Qiagen.
The rnh PCR product was first inserted in the pET101 expression vector.
The advantage of pET101 is that it allows direct insertion of the PCR product in the expression vector, therefore making the cloning process faster. However, pET101 constructs seldom yield expression of the protein of interest. The reaction setup is shown in Table 6.4, and it was left to incubate at Room
Temperature for 30 minutes.
Table 6.4 – Rb69 RNase H Insertion in pET101 Reaction
pET101 Insertion Reaction
Purified PCR product 2 µL
Salt Solution 1 µL
pET101 vector 1 µL
Autoclaved water 2 µL
The salt solution is provided with the pET101 cloning kit, and is composed of 1.2 M NaCl + 0.06 M MgCl2. The pET101 vector is already linearized and covalently linked to Topoisomerase I.
A 2 µL sample of the reaction were transformed into 50 µL of competent
Top10 cells. After the transformation reaction, 50 µL and 100 µL of the cells were then respectively plated on two Petri dishes containing LB Agar + Ampicillin, as pET101 contains the AmpR Ampicillin resistance gene. The plates were then incubated at 37 °C until good-sized colonies could be seen. Several colonies
247
were picked on each plate and grown overnight in a liquid media containing LB
and Ampicillin. The next day, the cells were pelleted down and the plasmid DNA
was isolated using the MiniPrep® kit from Qiagen. Glycerol stocks of all the colonies were also made at OD600 = 0.4. The plasmids obtained from the different
colonies were run on a 1% agarose gel that is shown in Figure 6.9.b. The
plasmids from colonies 3, 4 and 6 seem to have the right insert, despite the bad
quality of the gel. All three plasmids were transformed in a competent expression
host to check for protein expression.
Because of the limited success of pET101 constructs in terms of protein
expression, the PCR purified gene was also inserted in the pENTR-D entry
vector. This work was done in parallel with the cloning in pET101. The pENTR-D
reaction setup is shown in Table 6.5. Once all the components were added to the
reaction, it was incubated at Room Temperature for 30 minutes.
Table 6.5 – Rb69 RNase H Insertion in pENTR-D Reaction
pENTR-D Insertion Reaction
Purified PCR product 2 µL
Salt Solution 1 µL
pENTR-D vector 1 µL
Autoclaved water 2 µL
The salt solution is provided with the pENTR-D cloning kit, and is composed of 1.2 M NaCl + 0.06 M MgCl2. The pENTR-D vector is already linearized and covalently linked to Topoisomerase I.
248
A 2 µL sample of the reaction were transformed into 50 µL of competent
Top10 cells. Then, 50 µL and 100 µL of the transformed cells were respectively
plated on two Petri dishes containing LB Agar + Kanamycin, as the pENTR-D
vector carries the Kanamycin resistance gene. The plates were incubated at
37 °C overnight. Several colonies were picked on each plate and grown
overnight in a liquid media containing LB and Kanamycin. The next day, the cells
were pelleted down and the plasmid DNA was isolated using the MiniPrep kit.
Glycerol stocks of all the colonies were also made at OD600 = 0.4. The plasmids
obtained from the different colonies were run on a 1% agarose gel that is shown
in Figure 6.9.c. All four plasmids have the correct size, indicating that the gene
was correctly inserted in the pENTR-D entry vector. However, four bands can be
seen on the gel for each plasmid, when only three are expected (supercoiled
DNA, linear DNA and nicked DNA). The presence of a fourth band is an
indication that another DNA species is present and might hinder any chance of
success in the following step, which is the insertion of the gene in the expression
vector. To avoid any difficulty, the supercoiled plasmid from colony 2,
corresponding to the strongest, lowest band around 3500 bp was gel purified with
the MiniElute kit.
Next, the LR Clonase reaction was performed, in order to swap the rnh
gene from the entry vector into the pDEST-C1 expression vector. The reaction setup is described in Table 6.6. It should be noted that in this case, the LR
Clonase enzyme is not present in the destination vector and has to be added
249
separately. The 1X TE buffer is composed of 20 mM Tris HCl pH 8.0 + 1 mM
EDTA.
Table 6.6 – Rb69 RNase H Insertion in pDEST-C1 Reaction
LR Clonase Reaction
Gel Purified rnh in pENTR-D 1 µL
pDEST-C1 1 µL
LR Clonase Mix 2 µL
TE buffer (1X) 6 µL
The reaction was incubated at Room Temperature for 2 hours and
terminated by adding 1 µL of Proteinase K at 2 µg/µL followed by incubation for
10 minutes at 37°C. The Proteinase K is a protease that degrades the LR
Clonase enzyme and leaves the DNA untouched, therefore ending the transposition reaction. 2 µL of the reaction were transformed into 100 µL of
competent DH5α cells. Top10 cells could not be used with the pDEST-C1 vector
since that particular cloning host is resistant to Streptomycin. The pDEST-C1
vector also carries the Streptomycin resistance gene, therefore no selection for
pDEST-C1 containing colonies can be achieved. Again, 50 µL and 100 µL of the
transformed DH5α cells were plated on two plates made with LB Agar +
Streptomycin. Colonies were picked on each plate, grown overnight in LB +
Streptomycin, and the plasmid was isolated from the cells with the MiniPrep kit.
Glycerol stocks were taken for all colonies at OD600 = 0.6. The different plasmids
were run on a 1% agarose gel, shown in Figure 6.9.d.
250
Figure 6.9 – Agarose Gels for Rb69 RNase H Cloning a 1 2 3 4
1- Rb69 rnh PCR product 2- Rb69 rnh PCR product
3- Rb69 rnh PCR product 1000 bp 4- 100 bp ladder 500 bp The rnh gene is 918 bp long, the PCR product has the correct size. 100 bp
b 1 2 3 4 5 6 7 8
1- Supercoiled DNA ladder 2- Rb69 rnh insert in pET101 – colony 1 3- Rb69 rnh insert in pET101 – colony 2
4- Rb69 rnh insert in pET101 – colony 3 5- Rb69 rnh insert in pET101 – colony 4 6- Rb69 rnh insert in pET101 – colony 5 5 kb 7- Rb69 rnh insert in pET101 – colony 6 8- Rb69 rnh insert in pET101 – colony 7 2 kb
The correct sizes are as follows: • rnh 918 bp • pET101 5753 bp • Total 6671 bp
The pET101 + rnh construct should run between 6 and 7 kb. Colonies 3, 4 and 6 look like they have the right insert, but the quality of the gel makes it difficult to assess the actual size of the different plasmids.
251 c 1 2 3 4 5 1- Rb69 rnh insert in pENTR-D – colony 1
2- Rb69 rnh insert in pENTR-D – colony 2 3- Rb69 rnh insert in pENTR-D – colony 3 4- Rb69 rnh insert in pENTR-D – colony 4 5- Supercoiled DNA ladder
The correct sizes are as follows: 5 kb • rnh 918 bp • pENTR-D 2580 bp • Total 3498 bp
2 kb The pENTR-D + rnh constructs run between 3 and 4 kb, which indicates that rnh is inserted in the vector.
d 1 2 3 4 5 6 7 8 1- Supercoiled DNA ladder 2- Rb69 rnh insert in pDEST-C1 – colony 1 3- Rb69 rnh insert in pDEST-C1 – colony 2
4- Rb69 rnh insert in pDEST-C1 – colony 3 5- Rb69 rnh insert in pDEST-C1 – colony 4 5 kb 6- Rb69 rnh insert in pDEST-C1 – colony 5 7- Rb69 rnh insert in pDEST-C1 – colony 6 2 kb 8- Rb69 rnh insert in pDEST-C1 – colony 7
The correct sizes are as follows: • rnh 918 bp • pDEST-C1 + 5334 bp • ccdB - 1600 bp • Total 4652 bp
The pDEST-C1 + rnh construct should run between 4 and 5 kb. Colonies 1 and 3 are contaminated with the pENTR-D + rnh construct (3500 bp), but all the other colonies show the right size.
252
Some contamination with the pENTR-D rnh insert can be seen on the plasmids isolated from colonies 1 and 3. These colonies were picked from the same plate (the one inoculated with 50 µL of transformed cloning host). The plasmid isolated from colony 2 has the correct size and shows the strongest band, it was therefore chosen to be transformed into an expression host.
6.2.4. Protein Expression and Solubility
The three pET101 + rnh plasmids isolated from colonies 3, 4 and 6 were
all transformed in competent T7 express cells (1 µL of plasmid / 25 µL of cells).
That specific expression cell line was chosen because protein expression is more
tightly regulated due to the presence of the lacIq promoter. Leaky expression of
the toxic RNase H is therefore greatly reduced, and better yields can be achieved. T7 express cells are resistant to Tetracyclin (see Table 2.1).
After the transformation, 50 µL and 100 µL of the reaction were separately
incubated overnight in 10 mL LB + Ampicillin + Tetracyclin. The next day, 500
µL of the overnight culture were again incubated in fresh LB + Ampicillin +
Tetracyclin, and the cells were allowed to grow until they reached OD600 = 0.6. At
that point, glycerol stocks were taken as well as SDS-PAGE samples, and
protein expression was induced by adding 1 mM IPTG. After three hours of
expression, samples were again taken. The SDS-PAGE for RNase H expression
is shown in Figure 6.10.a. It can be seen that a 55 kDa protein is strongly
expressed, while Rb69 RNase H is a 35.4 kDa protein. It was concluded that the
competent T7 express cells used for the transformation reaction were
253 contaminated with a plasmid containing a gene coding for a 55 kDa protein. This particular 55 kDa protein wasn’t identified.
A new transformation reaction was performed with a fresh batch of competent T7 express cells. The SDS-PAGE gel for RNase H expression is shown in Figure 6.10.b. This time, weak expression around 28 kDa can be seen, but again no RNase H was produced.
Figure 6.10 – Rb69 RNase H (pET101) Expression
a 1 2 3 4 5 6 7 1- Molecular Weight Marker 2- Rb69 RNase H expression (plasmid
from colony 3) – 0h sample 66.3 kDa 55.4 kDa 3- Rb69 RNase H expression (plasmid from colony 3) – 3h sample 36.5 kDa 4- Rb69 RNase H expression (plasmid 31.0 kDa from colony 4) – 0h sample 21.5 kDa 5- Rb69 RNase H expression (plasmid
14.4 kDa from colony 4) – 3h sample 6- Rb69 RNase H expression (plasmid
from colony 6) – 0h sample 7- Rb69 RNase H expression (plasmid from colony 6) – 3h sample
Three hours after IPTG induction, a 55 kDa protein is expressed. Rb69 RNase H should run around 35 kDa, so it is not RNase H that is being produced.
254
1 2 3 4 5 6 7 b 1- Molecular Weight Marker 2- Rb69 RNase H expression (plasmid 66.3 kDa from colony 3) – 0h sample 55.4 kDa 3- Rb69 RNase H expression (plasmid from colony 3) – 3h sample 36.5 kDa 4- Rb69 RNase H expression (plasmid 31.0 kDa from colony 4) – 0h sample 21.5 kDa 5- Rb69 RNase H expression (plasmid from colony 4) – 3h sample 14.4 kDa 6- Rb69 RNase H expression (plasmid from colony 6) – 0h sample 7- Rb69 RNase H expression (plasmid from colony 6) – 3h sample
Weak expression of a 28 kDa protein is occurring, but this is too small a protein to be RNase H. No band is seen at 35 kDa.
Since expression of Rb69 RNase H from the pET101 plasmid was
unsuccessful, it was decided to focus on the pDEST-C1 plasmid instead and test
it for protein expression.
As it has previously been mentioned, 0.7 µL of the pDEST-C1 + rnh
plasmid isolated from colony 2 (Figure 6.9.c) were transformed into competent
T7 express cells. A 100 µL sample of the transformed T7 express cells were
directly incubated in 10 mL of LB + Streptomycin + Tetracyclin and allowed to
grow overnight. The overnight cell culture was then used the next day to
inoculate fresh LB + Streptomycin + Tetracyclin, The cells were grown until OD600
= 0.6, when a glycerol stock was taken and protein expression was induced by
255 adding 1 mM IPTG. 0 h and 3 h samples were taken and run on an SDS-PAGE gel, shown in Figure 6.11.a.
Once protein expression was assessed, a 1 L culture was grown using the same protocol so that a decently sized cell pellet could be obtained for cell lysis.
A 1 g pellet was obtained and lysed in the following buffer: 50 mM Tris HCl pH 7.5, 200 mM NH4Cl, 10 mM MgCl2, 5% glycerol, 2 mM DTT and 0.03% PEI.
The cell lysis protocol is described in Section 2.3. The pellet and supernatant samples were run on an SDS-PAGE gel shown in Figure 6.11.b. Some Rb69
RNase H is found insoluble in the pellet, but a good fraction of the protein is also soluble and found in the supernatant. The same phenomenon was observed for
T4 RNase H. The protein was declared soluble, and a 6 L culture was prepared for further purification studies.
Figure 6.11 – Rb69 RNase H (pDEST-C1) Expression and Cell Lysis
1 2 3 a
1- Molecular Weight Marker 66.3 kDa 55.4 kDa 2- Rb69 RNase H expression – 0h sample 3- Rb69 RNase H expression – 3h sample 36.5 kDa 31.0 kDa
21.5 kDa
14.4 kDa
256
b
1 2 3
66.3 kDa 55.4 kDa 1- Molecular Weight Marker 36.5 kDa 2- Rb69 RNase H cell lysis – pellet 31.0 kDa 3- Rb69 RNase H cell lysis – supernatant 21.5 kDa
14.4 kDa
Once protein expression and solubility were assessed, the pDEST-C1 +
rnh plasmid chosen for the expression studies was sent to the Plant-Microbe
Genomics Facility at Ohio State University for DNA sequencing. The results are presented in Appendix 3, and show that the Rb69 rnh gene was correctly amplified and inserted in the pDEST-C1 vector.
6.2.5. Protein Purification
To purify Rb69 RNase H, it was decided to take advantage of the presence of an N-terminal 6xHisTag and to purify the protein using a Cobalt affinity column or Talon column. However, the protein had first to be roughly isolated from the lysate by ion-exchange chromatography.
The first purification scheme that was designed was the following:
RNase H was first loaded on the low-resolution cation-exchange column SP
257
Sepharose. The elution from the SP Sepharose was then further purified with the
Talon column. Since the pI of Rb69 RNase H calculated from Expasy (Gill and
von Hippel, 1989) is very close to 7, as is shown in Table 6.1, the pH of the lysis
buffer had to be lowered below 7 so that the protein would be in a cationic form.
Thus, the cell pellet was first lysed with the buffer shown in Table 6.7.
Table 6.7 – Lysis and HPLC buffers for Rb69 RNase H purification scheme 1
Ion Exchange Metal Affinity Lysis (SP Sepharose) (Talon) 50 mM bis-Tris HCl pH 6.5 50 mM Na Phosphate pH 8.0 200 mM NH Cl 25 mM bis-Tris HCl pH 6.5 4 300 mM NaCl 10 mM MgCl 100 mM NH Cl Buffers 2 4 10 mM MgCl 5% glycerol 10 mM MgCl 2 2 + 7.5 mM Imidazole (wash) 2 mM DTT 0 - 1 M NaCl or 150 mM Imidazole (elution) 0.03% PEI
Buffer A: ~ 16 mS/cm Conductivity ~ 18 mS/cm ~ 35 mS/cm Buffer B: ~ 96 mS/cm
Elution / 100% A → 100% B 100% A → 50% B
The results of the protein expression and cell lysis with that particular
buffer are presented in Figure 6.12.a. The majority of the protein was found in the
lysis pellet and very little in the lysate, which indicates that the protein is much
less soluble at pH 6.5, as compared to the cell lysis that was done at pH 7.5
(Figure 6.11.b) where a good fraction of the protein was found in the lysate.
Nonetheless, the soluble fraction was loaded on the SP Sepharose. The buffers
used for that step are also shown in Table 6.7, the chromatogram and
SDS-PAGE gel from the run are presented in Figure 6.12.b. RNase H bound to
the SP resin and was eluted fairly purely for a first purification step. The fractions
258
containing the protein were pooled to be run on the metal affinity Talon column.
They were first dialyzed in the Talon equilibration buffer that contains no
imidazole. The different buffers used with the Talon column are shown in Table
6.7. It should be noted that magnesium chloride was added to the buffers to
increase the solubility of RNase H. As a result, magnesium phosphate tribasic
Mg3(PO4)2 precipitated out of solution and made that purification step difficult.
The protein eluted from the SP Sepharose was loaded in several batches on the
Talon column and eluted with 75 mM imidazole. The chromatogram and SDS-
PAGE gel from the first batch are shown in Figure 6.12.c. The fractions containing the protein were again pooled and concentrated. The pH of the solution also had to be lowered to 6.0 to avoid precipitation of magnesium phosphate.
Figure 6.12 – Rb69 RNase H purification scheme 1
Figure 6.12.a – Cell lysis
1 2 3 4 5
66.3 kDa 1- Molecular Weight Marker 55.4 kDa 2- Rb69 RNase H expression – 0h sample
36.5 kDa 3- Rb69 RNase H expression – 3h sample 31.0 kDa 4- Rb69 RNase H cell lysis – pellet
21.5 kDa 5- Rb69 RNase H cell lysis – supernatant
14.4 kDa
259
Figure 6.12.b – SP Sepharose
* * *
1 2 3 4 5
1- Molecular Weight Marker 66.3 kDa 55.4 kDa 2- SP Sepharose – F. 7 3- SP Sepharose – F. 10 36.5 kDa 4- SP Sepharose – F. 13 31.0 kDa 5- SP Sepharose – Flow Through 21.5 kDa
14.4 kDa
On the chromatogram, the OD260 is shown in purple, the OD280 in green, the conductivity in red and the % B in black. The red stars indicate which fractions were run on a SDS-PAGE gel. Here, fractions 11 to 18 contained RNase H and were pooled to be run on the Talon column. The pooled fractions are indicated with a red line on the chromatogram.
260
Figure 6.12.c – Talon
* *
1 2 3 4 5
66.3 kDa 1- Talon – Load 55.4 kDa 2- Talon – F. 18
3- Talon – F. 32 36.5 kDa 31.0 kDa 4- Talon – Flow Through 5- Molecular Weight Marker 21.5 kDa 14.4 kDa
Here, fractions 14 to 40 contained RNase H and were pooled for concentration. A large portion of the protein didn’t bind to the resin and was found in the Flow Through fraction.
261
The first purification scheme yielded less than a milligram of pure protein
(from 11.6 g of cells), the low yield being mostly due to the incapacity to extract
RNase H in the lysis buffer at pH 6.5.
As more protein needed to be purified, a second purification scheme was
designed. This time, the lysis buffer had a higher pH of 8.0, to improve the
extraction efficiency, and the lysate was first loaded on a low resolution
anion-exchange column, the Q Sepharose, before being run on the Talon
column. The new set of buffers is described in Table 6.8.
Table 6.8 – Lysis and HPLC buffers for Rb69 RNase H purification scheme 2
Ion Exchange Metal Affinity Lysis (Q Sepharose) (Talon) 50 mM Tris HCl pH 8.0 200 mM NH4Cl 25 mM bis-Tris HCl pH 8.0 50 mM Na Phosphate pH 8.0 10 mM MgCl 100 mM NH Cl 300 mM NaCl Buffers 2 4 5% glycerol 10 mM MgCl2 7.5 mM Imidazole (wash) 2 mM DTT 0 - 1 M NaCl or 150 mM Imidazole (elution) 0.03% PEI
Buffer A: ~ 16 mS/cm Conductivity ~ 18 mS/cm ~ 30 mS/cm Buffer B: ~ 98 mS/cm
Elution / 100% A → 100% B 100% A → 50% B
After cell lysis, a larger amount of RNase H was found in the supernatant,
compared to Scheme 1. The SDS-PAGE gel for the lysis is shown in Figure
6.13.a. The soluble fraction was then loaded on the low resolution anion
exchange Q Sepharose column. The buffers for this step are presented in Table
6.8, the chromatogram and SDS-PAGE in Figure 6.13.b. RNase H didn’t stick
very tightly to the resin and most of it was eluted while the column was being
262
rinsed with buffer A. The rest of the protein was eluted with a salt gradient at the
beginning of the run. All the fractions containing RNase H were pooled and
dialyzed overnight in the Talon equilibration buffer. After dialysis, precipitate was
found in the protein solution. There was a possibility that this was due to
Magnesium Phosphate precipitation again, as the Q buffer contains Mg2+ and the
Talon buffer phosphate ions, but the precipitate didn’t dissolve when the pH of the solution was decreased. The precipitate was then run on an SDS-PAGE gel and identified as RNase H (see Figure 6.13.c, lane 1). It was pelleted down and resuspended in the Q buffer A, then kept at 4°C to allow the protein to go back in solution but none of the precipitate redissolved. The supernatant after the dialysis step was further purified with the Talon column, using the buffers described in
Table 6.8. Like in scheme 1, RNase H was loaded onto the column in several batches. The chromatogram and SDS-PAGE for one of the Talon runs are shown in Figure 6.13.c. It should be noted that some of the protein is eluted when washing the column with the rinse buffer containing 7.5 mM imidazole, as can be seen on the rinse fraction in lane 5. The fractions containing RNase H were pooled and concentrated.
263
Figure 6.13 – Rb69 Full Length RNase H purification scheme 2
Figure 6.13.a – Cell lysis
1 2 3
1- Molecular Weight 66.3 kDa 2- Rb69 RNase H cell lysis – pellet 55.4 kDa 3- Rb69 RNase H cell lysis – supernatant
36.5 kDa 31.0 kDa
21.5 kDa
264
Figure 6.13.b – Q Sepharose
* * * * * * * *
1- Q Sepharose – Load 1 2 3 4 5 6 7 8 9 10 11 12 2- Q Sepharose – F. 11 3- Q Sepharose – F. 13 4- Q Sepharose – F. 15
5- Q Sepharose – F. 18 66.3 kDa 55.4 kDa 6- Q Sepharose – F. 21 7- Q Sepharose – F. 23 36.5 kDa 31.0 kDa 8- Q Sepharose – F. 25 9- Q Sepharose – F. 27 21.5 kDa 10- Q Sepharose – Rinse fraction 11- Q Sepharose – Flow Through 12- Molecular Weight Marker
On the chromatogram, the OD260 is shown in purple, the OD280 in green, the conductivity in red and the % B in black. The red stars indicate which fractions were run on a SDS-PAGE gel. Here, fractions 1 to 16 as well as the rinse fraction contained RNase H and were pooled to be run on the Talon column. The pooled fractions are indicated with a red line on the chromatogram.
265
Figure 6.13.c – Talon
**
1 2 3 4 5 6 7 8
1- Precipitate from the dialysis 2- Talon – Load 66.3 kDa 55.4 kDa 3- Talon – F. 21 4- Talon – F. 28 36.5 kDa 31.0 kDa 5- Talon – Flow Through 6- Talon – Rinse fraction (concentrated) 21.5 kDa 7- Talon – 150 mM Imidazole Elution 14.4 kDa 8- Molecular Weight Marker
Here, fractions 18 to 40 contained RNase H and were pooled for concentration. A large portion of the protein didn’t bind to the resin and was found in the Rinse fraction. This particular fraction was concentrated before being loaded on the gel which is why it looks more concentrated.
266
The final yield for this purification scheme was around 2 mg of pure
protein from 6.8 g of cells. Most of the protein was lost due to its precipitation and
could not be brought back in solution.
A different approach was used for the third purification scheme. Since
RNAse H doesn’t seem to bind to the metal affinity column, maybe because the
His-Tag is not accessible for binding to the Co2+, the protein would be purified
using ion-exchange chromatography only. The different buffers for this scheme
are detailed in Table 6.9.
Table 6.9 – Lysis and HPLC buffers for Rb69 RNase H purification scheme 3
Ion Exchange Size Exclusion Lysis (SP Sepharose and (Superdex 75) POROS HS)
50 mM Tris HCl pH 7.5 200 mM NH Cl 25 mM bis-Tris HCl pH 6.5 4 25 mM bis-Tris HCl pH 6.5 10 mM MgCl 100 mM NH Cl Buffers 2 4 150 mM NH Cl 5% glycerol 10 mM MgCl 4 2 10 mM MgCl 2 mM DTT 0 - 1 M NaCl 2 0.03% PEI
Buffer A: ~ 16 mS/cm Conductivity ~ 18 mS/cm / Buffer B: ~ 96 mS/cm
Elution / 100% A → 100% B /
The cell pellet was first lysed at pH 7.5 to yield soluble protein, using the
lysis buffer described inTable 6.9. The soluble portion, containing RNase H, was
run on an SDS-PAGE gel, which is shown in Figure 6.14.a, lane 2. This
supernatant was then loaded on the same low resolution cation exchange
column that was used in Scheme 1, the SP Sepharose, the reason being that
RNase H seemed to bind more strongly to the cation exchange resin compared
267
to the anion exchange one. The chromatogram and SDS-PAGE gel
corresponding to the SP Sepharose run are presented in Figure 6.14.a. Similarly
to Scheme 1, the protein was eluted fairly pure from the SP Sepharose, and the
fractions containing RNase H were pooled and loaded on the next column, the
high resolution cation exchange Poros HS. The protein eluted from the Poros
HS is still contaminated with higher molecular weight impurities, as it can be seen on Figure 6.14.b. Therefore, the protein was run on a size exclusion column to be further purified. The Superdex 75 was chosen as its void fraction (75 kDa and higher) is significantly larger than RNase H (around 40 kDa with the His-Tag).
Thus, RNase H should be eluted during the run and the impurities either earlier in the run or in the void fraction. Indeed, RNase H was purified away from the impurities (see Figure 6.14.c) and was then concentrated.
268
Figure 6.14 – Rb69 Full Length RNase H purification scheme 3
Figure 6.14.a – SP Sepharose
* *
1 2 3 4 5
1- Molecular Weight Marker
66.3 kDa 2- SP Sepharose – Load 55.4 kDa 3- SP Sepharose – F. 10 4- SP Sepharose – F. 14 36.5 kDa 31.0 kDa 5- SP Sepharose – Flow Through
21.5 kDa
On the chromatogram, the OD260 is shown in purple, the OD280 in green, the conductivity in red and the % B in black. The red stars indicate which fractions were run on a SDS-PAGE gel. Here, fractions 13 to 19 contained RNase H and were pooled to be run on the Poros HS. The pooled fractions are indicated with a red line on the chromatogram.
269
Figure 6.14.b – Poros HS
*
1 2 3 4
1- Poros HS – Load 2- Poros HS – F. 12 66.3 kDa 55.4 kDa 3- Poros HS – Flow Through 4- Molecular Weight 36.5 kDa 31.0 kDa
21.5 kDa
Here, RNase H is found in the main peak from fraction 6 to 16, and is almost pure. These fractions were pooled and run on a size exclusion column.
270
Figure 6.14.c – Superdex 75
* ** *
1 2 3 4 5 6
1- Molecular Weight
66.3 kDa 2- Superdex 75 - Load 55.4 kDa 3- Superdex 75 – F. 2 4- Superdex 75 – F. 10 36.5 kDa 31.0 kDa 5- Superdex 75 – F. 12 6- Superdex 75 – F. 20 21.5 kDa
The high molecular weight impurities are found in fraction 10 while RNase H is in fraction 12. A low molecular weight impurity was also eluted of the column in fraction 20. Fractions 11 to 14 were pooled and concentrated.
271
The final yield for this purification scheme was close to 5 mg for a starting
mass of 7 g of cells. A large amount of protein was lost, due to problems with the
fraction collector, and the expected yield should be around 15 to 20 mg. This is a
considerable improvement compared to the previous purification schemes, and
Scheme 3 was chosen as the preferred purification protocol for Rb69 RNase H.
However, these yields are still rather low, and this is most likely due to the
toxicity of native RNase H once it is overexpressed. Similarly to the strategy that
was adopted for the Bacteriophage T4 RNase H, an inactive mutant, D132N
RNase H, was cloned. This mutant has greatly reduced nuclease activity and
larger amounts of protein can be obtained. Moreover, an inactive RNase H would
have to be used for DNA binding studies, providing another reason for cloning
this mutant.
6.3. Bacteriophage Rb69 D132N RNase H
6.3.1. Introduction
The D132N mutant for the bacteriophage T4 RNase H is an inactive
mutant, as Asp 132 is involved in Mg2+ binding in the active site. Mutating this
aspartate into an asparagine prevents metal binding in site 1 (see Section 1.1.3)
and therefore removes nuclease activity. The same mutant can be made for the
Rb69 RNase H. A sequence alignment (see Figure 6.15) shows that residue 132
in Rb69 RNase H is the same aspartate that was involved in metal binding in T4
RNase H, thus the same D132N mutant in Rb69 RNase H would also be inactive.
272
Figure 6.15 – Sequence Alignement of T4 RNase H and Rb69 RNase H
1 MDLEMMLDEDYKEGIALADFSNIALAAALNNFEDGDKITVPMVRHVVLNSIRKNVVMFRK 60 MDLEMMLDEDYKEGI L DFS IAL+ AL NF D +KI + MVRH++LNSI+ NV + 1 MDLEMMLDEDYKEGICLIDFSQIALSTALVNFPDKEKINLSMVRHLILNSIKFNVKKAKT 60
61 QGYTKFVLCMDNATSGYWRRDFAYYYKKNRKTDREASKWDWEGYFTALHQVVDEIKKYMP 120 GYTK VLC+DNA SGYWRRDFAYYYKKNR RE S WDWEGYF + H+V+DE+K YMP 61 LGYTKIVLCIDNAKSGYWRRDFAYYYKKNRGKAREESTWDWEGYFESSHKVIDELKAYMP 120
121 YVVMDIDKYEADDHIGVLTKYLSLAGHKVCIVASDGDFTQLHKYPNVKQWSPPQKKWVKI 180 Y+VMDIDKYEADDHI VL K SL GHK+ I++SDGDFTQLHKYPNVKQWSP KKWVKI 121 YIVMDIDKYEADDHIAVLVKKFSLEGHKILIISSDGDFTQLHKYPNVKQWSPMHKKWVKI 180
181 KNGSAEIDCMTKILKGDRKDGVASVRVRGDFWFTRVEGERTPSMKTTIIEALANDRSQAE 240 K+GSAEIDCMTKILKGD+KD VASV+VR DFWFTRVEGERTPSMKT+I+EA+ANDR QA+ 181 KSGSAEIDCMTKILKGDKKDNVASVKVRSDFWFTRVEGERTPSMKTSIVEAIANDREQAK 240
241 VLLSAEEYKRYQENLVLIDFDYIPDNIASTIIEYYNSYQPQPKGKIYSYFVKSGLSKLTS 300 VLL+ EY RY+ENLVLIDFDYIPDNIAS I+ YYNSY+ P+GKIYSYFVK+GLSKLT+ 241 VLLTESEYNRYKENLVLIDFDYIPDNIASNIVNYYNSYKLPPRGKIYSYFVKAGLSKLTN 300
301 VINEF 305 INEF 301 SINEF 305
The Rb69 RNase H sequence is shown in blue, the T4 RNase H is in green. Identities = 218/289 (75%), Positives = 246/289 (85%) The Aspartate mutated to an Asparagine in the D132N mutant is highlighted in red.
The characteristics of Rb69 D132N RNase H are shown in Table 6.10.
They are similar to the native RNase H, except for the calculated pI which is
significantly higher.
Table 6.10 – Rb69 D132N RNase H characteristics
Amino-acids Molecular Weight Calculated pI ε
35.4 kDa 305 7.62 1.78 ( 40.0 kDa with HisTag)
273
6.3.2. Molecular Cloning
The forward and reverse primers for the site-directed mutagenesis reaction were designed according to the QuikChange® protocol (see Section
2.1.3), they are shown in Figure 6.16.
Figure 6.16 – Site-Directed Mutagenesis PCR primers for Rb69 D132N
RNase H Cloning
Forward Primer
rnh 5’– ATT GAC AAA TAC GAA GCG GAT GAC CAT ATC GGC GTA TTA –3’ I D K Y E A D D H I G V L Primer 5’– GAC AAA TAC GAA GCG AAT GAC CAT ATC GGC G –3’ D K Y E A N D H I G
5’– GAC AAA TAC GAA GCG AAT GAC CAT ATC GGC G –3’
Reverse Primer
rnh 3’– TAA CTG TTT ATG CTT CGC CTA CTG GTA TAG CCG CAT AAT –5’ Primer 3’– CTG TTT ATG CTT CGC TTA CTG GTA TAG CCG C –5’
5’– C GCC GAT ATG GTC ATT CGC TTC GTA TTT GTC –3’
31 bp total
The forward primer for the site-directed mutagenesis PCR reaction is shown on the top. It was aligned with the nucleotide sequence of Rb69 RNase H. An adenosine in the AAT codon coding for D132 was mutated into a guanosine, the new GAT codon now coding for N132. The mutated nucleotide is shown in red. Highlighted in yellow is the primer that anneals with the original rnh gene. The reverse primer is shown at the bottom, highlighted in the same manner.
Tm = 76.3 °C
274
The PCR reaction was performed as described in Table 6.11. The template used in the reaction was the rnh + pDEST-C1 plasmid from colony 2 (see Section
6.2.3). The annealing temperature of 55 °C was chosen according to the
QuikChange® protocol.
Table 6.11 – Site-Directed Mutagenesis PCR Reaction for D132N RNase H
PCR Reaction PCR Program
KOD buffer (10X) 5 µL Activation 95 °C, 2 minutes
Forward primer (2.5 µM) 6 µL Denaturation 95 °C, 20 seconds
Reverse primer (2.5 µM) 6 µL Annealing 55 °C, 10 seconds
dNTPs (2 mM) 5 µL Extension 70 °C, 2 minutes
MgSO4 (25 mM) 5 µL 20 cycles
KOD Polymerase (1 U/µL) 1 µL Final extension 70 °C, 20 minutes
Template 1 µL
Autoclaved water 21 µL
Upon reception, the primers were first dissolved and diluted to 250 µM with 1X TE buffer (20 mM Tris HCl pH 8.0 + 1 mM EDTA), then a 2.5 µM stock was made for each primer to be used in the PCR reaction.
After the reaction, 1 µL of the restriction enzyme DpnI at 20 U/µL was
added to the PCR product, in order to digest the original pDEST-C1 + rnh
plasmid that is methylated, and leave only the pDEST-C1 + D132N rnh plasmid
in solution. The mixture was incubated at 37 °C for an hour. Next, the digested
PCR product was run on a 1% agarose gel that is presented in Figure 6.17.a. It
should be noted that after the PCR reaction, the plasmid that has been amplified
has not been ligated yet and is double-nicked. Its actual size can therefore not be
275
directly assessed by comparison with the supercoiled DNA ladder or the 1kb
linear ladder, as the nicked plasmid will run slower than either one: on the gel,
the pDEST-C1 + D132N rnh plasmid runs next to the 8 kb supercoiled DNA and
7kb linear DNA, when it is around 4600 bp. However, the purpose of this gel was
to make sure the PCR reaction yielded a clean product before moving on to the
transformation step. First, 2 µL of the PCR product were transformed into 25 µL
of DH5α competent cells, and after the transformation 50 µL and 100 µL
respectively of the cells were plated on LB Agar + Streptomycin. The next day,
colonies were picked from both plates and after an overnight growth, the
plasmids were isolated from the cells using the Miniprep kit. These plasmids
were run on a 1% agarose gel that is shown in Figure 6.17.b. The one isolated
from colony 2 showed the strongest band and was chosen for transformation into the expression host.
Figure 6.17 – Agarose Gels for Rb69 D132N RNase H Cloning
a 1 2 3
1- Supercoiled DNA ladder
10 kb 2- Rb69 D132N rnh in pDEST-C1 - Mutagenesis PCR product
3- 1 kb DNA ladder 5 kb 2 kb 5 kb
1 kb
276
1 2 3 4 5 6 7 b 1- Supercoiled DNA ladder 2- Rb69 D132N RNase H in pDEST-C1 – colony 1 3- Rb69 D132N RNase H in pDEST-C1 – colony 2
4- Rb69 D132N RNase H in pDEST-C1 – colony 3 5- Rb69 D132N RNase H in pDEST-C1 – colony 4 5 kb 6- Rb69 D132N RNase H in pDEST-C1 – colony 5 7- Rb69 D132N RNase H in pDEST-C1 – colony 6 2 kb
6.3.3. Protein Expression and Solubility
A 1 µL sample of the pDEST-C1 + D132N rnh plasmid from colony 2 was
transformed into 50 µL of competent T7 express cells. The cells were first plated
on LB Agar + Streptomycin + Tetracyclin, and one colony from each plate was
picked up and grown overnight in LB + Streptomycin + Tetracyclin. The next day,
protein expression was induced with 1 mM IPTG when the cells reached
OD600 = 0.6. Glycerol stocks were also taken at that point. Samples for the 0 h and 3 h expression were run on an SDS-PAGE gel, presented in Figure 6.18.a. A protein is expressed around 40 kDa, which is consistent with Rb69 D132N
RNase H with the N-terminal His-Tag.
Next, the cells were lysed to check for protein solubility. The lysis buffer
was composed of 50 mM Tris HCl pH 7.5, 200 mM NH4Cl, 10 mM MgCl2, 5 % glycerol, 0.03% PEI and 2 mM DTT. This is the same lysis buffer that was used for the Rb69 Native RNase H and yielded soluble protein. The pellet and supernatant samples after cell lysis are shown in Figure 6.18.b. As is always the case for RNase H, some protein was found insoluble in the pellet, but a decent amount was also soluble.
277
Figure 6.18 – Rb69 D132N RNase H Expression and Cell Lysis
a 1 2 3 4 5
1- Rb69 D132N RNase H expression (colony 1) – 0h sample
2- Rb69 D132N RNase H expression
66.3 kDa (colony 1) – 3h sample 55.4 kDa 3- Rb69 D132N RNase H expression
(colony 2) – 0h sample 36.5 kDa 31.0 kDa 4- Rb69 D132N RNase H expression (colony 2) – 3h sample 21.5 kDa 5- Molecular Weight Marker
Rb69 D132N RNase H is expressed around 40 kDa, which is consistent with the calculated molecular weight of RNase H including the His-Tag.
1 2 3 b
66.3 kDa 1- Molecular Weight Marker 55.4 kDa 2- Rb69 D132N RNase H cell lysis – pellet 3- Rb69 D132N RNase H cell lysis –supernatant 36.5 kDa 31.0 kDa
21.5 kDa
Rb69 D132N RNase H is partially distributed between pellet and supernatant after the cell lysis, but a reasonable amount of protein is found soluble.
278
After protein expression and solubility were assessed, the pDEST-C1 +
D132N rnh plasmid from colony 2 that was used in the expression studies was
sent to the Plant-Microbe Genomics Facility at Ohio State University for DNA
sequencing. The results are detailed in Appendix 3. The plasmid was correctly
sequenced and the mutation in codon 132 was confirmed. Both the forward and
reverse primer reactions are shown, for the same reason that was explained in
Section 6.2.4.
6.3.4. Protein Purification
Rb69 D132N RNase H was purified according to the final purification
protocol that was designed for the native protein (see Section 6.2.5). The
different sets of buffers are presented in Table 6.12.
Table 6.12 – Lysis and HPLC buffers for Rb69 D132N RNase H purification
Ion Exchange Size Exclusion Lysis (SP Sepharose and (Superdex 75) POROS HS)
50 mM Tris HCl pH 7.5 200 mM NH Cl 25 mM bis-Tris HCl pH 6.5 4 25 mM bis-Tris HCl pH 6.5 10 mM MgCl 100 mM NH Cl Buffers 2 4 150 mM NH Cl 5% glycerol 10 mM MgCl 4 2 10 mM MgCl 2 mM DTT 0 - 1 M NaCl 2 0.03% PEI
Buffer A: ~ 16 mS/cm Conductivity ~ 18 mS/cm / Buffer B: ~ 96 mS/cm
Elution / 100% A → 100% B /
279
The soluble portion after cell lysis was first loaded on the low resolution
cation-exchange SP Sepharose. The majority of D132N RNase H was eluted
during the salt gradient elution, however a small amount of protein was also
found in the flow through fraction. That flow through fraction was loaded onto the
SP Sepharose column again, but no more RNase H bound to the resin, the same
amount was found in the flow through fraction once again. The chromatogram
from the first run, as well as the SDS-PAGE gels for both runs, labeled SP
Sepharose (1) and (2), are presented in Figure 6.19.a. Next, the eluted fractions
from run 1 were pooled and loaded on the high resolution cation exchange Poros
HS. D132N RNase H bound more strongly to the Poros resin, but the eluted
protein is still contaminated with higher molecular weight proteins. The results
from the Poros HS step are shown in Figure 6.19.b. Finally, the protein was further purified by size exclusion chromatography using Superdex 75 (see Figure
6.19.c). D132N RNase H after the final purification step was pure enough, the total amount of pure protein was 3mg, which was concentrated down to 6 mg/mL.
However, the protein started precipitating upon concentration.
280
Figure 6.19 – Rb69 D132N RNase H purification
Figure 6.19.a – SP Sepharose
* *
1 2 3 4 5 1- Molecular Weight Marker 6 7 8 9
2- SP Sepharose (1) – Load 3- SP Sepharose (1) - F. 10
66.3 kDa 4- SP Sepharose (1) - F. 14 66.3 kDa 55.4 kDa 55.4 kDa 5- SP Sepharose (1) - Flow Through
36.5 kDa 36.5 kDa 6- SP Sepharose (2) – Load 31.0 kDa 31.0 kDa 7- SP Sepharose (2) – F. 10 21.5 kDa 21.5 kDa 8- SP Sepharose (2) – Flow Through
9- Molecular Weight Marker
On the chromatogram, the OD260 is shown in purple, the OD280 in green, the conductivity in red and the % B in black. The red stars indicate which fractions were run on a SDS- PAGE gel. Here, fractions 12 to 20 contained RNase H and were pooled to be run on the Poros HS column. The pooled fractions are indicated with a red line on the chromatogram. Some of RNase H didn’t bind to the resin and was found in the Flow Through fraction, which was reloaded onto the column. However, RNase H didn’t bind the resin once again.
281
Figure 6.19.b – Poros HS
*
1 2 3
66.3 kDa 55.4 kDa 1- Poros HS – Load 36.5 kDa 2- Poros HS – F.12 31.0 kDa 3- Molecular Weight Marker 21.5 kDa
14.4 kDa
For this step, D132N RNase H was found in fractions 8 to 17 after elution from the Poros HS. Some higher molecular weight contaminants were still present in the protein sample.
282
Figure 6.19.c – Superdex 75
* * *
1 2 3 4 5
1- Molecular Weight Marker 66.3 kDa 2- Superdex 75 – Load 55.4 kDa 3- Superdex 75 – F. 11 36.5 kDa 4- Superdex 75 – F. 13 31.0 kDa 5- Superdex 75 – F. 21 21.5 kDa
After the Superdex 75 run, the impurities were eluted in fraction 11 while D132N RNase H was present in fraction 13. Fractions 12 to 15 were pooled for concentration.
283
6.3.5. Cleaving of the His-Tag
The structure of the bacteriophage T4 RNase H shows that the N-terminus and the C-terminus of the protein are in close proximity to each other. Even though the structure of bacteriophage Rb69 RNase H is still unknown, it is expected to be closely related to that of T4 RNase H. The C-terminus of
RNase H is involved in the interaction with 32 protein, and the presence of an
N-terminal His-Tag might interfere with that interaction. Therefore, the N-terminal
His-Tag had to be cleaved off before the interaction with Rb69 32-B could be studied.
D132N RNase H and the TEV protease were dialyzed together in a buffer containing 50 mM Tris HCl pH 8.5 and 0.5 mM EDTA. It should be noted that as
RNase H was precipitating during the previous concentration step, the precipitate was spun down and only the supernatant was used in dialysis. The proteolysis reaction was then setup as described in Table 6.13, and left to incubate at room
temperature overnight, since the TEV protease is only active at room
temperature or higher.
Table 6.13 – TEV Protease Reaction Setup
TEV Protease Reaction
Conponent Concentration Volume Amount
Rb69 D132N RNase H 0.27 mg/mL 2.2 mL 0.6 mg
TEV Protease 0.5 mg/mL 50 µL 0.05 mg final β-mercaptoethanol 7 M 2.5 µL concentration 1 mM
284
After the reaction, more precipitate was formed. It was pelleted down and
both the pellet and supernatant were run on an SDS-PAGE gel. The results from
the gel are presented in Figure 6.20. It appears that most of the D132N RNase H
was lost due to precipitation, as only 0.6 mg out of 3 mg were left. Two bands
can be seen around 35 kDa for the supernatant sample where the cleaved
D132N RNase H is expected, indicating that the His-Tag might actually have been partially cleaved. Another strong band is seen around 70 kDa, which could be an RNase H dimer. However, the amount of RNase H left in solution at that point was too low to allow for any further experiment with that batch of protein.
Figure 6.20 – TEV Protease Reaction Results
1 2 3 4
66.3 kDa 55.4 kDa 1- Molecular Weight Marker
36.5 kDa 2- D132N RNase H after Superdex 75 31.0 kDa 3- TEV Protease reaction pellet 21.5 kDa 4- TEV Protease reaction supernatant 14.4 kDa
285
6.4. Bacteriophage Rb69 32-B and Future Work
The objective in cloning, expressing and purifying Rb69 RNase H was to study the interaction with Rb69 32-B protein. The 32-B protein was cloned by Dr.
Juliette M. Devos in the Mueser lab. The truncated 32 gene was inserted in the
pDEST-C1 vector, but no TEV protease cleavage site was inserted between the
N-terminal HisTag and the protein. The plasmid was then transformed in the T7
express cell line, and soluble protein was expressed under the same conditions
that were used for Rb69 RNase H. However, the protein purification proved to be
a lot more challenging than for the T4 32-B protein, and despite several attempts no pure Rb69 32-B was obtained in large enough quantities for further studies.
Concerning Rb69 RNase H, the yields were rather low, even while using
the D132N mutant, which also had solubility issues. In the future, one should try
to resolve this solubility problem by using the solubility screen in order to find an
optimal buffer that would enhance the solubility of the protein.
When all the issues of purification and solubility for both Rb69 RNase H
and 32-B have been solved, it would be interesting to apply the strategy that was used for the T4 proteins to the Rb69 complex. If crystals can be obtained, they
might diffract better due to different surface residues that lead to better packing
of the proteins, and provide a clearer picture of the interaction between RNase H
and the 32 protein at the T4-like phage replication fork.
CHAPTER 7 - Escherichia coli DNA-Binding
Protein from Starved Cells
7.1. Introduction
The E. coli Dps protein (from DNA-binding protein from starved cells) has
been found in E. coli cells going into stationary phase. Even though the mechanism by which this protein binds DNA is not clear, it seems to be protecting it from oxidative damage and induce compaction of the genome (see
Section 1.2 for more details).
This chapter is divided in two main sections, the previous work that was accomplished by other members of the lab, and the current work that follows up on their results.
7.2. Previous Work
The Dps project was initiated by Brandon K. Collins and Stephen J. Tomanicek,
former students in the lab. The work they accomplished is described in this
particular section and can be found in Stephen Tomanicek’s Ph.D. dissertation
(Tomanicek, 2005).
286 287
7.2.1. Expression and Purification
Dps was found as an overexpressed impurity in preparations of recombinant archaeal FEN-1 proteins. FEN-1 was expressed in BL21(DE3) cells and protein production was induced with IPTG at 37 °C. The molecular weight of
Dps was found to be around 19 kDa on SDS-PAGE gel.
Dps could be purified away from FEN-1 by size exclusion chromatography on a Superdex 75 column, and was found in the void fraction, meaning its actual size is much larger than 19 kDa.
It should be noted than Dps was obtained from a few batches of FEN-1 protein, notably several Aeropyrum pernix (Ape) FEN-1, Archeoglobus fulgidus
(Afu) FEN-1 and Thermococcus ziligii (Tzi) FEN-1 samples. All the work described in this section was done with early batches of Dps purified from
Ape/Tzi FEN-1 by Brandon Collins and will later on be referred to as (BKC
Ape/Tzi FEN-1) Dps. Dps purified by Stephen Tomanicek will be referred to as
(SJT Ape/Ave FEN-1) Dps.
7.2.2. Characterization
Dps was identified as such using N-terminal sequencing (Midwest
Analytical Inc., St Louis, MO) and MALDI-TOF-TOF Mass Spectrometry (The
Michigan Proteome Consortium, U. of M. Medical School, Ann Arbor, MI), with a
100 % confidence score.
288
It is important to note that the N-terminal sequencing showed that the
endogenous Dps expressed from E. coli is truncated, missing the first nine
amino-acids, as shown in Figure 7.1.
Figure 7.1 – Amino-Acid Sequence of E. coli Dps
1 MSTAKLVKSK ATNLLYTRND VSDSEKKATV ELLNRQVIQF IDLSLITKQA HWNMRGANFI
61 AVHEMLDGFR TALIDHLDTM AERAVQLGGV ALGTTQVINS KTPLKSYPLD IHNVQDHLKE
121 LADRYAIVAN DVRKAIGEAK DDDTADILTA ASRDLDKFLW FIESNIE
The nine missing residues are highlighted in yellow.
The protein characteristics were calculated using that sequence, with the
ExPASy website (Gill and von Hippel, 1989; Gasteiger et al., 2003). They are
summarized in Table 7.1.
Table 7.1 – Dps Characteristics
Dps
Amino-acids 158
Molecular Weight 17.7 kDa
pI 5.33
ε 0.87
The molecular weight of the truncated Dps was calculated to be 17.7 kDa.
However, Dynamic Light Scattering (DLS) measurements show that the
molecular weight is 141.6 kDa, suggesting that Dps really is a heptamer or an
289 octamer in solution. This result is consistent with the fact that Dps was found in the void fraction during size exclusion chromatography.
7.2.3. X-Ray Diffraction Studies
Dps crystals were obtained at 21 °C, using the hanging drop vapor diffusion method, in drops containing 2 µL of the reservoir solution and 2 µL of the protein solution at 19 mg/mL. Examples of the crystals obtained and the conditions they were grown in are shown in Figure 7.2.
Figure 7.2 – Crystals of (BKC Tzi FEN-1) Dps and (BKC Ape FEN-1) Dps
a b
a - (BKC Tzi FEN-1) Dps crystal grown in ~ 17 % PEG 400, 100 mM Na HEPES pH 7.5 and 200 mM MgCl2.
b - (BKC Ape FEN-1) Dps crystals grown in ~ 14 % PEG 1000, 100 mM Na Cacodylate pH 6.5 and 200 mM MgCl2.
The crystals shown in Figure 7.2.a were flash frozen in liquid Nitrogen with
30 % PEG 400 as a cryoprotectant, while the crystals shown in Figure 7.2.b were cryoprotected using either 30 % PEG 400 or 25 % Ethylene Glycol and then flash frozen in liquid Nitrogen.
290
The crystals were screened for X-Ray diffraction and four datasets were
collected on (BKC Tzi FEN-1) Dps crystals. No dataset was collected on the
(BKC Ape FEN-1) Dps crystals as they were not diffraction quality crystals.
The datasets were processed, and the space groups I222 or I212121 were found to have the highest symmetry and the lowest Rmerge values, around 6.1 %.
However, phasing with Molecular Replacement (PDB entry of the search
model: 1DPS, (Grant et al., 1998)) was unsuccessful and the datasets had to be
reprocessed as P1. Eight Dps molecules were then found in the asymmetric unit,
but the solution could not be refined.
7.2.4. Discussion and Future Work
The fact that Molecular Replacement, using the full-length Dps structure
as the search model, did not yield any solution could mean that the truncated
Dps may have a different and unique structure. More crystals of this protein
should be grown and more data collected, in order to solve the structure and see how different it is from the full-length Dps structure. Moreover, it could be interesting to see if the Dps truncation also forms a spherical dodecamer.
7.3. Project Follow-up
Different samples of purified Dps were obtained from Stephen Tomanicek,
namely (BKC Ave FEN-1) Dps, (BKC Ape FEN-1) Dps and (STJ Ape FEN-1)
Dps. That last protein was used initially, as it was present in greater quantity.
291
The protein was dialyzed in 50 mM bis-Tris HCl pH 6.5, 100 mM NH4Cl
and 10 mM MgCl2. A hanging drop vapor diffusion expansion was set up at room
temperature using a coarse gradient in a 4x6 format, with the condition
previously stated in Section 7.1.3. that gave the best crystals, that is 10-30 %
PEG 400, 100 mM Na HEPES pH 7.5 and 200 mM MgCl2. Various amounts of
ethylene glycol (0, 5, 15 and 25 %) were added to each row for cryoprotection.
Two drops were set up on the cover slide, containing 2 µL of the well solution
and 2 µL of the protein at either 15 mg/mL or 18.5 mg/mL. Surprisingly, this
experiment was unsuccessful and no crystals were obtained.
Since the protein used in that crystallization experiment had been kept at
4 °C for a long time, there was some concern that it might have degraded over
time. It was run, along with the other two samples ((BKC Ave FEN-1) Dps, (BKC
Ape FEN-1) Dps), on an SDS-PAGE gel, which is shown on Figure 7.3.
Figure 7.3 – SDS-PAGE Gel of the Truncated Dps Samples
1 2 3 4
66.3 kDa 55.4 kDa 1- Molecular Weight Marker 2- (SJT Ape FEN-1) Dps 36.5 kDa 31.0 kDa 3- (BKC Ape FEN-1) Dps 21.5 kDa 4- (BKC Ave FEN-1) Dps 14.4 kDa
292
As it can be seen on the SDS-PAGE gel, (SJT Ape FEN-1) Dps, that was used for the crystallization trials, has a molecular weight of around 20 kDa, as compared to the BKC Dps, which both have a molecular weight of about 18 kDa.
This difference in size might explain why no crystals were obtained, even though the crystallization condition that was used was known to yield crystals.
The Dps that was sequenced as missing nine amino-acids at the
N-terminus was one of the BKC proteins, and its molecular weight on the
SDS-PAGE gel is consistent with the calculated value of 17.7 kDa. It is possible that the (SJT Ape FEN-1) Dps, which was purified from a different batch of
FEN-1 protein, might be the full-length Dps (the calculated molecular weight on the ExPASy ProtParam Tool is 18.7 kDa) or a truncation missing less than nine amino-acids.
The next section will deal with the characterization of (SJT Ape FEN-1)
Dps. However, the focus should remain on the known truncated Dps proteins for the crystallization studies, since the structure of the full-length Dps is already known.
7.3.1. Further Characterization of (SJT Ape FEN-1) Dps
Dynamic Light Scattering
In order to determine if (SJT Ape FEN-1) Dps in solution behaves like the truncated Dps, a DLS experiment was performed at Room Temperature. The protein was dialyzed in 50 mM Bis-Tris HCl pH 6.5, 100 mM NH4Cl and 10 mM
MgCl2, and filtered using a 0.45 µm pore size Ultrafree-MC filter unit from
293
Millipore. The Dps sample used in the experiment had a concentration of 1.08 mg/mL.
The results of the experiment are shown in Figure 7.4 and Table 7.2. The
protein sample is homogeneous, with a polydispersity of 16.5%. Some trace
amounts of aggregates can be seen, but they don’t account for any of the mass.
The molecular weight was calculated to be about 134 kDa, which is again
consistent with Dps being a heptamer or an octamer in solution.
Figure 7.4 – DLS Results for (SJT Ape FEN-1) Dps
Peak 1 `
Peak 3
Peak 2
294
Table 7.2 – DLS Results for (SJT Ape FEN-1) Dps
Peak Rh (nm) % Pd MW (kDa) % Intensity % Mass 1 4.828 16.5 134 84.9 100 2 43.75 0 23,270 3.7 0 3 43817 0 2.44 E 11 11.4 0
Mass Spectometry
The SDS-PAGE gel shown in Figure 7.3 was sent to the Michigan
Proteome Consortium at the University of Michigan Medical School (Ann Arbor,
MI) for trypsin digestion and MALDI-TOF Mass Spectrometry analysis. The
objective of this experiment was to determine if the (SJT Ape FEN-1) Dps protein
is also an N-terminal truncation missing nine residues, like it was previously
determined for the (BKC Ape FEN-1) Dps. The results are shown in Figure 7.5.
Figure 7.5 – MALDI-TOF Mass Spectrometry Results
a – (SJT Ape FEN-1) Dps, Protein Score: 121, 100 %
1 MSTAKLVKSK ATNLLYTRND VSDSEKKATV ELLNRQVIQF IDLSLITKQA HWNMRGANFI
61 AVHEMLDGFR TALIDHLDTM AERAVQLGGV ALGTTQVINS KTPLKSYPLD IHNVQDHLKE
121 LADRYAIVAN DVRKAIGEAK DDDTADILTA ASRDLDKFLW FIESNIE
b – (BKC Ape FEN-1) Dps, Protein Score: 121, 100 %
1 MSTAKLVKSK ATNLLYTRND VSDSEKKATV ELLNRQVIQF IDLSLITKQA HWNMRGANFI
61 AVHEMLDGFR TALIDHLDTM AERAVQLGGV ALGTTQVINS KTPLKSYPLD IHNVQDHLKE
121 LADRYAIVAN DVRKAIGEAK DDDTADILTA ASRDLDKFLW FIESNIE
295
c – (BKC Ave FEN-1) Dps, Protein Score: 125, 100 %
1 MSTAKLVKSK ATNLLYTRND VSDSEKKATV ELLNRQVIQF IDLSLITKQA HWNMRGANFI
61 AVHEMLDGFR TALIDHLDTM AERAVQLGGV ALGTTQVINS KTPLKSYPLD IHNVQDHLKE
121 LADRYAIVAN DVRKAIGEAK DDDTADILTA ASRDLDKFLW FIESNIE
The sets of peptides corresponding to E. coli Dps with the highest protein score were chosen. a and b were labeled as “DNA protection during starvation condition (E. coli CFT073)”, and c as “PexB”. Highlighted in gray are the missing amino-acids at the N terminus. Glutamine residues cyclizing into pyroglutamate are shown in blue, and oxidized Methionine residues are in green. The peptides identified by the MS experiment are shown in red. The database used for peptide matching was NCBInr.
The MALDI-TOF experiment identified (SJT Ape FEN-1) Dps as E. coli
Dps. The first ten amino-acids are missing from the peptide list, which suggests that (SJT Ape FEN-1) Dps is missing the N-terminus the same way (BKC Ape
FEN-1) Dps is. However, trypsin, which was used in this experiment, cleaves after each lysine and arginine residue unless it is followed by a proline. The
N-terminus of E. coli Dps contains three lysines in position 5, 8 and 10. The three
following peptides would then be created by trypsin from the N terminus: MSTAK,
LVK and SK, and have a molecular weight of 537 Da, 394 Da and 251 Da,
respectively. Peptides this small would not have been detected by the
experiment. The solution then would be to treat the (SJT Ape FEN-1) Dps
sample with a different protease and run the MS experiment again.
Unfortunately, that sample was not available anymore and the experiment could
not be done.
296
7.3.2. X-Ray Diffraction Studies
The three samples previously mentioned ((SJT Ape FEN-1) Dps, (BKC
Ape FEN-1) Dps and (BKC Ave FEN-1) Dps) were dialyzed in the following
buffer: 50 mM bis-Tris HCl pH 6.5, 100 mM NH4Cl, 150 mM NaCl and 10 mM
MgCl2. Each sample was then concentrated and filtered with a Millipore 0.45 µm
pore size Ultrafree-MC filter unit. A coarse gradient expansion, using the 2 + 2
hanging drop vapor diffusion method, was set up in 24-well Costar trays at room temperature. This time, the two conditions that previously gave crystals were
used in a 2x12 setup (see Figure 8.2). The best crystals were obtained for (BKC
Ape FEN-1) Dps at 21.2 mg/mL, in 11.8-15.5 % PEG 1000, 100 mM Na
Cacodylate pH 6.5 and 200 mM MgCl2, and are shown in Figure 7.6.
Figure 7.6 – (BKC Ape FEN-1) Dps Crystals a b c
a – 11.8 % PEG 1000, 100 mM Na Cacodylate pH 6.5 and 200 mM MgCl2
b – 13.6 % PEG 1000, 100 mM Na Cacodylate pH 6.5 and 200 mM MgCl2
c – 15.5 % PEG 1000, 100 mM Na Cacodylate pH 6.5 and 200 mM MgCl2
Other crystals were obtained for the other two proteins at around
20 mg/mL, but they were too small for diffraction.
297
Crystals shown in Figure 8.6.b and 8.6.c were soaked in a substitute
mother liquor containing 50 mM bis-Tris HCl pH 6.5, 100 mM NH4Cl, 150 mM
NaCl, 13.6 or 15.5 % PEG 1000 respectively, 100 mM Na Cacodylate pH 6.5 and
210 mM MgCl2, before 25 % Ethylene Glycol was added for cryoprotection. The
crystals were then flash frozen in liquid Nitrogen and screened for X-Ray
diffraction.
One dataset was collected on the in-house Rigaku FR-E diffractometer
(the Ohio Crystallography Consortium at the University of Toledo, Toledo, OH,
USA), using a wavelength of 1.54 Å. The crystal diffracted to a resolution of
3.0 Å. A number of 220 images were collected every 0.5 degree with an
oscillation of 0.5 degree and an exposure time of 15 s. The distance to the
detector was set at 80 mm. Some of the images collected are shown in Figure
7.7.
Figure 7.7 – X-Ray Diffraction Images of the (BKC Ape FEN-1) Dps Crystals
a b c
a – Image 1 at 0° b – Image 90 at 45° c – Image 180 at 90°
298
The dataset was indexed, integrated and scaled using the HKL2000 software (Minor et al., 2002). A summary of the data processing with the different space groups that were used is shown in Table 7.3.
Table 7.3 – (BKC Ape FEN-1) Dps Data Processing Summary
Space Group P1 (1) C2 (5) F222 (22) I222 (23) I4 (79)
Resolution 20 to 3.5 Å 20 to 3.5 Å 20 to 3.5 Å 20 to 3.5 Å 20 to 3.5 Å after scaling Unit Cell 85.04 116.9 145.0 90.0 114.19 90.0 88.73 90.0 88.78 90.0 dimensions 85.00 95.32 88.76 127.8 125.48 90.0 88.97 90.0 88.78 90.0 (Å, °) 85.09 117.1 89.00 90.0 125.53 90.0 114.5 90.0 114.3 90.0
Rmerge * 8.0 % 10.0 % 48.2 % 10.7 % 46.3 %
Mosaicity 0.945 0.913 0.968 0.964 0.955
16,474 9,753 5,775 6,052 5,618 Number of (25,896) (26,159) (25,248) (26,212) (25,763) reflections 13 / atom 8 / atom 4.6 / atom 5 / atom 4.5 / atom
Completeness 74.3 % 84.2 % 99.9 % 98.2 % 99.9 %
# molecules / asymmetric 8 to 12 5 to 6 2 to 3 2 to 3 2 to 3 unit
⎡ n ⎤ n * R =100 × F 2 − F 2 / F 2 Equation 7.1 merge ⎢∑∑ hkl hkl i ⎥ ∑∑ hkl ⎣ hkl i=1 ⎦ hkl i=1 where F 2 is the intensity of each hkl reflection and F 2 is the mean value of i hkl hkl i measurements of n equivalent reflections.
The Rmerge values indicate that the actual space group might be I222. That space group choice was confirmed using Pointless in the CCP4 program suite.
The Matthews cell content analysis program used next predicted three molecules in the asymmetric unit (Matthews, 1968), with a Matthews coefficient of 2.13
299
Å3.Da-1 and a solvent content of 42.24%. Phasing using Molecular Replacement was then done on the I222 data, as well as on the P1 data. The program used for
Molecular Replacement was MolRep, which is part of the CCP4 program suite
(Bailey, 1994), and the search model was the E. coli full length Dps crystal structure (pdb: 1DPS, (Grant et al., 1998)). A summary of the Molecular
Replacement results can be found in Table 7.4.
Table 7.4 – (BKC Ape FEN-1) Dps MolRep Molecular Replacement Summary
P1 (1) I222 (23) search result search result
R = 56 % 1 dodecamer factor 12 monomers No solution found CC = 31.4 %
R = 59 % factor R = 42.8 % 12 monomers CC = 26 % 3 monomers factor CC = 59.7 % (11 monomers found)
R = 59 % factor 3 monomers with R ~ 60 % 6 monomers CC = 26 % factor I2 2 2 (24) CC ~ 40 % (11 monomers found?) 1 1 1
R = 59.3 % dimer factor CC = 22 %
P1 Molecular Replacement: The high R and low correlation values suggest that P1 is not the right space group, and no satisfying solution was found after Molecular Replacement.
I222 Molecular Replacement: No solution was found when 12 Dps monomers were searched for. However, when 3 Dps monomers were searched for, the lower R and higher correlation values indicate that a possible solution was found. The same searched with the dataset processed as I212121 didn’t yield any acceptable solution.
The I222 Molecular Replacement solution was refined using the CCP4 refinement program REFMAC in the restrained mode (Bailey, 1994; Murshudov,
1997). The R value was calculated to be 29.6 % and the Rfree 41.8 %.
300
Since the refinement results after molecular replacement with MolRep
were not very good, another molecular replacement program, namely Phaser,
was used, in order to compare the two molecular replacement solutions. Three
molecules of 1DPS were searched for. The results from Phaser are shown in
Table 7.5.
Table 7.5 - (BKC Ape FEN-1) Dps Phaser Molecular Replacement Summary
Score Search Model (1DPS × 3)
Rotation Function Score 7.1 8.1 9.0
Translation Function Score 11.6 23.0 31.1
Packing 0 0 0
Log-Likelihood Gain 127 490 1222
The Log-likelihood gain (LLG) indicates how well the data agrees with the model, a good molecular replacement solution will therefore have a high LLG score. The rotation (RFZ) and translation (TFZ) Z scores are then calculated from the LLG. An RFZ score high than 5 and TFZ score higher than 8 indicate that a solution was found. The packing is an indication of clashes that may have been found by the program.
The Phaser scores looked good, and after a number of restrained
refinement cycles, the R value was calculated to be 18.3 % and the Rfree 31.1 %.
The R value is a little low but this could most likely be fixed upon a few cycles of building and refinement.
The two solutions, from MolRep and from Phaser, are shown below in
Figure 7.8. Even though these solutions look different, it is important to remember that the space group after processing was I222, therefore after generating symmetry molecules with the I222 symmetry operators, the two models are actually the same.
301
Figure 7.8 – MolRep vs. Phaser Solutions
MolRep Phaser
The Phaser solution will be used for the rest of the model description, as the Molecular Replacement and refinement statistics were better.
After generation of the symmetry molecules, a spherical dodecamer was
obtained. It is presented in Figure 7.9. A hollow sphere was obtained after
surface rendering, it is also shown in Figure 7.9.c. The dimensions of the sphere
are as follows: 85 to 90 Å for the outside diameter and 45 Å for the hollow core
diameter. These dimensions are very close to the ones reported by Grant (Grant
et al., 1998), which were 90 and 45 Å respectively, for the outside and the hollow
core.
302
Figure 7.9 – Final Dps Model from the (BKC Ape FEN-1) Dps Crystals
a b
c
a – Three monomers of Dps found 85 Å after Molecular Replacement
b – Spherical Dodecameric structure 45 Å of Dps
c- Surface rendering of a half-sphere of Dps dodecamers
The (BKC Ape FEN-1) Dps dodecameric structure was superimposed onto the Dps dodecamer reported earlier (Grant et al., 1998). This is shown in
Figure 7.10 below. It can be seen easily that the two structures superimpose perfectly.
303
Figure 7.10 – Final Dps Model from the (BKC Ape FEN-1) Dps Crystals
The Dps dodecamer model is shown in gray (Grant et al., 1998) and the (SJT Ape FEN-1) Dps model in orange.
It appears that the truncation of the N-terminus of Dps does not affect the
interaction of the proteins forming the dodecameric structure. The reason the
earlier dataset could not be phased must therefore be the bad quality of the X-
Ray diffraction data. This is another piece of evidence that the N-terminus of Dps
is not involved in hollow sphere formation and aggregation leading to genome
compaction, but more likely in DNA binding, as it has been suggested before
(Ceci et al., 2004).
8.4. Conclusion
An N-terminal truncation of the E. coli Dps protein was characterized.
Biophysical studies, such as Dynamic Light Scattering and Mass Spectrometry, were done but did not provide any groundbreaking results. On the other hand, a
304 crystal structure of the protein was obtained. The Dps truncation adopts the same overall dodecameric form as the full length Dps does. This result indicates that the N-terminus of Dps, which is lysine rich, is not involved in cooperative binding but more likely in DNA binding, a result that is consistent with some of the evidence that has been published by other groups.
305
BIBLIOGRAPHY
Alberts, B. M., Frey, L. (1970) T4 bacteriophage gene 32: a structural protein in the replication and recombination of DNA. Nature, 227 1313-18. Azam, T. A. and Ishihama, A. (1999) Twelve species of the nucleoid-associated protein from Escherichia coli. Sequence recognition specificity and DNA binding affinity. J Biol Chem, 274 (46), 33105-13. Bailey, S. (1994) The CCP4 suite: programs for protein crystallography. Acta Crystallogr D Biol Crystallogr, D50 (5), 760-3. Bhagwat, M., Hobbs, L. J. and Nossal, N. G. (1997) The 5'-exonuclease activity of bacteriophage T4 RNase H is stimulated by the T4 gene 32 single- stranded DNA-binding protein, but its flap endonuclease is inhibited. J Biol Chem, 272 (45), 28523-30. Bhagwat, M., Meara, D. and Nossal, N. G. (1997) Identification of residues of T4 RNase H required for catalysis and DNA binding. J Biol Chem, 272 (45), 28531-8. Bhagwat, M. and Nossal, N. G. (2001) Bacteriophage T4 RNase H removes both RNA primers and adjacent DNA from the 5' end of lagging strand fragments. J Biol Chem, 276 (30), 28516-24. Bradford, M. M. (1976) A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding. Anal Biochem, 72 248-54. Casas-Finet, J. R., Fischer, K. R. and Karpel, R. L. (1992) Structural basis for the nucleic acid binding cooperativity of bacteriophage T4 gene 32 protein: the (Lys/Arg)3(Ser/Thr)2 (LAST) motif. Proc Natl Acad Sci U S A, 89 (3), 1050-4. Ceci, P., Cellai, S., Falvo, E., Rivetti, C., Rossi, G. L. and Chiancone, E. (2004) DNA condensation and self-aggregation of Escherichia coli Dps are coupled phenomena related to the properties of the N-terminus. Nucleic Acids Res, 32 (19), 5935-44. Chastain, P., Makhov, A. M., Nossal, N. G., Griffith, J. D. (2003) Architecture of the replication complex and DNA loops at the fork generated by the bacteriophage T4 proteins. J Biol Chem, 278 21276-21285. Collins, B. K., Tomanicek, S. J., Lyamicheva, N., Kaiser, M. W. and Mueser, T. C. (2004) A preliminary solubility screen used to improve crystallization trials: crystallization and preliminary X-ray structure determination of Aeropyrum pernix flap endonuclease-1. Acta Crystallogr D Biol Crystallogr, 60 (Pt 9), 1674-8. DeLano, W. L. and Lam, J. W. (2005) PyMOL: A communications tool for computational models. 230th ACS National Meeting, Washington, DC, United States. Devos, J. M., Tomanicek, S. J., Jones, C. E., Nossal, N. G. and Mueser, T. C. (2007) Crystal structure of bacteriophage T4 5' nuclease in complex with a
306
branched DNA reveals how flap endonuclease-1 family nucleases bind their substrates. J Biol Chem, 282 (43), 31713-24. Dwlgosh, J. M. (2008). The study of protein-protein interactions involved in lagging strand DNA replication and repair, the University of Toledo. Ph.D. Dissertation. Emsley, P. a. C. K. (2004) Coot: Model-building tools for molecular graphics. Acta Crystallogr D Biol Crystallogr, D60 2126-32. Gangisetty, O., Jones, C. E., Bhagwat, M. and Nossal, N. G. (2005) Maturation of bacteriophage T4 lagging strand fragments depends on interaction of T4 RNase H with T4 32 protein rather than the T4 gene 45 clamp. J Biol Chem, 280 (13), 12876-87. Gasteiger, E., Gattiker, A., Hoogland, C., Ivanyi, I., Appel, R. D. and Bairoch, A. (2003) ExPASy: The proteomics server for in-depth protein knowledge and analysis. Nucleic Acids Res, 31 (13), 3784-8. Genschel, J., Curth, U. and Urbanke, C. (2000) Interaction of E. coli single- stranded DNA binding protein (SSB) with exonuclease I. The carboxy- terminus of SSB is the recognition site for the nuclease. Biol Chem, 381 (3), 183-92. Giedroc, D. P., Keating, K. M., Williams, K. R. and Coleman, J. E. (1987) The function of zinc in gene 32 protein from T4. Biochemistry, 26 (17), 5251-9. Giedroc, D. P., Keating, K. M., Williams, K. R., Konigsberg, W. H. and Coleman, J. E. (1986) Gene 32 protein, the single-stranded DNA binding protein from bacteriophage T4, is a zinc metalloprotein. Proc Natl Acad Sci U S A, 83 (22), 8452-6. Giedroc, D. P., Khan, R. and Barnhart, K. (1991) Site-specific 1,N6- ethenoadenylated single-stranded oligonucleotides as structural probes for the T4 gene 32 protein-ssDNA complex. Biochemistry, 30 (33), 8230- 42. Gill, S. C. and von Hippel, P. H. (1989) Calculation of protein extinction coefficients from amino acid sequence data. Anal Biochem, 182 (2), 319- 26. Grant, R. A., Filman, D. J., Finkel, S. E., Kolter, R. and Hogle, J. M. (1998) The crystal structure of Dps, a ferritin homolog that binds and protects DNA. Nat Struct Biol, 5 (4), 294-303. Haebel, P. W., Wichman, S., Goldstone, D. and Metcalf, P. (2001) Crystallization and initial crystallographic analysis of the disulfide bond isomerase DsbC in complex with the alpha domain of the electron transporter DsbD. J Struct Biol, 136 (2), 162-6. Han, E. S., Cooper, D. L., Persky, N. S., Sutera, V. A., Jr., Whitaker, R. D., Montello, M. L. and Lovett, S. T. (2006) RecJ exonuclease: substrates, products and interaction with SSB. Nucleic Acids Res, 34 (4), 1084-91. Hollingsworth, H. C. and Nossal, N. G. (1991) Bacteriophage T4 encodes an RNase H which removes RNA primers made by the T4 DNA replication system in vitro. J Biol Chem, 266 (3), 1888-97. Horanyi, P. S., Griffith, J., Wang, B. C. and Jenney, F. E. (2006) Vectors for high throughput expression of ccdB polypeptides. U.S. Pat. Appl. Publ.
307
Huber, C. G. (2000) Biopolymer Chromatography. Encyclopedia of Analytical Chemistry, R. A. M. Eds, 11250-78. Hurley, J. M., Chervitz, S. A., Jarvis, T. C., Singer, B. S., Gold L. (1993) Assembly of the bacteriophage T4 replication machine requires the acidic carboxy terminus of gene 32 protein. J Mol Biol, 229 398-418. Ilari, A., Ceci, P., Ferrari, D., Rossi, G. L. and Chiancone, E. (2002) Iron incorporation into Escherichia coli Dps gives rise to a ferritin-like microcrystalline core. J Biol Chem, 277 (40), 37619-23. Izaac, A., Schall, C. A. and Mueser, T. C. (2006) Assessment of a preliminary solubility screen to improve crystallization trials: uncoupling crystal condition searches. Acta Crystallogr D Biol Crystallogr, 62 (Pt 7), 833-42. Jensen, D. E., Kelly, R. C. and von Hippel, P. H. (1976) DNA "melting" proteins. II. Effects of bacteriophage T4 gene 32-protein binding on the conformation and stability of nucleic acid structures. J Biol Chem, 251 (22), 7215-28. Jones, C. E., Mueser, T. C. and Nossal, N. G. (2004) Bacteriophage T4 32 protein is required for helicase-dependent leading strand synthesis when the helicase is loaded by the T4 59 helicase-loading protein. J Biol Chem, 279 (13), 12067-75. Karam, J. D. (1994) Molecular Biology of Bacteriophage T4. Washington D.C., American Society of Microbiology. Karpel, R. L. (1990). The biology of non-specific DNA-protein interactions, 103- 130. Koch, M. H., Vachette, P. and Svergun, D. I. (2003) Small-angle scattering: a view on the properties, structures and structural changes of biological macromolecules in solution. Q Rev Biophys, 36 (2), 147-227. Kornberg, A., Baker, T.A. (1992) DNA Replication, second edition. New York, W.H. Freeman and Company. Kuzmic, P. (1996) Program DYNAFIT for the analysis of enzyme kinetic data: application to HIV proteinase. Anal Biochem, 237 (2), 260-73. Laskowski, R. A., MacArthur, M. W., Moss, D. S. and Thornton, J. M. (1993) PROCHECK: a program to check the stereochemical quality of protein structures. J Appl Cryst, 26 283-91. Leslie, A. G. W. (1992) Recent changes to the MOSFLM package for processing film and image plate data. Joint CCP4 + ESF-EAMCB Newsletter on Protein Crystallography,(26). Little, J. W. and Mount, D. W. (1982) The SOS regulatory system of Escherichia coli. Cell, 29 (1), 11-22. Liu, Y., Kao, H. I. and Bambara, R. A. (2004) Flap endonuclease 1: a central component of DNA metabolism. Annu Rev Biochem, 73 589-615. Martinez, A. and Kolter, R. (1997) Protection of DNA during oxidative stress by the nonspecific DNA-binding protein Dps. J Bacteriol, 179 (16), 5188-94. Matthews, B. W. (1968) Solvent content of protein crystals. J Mol Biol, 33 (2), 491-7.
308
McCoy, A. J., Grosse-Kunstleve R. W., Adams P. D., Winn M. D., Storoni L. C. and Read R. J. (2007) Phaser crystallographic software. J. Appl. Cryst., 40 658-74. Minor, W., Cymborowski, M. and Otwinowski, Z. (2002) Automatic system for crystallographic data collection and analysis. Acta Physica Polonica, A, 101 (5), 613-19. Molineux, I. J. and Gefter, M. L. (1975) Properties of the Escherichia coli DNA- binding (unwinding) protein interaction with nucleolytic enzymes and DNA. J Mol Biol, 98 (4), 811-25. Mueser, T. C., Nossal, N. G. and Hyde, C. C. (1996) Structure of bacteriophage T4 RNase H, a 5' to 3' RNA-DNA and DNA-DNA exonuclease with sequence similarity to the RAD2 family of eukaryotic proteins. Cell, 85 (7), 1101-12. Murshudov, G., Vagin A. and Dodson E. (1997) Refinement of macromolecular structures by the maximum-likelihood method. Acta Crystallogr D Biol Crystallogr, D53 240-55. Nossal, N. G. (1992) Protein-protein interactions at a DNA replication fork: bacteriophage T4 as a model. Faseb J, 6 (3), 871-8. Nossal, N. G. (1994) The Bacteriophage T4 DNA Replication Fork. Molecular Biology of Bacteriophage T4, American Society of Microbiology, 43-53. Otwinowski, Z. and Minor, W. (1997) Processing of X-ray diffraction data collecting in oscillation mode. In Carter, C.W. and Sweet, R.M. (eds.). Methods Enzymol, 276 307-326. Pan, T., Giedroc, D. P. and Coleman, J. E. (1989) 1H NMR studies of T4 gene 32 protein: effects of zinc removal and reconstitution. Biochemistry, 28 (22), 8828-32. Petoukhov, M. V., Konarev, P. V., Kikhney, A. G. and Svergun, D .I. (2007) ATSAS 2.1 - Towards Automated and Web-Supported Small Angle Scattering Data Analysis. J Appl Cryst, 40 223-28. Pflugrath, J. W. (1999) The finer things in X-Ray diffraction data collection. Acta Crystallogr D Biol Crystallogr, D55 1718-25. Pierce, M. M., Raman, C. S. and Nall, B. T. (1999) Isothermal titration calorimetry of protein-protein interactions. Methods, 19 (2), 213-21. Rao, R. N. (1984) Construction and properties of plasmid pKC30, a pBR322 derivative containing the pL-N region of phage lambda. Gene, 31 (1-3), 247-50. Reuven, N. B., Staire, A. E., Myers, R. S. and Weller, S. K. (2003) The herpes simplex virus type 1 alkaline nuclease and single-stranded DNA binding protein mediate strand exchange in vitro. J Virol, 77 (13), 7425-33. Rodgers, D. W. (1994) Cryocrystallography. Structure, 2 (12), 1135-40. Sandigursky, M., Franklin, W. A. (1993) E. coli single-stranded DNA-binding protein stimulates the DNA deoxyribophosphodiesterase activity of Exonuclease I. Nucleic Acids Res, 22 (2), 247-50. Savvides, S. N., Raghunathan, S., Futterer, K., Kozlov, A. G., Lohman, T. M., Waksman G. (2004) The C-terminal domain of full-length E. coli SSB is disordered even when bound to DNA. Protein Sci, 13 1942-47.
309
Senger, A. B. and Mueser, T. C. (2005) Rapid preparation of custom grid screens for crystal growth optimization. J. Appl. Cryst., 38 847-50. Shamoo, Y., Friedman, A. M., Parsons, M. R., Konigsberg, W. H. and Steitz, T. A. (1995) Crystal structure of a replication fork single-stranded DNA binding protein (T4 gp32) complexed to DNA. Nature, 376 (6538), 362-6. Shatzman, A. R. and Rosenberg, M. (1987) Expression, identification, and characterization of recombinant gene products in Escherichia coli. Methods Enzymol, 152 661-73. Tanford, C. Light Scattering. Physical Chemistry of Macromolecules. Tomanicek, S. J. (2005). Crystallographic Studies of DNA Replication and Repair Proteins, the University of Toledo. Ph.D. dissertation. Tomanicek, S. J., Devos. J. M., Mueser, T. C. Metal-free crystal structure of bacteriophage T4 RNase H. in preparation. Waidner, L. A., Flynn, E. K., Wu, M., Li, X. and Karpel, R. L. (2001) Domain effects on the DNA-interactive properties of bacteriophage T4 gene 32 protein. J Biol Chem, 276 (4), 2509-16. Williams, K. R., Spicer, E. K., LoPresti, M. B., Guggenheimer, R. A., Chase, J. W. (1983) Limited proteolysis studies on the E. coli single-stranded DNA binding protein. J Biol Chem, 258 (5), 3346-3355. Yoakum, G. H. (1983) Amplification of DNA repair genes using plasmid pKc30. Methods Enzymol, 101 138-55.
310
APPENDICES
Appendix 1 – Maps of the pENTR-D, pET 101 and pDEST-C1 Vectors...... 311
Appendix 2 – HPLC Columns...... 313
Appendix 3 – DNA Sequencing Results ...... 314
311
Appendix 1 – Maps of the pENTR-D, pET 101 and pDEST-C1 Vectors
A
B
312
C
A – pENTR™ / D-TOPO® Gateway® entry vector (Invitrogen)
B – pET101 / D-TOPO® Gateway® expression vector (Invitrogen)
C – pDEST-C1 expression vector (Horanyi et al., 2006)
313
Appendix 2 – HPLC Columns
HPLC Column Chemistry of the Resin
Low Resolution Cation Exchange SP Sepharose The cross-linked agarose-dextran matrix is coated with sulfopropyl (Amersham groups that bind to the positively charged residues on the protein. The Biosciences) protein is then eluted with a NaCl gradient.
Low Resolution Anion Exchange Q Sepharose The cross-linked agarose-dextran matrix is coated with quaternary (Amersham amine groups that bind to the negatively charged residues on the Biosciences) protein. The protein is then eluted with a NaCl gradient.
High Resolution Cation Exchange POROS HS The poly(styrene-divinylbenzene) polymer matrix (PS/DVB) is coated (Applied with sulfopropyl groups that bind to the positively charged residues on Biosystems) the protein. The protein is then eluted with a NaCl gradient.
High Resolution Anion Exchange POROS HQ The PS/DVB matrix is coated with quaternary amine groups that bind to (Applied the negatively charged residues on the protein. The protein is then Biosystems) eluted with a NaCl gradient.
Ion Exchange Hydroxyapatite The [Ca5(PO4)3OH]2 resin of the hydroxyapatite column binds proteins (Bio-Rad) and competes with the phosphate backbone of the DNA present in the sample. The protein is eluted with a salt gradient of Ammonium Sulfate.
POROS PE Hydrophobic Interaction (Applied The PS/DVB matrix is coated with phenyl-ether hydrophobic groups that Biosystems) bind to nucleases present in the sample. The protein is eluted in the flow through.
Size Exclusion Superdex 75 The macroporous gel matrix of agarose and dextran forms porous (Amersham beads. Protein larger than 75 kDa cannot diffuse in the pores and are Biosciences) eluted in the void fraction. The smaller proteins are eluted according to their hydrodynamic radius (the smaller ones being eluted last).
Superdex 200 Size Exclusion (Amersham The principle is the same as the Superdex 75, but here the pores are Biosciences) larger and allow proteins up to 200 kDa to diffuse.
Metal Affinity Cobalt ions are immobilized on a Superflow resin via a tetradentate Talon ligand (carboxyl and amine groups). The His-Tags located on the protein (Clontech) can then chelate the cobalt ions. The protein is eluted with an Imidazole gradient.
314
Appendix 3 – DNA Sequencing Results
a - T4 32-B Protein (pEKF2 plasmid)
Forward Primer Sequencing Reaction
ATG CTG ATG TTT AAA CGT AAA TCT ACT GCT GAA CTC GCT GCA CAA ATG GCT AAA CTG 1 M F K R K S T A E L A A Q M A K L
AN GGN NN AAA GGT TTT TCT TCT GAA NAT ANA GGC NAG TGG AAA CTG AAA AAT GGC AAT AAA GGT TTT TCT TCT GAA GAT AAA GGC GAG TGG AAA CTG AAA 18 N G N K G F S S E D K G E W K L K
CTC NAT AAT GNG GGT AAN GGT CAA GCN NTA ATT CNT TTT CTT CCG TCN AAA CTC GAT AAT GCG GGT AAC GGT CAA GCA GTA ATT CGT TTT CTT CCG TCT AAA 35 L D N A G N G Q A V I R F L P S K
AAT GAT GAA CAA GCA CCA TTC NCA ATT CTT GTA AAT CAC GGT TTC NAG AAA AAT GAT GAA CAA GCA CCA TTC GCA ATT CTT GTA AAT CAC GGT TTC AAG AAA 52 N D E Q A P F A I L V N H G F K K
AAT GGT AAA TGG TAT ATT GAA NCA TGT TCA TCN ACC CAT GGT GAT TAC GAT AAT GGT AAA TGG TAT ATT GAA ACA TGT TCA TCT ACC CAT GGT GAT TAC GAT 69 N G K W Y I E T C S S T H G D Y D
TCT TGC CCA GTA TGT CAA TAC ATC NGT AAA AAT GAT CTA TAC AAC ACT GAC TCT TGC CCA GTA TGT CAA TAC ATC AGT AAA AAT GAT CTA TAC AAC ACT GAC 86 S C P V C Q Y I S K N D L Y N T D
AAT AAA GAG TAN AGT CTT GTT AAA CGT AAA ACT TCT TAC TGG GCC AAC NTT AAT AAA GAG TAC AGT CTT GTT AAA CGT AAA ACT TCT TAC TGG GCT AAC ATT 103 N K E Y S L V K R K T S Y W A N I
CTT GTA GTA AAA GAC CCA GCT GCT CCA NAA AAC GAA NGT NAA GTA TTN AAA CTT GTA GTA AAA GAC CCA GCT GCT CCA GAA AAC GAA GGT AAA GTA TTT AAA 120 L V V K D P A A P E N E G K V F K
TAC CGW TNC GGN AAN AAA ATC TGG GAT AAA ATC AAT GCA ATG ATT GCG GTT TAC CGC TTT GGT AAG AAA ATC TGG GAT AAA ATC AAT GCA ATG ATT GCG GTT 137 Y R F G K K I W D K I N A M I A V
GAT GTT GAA ATG GGT GAA CAN CCA NTT GAT GTA ACT TGT NCG NGG GAA GGT GAT GTT GAA ATG GGT GAA ACT CCA GTT GAT GTA ACT TGT CCG TGG GAA GGT 154 D V E M G E T P V D V T C P W E G
GCT AAC TTT GNA CTG AAA GTT ANA CAA GTT TCT GGA TTT AGT AAC TAC NAT GCT AAC TTT GTA CTG AAA GTT AAA CAA GTT TCT GGA TTT AGT AAC TAC GAT 171 A N F V L K V K Q V S G F S N Y D
GAA TCT NAA TTC CTG AAT CAA NCT GCN ATT CCA AAC ATT GAC GAT GAA TCT GAA TCT AAA TTC CTG AAT CAA TCT GCG ATT CCA AAC ATT GAC GAT GAA TCT 188 E S K F L N Q S A I P N I D D E S
315
TTC CAN AAA GAA CTG TTC NAA CAA ATG GTT GAC CTT TCT GAN ATG ACT TCT TTC CAG AAA GAA CTG TTC GAA CAA ATG GTT GAC CTT TCT GAA ATG ACT TCT 205 F Q K E L F E Q M V D L S E M T S
AAA GAT AAA TTC AAA TCG TTT GAA GAA CNT AAT ACT AAA TTC GGT CAA GTT AAA GAT AAA TTC AAA TCG TTT GAA GAA CTT AAT ACT AAA TTC GGT CAA GTT 222 K D K F K S F E E L N T K F G Q V
ATG NGA ACT GCT GTG ATG GGC GGT GCT GCT GCA ACT GCT GCT AAG AAA GCT ATG GGA ACT GCT GTG ATG GGC GGT GCT GCT GCA ACT GCT GCT AAG AAA GCT 239 M G T A V M G G A A A T A A K K A
GAT AAA GTT GCT GAT GAT TTG NAT GCA TTC ANT GTT GAT GAC TTC AAT ACA GAT AAA GTG GCT GAT GAT TTG GAT GCA TTC AAT GTT GAT GAC TTC AAT ACA 256 D K V A D D L D A F N V D D F N T
NAA ACT GAA NAT GAT TTT ATG AGC TCA AGC TCT GGT AGT TCA TCT AGT GCN AAA ACT GAA GAT GAT TTT ATG AGC TCA AGC TCT GGT AGT TCA TCT AGT GCT 273 K T E D D F M S S S S G S S S S A
GAT GAC ACG GAC CTG GAT GAC CTT TTG AAT GAC CTT TAA GAT GAC ACG GAC CTG GAT GAC CTT TTG AAT GAC CTT TAA 290 D D T D L D D L L N D L stop
Reverse Primer Sequencing Reaction
NNG CTG ATG TTT AAA CGT AAA TCT ACT GCT GAA CTC GCT GCA CAA ATG GCT AAA CTG 1 M F K R K S T A E L A A Q M A K L
AAT GGC AAT AAA NGG TTT TTCT TNT GAA GAT AAA GGN GAG TGG AAA CTG AAA AAT GGC AAT AAA GGT TTT TCT TCT GAA GAT AAA GGC GAG TGG AAA CTG AAA 18 N G N K G F S S E D K G E W K L K
CTC GAT AAT GCG GGT AAC GGT CAA GCA GTN ATT NGT TTT CTT CNG TNT AAA CTC GAT AAT GCG GGT AAC GGT CAA GCA GTA ATT CGT TTT CTT CCG TCT AAA 35 L D N A G N G Q A V I R F L P S K
AAT GAT GAA CAA GCA CCA TTN GCA ATT CTT NGTA AAT CAC GGT TTC AAG AAA AAT GAT GAA CAA GCA CCA TTC GCA ATT CTT GTA AAT CAC GGT TTC AAG AAA 52 N D E Q A P F A I L V N H G F K K
AAT GGT AAA TGG TAT ANN GAA ACA TGT TCA TCT ACC CAT GGT GAT TAC GAT AAT GGT AAA TGG TAT ATT GAA ACA TGT TCA TCT ACC CAT GGT GAT TAC GAT 69 N G K W Y I E T C S S T H G D Y D
TCT TGC CCA GTA TGT CAA TAC ATC AGT AAA AAT GAT CTA TAC AAC ACT GAC TCT TGC CCA GTA TGT CAA TAC ATC AGT AAA AAT GAT CTA TAC AAC ACT GAC 86 S C P V C Q Y I S K N D L Y N T D
AAT AAA GAG TAC AGT CTT GTT AAA CGT AAA ACT TCT TAC TGG GCC AAC ATT AAT AAA GAG TAC AGT CTT GTT AAA CGT AAA ACT TCT TAC TGG GCT AAC ATT 103 N K E Y S L V K R K T S Y W A N I
316
CTT GTA GTA AAA GAC CCA GCT GCT CCA GAA AAC GAA GGT AAA GTA TTT AAA CTT GTA GTA AAA GAC CCA GCT GCT CCA GAA AAC GAA GGT AAA GTA TTT AAA 120 L V V K D P A A P E N E G K V F K
TAC CGT TTC GGT AAG AAA ATC TGG GAT AAA ATC AAT GCA ATG ATT GCG GTT TAC CGC TTT GGT AAG AAA ATC TGG GAT AAA ATC AAT GCA ATG ATT GCG GTT 137 Y R F G K K I W D K I N A M I A V
GAT GTT GAA ATG GGT GAA ACT CCA GTT GAT GTA ACT TGT CCG TGG GAA GGT GAT GTT GAA ATG GGT GAA ACT CCA GTT GAT GTA ACT TGT CCG TGG GAA GGT 154 D V E M G E T P V D V T C P W E G
GCT AAC TTT GTA CTG AAA GTT AAA CAA GTT TCT GGA TTT AGT AAC TAC GAT GCT AAC TTT GTA CTG AAA GTT AAA CAA GTT TCT GGA TTT AGT AAC TAC GAT 171 A N F V L K V K Q V S G F S N Y D
GAA TCT AAA TTC CTG AAT CAA TCT GCG ATT CCA AAC ATT GAC GAT GAA TCT GAA TCT AAA TTC CTG AAT CAA TCT GCG ATT CCA AAC ATT GAC GAT GAA TCT 188 E S K F L N Q S A I P N I D D E S
TTC CAG AAA GAA CTG TTC GAA CAA ATG GTT GAC CTT TCT GAA ATG ACT TCT TTC CAG AAA GAA CTG TTC GAA CAA ATG GTT GAC CTT TCT GAA ATG ACT TCT 205 F Q K E L F E Q M V D L S E M T S
AAA GAT AAA TTC AAA TCG TTT GAA GAA CTT AAT ACT AAA TTC GGT CAA GTT AAA GAT AAA TTC AAA TCG TTT GAA GAA CTT AAT ACT AAA TTC GGT CAA GTT 222 K D K F K S F E E L N T K F G Q V
ATG GGA ACT GCT GTG ATG GGC GGT GCT GCT GCA ACT GCT GCT AAG AAA GCT ATG GGA ACT GCT GTG ATG GGC GGT GCT GCT GCA ACT GCT GCT AAG AAA GCT 239 M G T A V M G G A A A T A A K K A
GAT AAA GTT GCT GAT GAT TTG GAT GCA TTC AAT GTT GAT GAC TTC AAT ACA GAT AAA GTG GCT GAT GAT TTG GAT GCA TTC AAT GTT GAT GAC TTC AAT ACA 256 D K V A D D L D A F N V D D F N T
AAA ACT GAA GAT GAT TTT ATG AGC TCA AGC TCT GGT AGT TCA TCT AGT GCT AAA ACT GAA GAT GAT TTT ATG AGC TCA AGC TCT GGT AGT TCA TCT AGT GCT 273 K T E D D F M S S S S G S S S S A
GAT GAC ACG GAC CTG GAT GAC CTT TTG AAT GAC CTT TAA GAT GAC ACG GAC CTG GAT GAC CTT TTG AAT GAC CTT TAA 290 D D T D L D D L L N D L stop
b - T4 32-B Protein (pDEST-C1 plasmid)
Forward Primer Sequencing Reaction
ATG GCA CAT CAC CAC CAC CAT CAC GTG GGT ACC GGT TCG AAT GAT GAC NAC NAC
AAA TCA ACA AGT TTG TAC AAA AAA GCA GGC TCC GCG GCC GCC CCC TTC ACC GAG
AAC CTC TAC TTC CAA GGA CTG AAT GGC AAT AAA GGT TTT TCT TCT GAA GAT AAA ATG CTG AAT GGC AAT AAA GGT TTT TCT TCT GAA GAT AAA
317
GGC GAG TGG AAA CTG AAA CTC GAT AAT GCG GGT AAC GGT CAA GCA GTA ATT CGT GGC GAG TGG AAA CTG AAA CTC GAT AAT GCG GGT AAC GGT CAA GCA GTA ATT CGT
TTT CTT CCG TCT AAA AAT GAT GAA CAA GCA CCA TTC GCA ATT CTT GTA AAT CAC TTT CTT CCG TCT AAA AAT GAT GAA CAA GCA CCA TTC GCA ATT CTT GTA AAT CAC
GGT TTC AAG AAA AAT GGT AAA TGG TAT ATT GAA ACA TGT TCA TCT ACC CAT GGT GGT TTC AAG AAA AAT GGT AAA TGG TAT ATT GAA ACA TGT TCA TCT ACC CAT GGT
GAT TAC GAT TCT TGC CCA GTA TGT CAA TAC ATC AGT AAA AAT GAT CTA TAC AAC GAT TAC GAT TCT TGC CCA GTA TGT CAA TAC ATC AGT AAA AAT GAT CTA TAC AAC
ACT GAC AAT AAA GAG TAC AGT CTT GTT AAA CGT AAA ACT TCT TAC TGG GCC AAC ACT GAC AAT AAA GAG TAC AGT CTT GTT AAA CGT AAA ACT TCT TAC TGG GCT AAC
ATT CTT GTA GTA AAA GAC CCA GCT GCT CCA GAA AAC GAA GGT AAA GTA TTT AAA ATT CTT GTA GTA AAA GAC CCA GCT GCT CCA GAA AAC GAA GGT AAA GTA TTT AAA
TAC CGT TTC GGT AAG AAA ATC TGG GAT AAA ATC AAT GCA ATG ATT GCG GTT GAT TAC CGC TTT GGT AAG AAA ATC TGG GAT AAA ATC AAT GCA ATG ATT GCG GTT GAT
GTT GAA ATG GGT GAA ACT CCA GTT GAT GTA ACT TGT CCG TGG GAA GGT GCT AAC GTT GAA ATG GGT GAA ACT CCA GTT GAT GTA ACT TGT CCG TGG GAA GGT GCT AAC
TTT GTA CTG AAA GTT AAA CAA GTT TCT GGA TTT AGT AAC TAC GAT GAA TCT AAA TTT GTA CTG AAA GTT AAA CAA GTT TCT GGA TTT AGT AAC TAC GAT GAA TCT AAA
TTC CTG AAT CAA TCT GCG ATT CCA AAC ATT GAC GAT GAA TCT TTC CAG AAA GAA TTC CTG AAT CAA TCT GCG ATT CCA AAC ATT GAC GAT GAA TCT TTC CAG AAA GAA
CTG TTC GAA CAA ATG GTT GAC CTT TCT GAA ATG ACT TCT AAA GAT AAA TTC AAA CTG TTC GAA CAA ATG GTT GAC CTT TCT GAA ATG ACT TCT AAA GAT AAA TTC AAA
TCG TTT GAA GAA CTT AAT ACT AAA TTC GGT CAA GTT ATG GGA ACT GCT GTG ATG TCG TTT GAA GAA CTT AAT ACT AAA TTC GGT CAA GTT ATG GGA ACT GCT GTG ATG
GGC GGT GCT GCT GCA ACT GCT GCT AAG AAA GCT GAT AAA GTT GCT GAT GAT TTG GGC GGT GCT GCT GCA ACT GCT GCT AAG AAA GCT GAT AAA GTG GCT GAT GAT TTG
GAT GCA TTC AAT GTT GAT GAC TTC AAT ACA AAA ACT GAA GAT GAT TTT ATG AGC GAT GCA TTC AAT GTT GAT GAC TTC AAT ACA AAA ACT GAA GAT GAT TTT ATG AGC
TCA AGC TCT GGT AGT TCA TCT AGT GCT GAT GAC ACG GAC CTG GNT GAC CTT TTG TCA AGC TCT GGT AGT TCA TCT AGT GCT GAT GAC ACG GAC CTG GAT GAC CTT TTG
AAT GAC CTT TAA AAT GAC CTT TAA
Reverse Primer Sequencing Reaction
CTG AAN NNC NNT AAA NNT TTT TCT TCT GAA GAT AAA NGN GAG TGG AAA CTG ATG CTG AAT GGC AAT AAA GGT TTT TCT TCT GAA GAT AAA GGC GAG TGG AAA CTG
318
AAA CTC GAT AAT GCG GGT AA GGT CAA GCA GTA ATT CGT TTT CTT CCG TCT AAA AAA CTC GAT AAT GCG GGT AAC GGT CAA GCA GTA ATT CGT TTT CTT CCG TCT AAA
AAT GAT GAA CAA GCA CCA TTC GCA ATT CTT GTA AAT CAG GTT TCA AGA AAA A T AAT GAT GAA CAA GCA CCA TTC GCA ATT CTT GTA AAT CAC GGT TTC AAG AAA AAT
GGT AAA TGG TAT ATT GAA ACA TGT TCA TCT ACC CAT GGT GAT TAC GAT TCT TGC GGT AAA TGG TAT ATT GAA ACA TGT TCA TCT ACC CAT GGT GAT TAC GAT TCT TGC
CC GTA TGT CAA TAC ATC AGT AAA AAT GAT CTA TAC AAC ACT GAC AAT AAA GAG CCA GTA TGT CAA TAC ATC AGT AAA AAT GAT CTA TAC AAC ACT GAC AAT AAA GAG
TAC AGT CTT GTT AAA CGT AAA CT TCT TAC TGG GCC AAC ATT CTT GTA GTA AAA TAC AGT CTT GTT AAA CGT AAA ACT TCT TAC TGG GCT AAC ATT CTT GTA GTA AAA
GAC CCA GCT GCT CCA GAA AAC GAA GGT AAA GTA TTT AA TAC CG TTTCGGT AAG GAC CCA GCT GCT CCA GAA AAC GAA GGT AAA GTA TTT AAA TAC CGC TTT GGT AAG
AAA ATC TGG GAT AAA ATC AAT GCA ATG ATT GCG GTT GAT*TTT GTA CTG AAA GTT AAA ATC TGG GAT AAA ATC AAT GCA ATG ATT GCG GTT GAT TTT GTA CTG AAA GTT
AAA CAA GTT TCT GGA TTT AG AAC TAC GAT GAA TCT AAA TTC CTG AAT CAA TCT AAA CAA GTT TCT GGA TTT AGT AAC TAC GAT GAA TCT AAA TTC CTG AAT CAA TCT
GCG ATT CCA AAC ATT GAC GAT GAA TCT TTC CAG AAA GA CTG TTC GAA CAA ATG GCG ATT CCA AAC ATT GAC GAT GAA TCT TTC CAG AAA GAA CTG TTC GAA CAA ATG
GTT GAC CTT TCT GAA ATG ACT TCT AAA GAT AAA TTC AAA TCG TTT GAA GAA CTT GTT GAC CTT TCT GAA ATG ACT TCT AAA GAT AAA TTC AAA TCG TTT GAA GAA CTT
AA ACT AAA TTC GGT CAA GTT ATG GGA ACT GCT GTG ATG GGC GGT GCT GCT GCA AAT ACT AAA TTC GGT CAA GTT ATG GGA ACT GCT GTG ATG GGC GGT GCT GCT GCA
ACT GCT GCT AAG AAA GCT GA AAA GTT GCT GAT GAT TTG GAT GCA TTC AAT GTT ACT GCT GCT AAG AAA GCT GAT AAA GTG GCT GAT GAT TTG GAT GCA TTC AAT GTT
GAT GAC TTC AAT ACA AAA ACT GAA GAT GAT TTT ATG AG TCA AGC TCT GGT AGT GAT GAC TTC AAT ACA AAA ACT GAA GAT GAT TTT ATG AGC TCA AGC TCT GGT AGT
TCA TCT AGT GCT GAT GAC ACG GAC CTG GAT GAC CTT TTG AAT GAC CTT TAA TCA TCT AGT GCT GAT GAC ACG GAC CTG GAT GAC CTT TTG AAT GAC CTT TAA
* GTTGAAATGGGTGAAACCCAGTTGATGTAACTTGTCCGTGGGAAGGTGCTAACTTTGTACTGAAAGTT
c - T4 I151D 32-B Protein (pDEST-C1 plasmid)
Forward Primer Sequencing Reaction
ATG GCA CAT CAC CAC CAC CAT CAC GTG GGT ACC GGT TCG AAT GAT GAC GAC NAC
AAA TCA ACA AGT TTG TAC AAA AAA GCA GGC TCC GCG GCC GCC CCC TTC ACC GAG
AAC CTC TAC TTC CAA GGA CTG AAT GGC AAT AAA GGT TTT TCT TCT GAA GAT AAA CTG AAT GGC AAT AAA GGT TTT TCT TCT GAA GAT AAA
319
GGC GAG TGG AAA CTG AAA CTC GAT AAT GCG GGT AAC GGT CAA GCA GTA ATT CGT GGC GAG TGG AAA CTG AAA CTC GAT AAT GCG GGT AAC GGT CAA GCA GTA ATT CGT
TTT CTT CCG TCT AAA AAT GAT GAA CAA GCA CCA TTC GCA ATT CTT GTA AAT CAC TTT CTT CCG TCT AAA AAT GAT GAA CAA GCA CCA TTC GCA ATT CTT GTA AAT CAC
GGT TTC AAG AAA AAT GGT AAA TGG TAT ATT GAA ACA TGT TCA TCT ACC CAT GGT GGT TTC AAG AAA AAT GGT AAA TGG TAT ATT GAA ACA TGT TCA TCT ACC CAT GGT
GAT TAC GAT TCT TGC CCA GTA TGT CAA TAC ATC AGT AAA AAT GAT CTA TAC AAC GAT TAC GAT TCT TGC CCA GTA TGT CAA TAC ATC AGT AAA AAT GAT CTA TAC AAC
ACT GAC AAT AAA GAG TAC AGT CTT GTT AAA CGT AAA ACT TCT TAC TGG GCC AAC ACT GAC AAT AAA GAG TAC AGT CTT GTT AAA CGT AAA ACT TCT TAC TGG GCT AAC
ATT CTT GTA GTA AAA GAC CCA GCT GCT CCA GAA AAC GAA GGT AAA GTA TTT AAA ATT CTT GTA GTA AAA GAC CCA GCT GCT CCA GAA AAC GAA GGT AAA GTA TTT AAA
TAC CGT TTC GGT AAG AAA ATC TGG GAT AAA ATC AAT GCA ATG GAT GCG GTT GAT TAC CGC TTT GGT AAG AAA ATC TGG GAT AAA ATC AAT GCA ATG ATT GCG GTT GAT
GTT GAA ATG GGT GAA ACT CCA GTT GAT GTA ACT TGT CCG TGG GAA GGT GCT AAC GTT GAA ATG GGT GAA ACT CCA GTT GAT GTA ACT TGT CCG TGG GAA GGT GCT AAC
TTT GTA CTG AAA GTT AAA CAA GTT TCT GGA TTT AGT AAC TAC GAT GAA TCT AAA TTT GTA CTG AAA GTT AAA CAA GTT TCT GGA TTT AGT AAC TAC GAT GAA TCT AAA
TTC CTG AAT CAT TCT GCG ATT CCA AAC ATT GAC GAT GAA TCT TTC CAG AAA GAA TTC CTG AAT CAA TCT GCG ATT CCA AAC ATT GAC GAT GAA TCT TTC CAG AAA GAA
CTG TTC GAA CAA ATG GTT GAC CTT TCT GAA ATG ACT TCT AAA GAT AAA TTC AAA CTG TTC GAA CAA ATG GTT GAC CTT TCT GAA ATG ACT TCT AAA GAT AAA TTC AAA
TCG TTT GAA GAA[CTT AAT ACT AAA TTC GGT CAA GTT ATG GGA ACT GCT GTG ATG TCG TTT GAA GAA AGC TGA TAA AGT GGC TGA TGA TTT GGA TGC ATT CAA TGT TGA
GGC GGT GCT GCT GCA ACT GCT GCT AAG AAA GCT GAT AAA GTT GCT GAT GAT TTG TGA CTT CAA TAC AAA AAC TGA AGA TGA TTT TAT GAG CTC AAG CTC TGG TAG TTC
GAT GCA TTC AAT GTT GAT GAC TTC NAT ACA AAA CTG AAG ATG] AT TTT ATG AGC ATC TAG TGC TGA TGA CAC GGA CCT GGA TGA CCT TTT GAA TGA CAT TTT ATG AGC
TCA AGC TCT GGT AGT TCA TCT AGT GCT GAT GAC ACG GAC CTG GAT GAC CTT TTG TCA AGC TCT GGT AGT TCA TCT AGT GCT GAT GAC ACG GAC CTG GAT GAC CTT TTG
AAT GAC CTT TAA AAT GAC CTT TAA
Reverse Primer Sequencing Reaction
CTG AAT GNC AAT AAA NGT TTT TCT TCT GAA GAT AAA GGC GAG TGG AAA CTG ATG CTG AAT GGC AAT AAA GGT TTT TCT TCT GAA GAT AAA GGC GAG TGG AAA CTG
AAA CTC GAT AAT GCG GGT AAC GGT CAA GCA GTA ATT CGT TTT CTT CCG TCT AAA AAA CTC GAT AAT GCG GGT AAC GGT CAA GCA GTA ATT CGT TTT CTT CCG TCT AAA
320
AAT GAT GAA CAA GCA CCA TTC GCA ATT CTT GTA AAT CAC GGT TTC AAG AAA AAT AAT GAT GAA CAA GCA CCA TTC GCA ATT CTT GTA AAT CAC GGT TTC AAG AAA AAT
GGT AAA TGG TAT ATT GAA ACA TGT TCA TCT ACC CAT GGT GAT TAC GAT TCT TGC GGT AAA TGG TAT ATT GAA ACA TGT TCA TCT ACC CAT GGT GAT TAC GAT TCT TGC
CCA GTA TGT CAA TAC ATC AGT AAA AAT GAT CTA TAC AAC ACT GAC AAT AAA GAG CCA GTA TGT CAA TAC ATC AGT AAA AAT GAT CTA TAC AAC ACT GAC AAT AAA GAG
TAC AGT CTT GTT AAA CGT AAA ACT TCT TAC TGG GCC AAC ATT CTT GTA GTA AAA TAC AGT CTT GTT AAA CGT AAA ACT TCT TAC TGG GCT AAC ATT CTT GTA GTA AAA
GAC CCA GCT GCT CCA GAA AAC GAA GGT AAA GTA TTT AAA TAC CGT TTC GGT AAG GAC CCA GCT GCT CCA GAA AAC GAA GGT AAA GTA TTT AAA TAC CGC TTT GGT AAG
AAA ATC TGG GAT AAA ATC AAT GCA ATG GAT GCG GTT GAT GTT GAA ATG GGT GAA AAA ATC TGG GAT AAA ATC AAT GCA ATG ATT GCG GTT GAT GTT GAA ATG GGT GAA
ACT CCA GTT GAT GTA ACT TGT CCG TGG GAA GGT GCT AAC TTT GTA CTG AAA GTT ACT CCA GTT GAT GTA ACT TGT CCG TGG GAA GGT GCT AAC TTT GTA CTG AAA GTT
AAA CAA GTT TCT GGA TTT AGT AAC TAC GAT GAA TCT AAA TTC CTG AAT CAT TCT AAA CAA GTT TCT GGA TTT AGT AAC TAC GAT GAA TCT AAA TTC CTG AAT CAA TCT
GCG ATT CCA AAC ATT GAC GAT GAA TCT TTC CAG AAA GAA CTG TTC GAA CAA ATG GCG ATT CCA AAC ATT GAC GAT GAA TCT TTC CAG AAA GAA CTG TTC GAA CAA ATG
GTT GAC CTT TCT GAA ATG ACT TCT AAA GAT AAA TTC AAA TCG TTT GAA GAA CTT GTT GAC CTT TCT GAA ATG ACT TCT AAA GAT AAA TTC AAA TCG TTT GAA GAA CTT
AAT ACT AAA TTC GGT CAA GTT ATG GGA ACT GCT GTG ATG GGC GGT GCT GCT GCA AAT ACT AAA TTC GGT CAA GTT ATG GGA ACT GCT GTG ATG GGC GGT GCT GCT GCA
ACT GCT GCT AAG AAA GCT GAT AAA GTT GCT GAT GAT TTG GAT GCA TTC AAT GTT ACT GCT GCT AAG AAA GCT GAT AAA GTG GCT GAT GAT TTG GAT GCA TTC AAT GTT
GAT GAC TTC AAT ACA AAA ACT GAA GAT GAT TTT ATG AGC TCA AGC TCT GGT AGT GAT GAC TTC AAT ACA AAA ACT GAA GAT GAT TTT ATG AGC TCA AGC TCT GGT AGT
TCA TCT AGT GCT GAT GAC ACG GAC CTG GAT GAC CTT TTG AAT GAC CTT TAA TCA TCT AGT GCT GAT GAC ACG GAC CTG GAT GAC CTT TTG AAT GAC CTT TAA
d – T4 I60D 32-B Protein (pDEST-C1 plasmid)
Forward Primer Sequencing Reaction
ATG GCA CAT CAC CAC CAC CAT CAC GTG GGT ACC GGT TCG AAT GAT GAC GAC GAC AAA TCA ACA AGT TTG TAC AAA AAA GCA GGC TCC GCG GCC GCC CCC TTC ACC GAG
AAC CTC TAC TTC CAA GGA CTG AAT GGC AAT AAA GGT TTT TCT TCT GAA GAT AAA ATG CTG AAT GGC AAT AAA GGT TTT TCT TCT GAA GAT AAA
321
GGC GAG TGG AAA CTG AAA CTC GAT AAT GCG GGT AAC GGT CAA GCA GTA ATT CGT GGC GAG TGG AAA CTG AAA CTC GAT AAT GCG GGT AAC GGT CAA GCA GTA ATT CGT
TTT CTT CCG TCT AAA AAT GAT GAA CAA GCA CCA TTC GCA GAT CTT GTA AAT CAC TTT CTT CCG TCT AAA AAT GAT GAA CAA GCA CCA TTC GCA ATT CTT GTA AAT CAC
GGT TTC ANN AAA AAN GGT AAA TGG TAT ATT GAA ACA TGT TCA TCT ACC CAN GGT GGT TTC AAG AAA AAT GGT AAA TGG TAT ATT GAA ACA TGT TCA TCT ACC CAT GGT
GAT TAC NAT TCT TGC CCA NTA TGT CNN TAC NTC NGN AAN AAT GAT CTA TAC AAC GAT TAC GAT TCT TGC CCA GTA TGT CAA TAC ATC AGT AAA AAT GAT CTA TAC AAC
ACT GAC NAT AAA GAG TAN NGT CTT GTT AAA CGT AAA ACT TCN TAC TGG GCC NNC ACT GAC AAT AAA GAG TAC AGT CTT GTT AAA CGT AAA ACT TCT TAC TGG GCT AAC
ATT CTT GTA NTA AAA GAC CCN GCT GCT CCA NAA AAC GAA NGT AAA GTA TTT AAA ATT CTT GTA GTA AAA GAC CCA GCT GCT CCA GAA AAC GAA GGT AAA GTA TTT AAA
TAC CGN TTC GGT AAG AAA ATC TGG GAT AAA ATC AAT GCA ATG ATT GCG GTT GAN TAC CGC TTT GGT AAG AAA ATC TGG GAT AAA ATC AAT GCA ATG ATT GCG GTT GAT
GTT GAA ATG GGT GAA ACT CAN GTT GAN GTA ACT TGT NCG TGN GAA GGT GCT AAN GTT GAA ATG GGT GAA ACT CCA GTT GAT GTA ACT TGT CCG TGG GAA GGT GCT AAC
TTT GTA CTG AAA GTT AAA CAA GTT TCT GGA TTT ANT AAC TAC GAT GAA TCT AAN TTT GTA CTG AAA GTT AAA CAA GTT TCT GGA TTT AGT AAC TAC GAT GAA TCT AAA
TTC CTG AAT CAA TCT GCN ATT CCN AAC ATT GAC GAT GAA TCT TTC CAG ANA GAA TTC CTG AAT CAA TCT GCG ATT CCA AAC ATT GAC GAT GAA TCT TTC CAG AAA GAA
CTG TTC GAA CAN ATG GTT GAC CTT TCT GAN NTG ACT TCT AAA GAT ANA TTC NNA CTG TTC GAA CAA ATG GTT GAC CTT TCT GAA ATG ACT TCT AAA GAT AAA TTC AAA
TCG NTT GAA NAA CTT AAT ACT ANA TTC NGT CAA GTT ATN GGA ACT GCT GTG ATG TCG TTT GAA GAA CTT AAT ACT AAA TTC GGT CAA GTT ATG GGA ACT GCT GTG ATG
GGC GGT GCT GCT GCA ACT GCT GCT AAN AAA GCT GAT ANA GNT GCT GAT GAT TTG GGC GGT GCT GCT GCA ACT GCT GCT AAG AAA GCT GAT AAA GTG GCT GAT GAT TTG
GAT GCA TTN NAT GTT GAT GAC TTC NAT ACA AAA CT GAN NT GAT TNT ATG AGC GAT GCA TTC AAT GTT GAT GAC TTC AAT ACA AAA ACT GAA GAT GAT TTT ATG AGC
TCA AGC TCT GGT AGN TCA TCT AGT GNT GAT GAC NCG GAC CTG NNT GAC CNT TTG TCA AGC TCT GGT AGT TCA TCT AGT GCT GAT GAC ACG GAC CTG GAT GAC CTT TTG
AAT GAC CTT TAA AAT GAC CTT TAA
Reverse Primer Sequencing Reaction
CTG AAN NNC NAN AAN GNT TTT TCT TCN NAA GAT AAAAGGC GAG TNG AAN CTG ATG CTG AAT GGC AAT AAA GGT TTT TCT TCT GAA GAT AAA GGC GAG TGG AAA CTG
NAN CTC GAT AAT GNG GGN ANN GGT CAA GCA GTA ATN NGT NTT CTT CCG NNN ANA AAA CTC GAT AAT GCG GGT AAC GGT CAA GCA GTA ATT CGT TTT CTT CCG TCT AAA
322
AAT GAT NAA CAA GCA CCA TTC GCA GAT CTT GTA AAT CAC GGT TTC AAG AAA AAT AAT GAT GAA CAA GCA CCA TTC GCA ATT CTT GTA AATCAC GGT TTC AAG AAA AAT
GGT AAA TGG TAT ATT GAA ACA TGT TCA TCT ANN CAT GGT GAT TAC GAT TCT TGC GGT AAA TGG TAT ATT GAA ACA TGT TCA TCT ACC CAT GGT GAT TAC GAT TCT TGC
CCA GTA TGT CAA TAC ATC AGT AAA AAT GAT CTA TAC AAC ACT GAC AAT AAA GAG CCA GTA TGT CAA TAC ATC AGT AAA AAT GAT CTA TAC AAC ACT GAC AAT AAA GAG
TAC AGT CTT GTT AAA CGT AAA ACT TCT TAC TGG GCC AAC ATT CTT GTA GTA AAA TAC AGT CTT GTT AAA CGT AAA ACT TCT TAC TGG GCT AAC ATT CTT GTA GTA AAA
GAC CCA GCT GCT CCA GAA AAC GAA GGT AAA GTA TTT AAA TAC CGT TTC GGT AAG GAC CCA GCT GCT CCA GAA AAC GAA GGT AAA GTA TTT AAA TAC CGC TTT GGT AAG
AAA ATC TGG GAT AAA ATC AAT GCA ATG ATT GCG GTT GAT GTT GAA ATG GGT GAA AAA ATC TGG GAT AAA ATC AAT GCA ATG ATT GCG GTT GAT GTT GAA ATG GGT GAA
ACT CCA GTT GAT GTA ACT TGT CCG TGG GAA GGT GCT AAC TTT GTA CTG AAA GTT ACT CCA GTT GAT GTA ACT TGT CCG TGG GAA GGT GCT AAC TTT GTA CTG AAA GTT
AAA CAA GTT TCT GGA TTT AGT AAC TAC GAT GAA TCT AAA TTC CTG AAT CAA TCT AAA CAA GTT TCT GGA TTT AGT AAC TAC GAT GAA TCT AAA TTC CTG AAT CAA TCT
GCG ATT CCA AAC ATT GAC GAT GAA TCT TTC CAG AAA GAA CTG TTC GAA CAA ATG GCG ATT CCA AAC ATT GAC GAT GAA TCT TTC CAG AAA GAA CTG TTC GAA CAA ATG
GTT GAC CTT TCT GAA ATG ACT TCT AAA GAT AAA TTC AAA TCG TTT GAA GAA CTT GTT GAC CTT TCT GAA ATG ACT TCT AAA GAT AAA TTC AAA TCG TTT GAA GAA CTT
AAT ACT AAA TTC GGT CAA GTT ATG GGA ACT GCT GTG ATG GGC GGT GCT GCT GCA AAT ACT AAA TTC GGT CAA GTT ATG GGA ACT GCT GTG ATG GGC GGT GCT GCT GCA
ACT GCT GCT AAG AAA GCT GAT AAA GTT GCT GAT GAT TTG GAT GCA TTC AAT GTT ACT GCT GCT AAG AAA GCT GAT AAA GTG GCT GAT GAT TTG GAT GCA TTC AAT GTT
GAT GAC TTC AAT ACA AAA ACT GAA GAT GAT TTT ATG AGC TCA AGC TCT GGT AGT GAT GAC TTC AAT ACA AAA ACT GAA GAT GAT TTT ATG AGC TCA AGC TCT GGT AGT
TCA TCT AGT GCT GAT GAC ACG GAC CTG GAT GAC CTT TTG AAT GAC CTT TAA TCA TCT AGT GCT GAT GAC ACG GAC CTG GAT GAC CTT TTG AAT GAC CTT TAA
e – Rb69 RNase H (pDEST-C1 plasmid)
Forward Primer Sequencing Reaction
ATG GAT TTA GAA ATG ATG TTG GAT GAA GAT TAC AAA GAA GGT ATT GCG CTT GCA ATG GAT TTA GAA ATG ATG TTG GAT GAA GAT TAC AAA GAA GGT ATT GCG CTT GCA
GAC TTT AGT AAC ATT GCA TTG GCA GCT GCA TTA AAC AAC TTT GAA GAT GGT GAT GAC TTT AGT AAC ATT GCA TTG GCA GCT GCA TTA AAC AAC TTT GAA GAT GGT GAT
AAA ATT ACC GTT CCG ATG GTT CGT CAT GTA GTC TTG AAT TCA ATT CGT AAA AAC AAA ATT ACC GTT CCG ATG GTT CGT CAT GTA GTC TTG AAT TCA ATT CGT AAA AAC
323
GTA GTG ATG TTC CGT AAG CAA GGT TAT ACA AAA TTT GTA TTG TGC ATG GAT AAC GTA GTG ATG TTC CGT AAG CAA GGT TAT ACA AAA TTT GTA TTG TGC ATG GAT AAC
GCT ACT TCT GGG TAT TGG CGA CGC GAC TTT GCT TAC TAC TAC AAG AAA AAT CGT GCT ACT TCT GGG TAT TGG CGA CGC GAC TTT GCT TAC TAC TAC AAG AAA AAT CGT
AAA ACT GAT CGT GAA GCT TCA AAG TGG GAT TGG GAA GGA TAT TTT ACT GCA CTT AAA ACT GAT CGT GAA GCT TCA AAG TGG GAT TGG GAA GGA TAT TTT ACT GCA CTT
CAT CAA GTC GTT GAT GAG ATT AAG AAA TAT ATG CCA TAC GTT GTA ATG GAT ATT CAT CAA GTC GTT GAT GAG ATT AAG AAA TAT ATG CCA TAC GTT GTA ATG GAT ATT
GAC AAA TAC GAA GCG GAT GAC CAT ATC GGC GTA TTA ACT AAA TAT TTG TCA TTA GAC AAA TAC GAA GCG GAT GAC CAT ATC GGC GTA TTA ACT AAA TAT TTG TCA TTA
GCT GGT CAT AAG GTG TGT ATT GTT GCA TCA GAT GGT GAC TTT ACA CAA TTA CAC GCT GGT CAT AAG GTG TGT ATT GTT GCA TCA GAT GGT GAC TTT ACA CAA TTA CAC
AAA TAC CCT AAC GTT AAA CAG TGG TCG CCA CCG CAG AAA AAA TGG GTT AAA ATT AAA TAC CCT AAC GTT AAA CAG TGG TCG CCA CCG CAG AAA AAA TGG GTT AAA ATT
AAG AAT GGT TCT GCC GAA ATT GAT TGC ATG ACT AAA ATT CTT AAA GGC GAC CGT AAG AAT GGT TCT GCC GAA ATT GAT TGC ATG ACT AAA ATT CTT AAA GGC GAC CGT
AAA GAT GGT GTT GCG TCT GTT CGA GTT CGT GGT GAT TTC TGG TTT ACT CGA GTC AAA GAT GGT GTT GCG TCT GTT CGA GTT CGT GGT GAT TTC TGG TTT ACT CGA GTC
GAA NGC GAA CGA ACT CCA AGC ATG AAA ACA ACG ATC ATT GAA GCA CTT GCC AAT GAA GGC GAA CGA ACT CCA AGC ATG AAA ACA ACG ATC ATT GAA GCA CTT GCC AAT
GAT CGT TCT CAA GCT GAA GTA TTA TTA AGT GCA GAA NAA TAT AAA CGG TAC CAA GAT CGT TCT CAA GCT GAA GTA TTA TTA AGT GCA GAA GAA TAT AAA CGG TAC CAA
GAA AAT TTG GTT CTC ATT GAT TTT GAT TAT ATC CCT GAT AAT ATT GCT TCA ACC GAA AAT TTG GTT CTC ATT GAT TTT GAT TAT ATC CCT GAT AAT ATT GCT TCA ACC
ATT ATA GAG TAT TAT AAC TCA TAT CAN CCA CAA CCT AAA GGC AAG ATT TAT TCN ATT ATA GAG TAT TAT AAC TCA TAT CAA CCA CAA CCT AAA GGC AAG ATT TAT TCA
TAC TTT GTA AAA TCC GGT CTT TCT AAA TTA ACN AGT GTA ATT AAT GAA TTC TGA TAC TTT GTA AAA TCC GGT CTT TCT AAA TTA ACA AGT GTA ATT AAT GAA TTC TGA
Reverse Primer Sequencing Reaction
ATG GNT TTA GAA ATG ATN TG GAT GAN GAT TNC AAA GAA GGT ATT GCG CTT GCA ATG GAT TTA GAA ATG ATG TTG GAT GAA GAT TAC AAA GAA GGT ATT GCG CTT GCA
GAC TTT AGT ANC ATT GCA TNG NCA GCT GCA TTA AAC AAC TT GAA GAT GGT GAT GAC TTT AGT AAC ATT GCA TTG GCA GCT GCA TTA AAC AAC TTT GAA GAT GGT GAT
AAA ATT ACC GTT CCG ATG GTT CGT CAT GTA GTC TTG AAT TCA ATT NGT AAA AAC AAA ATT ACC GTT CCG ATG GTT CGT CAT GTA GTC TTG AAT TCA ATT CGT AAA AAC
GTA GTG ATG TTC CGT AAG CAA GGT TAT ACA AAA TTT GTA TTG TGC ATG GAT AAC GTA GTG ATG TTC CGT AAG CAA GGT TAT ACA AAA TTT GTA TTG TGC ATG GAT AAC
324
GCT ACT TCT GGG TAT TGG CGA CGC GAC TTT GCT TAC TAC TAC AAG AAA AAT CGT GCT ACT TCT GGG TAT TGG CGA CGC GAC TTT GCT TAC TAC TAC AAG AAA AAT CGT
AAA ACT GAT CGT GAA GCT TCA AAG TGG GAT TGG GAA GGA TAT TTT ACT GCA CTT AAA ACT GAT CGT GAA GCT TCA AAG TGG GAT TGG GAA GGA TAT TTT ACT GCA CTT
CAT CAA GTC GTT GAT GAG ATT AAG AAA TAT ATG CCA TAC GTT GTA ATG GAT ATT CAT CAA GTC GTT GAT GAG ATT AAG AAA TAT ATG CCA TAC GTT GTA ATG GAT ATT
GAC AAA TAC GAA GCG GAT GAC CAT ATC GGC GTA TTA ACT AAA TAT TTG TCA TTA GAC AAA TAC GAA GCG GAT GAC CAT ATC GGC GTA TTA ACT AAA TAT TTG TCA TTA
GCT GGT CAT AAG GTG TGT ATT GTT GCA TCA GAT GGT GAC TTT ACA CAA TTA CAC GCT GGT CAT AAG GTG TGT ATT GTT GCA TCA GAT GGT GAC TTT ACA CAA TTA CAC
AAA TAC CCT AAC GTT AAA CAG TGG TCG CCA CCG CAG AAA AAA TGG GTT AAA ATT AAA TAC CCT AAC GTT AAA CAG TGG TCG CCA CCG CAG AAA AAA TGG GTT AAA ATT
AAG AAT GGT TCT GCC GAA ATT GAT TGC ATG ACT AAA ATT CTT AAA GGC GAC CGT AAG AAT GGT TCT GCC GAA ATT GAT TGC ATG ACT AAA ATT CTT AAA GGC GAC CGT
AAA GAT GGT GTT GCG TCT GTT CGA GTT CGT GGT GAT TTC TGG TTT ACT CGA GTC AAA GAT GGT GTT GCG TCT GTT CGA GTT CGT GGT GAT TTC TGG TTT ACT CGA GTC
GAA GGC GAA CGA ACT CCA AGC ATG AAA ACA ACG ATC ATT GAA GCA CTT GCC AAT GAA GGC GAA CGA ACT CCA AGC ATG AAA ACA ACG ATC ATT GAA GCA CTT GCC AAT
GAT CGT TCT CAA GCT GAA GTA TTA TTA AGT GCA GAA GAA TAT AAA CGG TAC CAA GAT CGT TCT CAA GCT GAA GTA TTA TTA AGT GCA GAA GAA TAT AAA CGG TAC CAA
GAA AAT TTG GTT CTC ATT GAT TTT GAT TAT ATC CCT GAT AAT ATT GCT TCA ACC GAA AAT TTG GTT CTC ATT GAT TTT GAT TAT ATC CCT GAT AAT ATT GCT TCA ACC
ATT ATA GAG TAT TAT AAC TCA TAT CAA CCA CAA CCT AAA GGC AAG ATT TAT TCA ATT ATA GAG TAT TAT AAC TCA TAT CAA CCA CAA CCT AAA GGC AAG ATT TAT TCA
TAC TTT GTA AAA TCC GGT CTT TCT AAA TTA ACA AGT GTA ATT AAT GAA TTC TGA TAC TTT GTA AAA TCC GGT CTT TCT AAA TTA ACA AGT GTA ATT AAT GAA TTC TGA
f – Rb69 D132N RNase H (pDEST-C1 plasmid)
Forward Primer Sequencing Reaction
ATG GAT TTA GAA ATG ATG TTG GAT GAA GAT TAC AAA GAA GGT ATT GCG CTT GCA ATG GAT TTA GAA ATG ATG TTG GAT GAA GAT TAC AAA GAA GGT ATT GCG CTT GCA
GAC TTT AGT AAC ATT GCA TTG GCA GCT GCA TTA AAC AAC TTT GAA GAT GGT GAT GAC TTT AGT AAC ATT GCA TTG GCA GCT GCA TTA AAC AAC TTT GAA GAT GGT GAT
AAA ATT ACC GTT CCG ATG GTT CGT CAT GTA GTC TTG AAT TCA ATT CGT AAA AAC AAA ATT ACC GTT CCG ATG GTT CGT CAT GTA GTC TTG AAT TCA ATT CGT AAA AAC
GTA GTG ATG TTC CGT AAG CAA GGT TAT ACA AAA TTT GTA TTG TGC ATG GAT AAC GTA GTG ATG TTC CGT AAG CAA GGT TAT ACA AAA TTT GTA TTG TGC ATG GAT AAC
325
GCT ACT TCT GGG TAT TGG CGA CGC GAC TTT GCT TAC TAC TAC AAG AAA AAT CGT GCT ACT TCT GGG TAT TGG CGA CGC GAC TTT GCT TAC TAC TAC AAG AAA AAT CGT
AAA ACT GAT CGT GAA GCT TCA AAG TGG GAT TGG GAA GGA TAT TTT ACT GCA CTT AAA ACT GAT CGT GAA GCT TCA AAG TGG GAT TGG GAA GGA TAT TTT ACT GCA CTT
CAT CAA GTC GTT GAT GAG ATT AAG AAA TAT ATG CCA TAC GTT GTA ATG GAT ATT CAT CAA GTC GTT GAT GAG ATT AAG AAA TAT ATG CCA TAC GTT GTA ATG GAT ATT
GAC AAA TAC GAA GCG AAT GAC CAT ATC GGC GTA TTA ACT AAA TAT TTG TCA TTA GAC AAA TAC GAA GCG GAT GAC CAT ATC GGC GTA TTA ACT AAA TAT TTG TCA TTA
GCT GGT CAT AAG GTG TGT ATT GTT GCA TCA GAT GGT GAC TTT ACA CAA TTA CAC GCT GGT CAT AAG GTG TGT ATT GTT GCA TCA GAT GGT GAC TTT ACA CAA TTA CAC
AAA TAC CCT AAC GTT AAA CAG TGG TCG CCA CCG CAG AAA AAA TGG GTT AAA ATT AAA TAC CCT AAC GTT AAA CAG TGG TCG CCA CCG CAG AAA AAA TGG GTT AAA ATT
AAG AAT GGT TCT GCC GAA ATT GAT TGC ATG ACT AAA ATT CTT AAA GGC GAC CGT AAG AAT GGT TCT GCC GAA ATT GAT TGC ATG ACT AAA ATT CTT AAA GGC GAC CGT
AAA GAT GGT GTT GCG TCT GTT CGA GTT CGT GGT GAT TTC TGG TTT ACT CGA GTC AAA GAT GGT GTT GCG TCT GTT CGA GTT CGT GGT GAT TTC TGG TTT ACT CGA GTC
GAA NGC GAA CGA ACT CCA AGC ATG AAA ACA ACG ATC ATT GAA GCA CTT GCC AAT GAA GGC GAA CGA ACT CCA AGC ATG AAA ACA ACG ATC ATT GAA GCA CTT GCC AAT
GAT CGT TCT CAA GCT GAA GTA TTA TTA AGT GCA GAA GAA TAT AAA CGG TAC CAA GAT CGT TCT CAA GCT GAA GTA TTA TTA AGT GCA GAA GAA TAT AAA CGG TAC CAA
GAA AAT TTG GTT CTC ATT GAT TTT GAT TAT ATC CCT GAT AAT ATT GCT TCA ACC GAA AAT TTG GTT CTC ATT GAT TTT GAT TAT ATC CCT GAT AAT ATT GCT TCA ACC
ATT ATA GAG TAT TAT AAC TCA TAT CAA CCA CAA CCT AAN GGC AAG ATT TAT TCA ATT ATA GAG TAT TAT AAC TCA TAT CAA CCA CAA CCT AAA GGC AAG ATT TAT TCA
TAC TTT GTA AAA NCC CGG TCT TTC TAN TTA ANN AGT GTA ATT NAT GAA TTC TGA TAC TTT GTA AAA TCC GGT CTT TCT AAA TTA ACA AGT GTA ATT AAT GAA TTC TGA
Reverse Primer Sequencing Reaction
ATG NN TT AGA AAT GAT GTN GAT GAA GAT TAC AAA GAA GGT ATT GCG CTT GCA ATG GAT TTA GAA ATG ATG TTG GAT GAA GAT TAC AAA GAA GGT ATT GCG CTT GCA
GAC TTT AGT ANC AT GCA TNG GCA GCT GCA TTA AAC AAC TT GAA GAT GGT GAT GAC TTT AGT AAC ATT GCA TTG GCA GCT GCA TTA AAC AAC TTT GAA GAT GGT GAT
AAA ATT ACC GTT CCG ATG GTT CGT CAT GTA GTC TTG AAT TCA ATT CGT AAA AAC AAA ATT ACC GTT CCG ATG GTT CGT CAT GTA GTC TTG AAT TCA ATT CGT AAA AAC
GTA GTG ATG TTC CGT AAG CAA GGT TAT ACA AAA TTT GTA TTG TGC ATG GAA AAC GTA GTG ATG TTC CGT AAG CAA GGT TAT ACA AAA TTT GTA TTG TGC ATG GAT AAC
GCT ACT TCT GGG TAT TGG CGA CGC GAC TTT GCT TAC TAC TAC AAG AAA AAT CGT GCT ACT TCT GGG TAT TGG CGA CGC GAC TTT GCT TAC TAC TAC AAG AAA AAT CGT
326
AAA ACT GAT CGT GAA GCT TCA AAG TGG GAT TGG GAA GGA TAT TTT ACT GCA CTT AAA ACT GAT CGT GAA GCT TCA AAG TGG GAT TGG GAA GGA TAT TTT ACT GCA CTT
CAT CAA GTC GTT GAT GAG ATT AAG AAA TAT ATG CCA TAC GTT GTA ATG GAT ATT CAT CAA GTC GTT GAT GAG ATT AAG AAA TAT ATG CCA TAC GTT GTA ATG GAT ATT
GAC AAA TAC GAA GCG AAT GAC CAT ATC GGC GTA TTA ACT AAA TAT TTG TCA TTA GAC AAA TAC GAA GCG GAT GAC CAT ATC GGC GTA TTA ACT AAA TAT TTG TCA TTA
GCT GGT CAT AAG GTG TGT ATT GTT GCA TCA GAT GGT GAC TTT ACA CAA TTA CAC GCT GGT CAT AAG GTG TGT ATT GTT GCA TCA GAT GGT GAC TTT ACA CAA TTA CAC
AAA TAC CCT AAC GTT AAA CAG TGG TCG CCA CCG CAG AAA AAA TGG GTT AAA ATT AAA TAC CCT AAC GTT AAA CAG TGG TCG CCA CCG CAG AAA AAA TGG GTT AAA ATT
AAG AAT GGT TCT GCC GAA ATT GAT TGC ATG ACT AAA ATT CTT AAA GGC GAC CGT AAG AAT GGT TCT GCC GAA ATT GAT TGC ATG ACT AAA ATT CTT AAA GGC GAC CGT
AAA GAT GGT GTT GCG TCT GTT CGA GTT CGT GGT GAT TTC TGG TTT ACT CGA GTC AAA GAT GGT GTT GCG TCT GTT CGA GTT CGT GGT GAT TTC TGG TTT ACT CGA GTC
GAA GGC GAA CGA ACT CCA AGC ANG AAA ACA ACG ATC ATT GAA GCA CTT GCC AAT GAA GGC GAA CGA ACT CCA AGC ATG AAA ACA ACG ATC ATT GAA GCA CTT GCC AAT
GAT CGT TCT CAA GCT GAA GTA TTA TTA AGT GCA GAA GAA TAT AAA CGG TAC CAA GAT CGT TCT CAA GCT GAA GTA TTA TTA AGT GCA GAA GAA TAT AAA CGG TAC CAA
GAA AAT TTG GTT CTC ATT GAT TTT GAT TAT ATC CCT GAT AAT ATT GCT TCA ACC GAA AAT TTG GTT CTC ATT GAT TTT GAT TAT ATC CCT GAT AAT ATT GCT TCA ACC
ATT ATA GAG TAT TAT AAC TCA TAT CAA CCA CAA CCT AAA GGC AAG ATT TAT TCA ATT ATA GAG TAT TAT AAC TCA TAT CAA CCA CAA CCT AAA GGC AAG ATT TAT TCA
TAC TTT GTA AAA TCC GGT CTT TCT AAA TTA ACA AGN GTA ATT AAT GAA TTC TGA TAC TTT GTA AAA TCC GGT CTT TCT AAA TTA ACA AGT GTA ATT AAT GAA TTC TGA
The results from the sequencing reactions are highlighted in yellow, with the GOI nucleotide sequence below. The mismatches are not highlighted. In the forward primer sequence, the start codon present in the pDEST-C1 is highlighted in green and the stop codon is highlighted in red. The sequence coding for the His-Tag is in pink and the one coding for the TEV protease site in blue. In between the two is the linker sequence. All of these are not presented in the reverse primer reaction. The mutated nucleotides are shown in red.