<<

Florida State University Libraries

Electronic Theses, Treatises and Dissertations The Graduate School

2007 Lombricine Structure and Specificity: A Paradigm for Elucidation of Substrate Specificity in Phosphagen D. Jeffrey. Bush

Follow this and additional works at the FSU Digital Library. For more information, please contact [email protected]

THE FLORIDA STATE UNIVERSITY

COLLEGE OF ARTS AND SCIENCES

LOMBRICINE KINASE STRUCTURE AND SUBSTRATE SPECIFICITY: A PARADIGM FOR ELUCIDATION OF SUBSTRATE SPECIFICITY IN PHOSPHAGEN KINASES

By

D. JEFFREY BUSH

A Dissertation submitted to the Department of Chemistry and Biochemistry in partial fulfillment of the requirements for the degree of Doctor of Philosophy

Degree Awarded: Spring Semester, 2007

The members of the Committee approve the Dissertation of D. Jeffrey Bush defended on February 20, 2007.

Michael S. Chapman Professor Co-Directing Dissertation

John Dorsey Professor Co-Directing Dissertation

W. Ross Ellington Outside Committee Member

Michael Blaber Committee Member

Approved:

______Joseph Schlenoff, Department Chair, Department of Chemistry & Biochemistry

______Joseph Travis, Dean, College of Arts & Sciences

The Office of Graduate Studies has verified and approved the above named committee members.

ii

To the late Clifford M. Bush, who with statements such as “A heterogeneous compound of two or more substances whose ray through certain limits is confined to a specific area…” fostered a strong interest of the author in science at a very young age, if only just to know more about what he spoke.

iii

ACKNOWLEDGEMENTS

I wish to first convey my sincere gratitude to my parents, Donald and Roberta for raising me in the nurture and admonition of the Almighty God. With loving support and encouragement, all have fostered a strong belief that religion and science are by no means mutually exclusive, contrary to popular myth. I certainly would not have been able to achieve the accomplishments that I have without the discipline, perseverance, support and encouragement instilled at an early age.

My mentor, Dr. Michael Chapman has also been a tremendous source of inspiration, advice and encouragement. I am forever grateful for his support, depth of knowledge in so many areas, and for his continued patience and kind demeanor, while directing and focusing this research.

I would also like to thank my supervisory committee, particularly, collaborating Professor Dr. W. Ross Ellington, whose helpful advice, suggestions and encouragement were crucial to the success of the work described herein. Many laboratory colleagues were also instrumental in the success of this research. Among those providing much helpful advice were Dr. Shawn Clark, Dr. Qing Xie, Dr. Arezki Azzi, Dr. James Gattis, Ms. Eliza Ruben, Mr. Omar Davulcu, Dr. Pamela Pruett, Dr. Mohammad Yousef, Dr. Andrei Korostelev, Dr. Felcy Gabriel Ms. Nancy Meyer, Dr. Gregg Hoffman, Ms. Deanne Compaan, Ms. Irina Carbone, Dr. Wei Yang, Ms. Sarah Murray, Dr. Michael Zawrotny, and Mr. Travis Smith.

Two important influences who fostered my interest in science at an early age and to whom I am also grateful are my high school chemistry teacher, Mary Solarczyk, and Mr. Patrick Weaver, who urged me to take Ms. Solarczyk’s AP chemistry class, which I found remarkably fascinating.

iv

My sister, Laura, also encouraged me and motivated me to put tremendous effort toward my studies from sheer amicable sibling rivalry if not simply out of spite to try to outperform her own considerable academic proclivities.

Last but certainly not least, I would like to thank the director of the FSU X-ray Facility, Dr. Thayumanasamy Somasundaram. Without the support and expert advice of ‘Soma’ in areas from crystal mounting, cryoprotection, data collection and processing, synchrotron data collection procedures, as well as his familiarity with unix-based crystallographic software packages, this work would certainly not have been possible.

Substrate-free dimeric (sf-d) x-ray data for lombricine kinase was collected at the Cornell High Energy Synchrotron Source (CHESS) which is supported by the National Science Foundation and the National Institutes of Health/National Institute of General Medical Sciences under award DMR-0225180.

Substrate-free monomeric (sf-m) x-ray data was collected at the Advanced Photon Source which was supported by the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences, under contract No. W-31-109-ENG- 38.

Funding of this research by the National Institutes of Health (NIH) grant R01 GM55837 to Michael S. Chapman, the American Heart Association (AHA) grant 0315101B to D. Jeffrey Bush, and in part by the National Science Foundation (NSF) research training grant to Florida State University’s Institute of Molecular Biophysics is gratefully acknowledged.

v

TABLE OF CONTENTS

List of Tables ...... Page viii List of Figures ...... Page ix Abbreviations ...... Page x Abstract ...... Page xi

1. Introduction ...... Page 1

Phosphagen (guanidino) kinase structure and function ...... Page 1 Properties of the phosphagen substrates ...... Page 2 Clinical significance of phosphagen kinases ...... Page 4 mechanism ...... Page 4 Preorganization, substrate-substrate alignment and proximity ...... Page 7 Substrate specificity ...... Page 10

2. Materials and Methods ...... Page 14

Biochemical reagents and instrumentation ...... Page 14 RNA isolation and cDNA construction ...... Page 14 PCR amplification, cloning and sequencing ...... Page 14 Expression and purification of lombricine kinase ...... Page 15 Multiple sequence alignments ...... Page 21 Crystallization and cryoprotection ...... Page 21 X-ray data collection, processing and reciprocal-space refinement Page 24 Calculation of structure factors and R-free test set generation ...... Page 29 Homology modeling ...... Page 29 Molecular replacement ...... Page 30 Determination of the non-crystallographic symmetry operator ...... Page 31 Structural refinement ...... Page 33 Model building and real-space refinement ...... Page 36 Comparative structural alignment of phosphagen kinases ...... Page 37 Targeted molecular dynamics simulation of active-site closure ...... Page 38

3. Results and Discussion ...... ….. Page 40

Expression and purification of lombricine kinase ...... Page 40 Multiple sequence alignment of phosphagen kinases ...... Page 41 Substrate Specificity ...... Page 44 Crystallization and cryoprotection optimization ...... Page 51 Structural validation ...... Page 52 Evaluation of the reliability of predicted transition-state models of LK Page 53

vi

Structural rationalization of potential specificity determinants ...... Page 56

5. Conclusions ...... Page 63

APPENDICES ...... Page 64

A Commercially-available potential enzyme inhibitors ...... Page 64 B Protocol for real-space refinement in “O” ...... Page 69 C Biological buffers and solutions……………...... Page 74

REFERENCES ...... Page 75

BIOGRAPHICAL SKETCH ...... Page 84

vii

LIST OF TABLES

Table 1: Comparative space group, unit cell and crystallization conditions Page 27

Table 2: Refinement strategy and R-factor statistics for sf-m data Page 33

Table 3: Refinement strategy and R-factor statistics for sf-d data Page 34

Table 4: Kinetic parameters of Annelid phosphagen kinases Page 46

viii

LIST OF FIGURES

Figure 1: Structures of the phosphagen kinase substrates Page 3 Figure 2: The enzymatic mechanistic pathways of phosphoryl transfer Page 5 Figure 3: Presumed schematic kinetic mechanism of LK Page 6 Figure 4: The transition-state structure of phosphagen kinases Page 8 Figure 5: Lombricine kinase ADP-agarose chromatogram Page 17 Figure 6: Typical SDS PAGE gel after ADP chromatography Page 18 Figure 7: Purity verification by superdex-200HR chromatography Page 19 Figure 8: Superdex-200HR chromatogram of pre-ADP-column dialyzate Page 20 Figure 9: Lombricine kinase crystal photomicroscopy Page 22 Figure 10: Substrate-free sf-d crystal photomicroscopy Page 23 Figure 11: Substrate-free sf-m crystal photomicroscopy Page 24 Figure 12: Diffraction image of the LK-sf-d crystal form Page 26 Figure 13: Comparative hybrid diffraction image of sf-m and sf-d Page 28 Figure 14: The two-fold non-crystallographic LK dimer Page 32 Figure 15: Fit of high resolution sf-d LK model to electron density map Page 35 Figure 16: A multiple sequence alignment of phosphagen kinases Page 42 Figure 17A-C: Multiple sequence alignment of TK and LK with AK Page 48 Figure 18: Ramachandran plot of the sf-d form of lombricine kinase Page 54 Figure 19: Global comparison of LK sf-m and sf-d structures Page 55 Figure 20: Theoretical closed TSAC homology model of LK Page 58 Figure 21: Arginine in the TSAC structure of arginine kinase Page 60 Figure 22: Ribbon diagram of superimposed LK sf-d and sf-m structures Page 61 Figure 23: Global theoretical closed TSAC homology model of LK Page 62

ix

ABBREVIATIONS

ADP LB Luria-Bertani Media AK Arginine kinase LD Lock-dock APS Advanced Photon Source LK Lombricine kinase ATP MgADP- Magnesium adenosine diphosphate Basic Local Alignment Search BLAST Mme Mono methyl ether Tool CCD Charge-coupled device NCS Non-crystallographic symmetry

CCP4 Collaborative Computational NMR Nuclear Magnetic Resonance Project Number 4 PAGE Polyacrylamide gel electrophoresis CHARMM Chemistry at Harvard Molecular PCR Chain Reaction Mechanics CHESS Cornell High Energy Synchrotron PDB Protein databank Source CK PEG Polyethylene glycol CNS Crystallography and NMR pH -log of hydrogen ion System DEAE Diethylaminoethyl PLom Phospholombricine DNA Deoxyribonucleic acid RNA Ribonucleic acid

dT Deoxythymine SDS Sodium dodecyl sulfate

DTT Dithiothreitol SER-CAT Southeast Regional Collaborative Access Team E.C. Enzyme Class sf-d Substrate-free dimeric crystal form

E.Coli Escherichia Coli sf-m Substrate-free monomeric crystal form

EDTA Ethylenediaminetetra-acetic acid SN1 Substitution, nucleophilic, unimolecular

H.R. High Resolution SN2 Substitution, nucleophilic, dimolecular

IPTG Isopropyl-beta-D- Thiogalactopyranoside TLS Translation, Libration, Screw

Kcat Catalytic enzyme turnover rate TSAC Transition State Analogue Complex

KM Michaelis constant

x

ABSTRACT

Despite the existence of structures of substrate-bound and substrate-free forms of two highly specific members of the family of phosphagen (guanidino) kinases, namely arginine and creatine kinase, the full determination of the structural correlates of substrate specificity within this family of has remained elusive. Such has prompted the expansion of this research to elucidate the structural correlates of specificity of each phosphagen kinase for its preferred substrate to lombricine kinase (LK), the largest and least specific of these enzymes. The culmination of this work has firmly established a paradigm for substrate specificity in multi-substrate enzyme reactions. As the least specific of the phosphagen (guanidino) kinases, the study of lombricine kinase complements that of the highly specific arginine and creatine kinases. The primary goal of this research was a more advanced structural understanding of the basis for substrate specificity within this family of enzymes, to elucidate how lombricine kinase catalyzes reactions with a wider variety of substrates, and specifically to test the hypothesis that its activity on this wider variety of substrates is attributable to its ability to accurately pre-align these substrates in a way that arginine and creatine kinases can not. Here we report the largely identical structures of two substrate-free forms of lombricine kinase crystallized at two different pHs and analyze them within the context of their differing space groups, non-crystallographic symmetry, and structural differences upon superimposition with other phosphagen kinase structures. A predicted structural homology model for the closed lombricine kinase transition-state analogue conformation has then been generated from a sequence alignment and quasi-rigid sub- domain motions known to exist in AK, as described by Yousef, 2003. The robustness of the predicted model of the closed LK TSAC structure was supported by CHARMM- based targeted molecular dynamics (TMD) methods, using the open substrate-free lombricine kinase structure as the starting model and the AK TSAC structure as the target. Interpretation of the model along with evidence from multiple sequence alignments, comparisons of substrate structures and structural superimposition of the

xi

predicted LK model with the closed arginine kinase transition-state structure indicates that the structural correlates of the lombricine specificity are likely to include regions of sequence that are conserved in both known LKs and physically close to the beta carbon of the substrate-analog arginine in the hybrid model. Lysine 83, that is distinct from AKs and strictly conserved in both known LK sequences, appears to position its side chain toward the beta carbon of arginine. Further analysis of this model shows a pair of histidines (H187 and H313) and a pair of cysteines (C58 and C270), one on each side of the substrate, are in the general vicinity and could mediate lombricine specificity through salt bridges that would stabilize the negative charge on the phosphate moiety between the beta and gamma carbons of the substrate lombricine, the only position where lombricine differs from the structure of the substrate arginine (Figure 1). The results show conclusive evidence that Lysine 83 of lombricine kinase is clearly involved in mediating the specificity of LK toward the substrate lombricine.

xii

INTRODUCTION

Phosphagen (guanidino) kinase structure and function

Lombricine, arginine, and creatine kinases (E.C. 2.7.3) are homologous cellular ATP-buffering enzymes of high sequence conservation and structural similarity that catalyze the reversible transphosphorylation of their respective substrates, or guanidino acceptor compounds (phosphagens), and belong to the enzyme family of guanidino (phosphagen kinases or guanidino kinases). This ATP-dependent phosphorylation of energy-storing substrates is a reaction central to the short-term temporal cellular energy buffering [3, 4] and possibly more controversially, the spatial buffering of sites of energy production from sites of energy consumption with a phosphagen shuttle [5-7] capable of more rapid diffusion than the high-energy nucleotide. A wide array of endergonic cellular processes are driven by nucleotide hydrolysis, from powering molecular motors and transmembrane active transport to muscle contraction, ribosomal peptide synthesis, and signal transduction [8-17]. Furthermore, as evident from the following reaction scheme and free energy – equilibrium constant relationship, the free energy of hydrolysis of ATP is not fixed, but rather is highly dependent upon the concentration ratio of [ADP]/[ATP], thus, the physiological benefit of cellular maintenance of this ratio at an energetically favorable value [18].

Despite this low ratio or rather high proportion of ATP relative to ADP, the entire ATP pool would be depleted within one second during a burst of muscle activity and the maximal rate of respiratory ATP regeneration is only about 1/16th of that required to

maintain the ATP concentration [19], which further reveals the importance of creatine kinase and other phosphagen kinases for ATP regeneration. This ATP hydrolysis reaction under normal physiological conditions is maintained by oxidative phosphorylation far displaced from equilibrium such that the free energy of hydrolysis exceeds -60 kJ/mol in certain cells [20]. Common structural hallmarks of this enzyme family include a small N-terminal alpha-helical domain connected by a flexible linker sequence to a much larger C- terminal beta-sheet domain flanked by alpha-helices [21-25]. Two flexible loops, one from each domain, fold over the upon substrate binding to align catalytic groups, enable guanidyl deprotonation, thereby enhancing its nucleophilicity, partially mediating substrate specificity, and preventing wasteful hydrolysis of the bound high- energy nucleotide [1, 26-28]. Other notable common features are an active-site cysteine residue that has been implicated in substrate binding synergy [29] as well as a strictly conserved NEEDH region, exploited for primer design [30].

Properties of the phosphagen substrates

Different organisms utilize a great diversity of phosphagen substrates that differ significantly in size, shape, electrostatic potential, and thermodynamic poise ( as shown in Figure 1) [20, 31, 32]. Even within certain invertebrates, there is a wide variety of phosphagen kinase and phosphagen diversity. However, in tissues such as spermatozoa that are critical to their survival, these invertebrates preferentially express creatine kinase and carry and store the majority of their energy reserves as creatine phosphate [7], likely due to the enhanced selectivity, and higher diffusivity and thermodynamic poise of this phosphagen, which is intimately related to the better ability of creatine kinase to buffer ATP at much higher ATP/ADP ratios than other phosphagen/phosphagen kinase systems [7, 33]. In addition to buffering ATP concentrations allowing for high energy output during ATP hydrolysis during periods of disequilibrium of ATP supply and demand, the phosphagens also are involved in regulation of glycogenolysis, proton buffering, and intracellular energy transport [20].

2

Figure 1: Structures of the phosphagen kinase substrates: This enzyme family (E.C. 2.7.3) consists of Phosphotransferases with a nitrogenous acceptor. Arginine is shown in the phosphagen form. LK likely distinguishes lombricine from the highly structurally similar arginine by enzyme interactions with the phosphodiester moiety of lombricine, the only region where the substrates differ.

3

Clinical significance of phosphagen kinases

Highly expressed in human cardiac musculature, creatine kinase catalyzes a reaction critical to meeting the energy burst requirements in muscle and neuronal cells and transport epithelia when cytoplasmic glycolysis and mitochondrial oxidative phosphorylation are insufficient to meeting such elevated and fluctuating energy demands. The equilibrium of this reaction is pH sensitive, with lower pH situations such as intracellular acidosis or elevated serum bicarbonate levels, as would be the case following exertion, favoring ATP synthesis. The emphasis of the current study is focused less on the physiological role of these enzymes, but rather on their suitability as models for understanding the fundamentals of substrate specificity in multi-substrate enzyme reactions. Nevertheless, this enzyme family, is clearly linked through off-shunts of the urea cycle to nitric oxide production, a deficiency of which “has been implicated in the pathogenesis of increased pulse wave reflection associated with systolic hypertension” and L-arginine is known to decrease systolic hypertension [34], likely due to it being a precursor for the vasodilator nitric oxide [35]. Nitric oxide, originally known as endothelium derived relaxing factor, is a biological messenger that plays a role in vasodilation, neurotransmission, modulation of the hair cycle and penile erections [36- 38]. Different phosphagen kinase isoforms exist in different proportions in various tissues. Upon muscle injury, these enzymes are released into the serum, thus forming the clinical basis for differential diagnoses of certain disorders, including acute myocardial infarction (heart attack) [39] and muscular dystrophy [40]. Elevated serum levels of creatinine, a cyclic condensation of the substrate creatine, are also diagnostic for renal insufficiency [41].

Enzyme mechanism

Enzymes in general have been shown to achieve through a number of different mechanisms, including acid-base catalysis, covalent catalysis, transition-state

4

stabilization, ground-state destabilization, and pre-organization or proximity effects [42]. Past contentious debates in the literature regarding the relative importance of entropic contributions, active-site pre-organization, alignment, and proximity to catalytic rate enhancements underscore the need for further evidence to elucidate the various partitioning of such mechanisms as to how enzymes achieve catalysis because for many enzymes, the “known” mechanism is merely a plausible interpretation of limited data. Such conjecture manifests in the countering viewpoints of Koshland and Jencks. William Jencks contends that translational and overall rotational motions supply the entropic driving force for enzymatic catalytic rate enhancements but that substrate alignment was less important [43]. Conversely, Daniel Koshland’s calculations indicate that 103-105 fold rate enhancements can be attained for an optimally oriented bimolecular reaction relative to one that is randomly oriented [44]. Enzymatic phosphoryl transfer reactions in particular have only been shown to

follow three distinct mechanisms (Figure 2): a dissociative SN1-like mechanism that proceeds via a resonance-stabilized metaphosphate intermediate that requires a solvent with a relatively high dielectric constant to stabilize the leaving group; a

Figure 2: The three observed enzymatic mechanistic pathways of phosphoryl transfer proceed through one of the either transition states or intermediates shown. The fully associative pathway proceeds via the phosphorane intermediate at left. The fully dissociative pathway proceeds via the metaphosphate intermediate at right. The partially associative concerted pathway proceeds via the loose transition state at center. The transition-state and intermediates shown are indicated by empirical evidence from multiple kinetic isotope effects from Hengge (2002). Isotopic labeling of the gamma phosphate oxygens of ATP during the phosphagen kinase catalyzed reaction reveals inversion of stereochemistry at phosphorus indicating that this enzyme family proceeds via the loose transition state at center [2] (from Hansen and Knowles, 1981).

5

completely associative addition-elimination mechanism that proceeds through a trigonal bipyramidal pentacoordinate phosphorane intermediate; or a concerted partially associative SN2-like mechanism that proceeds via a loose transition-state with minimal bond formation to the nucleophile and extensive bond cleavage to the leaving group (the sum of the bond orders is less than 1) [45]. Past NMR and kinetic studies have shown that creatine and arginine kinases share the same rapid equilibrium, random-order, bimolecular-bimolecular mechanism [46, 47] with substrate-binding synergy (Figure 3) that is mediated by an active-site reactive cysteine, the catalytic necessity of which has been debated [29]. Catalytic rates of ~135s-1 [48, 49] are ostensibly consistent with large conformational changes within the enzyme upon substrate binding [50-53]. Isotopic labeling of the ATP gamma phosphoryl oxygens revealed inversion of stereochemistry at phosphorus that was consistent with a direct partially associative in-line phosphoryl transfer [2]. Recent computational data has suggested that the catalysis of these enzymes and possibly their specificity might be mediated partially through an (n→σ*) anomeric or stereoelectronic effect between lone pairs of the double-bonded gamma oxygen and an antibonding orbital of the high energy bond on either the phosphagen or the nucleotide [54]. This anomeric effect weakens the high energy N-P bond on the substrate guanidinium by lengthening it by 0.1 to 0.2Å and facilitating the reaction [54].

Figure 3: Presumed schematic kinetic mechanism for the phosphorylation reaction catalyzed by lombricine kinase. E denotes the enzyme, Lom denotes the substrate lombricine and PLom denotes the phosphorylated substrate (phosphagen).

6

Concomitantly, guanidinyl deprotonation occurs by way of one of the active site catalytic bases (Glu225 and Glu314 in recombinant Limulus polyphemus arginine kinase) that also serve to hold and align the phosphorylated substrate guanidinium for nucleophilic attack by the -Phosphorus of ADP. This base catalysis, however appears not to be of primary importance for catalysis, as structurally-directed mutagenesis and kinetics have shown that prior candidates for acid-base catalysis play accessory, but not critical roles in catalysis [55]. Despite such evidence, guanidinyl deprotonation allows the nitrogen lone pairs to participate in resonance which likely facilitates the anomeric effect and increases the lability of the N-P bond on the phosphagen substrate which enables catalysis.

Preorganization, substrate-substrate alignment and proximity

Refinement of an arginine kinase transition-state structure at unusually high resolution (1.2 Å) indicated that nucleotide and phosphagen substrates were aligned within 3o of optimal (Figure 4), implicating pre-ordering as a potentially important part of the catalytic effect [28] and confirming past evidence from other enzymes in-vitro and in- silico [44, 56, 57]. Comparison of a TLS (translation-libration-screw) tensor analysis of the anisotropic thermal factors of this high resolution transition-state structure with the structural differences between substrate-bound and substrate-free arginine kinase has shown that there are four rigid fragments and two flexible loops that move upon substrate binding to close the active site [27]. At the end of these motions the substrates are put in extraordinarily precise reactive pre-alignment to achieve catalysis. In arginine kinase, structures of ternary complexes of the enzyme co-crystallized with MgADP-, nitrate, and each of four non-cognate substrate analogues have all been shown to bind, induce transition-state-like conformational changes within the enzyme but exhibit no activity. Further analysis of these non-cognate complexes along with transition-state mutant structures of arginine kinase reveals that only a modest displacement of the reactive guanidinium with no significant displacement of the nitrate or nucleotide is sufficient to completely inactivate the enzyme and render it unable to pre-order the reactive moieties along optimal orbital reaction trajectories.

7

Figure 4: The reversible partially associative transition-state of phosphagen kinases showing extraordinarily precise substrate alignment. The nitrate mimics the planar phosphoryl during catalysis. The grey atoms indicate the theoretical transition state from small molecule calculations and the colored atoms represent the experimental analogue structure. The dotted line represents the reaction trajectory. The experimental omit map density frames the atoms [1].

8

It is also evident that the enzyme attempts to strain these substrates into higher than free-solution energy conformations that, if possible, would allow them to react. Lombricine kinase represents a more convenient system to study pre-alignment because with lombricine kinase, unlike arginine kinase, the alternate substrates are partially active, and so it is possible to test whether catalytic activity is correlated to the precision with which the induced-fit changes align the substrates. Substrate-substrate alignment is an extension of the ideas of specificity by induced fit, an extension of broad potential applicability to the many two-substrate reactions in metabolism, and a fundamental advance. The relative importance of entropic contributions to catalytic rate-enhancements has been the source of contentious debate in the literature [57-62] since the 1970’s, when Koshland suggested that substrate alignment and “orbital steering” effects might play a significant role [44, 61, 63]. Jenck’s, however, pointed out that unimolecular enzymes have very small entropic contributions to catalytic rate-enhancement [62, 64, 65] and that proximity effects were largely due to translational entropy, in close agreement with the computational evidence generated by Warshel [57, 58]. Such entropic effects for multi-substrate enzymes, however, have not been measured in any great detail, thus underpinning the need for further study in this area. Computational evidence from both unimolecular and bimolecular systems suggests that enthalpy is more important than entropy [66, 67] for the adoption of the catalytic near-attack conformation on the path to the transition state. Correspondingly, research from the Chapman laboratory, such as the previously described mismatched transition-state like structures of arginine kinase with four non-cognate substrates suggests that preorganization, along with substrate alignment and enthalpic constraining of the substrates into reactive proximity may play a larger role in the catalytic rate enhancements of phosphagen kinases than previously had been thought.

9

Substrate specificity

Historically, the consensus as to how enzymes achieved specificity for catalysis was largely based upon Fischer’s lock-&-key theory, in which only substrates with the correct geometry and chemical complementarity were capable of fitting into the active site [68]. Koshland’s theory of induced-fit met several inadequacies of lock-&-key, including specificity toward smaller substrates that could still fit in the active site [69]. Koshland proposed that only cognate substrates would elicit the conformational changes required to bring catalytic residues into the correct reactive proximity with the substrates [69]. Experimental paradigms used to develop these ideas, however, were predominantly single-substrate enzyme reactions that were technically easier to visualize and in which specificity was critical to survival. Such enzymes, including serine proteases and amino-acyl tRNA synthetases have evolved for their exceptionally precise specificity, considering the deleterious consequences of indiscriminant proteolysis and stochastic translation processes [70-74], and therefore are likely to be atypical of the bisubstrate enzymes like phosphagen kinases that dominate metabolism. Rigorous studies of more representative metabolic reactions, especially more common multi-substrate enzyme reactions might reveal other mechanisms of specificity. The arginine kinase structure from the Chapman laboratory was the first of a multi-substrate unmodified enzyme in transition-state form, where the substrates were not covalently linked, but bound independently, thus offering an unprecedented visualization of a multi- substrate reaction at work [21, 28], and perhaps offering a further paradigm for substrate alignment and specificity. Research building upon a prior 1.2 Å resolution structure of a transition-state analogue complex of arginine kinase along with kinetic data and structures of mutants, chimera, and complexes with non-cognate substrates has ruled out two lock-and-key hypotheses and implicated induced-fit conformational changes in the enzyme’s specificity. Potentially new is the role of substrate pre-alignment that appears to be precise enough only with cognate substrates. As it has not previously been possible to

10

study multi-substrate reactions in such detail, this may be a common mechanism of specificity that has remained unrecognized. Although determination of the full complement of the structural correlates of substrate specificity for phosphagen kinases remains elusive, two variable loop regions within these enzymes have consistently been implicated [75-77]. One of these loops (residues 59-64 of arginine kinase) is located in the small N-terminal alpha-helical domain, which facilitates binding of the phosphagen substrate through backbone hydrogen bonds to the carboxylate of the substrate. The other loop in the large domain (residues 312-319 of arginine kinase) serves to align and position the guanidinium of the substrate for optimal nucleophilic attack on ATP at the distal region of the active site [27, 28, 78]. Interestingly, this same loop in creatine kinase mitigates specificity by forming a hydrophobic mini-pocket which accommodates the methyl group of creatine [26]. Kinetic studies of structurally-directed chimeric constructs in which creatine kinase small-domain loop residues were mutagenically inserted into arginine kinase revealed measurable activity for arginine, but not creatine. Phosphocreatine binding was also not enhanced by these insertions as determined by the specificity index, kcat/KM, clearly indicating that although important for specificity, this loop represents only part of the story [79]. Structures of the arginine kinase mutants E225Q and E314D showed subtle distortions of the precise phosphagen-nucleotide alignment, which kinetic studies revealed to have little impact on substrate binding constants, but considerable reductions in enzymatic activity, thus providing further evidence of the catalytic importance of substrate-substrate alignment. The role of alignment in catalysis has been further probed by the structure determination of arginine kinase with four arginine homologues: imino-ethyl-ornithine, L-citrulline, L-ornithine, and D-arginine with no mutation of the enzyme (Clark, S.A., Bush, J, et. al., unpublished). All were co- crystallized with MgADP- and nitrate in ternary transition state analog complexes (TSAC). These non-cognate substrates bind and induce the transition-state enzyme conformations but remain inactive. These structures strikingly reveal that only a modest deviation in the position of the guanidinium with little concomitant disposition of nitrate

11

or ADP coordination is sufficient to render the enzyme completely inactive, lending further credibility to Koshland’s theory of orbital steering. Past substrate inhibition kinetic studies [80] with arginine kinase used various arginine analogues in an attempt to correlate the relative apparent binding constants of these substrate analogues with their stereochemical, electrostatic, hydrogen bonding potential, hydrophobicity, and other features. Such an analysis would facilitate the determination of which corresponding features of the enzyme enabled it to discriminate between the preferred substrate and a closely related analogous substrate. A similar analysis can be performed for cognate substrates, in which the specificity index,

(kcat/KM) is used to directly make this determination. The specificity index describes the ratio of each of the relative catalytic rate enhancements for each substrate to its associated binding strength. The results of this study [80] indicated that the strength of binding of each of these substrates decreased with decreasing length of substrate carbon chain away from the ideal substrate length of arginine. Further, the substrate carboxylate was important for binding but the guanidinium was not. Although the guanidinium contributes little to substrate binding, positive charge surrounding the substrate δ-nitrogen is important as evidenced by the improved binding of ornithine relative to citrulline. Ornithine is arginine with the guanidinium truncated at the δ- nitrogen, whereas citrulline is arginine with one of the terminal of the guanidinium replaced with an oxygen atom. Lombricine kinase is a biological homodimer with somewhat relaxed substrate specificity relative to the highly specific arginine and creatine kinases, as it has been shown to achieve catalysis with either lombricine or taurocyamine. The reported apparent binding constant and specificity index of LK for lombricine are Km = 5.33 mM and kcat/Km = 3.37 s-1mM-1 whereas the apparent binding constant and specificity index of LK for taurocyamine are Km = 15.31 mM and kcat/Km = 0.48 s-1mM-1, respectively [81]. The non-crystallographic multimeric state, as described herein, appears to have a dependence upon pH as well as the presence of substrates as shown in Figure 14. The substrate lombricine (guanidinoethyl phosphoserine) is interesting in that it possesses a

12

D-serine moiety in annelids and an L-serine moiety in echiuroids [20], suggesting that the enzyme can tolerate more than one rotameric state of the substrate side chain. The research herein presented is an extension of the efforts to eludidate the structural determinants of substrate specificity in phosphagen kinases to another member of this enzyme family, lombricine kinase. The 1.8 Å substrate-free structure of lombricine kinase is described along with its predicted transition-state-like homology- modeled structure, and the suggested potential structural determinants of substrate specificity that are implicated by comparative analysis of the transition-state structure of arginine kinase.

13

MATERIALS AND METHODS

Biochemical reagents and instrumentation

All biochemicals were obtained from Sigma Chemical Company (St. Louis, MO) and Fisher Scientific (Pittsburgh, PA) with the following exceptions: Electrophoresis reagents were obtained from Bio-Rad (Richmond, CA). Molecular biological reagents were obtained from Roche Molecular Biochemicals (Indianapolis, IN), Promega (Madison, WI), EMD Biosciences (San Diego, CA) and Life Technologies, (now Invitrogen, Madison, WI). Cromophore coupling enzymes for activity assays were obtained from Roche Molecular Biochemicals (Indianapolis, IN). Chromatographic media were obtained from Amersham Pharmacia Biotech (now G.E. Healthcare, Piscataway, NJ). Crystallization reagents were obtained from Hampton Research (Aliso Viejo, CA) and Nextal, (now Qiagen, Valencia, CA). Live specimens of the marine echiuroid worm Urechis caupo were obtained from Sea Life Supply (Sand City, CA). Chromatography was performed on an Akta FPLC system (G.E. Healthcare, Piscataway, NJ)

RNA Isolation, cDNA construction, PCR amplification, cloning and sequencing

Total RNA was isolated from the body wall musculature of fresh Urechis caupo using Trizol Reagent (Life Technologies/Invitrogen). The cDNA was generated from a small amount of total RNA with a Ready-to-Go-You-Prime kit (Roche Molecular Biochemicals) using 0.5 μg of a lock-dock oligo(dT) (LD/dT) primer [82]. The cDNA was then PCR amplified using the LD/dT primer and a forward “universal” primer (5’- AA(C/T)GA(A/G)GA(A/G)GA(C/T)CA(C/T)-3’), corresponding to a strictly conserved NEEDH peptide that is ubiquitous in all phosphagen kinases. The PCR amplification was performed in a Hybaid OmnE thermocycler using ExTaq DNA polymerase (Takara) with a previously described touchdown protocol [83]. The resulting approximately 1100

14

base pair product was subcloned using a TA cloning kit (Life Technologies/Invitrogen) and the plasmids were isolated and sequenced using an ABI Model 3100 DNA sequencer. The subsequently deduced amino acid sequence of this PCR product was largely similar to that of Eisenia LK. A 5’-RACE kit (Invitrogen) [84] was used for rapid amplification of the cDNA ends following the design of two gene-specific reverse primers: LK5RACE1, (GTAGCCCAGACGCTGGTTGGACATGTA), and LK5RACE2, (CTCGCGGCGTTCTTCTTCTTCATCTGTTCA). The subsequent 800 base pair product was then subcloned and sequenced as previously described. Start and stop codons and restriction sites were added to the full-length Urechis LK cDNA using two gene-specific primers by PCR amplification of the original reverse transcript [30]. The resulting cDNA transcript was purified using Wizard PCR preps (Promega) and ligated into the pETblue-1 AccepTor vector (Novagen). NovaBlue competent cells (Novagen) were then transformed with this plasmid vector, plated, and blue/white screened for transformed clones. Plasmids from selected clones were isolated, restriction digested and sequenced to verify the orientation and sequence of inserts. Sequence and orientation-verified plasmid constructs were then used to transform Tuner (DE3)pLacI competent E. coli cells (Novagen), followed by plating and selection of transformants. Transformed colonies were grown in liquid media, cryoprotected in 30% glycerol, and stored at -80oC to be used for protein expression.

Expression and purification of lombricine kinase

A number of techniques were attempted in order to increase the amount of solubly expressed LK, including cloning and expression of an N-terminal fusion construct known as Nus-Tag (with 6X His), which has been shown to dramatically improve the solubility of many bacterially expressed proteins [85]. Nickel-chelation chromatography bracketing enterokinase-mediated cleavage of this N-terminal fusion protein, however, was not conducive to scale-up. Key to large-scale soluble expression proved to be using the original wild-type inclusion-body construct combined with low- temperature expression. Liquid LB media containing 1% glucose and 50μg/mL

15

carbenicillin was inoculated and cells were grown in a 1.8L Fernbach flask at 37oC to an optical density at 600nm of ~0.5. The incubator temperature was then dropped for expression to 15oC for 1 hour and the culture was then induced with 1mM IPTG. Cells were harvested for 24 hours post-induction at 15oC, 270 rpm in a refrigerated incubator- shaker. The cells were then pelleted at 10,000 x G, and resuspended in 50mL of Lysis buffer ( 50mM Tris/HCl, 100mM NaCl, 10mM EDTA, 15% w/v sucrose, 14mM Beta- mercaptoethanol, pH 8.2 at ambient temperature ). The resuspended cells were lysed by exhaustive processing of the suspension through a model 110L Microfluidizer (Microfluidics Corporation, Newton, MA) at an operating pressure of 13,000 psi. Cell debris were pelleted by centrifugation at 14,000 x G and the resulting raw lysate supernatant was filtered through glass wool and dialyzed exhaustively by DEAE

running buffer (10mM Tris, 1mM EDTA, 5mM KCl, 0.02%w/v NaN3, 1mM DTT, pH 8.6 at ambient temperature). Precipitated protein was pelleted by centrifugation at 14,000 x G and the centrifuged dialyzate was loaded onto a DEAE anion-exchange column (GE Healthcare) that had been previously equilibrated with ten column volumes of running buffer. A salt gradient of 0-120mM KCl over ten column volumes nicely eluted and resolved the LK peak at ~60 mM KCl. Subsequent SDS PAGE revealed that the most abundant band in the gel of these DEAE peak fractions was consistent with LK subunits, at ~Mr of 41,000 Da. The juxtaposition of an activity assay with this chromatogram confirmed a high propensity of this chromatographic peak for phosphorylation of lombricine and taurocyamine. The DEAE peak fractions were then pooled and exhaustively dialyzed by ADP- agarose running buffer (50mM Bis-Tris, 1mM MgCl2, 5mM NaNO3, 1mM DTT, pH 6.5 at ambient temperature). The adenosine-3’,5’diphosphate-agarose column (Sigma) was equilibrated with ten column volumes of running buffer, and the dialyzate was loaded onto the column. The column was then washed until a baseline absorbance was established ~ 3 column volumes. A shallow salt gradient of 0-80mM NaCl over ten column volumes resolved the chromatographic critical pair (Figure 5), namely the active properly folded dimeric LK and a putative misfolded conformer. Subsequent SDS

16

PAGE produced a single band consistent with the relative molecular weight of LK subunits (Figure 6).

Figure 5: A typical lombricine kinase ADP-agarose transition-state affinity chromatogram. The plot is mili absorbance units versus minutes. The yellow line is a trace of percent eluent and red is detector response (resistivity in mS/cm) LK elutes at ~96min at a flow rate of 0.3mL/min and a gradient of 0-100mM KCl over 10 column volumes

17

Figure 6: A typical SDS-PAGE gel showing over-expression and purity of LK after ADP-agarose transition-state affinity chromatography from above. The numbers are the corresponding fractions from the above chromatogram in figure 5.

18

The peak fractions from ADP-agarose transition-state affinity chromatography [86], (15μL each) were then sequentially injected into an analytical Superdex 200HR column (G.E. Healthcare) (Figure 7) that had been pre-equilibrated with ten column volumes of Superdex running buffer (25mM Tris/HCl, 150mM KCl, 1mM EDTA, 1mM DTT, pH 8.1 at ambient temperature). This chromatogram (Figure 7) yielded a single highly symmetric peak with a retention time of ~36 minutes at a flow rate of 0.4mL/min. A subsequent activity assay of this peak sample revealed a high propensity for phosphorylation of taurocyamine. In addition to the 36 minute peak, a Superdex 200HR chromatogram of the pre-ADP column dialyzate sample revealed an additional peak with a retention time of ~38 minutes at a flow rate of 0.4mL/min, that had no propensity for phosphorylation of taurocyamine, presumably the putative misfolded conformer (Figure 8).

Figure 7: Purity verification via a high resolution analytical superdex-200HR chromatogram following ADP-agarose chromatography. The high peak symmetry and absence of the shoulder peak that was present in a sample of pre-ADP column dialyzate (Fig. 9) confirms purification.

19

Figure 8: High resolution superdex-200HR chromatogram showing the chromatographic critical pair: active dimeric LK and inactive improperly folded monomer in the pre-ADP column dialyzate

Lombricine kinase was initially partially purified by a combination of affinity and ion-exchange chromatographies with columns of Cibacron Blue sepharose (Amersham), and DEAE sepharose respectively. Even with very long shallow salt gradients during the Cibacron Blue chromatographic step, subsequent analytical Superdex 200HR chromatography still contained both of the unresolved peaks, one of which (Tr = 36 min) was consistent with the interpolated molecular weight of native dimeric lombricine kinase. The other (Tr = 38 min) was presumably the aforementioned putative misfolded conformer. Juxtaposition of eluent fractions from analytical Superdex 200HR chromatographic peaks with an activity assay indeed shows that the peak for the active dimer corresponds very reproducibly with the Tr = 36 min peak. Swapping the Cibacron

20

Blue sepharose chromatographic step, for the previously described 3’5’-ADP transition-

state affinity chromatography and running a long shallow gradient removed the Tr = 38 min misfolded peak from a subsequent analytical Superdex 200HR chromatography step. The resulting pure active dimeric LK was then exhaustively dialyzed by concentration buffer (10mM Tris, 5mM KCl, 2mM DTT, pH = 8.1 at ambient temperature), concentrated to 30mg/mL with an Amicon pressurized stirred-cell and used for crystallization trials.

Multiple sequence alignments

Multiple sequence alignments were constructed with the program CLUSTAL-X [87] and visualized with the program GENEDOC (http://www.psc.edu/biomed/genedoc)

Crystallization and cryoprotection

Initial crystallization attempts for both the transition-state and substrate-free LK crystals had used two strategies: sampling conditions similar to those successful with arginine and creatine kinases, and through screens previously developed against a larger panel of proteins [88]. Ultimately, initial crystal forms so developed were rapidly and efficiently further screened and tested for reproducibility, while conserving precious sample through a crystallization service [89]. Photographs were taken of these crystallization trials immediately after setup, and at days 1, 3, and 7 to show the progression of crystal growth and to characterize protein solubility curves based on the relative rates of crystallization, precipitation, or remaining solubility (Figure 9). Subsequently, chaotropic and kosmotropic additives, precipitant concentrations, and buffer pH were iteratively empirically optimized and characterized by resultant crystal size, morphology, and diffraction quality. The effectiveness of cryoprotection was assessed as described elsewhere [90, 91] by flash freezing and irradiating mother

21

Figure 9: lombricine kinase crystal photomicroscopy: The images in the top row show the progression of crystal growth immediately after setup, and at days 1, 3, and 7 respectively. The images in the bottom row show subsequent improvements in crystal morphology from small needle-like crystals at left to twinned crystals and finally to large well-defined single crystals at right upon optimization of crystallization conditions

22

liquor with progressively increasing percentages of various cryoprotectants, such as glycerol until sharp ice- ring diffraction became diffuse (~35% glycerol). To prevent harsh changes in ionic strength that might crack crystals, the crystals were transferred gradually to higher percentages of glycerol in mother liquor Figure 10 : This is an image of a typical initial substrate-free (sf- d) crystal of lombricine kinase, a non-crystallographic dimer. in 5%, increments, The crystals grow fully and robustly in approximately one week. The crystal is in a 50 nL drop with a longest dimension of soaking the crystals at approximately 0.6mm each interval for 1-2 minutes to a final concentration of 35% glycerol in mother liquor. Data has been collected for two distinct substrate-free crystal forms in two different crystallographic multimeric states at two different pHs. The first and best diffracting LK crystal form, sf-d (~1.75 Å) (Figure 10) is a crystallographic dimer possessing two copies of the ~42kDa, 366 residue protein chain per asymmetric unit and was set up at an initial protein concentration of 28mg/mL as assessed by a Bradford assay and ultraviolet absorbance at 280nm with a theoretical extinction coefficient of 0.77. The sf-d crystal form was grown by the hanging drop vapor diffusion method against a mother liquor containing 15mM BisTris, 0.2M NaNO3, 1mM DTT, 20% PEG 3350mme, pH = 6.8 and had approximate dimensions of 0.3 X 0.4 X 0.15mm. For the other LK crystal form, later determined to also be a substrate-free form and designated sf-m, (Figure 11), TSAC

23

components with taurocyamine had to be made up in a 5X stock so they could be diluted in-situ to 1X, (5mM MgCl2, 40mM ADP, 50mM Taurocyamine, 15mM BisTris, 20%w/v PEG 3350mme, 1mM DTT pH = 5.8). This presented a minor challenge in solubilizing the taurocyamine, which was overcome through adding warm buffer to the taurocyamine and slightly raising the pH. The supersaturated solution was then allowed to cool to ambient temperature before adding the Figure 11: A typical crystallographic monomeric lombricine kinase crystal (sf-m) after temperature-sensitive DTT and ADP optimization of crystallization conditions. The crystal dimensions are approximately 150 by 350 followed by final pH adjustment. microns and the crystal diffracts to 2.5Å Crystals were set up and resolution. cryoprotected as described for the sf- d crystal, however, with the addition of TSAC components. The best diffracting crystal of this form had approximate dimensions of 0.15 X 0.35 X 0.1mm and diffracted to 2.5Å resolution. Both crystal forms were grown at 4oC.

X-ray data collection, processing, and reciprocal-space refinement

Many x-ray crystallographic datasets have been collected both at the x-ray facility of Florida State University’s Institute of Molecular Biophysics (IMB) as well as at synchrotron facilities (Cornell High Energy Synchrotron Source - CHESS and The Advanced Photon Source - APS). Crystals were primarily screened at the IMB x-ray facility, with the crystallization conditions producing the most well-diffracting crystals being reproduced for synchrotron data collection. The IMB x-ray facility houses two

24

Rigaku RU-H2R water-cooled rotating copper anode x-ray generators capable of producing 1.541 Å x-rays with maximum tube potential and current of 60kV and 200mA, respectively. Both generators are equipped with Oxford Cryosystems cryostats for low- temperature data collection as well as Osmic confocal mirror monochromators with a 0.3 X 3.0 mm2 collimator focus. One of these generators is equipped with an R-Axis IIc image-plate detector comprised of two 19 X 19 cm2 units, while the other is equipped with a Peltier-cooled Mar-165-ccd, charge coupled device (Mar-USA). Prior to crystal screening data collection, the cryostat was allowed to cool to 100K, as the rotating anode was gradually brought up to operating potential, current and power of 40kV, 90mA, and 1.2kW, respectively. Following cryoprotection, diffraction-quality crystals were measured approximately by observing and comparing various sizes of mounted cryoloops (Hampton Research) under magnification next to the crystal. The appropriate cryoloop was selected accordingly and the crystal was looped out of the cryoprotectant (~35 % glycerol) in mother liquor, mounted on the goniometer and immediately flash frozen by quickly blocking and releasing the cryostream. The crystal was then centered in the X-ray beam by aligning the crystal at ϕ=0° and ϕ=90°. Initial diffraction images were taken with an oscillation sweep width of 1.0°, at ϕ=0° and ϕ=90°, with a crystal- detector distance of 200mm and an exposure of 8 minutes. The best of the sf-d (substrate-free) datasets, as determined by such parameters as high resolution, low Rmerge, low mosaicity, high completeness and data redundancy, etc, were collected on February 21, 2004 at the A1 beam line of the Cornell High Energy Synchrotron Source (CHESS) (Ithaca, NY) (Figure 12). The radiation was generated by accelerating electrons through a potential of 13.6KeV, producing monochromatic x-rays of wavelength 0.9764Å with a flux density on the order of 1012 photons per 0.01mm2 of collimator area. The data were collected at 100K using an Oxford Cryosystems cryostat supplying a dry liquid nitrogen stream and acquired on an ADSC Quantum 210 CCD detector composed of four 2048 X 2048 pixel modules [92]. The data were processed, indexed, integrated, merged, refined and scaled with the programs Denzo and Scalepack as part of the HKL crystallographic software package [93]. Data for the

25

Figure 12: A typical diffraction image of the sf-d (substrate-free) dimeric form of lombricine kinase. Data were present to 1.7 Angstrom resolution and indexed to space group P2(1). The data were collected at the A1 beam line of the Cornell High Energy Synchrotron Source (CHESS), Ithaca, NY

26

sf-m form of lombricine kinase were collected at the SER-CAT bending magnet beamline 22-BM at the Advanced Photon Source at Argonne National Laboratory, Argonne, IL. A comparison of crystallization conditions, space group, unit cell dimensions and multimeric form is shown in Table 1 below.

Table 1: Comparative space group, unit cell and crystallization conditions for two distinct multimeric forms of lombricine kinase, a biological dimer. The most striking difference in setup conditions is that the sf-m form crystallized at a full pH unit lower and had transition-state analogue components present in the mother liquor. There appears to be a pH or substrate dependence of the space group and non-crystallographic symmetry

Crystal Form LK-sf-d LK-sf-m Crystallographic Dimer Monomer Multimeric form

Space Group P21 C2221

Unit cell dimension (a) 74.34 67.69

Unit cell dimension (b) 59.55 77.99

Unit cell dimension (c) 85.85 141.12

Unit cell angle (α) 90 90

Unit cell angle () 105 90

Unit cell angle () 90 90 15 mM Bis Tris, Buffer and pH 15 mM Bis Tris, pH=5.8 pH=6.8 20% w/v PEG 3350 20% w/v PEG 3350 mme, Precipitant mme, 0.2M NaNO3 0.2M NaNO3 5 mM MgCl , 40 mM ADP, Substrates None 2 50 mM Taurocyamine Divalent Metal None Mg2+ Reducing Agent 1 mM DTT 1 mM DTT

27

The relative unit cell dimensions and resolution are perhaps better represented by the hybrid diffraction image shown in Figure 13, where the left hand side of the diffraction pattern shows the lower resolution sf-m crystallographic monomer and the right hand side of the diffraction pattern shows the much higher resolution crystallographic dimer from (sf-d). The images were scaled similarly and the radius of the diffuse water ring was crudely matched for this general comparison.

Figure 13: Comparative hybrid diffraction image showing a side-by-side comparison of reflections from the 2.5 Angstrom resolution sf-m form of lombricine kinase (left-hand side) versus those from the 1.7 Angstrom resolution sf-d form of the enzyme (right-hand side), revealing the relative unit cell dimensions and resolution of the similarly scaled images.

28

Calculation of Structure Factors and R-free Test Set Generation

Following data reduction, the merged scaled intensities from the scalepack.sca file were converted to structure factor amplitudes with the CNS utility program to_cns [94], and converted to CCP4 format [95] with the program MTZDUMP. A test set of random reflections for R-free unbiased cross-validation was generated from 4% of the reflection data for each dataset in CNS with the module make_cv.inp. These reflections were not used in subsequent refinement so as to provide a metric to monitor bias and overfitting of the model.

Homology Modeling

Due to the initial extensive unsuccessful effort aimed at locating a robust molecular replacement probe, a homology modeling protocol was established to generate a molecular replacement probe that would be more likely representative of lombricine kinase. In order to maximize the chance of success, a search of the protein data bank for existing phosphagen kinase crystal structures was further narrowed by close examination of multiple sequence alignments of these structures with the lombricine kinase sequence. This strategy combined with BLAST searches revealed a high identity, highly similar sequence for which a crystal structure existed. The most appropriate model thus determined, appeared to be human ubiquitous mitochondrial creatine kinase, with the PDB code 1QK1, with approximately 56% sequence identity to LK. Each residue of the LK-aligned sequence of this creatine kinase was then computationally mutated to each of its lombricine kinase homologues with the program SEAMAN [96] to lay an LK sequence on this creatine kinase structural scaffold. Gaps were ignored and the corresponding residue numbering remained of the creatine kinase type. In order to account for side-chain structural variations, lengths, and atom numbers, and to maintain spatial restraints, the model was then energy-optimized with the program Modeller [97, 98] and the resulting homology model was used as the molecular replacement probe (phasing model) to solve the sf-m rotation and translation

29

functions. The program Modeller was also used to generate a predicted closed structure of LK. The program takes a homologous structure - in this case the closed transition-state structure of arginine kinase and a sequence alignment of LK with this arginine kinase as input and produces a homology model based upon maintaining pre- existing structural restraints.

Molecular Replacement

Molecular replacement involves the use of a pre-existing highly similar structure as an initial estimate of the phases for the unknown structure. The pre-existing structure, known as the phasing model (the previously-described homology model) is placed into the unit cell of the unknown structure (LK in this case) and rotated and translated until cross Patterson vectors between the structure factors of the phasing model and the unknown structure are minimized, indicating that the phasing model is placed correctly in the unknown unit cell [99]. A hybrid electron density map can then be calculated using structure factor amplitudes derived from reciprocal-space x-ray intensity data from the unknown structure and initial phases calculated from the phasing model. No pre-existing phosphagen kinase structure provided a robust molecular replacement solution for LK until the homology model was used as the phasing model. This phasing model was generated as previously described, by laying the LK sequence on top of the structural scaffold of human ubiquitous mitochondrial creatine kinase. This homology modeled phasing model was used to solve the molecular replacement solution for the 2.5 Å substrate-free monomer structure (sf-m) of lombricine kinase. This homology model was the first molecular replacement probe to generate a robust molecular replacement solution, as none of the existing phosphagen kinase crystal structures apparently possessed sufficient structural similarity to produce a robust molecular replacement solution. The homology modeled structural probe was placed into the LK unit cell and the six-dimensional orientation and positional search of the model was determined by separating the search into two three-dimensional searches and calculating rotation and translation functions independently. A robust solution was

30

generated for the rotation and translation searches upon Patterson correlation refinement in CNS and was confirmed using independent programs Phaser [100] and Amore [101] both the latter available as part of the CCP4 software package [102]. Each program has complementary advantages. Amore uses a relatively crude, but quick algorithm to be able to follow up many more potential solutions than possible with the other programs [103]. CNS is more computationally intensive, and able to follow up fewer possibilities, but its refinements at each step reduce the chance that error will cause the correct solution to be overlooked [104, 105]. Phaser is a new implementation that uses a maximum likelihood objective function, and has met with high success rates in finding the real solution among the very top-scoring options [106]. The refined substrate-free monomeric (sf-m) structure of LK was then used as a molecular replacement probe to solve the rotation and translation functions for the substrate-free dimer (sf-d) structure.

Determination of the Non-Crystallographic Symmetry Operator

Characterization of non-crystallographic symmetry from the diffraction pattern can allow the dramatic improvement of electron density maps through averaging of the equivalent parts [107]. In order to determine the non-crystallographic symmetry operator for use as a restraint in further refinement of the crystallographic dimer (sf-d), the refined sf-m model was placed into the sf-d unit cell and the rotation and translation function were solved for the first subunit using the program CNS. Two large, clear and robust rotation function solutions then emerged for the cross rotation function and a single clear and robust solution for the translation function solution. This first solution was subsequently fixed in position, another copy of the sf-m model was inserted into the sf-d crystal unit cell and a rotation search was performed for the second copy. Patterson correlation refinement was then applied to improve upon the initial orientations and positions obtained by grid-search. Rigid groups of residues were defined by sequence alignment to the known small and large domains for arginine kinase. The sequence-aligned LK residues, so defined were 5-95 for the small N- terminal alpha-helical domain and residues 122-366 for the large C-terminal beta-sheet

31

domain. To verify the molecular replacement solution, the cross-rotation function solutions were checked for consistency with self-rotation functions. In order to achieve this, a self rotation function was performed with the aforementioned previously fixed first solution to define a rotation origin, which as expected, produced a very large self rotation peak solution at the origin (0,0,0). Finally, a simple cross rotation search was performed using this fixed solution at the origin to again find the other subunit. The non- crystallographic symmetry operator, thus determined was:

(x) = (0.23567 x1 -0.97149 x2 -0.02591 x3)

(y) = (-0.97130 y1 -0.23634 y2 0.02678 y3)

(z) = (-0.03214 z1 0.01886 z2 -0.99931 z3)

The application of this operator to the first subunit generated the 2-fold non- crystallographic symmetry defining the asymmetric unit as shown in Figure 14:

Figure 14: Two-fold non-crystallographic LK dimer : This sf-d crystal form structure of lombricine kinase representing the asymmetric unit shows the 2-fold non- crystallographic symmetry between the two subunits - blue and grey respectively. The N- and C-termini of each subunit are labeled. The image was visualized and rendered in VMD

32

Structural Refinement

The sf-m structure was solved by iterative cycles of refinement with the programs CNS, Refmac-5 [108] from the CCP4 suite and Phenix [109] combined with model building and real-space refinement and validation in “O” [110] and Coot [111]. The steps in the refinement strategy along with the corresponding R-factor statistics for the sf-m form are listed in Table 2 below. Once the crystallographic monomer (sf-m) reached an Rfree of 29.06, this improved and less biased model was used as a molecular replacement probe and also to refine the crystallographic dimer form (sf-d), which was considerably more tractable due to the 2-fold non-crystallographic symmetry.

Table 2: Refinement strategy and crystallographic R-factor statistics for the sf-m form of lombricine kinase

Initial Final Refinement Step Final Rwork Initial Rfree Rwork Rfree Simulated Annealing 39.93 30.40 45.53 40.85 1 (Wa=35) 2 B-Group Refinement 30.28 28.99 40.79 38.08 3 Manual Model Building ↓ ↓ ↓ ↓ Conjugate Gradient 33.96 29.13 36.31 34.45 4 (Wa=3) 5 B-Group Refinement 29.04 25.68 34.08 30.95 Simulated Annealing 25.58 25.06 30.98 30.24 6 (Wa=4) 7 Water-picking 25.17 23.94 30.84 29.65 Simulated Annealing 24.05 23.64 29.35 29.06 8 (Wa=5) 9 Manual Model Building ↓ ↓ ↓ ↓ Maximum Likelihood (Phenix_Refine 19.35 18.09 25.95 25.95 10 Macrocycle)

The refinement strategy and corresponding R-factor statistics for the sf-d form are shown in Table 3.

33

Table 3: Refinement strategy and crystallographic R-factor statistics for the sf-d form of lombricine kinase with 2-fold NCS along the b-axis

Initial Initial Refinement Step Final Rwork Final Rfree Rwork Rfree Simulated Annealing 1 37.38 35.35 37.03 36.28 (Wa=1) Conjugate Gradient 2 35.09 34.02 35.84 35.79 (Wa=9) 3 B-Group Refinement 34.00 32.65 35.77 34.61 No No 4 B-Individual Refinement ↓ Improvement ↓ Improvement Simulated Annealing 5 32.62 32.54 34.56 33.96 (Wa=3) 6 Water-picking 32.75 31.71 34.26 33.05 Simulated Annealing 7 31.37 30.19 32.94 32.55 (Wa=15) 8 B-Individual Refinement 30.17 30.88 32.45 32.33 ↓ ↓ ↓ ↓ ↓ Relax NCS Relax NCS Relax NCS Relax NCS Relax NCS ↓ ↓ ↓ ↓ ↓ Conjugate Gradient 9 29.91 28.11 32.14 31.12 (Wa=21) Simulated Annealing 10 28.15 28.14 31.15 30.95 (Wa=21) 11 B-Group Refinement 28.13 27.50 30.91 30.61 Simulated Annealing 12 27.50 26.98 30.59 30.27 (Wa=27) 13 Water-picking 27.06 26.92 30.44 29.84 Conjugate Gradient 14 26.43 26.25 29.52 29.43 (Wa=21) Simulated Annealing 15 26.21 25.91 29.35 29.27 (Wa=27) 16 Manual Model Building ↓ ↓ ↓ ↓

17 Refmac-5 Macrocycle 25.20 23.20 26.97 25.60

34

The refined high resolution sf-d model fits extremely well to the electron-density map as represented by the impressive toroid-like electron density map in Figure 15.

Figure 15: High resolution (1.8 Å) images of the fit of the substrate-free model of LK to a 2Fo-FC sigma-A weighted composite omit electron density map, showing toroid-like density through Tyrosine 266, Proline 50, and Tryptophan 260, counter-clockwise, respectively from top right

35

Model Building and Real-Space Refinement

Models were rebuilt between cycles of refinement in CNS, PHENIX, and Refmac- 5 to improve the fit of each model to its associated electron density map. Such computer-assisted “manual” rebuilding started with use of the programs OOPS [112], model-stats.inp in CNS [113], and PROCHECK [114] to rapidly locate problem areas of the models. Prior to model rebuilding, a 2Fo-Fc sigma-a weighted composite-omit map and an Fo-Fc map were each calculated in CNS for each model and dataset pair. Combinations of approaches were adopted depending on the nature of the error in the model. In cases where the backbone trace was problematic, the first step was to load both the 2Fo-Fc and Fo-Fc maps into “O” [110, 115], and use the command lego_ca to load in and scroll through the “virtual slider” best fit of backbone traces consistent with similar backbone fragments from databases of other accepted and well-defined high resolution structures. The command refi_z was then applied across regions where the existing structure and the new backbone trace joined in order to regularize and restore correct geometry, bond-lengths and torsion angles. Peaks in the Fo-Fc map were used to determine where regions of model should not be and the direction that they should move in order to be more consistent with the map. The backbone of the model was then further improved and validated in Coot [111] by close examination of the Ramachandran plot. Outlying backbone torsion angles were adjusted by rotation of the carbonyl or peptide plane until they moved into the energy minima of the local secondary structure. The models were then further improved by using the Model/Fit/Refine module of Coot in conjunction with other structure validation modules. Initially, poor stereochemical geometry was corrected residue-by-residue by careful examination of outliers in the geometry analysis module. Density fit analysis and rotamer analysis were then each investigated in sequence by examination of a residue-by-residue histogram of outliers. Each of these outliers was corrected, real-space refined and the geometry was regularized. Each of these validation modules was then revisited to ensure that all improvements were mutually consistent. The models were then placed through further

36

cycles of refinement in PHENIX [109], CNS [105], or Refmac-5 [108]. Each program uses slightly different refinement algorithms and is often appropriate for different stages of refinement. CNS seems more appropriate for initial stages of refinement due to its ability to perform simulated annealing, that can correct gross structural errors and dramatically improve poor initial models [113]. Refmac-5 is more appropriate for use with structures possessing non-crystallographic symmetry due to its ease of generating automatic geometric restraints to maintain this symmetry [108]. The merits of PHENIX lie in its automated map interpretation, easy scripting, and application of maximum- likelihood density modification algorithms resulting in minimal bias in electron density maps [109].

Comparative Structural Alignment of Phosphagen Kinases

In the absence of an actual closed transition-state structure of lombricine kinase, the closed transition-state conformation was modeled by applying the dynamic domain operators that are known to exist between the open substrate-free and closed transition- state conformations of arginine kinase. Dynamic domain analysis of the open and closed conformations of arginine kinase revealed the presence of three dynamic domains that moved relative to a fourth fixed domain as the active site closes over the substrates [27]. A closed transition state conformation of lombricine kinase was generated with assistance from a sequence alignment of arginine kinase with lombricine kinase. The sequence-aligned homologous backbone atoms of lombricine kinase were superimposed with the corresponding atoms in arginine kinase comprising the fixed domain with the program Superpose in CCP4 [116]. This fixed-domain region is located in the C-terminal beta-sheet region of the protein. The non-sequential residues comprising the fixed domain in AK [27] consisted of 98-111; 124-127; 156-163; 220- 222; 227-261; 276-285; and 329-352 (AK numbering) and the corresponding LK residues were 92-105; 117-120; 149-156; 214-216; 221-255; 275-284; and 327-350 (LK numbering). Only well-defined regions of secondary structure were used for structural alignment and flexible loop regions of the structures were omitted. Following alignment

37

of the fixed domain, the resulting homologous LK dynamic domains were sequentially superimposed on each of the corresponding dynamic domains from the closed AK transition-state analogue complex and each of the three rotation operators were calculated for these superimpositions. These rotation operators were then applied to each of the dynamic domains of LK and the theoretical closed transition-state structure of lombricine kinase was so generated.

Targeted molecular dynamics simulation of active-site closure

Computational simulation was performed (with assistance from Dr. Wei Yang) as an alternative approach to the predicted closed structure of LK generated above. Initially, the lombricine kinase sequence was adjusted such that there was direct correspondence of numbered residues with the target. Insertions in the arginine kinase sequence relative to the lombricine kinase sequence were deleted from the closed arginine kinase target structure. Insertions in the lombricine kinase sequence relative to the arginine kinase sequence were added to the closed arginine kinase target structure with artificial coordinates such that the initial open lombricine kinase structure and the closed arginine kinase target structure had identical corresponding residues. A subset of lombricine kinase residues were selected for the simulation from each of the non- sequential dynamic domains that are well characterized in arginine kinase as described by Yousef [27]. If this subset of residues has sufficient representation from each of the dynamic domains, then upon movement toward the target structure, the rest of the initial structure will follow and thus represent a good model of the target. Multiple sequence alignments and visual inspection of the local secondary structure assisted in the selection. Substrates were omitted from the simulation because there were no corresponding coordinates for substrates in the starting LK model. In the targeted molecular dynamics formalism [117, 118], atomic coordinates of a a particular initial conformation, a with N atoms are represented by a vector, xa= (x1 , a a x2 ,…, x3N ) that is optimized toward its target using a constant temperature dynamics

38

simulation with a standard molecular mechanics potential [119] augmented with a time- dependent holonomic constraint of the form: 2 Φ(x) = | x – xT | – η = 0 [120]

where x is an instantaneous configuration, xT is the target structure and η is the desired root mean square deviation. The holonomic structural trajectory of lombricine kinase between the solved open structure and the theoretical closed conformation was interpolated in this fashion by targeted molecular dynamics (TMD) methods [117, 121] with the program VMD [122] as an alternate method to evaluate the theoretical LK closed conformation generated by superimposition and application of the dynamic domain rotation operators.

39

RESULTS AND DISCUSSION

Expression and purification of lombricine kinase

Overexpression of recombinant lombricine kinase has never been problematic, as early partially-pure preparations resulted routinely in 40 to 50 mg of protein per liter of bacterial culture media, as determined by Bradford [123] assays as well as ultraviolet absorption. The two most challenging aspects of this research have been the solubilization of active lombricine kinase and the resolution of the chromatographic critical pair (active soluble LK and an inactive putative misfolded LK conformer) during purification. The misfolded LK conformer has been present in every preparation and exhibits very similar electrostatic behavior to active dimeric LK, as both co-elute upon DEAE anion exchange chromatography even with long and shallow salt gradients during elution. Additionally, active dimeric LK and its putative misfolded conformer both also apparently exhibit very similar hydrodynamic radii, as the two are not completely base-line resolved upon analytical superdex 200 HR high resolution size-exclusion chromatography. With potentially higher purity attainable by purification from insoluble inclusion body expression, as exploited with arginine kinase [124], initial efforts focused on a refolding strategy for LK. Despite extensive efforts, this was not successful. Another attempt was to increase the amount of soluble LK through expression of an N-terminal Nus-Tag fusion protein (with 6X Histidine). Nickel-chelation chromatography and enterokinase-mediated cleavage of this N-terminal fusion protein, however, were not conducive to scale-up for crystallization quantities of protein. Key to large amounts of soluble LK was low-temperature expression using the original construct. The key to purifying LK and resolving the previously described chromatographic critical pair was switching from the Cibacron-blue chromatography to the more specific ADP-agarose chromatography combined with long shallow salt gradients during elution. Sacrificing yield for purity in the chromatographic peak fractionation resulted in approximately 7 to 10 mg of pure protein per liter of bacterial culture media.

40

Multiple sequence alignment of phosphagen kinases

Multiple sequence alignments provided tremendous insight to both deductively and inductively infer potential residues that might be responsible for the lombricine specificity of LK. By careful examination of a diverse sequence alignment of phosphagen kinases, as in Figure 16, consisting of two lombricine kinases, three creatine kinases and one arginine kinase, it was possible to deduce which residues and regions of sequence were strictly conserved within this enzyme family and therefore might be catalytically important to all such enzymes, and, from differences at strategic positions, those perhaps involved in substrate specificity. The sequences in this alignment are shaded from low sequence conservation (light grey) to strict sequence conservation (black) and consist of KLOM_URECA – lombricine kinase from the Innkeeper worm Urechis caupo, the enzyme of interest in the current study; KLOM_EISFO – lombricine kinase from the brandling worm Eisenia fetida; 1VRP – creatine kinase (having a solved TSAC crystal structure) from the Pacific electric ray Torpedo californica; 1QH4 – creatine kinase from chicken brain (having a substrate-free crystal structure); 1QK1 – Human ubiquitous mitochondrial creatine kinase (having a solved substrate-free crystal structure); 1M80 – arginine kinase from the horseshoe crab Limulus polyphemus (having a solved substrate-free crystal structure). The combination of structural information from the transition-state analogue structure of arginine kinase and sequence alignments such as these are crucial to discerning the roles of various residues in catalysis or substrate discrimination in LK in the absence of an actual closed TSAC structure of LK. In the TSAC structure of arginine kinase for example, there are four arginine residues (R124, R126, R309, R280) that are strictly conserved within this enzyme family and have been shown to form salt bridges with the ADP -phosphate and electrostatically stabilizing it during catalysis. Furthermore, an additional arginine residue (R229) along with R126 and R309 also serve to stabilize the nitrate that mimics the ADP -phosphate in the transition state. These four strictly

41

, (KLOM_EISFO), Eisenia Fetida . The alignment is shaded by the relative sequence ) of previously solved structures potential molecular Urechis caupo conservation from low (light grey) to high (black) along with CK and AK sequences from the PDB (listed by code replacement models aligned with lombricine kinase from Figure 16: A multiple sequence alignment of a phodphagen kinases: LK from the brandling worm

42

conserved arginine residues align respectively with R117, R119, R307, and R279 of LK and it is thought that they would have an identical function in lombricine kinase. Although a broadly diverse multiple sequence alignment of many different members of the family of phosphagen kinases as in Figure 16 provides much information as to sequence similarity and conservation and therefore insight into which residues might play a role in catalysis for any of these enzymes, it provides little evidence to explain the possible structural and physicochemical sequence differences between lombricine kinase and arginine kinase that might be responsible for the preference of each enzyme for lombricine and arginine, respectively. A more appropriate sequence alignment for this purpose is represented in Figure 17 (A-C), where taurocyamine kinases (TK_RIFPA) from Riftia pachyptila (tube worm) and (TK_AREBRA) from Arenicola brasiliensis (lugworm) along with lombricine kinases (LK_URCA) from Urechis caupo (innkeeper worm) and (LK_EISFO) from Eisenia fetida (brandling worm) are aligned with five randomly selected arginine kinase sequences: (AK_TRYCR) from Trypanosoma cruzi (Chagas disease parasite), (AK_DROME) from Drosophila melanogaster (fruit fly), (AK_HOMGA) from Homarus gammarus (European lobster), (AK_BOMORI) from Bombyx mori (domestic silkworm), and (AK_LIMPO) from Limulus polyphemus (Atlantic horseshoe crab). This alignment is colored by physicochemical properties to allow for a more effective comparison, where the charged residues aspartate, glutamate, histidine, lysine, and arginine are colored red, the amphoteric residues serine, threonine, asparagine, and glutamine are colored orange, the hydrophobic or aromatic residues leucine, isoleucine, valine, tryptophan, tyrosine, and phenylalanine are colored light green and the small or sulfur-containing residues alanine, glycine, methionine, and cysteine are colored dark green. Additionally, regions of high sequence conservation that are known to line the phosphagen of the enzyme from the transition-state structure of arginine kinase are colored black. The rationalization leading to the hypothesis that an alignment such as this might reveal more insight into the structural determinants of lombricine specificity stems from the fact that lombricine kinase is able to achieve catalysis with either taurocyamine or lombricine, but not arginine, meaning that this enzyme can recognize substrates

43

possessing the internal phosphodiester chemistry of the substrate lombricine or the very similar sulfonic group chemistry of the substrate taurocyamine as shown in Figure 1 (page 3). The alignment of taurocyamine and lombricine kinases that recognize phosphodiesters or sulfonic groups, and contrast with arginine kinases that do not, helps in discerning the mediators of substrate specificity. There are only a few regions with both high sequence similarity between lombricine and taurocyamine kinases and low similarity with arginine kinases, and these candidate mediators of specificity are colored purple in the alignment.

Substrate Specificity

Despite the existence of transition state structures of arginine and creatine kinases, the full complement of the structural determinants of substrate specificity within this enzyme family has remained elusive. Nonetheless, existing structures and multiple sequence alignments have implicated two flexible loops in the specificity of phosphagen kinases. The first loop is in the small N-terminal alpha-helical domain, dubbed the GS region [125] and is comprised of residues 50-62 in Urechis caupo lombricine kinase. Comparatively, in an arginine kinase TSAC structure, residues serine 63, glycine 64, and valine 65 of this GS region form hydrogen bonds from backbone nitrogens to the carboxylate of the substrate arginine. Additionally, a side-chain hydrogen bond is donated from tyrosine 68 to the alpha amine of the substrate arginine. All of these interactions partially mediate the specificity of AK toward arginine [1]. The substrate arginine is also tethered to the enzyme on the substrate’s guanidinium end through salt bridges to two glutamates, but due to high sequence conservation of these glutamates across a number of phosphagen kinases as well as the precise catalytic substrate alignment seen in the AK transition state in Figure 4, these seem to be necessary for catalysis and seem to function more in preorganization and substrate alignment than specificity.

44

Conjecture, however, abounds as transition state structures of wild-type AK with each of four non-cognate substrates, reveal that these non-cognate substrates bind and induce the transition-state-like enzyme conformations, but remain inactive (Clark, S.A, Bush, J, unpublished). These structures strikingly suggest that only a modest deviation in the position of the guanidinium with little concomitant disposition of nitrate or ADP coordination is sufficient to render the enzyme completely inactive indicating that substrate-substrate alignment may also play a role in the specificity of these enzymes. The second flexible loop involved in substrate specificity is in the large C-terminal beta sheet domain and is comprised of residues 308-315 in Urechis caupo lombricine kinase. Similarly in AK, the inhibition constants measured for L-arginine analogues suggest that substrate specificity may depend on the length of the carbon chain between the guanidine and amino acid moieties of the phosphagen substrate (L- isoleucine>L-valine>L-α-aminobutyrate>L-alanine) as well as the bonding interactions of the α-carbon atom (L-arginine>agmatine and L-ornithine>putrescine) [126]. A few varieties of AK were also able to phosphorylate canavanine (KM=6.7 mM, Vmax=0.3 times L-arginine), which is arginine with the delta carbon replaced by oxygen [127]. The D-arginine enantiomer is not a substrate for AK [127, 128], but it is a good competitive inhibitor of AK (Ki=0.31 mM) [128], indicating that the enzyme can bind to D-arginine, but is likely unable to align its reactive guanidinium for catalysis. A transition state structure of AK with D-arginine confirms this prediction (Clark, S.A., Bush, J, unpublished). Creatine kinase also exploits the two aforementioned loops in order to mediate its specificity toward creatine. Two hydrophobic residues, one from each loop, isoleucine 69 and valine 325, respectively, form a hydrophobic mini-pocket to accommodate the methyl group of creatine [26]. Mutation of the CK valine 325 to a glutamate increases the preference for glycocyamine (creatine without the methyl group) by 100-fold [129] and this is likely for the same reason as in arginine kinase, that in the absence of a substrate methyl group, this glutamate forms a salt bridge with the substrate to align and position it for catalysis [1, 77].

45

Lombricine kinase (LK) and taurocyamine kinase (TK), the phosphagen kinases unique to annelids prefer substrates with dianionic phosphodiester and sulfonic groups respectively. Further, these enzymes also show activity on each other’s substrates, indicating within the context of substrate structure, that the stereochemical and electrostatic similarity of these anionic substituents could partially mediate specificity in these enzymes. In cytoplasmic Arenicola brasiliensis TK for instance, the enzyme shows lombricine activity of 9% of that of taurocyamine, but the mitochondrial form of the enzyme exhibits a lombricine activity of 30% of taurocyamine [130]. Similarly in Riftia pachyptila TK, the cytoplasmic form exhibited a lombricine activity of 21% of that of taurocyamine, whereas the mitochondrial form showed stronger and broader substrate activity relative to A. brasiliensis TK on glycocyamine, lombricine, and arginine, (35%, 31%, and 3%) of that of taurocyamine, respectively. The maximum activity of cytoplasmic Riftia pachyptila TK with taurocyamine was 4.04 ± 0.33 µmol/(min·mg protein) and the maximum activity of the mitochondrial form was 10.4 ± 0.590 µmol/(min·mg protein). Neither enzyme showed any creatine activity [131]. Kinetic data for these enzymes are summarized in Table 4.

Table 4: Kinetic parameters of annelid phosphagen kinases in the direction of phosphagen synthesis (forward reaction) from [131]. The mitochondrial enzyme forms have higher substrate affinities than the cytoplasmic forms, suggestive of different physiological roles. Interestingly, a single K95Y mutant in LK changes the substrate preference of this enzyme from lombricine to taurocyamine. This residue aligns with lysine 83 of Urechis LK in the current study (figure 20) that appears to mediate lombricine specificity by stabilizing the phosphodiester of lombricine.

Preferred kcat / KM Genus / Enzyme Reference KM [mM] substrate [s-1 · mM-1] Arenicola TK [130, 131] Taurocyamine 4.01 ± 0.418 2.35 Arenicola MiTK [130, 131] Taurocyamine 0.881 ± 0.085 16.23 Riftia MiTK [131] Taurocyamine 2.12 ± 0.459 5.9 Eisenia LK [81] Lombricine 5.33 ± 0.67 3.33 Eisenia LK K95Y [81] Taurocyamine 1.93 ± 0.07 6.4

46

The Michaelis constant, KM, represents the substrate concentration at half- maximal enzyme velocity and describes the enzyme affinity for substrate when product

formation is rate-limiting, with a lower KM indicating stronger affinity. The catalytic rate

constant, kcat, indicates the rate at which the enzyme-substrate complex is converted to

product. Either a faster rate of conversion then, indicated by a larger kcat, and/or tighter substrate binding indicated by a lower KM, combine to form a larger specificity index,

kcat/KM, which determines the relative substrate specificities. In Table 4, Arenicola MiTK has the strongest substrate binding and the highest substrate specificity. The data also show that mitochondrial enzyme forms have higher substrate affinities than cytoplasmic forms, suggestive of different physiological roles [131]. The residue at position 95 in Eisenia LK strongly influences the substrate preference of LK toward either lombricine or taurocyamine. This residue is strictly conserved in these enzymes and is a lysine in LKs, a histidine in TKs, a tyrosine in AKs (Figure 17) and an arginine in CKs. Interestingly, a single K95Y mutant in Eisenia LK changes the substrate preference of this enzyme from lombricine to taurocyamine and exhibits greatly enhanced affinity for taurocyamine (Table 4) and greatly reduced affinity -1 -1 for lombricine (KM=14.2mM and kcat/KM=0.72s mM ) compared to the wild-type Eisenia LK parameters shown in Table 4 [81]. Also, not surprisingly, since this K95Y mutant is of the AK type, it also showed significant activity for arginine as well (KM=33.28mM and -1 -1 kcat/KM=0.01s mM ) [81]. This residue aligns with lysine 83 of Urechis LK in the current study (figure 20), that from the modeled TSAC structure of LK with the superimposed substrate arginine, could mediate lombricine specificity by stabilizing the phosphodiester of lombricine because this lysine points in the direction where the phosphodiester group would be. An Eisenia LK K95R mutant displayed stronger affinity for both lombricine -1 -1 (KM=0.74mM and kcat/KM=19.34s mM ) and taurocyamine (KM=2.67mM and -1 -1 kcat/KM=2.81s mM ) relative to the wild-type enzyme with lombricine (KM=5.33mM and -1 -1 -1 -1 kcat/KM=3.37s mM ) and taurocyamine (KM=15.31mM and kcat/KM=0.48s mM ) [81]. This may be because arginine has an additional positively charged arm relative to lysine that might better stabilize the dianionic nature of the phosphodiester and sulfonic groups of lombricine and taurocyamine, respectively.

47

. ) pha-amine of the substrate arginine in transition-state ine kinases (TK), and the two existing lombricine kinase substrate specificity. The remaining residues are shaded by hobic p nase sequences (AK). Residues shaded black comprise the forms a hydrogen bond with the alpha-carboxylate of substrate dro y reen=h g

g; e=H-bondin g oran ; ed g red=char (

y ert p ro Figure 17-A: A multiple sequence alignment of two taurocyam p (LK) sequences with five randomly selected arginine ki ‘specificity loop’ that form backbone hydrogen bonds to the al structure of arginine kinase [1]. The side-chain Y68 arginine. Purple shaded residues are potential mediators of

48

nidinyl nitrogen of the substrate and also serves to align LKs and TKs, with substrates exhibiting similar sulfonic, inants of substrate specificity as determined from this ved in phosphagen kinases. The second E (glutamate) of this . y l chemistr y Figure 17-B: The region shaded in black is strictly conser NEEDH region is the catalytic base that deprotonates gua the guanidinium for catalysis. Potential structural determ alignment are shaded purple. These regions common to sulfinic, and phosphatid

49

been implicated in substrate binding synergy. The strictly ves to align the substrate guanidinium for nucleophilic attack Figure 17-C: The strictly conserved cysteine, C (shaded black) has conserved glutamate, E in the GEHT region (shaded black) ser on the ATP gamma phosphate. Potential mediators of substrate specificity are shaded purple.

50

Crystallization and cryoprotection optimization

Initially to optimize crystallization conditions, solubility tests were carried out to determine the limits of pH, precipitant concentration, chaotropes, and other mother liquor components. An appropriate buffer with a buffering capacity within this range was thus chosen to be BisTris. The pH was then varied in 0.1 unit increments across this pH range with all other parameters being held constant. The resultant impact on crystal morphology, rate of crystallization, or number of nuclei generated and often times diffraction quality was assessed empirically. This process was repeated in turn for each mother liquor additive. The salt concentration (NaNO3) was varied from 10 mM to 250 mM in 0.1 mM increments. The precipitant (PEG 3350mme) was varied from 5% to 29% in 2% m/v increments. Effective cryoprotection was assessed without wasting a crystal, by flash freezing and irradiating mother liquor with various increasing amounts of cryoprotectants until sharp ice rings became diffuse. For the substrate-free monomeric crystal form, which was actually setup in the presence of transition state analogue components, crystallization conditions were determined to be optimum at a full pH unit lower than the dimeric form. The transition state analogue substrates added were magnesium-ADP, nitrate, that mimicks the planar phoshoryl group during catalysis, and taurocyamine. Taurocyamine was used simply because it is also a substrate for LK and it is commercially-available, whereas lombricine is labor intensive to isolate in large yields from very large quantities of earthworms. (personal communication from Ross Ellington) Being the non-preferred substrate for this reaction, it generally would have a higher binding constant and therefore it would likely require higher concentrations in order to produce a transition-state like crystal and this may be one reason why crystallization of the transition-state structure of lombricine kinase proved unsuccessful. Nonetheless, as can be seen from Figures 19 and 22, the loop containing glutamate 312 that would bind and align the substrate for the monomeric crystal form (the one setup under TSAC conditions) is clearly moved in toward the active site relative to the dimeric crystal form, where no TSAC substrates were added to the mother liquor. This might indicate that substrates are beginning to bind, but with

51

very low occupancy. It would appear that since the only difference in crystallization conditions between the dimeric and monomeric crystal forms of LK are pH and transition-state analogue components, that there is potentially a pH and substrate dependence of the space group, unit cell dimensions, Matthews number and therefore non-crystallographic symmetry of crystallization.

Structural validation

Both the substrate-free monomeric (sf-m) form and substrate-free dimeric form (sf-d) of lombricine kinase were validated residue-by-residue by density-fit analysis, peptide omega analysis of backbone torsion angles, geometry analysis, Ramachandran analysis, and rotomer analysis with the program Coot. After modeling explicit solvent into appropriate regions of unmodeled density from Fo-Fc difference maps, the very few remaining difference map peaks were closely investigated and the models were modestly adjusted and then real-space refined to correct any improper bond lengths, bond angles, or torsion angles and re-establish the correct geometry. The models were then checked with the program Procheck [114] to verify that there were no clashing or geometrically outlying residues and a Ramachandran plot showing that no residues are in disallowed regions for the LK sf-d structure is shown in Figure 18. Further evidence of the reliability and quality of the structure is indicated by the R and R-free values as shown in Tables 2 and 3. For the respective resolutions of the structures, these R and R-free values are completely consistent with a ten year analysis of 10,888 structures that were peer-reviewed and submitted to the protein data bank between 1991 and 2000, where the authors plotted R and R-free versus resolution [132]. The substrate- free structures of each crystal form are largely identical as determined by the global structural comparison by superimposition of the substrate-free monomeric crystal form of lombricine kinase (sf-m) in blue and the A subunit of the dimeric crystal form of lombricine kinase (sf-d) in red as shown in Figure 19 and have an overall rmsd of 0.36 angstroms. This particular projection is looking down the hinge axis with the N-terminal alpha-helical domain on the right hand side and the C-terminal beta-sheet domain on

52

the left hand side. The active site is located in the cleft in the center. As expected, the largest deviations are seen in flexible loop regions such as at bottom center where glutamate 312 that aligns the substrate guanidinium is located. Other notable deviations are near regions of crystal contacts and non-crystallographic dimer packing.

Evaluation of the reliability of predicted transition-state models of LK

The best model of the predicted transition-state models of LK was the homology modeled structure. The reason is that the model generated from the superimposition of the open substrate-free structure of lombrcine kinase followed by application of dynamic domain rotation operators did not take side chains into account, so essentially this structure is an artificially closed structure with side chains of the substrate-free configuration type and therefore, there are inherent limitations to the usefulness of this predicted model. In order for the model predicted by targeted molecular dynamics to be more accurate, coordinates of the substrate lombricine would likely need to be parameterized and included in the calculation. This has not been done. The closed transition-state structure predicted by homology modeling has an algorithm that maintains spatial restraints that exist in the transition-state structure of arginine kinase and thus the model predicted by homology modeling takes into account the positions of side chains and thus is of more use to make the predictions regarding the potential structural determinants of lombricine specificity. The strictly conserved aligned regions of sequence that serve to position the guanidinium for catalysis (glutamate 225 and glutamate 314 in the transition-state structure of arginine kinase -Figure 21) and (glutamate 219 and glutamate 312 in LK -Figure 20) are 2.58 and 4.19 Å, respectively away from the guanidinium of the substrate arginine when superimposed with the transition-state structure of arginine kinase, and therefore, the Modeller-predicted closed LK homology model seems like the most appropriate model to use in the analysis.

53

Figure 18: Ramachandran plot of the backbone torsion angles (Psi vs Phi) of the sf-d structure of lombricine kinase, showing that no residues are in disallowed regions and greater than 91 percent of all residues in the protein have backbone geometries in the most favored regions consistent with their associated secondary structures (alpha-helix, beta-sheet, or left-handed helix), an indication of a good structure.

54

Figure 19: Global structural comparison by superimposition of the substrate-free monomeric crystal form of lombricine kinase (sf-m) in blue wire frame and the A subunit of the dimeric crystal form of lombricine kinase (sf-d) in red. The active site is located in the cavity in the center. The structures are largely identical and have an overall rmsd of 0.36 angstrom. The largest deviations are seen in flexible loop regions such as at bottom center where glutamate 312 that aligns the substrate guanidinium is located. Other notable deviations are near regions of crystal contacts and non- crystallographic dimer packing. The image was rendered in VMD

55

Structural rationalization of potential specificity determinants

The inductive rationalization from the sequence alignment and the concomitant selection of the potential residues responsible for the substrate discrimination of lombricine kinase toward lombricine but not arginine was then extended to each of the theoretical closed transition-state analogue structures of lombricine kinase that were generated by targeted molecular dynamics, homology modeling, and superimposition of corresponding LK residues with the known fixed and dynamic domains of arginine kinase, as described by Yousef [27] Lombricine kinase catalyzes the phosphorylation of several of the biological substrates shown in Figure 1. All the LK-active substrates exhibit very similar stereochemistry and electrostatic potentials of the phosphodiester functionality as in lombricine or the sulfonic functionality as in taurocyamine. The only structural difference between lombricine and arginine is that lombricine has a phosphodiester group between the beta and gamma carbons of arginine. Therefore, it seems likely that LK would discriminate between lombricine and arginine by adding (relative to arginine kinase) a hydrogen-bonding or salt bridge partner. The binding site of lombricine in LK was inferred by close comparison of the superimposed transition-state structure of arginine kinase (including the substrate arginine, which is analogous to lombricine in LK) with each of three theoretical models of the closed transition-state structure of lombricine kinase. These models were generated by homology modeling, superimposition and application of arginine kinase- like dynamic domain operators, and targeted molecular dynamics, respectively. This combined structural and sequence analysis has implicated several candidates as potential structural determinants of lombricine specificity as shown in Figure 20. This theoretical closed transition-state analogue homology model of lombricine kinase was analyzed following superimposition of the backbone atoms of the fixed and dynamic domains of the model in regions of high identity to the transition-state structure of arginine kinase. The coordinates of arginine from this transition-state structure of

56

arginine kinase were then added to the AK-superimposed theoretical LK model. Only atoms within 8 angstroms of the arginine beta carbon are shown. The theoretical LK model seems reasonable because as in the arginine kinase structure, this model positions two glutamates (E219 and E312) in close proximity on opposite sides of the arginine guanidinium. One prominent arginine-proximal residue in the model is lysine 83 in lombricine kinase, that is distinct from AKs in sequence alignments and strictly conserved in both known LK sequences and is an electrostatically positive histidine in the TKs. This lysine appears to position its side chain nitrogen toward the beta carbon of arginine, where lombricine would have the phosphodiester group if juxtaposed with arginine. Further analysis of this model shows a pair of histidines (H187 and H313) and a pair of cysteines (C58 and C270) one in each pair on opposing sides of the substrate that appear to be in the general vicinity and could theoretically mediate lombricine specificity through electrostatic interactions or hydrogen bonds that would stabilize the negative charge on the phosphate moiety between the beta and gamma carbons of the substrate lombricine, the only position where lombricine differs from the structure of arginine. Some other notable sequence differences are present in LKs relative to AKs that are close to the arginine binding site in the AK TSAC structure shown in Figure 21 for comparison and include two loop regions (56-59 and 182-193) in LK, representing residues G56, R57, C58, and I59 in the first loop and residues Q182, K183, P184, T185, G186, H187, L188, M189, V190, N191, S192, and A193 in the second loop. The first loop sequence aligns near the ‘specificity loop’ of arginine kinase (S63-Y68) that folds closely behind the substrate arginine and forms three hydrogen bonds from backbone nitrogen atoms to the carboxylate of the substrate arginine and a side-chain hydrogen bond from tyrosine 68 to the alpha-amino group of the substrate arginine. Upon superimposition of the closed arginine kinase TSAC structure with the TMD-generated theoretical closed structural model of LK, four residues from the second loop sequence (Q182, K183, T185, and H187) appear to be very close to the substrate arginine and by inference, likely to also be close to where lombricine might bind in the TSAC structure of LK. The side chains of all four residues are capable of

57

Figure 20: The theoretical closed transition-state analogue homology model of lombricine kinase. The backbone atoms of the fixed and dynamic domains of the model in regions of high identity were superimposed with the transition-state structure of arginine kinase. The resultant coordinates of arginine (in red) from this transition-state structure of arginine kinase were then added to the theoretical LK model. Only atoms within 8 angstroms of the arginine beta carbon are shown. As in the arginine kinase structure, this model positions two glutamates (E219 and E312) in close proximity on opposite sides of the arginine guanidinium The arginine-proximal residues of this theoretical LK model that are implicated in the preference of LK for lombricine are a pair of histidines (H187 and H313) and a pair of cysteines (C58 and C270) on opposite sides of the arginine shown. Lysine 83 is also in the general vicinity of where the phosphodiester group of lombricine might be expected to bind. Homology modeling was done with the program Modeler and rendered in DeepView. Sulfurs are yellow, nitrogens are blue and oxygens and the substrate arginine are red.

58

forming hydrogen bonds or electrostatically stabilizing the internal phosphodiester of lombricine. Furthermore, all four residues are strictly conserved in all lombricine and taurocyamine kinases of the alignment shown in Figure 17 and all corresponding aligned arginine kinase residues in this region of the alignment are distinct from those of LK and TK, but strictly conserved within all AKs in the alignment. Interestingly, the threonine 185 of Urechis caupo LK numbering that is present in both LKs and TKs is absent in all AKs and aligns with a gap in the AK sequence, which also could indicate its apparent function as a specificity determinant in LK. The overall ribbon diagram of the theoretical closed transition-state analogue homology modeled structure of lombricine kinase with the transition-state coordinates of the substrates for arginine kinase is shown in Figure 23 and shows clearly the relative locations of the loops which would bind the substrates. The backbone atoms of the fixed and dynamic domains of the model in regions of high identity were superimposed with arginine kinase. The resultant substrate coordinates of ADP, nitrate, and arginine left to right in red were then added to the model to create this hybrid model. The results appear to be ostensibly consistent with the transition state structure of arginine kinase, as globally, the loops are in the right locations relative to the substrates, and locally, as shown in Figure 19, the homologous residues that align the guanidinium of arginine (glutamate 219 and glutamate 312 in LK and glutamate 225 and glutamate 314 in AK) in arginine kinase are within 2.6 angstroms and 4 angstroms respectively. Further, the loop in arginine kinase (S63-V65) that forms three backbone nitrogen hydrogen bonds to the arginine carboxylate is also in approximately the right position for this theoretical hybrid model to do the same. A ribbon diagram of this structural comparison of substrate-free monomeric (sf-m) and dimeric (sf-d), which perhaps better illustrates the similarity of the structures is shown in Figure 22. In both of these figures, it is evident that the main distinction between the two structures is the position of the large domain flexible loop containing glutamate 312 that functions in aligning the guanidinium of the substrate for optimal nucleophilic attack on the gamma phosphate of ATP at the distal end of the active site.

59

Figure 21: Arginine in the TSAC structure of AK: The active site environment of the substrate arginine (red) in the transition state analogue structure of arginine kinase. The residues essential for arginine binding are in a flexible loop at right and include serine 63, glycine 64, and valine 65 that form hydrogen bonds (green dashed lines) from backbone nitrogen atoms to the alpha carboxylate of the substrate arginine. An additional hydrogen bond is contributed by the side chain hydroxyl of tyrosine 68 to the alpha amine of arginine. Comparison of this structure from Zhou, 1998 [1] with the theoretical transition state of lombricine kinase in figure 19 assists in the formulation of the rationale for the implication of the specificity determinants of LK for lombricine.

60

Figure 22: Ribbon diagram of the structural comparison by superimposition of the substrate-free monomeric crystal form of lombricine kinase (sf-m) in blue and the A subunit of the dimeric crystal form of lombricine kinase (sf-d) in red. The active site is located in the cavity in the center. The structures are largely identical (Overall rmsd of 0.36 angstroms) except for flexible loop regions, regions of crystal contacts and non- crystallographic dimer packing.

61

Figure 23: The global theoretical closed transition-state analogue homology modeled structure of lombricine kinase (cyan). The backbone atoms of the fixed and dynamic domains of the model in regions of high identity were superimposed with arginine kinase. The resultant substrate coordinates of ADP, nitrate, and arginine left to right in red were then added to the model. Homology modeling was done with the program Modeler and rendered in VMD

62

CONCLUSIONS

While, the interpretation of the predicted theoretical transition state structure of lombricine kinase is highly speculative, in conjunction with the multiple sequence alignments, it provides the best means possible in the absence of an actual transition- state structure of LK to qualitatively suggest the residues that might have a role in specificity. Certainly more data is needed to confirm or refute the prediction that in LK, a lysine K83, a pair of histidines H187 and H313 and a pair of cysteines, C58 and C270 would bind the phosphodiester group of lombricine and be responsible for the preference of LK for lombricine. Sequencing more LKs from an evolutionarily diverse population would certainly be helpful, as would an actual transition-state structure of LK. The hope is that this analysis will lead to further research to fully determine the structural determinants of substrate specificity of each member of this enzyme family for its preferred substrate. Some of this further research might entail creating mutants of the sites predicted from this research and kinetically measuring apparent binding constants and relative catalytic turnover rates to see if the specificity index is altered by these mutations, or if catalysis is prevented all together. Specifically, kinetic characterization of the Eisenia LK mutant K95H would be beneficial, as from sequence alignments, this would seemingly convert the enzyme preference for lombricine to taurocyamine in the same fashion as the K95Y mutant. Ideally, the most preferable situation would be an actual transition state structure of lombricine kinase to evaluate the predictions herein described. To this end, a few good crystals still exist, and substrate soaking efforts shall continue in the near future. Along with evidence from the subtle differences in substrate structures of arginine, lombricine, and taurocyamine (Figure 1), multiple sequence alignments (Figure 17), and kinetic data of mutants (Table 4 and page 47), the homology modeled transition state of lombricine kinase with arginine superimposed (Figure 20) presented herein, proves conclusively that the residue at position 83 in Urechis LK (95 in Eisenia LK) is clearly involved in the substrate specificity of phosphagen kinases.

63

APPENDIX A: COMMERCIALLY AVAILABLE POTENTIAL ENZYME INHIBITORS Cat. Name Structure Description Vendor No.

N

N Arginine with N backbone amino and carboxylate MTH- DL- N groups locked in M3756 Arginine S Sigma rigid five-membered ring with other O N functionalities

Arginine with both CH 3 N CH 3 potential N nucleophilic NG, NG'- O guanidino nitrogens Dimethyl-L- N bound to methyl Sigma D0390 arginine O groups. Available as salt with di(p- N hydroxyazobenzene -p-sulfonate)

N Arginine with one N potential G nucleophilic N methyl-L- O guanidino nitrogen M7033 arginine CH 3 N Sigma bound to a methyl O group. Available as acetate salt N

N Arginine with one N potential G O N hydroxyl-L- nucleophilic HO N H7278 arginine guanidino nitrogen Sigma O bound to a hydroxyl group N

64

N N5501 Arginine with one (Methyl N potential O ester Nω nitro-L- nucleophilic ON N =NAME arginine (LNNA) 2 guanidino nitrogen Sigma 5g, bound to a nitro O 38.85 group N 72760)

N N O Arginine with an N amide group bound to alpha amino Nα carbamyl-L- group. Other C8632 arginine O Sigma variations of this ON substitution also available N

N Arginine with N backbone carboxylate Arginine ethyl O S78,289- esterified with an ester N Sigma 0 ethyl group. Methyl O ester also available (11030) N

O Wako O SO Chemica Guanidinoethyl-2- 200- Taurocyamine l 1-877- sulfite 07843 N 714- N 1920 N

65

N O Arginine less half of guanidinium and Lysine L5501 O juxtaposing side Sigma N chain N and C

Taurocyamine less O guanidinium. (Other N comps e.g. BES, Taurine T8691 S OH ACES exist with Sigma other functionalities O off amino group)

N N Arginine with alpha -chloro- - α δ O amino replaced by guanidino-n- C0519 N nearly isosteric Sigma valeric acid O chloride Cl

O Creatine less guanidinium (CK Sarcosine S7672 N and GK) (n-methyl Sigma O glycine)

N N O Arginine with NHOH L-Arginine N bound to backbone A7380 hydroxamate Sigma carboxylate NH N OH

O O O-Phospho-L- lombricine less ethyl Sigma P0878 serine HO PO O guanidinium OH N

66

N S Arginine with one O of terminal S-Methyl L- guanidinium N Sigma M5171 thiocitrulline nitrogens replaced with S- O CH3 N

N Arginine with an O extra methylene L-N6-(1-iminoethyl) and terminal Sigma I8021 lysine N guanidinium O nitrogen replaced N with methyl

N O Arginine with a 5 terminal L-N -(1-iminoethyl) N guanidinium Sigma I8768 ornithine O nitrogen replaced with methyl N

O Arginine with only N up to the first L-Ornithine Sigma 754-70 O nitrogen of guanidinium N

O N N Arginine without 4-Guanidinobutyric one methylene Sigma 51018 acid O and without alpha N amino

67

N N

N O Arginine with an Homoarginine Sigma 157112 extra methylene O N

N O O N Single methylene L-Canavanine replaced with Sigma C1625 N O oxygen N

O N Citrulline (L-2- O Guanidinium Aminoureidovaleric N nitrogen replaced Sigma C7629 acid) O with oxygen N

68

APPENDIX B: PROTOCOL FOR REAL SPACE REFINEMENT IN “O”

Useful ‘O’ resources, such as a model building tutorial and list of commands can be found at http://www.bioxray.dk/~mok/o/o_man/manual.html and A to Z of O at http://weizmann.ac.il/Chemical_Research_Support//xraylab/manuals/AtoZofO/A- Z_frameset.html

Suggested user input herein, is in bold

GETTING STARTED WITH O on the local network: 1. Point your web browser to \\Kodiak\shares and copy the contents of the o_nt directory to your working directory, likely in the Z:\ drive. This is necessary because you cannot write to Kodiak and you will need to have a local area where you can save your files. 2. Create a shortcut on your desktop to the nt_o executable from \\Kodiak\shares. Right click on the shortcut and add a -st to the end of the nt_o.exe target (i.e. Z:\your_directory\o_nt.exe –st). This will ensure that you are using the stereo version of O. Later on, you will be able to toggle the stereo vision on and off with the F1 key, assuming you have the correct graphics card and stereo emitter properly installed and wear the correct NuVision 3D glasses that sync with the emitter. Also, when you type stereo (return) into the terminal (text) window and then click in the ‘dial menu’ box within the graphics window, you should see two new sliders – ‘shear’ and ‘separation’. These parameters should be adjusted with the sliders to get the stereo settings the way you like them. It seems to make the most sense to have the brightest part of the map and model closest to you and slowly fade into the background. I think that a shear and separation of zero is analogous to the stereo being off because it appears to merge the two stereo images even while wearing your glasses and with F1 toggled on. 3. Double click the shortcut – this will bring up a dos window running O. After step 4, when you are prompted if you want to use the display and you answer yes, then the graphics window will pop up also 4. >startup.o return

69

>menu.o return return five more times

The program is intuitive, as case does not matter. Also if a command is unambiguous with other commands, the following shortcut will also work for sam_atom_in, you can alternatively type s_a_i therefore,

TO LOAD A MODEL INTO O: sam_atom_in return Sam>name of input file: your_model_PDB_filename.pdb (in my case: modweb203_LKTSA_outmod_1Bchain_refmac1.pdb) or 1QK1_lsqon_210 ret Sam> O associated molecule name:LK ret File type is PDB Database compressed Space for 285900 atoms Space for 10000 residues Molecule B contained 360 residues and 2830 atoms dir LK* ret Heap>………

5. If you want to load a c-alpha trace that connects adjacent c-alphas with a straight line and shows no side chains: O > mol ‘(your O-associated model name)’ ca ; end e.g. O > mol LK ca ; end

TO ADD SIDECHAINS AS AN ITEM IN THE OBJECT MENU SLIDER: mol ‘(your O-associated model name)’ obj ‘(side_chain)’ or whatever you would like to call it. zone ; end

70

(This command puts your sidechains into a separate field in the objects box on the graphics window so that you can toggle the sidechains on and off if things get too confusing)

TO CENTER MOLECULE ON SCREEN O > cen_atom ret O > LK B100 ca (for instance, will center on the c-alpha of Bchain residue 100)

TO LOAD A MAP INTO O: (.EZD or .DSN6 format ) fm_f Fm > Filename: your_map_filename.map (in my case model_map_45.map or 2FO- FCcompo45.map) Fm> Name of this map Q1? ret Fm> Map type is ?: Fm> Choose one of these: O X-PLOR TNT EZD CCP4 PHASES fm_setup terminal echo -> What map? Q1: ret The default map color is maroon, but I happen to like blue or dark_green (good contrast), but you can use cyan, magenta, or anything you like ( and one can also type any color in hexadecimal format)

fm_d This should draw your electron density map that you just loaded

TO MANUALLY REBUILD AND MODEL THE STRUCTURE INTO DENSITY mutate_replace (ret) (LK) ((residue_#)) ((residue_type)) New residue will be in purple color. The whole molecule disappears, so redraw the molecule as in the ‘TO ADD SIDECHAINS…’ section above

71

From the Rebuild menu at the top of graphics window->Grab-> atom or residue – Pull C-alphas to middle of density in 3D (with stereo on). Lego_setup (ret) Lego> Drawing of ca traces (on/off): (ret) Lego> Good-bad fit colour ramping 120 40 (ret) Lego> File of Diagonal Distances dgnl.o (ret) Lego> Directory containing Protein Database odata/ (ret) Lego> File of sidechain rotamers rsc.o Read refi_aa.o Refi_init ‘(your O-associated model name)’ (return) Refi_gene (return twice) You only need to do this step once if you save after doing it.

Lego_Ca (ret)-Select a Ca from first and last residue of mainchain region to be rebuilt…Usually 3 to 5 residues is a good choice. Click on dial menu on graphics screen and you should notice a new slider choice “Best fit” Scroll left or right on this slider until you get good agreement between the position of your Ca’s and the chicken-wire density. Move_z – Program will ask you to click on beginning and ending residues to define the zone to manipulate. Click on dial menu and new sliders should appear that allow you to move and rotate the fragment in all directions. Do this until you are pleased with model fitting into map Refi_z A21 A26 – For instance, if I have been manipulating A chain residues in region 22 to 25 as in 1,2,4 and 5 above, this command should restore appropriate bond lengths, angles and torsions. Rebuild -> Torsion->In Residue ->Click on Dial menu and adjust phi/psi/omega/chi angles stereo – shear and separation lightest away Iterative cycles of 1-2 and 4-6 above. Note that the ends of the 3 to 5 residue chains will not have good agreement, therefore, only look at the c-alpha positions for the middle three atoms and then overlap the next iterative residue fragment of your chain

72

with the previous. For example if I am attempting to trace the backbone and look at residues 101-105, my next fragment choice for best fitting analysis should be something like 104-108….

TO MOVE ABOUT WITHIN THE STRUCTURE (type in terminal window) O > centre_atom As3> Define molecule A, residue, and atom CA: a12 Centre_next and Centre_previous O > centre_next residue_type = pro As4> Centering on A A38 CA O > Now hit the Sphere macro, to see the residue. Try it again. You should now see proline A56. It is often very useful to move forward one residue, and display a sphere. Here is a macro that does that: centre_next atom_name = ca object sph sphere 10 end

MOUSE O COMMANDS: Hold down Middle Mouse + Move mouse = shrink or enlarge model Right Mouse + Move Mouse = Rotate model CTRL + Right Mouse + Move Mouse = Translate Model Middle Mouse + shift = slab mode F1 will toggle the stereo vision on and off provided you have the emitter, stereo goggles, stereo version of O and appropriate video card installed FOR FURTHER HELP, TRY to GOOGLE “O files” Or check out reference below or Methods in Enzymology volume 115 Jones, T.A, Zou, J-Y., Cowan, S.W. & Kjeldgaard, M. (1991) Improved Methods for Building Protein Models in Electron Density Maps and the Location of Errors in these Models. Acta Cryst. A47, 110--119.

73

APPENDIX C: BIOLOGICAL BUFFERS AND SOLUTIONS

Lysis Buffer Superdex Buffer 50mM Tris/HCl 25mM Tris/HCl 100mM NaCl 150mM KCl 10mM EDTA 1mM EDTA 15% w/v sucrose 1mM DTT 14mM Beta-mercaptoethanol 1mM EDTA pH 8.2 at 298K pH 8.1 at 298K

DEAE Running Buffer Concentration Buffer 10mM Tris/HCl 10mM Tris/HCl 1mM EDTA 5mM KCl 5mM KCl 2mM DTT

0.02%w/v NaN3 pH = 8.1 at 298K 1mM DTT pH 8.6 at 298K Crystallization Buffer 15mM BisTris ADP Running Buffer 0.2M NaNO3 50mM Bis-Tris 1mM DTT

1mM MgCl2 20% PEG 3350mme

5mM NaNO3 pH = 6.8 or 5.8 1mM DTT Cryo: 35% Glycerol pH 6.5 at 298K

74

REFERENCES

1. Zhou, G., et al., Transition state structure of arginine kinase: Implications for catalysis of bimolecular reactions. Proceedings of the National Academy of Sciences, USA, 1998. 95: p. 8449-54. 2. Hansen, D.E. and J.R. Knowles, The Stereochemical Course of the Reaction Catalyzed by Creatine Kinase. Journal of Biological Chemistry, 1981. 256(12): p. 5967-5969. 3. Ellington, W.R., Evolution and Physiological Roles of Phosphagen Systems. Annual Review of Physiology, 2001. 63: p. 289-325. 4. Schlattner, U., et al., Functional aspects of the X-ray structure of mitochondrial creatine kinase: a molecular physiology approach. Mol Cell Biochem, 1998. 184(1-2): p. 125-40. 5. Tombes, R.M. and B.M. Shapiro, Metabolite Channeling: A Phosphorylcreatine Shuttle to Mediate High Energy Phosphate Transport between Sperm Mitochondrion and Tail. Cell, 1985. 41: p. 325-344. 6. Wallimann, T., et al., Intracellular Compartmentation, Structure and Function of Creatine Kinase Isoenzymes: the "Phospho-Creatine Circuit" for Cellular Energy Homeostasis. Biochemical Journal, 1992. 281: p. 21-40. 7. Ellington, W.R. and S.T. Kinsey, Functional and evolutionary implications of the the distribution of phosphagens in primitive type spermatozoa. Biol Bull, 1998. 195: p. 264-272. 8. Abel, K., et al., An alpha to beta conformational switch in EF-Tu. Structure, 1996. 4(10): p. 1153-9. 9. Capaldi, R.A. and R. Aggeler, Mechanism of the F(1)F(0)-type ATP synthase, a biological rotary motor. Trends Biochem Sci, 2002. 27(3): p. 154-60. 10. Sprang, S.R., G protein mechanisms: insights from structural analysis. Annu Rev Biochem, 1997. 66: p. 639-78. 11. Wittinghofer, A., Signal transduction via Ras. Biol Chem, 1998. 379(8-9): p. 933-7. 12. Ramakrishnan, V., Ribosome structure and the mechanism of translation. Cell, 2002. 108(4): p. 557-72. 13. Frank, J., The ribosome--a macromolecular machine par excellence. Chem Biol, 2000. 7(6): p. R133-41. 14. Spudich, J.A., The myosin swinging cross-bridge model. Nat Rev Mol Cell Biol, 2001. 2(5): p. 387-92. 15. Sindelar, C.V., et al., Two conformations in the human kinesin power stroke defined by X-ray crystallography and EPR spectroscopy. Nat Struct Biol, 2002. 9(11): p. 844-8. 16. Kikkawa, M., et al., Switch-based mechanism of kinesin motors. Nature, 2001. 411(6836): p. 439-45.

75

17. Vale, R.D. and R.A. Milligan, The way things move: looking under the hood of molecular motor proteins. Science, 2000. 288(5463): p. 88-95. 18. Kammermeier, H., Meaning of energetic parameters. Basic Res. Cardiol., 1993. 88: p. 380-4. 19. McGilvery, R.W., Biochemistry, A Functional Approach. 2 ed. 1979, Philadelphia: Saunders. 20. Ellington, W.R., Evolution and physiological roles of phosphagen systems. Annu Rev Physiol, 2001. 63: p. 289-325. 21. Zhou, G., The Transition State Structure of Arginine Kinase, in Molecular Biophysics. 1998, Florida State University: Tallahassee. p. 112. 22. Lahiri, S.D., et al., The 2.1 A Structure of Torpedo californica Creatine 2+ - Kinase Complexed with the ADP-Mg -NO3 -Creatine Transition-State Analogue Complex. Biochemistry, 2002. 41(47): p. 13861-13867. 23. Fritzwolf, K., et al. X-ray Structure of Mitochondrial Creatine Kinase. in Biophysical Society. 1996. Baltimore: Biophysical Journal. 24. Rao, J.K., G. Bujacz, and A. Wlodawer, Crystal structure of rabbit muscle creatine kinase. FEBS Lett, 1998. 439(1-2): p. 133-7. 25. Eder, M., et al., Crystal structure of brain-type creatine kinase at 1.41 A resolution. Protein Sci, 1999. 8(11): p. 2258-69. 26. Lahiri, S.D., et al., The 2.1 A Structure of Torpedo californica Creatine Kinase Complexed with the ADP-Mg2+-NO3- Creatine Transition-State Analogue Complex. Biochemistry, 2002. 41: p. 13861-13867. 27. Yousef, M.S., et al., Induced fit in guanidino kinases-comparison of substrate-free and transition state analog structures of arginine kinase. Protein Sci, 2003. 12(1): p. 103-11. 28. Yousef, M.S., et al., Refinement of Arginine Kinase Transition State Analogue Complex at 1.2 Å resolution; mechanistic insights. Acta Crystallographica. Section D: Biological Crystallography, 2002. 58: p. 2009- 2017. 29. Gattis, J.L., et al., The active site cysteine of arginine kinase: structural and functional analysis of partially active mutants. Biochemistry, 2004. 43(27): p. 8680-9. 30. Ellington, W.R. and J. Bush, Cloning and expression of a lombricine kinase from an echiuroid worm: insights into structural correlates of substrate specificity. Biochem Biophys Res Commun, 2002. 291(4): p. 939-44. 31. Kuby, S.A., L. Noda, and H.A. Lardy, Adenosinetriphosphate-Creatine Transphosphorylase III. Kinetic Studies. Journal of Biological Chemistry, 1954. 210: p. 65-82. 32. Kenyon, G.L. and G.H. Reed, Creatine Kinase: Structure-activity Relationships. Advances in Enzymology, 1983. 54: p. 367-426. 33. Ellington, W.R., Phosphocreatine represents a thermodynamic and functional improvement over other muscle phophagens. Journal of Experimental Biology, 1989. 143: p. 177-94.

76

34. Stokes, G.S., et. al, Interactions of L-arginine, isosorbide mononitrate, and angiotensin II inhibitors on arterial pulse wave. Am J Hypertens, 2003. 16(9): p. 719-724. 35. Eremin, O., L-Arginine: Biological Aspects and Clinical Applications. 1997, New York, NY: Chapman and Hall. 36. Murad, F., Shattuck Lecture. Nitric oxide and cyclic GMP in cell signaling and drug development. N. Engl. J. Med, 2006. 355(19): p. 2003-11. 37. Sowden, H.M., K.M. Naseem, and D.J. Tobin, Differential expression of nitric oxide synthases in human scalp epidermal and hair follicle pigmentary units: implications for regulation of melanogenesis. Br. J. Dermatol., 2005. 153(2): p. 301-9. 38. Burnett, A.L., The role of nitric oxide in erectile dysfunction: implications for medical therapy. J Clin Hypertens (Greenwich), 2006. 8(12 Suppl 4): p. 53-62. 39. Pasternak, R.C. and E. Braunwald, Acute Myocardial Infarction, in Harrison's Principles of Internal Medicine, J.D. Wilson, et al., Editors. 1991, McGraw-Hill, Inc.: New York, NY. 40. Griggs, R.C., W.G. Bradley, and B. Shahani, Approach to the Patient with Neuromuscular Disease, in Harrison's Principles of Internal Medicine, J.D. Wilson, et al., Editors. 1991, McGraw-Hill, Inc.: New York, NY. p. 2088-2096. 41. Fried, L.F., et al., Renal insufficiency as a predictor of cardiovascular outcomes and mortality in elderly individuals. J Am Coll Cardiol, 2003. 41(8): p. 1364-72. 42. Devlin, T.M., ed. Textbook of Biochemisty with Clinical Correlations. 5th ed. 2002, Wiley-Liss: New York. 1216. 43. Page, M.I. and W.P. Jencks, Entropic Contributions to Rate Accelerations in Enzymic and Intramolecular Reactions and the Chelate Effect. Proceedings of the National Academy of Sciences, USA, 1971. 68: p. 1678-83. 44. Dafforn, A. and D.E. Koshland Jr., Theoretical Aspects of Orbital Steering. Proceedings of the National Academy of Sciences, USA, 1971. 68: p. 2463- 7. 45. Hengge, A.C., Isotope effects in the study of phosphoryl and sulfuryl transfer reactions. Acc Chem Res, 2002. 35(2): p. 105-12. 46. Murali, N., et al., Two-Dimensional Transferred Nuclear Overhauser Effect Spectroscopy (TRNOESY) Studies of Nucleotide Conformations in Creatine Kinase Complexes. Biochemistry, 1993. 21: p. 12941-8. 47. Murali, N., G.K. Jarori, and B.D. Rao, Two-dimensional transferred nuclear Overhauser effect spectroscopy (TRNOESY) studies of nucleotide conformations in arginine kinase complexes. Biochemistry, 1994. 33(47): p. 14227-36. 48. Blethen, S.L., Kinetic Properties of the Arginine Kinase Isoenzymes of Limulus polyphemus. Archives of Biochemistry and Biophysics, 1972. 149: p. 244-251.

77

49. Rao, B.D.N., D.H. Buttlaire, and M. Cohn, 31P NMR Studies of the Arginine Kinase Reaction. Journal of Biological Chemistry, 1976. 251: p. 6981-6. 50. Rhee, S., et al., Crystal structures of a mutant (betaK87T) tryptophan synthase alpha2beta2 complex with ligands bound to the active sites of the alpha- and beta-subunits reveal ligand-induced conformational changes. Biochemistry, 1997. 36(25): p. 7664-80. 51. Gerstein, M. and C. Chothia, Analysis of protein loop closure. Two types of hinges produce one motion in lactate dehydrogenase. Journal of Molecular Biology, 1991. 220: p. 133-49. 52. Boyer, R., Concepts in Biochemistry. 1998, Pacific Grove, CA: Brooks / Cole. 53. Yousef, M.S., et al., Refinement of the arginine kinase transition-state analogue complex at 1.2 A resolution: mechanistic insights. Acta Crystallogr D Biol Crystallogr, 2002. 58(Pt 12): p. 2009-17. 54. Ruben, E.A., M.S. Chapman, and J.D. Evanseck, Generalized Anomeric Interpretation of the "High-Energy" N-P Bond in N-Methyl-N'- phosphorylguanidine: Importance of Reinforcing Stereoelectronic Effects in "High-Energy" Phosphoester Bonds. J Am Chem Soc, 2005. 127(50): p. 17789-17798. 55. Pruett, P.S., et al., The putative catalytic bases have, at most, an accessory role in the mechanism of arginine kinase. J Biol Chem, 2003. 29: p. 26952-7. 56. Mesecar, A.D., B.L. Stoddard, and D.E. Koshland Jr., Orbital Steering in the Catalytic Power of Enzymes: Small Structural Changes with Large Catalytic Consequences. Science, 1997. 277: p. 202-206. 57. Warshel, A., Electrostatic origin of the catalytic power of enzymes and the role of preorganized active sites. J Biol Chem, 1998. 273(42): p. 27035-8. 58. Villa, J., et al., How important are entropic contributions to ? Proc Natl Acad Sci U S A, 2000. 97(22): p. 11899-904. 59. Bruice, T.C. and S.J. Benkovic, Chemical basis for enzyme catalysis. Biochemistry, 2000. 39(21): p. 6267-74. 60. Lau, E.Y., et al., The importance of reactant positioning in enzyme catalysis: a hybrid quantum mechanics/molecular mechanics study of a haloalkane dehalogenase. Proc Natl Acad Sci U S A, 2000. 97(18): p. 9937- 42. 61. Koshland Jr., D.E. and a. others, The importance of orientational factors in enzymatic reactions. Cold Spring Harbor Symposia on Quantitative Biology, 1971. 36: p. 13-20. 62. Jencks, W.P. and M.I. Mage, "Orbital Steering", Entropy and Rate Accelerations. Biochemical and Biophysical Research Communications, 1974. 57: p. 887-892. 63. Storm, D.R. and D.E. Koshland Jr., A Source for the Special Catalytic Power of Enzymes: Orbital Steering. Proceedings of the National Academy of Sciences, USA, 1970. 66: p. 445-52.

78

64. Wolfenden, R., et al., The Temperature Dependence of Enzyme Rate Enhancements. J. Am. Chem. Soc., 1999. 121(32): p. 7419-7420. 65. Jencks, W.P., From Chemistry to Biochemistry to Catalysis to Movement. Annual Review of Biochemistry, 1997. 66: p. 1-18. 66. Lightstone, F.C. and T.C. Bruice, Ground State Conformations and Entropic and Enthalpic Factors in the Efficiency of Intramolecular and Enzymatic Reactions. 1. Cyclic Anhydride Formation by Substituted Glutarates, Succinate, and 3,6-Endoxo-4-tetrahydrophthalate Monophenyl Esters. J. Am. Chem. Soc., 1996. 118(11): p. 2595-2605. 67. Lightstone, F.C. and T.C. Bruice, Bioorganic Chemistry, 1998. 26: p. 193- 199. 68. Fischer, E., Ber. Dtsch. Chem. Ges., 1894. 27: p. 2985. 69. Koshland Jr., D.E., The key-lock theory and the induced-fit theory. Angew Chem Int Ed Engl, 1994. 33: p. 2375-2378. 70. Perona, J.J. and C.S. Craik, Structural basis of substrate specificity in the serine proteases. Protein Science, 1995. 4: p. 337-360. 71. Winter, G., et al., Redesigning enzyme structure by site-directed mutagenesis: tyrosyl tRNA synthetase and ATP binding. Nature, 1982. 299: p. 756-758. 72. Wells, J.A., et al., Recruitment of Substrate-specificity Properties from One Enzyme into a Related One by Protein Engineering. Proceedings of the National Academy of Science, U.S.A., 1987. 84: p. 5167-5171. 73. Hedstrom, L., et al., Converting trypsin to chymotrypsin: Ground state binding does not determine substrate specificity. Biochemistry, 1994. 33: p. 8764-9. 74. Carter Jr., C.W., Cognition, Mechanism, and Evolutionary Relationships in Aminoacyl-tRNA Synthetases. Annual Review of Biochemistry, 1993. 62: p. 715-48. 75. Suzuki, T., et al., Arginine kinase from Nautilus pompilius, a living fossil. Site-directed mutagenesis studies on the role of amino acid residues in the Guanidino specificity region. J Biol Chem, 2000. 275(31): p. 23884-90. 76. Zhou, G., W.R. Ellington, and M.S. Chapman, Induced Fit in Arginine Kinase. Biophys J, 2000. 78(3): p. 1541-1550. 77. Cantwell, J.S., et al., Mutagenesis of two acidic active site residues in human muscle creatine kinase: implications for the catalytic mechanism. Biochemistry, 2001. 40(10): p. 3056-61. 78. Zhou, G., et al. Transition State Structure of Arginine Kinase: Implications for Catalysis of Bimolecular Reactions. in Protein Society Annual Meeting. 1998. San Diego: Protein Society. 79. Azzi, A., et al., The role of phosphagen specificity loops in arginine kinase. Protein Sci, 2004. 13(3): p. 575-85. 80. Watts, D.C., et al., The use of arginine analogues for investigating the functional organization of the arginine-binding site in lobster muscle

79

arginine kinase. Role of the 'essential' thiol group. Biochem J, 1980. 185(3): p. 593-9. 81. Tanaka, K. and T. Suzuki, Role of amino-acid residue 95 in substrate specificity of phosphagen kinases. FEBS Lett, 2004. 573(1-3): p. 78-82. 82. Borson, N.D., W.L. Salo, and L.R. Drewes, A lock-docking oligo(DT) primer for 5' and 3' RACE PCR. PCR Methods Appl, 1992. 2: p. 144-148. 83. Graber, N.A. and W.R. Ellington, Gene duplication events producing muscle (M) and brain (B) isoforms of cytoplasmic creatine kinase: cDNA and deduced amino acid sequences from two lower chordates. Mol. Biol. Evol., 2001. 18: p. 1305-1314. 84. Frohman, M.A., M.K. Dush, and G.R. Martin, Rapid production of full length cDNAs from rare transcripts: amplification using a single gene-specific oligonucleotide primer. Proceedings of the National Academy of Sciences, USA, 1988. 85: p. 8998-9002. 85. Davis, G.D., et al., New fusion protein systems designed to give soluble expression in Escherichia coli. Biotechnol Bioeng., 1999. 65(4): p. 382-388. 86. Brooks, S.P. and K.B. Storey, Purification of using transition-state analogue affinity chromatography. J. Chromatogr., 1988. 455: p. 291-296. 87. Ramu, C., et al., Multiple sequence alignment with the Clustal series of programs. Nucleic Acids Res., 2003. 31(13): p. 3497-3500. 88. Jancarik, J. and S.-H. Kim, Sparse Matrix Sampling: a Screening Method for Crystallization of Proteins. Journal of Applied Crystallography, 1991. 24: p. 409-11. 89. Hosfield, D., et al., A fully integrated protein crystallization platform for small-molecule drug discovery. Journal of Structural Biology, 2003. 142: p. 207-217. 90. Garman, E.F. and E.P. Mitchell, Glycerol concentrations required for cryoprotection of 50 typical protein crystallization solutions. Journal of Applied Crystallography, 1996. 29: p. 584-7. 91. Garman, E., Cool data: quantity AND quality. Acta Crystallogr D Biol Crystallogr, 1999. 55(Pt 10): p. 1641-53. 92. Szebenyi, D.M.E., A system for integrated collection and analysis of crystallographic diffraction data. J. Synchrotron Radiation, 1997. 4: p. 128- 135. 93. Otwinowski, Z. and W. Minor, Processing of X-Ray Diffraction Data Collected in Oscillation Mode. Methods in Enzymology, 1997. 276: p. 307- 326. 94. Brunger, A.T., et al., Crystallography and NMR System: A New Software Suite for Macromolecular Structure Determination. Acta Cryst. D, 1998. D54: p. 905-921. 95. Collaborative Computational Project Number 4, The CCP4 Suite: Programs for Protein Crystallography. Acta Crystallographica, 1994. D50: p. 760-3.

80

96. Kleywegt, G.J., Making the most of your search model. CCP4/ESF-EACBM Newsletter on Protein Crystallography, 1996. 32: p. 32-36. 97. Sali, A. and T.L. Blundell, Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol., 1993. 234: p. 779-815. 98. Fiser, A., R.K. Do, and A. Sali, Modeling of loops in protein structures. Protein Science, 2000. 9: p. 1753-1773. 99. Rossmann, M.G., The Molecular Replacement Method. 1972, New York: Gordon and Breach. 100. McCoy, A.J., et al., Likelihood-enhanced fast translation functions. Acta Cryst. D, 2005. D61: p. 458-464. 101. Navaza, J. and P. Saludjian, AMoRe: An Automated Molecular replacement Program Package. Methods in Enzymology, 1997. 277: p. 581-94. 102. Navaza, J.e.a., The CCP4 Suite: Programs for Protein Crystallography. Acta Crystallogr D Biol Crystallogr, 1994. D50: p. 760-763. 103. Navaza, J., AMoRe: an Automated Package for Molecular Replacement. Acta Crystallographica, 1994. A50: p. 157-163. 104. Grosse-Kunstleve, R.W. and P.D. Adams, Patterson correlation methods: a review of molecular replacement with CNS. Acta Crystallogr D Biol Crystallogr, 2001. 57(Pt 10): p. 1390-6. 105. Brunger, A.T., et al., Crystallography and NMR System: A New Software Suite for Macromolecular Structure Determination. Acta Cryst. D, 1998. D54: p. 905-921. 106. McCoy, A.J., Solving structures of protein complexes by molecular replacement with Phaser. Acta Crystallogr D Biol Crystallogr, 2007. 63(Pt 1): p. 32-41. 107. Pai, R., J.C. Sacchettini, and T.R. Ioerger, Identifying non-crystallographic symmetry in protein electron-density maps: a feature-based approach. Acta Cryst. D, 2006. D62(9): p. 1012-1021. 108. Murshudov, G., A. Vagin, and E. Dodson, Refinement of Macromolecular Structures by the Maximum-Likelihood Method. Acta Crystallographica, 1997. D53: p. 240-255. 109. Adams, P.D., et al., PHENIX: Building new software for automated crystallographic structure determination. Acta Cryst. D, 2002. D58: p. 1948- 1954. 110. Jones, T.A. and M. Kjeldgaard. Making the first trace with O. in From First Map to Final Model, Proceedings of the CCP4 Study Weekend. 1994. Warrington, UK: Daresbury Laboratory. 111. Emsley, P. and K. Cowtan, Coot: model-building tools for molecular graphics. Acta Crystallogr D Biol Crystallogr, 2004. 60(Pt 12 Pt 1): p. 2126- 32. 112. Kleywegt, G.J. and A.T. Jones, Efficient Rebuilding of Protein Structures. Acta Crystallographica, 1996. D52: p. 829-32.

81

113. Brünger, A.T., et al., Crystallography and NMR system: A new software system for macromolecular structure determination. Acta Crystallographica, 1998. D54: p. 905-921. 114. Laskowski, R.A., et al., PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Cryst., 1993. 26: p. 283-91. 115. Jones, T.A. and M. Kjeldgaard, Electron-Density Map Interpretation. Methods in Enzymology, 1997. 277: p. 173-208. 116. CCP4, The SERC (UK) Collabortive Computing Project No. 4, A Suite of Programs for Protein Crystallography, Daresbury Laboratory: Warrington, UK. 117. Ma, J. and M. Karplus, Molecular switch in signal transduction: Reaction paths of the conformational changes in ras p21. Proc. Natl. Acad. Sci, USA, 1997. 94: p. 11905-11910. 118. van der Vaart, A. and M. Karplus, Simulation of conformational transitions by the restricted perturbation-targeted molecular dynamics method. J Chem. Phys., 2005. 122: p. 114903-1-114903-6. 119. Brooks, B.R., et al., CHARMM: A Program for Macromolecular Energy, Minimization, and Dynamics Calculations. J. Comput. Chem, 1983. 4: p. 187-217. 120. Radermacher, M., et al., Cryoelectron microscopy and three-dimensionala reconstruction of the calcium release channel/ryanodine receptor from skeletal muscle. Journal of Cell Biology, 1994. 127: p. 411-23. 121. Karplus, M., Molecular dynamics simulations of biomolecules. Acc. Chem. Res., 2002. 35: p. 321-323. 122. Humphrey, W., A. Dalke, and K. Schulten, VMD: Visual Molecular Dynamics. J. Mol. Graph., 1996. 14(1): p. 33-8, 27-8. 123. Bradford, M.M., A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding. Anal Biochem, 1976. 72: p. 248-54. 124. Zhou, G., et al., Expression, Purification from Inclusion Bodies, and Crystal Characterization of Transition State Analog Complex of Arginine Kinase: a Model for Studying Phosphagen Kinases. Protein Science, 1997. 6: p. 444-9. 125. Suzuki, T., et al., Arginine kinase evolved twice: evidence that echinoderm arginine kinase originated from creatine kinase. Biochem J, 1999. 340(Pt 3): p. 671-675. 126. Watts, D.C., et al., The use of arginine analogues for investigating the functional organization of the arginine binding site in Lobster muscle arginine kinase: The role of the essential thiol group. Biochem J, 1980. 185: p. 593-599. 127. Wright-Weber, B., et al., Immunological and physical comparison of monomeric and dimeric phosphagen kinases: Some evolutionary implications. Biochim Biophys Acta, 2006. 1760(3): p. 364-71.

82

128. Brown, A.E. and S.H. Grossman, The mechanism and modes of inhibition of arginine kinase from the cockroach (Periplaneta americana). Arch Insect Biochem Physiol, 2004. 57(4): p. 166-77. 129. Novak, W.R., et al., Isoleucine 69 and valine 325 form a specificity pocket in human muscle creatine kinase. Biochemistry, 2004. 43(43): p. 13766-74. 130. Uda, K., et al., Origin and properties of cytoplasmic and mitochondrial isoforms of taurocyamine kinase. Febs J, 2005. 272(14): p. 3521-30. 131. Uda, K., et al., Phosphagen kinase of the giant tubeworm Riftia pachyptila Cloning and expression of cytoplasmic and mitochondrial isoforms of taurocyamine kinase. Int J Biol Macromol, 2005. 132. Kleywegt, G.J. and T.A. Jones, Homo Crystallographicus -Ways and Means - Quo vadis? Structure, 2002. 10: p. 465-472.

83

BIOGRAPHICAL SKETCH

D. Jeffrey Bush

Education University of Pittsburgh at Johnstown Johnstown, PA (1993-1995)

Bachelor of Science in Chemistry Pennsylvania State University University Park, PA (1995-1998)

Doctor of Philosophy in Biochemistry Florida State University Tallahassee, FL (1999-present) Graduate Advisor: Dr. Michael S. Chapman Dissertation Title: “Lombricine Kinase Structure and Substrate Specificity: A Paradigm for Elucidation of Substrate Specificity in Phosphagen Kinases.”

Research Interests Structural Biology, primarily x-ray crystallography of protein-substrate complexes to probe active-site geometry and interactions involving the structural and functional relationships of enzymes. Steady-State and bioinformatics to assist with rational structure- based drug design. Protein engineering via site-directed mutagenesis and in-silico calculation and simulation.

Professional Experience 2007: Adjunct Professor of Chemistry Tallahassee Community College 2006-2007: Adjunct Professor of Chemistry Florida State University 2005-2006: Teaching Assistant in Biochemistry Florida State University 2000-2005: Graduate Research Assistant; PI- Dr. Michael S. Chapman Florida State University 1999-2003: Teaching Assistant in General and Organic Chemistry Florida State University 1998-1999: Contract Research Assistant Johnson and Johnson-Merck Consumer Pharmaceuticals Corporation Fort Washington, PA

84

Developed an Atomic Absorption spectroscopic assay for calcium and magnesium in liquid antacids. Performed active ingredient and degradant assays of famotidine by HPLC

Fellowships and Honors

2005-2006: Outstanding Teaching Assistant Award Nominee Florida State University 2004-2005: Dissertation Research Grant Florida State University 2003-2005: American Heart Association Predoctoral Research Fellow “Structure and substrate specificity in lombricine kinase” 2003: Joseph M. Schor Fellow in Biochemistry 2001-2002: NSF sponsored Research Training Grant Fellow Institute of Molecular Biophysics Florida State University 1999-2000: Hoffman Teaching Award Florida State University Publications 1. Cloning and expression of a lombricine kinase from an echiuroid worm: Insights into structural correlates of substrate specificity. Ellington W.R, Bush J., Biochemical and Biophysical Research Communications, 2002 Mar 8;291(4):939- 44. 2. A Crystallographic Investigation of Substrate Homologue Conformational Variability and Mechanistic Implications in Arginine Kinase. Clark S.A., Ruben E.R., Yousef M.S., Bush D.J., Fenley M., Ellington W.R, Chapman M.S., 2006 manuscript in preparation 3. The pH and substrate dependence of space group and non-crystallographic symmetry in the crystallization and x-ray data collection of lombricine kinase. manuscript in preparation 4. The Substrate-free crystal structure of lombricine kinase: Structural and targeted molecular dynamics characterization of a paradigm for substrate alignment and specificity in phosphagen kinases. Bush, J., Clark, S.A. and Chapman, M.S. manuscript in preparation

Professional Affiliations 2006-2007: American Crystallographic Association 2004-2006: American Association for the Advancement of Science 2002-2006: Southeast Regional Collaborative Access Team Beamline User Advanced Photon Source at Dept. of Energy’s Argonne National Laboratory, Argonne, IL

85