Understanding the Relationship Between Enzyme Structure and Catalysis
Total Page:16
File Type:pdf, Size:1020Kb
Understanding the Relationship Between Enzyme Structure and Catalysis Alex Gutteridge∗ Darwin College, Cambridge This dissertation is submitted for the degree of Doctor of Philosophy October 15, 2005 ∗EBI, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD. Preface This dissertation is the result of my own work, and includes nothing which is the outcome of work done in collaboration, except where specifically indicated in the text. The length of this dissertation does not exceed the word limit specified by the Graduate School of Biological, Medical and Veterinary Sciences. i Abstract The three dimensional structure of an enzyme is of fundamental importance to almost every function it performs. In this thesis, we aim to analyse the structures of many enzymes in order to gain insights into how they perform catalysis, and the role of structure in enzyme function. As well as improving our understanding of these important biological molecules, these insights will help in developing new tools for annotating enzyme structures of unknown function and for designing novel enzymes. Predicting the location of the active site, and the identity of the catalytic residues, is an important first step for annotating an enzyme of unknown function. We have developed a neural network trained to distinguish catalytic and non-catalytic residues based on a mixture of sequence and structural parameters. We find that the correct location of the active site can be predicted in ∼70% of cases. We also find that the most important factor in making a prediction is the conservation score of each residue. However, including structural data does improve the predictions that are made over those made on the basis of conservation alone. Our first analysis of enzyme structure aims to measure the extent of conforma- tional change undergone upon substrate binding. We find that most enzymes do not undergo large scale conformational change, and in many cases the catalytic residues are isolated from changes that do occur. One new theory of enzyme action suggests that certain, very small, conformational changes (deriving from changes in the en- ii Abstract zyme dynamics) are important in catalysis. In a separate study, we look for these changes and find hints that they may occur in some enzymes, although they are at the limit of what can be reliably observed in crystal structures. Having established that catalytic site is generally preformed, we next look at the interactions between catalytic residues. We analyse a large set of interactions and the roles they play in catalysis. We find that many catalytic residues do not require direct interactions with other catalytic residues to be active, although we do predict that many previously unidentified interactions will prove to be functionally important. Those residues that do commonly interact are often important in the pKa modulation of other residues. We examine in greater detail several important interactions, and observe a preference for certain catalytic interactions to adopt different distributions of geometries to non-catalytic interactions. In conclusion, it is clear that structure plays an important, though sometimes subtle, role in catalysis. We have seen that conformational changes in enzymes are often small, but that even small changes could be significant. We have also found that even the smallest chemical differences between groups are often exploited by enzyme mechanisms. We can use this information on how Nature uses these different groups in catalysis to improve our ability to predict and design new enzyme functions. iii Acknowledgements First and foremost, thanks must go to my supervisor Prof Thornton. Without her support, insight and encouragement, this thesis would be a lot less interesting and contain a lot more spelling mistaks. Thanks and acknowledgement must also go to many other past and present members of the Thornton group both at the EBI and UCL. Firstly, thanks to Dr Craig Porter for two summer jobs, lots of pizza and more enzyme annotation than I could ever wish for. Also, thanks to Dr Roman ’Romanager’ Laskowski, who despite his failing powers in recent years, has remained an inspiration both in the lab and on the football pitch. At the EBI, thanks first to Dr Gail Bartlett, for guidance and inspiration in the formative months of my Phd. Her work, and the thesis she wrote from it, has been a template for (all the good parts) of this dissertation. Other special mentions go to Gareth Stockwell and Dr Gordon Whammond (and Dr Kevin Murray - where ever he might be) for giving me somewhere to live (even if it was in the middle of nowhere) in my first year. And to Gareth (again), Gail (again), Prof Barbara Brodsky, James Torrance and Dr Jonathan Barker for putting up with my (occasionally) smelly football kits, and for providing all the office fun and games I’ve (usually) enjoyed over the last three years Outside of work, thanks go to Sheena Patel for all her support and friendship, iv Acknowledgements everyone at Drayton Park for whatever they’ve been doing, my Mum and Dad for trying to understand the various bits of pieces of what I’ve been doing for the last three years and last but not least, Sowmiya Moorthie, for making Cambridge a fun place to live, even for a big city boy. Actually, last thanks to all the hard-working crystallographers and experimen- talists who have done the work to generate the data from which I’ve built this thesis. v Table of Contents 1 Introduction 1 1.1 EnzymesasCatalysts........................... 1 1.1.1 The Importance of Enzymes For Life . 1 1.1.2 Historical Perspective on Enzyme Catalysis . 7 1.1.3 Chemical Catalysis . 9 1.2 TheStructureandEvolutionofEnzymes . 14 1.2.1 StructuralStudiesofEnzymes . 14 1.2.2 Domains.............................. 17 1.2.3 Active Sites and Clefts . 17 1.2.4 Determining the Function of Enzymes . 19 1.2.5 EnzymeEvolution . 21 1.2.6 Structural and Functional Enzyme Classification . 23 1.3 SerineProteases.............................. 26 1.3.1 Evolution and Families . 26 1.3.2 StructuralStudies. 27 1.3.3 Mechanism ............................ 28 1.4 Structural Genomics and Predicting Enzyme Function . ..... 31 1.4.1 Predicting Function From Sequence . 31 1.4.2 PredictingFunctionFromStructure . 33 1.5 Learning the Principles of Catalysis and Unresolved Issues . 37 1.5.1 The Catalytic Site Atlas and MACiE . 37 1.6 OutlineofThesis ............................. 39 2 Predicting Active Sites 41 2.1 Introduction................................ 41 2.2 MaterialsandMethods. 42 2.2.1 ProteinTestSet.......................... 42 vi TABLE OF CONTENTS 2.2.2 Compilation of Data . 43 2.2.3 Encoding and generation of Data Sets . 44 2.2.4 Training the Neural Network . 45 2.2.5 MeasuringPerformance. 45 2.2.6 RankingandClustering . 47 2.3 Results................................... 47 2.3.1 Analysis of Parameters . 47 2.3.2 Depth ............................... 48 2.3.3 An Example: Quinolate Phosphoribosyltransferase . 49 2.3.4 Trainingthenetwork . 50 2.3.5 Network Weights . 52 2.3.6 Clustering ............................. 54 2.3.7 Comparing the Predicted Sites to the Known Sites . 55 2.3.8 Significance of Results . 57 2.3.9 Comparison of the Performance of Different Networks . 58 2.3.10 Predictions ............................ 59 2.4 Discussion................................. 67 2.4.1 Why Did The Failures Fail? . 70 3 Compiling the Data 74 3.1 Introduction................................ 74 3.1.1 CurrentDataSources. 75 3.2 CompilingtheDataset . 78 3.2.1 Description of the Database . 79 3.3 Results................................... 81 3.3.1 PDBandECCoverage. 81 3.3.2 BiasandRedundancy . 83 4 Conformational Change 87 4.1 Introduction................................ 87 4.2 Methods.................................. 91 4.2.1 CompilingtheDataset . 91 4.2.2 SizeoftheDataset ........................ 93 4.2.3 Measuring Conformational Changes . 94 4.3 Results................................... 95 4.3.1 Catalytic Cycles . 95 vii TABLE OF CONTENTS 4.3.2 SubstrateBinding. .106 4.3.3 AbsoluteMotion .........................109 4.3.4 SideChainMotion . .109 4.4 Conclusions ................................110 4.4.1 OverallResults . .110 4.4.2 MotionandEnergy . .112 4.4.3 Implications for Other Applications . 113 4.4.4 Stabilityvsinducedfit . .114 5 Interactions of Catalytic Residues 116 5.1 Introduction................................116 5.2 Method ..................................118 5.2.1 GeneratingData . .118 5.2.2 The Functions of Secondary Residues . 120 5.3 Results...................................120 5.3.1 Common Catalytic Dyads . 120 5.3.2 Common Catalytic Units and their functions . 124 5.3.3 Arginine-Arginine. .124 5.3.4 Carboxylate/Carboxylate . 124 5.3.5 Carboxylate/Arginine . 125 5.3.6 Carboxylate/Lysine . 126 5.3.7 Carboxylate/Histidine . 128 5.3.8 Histidine/Hydroxyl . 128 5.3.9 Histidine/Histidine . .130 5.4 Discussion.................................130 5.5 The Roles of Catalytic Interactions . 130 6 Priming Catalytic Histidine 136 6.1 Introduction................................136 6.2 Methods..................................139 6.2.1 TheDataset............................139 6.3 AnalysisandResults ...........................144 6.3.1 PrimingofHistidine . .144 6.3.2 Histidine Interactions . 149 6.4 Discussion.................................159 6.4.1 HistidinePriming. .159 viii TABLE OF CONTENTS 6.4.2 Interaction Geometries . 162 7 Non-covalent Bonding In Enzymes 165 7.1 Introduction................................165 7.2 Methods..................................167