The MOLECULES of LIFE

Physical and Chemical Principles

The MOLECULES of LIFE

Physical and Chemical Principles

John Kuriyan Boyana Konforti David Wemmer Th e cover illustration shows the bacterial ribosome (purple and gray) Vice President: Denise Schanck in the act of decoding the sequence of a messenger RNA molecule Editor: Summers Scholl (blue). Th ree tRNA molecules are bound to the ribosome (red, green, Senior Editorial Assistant: Kelly O’Connor and yellow). Th e growing protein chain, which is hidden within the Primary Illustrator: Lore Leighton ribosome, is attached to the green tRNA. Th e red tRNA is delivering Additional Illustration: Laurel Muller, Cohographics, and Tiago Barros a new amino acid for incorporation into the protein, and the yellow Production Editor and Layout: EJ Publishing Services tRNA is about to depart. (Based on X-ray crystallographic analysis by Cover and Text Design: Matthew McClements, Blink Studio, Ltd. V. Ramakrishnan and colleagues at the MRC Laboratory of Molecular Developmental Editors: Sherry Granum Lewis, John Murdzek, and Biology, Cambridge, UK. ) Miranda Robertson Copyeditor: John Murdzek Proofreader: Sally Huish JOHN KURIYAN is Professor of Molecular and Cell Biology and of Indexer: Merrall-Ross International Ltd. Chemistry at the University of California, Berkeley. His laboratory uses x-ray crystallography to determine the three-dimensional structures of proteins involved in signaling and replication, as well as biochemical, biophysical, and computational analyses to elucidate mechanisms. Kuriyan was elected to the US National Academy of © 2013 by Garland Science, Taylor & Francis Group, LLC Sciences in 2001.

BOYANA KONFORTI is the launch Editor of Cell Reports, an open- access journal that covers all of biology with a focus on short papers. Over her career, Konforti has researched the mechanisms of DNA Th is book contains information obtained from authentic and highly recombination and RNA splicing. She has been a professional editor regarded sources. Every eff ort has been made to trace copyright holders for over 13 years; most recently she was Chief Editor of and to obtain their permission for the use of copyright material. Reprinted Structural & Molecular Biology. material is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable eff orts have been made to DAVID WEMMER is Professor of Chemistry at the University of publish reliable data and information, but the author and the publisher California, Berkeley and has served as Vice Chair, Assistant Dean, cannot assume responsibility for the validity of all materials or for the and Executive Associate Dean since joining the faculty in 1985. His consequences of their use. All rights reserved. No part of this publication research in structural biology uses magnetic resonance methods may be reproduced, stored in a retrieval system or transmitted in any to investigate the structure of proteins and DNA toward a better form or by any means—graphic, electronic, or mechanical, including understanding of how these molecules function. Wemmer is a Fellow photocopying, recording, taping, or information storage and retrieval of the AAAS and a member of Phi Kappa Phi and Sigma Xi. systems—without permission of the copyright holder.

ISBN 978-0-8153-4188-8

Library of Congress Cataloging-in-Publication Data

Kuriyan, John. Th e molecules of life : physical and chemical principles / John Kuriyan, Boyana Konforti, David Wemmer. p. ; cm. Includes bibliographical references and index. ISBN 978-0-8153-4188-8 (alk. paper) I. Konforti, Boyana. II. Wemmer, David. III. Title. [DNLM: 1. Molecular Biology--methods. 2. Biochemical Processes-- physiology. 3. Genomics--methods. QH 506] 572’.33--dc23 2012008865

Published by Garland Science, Taylor & Francis Group, LLC, an informa business, 711 Th ird Avenue, New York, NY 10017, USA, and 3 Park Square, Milton Park, Abingdon, OX14 4RN, UK.

Printed in the United States of America

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1

Visit our website at http://www.garlandscience.com v

Preface

he fi eld of biochemistry is entering an exciting new era in which genomic T information is being integrated into molecular-level descriptions of the physical processes that make life possible. Our understanding of how biological macromolecules work at the level of atoms and interactions is also enabling great strides to be made in molecular medicine—where the path between the identifi - cation of a target and the development of therapeutics that modulate its functions is becoming ever shorter. Key to making future advances in these areas is a new generation of molecular biologists and biochemists who are able to harness the tools and insights of physics and chemistry to exploit the emergence of genomic and systems-level information in biology. Th is book is the result of a decade-long series of discussions among the three of us, in which we considered how biology students should best prepare themselves to take advantage of the growing depth of information concerning molecular mechanisms in biology. Th e central theme of this book is that the ways in which proteins, DNA, and RNA work together in a cell are connected intimately to the structures of these biologi- cal macromolecules. Th ese structures, in turn, depend on interactions between the atoms in these molecules, and on the interplay between energy and entropy, which results in the remarkable ability of biological systems to self-assemble and control their own replication. Th is book is not intended to be a comprehen- sive reference, nor does it contain the most recent biological breakthroughs and discoveries. Our goal in this textbook is to integrate fundamental concepts in thermodynamics and kinetics with an introduction to biological mechanism at the level of molecular structure. We have done so by choosing biological examples to illustrate the basic physical and chemical principles that underlie how biologi- cal molecules function. We have written this textbook with an undergraduate audience in mind, particularly those students who have chosen biology or the health sciences as their principal area of study. We assume that students have taken introduc- tory courses in physics and chemistry, and have been introduced to diff erential calculus at a basic level. We anticipate that the book will also be useful for gradu- ate students in biology who have not taken courses in physical chemistry, or who seek to learn more about structural biology. We also hope that the book will be useful for scientists wishing to refresh their knowledge of the elementary princi- ples of biological structure, thermodynamics, and kinetics. Th e development of this textbook has been anchored, over the last few years, by the creation of a one-semester undergraduate course at the University of Califor- nia at Berkeley, off ered jointly by faculty in the departments of Chemistry and of Molecular and Cell Biology. Th is course has merged the fi rst part of a traditional course in biochemistry with a new way of teaching physical chemistry to biology undergraduates. Th ere are two aspects of this course that are a departure from past practice. Th e fi rst is the integration of structural biology with physical chem- istry, as mentioned earlier. Th e second aspect, and perhaps the more radical one for a course aimed at biology undergraduates, is to develop the laws of thermo- dynamics and the concept of free energy through statistical analysis of molecular vi PREFACE

interactions and behavior rather than on the more abstract concepts underly- ing heat engines. It is our experience that biology students take to the statistical treatment of energy and entropy more readily because this approach allows us to link thermodynamics and structure in an intuitively obvious way. Our initial hesitation concerning the implementation of this approach refl ected a concern that the mathematical preparation of typical biology students might leave them ill-prepared to grapple with the statistical approach to thermodynamics. But, to our satisfaction, we have found that students understand these concepts readily, as witnessed by the growing enrollment in this class each year since its inception at Berkeley. Th e majority of these students are majors in Molecular and Cell Biol- ogy, with another large group of them majoring in Bioengineering. Th e organization of our textbook follows how a course could be developed over one semester. We begin by introducing the nature of biological macromolecules and the structures that they form, placing these ideas in the broad context of how evolution proceeds while obeying physical laws. Th e fi rst chapter provides an overview of DNA, RNA, and proteins and also reviews the processes of replica- tion, transcription, and translation. A more detailed discussion of the structures of biological molecules is provided in Chapters 2 through 5, including a discussion of how evolutionary processes have shaped the architecture of proteins. Chapters 6, 7, and 8 provide a quantitative treatment of energy and the statistical basis for the concept of entropy, culminating in the development of the Boltzmann distri- bution and the idea that the energies of diff erent molecular confi gurations deter- mine the probabilities of observing them. Th e concept of free energy is introduced in Chapter 9 and, along with chemical potential, is developed further in Chapter 10, which applies these ideas to acid–base equilibria and to protein folding. Chap- ter 11 takes the concept of chemical potential one step further, by linking it to voltages through applications in redox chemistry and an analysis of how action potentials are transmitted in nerve cells. Chapters 12 to 14 are concerned with the principles of molecular recognition, developing the ideas of affi nity and specifi city, with applications to drug interac- tions, protein–DNA, protein–RNA and protein–protein interactions, followed by a treatment of allosteric systems. Chapters 15 to 17 introduce kinetic concepts, including an analysis of enzyme mechanisms and transport properties (the mate- rial in these chapters could be presented in a course before Chapters 12 to 14 are covered). Finally, Chapters 18 and 19 bring together all of the ideas introduced in the earlier chapters by discussing two particularly interesting aspects of the self- assembly of biological systems: the folding of proteins and RNA, and the fi delity of replication and translation. We have organized the book in a modular fashion, with each chapter broken into separate parts, some of which could be omitted according to instructor prefer- ence. While Chapters 6 to 19 assume that the student is familiar with the structural principles introduced in Chapters 1 to 5, an instructor could begin with Chapter 6, provided that the students have been introduced to proteins, DNA, and RNA in an earlier course (we believe that the earlier chapters could then serve as an excel- lent refresher). Each chapter has an associated set of problems—as anyone who has taken physical chemistry knows, working through problems is an important aspect of learning the material, and we hope that the problems at the end of each chapter can serve as a nucleus for generating assignments for the students to work through on their own. Th ere are two topics that might belong in an undergraduate biophysical chemis- try course that we have purposely omitted. One is quantum mechanics, and the other concerns methods of instrumental analysis and structure determination in biochemistry. At Berkeley, students are introduced to these topics in a separate course that follows the one based on our book. PREFACE vii

ONLINE RESOURCES Accessible from www.garlandscience.com/TMOL, the Student and Instructor Resource Websites provide learning and teaching tools created for Th e Molecules of Life. Th e Student Resource Site is open to everyone, and users have the option to register in order to use book-marking and note-taking tools. Th e Instructor Resource Site requires registration and access is available to instructors who have assigned the book to their course. To access the Instructor Resource Site, please contact your local sales representative or email [email protected]. Below is an overview of the resources available for this book. On the Website, the resources may be browsed by individual chapters and there is a search engine. You can also access the resources available for other Garland Science titles.

FOR STUDENTS Animations and Videos Th e animations and videos dynamically illustrate important concepts from the book, and make many of the more diffi cult topics accessible. Flashcards Each chapter contains a set of fl ashcards, built into the Website, that allow stu- dents to review key terms from the text. Glossary Th e complete glossary is available on the Website and can be searched and browsed as a whole or sorted by chapter.

FOR INSTRUCTORS Figures Th e images from the book are available in two convenient formats: PowerPoint® and JPEG. Figures are searchable by fi gure number, fi gure name, or by keywords used in the fi gure legend from the book. Th ere is one PowerPoint presentation for each chapter. Animations and Videos Th e animations and videos that are available to students are also available on the Instructor’s Website in two formats. Th e WMV formatted movies are created for instructors who wish to use the movies in PowerPoint presentations on Windows® computers; the QuickTime formatted movies are for use in PowerPoint for Apple® computers or Keynote® presentations. Th e movies can easily be downloaded to your computer by using the “download” button on the movie preview page. Solutions Manual A complete solutions manual is provided for all problems in the text. viii

Acknowledgments

his book could not have been developed without essential Mel (University of California, San Diego); Daniel Moriarty (Siena T input from the following people in particular: Stephen K. College); Donald Nelson (deceased); Hung Kui Ngai (Th e Chi- Burley (with whom John Kuriyan developed the inaugural set of nese University of Hong Kong); Timothy Nilsen (Case Western HHMI lectures entitled “Da Vinci and Darwin in the Molecules of Reserve University); Patricia Pellicena (Catalyst Biosciences); Life”) and the late Carl Brändén. Both were instrumental in mov- Jack Preiss (Michigan State University); Margot Quinlan (Uni- ing very early stages of this project forward; Lore Leighton, who versity of California, Los Angeles); Venkataraman Ramakrishnan worked in John Kuriyan’s lab, developed the illustrations from the (MRC Laboratory of Molecular Biology, Cambridge); Ruth Reed earliest stages of writing this book; Tiago Barros helped with fi g- (Juniata College); David Rueda (Wayne State University); Gor- ure work and rendered the cover ribosome; James Fraser devel- don Rule (Carnegie Mellon University); Paul Schettler (Juniata oped the problem sets; Samuel Leachman checked the solutions College); Kevin Schug (University of Texas, Arlington); Lawrence manual; Rachelle Gaudet developed a similar course at Harvard Shapiro (Columbia University); Kunchithapadam Swaminathan University based on early drafts of this book and provided valu- (National University of Singapore); Martha Teeter (Peace Films); able feedback; Krzysztof Kuczera, at the University of Kansas, Greg Tucker (University of Nottingham); Hiroshi Ueno (Nara carefully read and checked all the chapters; Tom Alber, Jamie Women’s University); Didem Vardar-Ulu (Wellesley College); Cate, and Bryan Krantz (who also teach the Berkeley undergrad- Kam Bo Wong (Th e Chinese University of Hong Kong); Sarah uate course); Susan Marqusee (who uses parts of this book for a Woodson (Johns Hopkins University); Michael Yaff e (Massachu- graduate course at Berkeley); Ken Dill (whose masterly introduc- setts Institute of Technology). tion to statistical mechanics in a graduate course at the University of California, San Francisco motivated our own simplifi ed treat- JK—I am deeply grateful to my wife, Devaki Chandra, and my ment of this material); and a large group of undergraduates at mother, Anna Kuriyan, who made it possible for me to write this Berkeley who provided constant feedback as the book metamor- book by giving me the supported mental space in which to work. phosed from a collection of notes into its present form. We hope I thank Ruth Reed and Paul Schettler, my teachers at Juniata Col- this book will help many other students to come. Sherry Granum lege, and Greg Petsko and Martin Karplus, my graduate school Lewis and John Murdzek provided helpful editorial suggestions. advisors, for introducing me to the connection between bio- We thank the students who participated in focus groups at Berke- chemistry and statistical thermodynamics. Miranda Robertson’s ley: Bob Bellerose, Aron Kamajaya, Kotaro Kelly, Melinda Mathur, guiding hand was instrumental in allowing me to fi nd my own and Jayasree Sundaram; and at Harvard: Meng Xiao He, Koning voice. Denise Schanck and Summers Scholl at Garland displayed Shen, Helen Yang, and Angela Zhang. the patience of saints, keeping this project alive over many years. Th e following people also provided valuable commentary as BK—I would like to thank my family for their patience and under- readers, reviewers, class testers, and advisors during the devel- standing. For my youngest daughter Niki the book has been a part opment of the project: of her life for as long as she can remember. My oldest daughter Jochen Autschbach (State University of New York, Buff alo); Sophie has viewed my working on the book with a mixture of Philip Bevilacqua (Pennsylvania State University); Phil Biggin pride and incomprehension as she has veered as far away from (University of Oxford); Mark Braiman (Syracuse University); the biological sciences as possible in her academic pursuits. And Charles Brenner (Dartmouth College); Angus Cameron (Uni- my husband Richard has had to put up with a lot—in particular, versity of Bristol); Wei-Jen Chang (Hamilton College); Yun-Wei many prolonged absences at book retreats and when I have holed Chiang (National Tsing Hua University); King-Lau Chow (Hong myself up for days at a time struggling to meet a deadline. Now Kong University of Science & Technology); Mads Hartvig Clausen that the textbook is done, if it refl ects even a small amount of the (Technical University of Denmark); James Cole (University of time and eff ort that went into it, then we will have accomplished Connecticut); EJ Crane (Pomona College); Ivan Dmochowski something to be proud of. (University of Pennsylvania); Martha Fedor (Scripps Research Institute); Ruben Gonzalez, (Columbia University); Stephen Har- DEW—First I need to thank John and Boyana for inviting me rison (Harvard Medical School); Lars Bo Stegeager Hemmingsen to participate in the writing of this book. If I had known the full (University of Copenhagen); ChulHee Kang (Washington State scope of what was to be done I might have hesitated, but now see University); Katherine Kantardjieff (California State University, it as having been an adventure of a new kind and feel great satis- Fullerton); Roderick MacKinnon (Rockefeller University); Jeff ry faction in seeing it completed. I also need to thank my family and Madura (Duquesne University); Dmitrii Makarov (University of lab members for their understanding in times when work on the Texas, Austin); MK Mathew (National Centre for Biological Sci- book had priority. Help and encouragement from the Garland ences, Bangalore); Kimberly Matulef (University of San Diego); editors was invaluable in getting this done, as were many other Kevin Mayo (University of Minnesota); Ann McDermott (Colum- kindnesses such as my sister-in-law Teresa’s loan of the beach bia University); Megan McEvoy (University of Arizona); Stephanie house for writing retreats. ix

Contents

How Do We Understand Life? 1 PART I: BIOLOGICAL MOLECULES 4 Chapter 1 From Genes to RNA and Proteins 5 Chapter 2 Nucleic Acid Structure 51 Chapter 3 Glycans and Lipids 91 Chapter 4 Protein Structure 131 Chapter 5 Evolutionary Variation in Proteins 191 PART II: ENERGY AND ENTROPY 238 Chapter 6 Energy and Intermolecular Forces 239 Chapter 7 Entropy 293 Chapter 8 Linking Energy and Entropy: The Boltzmann Distribution 341 PART III: FREE ENERGY 382 Chapter 9 Free Energy 383 Chapter 10 Chemical Potential and the Drive to Equilibrium 413 Chapter 11 Voltages and Free Energy 459 PART IV: MOLECULAR INTERACTIONS 530 Chapter 12 Molecular Recognition: The Thermodynamics of Binding 531 Chapter 13 Specifi city of Macromolecular Recognition 581 Chapter 14 Allostery 633 PART V: KINETICS AND CATALYSIS 672 Chapter 15 The Rates of Molecular Processes 673 Chapter 16 Principles of Enzyme Catalysis 721 Chapter 17 Diffusion and Transport 787 PART VI: ASSEMBLY AND ACTIVITIY 838 Chapter 18 Folding 839 Chapter 19 Fidelity in DNA and Protein Synthesis 887 Glossary 939 Index 965 x

Detailed Contents

How Do We Understand Life? 1 1.18 Splicing of RNA in eukaryotic cells can generate a diversity of RNAs from a single gene 39 PART I: BIOLOGICAL MOLECULES 4 1.19 The genetic code relates triplets of nucleotides in a gene sequence to each amino acid in a protein sequence 39 Chapter 1 From Genes to RNA and Proteins 5 1.20 Transfer RNAs work with the ribosome to A. INTERACTIONS BETWEEN MOLECULES 6 translate mRNA sequences into proteins 42 1.1 The energy of interaction between two molecules 1.21 The mechanism for the transfer of genetic is determined by noncovalent interactions 6 information is highly conserved 43 1.2 Neutral atoms attract and repel each other at 1.22 The discovery of retroviruses showed that close distances through van der Waals information stored in RNA can be transferred interactions 8 to DNA 44 1.3 Ionic interactions between charged atoms can Summary 46 be very strong, but are attenuated by water 10 Key Concepts 47 1.4 Hydrogen bonds are very common in biological Problems 48 macromolecules 12 Further Reading 50 B. INTRODUCTION TO NUCLEIC ACIDS AND PROTEINS 15 Chapter 2 Nucleic Acid Structure 51 1.5 Nucleotides have pentose sugars attached to A. DOUBLE-HELICAL STRUCTURES OF RNA nitrogenous bases and phosphate groups 15 AND DNA 52 1.6 The nucleotide bases in RNA and DNA are 2.1 The double helix is the principal secondary substituted pyrimidines or purines 18 structure of DNA and RNA 52 1.7 DNA and RNA are formed by sequential 2.2 Hydrogen bonding between bases is important reactions that utilize nucleotide triphosphates 20 for the formation of double helices, but its effect 1.8 DNA forms a double helix with antiparallel is weakened due to interactions with water 53 strands 22 2.3 The electronic polarization of the bases 1.9 The double helix is stabilized by the stacking contributes to strong stacking interactions between bases 54 of base pairs 24 2.4 Metal ions help shield electrostatic repulsions 1.10 Proteins are polymers of amino acids 25 between the phosphate groups 55 1.11 Proteins are formed by connecting amino 2.5 There are two common relative orientations acids by peptide bonds 25 of the base and the sugar 56 1.12 Amino acids are classifi ed based on the 2.6 The ribose ring has alternate conformations properties of their sidechains 29 defi ned by the sugar pucker 56 1.13 Proteins appear irregular in shape 30 2.7 RNA cannot adopt the standard Watson-Crick 1.14 Protein chains fold up to form hydrophobic double-helical structure because of constraints cores 31 on its sugar pucker 58 1.15 α helices and β sheets are the architectural 2.8 The standard Watson-Crick model of elements of protein structure 31 double-helical DNA is the B-form 59 C. REPLICATION, TRANSCRIPTION, AND 2.9 B-form DNA allows sequence-specifi c recognition of the major groove, which has a greater TRANSLATION 35 information content than the minor groove 60 1.16 DNA replication is a complex process involving 2.10 RNA adopts the A-form double-helical many protein machines 35 conformation 61 1.17 Transcription generates RNAs whose 2.11 The major groove of A-form double helices is sequences are dictated by the sequence of less accessible to proteins than that of nucleotides in genes 38 B-form DNA 62 DETAILED CONTENTS xi

2.12 Z-form DNA is a left-handed double-helical B. LIPIDS AND MEMBRANES 108 structure 62 3.13 The most abundant lipids are 2.13 The DNA double helix is quite deformable 65 glycerophospholipids 109 2.14 DNA supercoiling can occur when the ends of 3.14 Other classes of lipids have different molecular double helices are constrained 67 frameworks 110 2.15 Writhe, linking number, and twist are 3.15 Lipids form organized structures spontaneously 113 mathematical parameters that describe the 3.16 The shapes of lipid molecules affect the supercoiling of DNA 69 structures they form 113 2.16 The writhe, twist, and linking number are 3.17 Detergents are amphiphilic molecules that tend related to each other in a simple way 70 to form micelles rather than bilayers 115 2.17 The DNA in cells is supercoiled 71 3.18 Lipids in bilayers move freely in two dimensions 116 2.18 Local conformational changes in the DNA also 3.19 Lipid composition affects the physical properties affect supercoiling 72 of membranes 118 B. THE FUNCTIONAL VERSATILITY OF RNA 73 3.20 Proteins can be associated with membranes 2.19 Wobble base pairs are often seen in RNA 73 by attachment to lipid anchors 121 2.20 Nonstandard base-pairing is common in RNA 75 3.21 Lipid molecules can be sequestered and transported by proteins 122 2.21 Some RNA molecules contain modifi ed nucleotides 76 3.22 Different kinds of cells and organelles have different membrane compositions 123 2.22 A tetraloop is a common secondary structural 3.23 Cell walls are reinforced membranes 125 motif that caps RNA hairpins 79 Summary 126 2.23 Interactions with metal ions help RNAs to fold 80 Key Concepts 127 2.24 RNA tertiary structure involves interactions between secondary structural elements 81 Problems 128 2.25 Helices in RNA often interact through coaxial Further Reading 129 base stacking or the formation of pseudoknots 82 2.26 Various interactions between nucleotides Chapter 4 Protein Structure 131 stabilize RNA tertiary structure 84 Summary 86 A. GENERAL PRINCIPLES 131 Key Concepts 87 4.1 Protein structures display a hierarchical Problems 88 organization 131 Further Reading 90 4.2 Protein domains are the fundamental units of tertiary structure 133 4.3 Protein folding is driven by the formation of a Chapter 3 Glycans and Lipids 91 hydrophobic core 134 4.4 The formation of α helices and β sheets satisfi es A. GLYCANS 91 the hydrogen-bonding requirements of the 3.1 Simple sugars are comprised primarily of protein backbone 136 hydroxylated carbons 91 B. BACKBONE CONFORMATION 137 3.2 Many cyclic sugar molecules can exist in alternative anomeric forms 92 4.5 Protein folding involves conformational changes in the peptide backbone 137 3.3 Sugar rings often have many low energy conformations 94 4.6 Amino acids are chiral and only the L form stereoisomer is found in genetically encoded 3.4 Many sugars are structural isomers of identical proteins 138 composition, but with different stereochemistry 95 4.7 The peptide bond has partial double bond 3.5 Some sugars have other chemical functionalities character, so rotations about it are hindered 139 in addition to alcohol groups 97 4.8 Peptide groups can be in cis or trans 3.6 Glycans form polymeric structures that can conformations 140 have branched linkages 98 4.9 The backbone torsion angles ϕ (phi) and ψ (psi) 3.7 Differences in anomeric linkages lead to dramatic determine the conformation of the protein chain 141 differences in polymeric forms of glucose 99 4.10 The Ramachandran diagram defi nes the 3.8 Acetylation or other chemical modifi cation leads restrictions on backbone conformation 142 to diversity in sugar polymer properties 101 4.11 α helices and β strands are formed when 3.9 Glycans may be attached to proteins or lipids 102 consecutive residues adopt similar values of 3.10 The decoration of proteins with glycans is not ϕ and ψ 143 templated 104 4.12 Loop segments have residues with very 3.11 Glycan modifi cations alter the properties of different values of ϕ and ψ 146 proteins 105 4.13 α helices and β strands are often amphipathic 147 3.12 Protein–glycan interactions are important in 4.14 Some amino acids are preferred over others in cellular recognition 106 α helices 149 xii DETAILED CONTENTS

C. STRUCTURAL MOTIFS AND DOMAINS IN 4.42 Conformational changes in retinal impose SOLUBLE PROTEINS 150 directionality to proton fl ow in bacteriorhodopsin 181 4.15 Secondary structure elements are connected to form simple motifs 150 4.43 Active transporters cycle between conformations that are open to the interior 4.16 Amphipathic α helices can form dimeric or the exterior of the cell 183 structures called coiled coils 153 4.44 ATP binding and hydrolysis provides the driving 4.17 Hydrophobic sidechains in coiled coils are force for the transport of sugars into the cell 184 repeated in a heptad pattern 155 Summary 185 4.18 α helices that are integrated into complex protein structures do not usually form coiled coils 156 Key Concepts 187 Problems 188 4.19 The sidechains of α helices form ridges and grooves 157 Further Reading 189 4.20 α helices pack against each other with a limited set of crossing angles 157 Chapter 5 Evolutionary Variation in 4.21 Structures with alternating α helices and Proteins 191 β strands are very common 159 4.22 α/β barrels occur in many different enzymes 161 A. THE THERMODYNAMIC HYPOTHESIS 191 4.23 α/β open-sheet structures contain α helices 5.1 The structure of a protein is determined by its on both sides of the β sheet 162 sequence 191 4.24 Proteins with antiparallel β sheets often form 5.2 The thermodynamic hypothesis was fi rst structures called β barrels 162 established for an enzyme known as ribo- nuclease-A, which can be unfolded and folded 4.25 Up-and-down β barrels have a simple topology 163 reversibly 192 4.26 Up-and-down β sheets can form repetitive 5.3 By counting the number of possible structures 163 rearrangements of disulfi de bonds, we can 4.27 Greek key motifs occur frequently in antiparallel confi rm that ribonuclease-A is completely β structures 164 unfolded by urea and reducing agents 194 4.28 Certain structural motifs can be repeated B. SEQUENCE COMPARISONS AND THE almost endlessly to form elongated structures 165 BLOSUM MATRIX 195 4.29 Catalytic sites are usually located within core elements of protein folds 167 5.4 Protein structure is conserved during evolution while amino acid sequences vary 195 4.30 Binding sites are often located at the interfaces between domains 168 5.5 The globin fold is preserved in proteins that share very little sequence similarity 198 D. STRUCTURAL PRINCIPLES OF MEMBRANE 5.6 Similarities in protein sequences can be PROTEINS 169 quantifi ed by considering the frequencies with 4.31 Lipid bilayers form barriers that are nearly which amino acids are substituted for each impermeable to polar molecules 169 other in related proteins 201 4.32 Membrane proteins have distinct regions that 5.7 The BLOSUM matrix is a commonly used set interact with the lipid bilayer 170 of amino acid substitution scores 201 4.33 The hydrophobicity of the lipid bilayer requires 5.8 The fi rst step in deriving substitution scores the formation of regular secondary structure is to determine the frequencies of amino acid within the membrane 171 substitutions and correct for amino acid 4.34 The more polar sidechains are rarely found abundances 202 within membrane-spanning α helices, except 5.9 The substitution score is defi ned in terms of when they are required for specifi c functions 172 the logarithm of the substitution likelihood 204 4.35 Transmembrane α helices can be predicted from 5.10 The BLOSUM substitution scores refl ect the amino acid sequences 174 chemical properties of the amino acids 207 4.36 Hydrophobicity scales are used to identify 5.11 Substitution scores are used to align sequences transmembrane helices 175 and to detect similarities between proteins 208 4.37 Integral membrane proteins are stabilized by C. STRUCTURAL VARIATION IN PROTEINS 209 van der Waals contacts and hydrogen bonds 177 5.12 Small but signifi cant differences in protein 4.38 Porins contain β barrels that form structures arise from differences in sequences 209 transmembrane channels 178 5.13 Proteins retain a common structural core as 4.39 Pumps and transporters use energy to move their sequences diverge 210 molecules across the membrane 179 5.14 Structural overlap within the common core 4.40 Bacteriorhodopsin uses light energy to pump decreases as the sequences of proteins diverge 211 protons across the membrane 180 5.15 Sequence comparisons alone are insuffi cient to 4.41 A hydrogen-bonded chain of water molecules establish structural similarity between distantly can serve as a proton conducting “wire” 180 related proteins 212 DETAILED CONTENTS xiii

5.16 The amino acids have preferences for different 6.7 The heat capacity of an ideal monatomic gas is environments in folded proteins 213 constant 253 5.17 Fold-recognition algorithms evaluate the 6.8 The heat capacity of a macromolecular solution probability that the sequence of a protein is increases and then decreases with temperature 257 compatible with a known three-dimensional 6.9 The potential energy of a molecular system structure 214 is the energy stored in molecules and their 5.18 The 3D-1D profi le method maps three- interactions 259 dimensional structural information onto a one- 6.10 The Boltzmann distribution describes the dimensional set of environmental descriptors 216 population of molecules in different energy 5.19 The database of known protein structures is levels 261 used to generate a scoring matrix that gives the 6.11 The energy required to break interatomic likelihood of fi nding each amino acid in a interactions in folded macromolecules gives particular environmental class 217 rise to the peak in heat capacity 264 5.20 The 3D-1D profi le method matches sequences C. ENERGETICS OF INTERMOLECULAR with structures 218 INTERACTIONS 265 D. THE EVOLUTION OF MODULAR DOMAINS 220 6.12 Simplifi ed energy functions are used to 5.21 Domains are the fundamental unit of protein calculate molecular potential energies 265 evolution 220 6.13 Empirical potential energy functions enable 5.22 Domains can be organized into families with rapid calculation of molecular energies 266 similar folds 220 6.14 The energies of covalent bonds are approximated 5.23 The number of distinct fold families is likely to by functions such as the Morse potential 267 be limited 224 6.15 Other terms in the energy function describe 5.24 Protein domains are remarkably tolerant of torsion angles and the deformations in the changes in amino acid sequence, even in the angles between covalent bonds 270 hydrophobic core 225 6.16 The van der Waals energy term describes 5.25 Structural plasticity in protein domains increases weak attractions and strong repulsions the tolerance to mutation 227 between atoms 272 5.26 The Rossmann fold is found in many nucleotide 6.17 Atoms in proteins and nucleic acids are binding proteins 228 partially charged 274 5.27 Thioredoxin reductase and glutathione reductase 6.18 Electrostatic interactions are governed by are enzymes that diverged from a common Coulomb’s law 275 ancestor, but their active sites arose through 6.19 Hydrogen bonds are an important class of convergent evolution 230 electrostatic interactions 277 Summary 232 6.20 Empirical energy functions are used in computer Key Concepts 234 programs to calculate molecular energies 279 Problems 235 6.21 Interactions with water weaken the effective Further Reading 237 strengths of hydrogen bonds in proteins 281 6.22 The presence of hydrogen-bonding groups in a protein is important for solubility and PART II: ENERGY AND ENTROPY 238 specifi city 282 6.23 The water surrounding protein molecules Chapter 6 Energy and Intermolecular strongly infl uences electrostatic interactions 283 Forces 239 6.24 The shapes of proteins change the electrostatic fi elds generated by charges within the protein 285 A. THERMODYNAMICS OF HEAT TRANSFER 240 Summary 287 6.1 In order to keep track of changes in energy, we defi ne the region of interest as the “system” 240 Key Concepts 288 6.2 Energy released by chemical reactions is Problems 289 converted to heat and work 242 Further Reading 292 6.3 The fi rst law of thermodynamics states that energy is conserved 243 Chapter 7 Entropy 293 6.4 For a process occurring under constant pressure conditions, the heat transferred is equal to the A. COUNTING STATISTICS AND MULTIPLICITY 294 change in the enthalpy of the system 246 7.1 Different sequences of outcomes in a series 6.5 Changes in energy do not always indicate the of coin tosses have equal probabilities 294 direction of spontaneous change 250 7.2 When considering aggregate outcomes, the 6.6 The isothermal expansion of an ideal gas occurs most likely result is the one that has maximum spontaneously even though the energy of the multiplicity 295 gas does not change 251 7.3 The multiplicity of an outcome of coin tosses B. HEAT CAPACITIES AND THE BOLTZMANN can be calculated using a simple formula DISTRIBUTION 253 involving factorials 297 xiv DETAILED CONTENTS

7.4 The concept of multiplicity is broadly applicable Chapter 8 Linking Energy and Entropy: in biology because a series of coin fl ips is The Boltzmann Distribution 341 analogous to a collection of molecules in alternative states 300 A. ENERGY DISTRIBUTIONS AND ENTROPY 341 7.5 The binding of ligands to a receptor can be 8.1 The thermodynamic defi nition of the entropy monitored by fl uorescence microscopy 301 provides a link to experimental observations 341 7.6 Pascal’s triangle describes the multiplicity 8.2 The concept of temperature provides a of outcomes for a series of binary events 302 connection between the statistical and 7.7 The binomial distribution governs the probability thermodynamic defi nitions of entropy 343 of events with binary outcomes 304 8.3 Energy distributions describe the populations 7.8 When the number of events is large, Stirling’s of molecules with different energies 344 approximation simplifi es the calculation of the 8.4 The multiplicity of an energy distribution is the multiplicity 306 number of equivalent confi gurations of molecules 7.9 The relative probability of two outcomes is that results in the same energy distribution 344 given by the ratios of their multiplicities 307 8.5 The multiplicity of a system with different 7.10 As the number of events increases, the less energy levels can be calculated by counting likely outcomes become increasingly rare 308 the number of equivalent molecular 7.11 For coin tosses, outcomes with equal numbers rearrangements of energy 347 of heads and tails have maximal multiplicity 310 B. THE BOLTZMANN DISTRIBUTION 350 7.12 When the number of events is very large, the 8.6 For large numbers of molecules, a probabilistic probability distribution is well approximated expression for the entropy is more convenient 350 by a Gaussian distribution 311 8.7 The multiplicity of a system changes when 7.13 The Gaussian distribution is centered at the energy is transferred between systems 354 mean value and has a width that is proportional 8.8 Systems in thermal contact exchange heat to the standard deviation 312 until the combined entropy of the two systems 7.14 Application of the Gaussian distribution is maximal 356 enables statistical analysis of a series of 8.9 Many energy distributions are consistent with binary outcomes 315 the total energy of a system, but some have B. ENTROPY 317 higher multiplicity than others 359 7.15 The logarithm of the multiplicity (ln W) is related 8.10 The energy distribution at equilibrium must to the entropy 317 have an exponential form 360 7.16 The multiplicity of a molecular system is the 8.11 The partition function indicates the accessibility number of equivalent confi gurations of the of the higher energy levels of the system 363 molecules (microstates) 318 8.12 For large numbers of molecules, non-Boltzmann 7.17 The multiplicity of a system increases as the distributions of the energy are highly unlikely 367 volume increases 319 C. ENTROPY AND TEMPERATURE 368 7.18 For a large number of atoms, the state with maximal multiplicity is the state that is observed 8.13 The rate of change of entropy with respect at equilibrium 322 to energy is related to the temperature 368 8.14 The statistical and thermodynamic defi nitions 7.19 The Boltzmann constant, kB, is a proportionality constant linking entropy to the logarithm of the of the entropy are equivalent 375 multiplicity (ln W) 325 Summary 377 7.20 The change in entropy is related to the heat Key Concepts 378 transferred during a process 326 Problems 379 7.21 The work done in a near-equilibrium process is Further Reading 381 greater than for a nonequilibrium process 327 7.22 The work done in a near-equilibrium process is related to the change in entropy 329 PART III: FREE ENERGY 382 7.23 The statistical and thermodynamic defi nitions of entropy are equivalent 330 Chapter 9 Free Energy 383 7.24 The second law of thermodynamics states that A. FREE ENERGY 384 spontaneous change occurs in the direction of increasing entropy 331 9.1 The combined entropy of the system and the surroundings increases for a spontaneous 7.25 Diffusion across a semipermeable membrane process 384 can lead to unequal numbers of molecules on the two sides of the membrane 332 9.2 The change in entropy of the surroundings is related to the change in energy and volume Summary 335 of the system 386 Key Concepts 336 9.3 The Gibbs free energy (G) of the system always Problems 337 decreases in a spontaneous process occurring Further Reading 339 at constant pressure and temperature 387 DETAILED CONTENTS xv

9.4 The Helmholtz free energy (A) determines the 10.6 The chemical potentials of the reactants and direction of spontaneous change when the products are balanced at equilibrium 422 volume is constant 389 10.7 The concentrations of reactants and products B. STANDARD FREE-ENERGY CHANGES 390 at equilibrium defi ne the equilibrium constant (K), which is related to the standard 9.5 Standard free-energy changes are defi ned with free energy change (ΔGo) for the reaction 424 reference to defi ned standard states 390 10.8 Equilibrium constants can be used to calculate 9.6 The zero point of the free-energy scale is set by the extent of reaction at equilibrium 425 the free energy of the elements in their most stable forms 391 10.9 The free-energy change for the reaction (ΔG), not the standard free-energy change (ΔGo), 9.7 Thermodynamic cycles allow the determination determines the direction of spontaneous of the free energies of formation of complex change 426 molecules from simpler ones 392 10.10 The ratio of the reaction quotient (Q) to the 9.8 The free energy of formation of glucose is equilibrium constant (K) determines the obtained by considering three combustion thermodynamic drive of a reaction 427 reactions 394 10.11 ATP concentrations are maintained at high 9.9 Enthalpies and entropies of formation can be levels in cells, thereby increasing the driving combined to give the free energy of formation 395 force for ATP hydrolysis 427 9.10 Calorimetric measurements yield the standard C. ACID–BASE EQUILIBRIA 428 enthalpy changes associated with combustion reactions 396 10.12 The Henderson–Hasselbalch equation relates the pH of a solution of a weak acid to the 9.11 The entropy of formation of a compound is concentrations of the acid and its conjugate derived from heat capacity measurements 396 base 429 C. FREE ENERGY AND WORK 398 10.13 The proton concentration ([H+]) in pure water 9.12 Expansion work is not the only kind of work at room temperature corresponds to a that can be done by a system 398 pH value of 7.0 430 9.13 Chemical work involves changes in the numbers 10.14 The temperature dependence of the equilibrium of molecules 400 constant allows us to determine the values o o 9.14 The decrease in the Gibbs free energy for a of ΔH and ΔS 431 process is the maximum amount of non- 10.15 Weak acids, such as acetic acid, dissociate expansion work that the system is capable of very little in water 432 doing under constant pressure and temperature 400 10.16 Solutions of weak acids and their conjugate 9.15 The coupling of ATP hydrolysis to work bases act as buffers 433 underlies many processes in biology 402 10.17 The charges on biological macromolecules 9.16 The synthesis of ATP is coupled to the are affected by the pH 435 movement of ions across the membrane, down 10.18 The charge on an amino acid sidechain can be a concentration gradient 405 altered by interactions in the folded protein 436 Summary 408 D. FREE-ENERGY CHANGES IN PROTEIN Key Concepts 409 FOLDING 438 Problems 409 10.19 The protein folding reaction is simplifi ed by Further Reading 411 ignoring intermediate conformations 438 10.20 Protein folding results from a balance between energy and entropy 439 Chapter 10 Chemical Potential and 10.21 The entropy of the unfolded protein chain is the Drive to Equilibrium 413 proportional to the logarithm of the number A. CHEMICAL POTENTIAL 413 of conformations of the chain 440 10.1 The chemical potential of a molecular species 10.22 The number of conformations of the unfolded is the molar free energy of that species 414 chain can be estimated by counting the number of low-energy torsional isomers 442 10.2 Molecules move spontaneously from regions of high chemical potential to regions of low 10.23 The free-energy change opposes protein folding chemical potential 414 if the entropy of water molecules is not considered 443 10.3 Biochemical reactions are assumed to occur in ideal and dilute solutions, which simplifi es 10.24 Protein folding is driven by an increase in water entropy 444 the calculation of the chemical potential 416 10.25 Calorimetric measurements allow the 10.4 The chemical potential is proportional to the experimental determination of the free energy logarithm of the concentration 417 of protein folding 446 10.5 Chemical potentials at arbitrary concentrations 10.26 The heat capacity of a protein solution depends are calculated with reference to standard on the relative population of folded and concentrations 421 unfolded molecules, and on the energy B. EQUILIBRIUM CONSTANTS 422 required to unfold the protein 446 xvi DETAILED CONTENTS

10.27 The area under the peak in the melting curve 11.17 An electrical potential difference across the is the enthalpy change for unfolding at the membrane is essential for the functioning melting temperature 448 of all cells 484 10.28 The heat capacities of the folded and unfolded 11.18 The sodium–potassium pump hydrolyzes ATP protein allow the determination of ΔHo and to move Na+ ions out of the cell with the ΔSo for unfolding at any temperature 449 coupled movement of K+ ions into the cell 486 10.29 Folded proteins become unstable at very 11.19 Sodium and potassium channels allow ions low temperature because of changes in to move quickly across the membrane 487 o o ΔH and ΔS 452 11.20 Sodium and potassium channels contain Summary 453 a conserved tetrameric pore domain 489 Key Concepts 455 11.21 A large vestibule within the channel reduces Problems 456 the distance over which ions have to move Further Reading 457 without associated water molecules 490 11.22 Carbonyl groups in the selectivity fi lter provide specifi city for K+ ions by substituting for the Chapter 11 Voltages and Free Energy 459 inner-sphere waters 491 + A. OXIDATION–REDUCTION REACTIONS 11.23 Rapid transit of K ions through the channel is facilitated by hopping between isoenergetic IN BIOLOGY 459 binding sites 492 11.1 Reactions involving the transfer of electrons are D. THE TRANSMISSION OF ACTION referred to as oxidation–reduction reactions 459 POTENTIALS IN NEURONS 493 11.2 Biologically important redox-active metals are bound to proteins 460 11.24 The asymmetric distribution of ions across the cell membrane generates an equilibrium 11.3 Nicotinamide adenine dinucleotide (NAD+) is an important mediator of redox reactions membrane potential 493 in biology 460 11.25 The Nernst equation relates the equilibrium 11.4 Flavins and quinones can undergo oxidation or membrane potential to the concentrations reduction in two steps of one electron each 461 of ions inside and outside the cell 494 11.5 The oxidation of glucose is coupled to the 11.26 Cell membranes act as electrical capacitors 496 generation of NADH and FADH2 463 11.27 The depolarization of the membrane is a 11.6 Mitochondria are cellular compartments in key step in initiating a neuronal signal 498 which NADH and FADH2 are used to generate 11.28 Membrane potentials are altered by the ATP 465 movement of relatively few ions, enabling 11.7 Absorption of light creates molecules with rapid axonal transmission 499 high reducing power in photosynthesis 467 11.29 The propagation of voltage changes can be B. REDUCTION POTENTIALS AND FREE understood by treating the axon as an ENERGY 469 electrical circuit 500 11.8 Electrochemical cells can be constructed by 11.30 The propagation of changes in membrane linking two redox couples 470 potential in the axon are described by the cable equation 501 11.9 The voltage generated by an electrochemical cell with the reactants at standard conditions 11.31 The resting membrane potential is determined is known as the standard cell potential 473 by a combination of the basal conductances 11.10 The electric potential difference (voltage) of potassium and sodium channels 505 between two points is the work done in moving 11.32 The propagation of a voltage spike without a unit charge between the two points 474 triggering voltage-gated ion channels is known 11.11 Standard reduction potentials are related to the as passive spread 506 standard free-energy change of the redox 11.33 If membrane currents are neglected, then the reaction underlying the electrochemical cell 475 cable equation is analogous to a diffusion 11.12 Electrode potentials are measured relative to equation 507 a standard hydrogen electrode 477 11.34 Leakage through open ion channels limits 11.13 Tabulated values of standard electrode the spread of a voltage perturbation 509 potentials allow ready calculation of the 11.35 The time taken to develop a membrane standard potential of an electrochemical cell 478 potential is determined by the conductance 11.14 The Nernst equation describes how the of the membrane and its capacitance 510 potential changes with the concentrations 11.36 Myelination of mammalian neurons facilitates of the redox reactants 480 the transmission of action potentials 513 11.15 The standard state for reduction potentials 11.37 Action potentials are regenerated periodically in biochemistry is pH 7 480 as they travel down the axon 514 C. ION PUMPS AND CHANNELS IN NEURONS 481 11.38 A positively charged sensor in voltage-gated 11.16 Neuronal cells use electrical signals to ion channels moves across the membrane transmit information 482 upon depolarization 517 DETAILED CONTENTS xvii

11.39 The structures of voltage-gated K+ channels 12.16 Induced-fi t binding occurs through selection show that the voltage sensors form paddle-like by the ligand of one among many preexisting structures that surround the core of the conformations of the protein 557 channel 520 12.17 Conformational changes in the protein 11.40 The crystal structure of a voltage-gated underlie the specifi city of a cancer drug K+ channel suggests how the voltage sensor known as imatinib 559 opens and closes the channel 521 12.18 Conformational changes in the target protein Summary 524 can weaken the affi nity of an inhibitor 560 Key Concepts 525 12.19 The strength of noncovalent interactions usually correlates with hydrophobic interactions 562 Problems 526 12.20 Cholesterol-lowering drugs known as statins Further Reading 527 take advantage of hydrophobic interactions to block their target enzyme 563 PART IV: MOLECULAR INTERACTIONS 530 12.21 The apparent affi nity of a competitive inhibitor for a protein is reduced by the presence of Chapter 12 Molecular Recognition: the natural ligand 566 The Thermodynamics of Binding 531 12.22 Entropy lost by drug molecules upon binding is regained through the hydrophobic effect and A. THERMODYNAMICS OF MOLECULAR the release of protein-bound water molecules 569 INTERACTIONS 531 12.23 Isothermal titration calorimetry allows us to determine the enthalpic and entropic 12.1 The affi nity of a protein for a ligand is components of the binding free energy 573 characterized by the dissociation constant, KD 533 Summary 576 12.2 The value of KD corresponds to the concentration of free ligand at which the Key Concepts 578 protein is half saturated 535 Problems 578 12.3 The dissociation constant is a dimensionless Further Reading 580 number, but is commonly referred to in concentration units 537 Chapter 13 Specifi city of 12.4 Dissociation constants are determined Macromolecular Recognition 581 experimentally using binding assays 537 12.5 Binding isotherms plotted with logarithmic A. AFFINITY AND SPECIFICITY 581 axes are commonly used to determine the 13.1 Both affi nity and specifi city are important in dissociation constant 540 intermolecular interactions 581 12.6 When the ligand is in great excess over the 13.2 Proteins often have to choose between protein, the free ligand concentration, [L], is several closely related targets 582 essentially equal to the total ligand 13.3 Specifi city is defi ned in terms of ratios of concentration 542 dissociation constants 584 12.7 Scatchard analysis makes it possible to 13.4 The specifi city of binding depends on the estimate the value of KD when the concentration of ligand 585 concentration of the receptor is unknown 543 13.5 Fractional occupancy and specifi city are 12.8 Scatchard analysis can be applied to important for activities resulting from binding 587 unpurifi ed proteins 544 13.6 Most macromolecular interactions are a 12.9 Saturable binding is a hallmark of specifi c compromise between affi nity and specifi city 587 binding interactions 546 13.7 Fibroblast growth factors vary considerably 12.10 The value of the dissociation constant, KD, in their affi nities for receptors 588 defi nes the ligand concentration range over which the protein switches from unbound 13.8 The recognition of DNA by transcription factors to bound 546 involves discrimination between a very large numbers of off-target binding sites 590 12.11 The dissociation constant for a physiological 13.9 Lowering the affi nity of lac repressor for the ligand is usually close to the natural operator switches on transcription 591 concentration of the ligand 548 B. PROTEIN–PROTEIN INTERACTIONS 593 B. DRUG BINDING BY PROTEINS 549 13.10 Protein–protein complexes involve interfaces 12.12 Most drugs are developed by optimizing the between two folded domains or between inhibition of protein targets 549 a domain and a peptide segment 593 12.13 Signaling molecules are protein targets in 13.11 SH2 domains are specifi c for peptides cancer drug development 549 containing phosphotyrosine 595 12.14 Most small molecule drugs work by displacing 13.12 Individual SH2 domains cannot discriminate a natural ligand for a protein 552 sharply between different phosphotyrosine- 12.15 The binding of drugs to their target proteins containing sequences 596 often results in conformational changes in the 13.13 Combinations of peptide recognition domains protein 556 have higher specifi city than individual domains 597 xviii DETAILED CONTENTS

13.14 Protein–protein interfaces usually have a 14.2 The response of many biological systems is small hydrophobic core 599 ultrasensitive, with the switch from off to on 13.15 A typical protein–protein interface buries about occurring over a less than 100-fold range in 700 to 800 Å2 of surface area on each protein 600 concentration 634 13.16 Water molecules form hydrogen-bonded 14.3 Cooperativity and allostery are features of networks at protein–protein interfaces 601 many ultrasensitive systems 636 13.17 The interaction between growth hormone and 14.4 Bacterial movement towards attractants and its receptor is a model for understanding away from repellants is governed by signaling protein–protein interactions 602 proteins that bind to the fl agellar motor 638 13.18 The major growth hormone–receptor interface 14.5 The fl agellar motor switches to clockwise contains many types of interactions 603 rotation when the concentration of CheY 13.19 The interface between growth hormone and increases over a narrow range 639 its receptor contains hot spots of binding 14.6 The response of the fl agellar motor to affi nity, which dominate the interaction 605 concentrations of CheY is ultrasensitive 640 13.20 Residues that do not contribute to binding 14.7 The MAP kinase pathway involves the affi nity may be important for specifi city 606 sequential activation of a set of three protein 13.21 The desolvation of polar groups at interfaces kinases 641 makes a large contribution to the free energy 14.8 Phosphorylation controls the activity of of binding 607 protein kinases by allosteric modulation C. RECOGNITION OF NUCLEIC ACIDS BY of the structure of the active site 642 PROTEINS 610 14.9 The sequential phosphorylation of the MAP kinases leads to an ultrasensitive signaling 13.22 Complementarity in both electrostatics and shape is an important aspect of the recognition switch 643 of double-helical DNA and RNA 610 B. ALLOSTERY IN HEMOGLOBIN 645 13.23 Proteins distinguish between DNA and RNA 14.10 Allosteric proteins exhibit positive or negative double helices by recognizing differences in cooperativity 645 the geometry of the grooves 612 14.11 The heme group in hemoglobin binds oxygen 13.24 Proteins recognize DNA sequences by both reversibly 646 direct contacts and induced conformational 14.12 Hemoglobin increases the solubility of oxygen changes in DNA 613 in blood and makes its transport to the tissues 13.25 Hydrogen bonding is a key determinant of more effi cient 647 specifi city at DNA–protein interfaces 614 14.13 Hemoglobin undergoes conformational 13.26 Water molecules can form specifi c hydrogen- changes as it binds to and releases oxygen 649 bond bridges between protein and DNA 615 14.14 The sigmoid binding isotherm for an allosteric 13.27 Arginine interactions with the minor groove protein arises from switching between low- can provide sequence specifi city through and high-affi nity binding isotherms 649 shape recognition 616 14.15 The degree of cooperativity between binding 13.28 DNA structural changes induced by binding sites in an allosteric protein is characterized vary widely 617 by the Hill coeffi cient 650 13.29 Proteins that bind DNA as dimers do so with 14.16 The tertiary structure of each hemoglobin higher affi nity than if they were monomers 618 subunit changes upon oxygen binding 653 13.30 Linked DNA binding modules can increase 14.17 Changes in the tertiary structure of each binding affi nity and specifi city 619 subunit are coupled to a change in the 13.31 Cooperative binding of proteins also enhances quaternary structure of hemoglobin 655 specifi city 620 14.18 The hemoglobin tetramer is always in 13.32 Proteins that recognize single-stranded RNA equilibrium between R and T states, and interact extensively with the bases 623 oxygen binding biases the equilibrium 658 13.33 Stacking interactions between amino acid 14.19 Bisphosphoglycerate (BPG) stabilizes the T-state sidechains and nucleotide bases are an quaternary structure of hemoglobin 660 important aspect of RNA recognition 625 14.20 The low pH in venous blood stabilizes the T-state Summary 627 quaternary structure of hemoglobin 661 Key Concepts 628 14.21 Hemoglobins across evolution have acquired Problems 629 distinct allosteric mechanisms for achieving Further Reading 630 ultrasensitivity 662 14.22 Allosteric mechanisms are likely to evolve Chapter 14 Allostery 633 by the accretion of random mutations in A. ULTRASENSITIVITY OF MOLECULAR colocalized proteins 663 RESPONSES 633 Summary 667 14.1 Molecular outputs that depend on independent Key Concepts 668 binding events switch from on to off over a Problems 668 100-fold range in input strength 633 Further Reading 670 DETAILED CONTENTS xix

PART V: KINETICS AND CATALYSIS 672 15.22 Catalysts accelerate the rates of chemical reactions without being consumed in the process 705 Chapter 15 The Rates of Molecular 15.23 Rate laws for reactions usually must be Processes 673 determined experimentally 706 A. GENERAL KINETIC PRINCIPLES 675 15.24 The hydrolysis of sucrose provides an example of how a reaction mechanism is analyzed 707 15.1 The rate of reaction describes how fast concentrations change with time 675 15.25 The fastest possible reaction rate is determined 15.2 The rates of intermolecular reactions depend by the diffusion-limited rate of collision 709 on the concentrations of the reactants 676 15.26 Most reactions occur more slowly than the 15.3 Rate laws defi ne the relationship between diffusion-limited rate 710 the reaction rates and concentrations 676 15.27 The activation energy is the minimum energy 15.4 The dependence of the rate law on the required to convert reactants to products concentrations of reactants defi nes the order during a collision between molecules 711 of the reaction 678 15.28 The reaction rate depends exponentially 15.5 The integration of rate equations predicts the on the activation energy 712 time dependence of concentrations 679 15.29 Transition state theory links kinetics to 15.6 Reactants disappear linearly with time for a thermodynamic concepts 715 zero-order reaction 680 15.30 Catalysts can work by decreasing the activation 15.7 The concentration of reactant decreases energy, by increasing the preexponential factor, exponentially with time for a fi rst-order or by completely altering the mechanism 716 reaction 680 Summary 717 15.8 The reactants decay more slowly in second- Key Concepts 718 order reactions than in fi rst-order reactions, Problems 718 but the details depend on the particular type of reaction and the conditions 681 Further Reading 720 15.9 The half-life for a reaction provides a measure of the speed of the reaction 682 Chapter 16 Principles of Enzyme 15.10 For reactions with intermediate steps, the Catalysis 721 slowest step determines the overall rate 683 B. REVERSIBLE REACTIONS, STEADY STATES, A. MICHAELIS–MENTEN KINETICS 721 AND EQUILIBRIUM 688 16.1 Enzyme-catalyzed reactions can be described as a binding step followed by a catalytic step 723 15.11 The forward and reverse rates must both be considered for a reversible reaction 688 16.2 The Michaelis–Menten equation describes the kinetics of the simplest enzyme-catalyzed 15.12 The on and off rates of ligand binding can be reactions 725 measured by monitoring the approach to equilibrium 689 16.3 The value of the Michaelis constant, KM, is related to how much enzyme has substrate 15.13 Steady-state reactions are important in metabolism 691 bound 726 15.14 For reactions with alternative products, the 16.4 Enzymes are characterized by their turnover relative values of rate constants determine numbers and their catalytic effi ciencies 729 the distribution of products 693 16.5 A “perfect” enzyme is one that catalyzes the 15.15 Measuring fl uorescence provides an easy way chemical step of the reaction as fast as the to monitor kinetics 695 substrate can get to the enzyme 730 15.16 Fluorescence measurements can be carried out 16.6 In some cases the release of the product from under steady-state conditions 696 the enzyme affects the rate of the reaction 732 15.17 Fluorescence quenchers provide a way to 16.7 The specifi city of enzymes arises from both the detect whether a fl uorophore on a protein is rate of the chemical step and the value of KM 733 accessible to the solvent 697 16.8 Graphical analysis of enzyme kinetic data 15.18 The combination of forward and reverse rate facilitates the estimation of kinetic parameters 735 constants is related to the equilibrium constant 699 B. INHIBITORS AND MORE COMPLEX 15.19 Relaxation methods provide a way to obtain REACTION SCHEMES 736 rate constants for reversible reactions 700 16.9 Competitive inhibitors block the active site 15.20 Temperature jump experiments can be used of the enzyme in a reversible way 736 to determine the association and dissociation rate constants for dimerization 701 16.10 A competitive inhibitor does not affect the maximum velocity of the reaction, Vmax, but it 15.21 The rate constants for a cyclic set of reactions increases the Michaelis constant, K 737 are coupled 704 M 16.11 Reversible noncompetitive inhibitors decrease C. FACTORS THAT AFFECT THE RATE the maximum velocity, Vmax, without affecting CONSTANT 705 the Michaelis constant, KM 740 xx DETAILED CONTENTS

16.12 Substrate-dependent noncompetitive inhibitors Further Reading 785 only bind to the enzyme when the substrate is present 741 16.13 Some noncompetitive inhibitors are linked Chapter 17 Diffusion and Transport 787 irreversibly to the enzyme 742 A. RANDOM WALKS 787 16.14 In a ping-pong mechanism the enzyme becomes 17.1 Microscopic motion is well described by modifi ed temporarily during the reaction 744 trajectories called random walks 787 16.15 For a reaction with multiple substrates, the order of binding can be random or sequential 744 17.2 The analysis of bacterial movement is simplifi ed by considering one-dimensional random walks 16.16 Enzymes with multiple binding sites can with uniform step lengths and time intervals 788 display allosteric (cooperative) behavior 746 17.3 The probability distribution for the number of 16.17 Product inhibition is a mechanism for moves in one direction is given by a Gaussian regulating metabolite levels in cells 749 function 789 C. PROTEIN ENZYMES 749 17.4 The probability of moving a certain distance 16.18 Enzymes can accelerate reactions by large in a one-dimensional random walk is also given amounts 750 by a Gaussian function 791 16.19 Transition state stabilization is a major 17.5 The width of the distribution of displacements contributor to rate enhancement by enzymes 751 increases with the square root of time for 16.20 Enzymes can act as acids or bases to random walks 794 enhance reaction rates 754 17.6 Random walks in two dimensions can be 16.21 Proximity effects are important for many analyzed by combining two orthogonal one- reactions 756 dimensional random walks 796 16.22 The serine proteases are a large family of 17.7 A two-dimensional random walk is described enzymes that contain a conserved Ser-His-Asp by two one-dimensional walks, but the effective catalytic triad 758 step size for each is smaller by a factor of √2 798 16.23 Sidechain recognition positions the catalytic 17.8 The assumption of uniform step lengths along triad next to the peptide bond that is cleaved 758 each axis means that the random walk occurs 16.24 The specifi cities of serine proteases vary on a grid 798 considerably, but the catalytic triad is 17.9 A three-dimensional random walk is described conserved 760 by three orthogonal one-dimensional walks, 16.25 Peptide cleavage in serine proteases proceeds and the effective step size for each is smaller via a ping-pong mechanism 761 by a factor of √3 801 16.26 Angiotensin-converting enzyme is a zinc- 17.10 The movement of bacteria in the presence containing protease that is an important of attractants or repellents is described by drug target 763 biased random walks 801 16.27 Creatine kinase catalyzes phosphate transfer B. MACROSCOPIC DESCRIPTION OF by stabilizing a planar phosphate intermediate 764 DIFFUSION 802 16.28 Some enzymes work by populating disfavored 17.11 Fick’s fi rst law states that the fl ux of molecules conformations 766 is proportional to the concentration gradient 802 16.29 Antibodies that bind transition state analogs 17.12 Fick’s second law describes the rate of change can have catalytic activity 768 in concentration with time 804 D. RNA ENZYMES 769 17.13 Integration of the diffusion equation allows us to 16.30 Small self-cleaving ribozymes and ribonuclease calculate the change in concentration with time 805 proteins catalyze the same reaction 769 17.14 The diffusion constant is related to the mean 16.31 Self-cleaving ribozymes use nucleotide bases square displacement of molecules 807 for catalysis, even though these do not have 17.15 Diffusion constants depend on molecular pKa values well suited for proton transfer 769 properties such as size and shape 809 16.32 Hairpin ribozymes optimize hydrogen bonds 17.16 The diffusion constant is inversely related to to the transition state rather than to the initial the friction factor 810 or fi nal states 771 17.17 Viscosity is a measure of the resistance to fl ow 811 16.33 There are at least two possible mechanisms for bond cleavage by the hairpin ribozyme 773 17.18 Liquids with strong interactions between molecules have high viscosity 812 16.34 The splicing reaction catalyzed by group I introns occurs in two steps 774 17.19 The Stokes–Einstein equation allows us to 16.35 Metal ions facilitate catalysis by group I introns 777 calculate the diffusion coeffi cients of molecules 812 16.36 Substitution of oxygen by sulfur in RNA helps 17.20 The diffusion constants for nonspherical molecules identify metals that participate in catalysis 777 are only slightly different from those calculated from the spherical approximation 814 Summary 780 17.21 Diffusion-limited reaction rate constants can Key Concepts 781 be calculated from the diffusion constants Problems 782 of molecules 815 DETAILED CONTENTS xxi

17.22 One-dimensional searches on DNA increase 18.10 Changes in the sequence of a protein at certain the rate at which transcription factors fi nd positions can affect folding rates substantially 850 their targets 817 18.11 The nature of the transition state can be 17.23 Restricting diffusion to two-dimensional identifi ed by mapping the effect of mutations membranes can slow down the rate of on the folding and unfolding rates 852 encounter but still speed up reactions 819 18.12 The process of protein folding can be described 17.24 Concentration gradients determine the as funneled movement on a multidimensional outcomes of many biological processes 822 free-energy landscape 856 17.25 Cells use motor proteins to transport cargo B. CHAPERONES FOR PROTEIN FOLDING 857 over long distances and to specifi c locations 823 18.13 Many proteins tend to aggregate rather 17.26 Vesicles are transported by kinesin motors than fold 857 that move along microtubule tracks 823 18.14 The high concentration of macromolecules 17.27 ATP hydrolysis provides a powerful driving inside the cell makes the problem of force for kinesin movement 825 aggregation particularly acute 858 C. EXPERIMENTAL MEASUREMENT OF 18.15 Proteins inside the cell usually fold into a DIFFUSION 826 functional form rapidly 860 18.16 Some proteins form irreversible aggregates 17.28 Diffusion constants can be measured that are toxic to cells 861 experimentally in several ways 826 18.17 Molecular chaperones are proteins that 17.29 Movement of molecules in solution can be prevent protein aggregation 863 driven by centrifugal forces 827 18.18 Hsp70 recognizes short peptides with 17.30 Equilibrium centrifugation can be used to sequences that are characteristic of the determine molecular weights 829 interior segments of proteins 866 17.31 Electrophoresis provides an alternative method 18.19 Hsp70 binds and releases protein chains for driving molecular motion 830 in a cycle that is coupled to ATP binding and 17.32 The electrophoretic mobility of nucleic acids hydrolysis 866 decreases with size 831 18.20 The GroEL chaperonin forms a hollow 17.33 Gel electrophoresis analysis of proteins is double-ring structure within which protein useful for size determination 832 molecules can fold 868 Summary 833 18.21 GroEL works like a two-stroke engine, binding Key Concepts 834 and releasing proteins 870 Problems 835 18.22 GroEL–GroES can accelerate the folding of proteins through passive and active Further Reading 836 mechanisms 872 C. RNA FOLDING 872 PART VI: ASSEMBLY AND ACTIVITIY 838 18.23 The electrostatic fi eld around RNA leads to the diffuse localization of metal ions 873 Chapter 18 Folding 839 18.24 RNA folding can be driven by increasing the concentration of metal ions 874 A. HOW PROTEINS FOLD 840 18.25 RNAs form stable secondary structural 18.1 Protein folding is governed by thermodynamics 840 elements, which increases their tendency 18.2 The reversibility of protein folding can also to misfold 875 be demonstrated by manipulating single 18.26 RNA folding is hierarchical with multiple stable molecules 841 intermediates 877 18.3 Unfolded states of proteins correspond to 18.27 Collapse is an early event in the folding of RNA 878 wide distributions of different conformations 842 18.28 RNA folding landscapes are highly rugged 880 18.4 Protein folding cannot be explained by an Summary 882 exhaustive search of conformational space 844 Key Concepts 883 18.5 Many small proteins populate only fully unfolded and fully folded states 844 Problems 884 18.6 The order in which secondary and tertiary Further Reading 886 interactions form can vary in different proteins 845 Chapter 19 Fidelity in DNA and Protein 18.7 Folding rates are faster when residues close Synthesis 887 in sequence end up close together in the folded structure 846 A. MEASURING THE STABILITY OF DNA 18.8 The folding of some proteins involves the DUPLEXES 888 formation of transiently stable intermediates 847 19.1 The difference in free energy between matched 18.9 Folding pathways can have multiple and mismatched base pairs can be determined intermediates 850 by measuring the melting temperature of DNA 888 xxii DETAILED CONTENTS

19.2 DNA melting can be studied by UV absorption 19.19 DNA polymerases recognize DNA using the spectroscopy 889 backbone and minor groove 915 19.3 The changes in enthalpy and entropy associated 19.20 DNA polymerases sense the shapes of with DNA melting can be determined from the correctly paired bases 917 concentration dependence of melting curves 890 19.21 The shape of a nucleotide is more important 19.4 DNA duplexes containing a mismatched base for its being incorporated into DNA than its pair at one end are only marginally less stable ability to form hydrogen bonds 918 than duplexes with matched bases 892 19.22 The growing DNA strand can shuttle between 19.5 The entropy of each DNA chain is reduced the polymerase and exonuclease active sites 919 upon forming a duplex 894 C. HOW RIBOSOMES ACHIEVE FIDELITY 920 19.6 The stability of DNA depends on the pattern 19.23 The ribosome has two subunits, each of which on base stacks in the duplex 895 is a large complex of RNA and proteins 921 19.7 Base stacking is more important than hydrogen 19.24 Protein synthesis on the ribosome occurs bonding in determining the stability of DNA as a repeated series of steps of tRNA and helices 897 protein binding, with conformational changes B. FIDELITY IN DNA REPLICATION 898 in the ribosome 921 19.8 The process of DNA replication is very accurate 898 19.25 Selection of the correct A-site tRNA by 19.9 The energy of DNA base-pairing cannot base-pairing alone cannot explain ribosome explain the accuracy of DNA replication 900 fi delity 923 19.10 The overall process of DNA synthesis can be 19.26 A ribosome-induced bend in the EF-Tu•tRNA described as a series of kinetic steps 902 complex plays an important role in generating specifi city 924 19.11 Primer elongation by polymerase is quite rapid 904 19.27 The ribosome undergoes conformational 19.12 The rate-limiting step in the DNA synthesis changes during the process of tRNA selection 925 reaction is a conformational change in DNA 19.28 Tight interactions in the decoding center polymerase 905 can only occur for correct codon–anticodon 19.13 Determination of the values of Vmax and KM pairs 926 for the incorporation of correct and incorrect 19.29 Coupling of the decoding center and the base pairs provides insights into fi delity 907 GTPase active site of EF-Tu involves multiple 19.14 DNA polymerase has a nuclease activity that conformational rearrangements 929 can remove bases from the 3′ end of a DNA 19.30 The active site of EF-Tu needs only a small strand 908 rearrangement to be activated 930 19.15 The structure of DNA polymerase has fi ngers, 19.31 Release of EF-Tu allows the aminoacyl palm, and thumb subdomains 909 group of the A-site tRNA to move to the 19.16 DNA polymerase binds DNA using the “palm” peptidyl transfer center 931 and nearly encircles it 910 19.32 The ribosome catalyzes peptidyl transfer 932 19.17 The active site of polymerase contains two Summary 934 metals ions that catalyze nucleotide addition 911 Key Concepts 935 19.18 A conformational change in DNA polymerase upon binding dNTP contributes to replication Problems 936 fi delity 913 Further Reading 937 How Do We UUnderstandnderstand Life?Life?

he ultimate goal of molecular biology and biochemistry is to understand in T molecular terms the processes that make life possible. In his “Lectures on Physics,” Richard Feynman famously remarked that in order to understand life “the most powerful assumption of all ... (is) that everything that living things do can be understood in terms of the jigglings and wigglings of atoms.”1 Feynman made this statement about 50 years ago, shortly after and had discovered the double-helical structure of DNA, and Max Perutz and John Kendrew were working out the fi rst structures of proteins. How do we even begin to make good on Feynman’s assertion that life can be understood in terms of the “jigglings and wigglings” of atoms? Our purpose in this book is to connect fundamental principles concerning the structure and energet- ics of biologically important molecules to their function. Th e concepts that we shall introduce provide a start towards establishing a complete understanding of the physical basis for life. As we move through these concepts, we shall assume that you are familiar with the essential principles of chemical structure, reactivity, and bonding, as covered in a typical introductory chemistry course. We shall also assume familiarity with basic concepts in molecular biology, again at the level encountered in introductory courses. If you fi nd some of the material presented in the earlier chapters of this book diffi cult to follow, you may wish to consult elementary textbooks in chemistry and biology, such as those listed at the end of Chapter 1. Any living cell is, ultimately, a collection of diff erent kinds of molecules. Th e molec- ular structure of a particularly well-studied bacterium, Escherichia coli, is shown in Figure 1. Th is rendering of the cell, by the scientist and artist David Goodsell, is based on three-dimensional molecular structures that have been determined by many scientists, piece by piece, over the 50 years since Feynman’s assertion about life. Th e particular cell shown in Figure 1 is encapsulated by two lipid mem- branes that are coated by a layer of glycans (carbohydrates). Th e interior of the cell is densely packed with many diff erent kinds of macromolecules, which are very large molecules consisting of thousands of atoms each. Prominent among these is DNA, which is depicted as long yellow strands. Th e macromolecules with irregular shapes are various kinds of proteins and RNAs, as well as glycans. Like all living cells, E. coli takes in nutrients and catalyzes chemical reactions that release energy from the nutrients. Th e cell is able to harness this energy to grow and to reproduce. By dividing into two cells, the mother cell passes on the blue- print for life, encoded in the DNA, to its two daughter cells, along with all of the other kinds of molecules that are necessary for these cells to live. It is apparent, even from looking at this relatively simple bacterial cell, that it is a formidable challenge to work out how such a molecular system can grow and reproduce, using only information contained within itself. We shall start by understanding the properties and interactions of the biological macromolecules that are the nanoscale machines of the cell. Th e four types of macromolecules in the cell (DNA, RNA, proteins, and glycans) are polymers—that is, they are constructed by forming covalent linkages between 2 HOW DO WE UNDERSTAND LIFE?

(A) (B) flagellum (green)

cell membranes (yellow) ATP synthase (green) glycans (carbohydrates, green) ribosome (purple) messenger RNA (pink) transfer RNA (pink) RNA polymerase (orange) DNA polymerase (orange) DNA (yellow strands)

Figure 1 Molecular structure of smaller molecules. RNA and DNA are formed by linking nucleotides together, pro- a bacterial cell. Shown here is teins are formed from amino acids, and glycans are polymers of sugars. Of these, an Escherichia coli cell, illustrated DNA, RNA, and proteins are special because they are the three components of the by David Goodsell of The Scripps Research Institute. (A) Cross section process by which genetic information is translated into the machinery of the cell. of an E. coli cell. The main body DNA, RNA, and proteins are linear polymers in which the linkage between the of the cell is approximately 1 μm component units extends in only one direction without branching. Th e order of wide and has long whip-like fl agella, which power the movement of the specifi c kinds of nucleotides in DNA or RNA, or of specifi c amino acids in proteins, cell. (B) Expanded view of the region is called the sequence of the polymer. All living cells store heritable information in outlined in white in (A). Many of the form of DNA sequences, which are copied through the process of DNA repli- the macromolecules in the cell are cation and transmitted to progeny cells. Th e sequences of particular segments of shown here, drawn to scale. Some DNA are also copied during the process of transcription to make RNAs with diff er- of the many protein machines in the ent kinds of functions. Messenger RNAs (mRNAs) are used to synthesize proteins. cell are identifi ed: DNA polymerase Other kinds of RNAs carry out diverse functions in the cell. makes copies of DNA strands, RNA polymerase generates messenger RNA Most of the molecular machines that carry out the various processes essential for (mRNA) from DNA, and ATP synthase life are proteins. Th ese include enzymes that catalyze chemical reactions, motor stores energy in the form of adenosine proteins that move things inside the cell, architectural proteins that give the cell triphosphate (ATP). Transfer RNA (tRNA) is involved in the translation its dynamic shape, and regulatory proteins that switch cellular processes on and of the sequence of a messenger RNA off . Two particularly important kinds of protein enzymes are DNA polymerases, to the sequence of a protein, by a which replicate DNA, and RNA polymerases, which make RNAs based on the particularly large machine called the sequence of DNA. Another important protein enzyme is ATP synthase, which ribosome. You can appreciate the stores energy by synthesizing adenosine triphosphate (ATP). Some enzymes are scale of this drawing by considering made of RNA. Th e ribosome, which synthesizes proteins based on the sequences that each ribosome is ~300 Å in of messenger RNAs, is made of both proteins and RNA, with RNA being the func- diameter. (From D.S. Goodsell, Biochem. Mol. Biol. Educ. 37: 325–332, tionally more important part. Th ese molecular machines are identifi ed in Figure 2009. With permission from John Wiley 1, and we shall study some of them in this book. & Sons, Inc.) Th ere are four kinds of nucleotides in DNA and also in RNA, and 20 kinds of amino acids in proteins. Although this basic set of molecular building blocks is limited, they can generate a vast number of possible sequences. Th e E. coli bacterium has ~4.5 million (4.5 × 106) nucleotides in its DNA. A DNA molecule of this length corresponds to ~102,700,000 possible sequences (4 × 4 × 4 × 4 × ... 4.5 × 106 times), HOW DO WE UNDERSTAND LIFE? 3 which is an unimaginably large number. A typical protein molecule is made from ~300 amino acids. Th e total number of diff erent sequences possible for proteins of this length is 20300 ≈ 10390, also an enormously large number. It is from this vast diversity of possible sequences that evolution is able to select the much smaller number of sequences of DNAs, RNAs, and proteins that are used in life. Th ere are two central themes underlying the concepts in this book. Th e fi rst is that the function of a molecule depends on its structure and that biological macromol- ecules can assemble spontaneously into functional structures. Th e second theme is that any biological macromolecule must work together with other molecules to carry out its particular functions in the cell, and this depends on the ability of molecules to recognize each other specifi cally. Clearly, to understand the molec- ular mechanism of any biological process, we must understand the energy of the physical and chemical interactions that drive the formation of specifi c structures and promote molecular recognition. You may be familiar with the concept of entropy, which is a measure of the likeli- hood of a particular arrangement of molecules. Th e fl ow of energy is governed by a very general principle, which is that the entropy of the universe always increases in any process. Th is statement is known as the second law of thermodynamics, and you have encountered it in some form in introductory chemistry. Another way of stating the second law is that a system always tends towards increased dis- order, unless there is an input of energy. Th e relevance of the second law to liv- ing systems should become apparent if you study Figure 1 again. A living cell is a highly organized entity, with the cell membrane surrounding a specifi c collection of macromolecules that are where they need to be in order to function effi ciently. Cells require a constant supply of energy to carry out the processes associated with living. Without energy, they would quickly go into a quiescent state and eventually disintegrate. Th e increase in entropy (disorder) upon disintegration overcomes the energetically favorable interactions that enable the cell to function. In the fi rst part of this book (Part I, Biological Molecules), we introduce the impor- tant classes of biological macromolecules and discuss the details of their struc- tures. With the architectural principles of macromolecular structures in hand, we turn our attention to the physical principles that govern the interactions between molecules. As we explain in Part II of this book (Energy and Entropy), considera- tions of the energetics of interactions must always go hand in hand with consider- ation of the entropy (taken together, energy and entropy control the “jigglings and wigglings” of the atoms). By combining energy and entropy we arrive at a param- eter known as the free energy, which allows us to predict whether a molecular process will occur spontaneously. Th is concept is developed in Part III of the book (Free Energy), and applied to processes such as the spontaneous adoption of spe- cifi c structures by proteins and the transmission of electrical signals in nerve cells. In Part IV (Molecular Interactions), we focus on the idea that molecular interac- tions in living systems have to be highly specifi c. By drawing on the descriptions of protein and nucleic acid structure that we developed in Part I and the idea of free energy developed in Part III, we explain how molecules that need to interact fi nd each other in the crowded environment of the cell. Living systems change with time. Another way of saying this is that living systems are never at equilibrium: they would be dead if they were. In Part V (Kinetics and Catalysis), we turn to a study of kinetics, which describes the time dependence of molecular processes such as chemical reactions and diff usion. Th is part of the book provides us with several essential ideas about how enzymes work. Finally, in Part VI (Assembly and Activity), we focus on two particularly fascinating aspects of cellular processes: how proteins and RNA fold into specifi c three-dimensional structures, and how the processes of replication and translation achieve very high fi delity. 1 Feynman RP, Leighton RB, and Sands ML (1963) The Feynman Lectures on Physics. Reading, MA: Addison-Wesley Publishing Co. References

2 Chapter 2 Nucleic Acid Structure

A. Double-Helical Structures of RNA and DNA

Cozzarelli NR & Wang JC (1990) DNA Topology and Its Biological

Effects. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press.

Dickerson RE, Drew, HR, Conner, BN, Wing, RM, Fratini, AV & Kopka,

ML (1982) The anatomy of A-, B-, and Z-DNA. Science 216, 475–485.

Harrison SC & Aggarwal AK (1990) DNA recognition by proteins with a helix-turn-helix motif. Annu. Rev. Biochem. 59, 933–969.

Richmond TJ & Davey CA (2003) The structure of DNA in the nucleo some core. Nature 423, 145–150.

Saenger, W, Hunter WN & Kennard O (1986) DNA conformation is determined by economics in the hydration of phosphate groups.

Nature 324, 385–388.

Seeman NC, Rosenberg JM & Rich A (1976) Sequence-specifi c rec ognition of double helical nucleic acids by proteins. Proc. Natl. Acad.

Sci. USA 73, 804–808.

Wahl MC & Sundaralingam M (1997) Structures of A-DNA duplexes.

Biopolymers 44, 45–63.

Wang JC (2002) Cellular roles of DNA topoisomerases: a molecular perspective. Nat. Rev. Mol. Cell. Biol. 3, 430–440. Wemmer DE (2000) Designed sequence-specifi c minor groove ligands. Annu. Rev. Biophys. Biomol. Struct. 29, 439–461. Werner MH & Burley SK (1997) Architectural transcription factors: proteins that remodel DNA. Cell 88, 733–736. B. The Functional Versatility of RNA Ban N, Nissen P, Hansen J, Moore PB & Steitz TA (2000) The complete atomic structure of the large ribosomal subunit at 2.4 Å resolution. Science 289, 905–920. Batey RT, Rambo RP & Doudna JA (1999) Tertiary motifs in RNA structure and folding. Angew. Chem. Int. Ed. Engl. 38, 2326–2343. Correll CC & Swinger K (2003) Common and distinctive features of GNRA tetraloops based on a GUAA tetraloop structure at 1.4 Å resolution. RNA 9, 355–363. Gold L, Polisky B, Uhlenbeck O & Yarus M (1995) Diversity of oligonucleotide functions. Annu. Rev. Biochem. 64, 763–797. Mathews DH, Moss WN & Turner DH (2010) Folding and fi nding RNA secondary structure. Cold Spring Harb. Perspect. Biol. 2, a003665. Misra VK & Draper DE (1998) On the role of magnesium ions in RNA stability. Biopolymers 48, 113–135. Staple DW & Butcher SE (2005) Pseudoknots: RNA structures with diverse functions. PLoS Biol. 3, e213. Varani G & McClain WH (2000) The G x U wobble base pair. A fundamental building block of RNA structure crucial to RNA function in diverse biological systems. EMBO Rep. 1, 18–23. Wyatt JR, Puglisi JD & Tinoco I (1989) RNA folding: pseudoknots, loops and bulges. BioEssays 11, 100–106. Zuker M (2003) Mfold web server for nucleic acid folding and hybridization prediction. Nuc. Acids Res. 31, 3406–3415.

Further Reading

General 3 Chapter 3 Glycans and Lipids

A. Glycans

Drickamer K & Taylor ME (1998) Evolving views of protein glycosyla tion. Trends Biochem. Sci. 23, 321–324.

Helenius A & Aebi M (2004) Roles of N-linked glycans in the endo plasmic reticulum. Annu. Rev. Biochem. 73, 1019–1049.

Laughlin ST & Bertozzi CR (2009) Imaging the glycome. Proc. Natl.

Acad. Sci. USA 106, 12–17.

Petrescu A, Wormald MR & Dwek RA (2006) Structural aspects of glycomes with a focus on N-glycosylation and glycoprotein folding.

Curr. Op. Struct. Biol. 16, 600–607. B. Lipids and Membranes Grecco HE, Schmick M & Bastiaens PIH (2011) Signaling from the living plasma membrane. Cell 144, 897–909. Mayor S & Rao M (2004) Rafts: Scale-dependent, active lipid organization at the cell surface. Traffi c 5, 231–240. Maxfi eld FR & McGraw TE (2004) Endocytic recycling. Nat. Rev. Mol. Cell Biol. 5, 121–132. Mouritsen OG & Zuckermann MJ (2004) What’s so special about cholesterol? Lipids 39, 1101–1113. Sanyal S & Menon AK (2009) Flipping lipids: Why an’ what’s the reason for? ACS Chem. Biol. 4, 895–909. Schleifer KH & Kandler O (1972) Peptidoglycan types of bacterial cell walls and their taxonomic implications. Bacteriol. Rev. 36, 407–477. Simons K & Toomre D (2000) Lipid rafts and signal transduction. Nat. Rev. Mol. Cell Biol. 1, 31–41. van Meer G, Voelker DR & Feigenson GW (2008) Membrane lipids: where they are and how they behave. Nat. Rev. Mol. Cell Biol. 9, 112–124. Zhang FL & Casey PJ (1996) Protein prenylation: molecular mechanisms and functional consequences. Annu. Rev. Biochem. 65, 241– 269.

Further Reading

General 4 Chapter 4 Protein Structure

Crick FH (1952) Is α-keratin a coiled coil? Nature 170, 882–883.

Woolfson DN (2005) The design of coiled-coil structures and assem blies. Adv. Protein Chem. 70, 79–112.

D. Structural Principles of Membrane Proteins

Khorana HG (1988) Bacteriorhodopsin, a membrane protein that uses light to translocate protons. J. Biol. Chem. 263, 7439–7442.

Kyte J & Doolittle RF (1982) A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157, 105–132.

Lanyi JK (2004) Bacteriorhodopsin. Annu. Rev. Physiol. 66, 665–688.

Subramaniam S & Henderson R (2000) Crystallographic analysis of protein conformational changes in the bacteriorhodopsin photo cycle. Biochim. et Biophys. Acta 1460, 157–165.

White SH & Wimley WC (1999) Membrane protein folding and sta bility: Physical principles. Annu. Rev. Biophys. Biomol. Struct. 28,

319–365. 5 Chapter 5 Evolutionary Variation in Proteins

A. The Thermodynamic Hypothesis

Anfi nsen CB (1972) Nobel Lecture: Studies on the Principles that

Govern the Folding of Protein Chains. T. Frängsmyr, ed. Stockholm:

The Nobel Foundation.

Epstein CJ, Goldberger RF & Anfi nsen CB (1963) The genetic con trol of tertiary protein structure: Studies with model systems. Cold

Spring Harb. Symp. Quant. Biol. 28, 439–449.

Sela M & Lifson S (1959) The reformation of disulfi de bridges in pro teins. Biochim. Biophys. Acta 36, 471–478.

B. Sequence Comparisons and the BLOSUM Matrix

Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W &

Lipman DJ (1997) Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nuc. Acids Res. 25, 3389–3402.

BLAST: Basic Local Alignment Search Tool. http://blast.ncbi.nlm.nih. gov/Blast.cgi

Henikoff S & Henikoff JG (2000) Amino acid substitution matrices.

Adv. Protein Chem. 54, 73–97.

Krishnamurthy N & Sjolander KV (2005) Basic protein sequence anal ysis. Curr. Protoc. Mol. Biol. Chapter 19, Unit 19.15.

C. Structural Variation in Proteins

Bowie JU, Luthy R & Eisenberg D (1991) A method to identify protein sequences that fold into a known three-dimensional structure. Sci ence 253, 164–170. Holm L & Sander C (1996) Mapping the protein universe. Science 273, 595–602. Jones DT, Taylor WR & Thornton JM (1992) A new approach to protein fold recognition. Nature 358, 86–89. Marti-Renom MA, Madhusudhan MS & Sali A (2004) Alignment of protein sequences by their profi les. Prot. Sci. 13, 1071–1087. Richardson JS (1981) The anatomy and taxonomy of protein structure. Adv. Protein Chem. 34, 167–339. D. The Evolution of Modular Domains Andreeva A, Howorth D, Brenner SE, Hubbard TJ, Chothia C & Murzin AG (2004) SCOP database in 2004: Refi nements integrate structure and sequence family data. Nuc. Acids Res. 32, D226–D229. Buehner M, Ford GC, Moras D, Olsen KW & Rossmann MG (1973) D-glyceraldehyde-3-phosphate dehydrogenase: Three-dimensional structure and evolutionary signifi cance. Proc. Natl. Acad. Sci. USA 70, 3052–3054. Chothia C & Gough J (2009) Genomic and structural aspects of protein evolution. Biochem. J. 419, 15–28. Karplus M & McCammon JA (2002) Molecular dynamics simulations of biomolecules. Nat. Struct. Biol. 9, 646–652. Koonin EV, Wolf YI & Karev GP (2002) The structure of the protein universe and genome evolution. Nature 420, 218–223. Laskowski RA & Thornton JM (2008) Understanding the molecular machinery of genetics through 3D structures. Nat. Rev. Genet. 9, 141–151. Levitt M (2009) Nature of the protein universe. Proc. Natl. Acad. Sci. USA 106, 11079–11084. Orengo CA, Michie AD, Jones S, Jones DT, Swindells MB & Thornton JM (1997) CATH—a hierarchic classifi cation of protein domain structures. Structure 5, 1093–1108. Wolf YI, Grishin NV & Koonin EV (2000) Estimating the number of protein folds and families from complete genome data. J. Mol. Biol. 299, 897–905. Xia Y & Levitt M (2004) Simulating protein evolution in sequence and structure space. Curr. Opin. Struct. Biol. 14, 202–207. 6 Chapter 6 Energy and Intermolecular Forces

C. Energetics of Intermolecular Interactions

Baker EN & Hubbard RE (1984) Hydrogen-bonding in globular pro teins. Prog. Biophys. Mol. Biol. 44, 97–179.

Honig B & Nicholls A (1995) Classical electrostatics in biology and chemistry. Science 268, 1144–1149.

Karplus M & McCammon JA (2002) Molecular dynamics simulations of biomolecules. Nat. Struct. Biol. 9, 646–652.

Leach AR (2001) Molecular Modeling: Principles and Applications.

Upper Saddle River, NJ: Prentice Hall.

Matthews BW & Liu L (2009) A review about nothing: Are apolar cavi ties in proteins really empty? Protein Sci. 18, 494–502.

Schlick T (2002) Molecular Modeling and Simulation: An Interdiscipli nary Guide. New York: Springer.

Wang W, Donini O, Reyes CM & Kollman PA (2001) Biomolecular simulations: Recent developments in force fi elds, simulations of enzyme catalysis, protein-ligand, protein-protein, and protein nucleic acid noncovalent interactions. Annu. Rev. Biophys. Biomol.

Struct. 30, 211–243. 13 Chapter 13 specificity of Macromolecular Recognition

A. Affi nity and Specifi city

Mohammadi M, Olsen SK & Ibrahimi OA (2005) Structural basis for

fi broblast growth factor receptor activation. Cyt. & Growth Fact. Rev.

16, 107–137.

B. Protein–protein Interactions

Clackson T & Wells JA (1995) A hot spot of binding energy in a hor mone-receptor interface. Science 267, 383–386.

LoConte L, Chothia C & Janin J (1999) The atomic structure of pro tein–protein recognition sites. J. Mol. Biol. 285, 2177–2198.

Pearce KH, Cunningham BC, Fuh G, Teeri T & Wells JA (1999) Growth hormone binding affi nity for its receptor surpasses the require ments for cellular activity. Biochemistry 38, 81–89. Scott JD & Pawson T (2009) Cell signaling in space and time: where proteins come together and when they’re apart. Science 326, 1220– 1224. Sheinerman FB & Honig B (2002) On the role of electrostatic interactions in the design of protein-protein interfaces. J. Mol. Biol. 318, 161–177. C. Recognition of Nucleic Acids by Proteins Clery A, Blatter M & Allain FH (2008) RNA recognition motifs: boring? Not quite. Curr. Op. Struct. Biol. 18, 290–298. Draper DE (1995) Protein–RNA recognition. Annu. Rev. Biochem. 64, 593–620. Harrison SC (1991) A structural taxonomy of DNA-binding proteins. Nature 353, 715–719. Kao-Huang Y, Revzin A, Butler AP, O’Conner P, Noble DW & von Hippel PH (1977) Nonspecifi c DNA binding to genome-regulating proteins as a biological control mechanism: Measurement of DNAbound Escherichia coli lac repressor in vivo. Proc. Natl. Acad. Sci. USA 74, 4228–4232. 16. A zinc fi nger protein is isolated from a yeast cell. The value of K D for its binding site is 3 μM. In the presence of glucose, the protein dimerizes and recognizes an inverted repeat binding site. a. What is the expected value of K D if the binding is additive? b. The dimeric K D is measured at 5 nM. Why does this value deviate from the expected K D ?

17. A tryptophan residue near the periphery of a protein– protein interface is mutated to alanine and changes the K D of binding from 1 nM to 40 μM at 300 K. a. How much binding energy was contributed by that residue? b. Explain whether or not the tryptophan residue is likely a hot spot residue.

18. A protein–protein interface has a 10 nM affi nity at 300 K. A series of mutants are made in which each residue at the interface is replaced by alanine. A lysine residue at the center of the interface is mutated, and found to contribute 4 kJ•mol –1 to the binding free energy. a. What is the new K D ? b. Explain whether or not the lysine residue is a hot spot residue.

19. The transcription factor FraJ binds a poly-A DNA sequence with a 10 nM K D and a poly-G DNA sequence with a 27 μM K D . Mutation of a critical Phe residue to Ala results in a loss of 20 kJ•mol –1 in binding free energy for the poly-A sequence, but only a loss of 4 kJ•mol –1 on binding to the poly-G sequence. What is the change in specifi city for the poly-A sequence over the poly-G sequence at 10 –8 M concentration of FraJ at 300 K?

20. A protein–protein interface comprises 22 residues at the contact surface. From structures of the isolated proteins, it is expected that completely burying these residues would cause a surface area reduction of ~2000 Å 2 . However, a structure of the interface reveals that only 1200 Å 2 of surface area is buried. Why is there a discrepancy between the expected and measured surface area reductions? 21. Consider the dsRB domain and its potential for interacting with DNA and RNA (see Figure 13.32). What is the predicted effect on the specifi city and affi nity of recognition for the two types of nucleic acids of: a. An Arg to Ala mutation at the binding interface? b. Insertion of loop residues that change the relative spacing of helix A and helix B? 22. A DNA-binding domain binds the sequence GATCGCAATATCGATCGATC with a 25 nM affi nity. A mutation of an Arg to Ala in the protein or a mutation of the underlined “T” to “G” in the DNA sequence both result in a 9 kJ•mol –1 loss of binding free energy. Simultaneous mutation of both the protein and the DNA also results in a 9 kJ•mol –1 loss of binding free energy. a. What is the effect on the K D of any of these mutations? b. What does the double mutant result suggest about the structural basis for the protein–DNA interaction? 23. Each subunit of a homodimeric transcription factor can individually recognize a DNA half-site with a 5 μM K D . The dimeric form of the transcription factor recognizes the full inverted repeat DNA site with a 50 nM K D . How much free energy is used to induce the conformational changes of the protein and DNA during the binding of the dimeric transcription factor? 24. A complex of seven transcription factors binds a DNA enhancer element. The binding is cooperative. What are two molecular mechanisms that the transcription factors might use to achieve this cooperativity? Klug A (2010) The discovery of zinc fi ngers and their applications in gene regulation and genome manipulation. Ann. Rev. Biochem. 79, 213–231. Marmorstein R & Fitzgerald MX (2003) Modulation of DNA-binding domains for sequence specifi c DNA recognition. Gene 304, 1–12. Murphy IV FV & Churchill MEA (2000) Nonsequence-specifi c DNA recognition: A structural perspective. Structure 8, R83–R89. Patikoglou G & Burley SK (1997) Eukaryotic transcription factor–DNA complexes. Annu. Rev. Biophys. Biomol. Struct. 26, 289–325. Rohs R, Jin X, West SM, Joshi R, Honig B & Mann RS (2010) Origins of specifi city in protein-DNA recognition. Ann. Rev. Biochem. 79, 233–269. Seeman NC, Rosenberg JM & Rich A (1976) Sequence-specifi c recognition of double helical nucleic acids by proteins. Proc. Natl. Acad. Sci. USA 73, 804–808. Valverde R, Edwards L & Regen L (2008) Structure and function of KH domains. FEBS J. 275, 2712–2726. 14 Chapter 14 Allostery

A. Ultrasensitivity of Molecular Responses

Bray D (1995) Protein molecules as computational elements in living cells. Nature 376, 307–312.

Cluzel P, Surette M & Leibler S (2000) An ultrasensitive bacterial motor revealed by monitoring signaling proteins in single cells. Sci ence 287, 1652–1655.

Ferrell Jr JE & Machleder EM (1998) The biochemical basis of an all or-none cell fate switch in Xenopus oocytes. Science 280, 895–898.

Huang CY & Ferrell Jr JE (1996) Ultrasensitivity in the mitogen-acti vated protein kinase cascade. Proc. Natl. Acad. Sci. USA 93, 10078–

10083. B. Allostery in Hemoglobin Eaton WA, Henry ER, Hofrichter J & Mozzarelli A (1999) Is cooperative oxygen binding by hemoglobin really understood? Nat. Struct. Biol. 6, 351–358. Monod J, Wyman J & Changeux J-P (1965) On the nature of allosteric transitions: a plausible model. J. Mol. Biol. 12, 88–118. Perutz MF, Wilkinson MJ, Paoli M, & Dodson GG (1998) The stereochemistry of the cooperative effects in hemoglobin revisited. Annu. Rev. Biophys. Biomol. Struct. 27, 1–34. Royer Jr WE, Zhu H, Gorr TA, Flores JF & Knapp JE (2005) Allosteric hemoglobin assembly: Diversity and similarity. J. Biol. Chem. 280, 27477–27480. Szabo, A & Karplus, M (1972) A mathematical model for structurefunction relationships in hemoglobin. J. Mol. Biol. 72, 163–197. [NaCl] (mM) Hsp90 transcription 1 0 2 1 3 1 4 40 5 100 10 100 100 100 1000 100 [Caffeine] (mM) Hsp90 transcription 1 0 2 0 3 1 4 1.1 5 2 10 10 100 50 1000 100 16 Chapter 16 Principles of Enzyme Catalysis

A. Michaelis–Menten kinetics

Albery WR & Knowles JR (1977) Perfection in enzyme catalysis: the energetics of triosephosphate isomerase. Acc. Chem. Res. 10, 105–

111.

Dowd JE & Riggs DS (1965) A comparison of estimates of Michae lis–Menten kinetic constants from various linear transformations.

J. Biol. Chem. 240, 863–869.

Hammes GG (2002) Multiple conformational changes in enzyme catalysis. Biochemistry 41, 8221–8228.

Kirsch JF (1973) Mechanism of enzyme action. Annu. Rev. Biochem.

42, 205–234.

B. Inhibitors and more complex reaction schemes

Cleland WW (1963) The kinetics of enzyme-catalyzed reactions with two or more substrates or products: I. Nomenclature and rate equa tions. Biochimica et Biophysica Acta 67, 107–137.

Dixon M (1953) The determination of enzyme inhibitor constants.

Biochem. J. 55, 170–171.

Wolfenden R (2006) Degrees of diffi culty of water-consuming reac tions in the absence of enzymes. Chem. Rev. 106, 3379–3396. C. Protein enzymes Beck ZQ, Morris GM & Elder JH (2002) Defi ning HIV-1 protease substrate selectivity. Curr. Drug Targ. Infect. Disorders 2, 37–50. Kraut J (1977) Serine proteases: structure and mechanism of catalysis. Annu. Rev. Biochem. 46, 331–358. Pelmenschikov V & Siegbahn PE (2005) Copper-zinc superoxide dismutase: Theoretical insights into the catalytic mechanism. Inorg. Chem. 44, 3311–3320. Pollack SJ, Jacobs JW & Schultz PG (1986) Selective chemical catalysis by an antibody. Science 234, 1570–1573. Wang P-F, Flynn AJ, Naor MM, Jensen JH, Cui G, Merz KM, Kenyon GL & McLeish MJ (2006) Exploring the role of the active site cysteine in human muscle creatine kinase. Biochemistry 45, 11464–11472. Zhang X, Zhang X & Bruice TC (2005) A defi nitive mechanism for chorismate mutase. Biochemistry 44, 10443–10448. D. RNA enzymes Cochrane JC & Strobel SA (2008) Catalytic strategies of self-cleaving ribozymes. Acc. Chem. Res. 41, 1027–1035. Fedor MJ (2009) Comparative enzymology and structural biology of RNA self-cleavage. Annu. Rev. Biophys. 38, 271–299. Fedor MJ & Williamson JR (2005) The catalytic diversity of RNAs. Nat. Rev. Mol. Cell. Biol. 6, 399–412. Lilley DMJ & Eckstein F (2008) Ribozymes and RNA Catalysis. Cambridge, UK: RSC Publishing. Stahley MR & Strobel SA (2006) RNA splicing: group I intron crystal strcutures reveal the basis of splice site selection and metal ion catalysis. Curr. Opin. Struct. Biol. 16, 319–326. Vicens Q & Cech TR (2006) Atomic level architecture of group I introns revealed. Trends Biochem. Sci. 31, 41–51. 17 Chapter 17 Diffusion and Transport

A. Random walks

Adler J (1966) Chemotaxis in bacteria. Science 153, 708–716.

Berg HC (2004) E. coli in Motion. New York: Springer.

Berg HC (1993) Random Walks in Biology. Princeton, NJ: Princeton

University Press.

B. Macroscopic description of diffusion

Collins FC & Kimball GE (1949) Diffusion-controlled reaction rates. J.

Colloid Sci. 4, 425–437.

Elf J, Li G-W & Xie XS (2007) Probing transcription factor dynamics at the single-molecule level in a living cell. Science 316, 1191–1194.

Elowitz MB, Surette MG, Wolf PE, Stock JB & Leibler S (1999) Pro tein mobility in the cytoplasm of Escherichia coli. J. Bacteriol. 181,

197–203.

Halford SE & Marko JF (2004) How do site-specifi c DNA-binding pro teins fi nd their targets? Nucleic Acids Res. 32, 3040–3052.

Kampmann M (2005) Facilitated diffusion in chromatin lattices:

Mechanistic diversity and regulatory potential. Mol. Micro. 57,

889–899. Kholodenko BN, Hoek JB & Westerhoff HV (2000) Why cytoplasmic signalling proteins should be recruited to cell membranes. Trends Cell Biol. 10, 173–178. Spirov A, Fahmy K, Schneider M, Frei E, Noll M & Baumgartner S (2009) Formation of the bicoid morphogen gradient: An mRNA gradient dictates the protein gradient. Development 136, 605–614. Vale RD & Milligan RA (2000) The way things move: Looking under the hood of molecular motors. Science 288, 88–95. von Hippel PH & Berg OG (1989) Facilitated target location in biological systems. J. Biol. Chem. 264, 675–678. C. Experimental measurement of diffusion Berne BJ & Pecora R (2000) Dynamic light scattering: with application to chemistry, biology, and physics. Mineola, NY: Dover. Lebowitz J, Lewis MS & Schuck P (2002) Modern analytical ultracentrifugation in protein science: A tutorial review. Protein Sci. 11, 2067–2079.

19. The same proteins from the previous problem are analyzed by equilibrium ultracentrifugation. The logarithm of the absorbance is plotted versus squared distance from the top of the measurement cell at 10,000 rpm for each protein. a. Explain which sample is the mutant and which is the wild type. b. Is there monomer–dimer exchange in either the wild-type or mutant kinase?

20. Sedimentation coeffi cients (in Svedberg units) are often non-additive for macromolecular complexes. For example, the assembled ribosome and proteosome each have lower total sedimentation coeffi cients than one might expect given the constituents. How might a macromolecular assembly have a higher sedimentation coeffi cient than the sum of its subunits?

21. Why is it necessary to add agarose or polyacrylamide for electrophoresis experiments? 22. Why do proteins, but not nucleic acids, need to be covered in SDS to estimate the mass by gel electrophoresis? 23. Dynein is a cytoplasmic motor similar to kinesin, but it travels along microtubules in the opposite direction. A single dynein transports a vesicle 0.6 μm along an axon in 5 sec. Dynein steps use one cycle of ATP hydrolysis that move it 80 Å along a microtubule fi lament. Assuming all steps are forward along one fi lament, what is the ATP hydrolysis rate of dynein? 24. How much resistive force does a 50-nm vesicle experience if it is transported by dynein at 1 μm•sec –1 in the cytoplasm ( η = 0.2 g•cm –1 •sec –1 )? r 2 (cm 2 ) l o g ( A 2 8 0 ) 18 Chapter 18 Folding

A. How proteins fold

Dill KA (1990) Dominant forces in protein folding. Biochemistry 29,

7133–7155.

Dill KA & Chan HS (1997) From Levinthal to pathways to funnels. Nat.

Struct. Biol. 4, 10–19.

Dobson CM (2003) Protein folding and misfolding. Nature 426, 884–

890.

Fersht AR (2008) From the fi rst protein structures to our current knowledge of protein folding: delights and scepticisms. Nat. Rev.

Mol. Cell Biol. 9, 650–654.

Nishimura C, Prytulla S, Dyson HJ & Wright PE (2000) Conservation of folding pathways in evolutionarily distant globin sequences. Nat.

Struct. Biol. 7, 679–686.

Plaxco KW, Simons KT, Ruczinski I & Baker D (2000) Topology, stabil ity, sequence, and length: Defi ning the determinants of two-state protein folding kinetics. Biochemistry 39, 11177–11183.

Udgaonkar JB (2008) Multiple routes and structural heterogeneities in protein folding. Annu. Rev. Biophys. 37, 489–510.

B. Chaperones for protein folding Buchner J (1996) Supervising the fold: Functional principles of molecular chaperones. FASEB J. 10, 10–19.

Bukau B, Weissman J & Horwich A (2006) Molecular chaperones and protein quality control. Cell 125, 443–444.

Chiti F & Dobson CM (2006) Protein misfolding, functional amyloid, and human disease. Annu. Rev. Biochem. 75, 333–366. Flynn GC, Pohl J, Flocco MT & Rothman JE (1991) Peptide-binding specifi city of the molecular chaperone BiP. Nature 353, 726–730. Hartl FU (2011) Chaperone assisted protein folding: the path to discovery from a personal perspective. Nat. Med. 17, 1206–1210. Hartl FU (1996) Molecular chaperones in cellular protein folding. Nature 381, 571–579. Horwich AL (2011) Protein folding in the cell: an inside story. Nat. Med. 17, 1211–1216. Prusiner SB (1998) Prions. Proc. Natl. Acad. Sci. USA 95, 13363– 13368. Saibil HR (2008) Chaperone machines in action. Curr. Op. Struct. Biol. 18, 35–42. Whitesell L & Lindquist SL (2005) HSP90 and the chaperoning of cancer. Nat. Rev. Cancer 5, 761–772. C. RNA folding Baird NJ, Fang XW, Srividya N, Pan T & Sosnick TR (2007) Folding of a universal ribozyme: The ribonuclease P RNA. Q. Rev. Biophys. 40(2), 113–161. Chu VB & Herschlag D (2008) Unwinding RNA’s secrets: Advances in the biology, physics, and modeling of complex RNAs. Curr. Opin. Struct. Biol. 18, 305–314. Draper DE (2008) RNA folding: Thermodynamic and molecular descriptions of the roles of ions. Biophys. J. 95, 5489–5495. Li PTX, Vieregg J & Tinoco I (2008) How RNA unfolds and refolds. Annu. Rev. Biochem. 77, 77–100. Pyle AM (2002) Metal ions in the structure and function of RNA. J. Biol. Inorg. Chem. 7–8, 679–690. Thirumalai D & Woodson SA (2000) Maximizing RNA folding rates: A balancing act. RNA 6, 790–794. Shcherbakova I & Brenowitz M (2008) Monitoring structural changes in nucleic acids with single residue spatial and millisecond time resolution by quantitative hydroxyl radical footprinting. Nat. Protoc. 3(2), 288–302.

Further Reading

General 19 Chapter 19 Fidelity in DNA and Protein Synthesis

A. Measuring the stability of DNA duplexes

Petruska J, Goodman MF, Boosalis MS, Sowers LC, Cheong C & Tinoco

I Jr (1988) Comparison between DNA melting thermodynamics and

DNA polymerase fi delity. Proc. Natl. Acad. Sci. USA 85, 6252–6256.

Aboul-ela F, Koh D, Tinoco I Jr & Martin FH (1985) Base-base mis matches. Thermodynamics of double helix formation for dCA 3 XA 3 G

+ dCT 3 YT 3 G (X, Y = A,C,G,T). Nucleic Acids Res. 13, 4811–4824.

Lai JS, Qu J & Kool ET (2003) Fluorinated DNA bases as probes of electrostatic effects in DNA base stacking. Angew. Chem. Int. Ed.

Engl. 42, 5973–5977.

B. Fidelity in DNA replication

Doublie S, Tabor S, Long AM, Richardson CC & Ellenberger T (1998)

Crystal structure of a bacteriophage T7 DNA replication complex at

2.2 Å resolution. Nature 391, 251–258. Johnson KA (1993) Conformational coupling in DNA polymerase fi delity. Annu. Rev. Biochem. 62, 685–713. Joyce CM & Steitz TA (1994) Function and structure relationships in DNA polymerases. Annu. Rev. Biochem. 63, 772–822. Kunkel TA (2004) DNA replication fi delity. J. Biol. Chem. 279, 16895– 16898. Rothwell PJ & Waksman G (2005) Structure and mechanism of DNA polymerases. Adv. Protein Chem. 71, 401–440. C. How ribosomes achieve fi delity Korostelev A & Noller HF (2007) The ribosome in focus: new structures bring new insights. Trends Biochem. Sci. 32, 434–441. Ramakrishnan V (2010) Unraveling the structure of the ribosome (Nobel Lecture). Angew. Chem. Int. Ed. Engl. 49, 4355–4380. Rodnina MV, Beringer M & Wintermeyer W (2006) Mechanism of peptide bond formation on the ribosome. Quart. Rev. Biophys. 39, 203–225. Schmeing TM & Ramakrishnan V (2009) What recent ribosome structures have revealed about the mechanism of translation. Nature 461, 1234–1242. Steitz, TA (2010 From the structure and function of the ribosome to new antibiotics (Nobel Lecture). Angew. Chem. Int. Ed. Engl. 49, 4341–4354. Yonath A (2010) Polar bears, antibiotics and the evolving ribosome (Nobel Lecture). Angew. Chem. Int. Ed. Engl. 49, 4381–4398. Zaher HS & Green R (2009) Fidelity at the molecular level: Lessons from protein synthesis. Cell 136, 746–762. and entropy measurements show a large range. A scientist observes that mutations that are favorable with respect to binding enthalpy are entropically disfavored. What effect has the scientist observed and why is this phenomenon common in biological molecules?

23. A DNA polymerase is isolated and found to have an error rate of 1 in 10 6 . a. Suppose that the error rate is determined solely by the relative stabilities of incorrect and correct base pairs. What would the difference in free energy between correct and incorrect nucleotides incorporated by the polymerase have to be in order to explain the error rate? b. Solution studies of isolated oligonucleotides indicate that the energetic difference is actually –1.2 kJ•mol –1 . What is the equilibrium constant of correct–incorrect base pair discrimination based on these solution studies? c. What other enzymatic activity, in addition to nucleotide insertion, contributes to the increased fi delity of DNA polymerase?

24. Both thermodynamics and kinetics play an important role in achieving fi delity. How is kinetic control exercised by a) DNA polymerase and b) the ribosome?