An Analysis of Ancestral Sequence Resurrection in the Context of Guanylate Kinase Evolution

AN ANALYSIS OF ANCESTRAL SEQUENCE RESURRECTION IN THE CONTEXT OF GUANYLATE KINASE EVOLUTION by WILLIAM CAMPODONICO-BURNETT A THESIS Presented to the Department of Biochemistry and the Robert D. Clark Honors College in partial fulfillment of the requirements for the degree of Bachelor of Science July 2014 An Abstract of the Thesis of William Campodonico-Burnett for the degree of Bachelor of Arts in the Department of Biochemistry to be taken July 2014 Title: An Analysis of Ancestral Sequence Resurrection in the Context of Guanylate Kinase Evolution Approved: --Lb:S-___C______:. \)_~--· _ _ Kenneth E. Prehoda Ancestral sequence resurrection (ASR) is an important tool for studying evolution on a molecular scale. The process takes a broad range of extant samples and, using sequence alignment and evolutionary prediction algorithms, determines the most likely sequence to have evolved into modern-day proteins. While ever-improving technologies allow for increasingly reliable predictions, it is impossible to prove whether a reconstruction is in fact the true ancestor. This project will analyze the fidelity of the ASR process in the context of the divergence of enzymatically inactive guanylate kinase-like binding domains and enzymatically active guanylate kinases from a common ancestor. A maximum likelihood ancestor has already been predicted, so by comparing relative enzymatic activity of this ancestor, a variety of mutants, Bayesian predictions, and extant enzymes, we will be able to assess the validity of ASR for this billion-year-old evolutionary event. ii Acknowledgements I would like to thank Professors Prehoda and Southworth and Dr. Hetrick for helping me to fully examine the intricacies of protein evolution on a molecular scale, and for providing a diverse set of perspectives for framing my research. I would also like to thank all the professors, faculty, and staff that have given me the foundation of knowledge and skills required to complete an undergraduate thesis, as well as my fellow researchers within the Prehoda Lab for their guidance, wisdom, and encouragement through my research. Finally, I would like to thank my family for their constant support and encouragement, and for providing the ability to attend this wonderful institution. iii Table of Contents Introduction 1 Background 2 General Information 2 Ancestral Sequence Resurrection (ASR) 6 Membrane Associated Guanylate Kinases (MAGUK) 9 Enzyme Kinetics 14 Results 20 Catalytic Turnover (kcat) of Ancestral Guanylate Kinases 20 “Substrate Affinity” (KM) of Ancestral Guanylate Kinases 21 Catalytic efficiency (kcat/KM) of Ancestral Guanylate Kinases 22 Discussion & Conclusion 24 Mean and Variance data show that kcat and KM are statistically comparable 24 Analysis of individual point mutations can provide insight into their negligible impact on overall catalytic efficiency 24 Kinetics 26 Conclusion 26 Materials and Methods 28 Plasmid construction, expression, and purification 28 Enzymatic activity coupled assay 29 Future Directions 31 Appendix A – Protein Concentrations 32 Appendix B: Compiled Enzyme Data 33 Appendix C: Reconstruction Facts and Values 34 Individual Mutations 34 Additional Protein Characteristics: Bayesian 35 Appendix D: Phylogenetic Tree 37 Works Cited 38 iv List of Figures Figure 1: The (simplified) Central Dogma of Molecular Biology 2 Figure 2: Transcription and Translation 4 Figure 3: A General Overview of Ancestral Sequence Resurrection 8 Figure 4: Reaction Catalyzed by Guanylate Kinase Enzymes 10 Figure 5: MAGUK Domain Architecture 11 Figure 6: Characterization of the Likelihood of AncGK0 12 Figure 7: A Characteristic Michaelis-Menten Kinetics Plot 16 Figure 8: A Characteristic Sigmoidal Plot for an Allosteric Enzyme 18 Figure 9: Compiled Enzyme Turnover Data 21 Figure 10: Compiled Substrate Affinity Data 22 Figure 11: Compiled Catalytic Efficiency Data 23 v Introduction Approximately a billion years ago, a chromosomal duplication event occurred in a common ancestor of Choanoflagellates and Metazoa1, creating two copies of the gene coding the guanylate kinase (GK) enzyme. Over the course of the next billion years, these initially identical genes diverged to develop unique functions which still exist today, one branch maintaining its enzymatic ability, the other losing kinase activity but simultaneously gaining a protein binding function. Through the process of ancestral sequence resurrection (ASR), the amino acid sequence of this ancient protein was statistically predicted to help elucidate the process by which this divergence occurred, and this paper follows a rigorous mutagenic analysis of the resurrection to determine its relevance to the guanylate kinase system. 1 de Mendoza, 2010 Background General Information2 This paper will be working with the expectation of a rudimentary knowledge of biological processes. While far from exhaustive, the following explanation will hopefully provide enough information to follow the study. At the core of all cellular processes lies the central dogma of molecular biology, which generally describes the flow of genetic information and for the purpose of this study can be simplified to say ‘DNA leads to RNA leads to proteins’. Figure 1: The (simplified) Central Dogma of Molecular Biology In a simplistic view, this figure describes the central dogma of molecular biology, which generally states, “DNA produces RNA produces proteins”. The figure also indicates that DNA can remake itself. It does not show reverse transcription, another process whereby DNA can be synthesized using and RNA template, nor does it consider a large variety of other RNA function. While in actuality the theory is far more nuanced, this hits the following important points. Deoxyribonucleic acid (or DNA) serves as the information carrier of the cell, coding all a cell needs to survive and reproduce in countless different environments. It is the genetic material that is passed from generation to generation and ensures that life will continue beyond the span of a single organism’s time. During the course of an organism’s life, this genetic information is copied to ribonucleic acid (RNA) through the process of transcription, and this RNA then acts as the messenger, carrying 2 Background information adapted from Voet & Voet 2 information to the ribosome. Once at the ribosome RNA is translated into proteins. As the words imply, transcription (conversion of DNA into RNA) is a process of copying – both DNA and RNA are composed of nucleic acids. DNA cannot leave the nucleus and RNA can, so the genetic information is converted to RNA to reach the cytoplasmic ribosomes. On the other hand, translation (as the name implies) takes the nucleotide sequence and changes it into amino acids, which are structurally significantly different from nucleotides. 3 A B Figure 2: Transcription and Translation3 A. The flow of genetic information begins in the nucleus where DNA is transcribed into mRNA. This mRNA then leaves the nucleus and is transported to the ribosome, where B. it is translated into a protein chain. Three-nucleotide codons indicate which amino acid to add based on the mRNA sequence. These proteins are the functional end of this genetic flow, and will carry out a nearly unimaginable range of functions depending on their makeup. To use a business analogy, DNA is high management, making decisions and passing them along to RNA, 3Voet & Voet 4 the middle management, which then carries orders and oversees the proteins performing the work itself. As mentioned above, DNA and RNA are both chains of nucleic acids made up of sequences of five subtly different nucleotides – adenine, guanine (purines), thymine, cytosine, and uracil (pyrimidines). Proteins are made up of chains of 20 different natural amino acids, which allows these molecules much greater functional diversity than strings of nucleotides. During translation, unique groups of three nucleotides, known as codons, are read by the ribosome to represent individual amino acids – for example, a sequence of the three nucleotides guanine, adenosine, cytosine (GAC for short) would be translated into an aspartic acid amino acid while the subtly different ACC codon translates to a dramatically different alanine. A protein’s function relies heavily on both the linear sequence of amino acids, known as primary structure, and the 3D shape into which it folds, or secondary and tertiary structures, which are largely dictated by the primary structure. At the core, this folding process is all about minimizing energy to create the most energetically stable structure, and this stability is largely determined by the chemical properties of substituent amino acids, including size, rigidity, pKa, and hydrophobicity. Folding also gives rise to functional domains within a protein, such as hydrophobic cores (regions of largely organic, non-polar amino acids that clump together to keep away from water) and enzymatic active sites (groups of amino acids that create chemical microenvironments which help catalyze a chemical reaction), to name a few. Different combinations of these amino acids, and the resulting shapes into which they fold, lead to the nearly unimaginable range of protein functions observed within even the least complex forms of life. 5 Ancestral Sequence Resurrection (ASR) Ancestral Sequence Resurrection is a method of statistically predicting and creating the ancestors of extant proteins. ASR as a process was first hypothesized in the 1960’s, but for several decades technology could not make it a viable reality. In

An Analysis of Ancestral Sequence Resurrection in the Context of Guanylate Kinase Evolution

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support