The Alanine World Model for the Development of the Amino Acid Repertoire in Protein Biosynthesis
Total Page:16
File Type:pdf, Size:1020Kb
International Journal of Molecular Sciences Review The Alanine World Model for the Development of the Amino Acid Repertoire in Protein Biosynthesis Vladimir Kubyshkin 1,* and Nediljko Budisa 1,2,* 1 Department of Chemistry, University of Manitoba, Dysart Rd. 144, Winnipeg, MB R3T 2N2, Canada 2 Department of Chemistry, Technical University of Berlin, Müller-Breslau-Str. 10, 10623 Berlin, Germany * Correspondence: [email protected] (V.K.); [email protected] or [email protected] (N.B.); Tel.: +1-204-474-9321 or +49-30-314-28821 (N.B.) Received: 24 September 2019; Accepted: 3 November 2019; Published: 5 November 2019 Abstract: A central question in the evolution of the modern translation machinery is the origin and chemical ethology of the amino acids prescribed by the genetic code. The RNA World hypothesis postulates that templated protein synthesis has emerged in the transition from RNA to the Protein World. The sequence of these events and principles behind the acquisition of amino acids to this process remain elusive. Here we describe a model for this process by following the scheme previously proposed by Hartman and Smith, which suggests gradual expansion of the coding space as GC–GCA–GCAU genetic code. We point out a correlation of this scheme with the hierarchy of the protein folding. The model follows the sequence of steps in the process of the amino acid recruitment and fits well with the co-evolution and coenzyme handle theories. While the starting set (GC-phase) was responsible for the nucleotide biosynthesis processes, in the second phase alanine-based amino acids (GCA-phase) were recruited from the core metabolism, thereby providing a standard secondary structure, the α-helix. In the final phase (GCAU-phase), the amino acids were appended to the already existing architecture, enabling tertiary fold and membrane interactions. The whole scheme indicates strongly that the choice for the alanine core was done at the GCA-phase, while glycine and proline remained rudiments from the GC-phase. We suggest that the Protein World should rather be considered the Alanine World, as it predominantly relies on the alanine as the core chemical scaffold. Keywords: amino acids; genetic code; RNA World; Protein World; alanine; proline 1. The Amino Acid Code Life on Earth is made with the help of biopolymeric molecules of different kinds: linear polymers such as nucleic acids, proteins, and branched chains of polysaccharides. The process of life is characterized by a diverse set of interactions and strong dependencies between these molecules. For example, the synthesis of a cellulose (polymeric carbohydrate) is made by cellulose synthase (protein), a stretch of DNA (nucleic acid) is made by DNA polymerase (protein), and any full-length protein is made on the ribosome, which is essentially composed of RNA (nucleic acid). The relationships become even more complex considering that some of these syntheses are templated: a DNA is made on another DNA template (this process is called replication), an RNA is made on a DNA template (transcription), and a protein is made on an RNA template (translation). A templated synthesis implies that the sequence of the source polymer corresponds to the sequence of the outcome polymer with a certain rule. As a result, the source polymer can be considered as the one containing information about the outcome polymer’s content and its properties and the latter essentially execute the information. Therefore, a DNA (or RNA) is also called an informational polymer, whereas a protein is called an executive polymer. Int. J. Mol. Sci. 2019, 20, 5507; doi:10.3390/ijms20215507 www.mdpi.com/journal/ijms Int. J. Mol. Sci. 2019, 20, 5507 2 of 17 Int.What J. Mol. are Sci. 2019 the, constraints20, x FOR PEER inREVIEW thechoice of basic building blocks (monomers) for these2 of transient 18 polymeric structures? In other words, to what extent was monomer appearance inevitable for life on Earth andWhat how are likely the constraints is it to expect in the this choice component of basic building to appear blocks in a(monomers) living organism for these elsewhere transient in the universepolymeric [1]? Thisstructures? question In other is especially words, to frequentlywhat extent asked was monomer with regards appearance to amino inevitable acid composition for life on [2]. ThereEarth are and 20 how+ 3 likely amino is acidit to expect residues this thatcomponen are templatedt to appear in a the living ribosomal organism synthesiselsewhere ofin the proteins universe [1]? This question is especially frequently asked with regards to amino acid composition [2]. (coded into proteins). How did these amino acids end up in the protein code is a controversial There are 20 + 3 amino acid residues that are templated in the ribosomal synthesis of proteins (coded matter that has been addressed via the analysis of the codon structures, RNAs, metabolic and into proteins). How did these amino acids end up in the protein code is a controversial matter that prebiotichas been chemistries, addressed physiochemical via the analysis propertiesof the codon of aminostructures, acids, RNAs, and othermetabolic chemical and prebiotic and physical perspectiveschemistries, [3– physiochemical6]. While questioning properties the of amino amino acid acids, repertoire, and other wechemical tend to and overlook physical that perspectives the amino acid identities[3–6]. While represent questioning only one the aspect amino of acid the repertoire, protein architecture: we tend to overlook the primary that the sequence. amino acid There identities is however morerepresent to that: only secondary one aspect structures of the protein (e.g., αarchitecture:-helix and βthe-sheet), primary motifs sequence. (hydrophobic, There is however salt bridges, more etc.), andto tertiary that: secondary folding (globularstructures fold(e.g., andα-helix membrane and β-sheet), proteins) motifs (Figure(hydrophobic,1). Their salt appearancebridges, etc.), in and protein architecturetertiary folding is rarely (globular discussed fold or and questioned. membrane Meanwhile, proteins) (Figure this isexactly 1). Their what appearance the amino in acidprotein sequence is supposedarchitecture to determine.is rarely discussed Why door wequestioned. not turn Meanwhile, our perspective this is and exactly ask what ourselves: the amino is it inevitableacid sequence is supposed to determine. Why do we not turn our perspective and ask ourselves: is it that proteins should span the membrane when making channels, pumps, or receptors? Or another inevitable that proteins should span the membrane when making channels, pumps, or receptors? Or question: does the α-helix/β-sheet pair have any alternatives in forming the main protein constituting another question: does the α-helix/β-sheet pair have any alternatives in forming the main protein element?constituting Any kind element? of answer Any kind will of provide answer an will absolutely provide an new absolutely and unique new and view unique on the view set ofon the the amino acidsset involved of the amino in protein acids involved biogenesis. in protein In this biogenesis. article, we In will this usearticle, this we perspective will use this to perspective propose a viewto on the aminopropose acid a view repertoire on the amino establishment. acid repertoire establishment. FigureFigure 1.1. The hierarchy hierarchy of of protein protein folding. folding. 2. Amino2. Amino Acids Acids Do Do Not Not Equal Equal Proteins Proteins AminoAmino acids acids play play multiple multiple metabolic, metabolic, energetic, energetic, and and structural structural rolesroles inin biochemistry,biochemistry, andand forfor many of themmany participation of them participation in protein in protein translation translation accounts accounts for a for rather a rather small small part part of theirof their cellular cellular activity. Thisactivity. implies This that implies amino that acids amino should acids have should entered have biochemistryentered biochemistry at least at twice: least twice: first, first, as metabolic as entities,metabolic and later,entities, as monomericand later, as constituents monomeric forconstituents ribosomally for producedribosomally proteins. produced In proteins. between In was the between was the RNA World. The hypothesis of the RNA World follows directly from the Central RNA World. The hypothesis of the RNA World follows directly from the Central Dogma of molecular Dogma of molecular biology, and it postulates that there was a time, when biochemistry was driven biology,with andthe catalysis it postulates of RNA that molecules there was [7–9]. a time, Whet whenher there biochemistry were any protein was driven molecules with in thethe catalysisRNA of RNAWorld molecules phase [of7– before,9]. Whether we do therenot know. were We any cannot protein know molecules this for insure, the and RNA will World remain phase agnostic of before, we doregarding not know. the molecular We cannot components know this that for are sure, not tr andackable will in remain the existing agnostic biochemistry. regarding Only the those molecular componentsproteins that that were are not produced trackable by th ine theribosome existing are biochemistry. coded by genes, Only thus those they proteinscan be phylogenetically that were produced by theanalyzed ribosome at the are present coded days. by genes, This suggests thus they that can protein be phylogenetically biogenesis and the analyzed canonical at theamino present acid