De Novo Nucleic Acids: a Review of Synthetic Alternatives to DNA and RNA That Could Act As † Bio-Information Storage Molecules
Total Page:16
File Type:pdf, Size:1020Kb
life Review De Novo Nucleic Acids: A Review of Synthetic Alternatives to DNA and RNA That Could Act as y Bio-Information Storage Molecules Kevin G Devine 1 and Sohan Jheeta 2,* 1 School of Human Sciences, London Metropolitan University, 166-220 Holloway Rd, London N7 8BD, UK; [email protected] 2 Network of Researchers on the Chemical Evolution of Life (NoR CEL), Leeds LS7 3RB, UK * Correspondence: [email protected] This paper is dedicated to Professor Colin B Reese, Daniell Professor of Chemistry, Kings College London, y on the occasion of his 90th Birthday. Received: 17 November 2020; Accepted: 9 December 2020; Published: 11 December 2020 Abstract: Modern terran life uses several essential biopolymers like nucleic acids, proteins and polysaccharides. The nucleic acids, DNA and RNA are arguably life’s most important, acting as the stores and translators of genetic information contained in their base sequences, which ultimately manifest themselves in the amino acid sequences of proteins. But just what is it about their structures; an aromatic heterocyclic base appended to a (five-atom ring) sugar-phosphate backbone that enables them to carry out these functions with such high fidelity? In the past three decades, leading chemists have created in their laboratories synthetic analogues of nucleic acids which differ from their natural counterparts in three key areas as follows: (a) replacement of the phosphate moiety with an uncharged analogue, (b) replacement of the pentose sugars ribose and deoxyribose with alternative acyclic, pentose and hexose derivatives and, finally, (c) replacement of the two heterocyclic base pairs adenine/thymine and guanine/cytosine with non-standard analogues that obey the Watson–Crick pairing rules. This manuscript will examine in detail the physical and chemical properties of these synthetic nucleic acid analogues, in particular on their abilities to serve as conveyors of genetic information. If life exists elsewhere in the universe, will it also use DNA and RNA? Keywords: non-standard nucleic acids; sugar-phosphate backbone; pentose sugars; hexose derivatives; phosphate group replacement; alien life forms 1. Introduction Life on Earth uses three key biopolymers, namely nucleic acids, proteins and polysaccharides; each of which possesses intrinsic structural features. Nucleic acids are polymers comprised of heterocyclic aromatic bases appended to a sugar-phosphate backbone, held together by phosphodiester bonds. Proteins are polymers of amino acids linked via amide bonds of dubbed as peptide bonds [ C(= O)N( H) ] − − − and polysaccharides are polymers of carbohydrates linked via acetal ether bonds. These unique chemical features have long fascinated organic chemists and stimulated the most creative minds among them to question nature’s choices, and, indeed, design and test alternatives using the power of laboratory-based synthetic organic chemistry. This paper will focus entirely upon re-designed nucleic acids, which feature three key structural modifications of their natural counterparts: (a) replacement of the phosphate moiety with an uncharged analogue, (b) replacement of the pentose sugars ribose and deoxyribose with alternative acyclic, pentose and hexose derivatives and, finally, (c) replacement of the two heterocyclic base pairs adenine/thymine and guanine/cytosine with non-standard analogues that obey (or disobey) Life 2020, 10, 346; doi:10.3390/life10120346 www.mdpi.com/journal/life Life 2020, 10, 346 2 of 21 hexose derivatives and, finally, (c) replacement of the two heterocyclic base pairs adenine/thymine and guanine/cytosineLife 2020, 10, 346 with non-standard analogues that obey (or disobey) the Watson–Crick2 of pairing 21 rules. As will be shown, the results are indeed intriguing and have profound consequences for the development of artificial Darwinian chemical systems, and the discovery of life, if it exists, elsewhere the Watson–Crick pairing rules. As will be shown, the results are indeed intriguing and have profound in theconsequences Universe. for This the developmentis an example of artificial of how Darwinian the synthesis chemical systems,paradigm and can the discoverydrive discovery of life, if it and understandingexists, elsewhere in ways in the Universe.that analysis This isof an the example natural of how world the synthesis alone paradigmcannot. canJust drive as the discovery most and skilled mechanical,understanding electrical in ways and that softwareanalysis of engineers the natural can world build alone modern cannot. Just automobiles, as the most skilled aircraft mechanical, and super- computers,electrical creations and software whose engineers intricate can buildinner modern workings automobiles, they fully aircraft understand, and super-computers, so, it is hoped, creations the new generationwhose intricateof synthetic inner workingsbiologists they will fully be able understand, to manufacture, so, it is hoped, and the thus new full generationy understand, of synthetic artificial life forms,biologists built will from be able different to manufacture, biopolymers and thus to fullythose understand, found in nature. artificial life forms, built from different biopolymers to those found in nature. 1.1. Nucleic Acid Structure 1.1. Nucleic Acid Structure NucleicNucleic acids acids are are biopolymers biopolymers that that are are built built from nucleotides.from nucleotides. The latter The consist latter of three consist molecular of three molecularcomponents; components; a heterocyclic a heterocyclic aromatic aromatic base (also base known (also as known a nucleobase), as a nucleobase), a five-atom a ringfive-atom sugar, ring sugar,and and a phosphate a phosphate unit thatunit connects that connects the sugars the together sugars andtogether forms theand alternating forms the sugar-phosphate alternating sugar- phosphatebackbone. backbone. The structures The structures of the nucleobases, of the nucleo the twobases, sugars the andtwo the sugars nucleobase-sugar and the nucleobase-sugar conjugates, conjugates,which are which known are as known nucleosides, as nucleosides, are shown in are Figure shown1. DNA in Figure di ffers 1. from DNA RNA differs in two from distinct RNA ways: in two distinctthe sugarways: is the 20-deoxyribose sugar is 2′-deoxyribose instead of ribose, instead and of the ribose, pyrimidine and the base pyrimidine thymidine has base a methyl thymidine group has a methylattached group at attached the 5-position at the where 5-position uracil haswhere hydrogen uracil andhas sohydrogen technically and speaking so technically the thymidine speaking is a the thymidine50-methyluracil is a 5′-methyluracil base. base. FigureFigure 1. The 1. The molecular molecular structures structures ofof the the nucleobases, nucleobases, sugars sugars and nucleosidesand nucleosides found in found DNA andin DNA RNA. and RNA. The numbering systems are distinct for the nucleobases and sugars, with the latter using affixed “primed” numbers 1′-5′ to distinguish them (Figure S1a, in the Supplementary Materials). Life 2020, 10, 346 3 of 21 Life 2020, 10, 346 3 of 21 Oligonucleotides,The numbering or polynucleotides, systems are distinct are for polymers the nucleobases made from and sugars, nucleosides with the that latter are using linked affixed via their 3′- and“primed” 5′-oxygen numbers atoms 10-50 to by distinguish phosphate them groups (Figure S1a,(Figure in the SupplementaryS1b, in the Supplemen Materials).tary Oligonucleotides, Materials). The sequenceor polynucleotides, of bases is read are polymers from the made 5′-end from to nucleosides the 3’-end that (i.e., are 5 linked′ 3′). via their 30- and 50-oxygen atoms byThe phosphate base sequence groups (Figure of an S1b,oligonucleotide in the Supplementary is important Materials). because The sequencegenetic information of bases is read is fromstored in the 5 -end to the 3’-end (i.e., 5 3 ). the sequence0 of these bases in a0 !DNA0 (or RNA) molecule. The key storage unit for genetic information The base sequence of an oligonucleotide is important because genetic information is stored in the in most organisms is not a single-stranded DNA oligomer, instead, it is two complementary strands. sequence of these bases in a DNA (or RNA) molecule. The key storage unit for genetic information These strands are held together by base-pairs on opposite strands which follow two complementary in most organisms is not a single-stranded DNA oligomer, instead, it is two complementary strands. principles:These strands size and are heldhydrogen-bonding together by base-pairs complementarity. on opposite strands In size which complementarity, follow two complementary a large 9-atom bicyclicprinciples: purine size base and hydrogen-bonding(adenine and guanine) complementarity. pairs Inwith size a complementarity, small, 6-atom a largering 9-atompyrimidine bicyclic base (uracil/thyminepurine base (adenine and cytosine). and guanine) In hydrogen-bonding pairs with a small, 6-atom complementarity, ring pyrimidine hydrogen base (uracil bond/thymine donors and (N-H bonds)cytosine). on one In base hydrogen-bonding interact with the complementarity, hydrogen bond hydrogen acceptors bond generally donors (N-H with bonds) lone pairs on one of baseelectrons on Ninteract or O withatoms the on hydrogen its partner bond in acceptors the opposite generally stra withnd. loneIn this pairs way, of electrons an adenine on N oron O one atoms strand on its pairs withpartner a thymine in the (or opposite uracil strand. in RNA) In this in another, way,