Functional Proteins from Short Peptides: Dayhoff S Hypothesis

Angewandte Essays Chemie International Edition:DOI:10.1002/anie.201609977 Origin of Proteins German Edition:DOI:10.1002/ange.201609977 Functional Proteins from Short Peptides:Dayhoffs Hypothesis Turns 50 M. Luisa Romero Romero,Avigayel Rabin, and Dan S. Tawfik* Margaret O. Dayhoff ·peptides ·proteins · protein evolution ·origin of life 1. Introduction to adifferent fold.[6] But how would relatively short peptides give rise to folded, globular,and functional proteins?Fifty Proteins are unlikely to have emerged as the large, years ago,Margaret O. Dayhoff pioneered the key hypothesis complex globular forms we know today.First, the primordial regarding the evolution of the first protein forms from short, protein synthesis machinery would inevitably have been simple peptides. inefficient, and the earliest step in protein evolution may even have been the abiotic emergence of short peptides.Second, the de novo emergence of functional proteins through 2. Margaret Oakley Dayhoff random condensation of amino acids demands an improbably extensive exploration of sequence space.Wemay assume Born in Philadelphia on 11 March 1925, Dayhoff grew up aminimal set of primordial, abiotic amino acids,[1] even in New York City (Figure 1). Already in her early academic a“binary” system of polar/charged or hydrophobic[2] plus life,she stood out as an exceptional student, and was awarded glycine for connecting loops.Nonetheless,the probability of ascholarship to New York University.In1945, she graduated the emergence of atypical protein domain that can carry out magna cum laude in mathematics and three years later she enzymatic functions (ca. 100 amino acids) seems dauntingly obtained aPh.D.inquantum chemistry at Columbia Uni- 100 47 low (1/3 10À ). Hence,the first functional proteins versity.Her thesis described pioneering steps in computa- presumably originated from short polypeptides.However, tional chemistry—using punch card machines,she calculated since short polypeptide segments cannot function in the same the resonance energies of organic molecules.[7] She further way that globular proteins do,[3] supramolecular self-assem- explored problems in theoretical chemistry,first at the blies may have provided peptides with the operative volume Rockefeller Institute for Medical Research and later at the and network of interactions that are necessary,particularly University of Maryland. In the 1960s,she became associate for enzymatic functions.The first self-reproducing macro- director of the National Biochemical Research Foundation, molecules are likely to have been ribonucleic acids (RNAs). and in the early 1970s,she became aprofessor of physiology Peptides appeared later, probably as cofactors that stabilized and biophysics at Georgetown University Medical Center.[8] RNAs and augmented their catalytic capabilities.[4] Short She was the first woman to hold office in the Biophysical functional peptides are therefore highly likely to have served Society,and the first person to become both Secretary and as crucial intermediates between aprimordial RNAworld and the extant protein world. In fact, the birth of new proteins is not rare.Certain classes of proteins,typically disordered proteins,orordered all-beta proteins (e.g. b-propellers)[5] are constantly emerging. These new proteins may originate de novo from non-coding sequences or from sequences that encode proteins belonging [*] Dr.M.L.Romero Romero, Prof. D. S. Tawfik Department of Biomolecular Sciences The Weizmann Institute of Science, Rehovot 76100 (Israel) E-mail:[email protected] A. Rabin Current address:Department of Biological Chemistry the Alexander Silberman Inst. of Life Sciences the Hebrew UniversityofJerusalem Edmond J. Safra Campus, Jerusalem91904 (Israel) Figure 1. Margaret Oakley Dayhoff (1925–1983). Apicture taken in Supportinginformation for this article can be found under: 1980 (owned by her daughter,Ruth E. Dayhoff,and made available by http://dx.doi.org/10.1002/anie.201609977. the National Library of Medicine). 15966 2016 Wiley-VCH Verlag GmbH &Co. KGaA, Weinheim Angew.Chem. Int. Ed. 2016, 55,15966 –15971 Angewandte Essays Chemie President. Dayhoff was also amember of the American of the possessive attitudes held by many scientists regarding Association for the Advancement of Science,and in 1980, she sequence data. Crucially,Dayhoff derived the first probability became acouncillor for the International Society for the model of protein evolution that enabled the reconstruction of Study of the Origins of Life.She served on the editorial protein phylogenetic trees,the so-called PAMmodel (a boards of the journals DNA, Journal of Molecular Evolution, rearranged acronym for accepted pointed mutation). She thus and Computers in Biology and Medicine. described the first reconstruction of aphylogenetic tree,and Throughout her career, Dayhoff met challenges that applied such trees to provide the first support of symbiosis as working women face to this day,especially in fields largely the origins of eukaryotes.[12] She introduced the concept of dominated by men. Forexample,she had to seek anew protein superfamilies and the hierarchical organization of position after pregnancyand raising her two daughters.[8a] related sequences,[13] thus initiating all modern protein- Nonetheless,her scientific achievements are remarkable and categorizing databases such as CATH or SCOP.Anyone long-lasting.Inher honour,the Biophysical Society estab- analysing protein sequence,structure,orevolution is follow- lished the Margaret Oakley Dayhoff Award in 1984, which is ing in Dayhoffsgiant footsteps (not least, when using the given to outstanding women researchers it the early stages of single-letter amino acids code she began to introduce in 1965). careers in biophysical research. In 1966, following the publication of the very first Dayhoff was fascinated by the origin of life and decided to sequence of ferredoxin, Dayhoff,together with Richard address it by using computational analysis.[9] She developed Eck, published apaper in Science noting the internal thermodynamic models to study the prebiotic planetary sequence symmetry in ferredoxin and the profound implica- atmosphere.Foremost, she established methods to infer tions of this symmetry.[14] Ferredoxin is a55-residue protein evolutionary relationships through the comparison of protein containing key inorganic components in the form of iron– sequences,[10] aground-breaking contribution that marked the sulphur clusters that mediate its electron-transfer functions. start of bioinformatics as ascientific discipline.In1965, she Eck and Dayhoff considered ferredoxin to be aliving protein initiated the first protein database,collecting the 70 known fossil, not only because of its rudimentary function, but also protein sequences at that time in the Atlas of Protein because of its internal sequence symmetry.The symmetry Sequence and Structure.[11] This was not an easy task because they observed in the ferredoxin sequence known at the time (from Clostridium pasteurianum;Figure 2a)led them to Maria Luisa Romero Romero studied propose that ferredoxin, and possibly other proteins,evolved chemistry at the University of Granada through tandem duplications of ashorter protein, which itself where, in 2012, she obtained her Ph.D. in may have emerged through the duplication of an even shorter the group of Prof. Sanchez-Ruiz. She con- and simpler ancestral peptide.Bysimpler,they meant tinued as aPostdoctoralResearcherinDan acomposition of amino acids that could largely form Tawfik’s group at the Weizmann Institute of spontaneously,that is,through abiotic chemical and physical Science. Her current work is focused on the earliest stage of protein evolution. processes. Remarkably,there were only afew ferredoxin sequences available,which were obtained through amino acid sequenc- ing—DNAsequencing would emerge more than 10 years later. Furthermore,the 3D structure of ferredoxin was Avigayel (Hefter) Rabin, obtained her B.Sc. unknown, although the structure,once solved, revealed in computer science at Jerusalem College of astrikingly high degree of symmetry in the tertiary structure Technology (Machon Tal), during which with respect to the two halves (Figure 2b).[15] However, period, she also carried out bioinformatics numerous scientific hypotheses are proven invalid with time, research in Dan Tawfik’s group. She is currently pursuing aPh.D. at the Hebrew often, when the second or third model cases are examined. university of Jerusalem, in Dr.Sebastian Still, the hundreds of sequences of ferredoxin that followed Kadener’s lab, and her main interest is the the few available to Dayhoff and Eck showed the same evolution of circRNA biogenesis. pattern of internal symmetry.[16] In fact, using the tools pioneered by Dayhoff,consistently higher internal symmetry can be observed in the inferred ancestral ferredoxin sequences compared to the extant ones (Figure 2c). Dayhoff and Dan S. Tawfik’s research is focused on Eckshypothesis of asimple (abiotic) composition for the understanding how proteins, and enzymes ancestral peptide is also supported by this analysis (inset, in particular,evolve. The group studies the structural, biophysical, genetic, and Figure 2c). Finally,functional ferredoxin-like proteins have biochemicalaspects of protein evolution, been experimentally obtained through the self-assembly of examining avariety of case studies ranging peptides as short as 16 amino acids.[17] Thesmall size of these from the emergence of pesticide-degrading so-called maquettes

Functional Proteins from Short Peptides: Dayhoff S Hypothesis

Lie Therory and Hermite Polynomials

Sophie Dumont

Sophie Dumont

The Origins of Bioinformatics.Pdf

History and Epistemology of M Olecular Biology and Beyond

Coding Sequences: a History of Sequence Comparison Algorithms As a Scientiªc Instrument

Evolution by Substitution: Amino Acid Changes Over Time

Classifying Peroxiredoxin Subgroups and Identifying

Fully Transferred to Cotton, Maize and Potatoes

PDF Hosted at the Radboud Repository of the Radboud University Nijmegen

Amino Acids, Peptides, and Proteins

Databases Genomes