Resurrecting Ancient Genes: Experimental Analysis of Extinct Molecules

Total Page:16

File Type:pdf, Size:1020Kb

Resurrecting Ancient Genes: Experimental Analysis of Extinct Molecules REVIEWS RESURRECTING ANCIENT GENES: EXPERIMENTAL ANALYSIS OF EXTINCT MOLECULES Joseph W. Thornton There are few molecular fossils: with the rare exception of DNA fragments preserved in amber, ice or peat, no physical remnants preserve the intermediate forms that existed during the evolution of today’s genes. But ancient genes can now be reconstructed, expressed and functionally characterized, thanks to improved techniques for inferring and synthesizing ancestral sequences. This approach, known as ‘ancestral gene resurrection’, offers a powerful new way to empirically test hypotheses about the function of genes from the deep evolutionary past. BILATERIAN The evolution of gene function is a central issue in mol- genes from the last common ancestors of bacteria, of An animal that shows bilateral ecular evolution: by what mechanisms and dynamics BILATERIAN animals and of vertebrates. These studies have symmetry across a body axis. have the diverse functions of modern-day genes shed light on fascinating questions about primordial Bilaterians include chordates, emerged? This question is usually addressed by inferring environmental adaptations and the evolution of crucial arthropods, nematodes, annelids and molluscs, among other past processes from extant patterns, using statistical gene functions. groups. methods to detect the traces of ancient evolutionary Here, I review ancestral gene resurrection, the tech- events in the sequences of modern-day genes (see, for nical advances that have made it feasible and the studies ORTHOLOGUES example, REFS 1,2). Thanks to the recent surge in knowl- that have applied it. I discuss the historical development The ‘same’ gene in more than one species. Orthologues edge of structure–function relationships, evolutionists of the technique and its methodological basis, with an descend from a speciation event. can better interpret indicative patterns — such as biases emphasis on the previously unattainable insights it or changes in evolutionary rates — by focusing on spe- has allowed. I also highlight the limitations and pit- cific parts of gene sequences that are most likely to falls of this strategy, which should be kept in mind change gene function (for some examples, see REFS 3,4). when designing studies and interpreting results, and I Despite the important insights that the statistical conclude with suggestions for future extensions and approach has made possible, however, this method applications of the technique. remains inferential, without empirical tests to refute or corroborate the evolutionary hypotheses that sequence How to raise a gene from the dead patterns suggest. A gene resurrection study, as with all good science, Recent advances in phylogenetics and DNA-synthesis begins with a question — in this case, one that could be techniques have made it possible to experimentally test answered if we knew the functions of the ancestral or molecular evolutionary hypotheses by resurrecting intermediate forms that existed during the evolution of ancient genes in the laboratory. To resurrect a gene, the modern-day genes. For example, a question about the sequence of an ancient protein is inferred using phylo- physiological or biochemical traits of an extinct species genetic methods, a DNA molecule coding for that pro- could be answered by resurrecting and studying the Center for Ecology and tein is synthesized, the extinct protein is expressed ORTHOLOGUE, in that species, of a gene that determines Evolutionary Biology, in vitro or in cultured cells and its functions are assayed the trait in modern-day species. If the question concerns University of Oregon, Eugene, Oregon 97403-5289, USA. using molecular techniques. Half a dozen recent publi- how a gene family and its functions diversified, then the e-mail: [email protected] cations have reported the resurrection of ancestral ancestral gene that gave rise to the family through gene doi:10.1038/nrg1324 genes from the distant evolutionary past, including duplications can be resurrected and characterized, as 366 | MAY 2004 | VOLUME 5 www.nature.com/reviews/genetics REVIEWS a Infer phylogenetic tree from aligned sequences and determine best-fitting evolutionary model Human CDGCKSFFKRAVEGACMLLILKCKCLIICHHSRRWKTKHCIQGDACEHSKFFKENRAVEMMEVAGEKL… Mouse CDGCKTFFKRAVEGACMLLILKCKCLIICHHSRRYKTCHCIQGRACEHSKFFKENRAVEMMEVAGEKL… Chick CDGCKAFFKKAVEGACMLLILKCKCLIICHHARRYKTCHCVQGRACEKSKFFKENRAVEMMEVAGEKL… Vertebrate Frog CEGCKAFFKKSVEGACMLLILKCKCLIICHHARRYKTCHCIQGRACEKTDFFKENRAVEMMEVAGEKL… ancestor Zebrafish CEGCKSFFKKSVDGACMLLILKCKCLIICHHARRYKTCHCIQGRACEKTKFFRENRAVEMMEVAGEKL… (>400 mya) Shark CEGCKSFFKKSVDPACMLLILKCKCLIICHHARRYKTCHCIQGRACEKTKFFKDNRSVEMMEVAGEKL… Lamprey CEGCKSFFKKSMDPACMLLILKCKCLIICHHARRYKTCHCIQGRACEKTKFFKDNRALEMMEVAGEKL… Lancelet CEGCKSFLKKSMDPACMLLILKCKCLIICHHTRRYKTCHCIQGRACEKTKFFKDNRAVEMMDVAGEKL… Urchin CDGCKSFLKHTMDPACMFLILKCRCLIVCHHTRRYKTCHCIQGRSCEKTKFFKDNRAVEMMEVAGEKL… b Reconstruct protein sequence Ancestral sequence at ancestral node by maximum CEGCKSFFKKSMDPACMLLILKCKCLIICHHARRYKTCHCIQGRACEKTKFFKDNRAVEMMEVAGEKL likelihood c Synthesize oligonucleotides and assemble gene for ancestral protein by stepwise PCR CEGCKSFFKKSMDPACMLLILKCKCLIICHHARRYKTCHCIQGRACEKTKFFKDNRAVEMMEVAGEKL d Subclone assembled gene into vector, transform e Purify ancestral protein (if necessary) and characterize cultured cells and express ancestral protein function using trans-activation, binding or other assay 242.721 96-well plate Figure 1 | The ancestral gene resurrection strategy. Flow chart of the stages required to resurrect and characterize an ancestral gene. A hypothetical protein from the ancestral vertebrate is shown as an example. See main text for details of each step. mya, million years ago. OUTGROUP SEQUENCES In phylogenetics, sequences that can ancestors from intermediate points during the pro- code; CODON BIAS for the system in which the gene will be are known a priori to be more liferation of that family. expressed can be introduced to improve the translation distantly related to the other Resurrecting an ancestral gene involves five steps rate.This coding sequence is produced de novo (the sequences in the analysis (the (FIG. 1).In the first step, sequences that are descended third step) by synthesizing overlapping oligonucleotides ingroup sequences) than the from the ancestral gene are obtained and aligned — and assembling them by PCR or by restriction digest/ ingroup sequences are to each other. along with OUTGROUP SEQUENCES — and the tree of their ligation; site-directed mutagenesis can be used if the relationships is inferred (or imposed if it is known ancestral sequence can be made by introducing only a CODON BIAS a priori). Amino-acid sequences are typically used few changes into an extant gene. In the fourth step, the Preferential use of certain DNA because they contain less ‘noise’ than DNA sequences, ancestral gene is cloned into a plasmid that allows high- codons over others that code for the same amino acid. which are more subject to convergence and reversal. level expression, and the plasmid is then transfected into Phylogenetic methods, such as maximum parsimony or bacterial or mammalian cells in culture. Finally, the BINDING ASSAYS maximum likelihood (ML; see later sections), are then ancestral protein can be purified if necessary and its A family of biochemical used to infer the best estimate of the ancestral state for functions characterized using experimental tests such as proceduers used to determine each sequence site given the present-day sequence data. reporter-gene expression assays, ligand- or substrate- the affinity and specificity with which a protein binds a specific In the second step, a DNA sequence that codes for the BINDING ASSAYS,or assays that measure enzyme specificity ligand or substrate. ancestral protein is inferred on the basis of the genetic and turnover rate. NATURE REVIEWS | GENETICS VOLUME 5 | MAY 2004 | 367 REVIEWS Modernizing gene resurrection techniques maximum parsimony method, which was developed in In 1963, L. Pauling and E. Zuckerkandl prophesied that the early 1970s but not applied to or adapted for gene it would be possible one day to infer the gene sequences sequences until the first phylogenetic computer programs of ancestral species, to “synthesize these presumed com- emerged in the mid 1980s (BOX 1).Maximum parsimony ponents of extinct organisms … and study the physico- was an important advance over consensus methods, in chemical properties of these molecules”5.Not until which the most common state among extant sequences is 1990, however, were methods for ancestral sequence assumed to be the ancestral state, irrespective of phyloge- inference and DNA synthesis sufficiently mature to netic relations. The consensus approach is extremely sen- make the first such work possible (BOX 1).Five gene sitive to the sample of sequences chosen, however, and it resurrection studies were published during the follow- will always err if a change from the ancestral state ing seven years; all involved genes were from the rela- occurred on a deep branch that subtended a highly spe- tively recent past, such as transposons in the genomes of ciose group (BOX 2).To take a morphological example, mice and salmonids, and digestive RNases in ancestral the consensus method would lead to the inference that artiodactyls (TABLE 1). the vertebrate ancestor had jaws, because there are far A hiatus of more than half a decade followed, in more species of TELEOSTS and tetrapods than there are of which no new gene resurrection studies
Recommended publications
  • Ancient Metaproteomics: a Novel Approach for Understanding Disease And
    Ancient metaproteomics: a novel approach for understanding disease and diet in the archaeological record Jessica Hendy PhD University of York Archaeology August, 2015 ii Abstract Proteomics is increasingly being applied to archaeological samples following technological developments in mass spectrometry. This thesis explores how these developments may contribute to the characterisation of disease and diet in the archaeological record. This thesis has a three-fold aim; a) to evaluate the potential of shotgun proteomics as a method for characterising ancient disease, b) to develop the metaproteomic analysis of dental calculus as a tool for understanding both ancient oral health and patterns of individual food consumption and c) to apply these methodological developments to understanding individual lifeways of people enslaved during the 19th century transatlantic slave trade. This thesis demonstrates that ancient metaproteomics can be a powerful tool for identifying microorganisms in the archaeological record, characterising the functional profile of ancient proteomes and accessing individual patterns of food consumption with high taxonomic specificity. In particular, analysis of dental calculus may be an extremely valuable tool for understanding the aetiology of past oral diseases. Results of this study highlight the value of revisiting previous studies with more recent methodological approaches and demonstrate that biomolecular preservation can have a significant impact on the effectiveness of ancient proteins as an archaeological tool for this characterisation. Using the approaches developed in this study we have the opportunity to increase the visibility of past diseases and their aetiology, as well as develop a richer understanding of individual lifeways through the production of molecular life histories. iii iv List of Contents Abstract ...............................................................................................................................
    [Show full text]
  • Decelerated Genome Evolution in Modern Vertebrates Revealed by Analysis of Multiple Lancelet Genomes
    ARTICLE Received 20 May 2014 | Accepted 18 Nov 2014 | Published 19 Dec 2014 DOI: 10.1038/ncomms6896 OPEN Decelerated genome evolution in modern vertebrates revealed by analysis of multiple lancelet genomes Shengfeng Huang1, Zelin Chen1, Xinyu Yan1, Ting Yu1, Guangrui Huang1, Qingyu Yan1, Pierre Antoine Pontarotti2, Hongchen Zhao1, Jie Li1, Ping Yang1, Ruihua Wang1, Rui Li1, Xin Tao1, Ting Deng1, Yiquan Wang3,4, Guang Li3,4, Qiujin Zhang5, Sisi Zhou1, Leiming You1, Shaochun Yuan1, Yonggui Fu1, Fenfang Wu1, Meiling Dong1, Shangwu Chen1 & Anlong Xu1,6 Vertebrates diverged from other chordates B500 Myr ago and experienced successful innovations and adaptations, but the genomic basis underlying vertebrate origins are not fully understood. Here we suggest, through comparison with multiple lancelet (amphioxus) genomes, that ancient vertebrates experienced high rates of protein evolution, genome rearrangement and domain shuffling and that these rates greatly slowed down after the divergence of jawed and jawless vertebrates. Compared with lancelets, modern vertebrates retain, at least relatively, less protein diversity, fewer nucleotide polymorphisms, domain combinations and conserved non-coding elements (CNE). Modern vertebrates also lost substantial transposable element (TE) diversity, whereas lancelets preserve high TE diversity that includes even the long-sought RAG transposon. Lancelets also exhibit rapid gene turnover, pervasive transcription, fastest exon shuffling in metazoans and substantial TE methylation not observed in other invertebrates. These new lancelet genome sequences provide new insights into the chordate ancestral state and the vertebrate evolution. 1 State Key Laboratory of Biocontrol, Guangdong Key Laboratory of Pharmaceutical Functional Genes, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, China. 2 Evolution Biologique et Mode´lisation UMR 7353 Aix Marseille Universite´/CNRS, 3 Place Victor Hugo, 13331 Marseille, France.
    [Show full text]
  • Proteomic Strategies for Cultural Heritage: from Bones to Paintings☆
    Microchemical Journal 126 (2016) 341–348 Contents lists available at ScienceDirect Microchemical Journal journal homepage: www.elsevier.com/locate/microc Proteomic strategies for cultural heritage: From bones to paintings☆ Roberto Vinciguerra a, Addolorata De Chiaro a, Piero Pucci a, Gennaro Marino b, Leila Birolo a,⁎ a Dipartimento di Scienze Chimiche, Università di Napoli Federico II, Complesso Universitario Monte S.Angelo, Via Cinthia, Napoli, Italy b Università degli Studi ‘Suor Orsola Benincasa’, 80132 Napoli, Italy article info abstract Article history: In recent years, proteomics procedures have become increasingly popular for the characterization of proteina- Received 30 July 2015 ceous materials in ancient samples of several cultural heritage objects. The knowledge of the materials used in Received in revised form 18 December 2015 a work of art is crucial, not only to give an insight in the historical context of objects and artists, but also to analyse Accepted 18 December 2015 degradation processes taking place in aged objects and to develop appropriate conservation and/or restoration Available online 29 December 2015 treatments. However, protocols routinely applied for typical modern samples still need to be fully adapted to Keywords: take into account the low amount of proteinaceous material, the heterogeneity and the unusual physical state Proteomics of the samples, as well as the high levels of damage found in ancient samples. This paper deals with some exam- Ancient proteins ples of the adaptation of classical proteomic strategies in the analysis of ancient samples to meet the different LC-MS/MS aims in the cultural heritage field. © 2015 Elsevier B.V. All rights reserved.
    [Show full text]
  • Analysis of 5000 Year-Old Human Teeth Using Optimized Large-Scale And
    Analysis of 5000 year-old human teeth using optimized large-scale and targeted proteomics approaches for detection of sex-specific peptides Carine Froment, Mathilde Hourset, Nancy Sáenz-Oyhéréguy, Emmanuelle Mouton-Barbosa, Claire Willmann, Clément Zanolli, Rémi Esclassan, Richard Donat, Catherine Thèves, Odile Burlet-Schiltz, et al. To cite this version: Carine Froment, Mathilde Hourset, Nancy Sáenz-Oyhéréguy, Emmanuelle Mouton-Barbosa, Claire Willmann, et al.. Analysis of 5000 year-old human teeth using optimized large-scale and targeted proteomics approaches for detection of sex-specific peptides. Journal of Proteomics, Elsevier, 2020, 211, pp.103548. 10.1016/j.jprot.2019.103548. hal-02322441 HAL Id: hal-02322441 https://hal.archives-ouvertes.fr/hal-02322441 Submitted on 18 Nov 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Journal Pre-proof Analysis of 5000year-old human teeth using optimized large- scale and targeted proteomics approaches for detection of sex- specific peptides Carine Froment, Mathilde Hourset, Nancy Sáenz-Oyhéréguy, Emmanuelle Mouton-Barbosa, Claire Willmann, Clément Zanolli, Rémi Esclassan, Richard Donat, Catherine Thèves, Odile Burlet- Schiltz, Catherine Mollereau PII: S1874-3919(19)30320-3 DOI: https://doi.org/10.1016/j.jprot.2019.103548 Reference: JPROT 103548 To appear in: Journal of Proteomics Received date: 24 January 2019 Revised date: 30 August 2019 Accepted date: 7 October 2019 Please cite this article as: C.
    [Show full text]
  • SYSTEMS BIOLOGY Ancient Protein Complexes Revealed
    RESEARCH HIGHLIGHTS SYSTEMS BIOLOGY Ancient protein complexes revealed A systematic analysis of protein-complex history, they applied the approach to eight some interesting conclusions. For example, composition across a billion years of evo- species—mouse, frog, fly, worm, sea urchin, despite the fact that the majority of human lution reveals a spectrum of conservation. sea anemone, amoeba and yeast—spanning proteins arose in animals, they found that by Comparative analyses of protein-protein a billion years of evolution. and large, most stable human protein com- interaction networks from Escherichia coli to With this massive set of experiments, the plexes were likely inherited from a unicellular humans have shown that such networks are researchers identified and quantified 13,386 ancestor. Membership in such ancient com- conserved and relatively slow-evolving. Yet protein orthologues. They developed a plexes was either conserved or mixed with open questions remain about whether physi- machine learning–classification method to animal-specific components, and these com- cal interactions are preserved between evolu- pull out interactions that were conserved plexes tended to be abundant, ubiquitously tionarily distinct species, what these protein between human proteins and their worm, expressed and functionally associated with complexes are and what they do. fly, mouse and sea urchin orthologues (the core cellular processes. The relatively rare These are some of the questions that data from the other four species were used complexes consisting solely of proteins that Andrew Emili of the University of Toronto, to benchmark the approach). Their final net- arose in animals, in contrast, tended to have Edward Marcotte of the University of Texas at work contained 16,655 human co-complex functions strongly linked to multicellularity.
    [Show full text]
  • Ancient Protein Analysis in Archaeology
    This is a repository copy of Ancient protein analysis in archaeology. White Rose Research Online URL for this paper: https://eprints.whiterose.ac.uk/171551/ Version: Published Version Article: Hendy, Jessica orcid.org/0000-0002-3718-1058 (2021) Ancient protein analysis in archaeology. Science Advances. eabb9314. ISSN 2375-2548 https://doi.org/10.1126/sciadv.abb9314 Reuse This article is distributed under the terms of the Creative Commons Attribution-NonCommercial (CC BY-NC) licence. This licence allows you to remix, tweak, and build upon this work non-commercially, and any new works must also acknowledge the authors and be non-commercial. You don’t have to license any derivative works on the same terms. More information and the full terms of the licence here: https://creativecommons.org/licenses/ Takedown If you consider content in White Rose Research Online to be in breach of UK law, please notify us by emailing [email protected] including the URL of the record and the reason for the withdrawal request. [email protected] https://eprints.whiterose.ac.uk/ SCIENCE ADVANCES | REVIEW ANTHROPOLOGY Copyright © 2021 The Authors, some Ancient protein analysis in archaeology rights reserved; exclusive licensee Jessica Hendy1,2 American Association for the Advancement of Science. No claim to The analysis of ancient proteins from paleontological, archeological, and historic materials is revealing insights original U.S. Government into past subsistence practices, patterns of health and disease, evolution and phylogeny, and past environments. Works. Distributed This review tracks the development of this field, discusses some of the major methodological strategies used, and under a Creative synthesizes recent developments in archeological applications of ancient protein analysis.
    [Show full text]
  • Protein Sequence Comparison of Human and Non-Human Primate
    Protein sequence comparison of human and non-human primate tooth proteomes Carine Froment, Clément Zanolli, Mathilde Hourset, Emmanuelle Mouton-Barbosa, Andreia Moreira, Odile Burlet-Schiltz, Catherine Mollereau To cite this version: Carine Froment, Clément Zanolli, Mathilde Hourset, Emmanuelle Mouton-Barbosa, Andreia Moreira, et al.. Protein sequence comparison of human and non-human primate tooth proteomes. Journal of Proteomics, Elsevier, 2021, 231, pp.104045. 10.1016/j.jprot.2020.104045. hal-03039831 HAL Id: hal-03039831 https://hal.archives-ouvertes.fr/hal-03039831 Submitted on 4 Dec 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Protein sequence comparison of human and non-human primate tooth proteomes Carine Froment1, Clément Zanolli2, Mathilde Hourset3,4, Emmanuelle Mouton-Barbosa1, Andreia Moreira3, Odile Burlet-Schiltz1 and Catherine Mollereau3 1 Institut de Pharmacologie et Biologie Structurale (IPBS), Université de Toulouse, CNRS, UPS, Toulouse, France. 2 Laboratoire PACEA, UMR 5199 CNRS, Université de Bordeaux, Pessac, France.
    [Show full text]
  • Functional Innovation from Changes in Protein Domains and Their Combinations
    Functional innovation from changes in protein domains and their combinations Jonathan G. Lees1*+, Natalie L. Dawson1, Ian Sillitoe1 and Christine A. Orengo1 1 Institute of Structural and Molecular Biology, Division of Biosciences, University College London, Gower Street, London, WC1E 6BT, UK *To whom correspondence should be addressed. Tel: 02076793890; Fax: 02076797193; Email: [email protected] Abstract Domains are the functional building blocks of proteins. In this work we discuss how domains can contribute to the evolution of new functions. Domains themselves can evolve through various mechanisms, altering their intrinsic function. Domains can also facilitate functional innovations by combining with other domains to make novel proteins. We discuss the mechanisms by which domain and domain combinations support functional innovations. We highlight interesting examples where changes in domain combination promote changes at the domain level. Introduction Globular protein domains are structurally compact, independently folding units and can be grouped into sometimes very large homologous superfamilies (Sillitoe et al. 2015). Domain superfamilies can be defined using purely sequence (as in Pfam (Finn et al. 2015)), or sequence combined with structural data (such as used in the CATH and SCOP resources (Sillitoe et al. 2015; Murzin et al. 1995)). As the amount of protein structure data has grown, it has become clear that some homologues can diverge significantly in their structures and functions. In some superfamilies the domain fold can change to such an extent, that it appears similar to domain folds in other superfamilies. Some recent domain structure classifications (e.g. SCOP2 (Andreeva et al. 2014) highlight these structural overlaps, which suggest continuity in some regions of structure space (Edwards & Deane 2015).
    [Show full text]
  • Did Evolution Leap to Create the Protein Universe? Burkhard Rost
    409 Did evolution leap to create the protein universe? Burkhard Rost The genomes of over 60 organisms from all three kingdoms of Although most of these analyses utilize a wealth of biological life are now entirely sequenced. In many respects, the inventory data, they are not explicitly based on the fact that we have of proteins used in different kingdoms appears surprisingly entire genome sequences from representatives of all three similar. However, eukaryotes differ from other kingdoms in that kingdoms of life: eukarya, bacteria (prokaryotes) and they use many long proteins, and have more proteins with archaea. What do we learn from generating lists of parts of coiled-coil helices and with regions abundant in regular the whole [10,11•]? secondary structure. Particular structural domains are used in many pathways. Nevertheless, one domain tends to occur only Here, I focus on findings from methods that endeavor to once in one particular pathway. Many proteins do not have capture overall features for entire organisms. I challenge close homologues in different species (orphans) and there the assumption that bioinformatics is slowly but steadily could even be folds that are specific to one species. This view approaching the point at which we can smoothly move implies that protein fold space is discrete. An alternative model through neighborhoods of protein relationships in order to suggests that structure space is continuous and that modern generate an atlas of the fates and functions of proteins in proteins evolved by aggregating fragments of ancient proteins. the context of the cell. Either way, after having harvested proteomes by applying standard tools, the challenge now seems to be to develop Overview of proteomes: catalogs of structure better methods for comparative proteomics.
    [Show full text]
  • Elucidation of Cross-Species Proteomic Effects in Human and Hominin Bone Proteome Identification Through a Bioinformatics Experiment F
    Welker BMC Evolutionary Biology (2018) 18:23 https://doi.org/10.1186/s12862-018-1141-1 RESEARCH ARTICLE Open Access Elucidation of cross-species proteomic effects in human and hominin bone proteome identification through a bioinformatics experiment F. Welker1,2 Abstract Background: The study of ancient protein sequences is increasingly focused on the analysis of older samples, including those of ancient hominins. The analysis of such ancient proteomes thereby potentially suffers from “cross-species proteomic effects”: the loss of peptide and protein identifications at increased evolutionary distances due to a larger number of protein sequence differences between the database sequence and the analyzed organism. Error-tolerant proteomic search algorithms should theoretically overcome this problem at both the peptide and protein level; however, this has not been demonstrated. If error-tolerant searches do not overcome the cross-species proteomic issue then there might be inherent biases in the identified proteomes. Here, a bioinformatics experiment is performed to test this using a set of modern human bone proteomes and three independent searches against sequence databases at increasing evolutionary distances: the human (0 Ma), chimpanzee (6-8 Ma) and orangutan (16-17 Ma) reference proteomes, respectively. Results: Incorrectly suggested amino acid substitutions are absent when employing adequate filtering criteria for mutable PeptideSpectrumMatches(PSMs),butroughlyhalfofthemutablePSMswerenotrecovered.Asaresult,peptideand protein identification rates are higher in error-tolerant mode compared to non-error-tolerant searches but did not recover protein identifications completely. Data indicates that peptide length and the number of mutations between the target and database sequences are the main factors influencing mutable PSM identification. Conclusions: The error-tolerant results suggest that the cross-species proteomics problem is not overcome at increasing evolutionary distances, even at the protein level.
    [Show full text]
  • Ancient Evolutionary Signals of Protein-Coding
    Casimiro-Soriguer et al. BMC Genomics (2020) 21:210 https://doi.org/10.1186/s12864-020-6632-y RESEARCH ARTICLE Open Access Ancient evolutionary signals of protein- coding sequences allow the discovery of new genes in the Drosophila melanogaster genome Carlos S. Casimiro-Soriguer, Alejandro Rubio, Juan Jimenez and Antonio J. Pérez-Pulido* Abstract Background: The current growth in DNA sequencing techniques makes of genome annotation a crucial task in the genomic era. Traditional gene finders focus on protein-coding sequences, but they are far from being exhaustive. The number of this kind of genes continuously increases due to new experimental data and development of improved bioinformatics algorithms. Results: In this context, AnABlast represents a novel in silico strategy, based on the accumulation of short evolutionary signals identified by protein sequence alignments of low score. This strategy potentially highlights protein-coding regions in genomic sequences regardless of traditional homology or translation signatures. Here, we analyze the evolutionary information that the accumulation of these short signals encloses. Using the Drosophila melanogaster genome, we stablish optimal parameters for the accurate gene prediction with AnABlast and show that this new strategy significantly contributes to add genes, exons and pseudogenes regions, yet to be discovered in both already annotated and new genomes. Conclusions: AnABlast can be freely used to analyze genomic regions of whole genomes where it contributes to complete the previous annotation. Keywords: Ancient sequences, Gene finding, Protein-coding genes, Drosophila melanogaster, Protomotifs Background significant number of protein-coding sequences and other Research groups from all over the world are sequencing functional genomic elements are missing when using cur- whole genomes as a common task, taking advantage of rently available genomic annotation approaches.
    [Show full text]
  • Markgerstein Loaded 2024 Tweets
    MarkGerstein loaded 2024 tweets Hide Replies Hide Retweets Change User Follow @felixturner 60.1K followers .@kgebo & Devaney provide a nice overview of the @AllofUsResearch program as first building relationships with part… https://t.co/KQp0wX6sgq Jun 24, 2019 Posting my talk tomorrow at the @MRC_LMB in @Cambridge_Uni https://t.co/yJWjJgykIV Addressing core Qs in neuro-geno… https://t.co/9zzaE42mCX Jun 19, 2019 Posting my talk tomorrow on cancer genomics @DBMR_UniBe (in @UniBern), hosted by @MarkARubin1… https://t.co/v15JTy4X4A Jun 13, 2019 Posting my talk later today at @IEEEembs BHI meeting in Chicago https://t.co/aZmnjmzIH8 Quantification of sensitive… https://t.co/L0fl2q1Suw May 21, 2019 Images from the end of the day @Yale - Sunset, dusk & beyond for #Yale2019. Congrats to the new graduates. (Picture… https://t.co/74fCAiK9Bi May 21, 2019 Posting my talk tomorrow in Boston in the SynGen Series https://t.co/9dRc6NUblN Finding drug targets via deep-lear… https://t.co/c9SYYM4djh May 15, 2019 At #BoG19, @BPasaniuc gives a nice overview of SNP heritability in previous literature. How it is enriched in vario… https://t.co/N7apg9tWGK May 11, 2019 For #BoG19's #GenomesSparkJoy picture compilation, I've been busy snapping shots of the great sculptures @CSHL. My… https://t.co/vTXHWdEDQd May 11, 2019 At #BoG19, @genemodeller uses PRSes for coronary risk to further stratify high- & low-risk individuals (based on tr… https://t.co/Ook1ReKBUc May 09, 2019 At #BoG19, @genemodeller starts off by describing a huge matrix of 14M SNPs X 10K phenotype & trait measurements. W… https://t.co/OofRxo2I55 May 09, 2019 At #BoG19, Ongen describes a method to find CRD (cis-regulatory domains) that may be enriched for non-coding cancer… https://t.co/TfJ0rlG42y May 09, 2019 At #BoG19, @Kasper_Lage shows a nice visualization of both the Manhattan plot from GWAS & protein-protein interacti… https://t.co/3PF1AdRLA4 May 09, 2019 At #BoG19, @Ruthie_Johnson shows that heritability is highly correlated with the number of causal SNPs.
    [Show full text]