Predictions on and Analysis of Viral Proteins Encoded by Overlapping Genes

Total Page:16

File Type:pdf, Size:1020Kb

Predictions on and Analysis of Viral Proteins Encoded by Overlapping Genes PREDICTIONS ON AND ANALYSIS OF VIRAL PROTEINS ENCODED BY OVERLAPPING GENES Mahvash Khosravi Submitted to the faculty of the Bioinformatics Graduate Program in partial fulfillment of the requirements for the degree Master of Science in the School of Informatics, Indiana University August 2007 Accepted by the Faculty of Indiana University, in partial fulfillment of the requirements for the degree of Master of Science in Bioinformatics. Master’s Thesis Committee ___________________________ A. Keith Dunker, PhD, Chair ___________________________ Pedro Romero, PhD ___________________________ Vladimir N. Uversky, PhD ii Dedicated to the memory of my mother iii Acknowledgements I wish to express my gratitude to my thesis advisor Dr. A. Keith Dunker for his guidance, support and providing me the opportunity to work on this project. I would like to thank Dr. Pedro Romero for his advice and valuable input through the course of this research, as well as serving on my thesis committee. His guidance and encouragement is greatly appreciated. Also, I would like to thank Dr. Vladimir Uversky for serving on my thesis committee. Finally I would like to express my gratitude and love for my family. I am indebted to my husband Dr. Hossein Hariri for his endless love, dedication, patience and encouragement. Thank you Hossein. My deepest appreciation and love go to our beautiful daughters, our sweet engineer Ghazal and dear Homa who makes us so proud. iv Abstract Overlapping genes are adjacent genes that share a portion of their coding sequence. Such genes are often observed in the compact genomes of viruses, prokaryotes, and mitochondria. Overlapping genes are also seen in human and other mammalian genomes. Gene overlapping is a phenomenon to minimize genomic size and maximize encoding capacity. Overlapping genes produce different proteins. A major task in the post genomic era is the large-scale study of the structures and functions of proteins. Proteins play crucial roles in virtually all biological processes. In general it is assumed that 3-D structure determines the function of proteins, but many proteins or region of proteins may function in the absence of 3-D structure. The term “disordered” is used to describe these proteins. A large number of studies has shown that biological functions depend on both ordered and disordered proteins. Natively disordered regions are common and play essential roles in many proteins, especially, with regard to activities involved in signaling and regulation. The goal of this research was the analysis of the ordered and disordered tendencies of viral proteins encoded by overlapping genes. Our hypothesis is that, in a pair of proteins or protein regions encoded by overlapping genes, at least one of the pair is disordered (or unstructured). Our hypothesis is based on the observation that structural proteins require highly specific amino acid sequences, while unstructured (disordered) sequences are essentially unconstrained. Thus, given a structural protein and its associated mRNA sequence, any sequence derived from an overlapping reading frame seems highly unlikely to have a sequence pattern commensurate with a structural protein; on the other hand, a sequence pattern consistent with a disordered protein seems much v more likely. We performed studies on the protein products of overlapping gene sequences, tested the hypothesis and addressed the following two questions: First do the proteins encoded by overlapping genes have opposite order-disorder content, that is, does the ordered part of one of the overlapping proteins correspond to a disordered part in the other overlapping protein? Second, does the encoded protein in the overlapping regions have more disordered amino acids than the non-overlapping regions? Using our database of overlapping viral genes and the protein predictor PONDR VL3, we predicted the order-disorder of amino acids in the sequence of 97 viral protein samples. An analysis of the results supported our hypothesis and indicated that the ordered amino acids are mostly associated with non-overlapping regions while disordered amino acids are more prevalent in overlapping regions. In the overlapping regions for 52 protein pairs, we showed that most of the amino acid pairs facing each other on the protein sequences had at least one disorder for most cases. Out of 52 pairs, there were 3 protein pairs where there were no disordered amino acids and 22 protein pairs where there were no ordered amino acids on either sequence. The fraction of ordered pairs in the pool of overlapping regions of 52 protein pairs was 0.28. The non-overlapping region of 97 proteins had predominantly ordered proteins. The fraction of ordered amino acids in the pool of non-overlapping regions was determined to be 0.77. vi Table of Contents List of Tables …………………………………………………………………………...viii List of Tables ………………………………………………………………………….....ix I. Introduction………………………………………………………………………. 1 II. Background………………………………………………………………………. 2 Overlapping genes…………………………………………………………… 2 Protein structure-function paradigm………………………………………… 4 Protein folding……………………………………………………………… 6 Secondary structure of protein………………………………………………. 8 Beta bends………………………………………………………………….. 11 Tertiary structure of protein………………………………………………… 11 Natively disordered proteins – the new paradigm………………………….. 12 Characterization of natively disordered proteins…………………………… 14 NMR spectroscopy…………………………………………………………. 14 X-ray crystallography………………………………………………………. 14 Circular dichroism (CD) spectroscopy………………………………………15 Protease sensitivity…………………………………………………………. 15 Frequency of natively disordered proteins…………………………………. 16 Function of disordered proteins…………………………………………….. 17 Molecular recognition……………………………………………………… 18 Protein modification……………………………………………………….. 18 Entropic chain activities…………………………………………………… 18 Amino acid compositions of natively disordered proteins………………… 18 Prediction of natively disordered proteins………………………………… 19 Objective……………………………………………………………………. 20 III. Materials and Methods…………………………………………………………. 22 Software……………………………………………………………………. 22 Data analysis………………………………………………………………. 23 Data management…………………………………………………………. 24 Database design, construction and implementation……………………….. 32 VIRUS data schema……………………………………………………….. 33 VIRUS database construction……………………………………………… 37 Database implementation………………………………………………….. 39 Populating and query the database………………………………………… 40 Database queries…………………………………………………………… 40 Prediction of proteins encoded by overlapping genes…………………….. 43 Extraction of proteins encoded by overlapping genes……………………. 44 IV. Results…………………………………………………………………………. 46 Overlapping regions of protein pairs……………………………………… 46 Non-overlapping regions of proteins……………………………………… 50 Bootsrapping……………………………………………………………….. 51 V. Discussion……………………………………………………………………… 53 VI. Conclusions……………………………………………………………………. 58 VII. Recommendations for future work……………………………………………. 60 VIII. References …………………………………………………………………….. 61 vii List of Tables Table 1. List of software used for this research project………………………………… 23 Table 2. Dataset of Viral Proteins………………………………………………………. 24 Table 3. Description of Information on Viruses in Our Dataset…………………………32 viii List of Figures Figure 1. A Polypeptide chain with four amino acid residues…………………………… 6 Figure 2. Rigid CO-NH bond and rotation of Cα-C and Cα-N bonds……………………. 7 Figure 3. Ramachandran plot for 1000 nonglycine residues in eight proteins ……........... 8 Figure 4. Alpha helix…………………………………………………………………….. 9 Figure 5. Beta sheet………………………………………………………………………10 Figure 6. Beta bend…………………………………………………………………….. 11 Figure 7. Three dimensional protein structural motif ….................................................. 12 Figure 8. 3-D structure of Calcineurin…………………………………………………. 17 Figure 9. Schema for VIRUS database…………………………………………………. 35 Figure 10. Physical Schema Of DNA_VIRUS Database………………………………. 39 Figure 11. Examples of queries…………………………………………………………. 41 Figure 12. Example of fasta format of protein sequences………………………………. 43 Figure 13. A sample output……………………………………………………………… 45 Figure 14. Results of protein prediction in Excel Spreadsheet…………………………. 48 Figure 15. Percentages of order and disorder calculated by Excel Spreadsheet……….. 48 Figure 16. Analysis of Order-Disorder for Overlap Proteins ………………………….. 49 Figure 17. Analysis of Overlap Proteins with Disorder Breakdown…………………… 50 Figure 18. Results of protein prediction for non-overlap region ………………………. 51 Figure 19. Fraction of Ordered Amino Acids in the Non-overlap Region of Proteins… 52 Figure 20. Overlapping protein pairs with more than 50% disorder……………………. 55 Figure 21. Fraction of Ordered Amino Acids in the Non-overlap Region of Proteins… 56 Figure 22. Fraction of Disordered Amino Acids in the Entire Sequence of Proteins…. 57 ix I. Introduction Overlapping genes are adjacent genes that share a portion of their coding sequence. They are often observed in compact genome of viruses, prokaryotes, and mitochondria. Overlapping genes also occur in mammalian genomes. Overlapping genes encode different protein products using the same nucleic acid sequence. Proteins play crucial roles in virtually all biological processes. For many years it was thought that 3-D structure of proteins determine their functions [1,2,3]. But there are proteins or region of proteins which lack 3-D structure, yet such proteins and regions function in the absence of any specific fixed structure [4,5]. These proteins, called natively disordered proteins, have many important
Recommended publications
  • Long Overlapping Genes in Pseudomonas Aeruginosa
    bioRxiv preprint doi: https://doi.org/10.1101/2021.02.09.430400; this version posted February 10, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 1 Shadow ORFs illuminated: long overlapping genes in Pseudomonas 2 aeruginosa are translated and under purifying selection 3 4 Michaela Kreitmeier1, Zachary Ardern1*, Miriam Abele2, Christina Ludwig2, Siegfried Scherer1, 5 Klaus Neuhaus3* 6 7 1Chair for Microbial Ecology, TUM School of Life Sciences, Technische Universität München 8 Weihenstephaner Berg 3, 85354 Freising, Germany. 9 2Bavarian Center for Biomolecular Mass Spectrometry (BayBioMS), TUM School of Life 10 Sciences, Technische Universität München, Gregor-Mendel-Strasse 4, 85354 Freising, 11 Germany. 12 3Core Facility Microbiome, ZIEL – Institute for Food & Health, Technische Universität München 13 Weihenstephaner Berg 3, 85354 Freising, Germany. 14 15 *email: [email protected]; [email protected] 16 17 Abstract 18 The existence of overlapping genes (OLGs) with significant coding overlaps revolutionises our 19 understanding of genomic complexity. We report two exceptionally long (957 nt and 1536 nt), 20 evolutionarily novel, translated antisense open reading frames (ORFs) embedded within 21 annotated genes in the medically important Gram-negative bacterium Pseudomonas 22 aeruginosa. Both OLG pairs show sequence features consistent with being genes and 23 transcriptional signals in RNA sequencing data. Translation of both OLGs was confirmed by 24 ribosome profiling and mass spectrometry. Quantitative proteomics of samples taken during 25 different phases of growth revealed regulation of protein abundances, implying biological 26 functionality.
    [Show full text]
  • Grapevine Virus Diseases: Economic Impact and Current Advances in Viral Prospection and Management1
    1/22 ISSN 0100-2945 http://dx.doi.org/10.1590/0100-29452017411 GRAPEVINE VIRUS DISEASES: ECONOMIC IMPACT AND CURRENT ADVANCES IN VIRAL PROSPECTION AND MANAGEMENT1 MARCOS FERNANDO BASSO2, THOR VINÍCIUS MArtins FAJARDO3, PASQUALE SALDARELLI4 ABSTRACT-Grapevine (Vitis spp.) is a major vegetative propagated fruit crop with high socioeconomic importance worldwide. It is susceptible to several graft-transmitted agents that cause several diseases and substantial crop losses, reducing fruit quality and plant vigor, and shorten the longevity of vines. The vegetative propagation and frequent exchanges of propagative material among countries contribute to spread these pathogens, favoring the emergence of complex diseases. Its perennial life cycle further accelerates the mixing and introduction of several viral agents into a single plant. Currently, approximately 65 viruses belonging to different families have been reported infecting grapevines, but not all cause economically relevant diseases. The grapevine leafroll, rugose wood complex, leaf degeneration and fleck diseases are the four main disorders having worldwide economic importance. In addition, new viral species and strains have been identified and associated with economically important constraints to grape production. In Brazilian vineyards, eighteen viruses, three viroids and two virus-like diseases had already their occurrence reported and were molecularly characterized. Here, we review the current knowledge of these viruses, report advances in their diagnosis and prospection of new species, and give indications about the management of the associated grapevine diseases. Index terms: Vegetative propagation, plant viruses, crop losses, berry quality, next-generation sequencing. VIROSES EM VIDEIRAS: IMPACTO ECONÔMICO E RECENTES AVANÇOS NA PROSPECÇÃO DE VÍRUS E MANEJO DAS DOENÇAS DE ORIGEM VIRAL RESUMO-A videira (Vitis spp.) é propagada vegetativamente e considerada uma das principais culturas frutíferas por sua importância socioeconômica mundial.
    [Show full text]
  • Changes to Virus Taxonomy 2004
    Arch Virol (2005) 150: 189–198 DOI 10.1007/s00705-004-0429-1 Changes to virus taxonomy 2004 M. A. Mayo (ICTV Secretary) Scottish Crop Research Institute, Invergowrie, Dundee, U.K. Received July 30, 2004; accepted September 25, 2004 Published online November 10, 2004 c Springer-Verlag 2004 This note presents a compilation of recent changes to virus taxonomy decided by voting by the ICTV membership following recommendations from the ICTV Executive Committee. The changes are presented in the Table as decisions promoted by the Subcommittees of the EC and are grouped according to the major hosts of the viruses involved. These new taxa will be presented in more detail in the 8th ICTV Report scheduled to be published near the end of 2004 (Fauquet et al., 2004). Fauquet, C.M., Mayo, M.A., Maniloff, J., Desselberger, U., and Ball, L.A. (eds) (2004). Virus Taxonomy, VIIIth Report of the ICTV. Elsevier/Academic Press, London, pp. 1258. Recent changes to virus taxonomy Viruses of vertebrates Family Arenaviridae • Designate Cupixi virus as a species in the genus Arenavirus • Designate Bear Canyon virus as a species in the genus Arenavirus • Designate Allpahuayo virus as a species in the genus Arenavirus Family Birnaviridae • Assign Blotched snakehead virus as an unassigned species in family Birnaviridae Family Circoviridae • Create a new genus (Anellovirus) with Torque teno virus as type species Family Coronaviridae • Recognize a new species Severe acute respiratory syndrome coronavirus in the genus Coro- navirus, family Coronaviridae, order Nidovirales
    [Show full text]
  • A Novel Short L-Arginine Responsive Protein-Coding Gene (Laob)
    Erschienen in: BMC evolutionary biology ; 18 (2018), 1. - 21 https://dx.doi.org/10.1186/s12862-018-1134-0 Hücker et al. BMC Evolutionary Biology (2018) 18:21 https://doi.org/10.1186/s12862-018-1134-0 RESEARCH ARTICLE Open Access A novel short L-arginine responsive protein-coding gene (laoB) antiparallel overlapping to a CadC-like transcriptional regulator in Escherichia coli O157:H7 Sakai originated by overprinting Sarah M. Hücker1,5, Sonja Vanderhaeghen1, Isabel Abellan-Schneyder1,4, Romy Wecko1, Svenja Simon2, Siegfried Scherer1,3 and Klaus Neuhaus1,4* Abstract Background: Due to the DNA triplet code, it is possible that the sequences of two or more protein-coding genes overlap to a large degree. However, such non-trivial overlaps are usually excluded by genome annotation pipelines and, thus, only a few overlapping gene pairs have been described in bacteria. In contrast, transcriptome and translatome sequencing reveals many signals originated from the antisense strand of annotated genes, of which we analyzed an example gene pair in more detail. Results: A small open reading frame of Escherichia coli O157:H7 strain Sakai (EHEC), designated laoB (L-arginine responsive overlapping gene), is embedded in reading frame −2 in the antisense strand of ECs5115, encoding a CadC-like transcriptional regulator. This overlapping gene shows evidence of transcription and translation in Luria-Bertani (LB) and brain-heart infusion (BHI) medium based on RNA sequencing (RNAseq) and ribosomal- footprint sequencing (RIBOseq). The transcriptional start site is 289 base pairs (bp) upstream of the start codon and transcription termination is 155 bp downstream of the stop codon.
    [Show full text]
  • Origin, Adaptation and Evolutionary Pathways of Fungal Viruses
    Virus Genes 16:1, 119±131, 1998 # 1998 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands. Origin, Adaptation and Evolutionary Pathways of Fungal Viruses SAID A. GHABRIAL Department of Plant Pathology, University of Kentucky, Lexington, KY, USA Abstract. Fungal viruses or mycoviruses are widespread in fungi and are believed to be of ancient origin. They have evolved in concert with their hosts and are usually associated with symptomless infections. Mycoviruses are transmitted intracellularly during cell division, sporogenesis and cell fusion, and they lack an extracellular phase to their life cycles. Their natural host ranges are limited to individuals within the same or closely related vegetative compatibility groups. Typically, fungal viruses are isometric particles 25±50 nm in diameter, and possess dsRNA genomes. The best characterized of these belong to the family Totiviridae whose members have simple undivided dsRNA genomes comprised of a coat protein (CP) gene and an RNA dependent RNA polymerase (RDRP) gene. A recently characterized totivirus infecting a ®lamentous fungus was found to be more closely related to protozoan totiviruses than to yeast totiviruses suggesting these viruses existed prior to the divergence of fungi and protozoa. Although the dsRNA viruses at large are polyphyletic, based on RDRP sequence comparisons, the totiviruses are monophyletic. The theory of a cellular self-replicating mRNA as the origin of totiviruses is attractive because of their apparent ancient origin, the close relationships among their RDRPs, genome simplicity and the ability to use host proteins ef®ciently. Mycoviruses with bipartite genomes ( partitiviruses), like the totiviruses, have simple genomes, but the CP and RDRP genes are on separate dsRNA segments.
    [Show full text]
  • Recent Advances on Detection and Characterization of Fruit Tree Viruses Using High-Throughput Sequencing Technologies
    viruses Review Recent Advances on Detection and Characterization of Fruit Tree Viruses Using High-Throughput Sequencing Technologies Varvara I. Maliogka 1,* ID , Angelantonio Minafra 2 ID , Pasquale Saldarelli 2, Ana B. Ruiz-García 3, Miroslav Glasa 4 ID , Nikolaos Katis 1 and Antonio Olmos 3 ID 1 Laboratory of Plant Pathology, School of Agriculture, Faculty of Agriculture, Forestry and Natural Environment, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece; [email protected] 2 Istituto per la Protezione Sostenibile delle Piante, Consiglio Nazionale delle Ricerche, Via G. Amendola 122/D, 70126 Bari, Italy; [email protected] (A.M.); [email protected] (P.S.) 3 Centro de Protección Vegetal y Biotecnología, Instituto Valenciano de Investigaciones Agrarias (IVIA), Ctra. Moncada-Náquera km 4.5, 46113 Moncada, Valencia, Spain; [email protected] (A.B.R.-G.); [email protected] (A.O.) 4 Institute of Virology, Biomedical Research Centre, Slovak Academy of Sciences, Dúbravská cesta 9, 84505 Bratislava, Slovak Republic; [email protected] * Correspondence: [email protected]; Tel.: +30-2310-998716 Received: 23 July 2018; Accepted: 13 August 2018; Published: 17 August 2018 Abstract: Perennial crops, such as fruit trees, are infected by many viruses, which are transmitted through vegetative propagation and grafting of infected plant material. Some of these pathogens cause severe crop losses and often reduce the productive life of the orchards. Detection and characterization of these agents in fruit trees is challenging, however, during the last years, the wide application of high-throughput sequencing (HTS) technologies has significantly facilitated this task. In this review, we present recent advances in the discovery, detection, and characterization of fruit tree viruses and virus-like agents accomplished by HTS approaches.
    [Show full text]
  • Viral Diversity in Tree Species
    Universidade de Brasília Instituto de Ciências Biológicas Departamento de Fitopatologia Programa de Pós-Graduação em Biologia Microbiana Doctoral Thesis Viral diversity in tree species FLÁVIA MILENE BARROS NERY Brasília - DF, 2020 FLÁVIA MILENE BARROS NERY Viral diversity in tree species Thesis presented to the University of Brasília as a partial requirement for obtaining the title of Doctor in Microbiology by the Post - Graduate Program in Microbiology. Advisor Dra. Rita de Cássia Pereira Carvalho Co-advisor Dr. Fernando Lucas Melo BRASÍLIA, DF - BRAZIL FICHA CATALOGRÁFICA NERY, F.M.B Viral diversity in tree species Flávia Milene Barros Nery Brasília, 2025 Pages number: 126 Doctoral Thesis - Programa de Pós-Graduação em Biologia Microbiana, Universidade de Brasília, DF. I - Virus, tree species, metagenomics, High-throughput sequencing II - Universidade de Brasília, PPBM/ IB III - Viral diversity in tree species A minha mãe Ruth Ao meu noivo Neil Dedico Agradecimentos A Deus, gratidão por tudo e por ter me dado uma família e amigos que me amam e me apoiam em todas as minhas escolhas. Minha mãe Ruth e meu noivo Neil por todo o apoio e cuidado durante os momentos mais difíceis que enfrentei durante minha jornada. Aos meus irmãos André, Diego e meu sobrinho Bruno Kawai, gratidão. Aos meus amigos de longa data Rafaelle, Evanessa, Chênia, Tati, Leo, Suzi, Camilets, Ricardito, Jorgito e Diego, saudade da nossa amizade e dos bons tempos. Amo vocês com todo o meu coração! Minha orientadora e grande amiga Profa Rita de Cássia Pereira Carvalho, a quem escolhi e fui escolhida para amar e fazer parte da família.
    [Show full text]
  • In Plant Physiology
    IN PLANT PHYSIOLOGY Did Silencing Suppression Counter-Defensive Strategy Contribute To Origin And Evolution Of The Triple Gene Block Coding For Plant Virus Movement Proteins? Sergey Y. Morozov and Andrey G. Solovyev Journal Name: Frontiers in Plant Science ISSN: 1664-462X Article type: Opinion Article Received on: 30 May 2012 Accepted on: 05 Jun 2012 Provisional PDF published on: 05 Jun 2012 Frontiers website link: www.frontiersin.org Citation: Morozov SY and Solovyev AG(2012) Did Silencing Suppression Counter-Defensive Strategy Contribute To Origin And Evolution Of The Triple Gene Block Coding For Plant Virus Movement Proteins?. Front. Physio. 3:136. doi:10.3389/fpls.2012.00136 Article URL: http://www.frontiersin.org/Journal/FullText.aspx?s=907& name=plant%20physiology&ART_DOI=10.3389/fpls.2012.00136 (If clicking on the link doesn't work, try copying and pasting it into your browser.) Copyright statement: © 2012 Morozov and Solovyev. This is an open‐access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited. This Provisional PDF corresponds to the article as it appeared upon acceptance, after rigorous peer-review. Fully formatted PDF and full text (HTML) versions will be made available soon. 1 OPINION ARTICLE 2 3 Did Silencing Suppression Counter-Defensive Strategy 4 Contribute To Origin And Evolution Of The Triple Gene Block 5 Coding For Plant Virus Movement Proteins? 6 7 Sergey Y. Morozov*, Andrey G. Solovyev 8 9 Belozersky Institute of Physico-Chemical Biology, Moscow State University, Moscow, 10 Russia 11 12 Correspondence: 13 Sergey Y.
    [Show full text]
  • A-Lovisolo.Vp:Corelventura
    Acta zoologica cracoviensia, 46(suppl.– Fossil Insects): 37-50, Kraków, 15 Oct., 2003 Searching for palaeontological evidence of viruses that multiply in Insecta and Acarina Osvaldo LOVISOLO and Oscar RÖSLER Received: 31 March, 2002 Accepted for publication: 17 Oct., 2002 LOVISOLO O., RÖSLER O. 2003. Searching for palaeontological evidence of viruses that multiply in Insecta and Acarina. Acta zoologica cracoviensia, 46(suppl.– Fossil Insects): 37-50. Abstract. Viruses are known to be agents of important diseases of Insecta and Acarina, and many vertebrate and plant viruses have arthropods as propagative vectors. There is fossil evidence of arthropod pathogens for some micro-organisms, but not for viruses. Iso- lated virions would be hard to detect but, in fossil material, it could be easier to find traces of virus infection, mainly virus-induced cellular structures (VICS), easily recognisable by electron microscopy, such as virions encapsulated in protein occlusion bodies, aggregates of membrane-bounded virus particles and crystalline arrays of numerous virus particles. The following main taxa of viruses that multiply in arthropods are discussed both for some of their evolutionary aspects and for the VICS they cause in arthropods: A. dsDNA Poxviridae, Asfarviridae, Baculoviridae, Iridoviridae, Polydnaviridae and Ascoviridae, infecting mainly Lepidoptera, Hymenoptera, Coleoptera, Diptera and Acarina; B. ssDNA Parvoviridae, infecting mainly Diptera and Lepidoptera; C. dsRNA Reoviridae and Bir- naviridae, infecting mainly Diptera, Hymenoptera and Acarina, and plant viruses also multiplying in Hemiptera; D. Amb.-ssRNA Bunyaviridae and Tenuivirus, that multiply in Diptera and Hemiptera (animal viruses) and in Thysanoptera and Hemiptera (plant vi- ruses); E. -ssRNA Rhabdoviridae, multiplying in Diptera and Acarina (vertebrate vi- ruses), and mainly in Hemiptera (plant viruses); F.
    [Show full text]
  • The Design of Maximally Compressed Coding Sequences
    Two Proteins for the Price of One: The Design of Maximally Compressed Coding Sequences Bei Wang1, Dimitris Papamichail2, Steffen Mueller3, and Steven Skiena2 1 Dept. of Computer Science, Duke University, Durham, NC 27708 [email protected] 2 Dept. of Computer Science, State University of New York, Stony Brook, NY 11794 {dimitris, skiena}@cs.sunysb.edu 3 Dept. of Microbiology, State University of New York, Stony Brook, NY 11794 [email protected] Abstract. The emerging field of synthetic biology moves beyond con- ventional genetic manipulation to construct novel life forms which do not originate in nature. We explore the problem of designing the provably shortest genomic sequence to encode a given set of genes by exploit- ing alternate reading frames. We present an algorithm for designing the shortest DNA sequence simultaneously encoding two given amino acid sequences. We show that the coding sequence of naturally occurring pairs of overlapping genes approach maximum compression. We also investi- gate the impact of alternate coding matrices on overlapping sequence de- sign. Finally, we discuss an interesting application for overlapping gene design, namely the interleaving of an antibiotic resistance gene into a target gene inserted into a virus or plasmid for amplification. 1 Introduction The emerging field of synthetic biology moves beyond conventional genetic ma- nipulation to construct novel life forms which do not originate in nature. The synthesis of poliovirus from off-to-shelf components [1] attracted worldwide at- tention when announced in July 2002. Subsequently, the bacteriophage PhiX174 was synthesized using different techniques in only three weeks [2], and Kodu- mal, et al.
    [Show full text]
  • Wo 2007/056463 A2
    (12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) (19) World Intellectual Property Organization International Bureau (10) International Publication Number (43) International Publication Date PCT 18 May 2007 (18.05.2007) WO 2007/056463 A2 (51) International Patent Classification: (74) Agents: WILLIAMS, Kathleen et al.; Edwards Angell C12Q 1/70 (2006.01) C12Q 1/68 (2006.01) Palmer & Dodge LLP, P.O. Box 55874, Boston, MA 02205 (US). (21) International Application Number: (81) Designated States (unless otherwise indicated, for every PCT/US2006/043502 kind of national protection available): AE, AG, AL, AM, AT,AU, AZ, BA, BB, BG, BR, BW, BY, BZ, CA, CH, CN, (22) International Filing Date: CO, CR, CU, CZ, DE, DK, DM, DZ, EC, EE, EG, ES, FI, 9 November 2006 (09.1 1.2006) GB, GD, GE, GH, GM, GT, HN, HR, HU, ID, IL, IN, IS, JP, KE, KG, KM, KN, KP, KR, KZ, LA, LC, LK, LR, LS, (25) Filing Language: English LT, LU, LV,LY,MA, MD, MG, MK, MN, MW, MX, MY, MZ, NA, NG, NI, NO, NZ, OM, PG, PH, PL, PT, RO, RS, (26) Publication Language: English RU, SC, SD, SE, SG, SK, SL, SM, SV, SY, TJ, TM, TN, TR, TT, TZ, UA, UG, US, UZ, VC, VN, ZA, ZM, ZW (30) Priority Data: (84) Designated States (unless otherwise indicated, for every 60/735,085 9 November 2005 (09. 11.2005) US kind of regional protection available): ARIPO (BW, GH, GM, KE, LS, MW, MZ, NA, SD, SL, SZ, TZ, UG, ZM, (71) Applicant (for all designated States except US): ZW), Eurasian (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM), PRIMERA BIOSYSTEMS, INC.
    [Show full text]
  • Tamada-Text R3 HK-TT
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Okayama University Scientific Achievement Repository Tamada & Kondo - 1 Journal of General Plant Pathology Biological and genetic diversity of plasmodiophorid-transmitted viruses and their vectors Tetsuo Tamada* and Hideki Kondo Institute of Plant Science and Resources (IPSR), Okayama University, Kurashiki 710-0046, Japan. Current address for T. Tamada: Kita 10, Nishi 1, 13-2-606, Sapporo, 001-0010, Japan. *Corresponding author: Tetsuo Tamada; E-mail: [email protected] Total text pages 32 Word counts: 10 words (title); 147 words (Abstract); 6004 words (Main Text) Tables: 3; Figure: 1; Supplementary figure: 1 Tamada & Kondo - 2 Abstract About 20 species of viruses belonging to five genera, Benyvirus, Furovirus, Pecluvirus, Pomovirus and Bymovirus, are known to be transmitted by plasmodiophorids. These viruses have all positive-sense single-stranded RNA genomes that consist of two to five RNA components. Three species of plasmodiophorids are recognized as vectors: Polymyxa graminis, P. betae, and Spongospora subterranea. The viruses can survive in soil within the long-lived resting spores of the vector. There are biological and genetic variations in both virus and vector species. Many of the viruses have become the causal agents of important diseases in major crops, such as rice, wheat, barley, rye, sugar beet, potato, and groundnut. Control measure is dependent on the development of the resistant cultivars. During the last half a century, several virus diseases have been rapidly spread and distributed worldwide. For the six major virus diseases, their geographical distribution, diversity, and genetic resistance are addressed.
    [Show full text]