Mitochondrial Cytochrome Oxidase I Sequence Polymorphisms Reveal Population Genetic Diversity of Wuchereria Bancrofti in Papua New Guinea

Total Page:16

File Type:pdf, Size:1020Kb

Mitochondrial Cytochrome Oxidase I Sequence Polymorphisms Reveal Population Genetic Diversity of Wuchereria Bancrofti in Papua New Guinea MITOCHONDRIAL CYTOCHROME OXIDASE I SEQUENCE POLYMORPHISMS REVEAL POPULATION GENETIC DIVERSITY OF WUCHERERIA BANCROFTI IN PAPUA NEW GUINEA By AKSHAYA RAMESH Submitted in partial fulfillment of the requirements For the degree of Master of Science Thesis Advisor: Dr. Peter A Zimmerman Department of Biology CASE WESTERN RESERVE UNIVERSITY August, 2012 CASE WESTERN RESERVE UNIVERSITY SCHOOL OF GRADUATE STUDIES We hereby approve the thesis/dissertation of Akshaya Ramesh candidate for the MS_ degree *. (signed) Dr. Roy Ritzmann (Chair of the committee) Dr. Peter Zimmerman Dr. Daniel Tisch Dr. Michael Benard (date) 5/22/2012 *We also certify that written approval has been obtained for any proprietary material contained therein. ii Dedicated to my beloved Father and Grandmother iii Table of Contents Chapter 1: Lymphatic Filariasis: Global burden and Epidemiology 1.0. Introduction 1 1.1. Global burden 1 1.1.2. Lymphatic filariasis in Papua New Guinea 1 1.2. Global Alliance to eliminate lymphatic filariasis (GAELF/GPELF) 2 1.3. Life cycle of the filarial nematode: Wuchereria bancrofti 4 1.4. Vectors of W. bancrofti in Papua New Guinea 5 1.5. Genetics of W. bancrofti populations 6 1.6. Objectives 7 Chapter 2: The complete mitochondrial genome sequence of the filarial nematode, W. bancrofti 2.0. Introduction 8 2.1. Methods 9 2.1.1. Genomic DNA extraction and amplification 9 2.1.2. W. bancrofti mitochondrial genome amplification strategy 10 2.1.3. Agarose gel electrophoresis and gel extraction 12 2.1.4. Sequence assembly and gene annotation 13 2.2. Results and Discussion 13 2.2.1. General features of the mitochondrial genome of W. bancrofti 13 2.2.2. Protein coding genes 15 2.2.3. Codon Usage and amino acid composition 17 2.2.4. Ribosomal RNA genes 17 2.2.5. Transfer RNA genes 18 2.2.6. Non coding regions 19 2.3. Conclusions and future directions 20 iv Chapter 3: Genetic diversity of Wuchereria bancrofti in Dreikikir district, East Sepik province, Papua New Guinea 3.0. Introduction 21 3.1. Methods 24 3.1.1. Study sites and sample selection 24 3.1.2. Genomic DNA extraction and amplification 25 3.1.3. PCR amplification of the Cytochrome oxidase1 (cox1) gene and visualization of the products 25 3.1.4. Sequencing of the cox1 gene 26 3.1.5. Sequence analysis: Alignment of sequences and Sequence editing 27 3.1.6. Genetic heterogeneity of W. bancrofti 29 3.1.7. Test for neutrality 29 3.1.8. Population structure of W. bancrofti 29 3.2. Results 30 3.2.1. Genetic heterogeneity of W. bancrofti 31 3.2.2. Test for compliance to the neutral model of evolution 33 3.2.3. Genetic structure of W. bancrofti populations 33 3.2.4. Isolation by distance 45 3.3. Discussion 47 3.3.1. Genetic heterogeneity of W. bancrofti populations 47 3.3.2. Population structure 49 3.3.3. What mechanisms are responsible for structuring populations of W. bancrofti in PNG? 50 3.4. Conclusions and future directions 52 References 54 v List of Tables Table 1: Fifteen primer sets and COX1 primers used in the present study to amplify the complete mt genome and study the genetic heterogeneity of W. bancrofti. 11 Table 2: Summary statistics for 14 individuals from 6 villages with the observed number of haplotypes, haplotype diversity and test for neutrality 32 Table 3: Matrix with the pairwise Fst values for the six villages 34 Table 4: Fst values for the individuals from five villages 35 Table 5: AMOVA statistic generated using Arlequin 3.0 and DnaSP 5.0 across the villages (n=6) 36 vi List of Figures Figure 1: Linear representation of the complete mitochondrial genome of W. bancrofti from a Papua New Guinean isolate 14 Figure 2: Map of the study sites, Adapted from Bockarie et al., 1998 23 Figure 3: Multidimensional Scaling plot for the six villages in the ESP using pairwise Fst 34 Figure 4: Haplotype network for W. bancrofti hosts (n=14) across six villages in the ESP 38 Figure 5: Haplotype network for W. bancrofti hosts Peneng 39 Figure 6: Haplotype network for W. bancrofti hosts Albulum1 40 Figure 7: Haplotype network for W. bancrofti hosts Albulum2 41 Figure 8: Haplotype network for W. bancrofti hosts Yautong1 42 Figure 9: Haplotype network for W. bancrofti hosts Yautong2 43 Figure 10: Haplotype network for W. bancrofti hosts Moihuak 44 Figure 11: Histogram of genetic distances among hosts in the 6 villages 45 Figure 12: Histogram of genetic distances among hosts in the 6 villages 45 Figure 13: Comparison of Fst between the Peneng and Moihuak with respect to the other study sites 46 Figure 14: IBD plot for the six villages in ESP, PNG 47 vii Acknowledgements It is a pleasure to thank the many people who made this thesis possible. I consider it a great privilege for the opportunity given to me to work at the Center for Global Health and Diseases at Case Western Reserve University. I would like to sincerely thank Dr. Peter Zimmerman, my advisor and mentor for his crucial guidance in this project, constant support and encouragement. I would also like to thank Dr. Daniel Tisch for his help with the analysis of my data and motivation throughout my thesis writing period. I would like to take this opportunity to thank Dr. Michael Benard for introducing me to population genetics and lending me his expert views and precious time. I am extremely grateful to Dr. Scott Small, a post-doctoral fellow at the Zimmerman lab for constantly guiding me through the data; his untiring help, constant advice and support has substantially shaped the findings of my thesis. I unreservedly acknowledge with gratitude all the help and support provided by all the members of the Zimmerman lab. I deeply value the association with Dr. Rajeev Mehlotra, Krufinta Bun, Tenisha Phipps, Cara Halldin, Kyle Logue, Chad Schaber, Bangan John, Barnie Willie and Melinda Zikursh who have provided great help, caring and support besides practical advice in completion of this project. I would also like to thank Zachary Kloos, a great friend and fellow researcher for introducing me to Wuchereria bancrofti and helping me through my project. I wish to thank my best friend from high school (Soundarya Rangaraj), best friend as an undergraduate (Lakshmi Priya and Amrutha Pattamatta) and best friend in graduate school (Kirsten Eichelman) for their emotional support, entertainment and help through difficult times. viii I owe my gratitude to the Department of Biology at Case Western Reserve University and other members of the Center for Global Health and Diseases for extending their continuous support and guidance throughout my project. Last, and most importantly, I wish to thank my parents Sri Vidya Ramesh and Ramesh Veeraraghavan, my grandmother, Chandra Renganathan, my Aunt and Uncle for their never- ending love, care and support. To them I dedicate this thesis. ix List of Abbreviations ATP Annual Transmission Potential atp6 ATP synthase subunit 6 cob cytochrome b cox1-3 cytochrome c oxidase subunits 1-3 DALY disability-adjusted life year DEC Diethlycarbamazine ESP East Sepik province GAELF/GPELF Global Alliance to eliminate lymphatic filariasis gDNA Genomic DNA LD Linkage Disequilibrium LDR-FMA Ligase detection reaction-fluorescent microsphere assay LF Lymphatic filariasis MDA Mass drug administration MFI Median Fluorescence Intensity Mt Mitochondria nad1-6 NADH dehydrogenase subunits 1-6 nad4L NADH dehydrogenase subunit 4L nt nucleotide PacELF Pacific Program to Eliminate Lymphatic Filariasis PNG Papua New Guinea rrn ribosomal RNA trn transfer RNA WHO World Health Organization x Mitochondrial cytochrome oxidase I (COXI) sequence polymorphisms reveal population genetic diversity of Wuchereria bancrofti in Papua New Guinea Abstract by AKSHAYA RAMESH Wuchereria bancrofti is the primary causative agent of lymphatic filariasis, estimated to affect 120 million people in 80 countries. Several chemotherapeutic programs to eliminate this parasite have been introduced, which are likely to result in changes of the genetic structure in W. bancrofti populations. Despite constituting a major public health burden, this parasite remains poorly understood with respect to its mitochondrial sequence and population biology. To address this knowledge gap, the complete mitochondrial genome of W. bancrofti was sequenced following which a portion of the cytochrome oxidase 1 gene was amplified from individuals in the East Sepik Province of Papua New Guinea. The present study suggests that W. bancrofti populations are highly heterogeneous with a moderate genetic structure across the East Sepik Province. This study has facilitated exploration into W. bancrofti diversity and provides insights into patterns of transmission, an essential component of public health interventions aimed at eliminating lymphatic filariasis. xi Chapter 1 Lymphatic Filariasis: Global burden and Epidemiology 1.0. Introduction 1.1. Global burden Lymphatic filariasis (LF) is a neglected tropical disease, primarily of the poor, affecting around 120 million people worldwide and endemic in 80 countries. LF is endemic in Africa, South America, Indian subcontinent, South East Asia, the Pacific islands and the eastern Mediterranean. Around 118 million people are estimated to have clinical symptoms of the disease with 74 million being microfilaraemic, which includes hidden renal and lymphatic pathology; another 27 million have hydrocoele. Additionally, 16 million people are reported to have elephantiasis, a chronic form of the disease. The disability-adjusted life year (DALY) burden, a measure of overall disease burden expressed as the number of years lost due to ill-health, due to LF is 5.5 million (Global programme to eliminate lymphatic filariasis: Annual Report on Lymphatic Filariasis, 2002; Molyneux, Bradley, Hoerauf, Kyelem, & Taylor, 2003). 1.1.2. Lymphatic filariasis in Papua New Guinea Among the endemic islands, Papua New Guinea (PNG) has the highest estimated population at risk with almost 50% of the entire population at risk of infection.
Recommended publications
  • Macvector 13.5 Workshop
    MacVector 13 Workshop September 2014 MacVector 13.5 for Mac OS X Getting Started with MacVector 13.5 Support: 1 866 338 0222 +44 (0) 1223 410552 [email protected] MacVector 13 Workshop 1 MacVector 13 Workshop September 2014 MacVector Resources Tutorials A number of tutorials are available for download; http://www.macvector.com/downloads.html Videos http://www.macvector.com/Screencasts/screencasts2.html Manual There is a downloadable PDF version of the manual (12.0) at http://www.macvector.com/downloads.html#MacVector12UserGuide Discussion Forums To post questions or follow ongoing discussions, check out the user forums at; http://www.macvector.com/phpbb/index.php Copyright Statement Copyright MacVector, Inc, 2014. All rights reserved. This document contains proprietary information of MacVector, Inc and its licensors. It is their exclusive property. It may not be reproduced or transmitted, in whole or in part, without written agreement from MacVector, Inc. The software described in this document is furnished under a license agreement, a copy of which is packaged with the software. The software may not be used or copied except as provided in the license agreement. MacVector, Inc reserves the right to make changes, without notice, both to this publication and to the product it describes. Information concerning products not manufactured or distributed by MacVector, Inc is provided without warranty or representation of any kind, and MacVector, Inc will not be liable for any damages. This version of the workshop guide was published in November
    [Show full text]
  • Macvector 10.6 2 Macvector User Guide Copyright Statement
    MacVector 10.6 2 MacVector User Guide Copyright statement Copyright MacVector, Inc, 2008. All rights reserved. This document contains proprietary information of MacVector, Inc and its licensors. It is their exclusive property. It may not be reproduced or transmitted, in whole or in part, without written agreement from MacVector, Inc. The software described in this document is furnished under a license agreement, a copy of which is packaged with the software. The software may not be used or copied except as provided in the license agreement. MacVector, Inc reserves the right to make changes, without notice, both to this publication and to the product it describes. Information concerning products not manufactured or distributed by MacVector, Inc is provided without warranty or representation of any kind, and MacVector, Inc will not be liable for any damages. MacVector User Guide 3 4 MacVector User Guide 1 Introduction to the User Guide ....................................19 Overview....................................................................................19 The MacVector documentation set.............................................20 About this user guide..............................................................20 Conventions in this user guide...................................................20 Interface conventions .............................................................21 Navigation aids ..........................................................................21 2 Introduction to MacVector.............................................23
    [Show full text]
  • Clustering and Apple
    Apple in Research Rajiv Pillai Power of UNIX. Simplicity of Macintosh. Mac OS X: The easy way to be open Comand Line Interface FreeBSD 5 Editors Commands and Utilities Shells Scripting languages The Best Foundation The Best Foundation Secure Scalable Open standards High performance Rock-solid stability Advanced networking Built on Open Source Over 100 Open Source Projects Apple Confidential Modern Languages • GCC 3.3 • Perl 5.8.1 • Python 2.3 • PHP 4.3.2 • TCL 8.4.2 • Ruby 1.6.8 • Bash 2.05 Integrated X11 Quartz window Runs side-by- manager side with native applications (or full screen) Accelerated graphics Launch from Finder Dock menu A Whole New World of Solutions Bringing Mac OS X into new markets Scientific Distributed Enterprise Mathematica Platform LSF Oracle 10g MATLAB Globus Sybase BLAST Sun Grid Engine HP OpenView HMMER MPI SAP client GROMACS PBS SAS GeneSpring Myrinet JBoss (J2EE) PyMol Infiniband Tomcat (JSP) IBM XL Fortran iNquiry Axis (SOAP) Mac OS X The Best of Both Worlds Open Like Linux Convenient Like Windows Open Source Shrink-wrap solutions Open Standards Fits in to existing networks Open APIs & Applications Single point of support Runs all the Apps a Scientist Needs A single system on their desk • Their favorite GUI applications (e.g. Mathematica, Gaussian, Vector NTI, TurboWorx, more) • And their favorite UNIX applications (e.g. Phred/Phrap, HMMer, BLAST, Smith-Waterman, more) • And their favorite productivity tools (e.g. Photoshop, Microsoft Office, Outlook email client) • All run simultaneously on Mac OS X (Yes! Side by side) Apple Confidential Over 100 installations since March Academic Government and Commercial Harvard University Naval Medical Research Center Isis Pharmaceuticals Stanford University Scripps Research Institute Cincinnati Children’s Hospital Cornell University Children’s Mercy Hospital U.S.
    [Show full text]
  • Macvector 12.5 Getting Started Guide
    MacVector 12.5 Getting Started Guide Copyright statement Copyright MacVector, Inc, 2011. All rights reserved. This document contains proprietary information of MacVector, Inc and its licensors. It is their exclusive property. It may not be reproduced or transmitted, in whole or in part, without written agreement from MacVector, Inc. The software described in this document is furnished under a license agreement, a copy of which is packaged with the software. The software may not be used or copied except as provided in the license agreement. MacVector, Inc reserves the right to make changes, without notice, both to this publication and to the product it describes. Information concerning products not manufactured or distributed by MacVector, Inc is provided without warranty or representation of any kind, and MacVector, Inc will not be liable for any damages. 2 Table of Contents COPYRIGHT STATEMENT ................................................................................ 2 INTRODUCTION.................................................................................................. 4 GETTING SEQUENCE INFORMATION INTO MACVECTOR ...................... 4 IMPORTING SEQUENCE FILES .............................................................................4 CREATING NEW SEQUENCES...............................................................................4 OPENING SEQUENCES FROM ENTREZ ...............................................................5 VIEWING AND EDITING SEQUENCES...........................................................
    [Show full text]
  • Macvector 12.6 User Guide2.Pdf
    MacVector 12.6 MacVector User Guide 1 2 MacVector User Guide Copyright statement Copyright MacVector, Inc, 2012. All rights reserved. This document contains proprietary information of MacVector, Inc and its licensors. It is their exclusive property. It may not be reproduced or transmitted, in whole or in part, without written agreement from MacVector, Inc. The software described in this document is furnished under a license agreement, a copy of which is packaged with the software. The software may not be used or copied except as provided in the license agreement. MacVector, Inc reserves the right to make changes, without notice, both to this publication and to the product it describes. Information concerning products not manufactured or distributed by MacVector, Inc is provided without warranty or representation of any kind, and MacVector, Inc will not be liable for any damages. Trademarks Gateway®, TOPO®, Vector NTI® and Zero Blunt® are regiestered trademarks of Life Technologies, Carlsbad, California, USA. Vector NTI Advance™ is a trademark of Life Technologies, Carlsbad, California, USA. MacVector User Guide 3 4 MacVector User Guide 1 Introduction to the User Guide ....................................19 Overview....................................................................................19 The MacVector documentation set.............................................20 About this user guide..............................................................20 Conventions in this user guide...................................................20
    [Show full text]
  • Understanding the Origins, Dispersal, and Evolution of Bonamia Species (Phylum Haplosporidia) Based on Genetic Analyses of Ribosomal RNA Gene Regions
    W&M ScholarWorks Dissertations, Theses, and Masters Projects Theses, Dissertations, & Master Projects 2011 Understanding the Origins, Dispersal, and Evolution of Bonamia Species (Phylum Haplosporidia) Based on Genetic Analyses of Ribosomal RNA Gene Regions Kristina M. Hill College of William and Mary - Virginia Institute of Marine Science Follow this and additional works at: https://scholarworks.wm.edu/etd Part of the Developmental Biology Commons, Evolution Commons, and the Molecular Biology Commons Recommended Citation Hill, Kristina M., "Understanding the Origins, Dispersal, and Evolution of Bonamia Species (Phylum Haplosporidia) Based on Genetic Analyses of Ribosomal RNA Gene Regions" (2011). Dissertations, Theses, and Masters Projects. Paper 1539617909. https://dx.doi.org/doi:10.25773/v5-a0te-9079 This Thesis is brought to you for free and open access by the Theses, Dissertations, & Master Projects at W&M ScholarWorks. It has been accepted for inclusion in Dissertations, Theses, and Masters Projects by an authorized administrator of W&M ScholarWorks. For more information, please contact [email protected]. Understanding the Origins, Dispersal, and Evolution of Bonamia Species (Phylum Haplosporidia) Based on Genetic Analyses of Ribosomal RNA Gene Regions A Thesis Presented to The Faculty of the School of Marine Science The College of William and Mary in Virginia In Partial Fulfillment of the Requirements for the Degree of Master of Science by Kristina M. Hill 2011 APPROVAL SHEET This thesis is submitted in partial fulfillment of the requirements for the degree of Master of Science CH-s 7n - "UuUL ' Kristina Marie Hill Approved, May 2011 w. n Eugene M. Burreson, Ph.D Advisor Kimberly S. Reece, Ph.D.
    [Show full text]
  • Need and Role of Scala Implementations in Bioinformatics
    (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 8, No. 2, 2017 Need and Role of Scala Implementations in Bioinformatics Abbas Rehman Muhammad Atif Sarwar Department of Computer Science Department of Computer Science COMSATS Institute of Information Technology COMSATS Institute of Information Technology Sahiwal, Pakistan Sahiwal, Pakistan Ali Abbas Javed Ferzund Department of Computer Science Department of Computer Science COMSATS Institute of Information Technology COMSATS Institute of Information Technology Sahiwal, Pakistan Sahiwal, Pakistan Abstract—Next Generation Sequencing has resulted in the evolutionary change in data generation of different sequences. generation of large number of omics data at a faster speed that NGS machines are generating a huge amount of sequence data was not possible before. This data is only useful if it can be stored per day that needs to be stored, analyzed and managed well to and analyzed at the same speed. Big Data platforms and tools like seek the maximum advantages from this. Existing Apache Hadoop and Spark has solved this problem. However, bioinformatics techniques, tools or software are not keeping most of the algorithms used in bioinformatics for Pairwise pace with the speed of data generation. Old Bioinformatics alignment, Multiple Alignment and Motif finding are not tools have very less performance, accuracy and scalability implemented for Hadoop or Spark. Scala is a powerful language while analyzing large amount of data. When storing, managing supported by Spark. It provides, constructs like traits, closures, and analyzing large amount of data which is being generated functions, pattern matching and extractors that make it suitable now a days, these tools require more time and cost with less for Bioinformatics applications.
    [Show full text]
  • Macvector 11 2 Macvector User Guide Copyright Statement
    MacVector 11 2 MacVector User Guide Copyright statement Copyright MacVector, Inc, 2009. All rights reserved. This document contains proprietary information of MacVector, Inc and its licensors. It is their exclusive property. It may not be reproduced or transmitted, in whole or in part, without written agreement from MacVector, Inc. The software described in this document is furnished under a license agreement, a copy of which is packaged with the software. The software may not be used or copied except as provided in the license agreement. MacVector, Inc reserves the right to make changes, without notice, both to this publication and to the product it describes. Information concerning products not manufactured or distributed by MacVector, Inc is provided without warranty or representation of any kind, and MacVector, Inc will not be liable for any damages. Trademarks Gateway®, TOPO®, Vector NTI® and Zero Blunt® are regiestered trademarks of Life Technologies, Carlsbad, California, USA. Vector NTI Advance™ is a trademark of Life Technologies, Carlsbad, California, USA. MacVector User Guide 3 4 MacVector User Guide 1 Introduction to the User Guide ....................................19 Overview....................................................................................19 The MacVector documentation set.............................................20 About this user guide..............................................................20 Conventions in this user guide...................................................20 Interface conventions
    [Show full text]
  • MV 12.5.1 Release Notes
    MacVector 12.5.1 for Mac OS X System Requirements MacVector 12.5 runs on any PowerPC or Intel Macintosh running Mac OS X 10.5 or higher. It is a Universal Binary, meaning that it runs natively on both PowerPC and Intel based Macintosh computers. There are no specific hardware requirements for MacVector – if your machine can run OS X 10.5 or above, it can run MacVector. A complete installation of MacVector 12.5 uses approximately 160 MB of disk space. Installation and License Activation Install MacVector 12.5 by double-clicking on the MacVector 12.5.mpkg installer application. You will be prompted for a system administrator account and password during installation. As with MacVector 12.0, once installation is complete, you must enter a valid serial number and activation code the first time you run MacVector. This information is usually sent by e-mail and is also printed on the inside of the CD sleeve. If you previously installed MacVector 12.0 and have a serial number with a maintenance end date of Oct 1st 2011 or later, MacVector 12.5 will automatically use your existing license and you will not be required to enter the details again. Changes for MacVector 12.5.1 Bug Fixes Several occasional crashes have been fixed. Printing from the multiple sequence alignment editor now no longer prints additional blank pages. A bug leading to corrupted data when copying and pasting items between Restriction Enzyme documents has been fixed. The CDS translations display in the single sequence editor now updates correctly when residues are inserted before CDS features in the editor.
    [Show full text]
  • Sequence Analysis
    Sequence Analysis Introduction to Bioinformatics BIMMS December 2015 Gabriel Teku Department of Experimental Medical Science Faculty of Medicine Lund University Sequence analysis Part 1 • Sequence analysis: general introduction • Sequence features • Motifs and Domains Part 2 • Gala y • !MB"SS • Bioinformatics soft#are for sequence analysis Sequence analysis Part 1 • Sequence analysis: general introduction • Sequence features • Motifs and Domains Sequence analysis: definition $ refers to t%e &rocess of subjecting a DNA, RNA or peptide sequence to any of a #ide range of analytical met%ods to understand its features( function, structure, or evolution*** +%tt&://en.wi-i&edia.org,#iki/Sequence_analysis/ Quick sequence analysis example 1* "btain t%e &rotein sequence encoded by 0uman elastase gene from Uni&rot, P02234 2* "btain t%e 5DS sequence for t%e &rotein* %tt&:/,###.ebi.ac.u-,6ools/st 1* 6ranslate t%e 5DS sequence obtained above %tt&:/,###.ebi.ac.u-,6ools/st Quick sequence analysis example 3* 5om&are t%e translated 5DS to t%e &rotein sequence obtained from 1 above. %tt&:/,###.ebi.ac.u-,6ools/msa,clustalo, Quick sequence analysis example 3* 5om&are t%e translated 5DS to t%e &rotein sequence obtained from 1 above. %tt&:/,###.ebi.ac.u-,6ools/msa,clustalo, Types of sequence analysis Searching databases Sequence alignments Feature analyses Feature analysis Part 1 • General introduction • Feature analyses • Motifs and Domains What is a feature Sequence features are groups of nucleotides or amino acids that confer certain characteristics upon a gene or protein, and may be important for its overall function. %tt&://###.ebi.ac.u-,6ools,st Protein features Gene features Exercise on features 1* ! &lore t%e features along t%e &rotein P02234 #it%in UniProt 2* 8ie# the &rotein9s structure from &db by follo#ing t%e :D structure lin- for 1%1b.
    [Show full text]
  • Tracheophyte Genomes Keep Track of the Deep Evolution of the 2 Caulimoviridae 3 4 Authors 5 Seydina Diop1, Andrew D.W
    bioRxiv preprint doi: https://doi.org/10.1101/158972; this version posted July 21, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 1 Tracheophyte genomes keep track of the deep evolution of the 2 Caulimoviridae 3 4 Authors 5 Seydina Diop1, Andrew D.W. Geering2, Françoise Alfama-Depauw1, Mikaël Loaec1, Pierre-Yves 6 Teycheney3 and Florian Maumus1* 7 8 Affiliations 9 1 URGI, INRA, Université Paris-Saclay, 78026 Versailles, France; 10 2 Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, GPO Box 11 267, Brisbane, Queensland 4001, Australia 12 3 UMR AGAP, CIRAD, INRA, SupAgro, 97130 Capesterre Belle-Eau, France 13 14 Corresponding author 15 Florian Maumus 16 URGI-INRA 17 RD10 route de Saint Cyr 18 78026, Versailles 19 France 20 +33 1 30 83 31 74 21 [email protected] 22 23 24 1 bioRxiv preprint doi: https://doi.org/10.1101/158972; this version posted July 21, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 25 Abstract 26 Endogenous viral elements (EVEs) are viral sequences that are integrated in the nuclear genomes of 27 their hosts and are signatures of viral infections that may have occurred millions of years ago. The 28 study of EVEs, coined paleovirology, provides important insights into virus evolution. The 29 Caulimoviridae is the most common group of EVEs in plants, although their presence has often been 30 overlooked in plant genome studies.
    [Show full text]
  • Next-Generation DNA Sequencing Informatics, 2Nd Edition
    This is a free sample of content from Next-Generation DNA Sequencing Informatics, 2nd edition. Click here for more information on how to buy the book. Index Page references followed by f denote figures. Page references followed by t denote tables. A Needleman–Wunsch (NW) algorithm, 49, 54, 110–113 overview, 109–110 Abeel, Thomas, 103 – – – ABI. See Applied Biosystems Inc. Smith Waterman (SW) algorithm, 38, 49, 62 63, 111 113 Ab initio genome annotation, 172, 178, 180t–181t Splign, 182 – TopHat, 43, 182 ab1PeakReporter software, 52 53 – A-Bruijn graph, 133–134 Alignment score, FASTA, 64 65 ABySS (Assembly by Short Sequencing), 134, 142, 147–153 Allele, 52, 354 Allele frequency, 76, 94, 193 effect of k-mer size and minimum pair number on assembly, fi 148–149, 149f Allele-speci c expression, 155, 298 overview of, 147–148 ALLPATHS, 134 quality of assembly, 149–153, 150t, 151f–152f ALN format, 92 α transcriptome assembly (Trans-ABySS), 158t, 160–161, 166 -diversity indices, 319 – – AceView database, 294, 295f Alternative splicing, 182, 293 296, 294f 295f Acrylamide gels Altschul, Stephen, 65 capillary tube, 4 Amazon Elastic Compute Cloud (EC2), 43, 254, 300, 315, – Sanger sequencing and, 2, 3–4 362 364, 366, 369 – ACT, 179t Amino acids, pairwise comparisons, 48 49 Adapter removal, 37–39, 39f, 43 Amplicons, 8, 30, 89, 204, 309, 312 Adapter Removal program, 38 Amplicon Variant Analyzer, 101 Affine gaps, 42, 110, 111–112 AmpliSeq Cancer Panel (Ion Torrent), 206 Algorithms Annotation, 75. See also Genome annotation – – – alignment, 49, 109–124, 129, 223, 338, 344 ChIP-seq peak, 240 242, 255, 259, 262 263, 262f 263f – assembly, 59, 127–129, 133–134, 338 proteogenomics and, 327 328, 328f – database searching, 113–115 of variants, 208 212 development, 364 ANNOVAR, 211 DNA fragment/genome assembly, 127–129, 133–134, 142 Anthrax, 141 dynamic programming, 110–124 Anti-sense RNA, 281 file compression, 79 Application programming interface (API), 368 Golay error-correcting, 31 Applied Biosystems Inc.
    [Show full text]