Downloaded from the GISAID ( Platform on 2020 February 7 and 29 Respectively

Total Page:16

File Type:pdf, Size:1020Kb

Downloaded from the GISAID ( Platform on 2020 February 7 and 29 Respectively Phylogenetic study of 2019-nCoV by using alignment-free method Yang Gao*1, Tao Li*2, and Liaofu Luo‡3,4 1 Baotou National Rare Earth Hi-Tech Industrial Development Zone, Baotou, China 2 College of Life Sciences, Inner Mongolia Agricultural University , Hohhot, China. 3Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot, China 4 School of Life Science and Technology, Inner Mongolia University of Science and Technology, Baotou, China *These authors contributed equally to this work. ‡Corresponding author. E-mail: [email protected] (LL) Abstract The origin and early spread of 2019-nCoV is studied by phylogenetic analysis using IC-PIC alignment-free method based on DNA/RNA sequence information correlation (IC) and partial information correlation (PIC). The topology of phylogenetic tree of Betacoronavirus is remarkably consistent with biologist’s systematics, classifies 2019-nCoV as Sarbecovirus of Betacoronavirus and supports the assumption that these novel viruses are of bat origin with pangolin as one of the possible intermediate hosts. The novel virus branch of phylogenetic tree shows location-virus linkage. The placement of root of the early 2019-nCoV tree is studied carefully in Neighbor Joining consensus algorithm by introducing different out-groups (Bat-related coronaviruses, Pangolin coronaviruses and HIV viruses etc.) and comparing with UPGMA consensus trees. Several oldest branches (lineages) of the 2019-nCoV tree are deduced that means the COVID-19 may begin to spread in several regions in the world before its outbreak in Wuhan. Introduction Coronaviruses are single-stranded positive-sense enveloped RNA viruses that are distributed broadly among humans, other mammals, and birds and that cause respiratory, enteric, hepatic, and neurologic diseases [1,2]. The family Coronaviridae contains 4 genera, namely Alphacoronavirus, Betacoronavirus, Deltacoronavirus, and Gammacoronavirus [3]. Six coronavirus species are known to cause human disease [4]. Among which HCoV-OC43, HCoV-HKU1, SARS-CoV, and Middle East respiratory syndrome coronavirus (MERS-CoV) are belong to the genus Betacoronavirus [5-8]. The other two strains, HCoV-229E and HCoV-NL63, are belong to the genus Alphacoronavirus [4,5]. In the end of 2019, a pneumonia disease named COVID-19 that caused by 2019-nCoV (SARS-CoV-2) outbreaks in Wuhan, China and then spread rapidly all over the world. People urgently seek for the source of transmission and possible intermediate animal vectors. What kind of coronaviruses the SARS-CoV-2 is? How did the spread of the COVID-19 among humans begin? Both problems can be solved from the phylogenetic analysis of the 2019 novel Coronaviruses and other related genomes. About the first problem the alignment-based sequence analyses reveal many new discoveries [9-11]. Phylogenetic analyses of the RNA-dependent RNA polymerase (RdRp) protein, spike proteins, and full-length genomes found that the SARS-CoV-2 is most closely related to two bat SARS-like coronaviruses, bat-SL-CoVZXC21 and bat-SL-CoVZC45, and the SARS-CoV-2 is thought to belong to Sarbecovirus of Betacoronavirus [12–16]. It was also found that the SARS-CoV-2 has the highest similarity to the bat coronavirus RaTG13 sampled from a Rhinolophus affinis bat in Yunnan in 2013 [17, 18]. While bats may be the reservoir host for various coronaviruses, recent works discovered that the genome of Malayan pangolin coronaviruses shows high similarity to 2019-nCoV and therefore pangolins (Manis javanica) may be another intermediate host for coronaviruses. [19, 20]. However, little is known about the second problem. Recently, this problem was warm regarded by researchers Forster et al. [21] and Yu et al. [22] through phylogenetic network analysis. About the calculation method used in phylogenetic analysis although the alignment-based approaches often give high accuracy and may reveal the relationships among sequences, they meet huge challenges when the recombination, shuffling, and rearrangement events frequently occur in the genome evolution. Simultaneously, the whole-genome multiple alignments are very time-consuming and expensive in memory usage. However, to address the limitations of alignment-based approaches the alignment-free genome comparison methods are becoming attractive alternative [23]. Several approaches derived from information theory, with the emphasize of base correlation property, have been proposed for sequence comparison [24-27]. The average mutual information (AMI) profiles were used to group HIV-1 viruses. The base correlation method has been used to infer coronavirus phylogeny and analysis Hepatitis E virus genotyping and subtyping [25, 26]. Previously, we proposed the IC-PIC method, and studied the phylogeny of dsDNA viruses, papillomaviruses, parvoviruses in a wide range [27]. Recently, Randhawa et al. [28] and Gao et al.[29] using alignment-free method confirmed that the SARS-CoV-2 belongs to the Betacoronavirus and the genomic similarity to the sub-genus Sarbecovirus infers a possible bat origin. With the wide application of genome sequence data we eager to know what is the imprinting characterizing the characteristics of a genome. We suggested that AMI and k-departed base correlation can be looked as the signature of a given genome sequence[29]. The average mutual information is called information correlation (IC) defined by Dk2 2 p i log 2 p i p i ( k ) j log 2 p i ( k ) j i ij and the k-departed base correlation is called partial information correlation (PIC) defined by 2 Fi()() k j() p i k j p i p j where pi means the probability of base i in the sequence and pi(k)j means the joint probability of base pair ij departed by distance k (k=0,1,2,…). In the following we shall study the SARS-CoV-2 phylogeny by using IC-PIC algorithm based on the above set of signatures of the genome sequence . Materials and Methods The SARS-CoV-2 genomes used in Figure 1 and Figure 2 to 6 were downloaded from the GISAID (https://gisaid.org) platform on 2020 February 7 and 29 respectively. Sequences with NNs were discarded. Other sequences were downloaded from the NCBI database (http://www.ncbi.nlm.nih.gov/). Note : although the GISAID database now contains more than five thousands entries, to study the early spread of the SARS-CoV-2 we shall use only 136 genome sequences collected before Feb 29. The genome sequence is converted into an IC-PIC matrix with 17 rows (representing 1 IC for given k and 16 PICs of different base correlation categories) and d columns (representing the distance k between base pair, k=0,1 to d-1). The only parameter in the algorithm is the range of d, which is denoted as K. K is determined from the best-fit construction of tree. In general the deduced tree changes with K and attains stable at some large value. In deducing consensus tree for Beta coronavirus in Fig 1 and for the 2019-nCoV in Fig 2 to 6 we take K=50 and 100 respectively. The work is carried out on IC-PIC web server [29]. After uploading input data in fasta format, setting the parameter K-value and choosing the UPGMA or Neighbor-Joining (NJ) option, the server will run the program and for each run of given d (d=1 to K stepping 1) deduce a phylogenetic tree. In the calculation the evolutionary distance of any two genomes is calculated by Euclidean distance between their respective IC-PIC 17Xd matrices. Then an unrooted UPGMA or NJ tree is generated. Finally, K phylogenetic trees are combined to generate a consensus tree. All these trees were constructed by using NEIGHBOR and CONSENSE program in the PHYLIP package [30]. The robustness of the tree topology was estimated by branch support. Note that the NJ tree obtained by using NEIGHBOR algorithm is unrooted. The placement of the root is one of the most difficult parts of estimating a tree. However, for gene trees with a known species tree, there is the out-group approach since the out-group is known for sure or can be assumed appropriately in the present case. In constructing evolutionary tree for Beta coronavirus we introduce Gamacoronaviruse as out-group. In constructing evolutionary tree for 2019-nCoV we assume several possible out-groups for comparison. Simultaneously, we construct the UPGMA tree as a supplementary proof of the root of the tree in out-group approach. The biggest difference between NJ tree and UPGMA tree is that the latter assumes a constant rate of evolution across the lineages i.e., a molecular clock; because this is often violated in empirical datasets, this approach is usually considered sub-optimal. Another problem is related to the accuracy in constructing UPGMA tree. Since in UPGMA algorithm we always select a pair of sequences with the smallest distance to merge, when the difference of the distance of each pair of sequence is very small and the measurement accuracy of the distance is not enough the wrong pairing in some clades may occur. Results The whole-genome-based phylogenetic trees for Betacoronavirus and 2019-nCoV are deduced by use of IC-PIC method and given in Fig 1 and Fig 2 to 6 respectively. To reconstruct the phylogenetic tree the sequence data of 40 Betacoronaviruses from five subgenus and 136 SARS-CoV-2 are used. The consensus tree is derived from 50 (K=50 for Fig 1) or 100 (K=100 for Fig 2 to 6) trees based on IC-PIC matric. The robustness of the tree topology was estimated by branch support but the tree is not drawn to scale. The phylogenetic tree of Betacoronavirus genus is shown in Fig. 1. From Fig 1 we found: 1) The tree topology is remarkably consistent with biologist’s systematics that Betacoronavirus contains five subgenus, namely Sarbecovirus, Hibecovirus, Nobecovirus, Embecovirus and Merbecovirus.
Recommended publications
  • Genome Organization of Canada Goose Coronavirus, a Novel
    www.nature.com/scientificreports OPEN Genome Organization of Canada Goose Coronavirus, A Novel Species Identifed in a Mass Die-of of Received: 14 January 2019 Accepted: 25 March 2019 Canada Geese Published: xx xx xxxx Amber Papineau1,2, Yohannes Berhane1, Todd N. Wylie3,4, Kristine M. Wylie3,4, Samuel Sharpe5 & Oliver Lung 1,2 The complete genome of a novel coronavirus was sequenced directly from the cloacal swab of a Canada goose that perished in a die-of of Canada and Snow geese in Cambridge Bay, Nunavut, Canada. Comparative genomics and phylogenetic analysis indicate it is a new species of Gammacoronavirus, as it falls below the threshold of 90% amino acid similarity in the protein domains used to demarcate Coronaviridae. Additional features that distinguish the genome of Canada goose coronavirus include 6 novel ORFs, a partial duplication of the 4 gene and a presumptive change in the proteolytic processing of polyproteins 1a and 1ab. Viruses belonging to the Coronaviridae family have a single stranded positive sense RNA genome of 26–31 kb. Members of this family include both human pathogens, such as severe acute respiratory syn- drome virus (SARS-CoV)1, and animal pathogens, such as porcine epidemic diarrhea virus2. Currently, the International Committee on the Taxonomy of Viruses (ICTV) recognizes four genera in the Coronaviridae family: Alphacoronavirus, Betacoronavirus, Gammacoronavirus and Deltacoronavirus. While the reser- voirs of the Alphacoronavirus and Betacoronavirus genera are believed to be bats, the Gammacoronavirus and Deltacoronavirus genera have been shown to spread primarily through birds3. Te frst three species of the Deltacoronavirus genus were discovered in 20094 and recent work has vastly expanded the Deltacoronavirus genus, adding seven additional species3.
    [Show full text]
  • Phylogeny Codon Models • Last Lecture: Poor Man’S Way of Calculating Dn/Ds (Ka/Ks) • Tabulate Synonymous/Non-Synonymous Substitutions • Normalize by the Possibilities
    Phylogeny Codon models • Last lecture: poor man’s way of calculating dN/dS (Ka/Ks) • Tabulate synonymous/non-synonymous substitutions • Normalize by the possibilities • Transform to genetic distance KJC or Kk2p • In reality we use codon model • Amino acid substitution rates meet nucleotide models • Codon(nucleotide triplet) Codon model parameterization Stop codons are not allowed, reducing the matrix from 64x64 to 61x61 The entire codon matrix can be parameterized using: κ kappa, the transition/transversionratio ω omega, the dN/dS ratio – optimizing this parameter gives the an estimate of selection force πj the equilibrium codon frequency of codon j (Goldman and Yang. MBE 1994) Empirical codon substitution matrix Observations: Instantaneous rates of double nucleotide changes seem to be non-zero There should be a mechanism for mutating 2 adjacent nucleotides at once! (Kosiol and Goldman) • • Phylogeny • • Last lecture: Inferring distance from Phylogenetic trees given an alignment How to infer trees and distance distance How do we infer trees given an alignment • • Branch length Topology d 6-p E 6'B o F P Edo 3 vvi"oH!.- !fi*+nYolF r66HiH- .) Od-:oXP m a^--'*A ]9; E F: i ts X o Q I E itl Fl xo_-+,<Po r! UoaQrj*l.AP-^PA NJ o - +p-5 H .lXei:i'tH 'i,x+<ox;+x"'o 4 + = '" I = 9o FF^' ^X i! .poxHo dF*x€;. lqEgrE x< f <QrDGYa u5l =.ID * c 3 < 6+6_ y+ltl+5<->-^Hry ni F.O+O* E 3E E-f e= FaFO;o E rH y hl o < H ! E Y P /-)^\-B 91 X-6p-a' 6J.
    [Show full text]
  • Downloaded from the Genbank Database As of July 2020
    viruses Article Modular Evolution of Coronavirus Genomes Yulia Vakulenko 1,2 , Andrei Deviatkin 3 , Jan Felix Drexler 1,4,5 and Alexander Lukashev 1,3,* 1 Martsinovsky Institute of Medical Parasitology, Tropical and Vector Borne Diseases, Sechenov First Moscow State Medical University, 119435 Moscow, Russia; [email protected] (Y.V.); [email protected] (J.F.D.) 2 Department of Virology, Faculty of Biology, Lomonosov Moscow State University, 119234 Moscow, Russia 3 Laboratory of Molecular Biology and Biochemistry, Institute of Molecular Medicine, Sechenov First Moscow State Medical University, 119435 Moscow, Russia; [email protected] 4 Institute of Virology, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, 10117 Berlin, Germany 5 German Centre for Infection Research (DZIF), Associated Partner Site Charité, 10117 Berlin, Germany * Correspondence: [email protected] Abstract: The viral family Coronaviridae comprises four genera, termed Alpha-, Beta-, Gamma-, and Deltacoronavirus. Recombination events have been described in many coronaviruses infecting humans and other animals. However, formal analysis of the recombination patterns, both in terms of the involved genome regions and the extent of genetic divergence between partners, are scarce. Common methods of recombination detection based on phylogenetic incongruences (e.g., a phylogenetic com- patibility matrix) may fail in cases where too many events diminish the phylogenetic signal. Thus, an approach comparing genetic distances in distinct genome regions (pairwise distance deviation matrix) was set up. In alpha, beta, and delta-coronaviruses, a low incidence of recombination between closely related viruses was evident in all genome regions, but it was more extensive between the spike gene and other genome regions.
    [Show full text]
  • Family Classification
    1.0 GENERAL INTRODUCTION 1.1 Henckelia sect. Loxocarpus Loxocarpus R.Br., a taxon characterised by flowers with two stamens and plagiocarpic (held at an angle of 90–135° with pedicel) capsular fruit that splits dorsally has been treated as a section within Henckelia Spreng. (Weber & Burtt, 1998 [1997]). Loxocarpus as a genus was established based on L. incanus (Brown, 1839). It is principally recognised by its conical, short capsule with a broader base often with a hump-like swelling at the upper side (Banka & Kiew, 2009). It was reduced to sectional level within the genus Didymocarpus (Bentham, 1876; Clarke, 1883; Ridley, 1896) but again raised to generic level several times by different authors (Ridley, 1905; Burtt, 1958). In 1998, Weber & Burtt (1998 ['1997']) re-modelled Didymocarpus. Didymocarpus s.s. was redefined to a natural group, while most of the rest Malesian Didymocarpus s.l. and a few others morphologically close genera including Loxocarpus were transferred to Henckelia within which it was recognised as a section within. See Section 4.1 for its full taxonomic history. Molecular data now suggests that Henckelia sect. Loxocarpus is nested within ‗Twisted-fruited Asian and Malesian genera‘ group and distinct from other didymocarpoid genera (Möller et al. 2009; 2011). 1.2 State of knowledge and problem statements Henckelia sect. Loxocarpus includes 10 species in Peninsular Malaysia (with one species extending into Peninsular Thailand), 12 in Borneo, two in Sumatra and one in Lingga (Banka & Kiew, 2009). The genus Loxocarpus has never been monographed. Peninsular Malaysian taxa are well studied (Ridley, 1923; Banka, 1996; Banka & Kiew, 2009) but the Bornean and Sumatran taxa are poorly known.
    [Show full text]
  • The Probability of Monophyly of a Sample of Gene Lineages on a Species Tree
    PAPER The probability of monophyly of a sample of gene COLLOQUIUM lineages on a species tree Rohan S. Mehtaa,1, David Bryantb, and Noah A. Rosenberga aDepartment of Biology, Stanford University, Stanford, CA 94305; and bDepartment of Mathematics and Statistics, University of Otago, Dunedin 9054, New Zealand Edited by John C. Avise, University of California, Irvine, CA, and approved April 18, 2016 (received for review February 5, 2016) Monophyletic groups—groups that consist of all of the descendants loci that are reciprocally monophyletic is informative about the of a most recent common ancestor—arise naturally as a conse- time since species divergence and can assist in representing the quence of descent processes that result in meaningful distinctions level of differentiation between groups (4, 18). between organisms. Aspects of monophyly are therefore central to Many empirical investigations of genealogical phenomena have fields that examine and use genealogical descent. In particular, stud- made use of conceptual and statistical properties of monophyly ies in conservation genetics, phylogeography, population genetics, (19). Comparisons of observed monophyly levels to model pre- species delimitation, and systematics can all make use of mathemat- dictions have been used to provide information about species di- ical predictions under evolutionary models about features of mono- vergence times (20, 21). Model-based monophyly computations phyly. One important calculation, the probability that a set of gene have been used alongside DNA sequence differences between and lineages is monophyletic under a two-species neutral coalescent within proposed clades to argue for the existence of the clades model, has been used in many studies. Here, we extend this calcu- (22), and tests involving reciprocal monophyly have been used to lation for a species tree model that contains arbitrarily many species.
    [Show full text]
  • Canine Coronaviruses: Emerging and Re-Emerging Pathogens of Dogs
    1 Berliner und Münchener Tierärztliche Wochenschrift 2021 (134) Institute of Animal Hygiene and Veterinary Public Health, University Leipzig, Open Access Leipzig, Germany Berl Münch Tierärztl Wochenschr (134) 1–6 (2021) Canine coronaviruses: emerging and DOI 10.2376/1439-0299-2021-1 re-emerging pathogens of dogs © 2021 Schlütersche Fachmedien GmbH Ein Unternehmen der Schlüterschen Canine Coronaviren: Neu und erneut auftretende Pathogene Mediengruppe des Hundes ISSN 1439-0299 Korrespondenzadresse: Ahmed Abd El Wahed, Uwe Truyen [email protected] Eingegangen: 08.01.2021 Angenommen: 01.04.2021 Veröffentlicht: 29.04.2021 https://www.vetline.de/berliner-und- muenchener-tieraerztliche-wochenschrift- open-access Summary Canine coronavirus (CCoV) and canine respiratory coronavirus (CRCoV) are highly infectious viruses of dogs classified as Alphacoronavirus and Betacoronavirus, respectively. Both are examples for viruses causing emerging diseases since CCoV originated from a Feline coronavirus-like Alphacoronavirus and CRCoV from a Bovine coronavirus-like Betacoronavirus. In this review article, differences in the genetic organization of CCoV and CRCoV as well as their relation to other coronaviruses are discussed. Clinical pictures varying from an asymptomatic or mild unspecific disease, to respiratory or even an acute generalized illness are reported. The possible role of dogs in the spread of the Betacoronavirus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is crucial to study as ani- mal always played the role of establishing zoonotic diseases in the community. Keywords: canine coronavirus, canine respiratory coronavirus, SARS-CoV-2, genetics, infectious diseases Zusammenfassung Das canine Coronavirus (CCoV) und das canine respiratorische Coronavirus (CRCoV) sind hochinfektiöse Viren von Hunden, die als Alphacoronavirus bzw.
    [Show full text]
  • Exposure of Humans Or Animals to Sars-Cov-2 from Wild, Livestock, Companion and Aquatic Animals Qualitative Exposure Assessment
    ISSN 0254-6019 Exposure of humans or animals to SARS-CoV-2 from wild, livestock, companion and aquatic animals Qualitative exposure assessment FAO ANIMAL PRODUCTION AND HEALTH / PAPER 181 FAO ANIMAL PRODUCTION AND HEALTH / PAPER 181 Exposure of humans or animals to SARS-CoV-2 from wild, livestock, companion and aquatic animals Qualitative exposure assessment Authors Ihab El Masry, Sophie von Dobschuetz, Ludovic Plee, Fairouz Larfaoui, Zhen Yang, Junxia Song, Wantanee Kalpravidh, Keith Sumption Food and Agriculture Organization for the United Nations (FAO), Rome, Italy Dirk Pfeiffer City University of Hong Kong, Hong Kong SAR, China Sharon Calvin Canadian Food Inspection Agency (CFIA), Science Branch, Animal Health Risk Assessment Unit, Ottawa, Canada Helen Roberts Department for Environment, Food and Rural Affairs (Defra), Equines, Pets and New and Emerging Diseases, Exotic Disease Control Team, London, United Kingdom of Great Britain and Northern Ireland Alessio Lorusso Istituto Zooprofilattico dell’Abruzzo e Molise, Teramo, Italy Casey Barton-Behravesh Centers for Disease Control and Prevention (CDC), One Health Office, National Center for Emerging and Zoonotic Infectious Diseases, Atlanta, United States of America Zengren Zheng China Animal Health and Epidemiology Centre (CAHEC), China Animal Health Risk Analysis Commission, Qingdao City, China Food and Agriculture Organization of the United Nations Rome, 2020 Required citation: El Masry, I., von Dobschuetz, S., Plee, L., Larfaoui, F., Yang, Z., Song, J., Pfeiffer, D., Calvin, S., Roberts, H., Lorusso, A., Barton-Behravesh, C., Zheng, Z., Kalpravidh, W. & Sumption, K. 2020. Exposure of humans or animals to SARS-CoV-2 from wild, livestock, companion and aquatic animals: Qualitative exposure assessment. FAO animal production and health, Paper 181.
    [Show full text]
  • On the Coronaviruses and Their Associations with the Aquatic Environment and Wastewater
    water Review On the Coronaviruses and Their Associations with the Aquatic Environment and Wastewater Adrian Wartecki 1 and Piotr Rzymski 2,* 1 Faculty of Medicine, Poznan University of Medical Sciences, 60-812 Pozna´n,Poland; [email protected] 2 Department of Environmental Medicine, Poznan University of Medical Sciences, 60-806 Pozna´n,Poland * Correspondence: [email protected] Received: 24 April 2020; Accepted: 2 June 2020; Published: 4 June 2020 Abstract: The outbreak of Coronavirus Disease 2019 (COVID-19), a severe respiratory disease caused by betacoronavirus SARS-CoV-2, in 2019 that further developed into a pandemic has received an unprecedented response from the scientific community and sparked a general research interest into the biology and ecology of Coronaviridae, a family of positive-sense single-stranded RNA viruses. Aquatic environments, lakes, rivers and ponds, are important habitats for bats and birds, which are hosts for various coronavirus species and strains and which shed viral particles in their feces. It is therefore of high interest to fully explore the role that aquatic environments may play in coronavirus spread, including cross-species transmissions. Besides the respiratory tract, coronaviruses pathogenic to humans can also infect the digestive system and be subsequently defecated. Considering this, it is pivotal to understand whether wastewater can play a role in their dissemination, particularly in areas with poor sanitation. This review provides an overview of the taxonomy, molecular biology, natural reservoirs and pathogenicity of coronaviruses; outlines their potential to survive in aquatic environments and wastewater; and demonstrates their association with aquatic biota, mainly waterfowl. It also calls for further, interdisciplinary research in the field of aquatic virology to explore the potential hotspots of coronaviruses in the aquatic environment and the routes through which they may enter it.
    [Show full text]
  • A Comparative Phenetic and Cladistic Analysis of the Genus Holcaspis Chaudoir (Coleoptera: .Carabidae)
    Lincoln University Digital Thesis Copyright Statement The digital copy of this thesis is protected by the Copyright Act 1994 (New Zealand). This thesis may be consulted by you, provided you comply with the provisions of the Act and the following conditions of use: you will use the copy only for the purposes of research or private study you will recognise the author's right to be identified as the author of the thesis and due acknowledgement will be made to the author where appropriate you will obtain the author's permission before publishing any material from the thesis. A COMPARATIVE PHENETIC AND CLADISTIC ANALYSIS OF THE GENUS HOLCASPIS CHAUDOIR (COLEOPTERA: CARABIDAE) ********* A thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy at Lincoln University by Yupa Hanboonsong ********* Lincoln University 1994 Abstract of a thesis submitted in partial fulfilment of the requirements for the degree of Ph.D. A comparative phenetic and cladistic analysis of the genus Holcaspis Chaudoir (Coleoptera: .Carabidae) by Yupa Hanboonsong The systematics of the endemic New Zealand carabid genus Holcaspis are investigated, using phenetic and cladistic methods, to construct phenetic and phylogenetic relationships. Three different character data sets: morphological, allozyme and random amplified polymorphic DNA (RAPD) based on the polymerase chain reaction (PCR), are used to estimate the relationships. Cladistic and morphometric analyses are undertaken on adult morphological characters. Twenty six external morphological characters, including male and female genitalia, are used for cladistic analysis. The results from the cladistic analysis are strongly congruent with previous publications. The morphometric analysis uses multivariate discriminant functions, with 18 morphometric variables, to derive a phenogram by clustering from Mahalanobis distances (D2) of the discrimination analysis using the unweighted pair-group method with arithmetical averages (UPGMA).
    [Show full text]
  • Solution Sheet
    Solution sheet Sequence Alignments and Phylogeny Bioinformatics Leipzig WS 13/14 Solution sheet 1 Biological Background 1.1 Which of the following are pyrimidines? Cytosine and Thymine are pyrimidines (number 2) 1.2 Which of the following contain phosphorus atoms? DNA and RNA contain phosphorus atoms (number 2). 1.3 Which of the following contain sulfur atoms? Methionine contains sulfur atoms (number 3). 1.4 Which of the following is not a valid amino acid sequence? There is no amino acid with the one letter code 'O', such that there is no valid amino acid sequence 'WATSON' (number 4). 1.5 Which of the following 'one-letter' amino acid sequence corresponds to the se- quence Tyr-Phe-Lys-Thr-Glu-Gly? The amino acid sequence corresponds to the one letter code sequence YFKTEG (number 1). 1.6 Consider the following DNA oligomers. Which to are complementary to one an- other? All are written in the 5' to 3' direction (i.TTAGGC ii.CGGATT iii.AATCCG iv.CCGAAT) CGGATT (ii) and AATCCG (iii) are complementary (number 2). 2 Pairwise Alignments 2.1 Needleman-Wunsch Algorithm Given the alphabet B = fA; C; G; T g, the sequences s = ACGCA and p = ACCG and the following scoring matrix D: A C T G - A 3 -1 -1 -1 -2 C -1 3 -1 -1 -2 T -1 -1 3 -1 -2 G -1 -1 -1 3 -2 - -2 -2 -2 -2 0 1. What kind of scoring function is given by the matrix D, similarity or distance score? 2. Use the Needleman-Wunsch algorithm to compute the pairwise alignment of s and p.
    [Show full text]
  • The Characterization of Chifitms in Avian Coronavirus Infection in Vivo, Ex Vivo and in Vitro
    G C A T T A C G G C A T genes Article The Characterization of chIFITMs in Avian Coronavirus Infection In Vivo, Ex Vivo and In Vitro Angela Steyn 1,*, Sarah Keep 1, Erica Bickerton 1 and Mark Fife 1,2 1 The Pirbright Institute, Pirbright, Woking GU24 0NF, UK; [email protected] (S.K.); [email protected] (E.B); mfi[email protected] (M.F.) 2 AVIAGEN UK, Ltd. Newbridge, Midlothian EH28 8SZ, Scotland, UK * Correspondence: [email protected]; Tel.: +44-(0)148-323-4762 Received: 20 July 2020; Accepted: 7 August 2020; Published: 10 August 2020 Abstract: The coronaviruses are a large family of enveloped RNA viruses that commonly cause gastrointestinal or respiratory illnesses in the infected host. Avian coronavirus infectious bronchitis virus (IBV) is a highly contagious respiratory pathogen of chickens that can affect the kidneys and reproductive systems resulting in bird mortality and decreased reproductivity. The interferon-inducible transmembrane (IFITM) proteins are activated in response to viral infections and represent a class of cellular restriction factors that restrict the replication of many viral pathogens. Here, we characterize the relative mRNA expression of the chicken IFITM genes in response to IBV infection, in vivo, ex vivo and in vitro using the pathogenic M41-CK strain, the nephropathogenic QX strain and the nonpathogenic Beaudette strain. In vivo we demonstrate a significant upregulation of chIFITM1, 2, 3 and 5 in M41-CK- and QX-infected trachea two days post-infection. In vitro infection with Beaudette, M41-CK and QX results in a significant upregulation of chIFITM1, 2 and 3 at 24 h post-infection.
    [Show full text]
  • A Fréchet Tree Distance Measure to Compare Phylogeographic Spread Paths Across Trees Received: 24 July 2018 Susanne Reimering1, Sebastian Muñoz1 & Alice C
    www.nature.com/scientificreports OPEN A Fréchet tree distance measure to compare phylogeographic spread paths across trees Received: 24 July 2018 Susanne Reimering1, Sebastian Muñoz1 & Alice C. McHardy 1,2 Accepted: 1 November 2018 Phylogeographic methods reconstruct the origin and spread of taxa by inferring locations for internal Published: xx xx xxxx nodes of the phylogenetic tree from sampling locations of genetic sequences. This is commonly applied to study pathogen outbreaks and spread. To evaluate such reconstructions, the inferred spread paths from root to leaf nodes should be compared to other methods or references. Usually, ancestral state reconstructions are evaluated by node-wise comparisons, therefore requiring the same tree topology, which is usually unknown. Here, we present a method for comparing phylogeographies across diferent trees inferred from the same taxa. We compare paths of locations by calculating discrete Fréchet distances. By correcting the distances by the number of paths going through a node, we defne the Fréchet tree distance as a distance measure between phylogeographies. As an application, we compare phylogeographic spread patterns on trees inferred with diferent methods from hemagglutinin sequences of H5N1 infuenza viruses, fnding that both tree inference and ancestral reconstruction cause variation in phylogeographic spread that is not directly refected by topological diferences. The method is suitable for comparing phylogeographies inferred with diferent tree or phylogeographic inference methods to each other or to a known ground truth, thus enabling a quality assessment of such techniques. Phylogeography combines phylogenetic information describing the evolutionary relationships among species or members of a population with geographic information to study migration patterns.
    [Show full text]