Retand: a Novel Family of Gypsy-Like Retrotransposons Harboring an Ampliwed Tandem Repeat

Total Page:16

File Type:pdf, Size:1020Kb

Retand: a Novel Family of Gypsy-Like Retrotransposons Harboring an Ampliwed Tandem Repeat Mol Gen Genomics DOI 10.1007/s00438-006-0140-x ORIGINAL PAPER Retand: a novel family of gypsy-like retrotransposons harboring an ampliWed tandem repeat Eduard Kejnovsky · Zdenek Kubat · Jiri Macas · Roman Hobza · Jaroslav Mracek · Boris Vyskot Received: 20 December 2005 / Accepted: 2 June 2006 © Springer-Verlag 2006 Abstract In this paper we describe a pair of novel Retand elements are also present in other Silene species, Ty3/gypsy retrotransposons isolated from the dioecious mostly in subtelomeric heterochromatin regions of all plant Silene latifolia, consisting of a non-autonomous chromosomes. The only exception is the subtelomere of element Retand-1 (3.7 kb) and its autonomous partner the short arm of the Y chromosome in S. latifolia which Retand-2 (11.1 kb). These two elements have highly is known to lack the terminal heterochromatin. An similar long terminal repeat (LTR) sequences but diVer interesting feature of the Retand elements is the pres- in the presence of the typical retroelement coding ence of a tandem repeat sequence, which is more ampli- regions (gag-pol genes), most of which are missing in Wed in the non-autonomous Retand-1. Retand-1. Moreover, Retand-2 contains two additional open reading frames in antisense orientation localized Keywords Pair of autonomous and non-autonomous between the pol gene and right LTR. Retand transcripts retrotransposons · Tandem repeat · Transcriptional were detected in all organs tested (leaves, Xower buds activity · Sex chromosomes · Silene latifolia and roots) which, together with the high sequence simi- larity of LTRs in individual elements, indicates their recent transpositional activity. The autonomous ele- Introduction ments are similarly abundant (2,700 copies) as non- autonomous ones (2,100 copies) in S. latifolia genome. Retrotransposons represent a class of mobile elements with duplicative mode of transposition via RNA inter- mediate. They are widespread among eukaryotes and often compose a signiWcant fraction of the nuclear Communicated by M.-A. Grandbastien genome (Bennetzen 1996; SanMiguel et al. 1996). In Nucleotide sequence data reported are available in the DDBJ/ plants, retrotransposons are found in all genomes that EMBL/GenBank databases under the accession numbers have been examined, but they seem to be highly abun- DQ023669 (Retand-2) and DQ023670 (Retand-1). dant only in species with large genomes, in some plants comprising more than 50% of nuclear DNA content E. Kejnovsky (&) · Z. Kubat · R. Hobza · J. Mracek · B. Vyskot (Pearce et al.1996a; SanMiguel et al. 1996; Suoniemi Laboratory of Plant Developmental Genetics, et al. 1998). The replication strategy of retrotranspo- Institute of Biophysics, sons oVers the potential for explosive increases in copy Academy of Sciences of the Czech Republic, numbers and thereby these elements increase the size Kralovopolska 135, 612 65 Brno, Czech Republic e-mail: [email protected] of the host genome (Kumar and Bennetzen 1999; San- Miguel and Bennetzen 1998). DiVerential ampliWcation J. Macas of retrotransposons is largely responsible for signiWcant Laboratory of Molecular Cytogenetics, diVerences in genome size even between closely related Institute of Plant Molecular Biology, Academy of Sciences of the Czech Republic, plant species (SanMiguel 1998; Vicient et al. 1999; 37005 Ceske Budejovice, Czech Republic Kalendar et al. 2000; SanMiguel and Bennetzen 1998). 123 Mol Gen Genomics It has been found that plant genomes host diverse region, regions of gag and pol genes, ORF3 region and populations of retroelements which are classiWed into STR1 tandem repeat region with following primers, several basic classes (Hull 2001), each of them is repre- respectively: LTR-F1 (5Ј-TTCCGGGTGTAATTCC sented by various families including elements similar in AGAG-3Ј), LTR-R1 (5Ј-CATATTCTGCACCCGCT their nucleotide sequences and thus supposed to be of GAC-3Ј), GAG-F1 primer (5Ј-CCGACTCTCCAAA the same origin. However, even the retroelement GTCGAAC-3Ј) and GAG-R1 (5Ј-TTCTGCCCTTC sequences belonging to the same family may diVer con- ATTCTTGCT-3Ј), POL-F1 (5Ј-ACGGTCAGTGCC siderably in their length and structure, as they occur in GTAATCATC-3Ј), POL-R1 (5Ј-GTCGTCTCGGCT the genome as deletion derivatives of the full-length TCTATGTC-3Ј), STR1-F1 (5Ј-GAGGTGTCCAAG retroelements (Havecker et al. 2004). A typical exam- ACCGTTGT-3Ј), STR1-R1 (5Ј-GCCGTCGACTTCT ple is solo-LTRs derived from LTR-retrotransposons TTCTTGT-3Ј), ORF4-F1 (5Ј-TAACGTAACCCGGC by recombination-based excision and elimination of TCTGTC-3Ј) and ORF4-R1 (5Ј-TTTGGAGGATTTC internal element sequences between the two terminal GAGGATG-3Ј) with BAC4J4 clone containing an LTRs. In addition, non-autonomous elements lacking a autonomous Retand element as a template DNA. part or whole of the regions coding for retroelement After hybridization, detection was performed using proteins have been reported from several species. As CDP-Star detection reagent. BAC DNA was isolated these elements are not capable of encoding functional using the Qiagen Plasmid Midi kit. BAC DNA was proteins required for their replication and/or transposi- digested with restriction enzymes and desired frag- tion, they can be propagated in the genome only at the ments containing parts of Retand elements were sub- presence of a functional intact element. However, the cloned into linearized and dephosphorylated plasmid molecular mechanisms producing these non-autono- pBluescript IISK+ and transformed into XL1-Blue mous elements, as well as the population dynamics of cells. Subcloned fragments were sequenced using autonomous and non-autonomous elements in plant GeneJumper Primer Insertion Kit for Sequencing genomes are only poorly understood. Moreover, (Invitrogen) based on transposon-mediated insertions except for a few extensively sequenced model species of primer binding sites. All parts of the Retand ele- like rice (Jiang et al. 2002a, b), only a little is known ments were read 2–4 times (by VBC-Genomics, about the occurrence of these elements in other plants. Vienna) and the clone sequences were assembled with Here we describe Retand, a novel family of Ty3/ Staden Package software (Staden et al. 1996). gypsy retrotransposons including the full-length and the corresponding non-autonomous elements occur- Fluorescence in situ hybridization ring in the genome of Silene latifolia. These elements are localized at subtelomeric heterochromatin regions In order to synchronize the germinating seeds of S. of a vast majority of chromosomes and are transcrip- latifolia and S. viscosa (seeds come from the seed col- tionally active. An interesting feature of these retro- lection of the Institute of Biophysics, Brno, Czech transposons is the presence of an array of tandem Republic), the DNA polymerase inhibitor aphidicolin repeats within their sequences which is more ampliWed was added for 12 h, and mitoses were then accumu- in the non-autonomous element, as well as the lated with oryzalin as described in Siroky et al. (2001). presence of two extra ORFs in antisense orientation Slides were treated as described in Lengerova et al. located upstream of right LTR of the autonomous (2004) with slight modiWcations. Slide denaturation element. was performed in 7:3 (v/v) formamide: 2£ SSC for 2 min at 72°C. Slides were immediately dehydrated through 50, 70, and 100% ethanol (¡20°C), and air Materials and methods dried. The probe was denatured at 70°C for 10 min, and 100 ng of the denatured probe was added at room Cloning and hybridization temperature and hybridized for 18 h at 37°C. Slides were analyzed using Olympus Provis microscope, and The construction of a sample BAC library of the male image analysis was performed using ISIS software S. latifolia and the preparation of a membrane with 960 (Metasystems). DNA was labeled with Fluorolink BAC clones were described in Lengerova et al. (2004). Cy3-dUTP (Amersham Pharmacia Biotech) (red Screening was performed by hybridization with Alk- labeling) in combination with the nick translation mix Phos Direct hybridization kit (Amersham Pharmacia (Roche) or with SpectrumGreen direct-labeled dUTP Biotech) according to the manufacturer’s protocol. (green labeling) and the Nick Translation kit (both Probes were prepared by PCR ampliWcation of LTR Vysis). 123 Mol Gen Genomics RNA isolation and RT-PCR hybridization (FISH) signals, we found a novel non- autonomous retroelement that we named Retand-1 Total RNA was isolated from S. latifolia male plant tis- because it contained a tandem repeat array instead of sues (leaves, Xower buds or roots) by extraction with genes (retroelement with tandem repeat). This ele- RNA Blue (TopBio) and chloroform. All RNA samples ment was 3,679 bp long and contained almost identical were treated with RNase free DNase (Gibco-BRL) to long terminal repeats (LTRs) diVering in only one base remove any contaminant DNA. Reverse transcription at 3Јend (99.8% identity) indicating a recent insertion was carried out with the SuperScriptIII First-Strand of the element. The LTRs began with TG and ended Synthesis System for RT-PCR (Gibco-BRL) by random with CA, similarly as the LTRs of most plant retro- priming method, with 1 g of the template RNA. RNase transposons. We propose that the sequence TATA- H was added to remove the RNA template from the AAT located in the center of left LTR could serve as cDNA:RNA hybrid molecule after Wrst-strand synthe- putative transcription initiation signal for Retand-1. At sis. The RT-PCR reaction mix (50 l) consisted of 1£ both ends of the element, we found 5 bp direct repeats PCR buVer, 0.2 mM dNTP, 0.2 M primers, 1.5 mM CTTTG, which probably represent the target site MgCl2, 1 U of Taq polymerase (TopBio) and 10–100 ng duplications (TSD) generated during the element inte- of reverse-transcribed RNA or equal amount of reverse gration. Several features necessary for retroelement transcriptase-untreated RNA as a negative control. The replication were also identiWed, including primer bind- following primers were used in the RT-PCR experi- ing site (PBS) located downstream of the left LTR and ments (sequences of primers see above): LTR-F1, LTR- polypurine tract (PPT) located upstream of the right R1, POL-F1, POL-R1, STR1-F1, STR1-R1, ORF3-F1 LTR (Fig.
Recommended publications
  • Best Practice Recommendations for Internal Validation of Human Short Tandem Repeat Profiling on Capillary Electrophoresis Platforms
    Best Practice Recommendations for Internal Validation of Human Short Tandem Repeat Profiling on Capillary Electrophoresis Platforms Best Practice Recommendations for Internal Validation of Human Short Tandem Repeat Profiling on Capillary Electrophoresis Platforms Biological Methods Subcommittee Biology/DNA Scientific Area Committee Organization of Scientific Area Committees (OSAC) for Forensic Science Best Practice Recommendations for Internal Validation of Human Short Tandem Repeat Profiling on Capillary Electrophoresis Platforms OSAC Proposed Standard Best Practice Recommendations for Internal Validation of Human Short Tandem Repeat Profiling on Capillary Electrophoresis Platforms Prepared by Biological Methods Subcommittee Version: 1.0 Disclaimer: This document has been developed by the Biological Methods Subcommittee of the Organization of Scientific Area Committees (OSAC) for Forensic Science through a consensus process and is proposed for further development through a Standard Developing Organization (SDO). This document is being made available so that the forensic science community and interested parties can consider the recommendations of the OSAC pertaining to applicable forensic science practices. The document was developed with input from experts in a broad array of forensic science disciplines as well as scientific research, measurement science, statistics, law, and policy. This document has not been published by an SDO. Its contents are subject to change during the standards development process. All interested groups or individuals are strongly encouraged to submit comments on this proposed document during the open comment period administered by the Academy Standards Board (www.asbstandardsboard.org). Best Practice Recommendations for Internal Validation of Human Short Tandem Repeat Profiling on Capillary Electrophoresis Platforms Foreword This document outlines best practice recommendations for the internal validation of human short tandem repeat DNA profiling on capillary electrophoresis platforms utilized in forensic laboratories.
    [Show full text]
  • Microsatellites Within the Feline Androgen Receptor Are
    JFM0010.1177/1098612X16634386Journal of Feline Medicine and SurgeryFarwick et al 634386research-article2016 Original Article Journal of Feline Medicine and Surgery 1 –7 Microsatellites within the feline © The Author(s) 2016 Reprints and permissions: androgen receptor are suitable for X sagepub.co.uk/journalsPermissions.nav DOI: 10.1177/1098612X16634386 chromosome-linked clonality testing jfms.com in archival material Nadine M Farwick1, Robert Klopfleisch1, Achim D Gruber1 and Alexander Th A Weiss2 Abstract Objectives A hallmark of neoplasms is their origin from a single cell; that is, clonality. Many techniques have been developed in human medicine to utilise this feature of tumours for diagnostic purposes. One approach is X chromosome-linked clonality testing using polymorphisms of genes encoded by genes on the X chromosome. The aim of this study was to determine if the feline androgen receptor gene was suitable for X chromosome-linked clonality testing. Methods The feline androgen receptor gene, was characterised and used to test clonality of feline lymphomas by PCR and polyacrylamide gel electrophoresis, using archival formalin-fixed, paraffin-embedded material. Results Clonality of the feline lymphomas under study was confirmed and the gene locus was shown to represent a suitable target in clonality testing. Conclusions and relevance Because there are some pitfalls using X chromosome-linked clonality testing, further studies are necessary to establish this technique in the cat. Accepted: 26 January 2016 Introduction Most tumours arise
    [Show full text]
  • Role of Human Endogenous Retroviral Long Terminal Repeats (Ltrs) in Maintaining the Integrity of the Human Germ Line
    Viruses 2011, 3, 901-905; doi:10.3390/v3060901 OPEN ACCESS viruses ISSN 1999-4915 www.mdpi.com/journal/viruses Commentary Role of Human Endogenous Retroviral Long Terminal Repeats (LTRs) in Maintaining the Integrity of the Human Germ Line Meihong Liu and Maribeth V. Eiden * Laboratory of Cellular and Molecular Regulation, National Institute of Mental Health, 49 Convent Drive MSC 4483, Bethesda, MD 20892, USA; E-Mail: [email protected] * Author to whom correspondence should be addressed; E-Mail: [email protected]; Tel.:+1-301-402-1641 Fax: +1-301-402-1748. Received: 27 April 2011; in revised form: 10 June 2011 / Accepted: 15 June 2011 / Published: 21 June 2011 Abstract: Retroviruses integrate a reverse transcribed double stranded DNA copy of their viral genome into the chromosomal DNA of cells they infect. Occasionally, exogenous retroviruses infect germ cells and when this happens a profound shift in the virus host dynamic occurs. Retroviruses maintained as hereditable viral genetic material are referred to as endogenous retroviruses (ERVs). After millions of years of co-evolution with their hosts many human ERVs retain some degree of function and a few have even become symbionts. Thousands of copies of endogenous retrovirus long terminal repeats (LTRs) exist in the human genome. There are approximately 3000 to 4000 copies of the ERV-9 LTRs in the human genome and like other solo LTRs, ERV-9 LTRs can exhibit distinct promoter/enhancer activity in different cell lineages. It has been recently reported that a novel transcript of p63, a primordial member of the p53 family, is under the transcriptional control of an ERV-9 LTR [1].
    [Show full text]
  • A New Census of Protein Tandem Repeats and Their Relationship with Intrinsic Disorder
    G C A T T A C G G C A T genes Article A New Census of Protein Tandem Repeats and Their Relationship with Intrinsic Disorder Matteo Delucchi 1,2 , Elke Schaper 1,2,† , Oxana Sachenkova 3,‡, Arne Elofsson 3 and Maria Anisimova 1,2,* 1 ZHAW Life Sciences und Facility Management, Applied Computational Genomics, 8820 Wädenswil, Switzerland; [email protected] 2 Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland 3 Science of Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, 106 91 Stockholm, Sweden * Correspondence: [email protected]; Tel.: +41-(0)58-934-5882 † Present address: Carbon Delta AG, 8002 Zürich, Switzerland. ‡ Present address: Vildly AB, 385 31 Kalmar, Sweden. Received: 9 March 2020; Accepted: 1 April 2020; Published: 9 April 2020 Abstract: Protein tandem repeats (TRs) are often associated with immunity-related functions and diseases. Since that last census of protein TRs in 1999, the number of curated proteins increased more than seven-fold and new TR prediction methods were published. TRs appear to be enriched with intrinsic disorder and vice versa. The significance and the biological reasons for this association are unknown. Here, we characterize protein TRs across all kingdoms of life and their overlap with intrinsic disorder in unprecedented detail. Using state-of-the-art prediction methods, we estimate that 50.9% of proteins contain at least one TR, often located at the sequence flanks. Positive linear correlation between the proportion of TRs and the protein length was observed universally, with Eukaryotes in general having more TRs, but when the difference in length is taken into account the difference is quite small.
    [Show full text]
  • An Intronic Variable Number of Tandem Repeat Polymorphisms of the Cold
    European Journal of Human Genetics (2006) 14, 1295–1305 & 2006 Nature Publishing Group All rights reserved 1018-4813/06 $30.00 www.nature.com/ejhg ARTICLE An intronic variable number of tandem repeat polymorphisms of the cold-induced autoinflammatory syndrome 1 (CIAS1) gene modifies gene expression and is associated with essential hypertension Toshinori Omi1, Maki Kumada1, Toyomi Kamesaki1, Hiroshi Okuda1, Lkhagvasuren Munkhtulga1, Yoshiko Yanagisawa1, Nanami Utsumi1, Takaya Gotoh1, Akira Hata2, Masayoshi Soma3, Satoshi Umemura4, Toshio Ogihara5, Norio Takahashi6, Yasuharu Tabara7, Kazuyuki Shimada8, Hiroyuki Mano9, Eiji Kajii1, Tetsuro Miki7 and Sadahiko Iwamoto*,1 1Division of Human Genetics, Center for Community Medicine, Jichi Medical School, Tochigi, Japan; 2Department of Public Health, Graduate School of Medicine, Chiba University, Chiba, Japan; 3Department of Nephrology and Endocrinology, Advanced Medical Research Center, Nihon University School of Medicine, Tokyo, Japan; 4Second Department of Internal Medicine, Yokohama City University Medical School, Kanagawa, Japan; 5Department of Genetics, Radiation Effects Research Foundation, Hiroshima, Japan; 6Department of Geriatric Medicine, Osaka University Graduate School of Medicine, Osaka, Japan; 7Department of Geriatric Medicine, Ehime University School of Medicine, Ehime, Japan; 8Division of Cardiovascular Medicine, Department of Medicine, Jichi Medical School, Tochigi, Japan; 9Division of Functional Genomics, Jichi Medical School, Kawachigun, Tochigi, Japan Cold-induced autoinflammatory syndrome 1 (CIAS1) gene is a member of the NALP subfamily of the CATERPILLER protein family that is expressed predominantly in peripheral blood leukocytes, which is to regulate apoptosis or inflammation through the activation of NF-jB and caspase. Recent genetic analyses suggested an association between inflammation and oxidative stress-related genes in the development of hypertension. This is the first genetic study indicating an association between the CIAS1 gene and susceptibility to essential hypertension (EH).
    [Show full text]
  • Long Terminal Repeat Enhancer Core Sequences in Proviruses Adjacent to C-Myc in T-Cell Lymphomas Induced by a Murine Retrovirus
    JOURNAL OF VIROLOGY, Jan. 1995, p. 446–455 Vol. 69, No. 1 0022-538X/95/$04.0010 Copyright q 1995, American Society for Microbiology Long Terminal Repeat Enhancer Core Sequences in Proviruses Adjacent to c-myc in T-Cell Lymphomas Induced by a Murine Retrovirus HARRY L. MORRISON, BOBBY SONI, AND JACK LENZ* Department of Molecular Genetics, Albert Einstein College of Medicine, Bronx, New York 10461 Received 28 June 1994/Accepted 17 October 1994 The transcriptional enhancer in the long terminal repeat (LTR) of the T-lymphomagenic retrovirus SL3-3 differs from that of the nonleukemogenic virus Akv at several sites, including a single base pair difference in an element termed the enhancer core. Mutation of this T-A base pair to the C-G sequence found in Akv significantly attenuated the leukemogenicity of SL3-3. Thus, this difference is important for viral leukemoge- nicity. Since Akv is an endogenous virus, this suggests that the C-G in its core is an adaptation to being minimally pathogenic. Most tumors that occurred in mice inoculated with the mutant virus, called SAA, contained proviruses with reversion or potential suppressor mutations in the enhancer core. We also found that the 72-bp tandem repeats constituting the viral enhancer could vary in number. Most tumors contained mixtures of proviruses with various numbers of 72-bp units, usually between one and four. Variation in repeat number was most likely due to recombination events involving template misalignment during viral replication. Thus, two processes during viral replication, misincorporation and recombination, combined to alter LTR enhancer structure and generate more pathogenic variants from the mutant virus.
    [Show full text]
  • Estimation of Duplication History Under a Stochastic Model for Tandem Repeats Farzad Farnoud1* , Moshe Schwartz2 and Jehoshua Bruck3
    Farnoud et al. BMC Bioinformatics (2019) 20:64 https://doi.org/10.1186/s12859-019-2603-1 RESEARCH ARTICLE Open Access Estimation of duplication history under a stochastic model for tandem repeats Farzad Farnoud1* , Moshe Schwartz2 and Jehoshua Bruck3 Abstract Background: Tandem repeat sequences are common in the genomes of many organisms and are known to cause important phenomena such as gene silencing and rapid morphological changes. Due to the presence of multiple copies of the same pattern in tandem repeats and their high variability, they contain a wealth of information about the mutations that have led to their formation. The ability to extract this information can enhance our understanding of evolutionary mechanisms. Results: We present a stochastic model for the formation of tandem repeats via tandem duplication and substitution mutations. Based on the analysis of this model, we develop a method for estimating the relative mutation rates of duplications and substitutions, as well as the total number of mutations, in the history of a tandem repeat sequence. We validate our estimation method via Monte Carlo simulation and show that it outperforms the state-of-the-art algorithm for discovering the duplication history. We also apply our method to tandem repeat sequences in the human genome, where it demonstrates the different behaviors of micro- and mini-satellites and can be used to compare mutation rates across chromosomes. It is observed that chromosomes that exhibit the highest mutation activity in tandem repeat regions are the same as those thought to have the highest overall mutation rates. However, unlike previous works that rely on comparing human and chimpanzee genomes to measure mutation rates, the proposed method allows us to find chromosomes with the highest mutation activity based on a single genome, in essence by comparing (approximate) copies of the pattern in tandem repeats.
    [Show full text]
  • Cell Line Authentication: Technology, Workflow, and Service
    Maintaining Scientific Integrity Through Cell Line Authentication Gabriela Saldanha Genomics Business Unit Life Sciences Research Promega Corporation Presentation Outline • Background on cell line authentication (CLA) • Basics of STRs and CLA protocols Promega Corporation 2 The legend of HeLa • HeLa- first immortal cancer line to be established in 1951 by George Gey • Cervical cancer derived from patient Henrietta Lacks • Widely used in research According to Rebecca Skloot, “More than 60,000 scientific articles had been published about research done on HeLa, and that number was increasing steadily at a rate of more than 300 papers each month.” • Rapid growing; can contaminate and overtake HeLa cells other cell types Promega Corporation 3 Stanley Gartler describes HeLa contamination of cell lines 1968: Gartler publishes first evidence of HeLa contamination based on G6PD isoenzymes NAME DESCRIPTION ORIGIN G6PD VARIANT HeLa Cervical adenocarcinoma, African Type A (fast) human KB Oral epidermoid Caucasian Type A (fast) Nature Reviews, 2010 carcinoma, human HEp-2 Larynx epidermoid Caucasian Type A (fast) carcinoma, human Conclusion: 90% (18/20) Chang liver Liver, human Caucasian Type A (fast) human cell lines are ‘HeLa’ Int-407 Embryonic intestine, Caucasian Type A (fast) human Promega Corporation 4 Poor culture condition leads to contamination • Poor tissue culture environment • No disposable, plastic culture dishes • No commercial media • Cells grown on bench-top • Bunsen burners and steam used for sterilization • Technicians wore surgical
    [Show full text]
  • Finding a Tandem Repeats Motifs in the Completed
    Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 22 October 2019 doi:10.20944/preprints201910.0249.v1 Finding a Tandem Repeats Motifs in the Completed Genomes of Human Coronavirus (hku1) Which is Identified as a Hotspot Region for the Viruses Recombination by Using Regular Expression Language Mohammed Ahmed Ezzelregal Hassan 1, Mohamed Elsayed Hasan2 1. PhD. in Bioinformatics, University of SADAT City, Genetic Engineering and Biotechnology Research Institute GEBRI, Bioinformatics Department, [email protected] Bioinformatician and molecular geneticist at Molecular genetics and taxonomy lab, General Authority for Fish Resources and Development (GAFRD) 2. PhD. in Bioinformatics, University of SADAT City, Genetic Engineering and Biotechnology Research Institute GEBRI, Bioinformatics Department , [email protected] Bioinformatician at the central laboratory unit of GEBRI institute 1 © 2019 by the author(s). Distributed under a Creative Commons CC BY license. Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 22 October 2019 doi:10.20944/preprints201910.0249.v1 Abstract A lot of research studies have been surveyed the completed genomes of prokaryotic and eukaryotic and focused on the correlation between the percentage of microsatellite sequences in completed genomes and the whole size of the organism genomes. There are fewer studies made in repetitive sequences otherwise simple sequence repeats or long tandem repeats of virus genomes. simple sequence repeats (SSRs) are the most important regions for recombination and moving repeats blocks from site to another site in the genomes. A tool was programmed and designed by visual basic 6.0 to find the long tandem repeats in DNA sequences of the small genomes.
    [Show full text]
  • Identification and Characterisation of Endogenous Avian Leukosis Virus Subgroup E (ALVE) Insertions in Chicken Whole Genome Sequencing Data Andrew S
    Mason et al. Mobile DNA (2020) 11:22 https://doi.org/10.1186/s13100-020-00216-w RESEARCH Open Access Identification and characterisation of endogenous Avian Leukosis Virus subgroup E (ALVE) insertions in chicken whole genome sequencing data Andrew S. Mason1,2* , Ashlee R. Lund3, Paul M. Hocking1ˆ, Janet E. Fulton3† and David W. Burt1,4† Abstract Background: Endogenous retroviruses (ERVs) are the remnants of retroviral infections which can elicit prolonged genomic and immunological stress on their host organism. In chickens, endogenous Avian Leukosis Virus subgroup E (ALVE) expression has been associated with reductions in muscle growth rate and egg production, as well as providing the potential for novel recombinant viruses. However, ALVEs can remain in commercial stock due to their incomplete identification and association with desirable traits, such as ALVE21 and slow feathering. The availability of whole genome sequencing (WGS) data facilitates high-throughput identification and characterisation of these retroviral remnants. Results: We have developed obsERVer, a new bioinformatic ERV identification pipeline which can identify ALVEs in WGS data without further sequencing. With this pipeline, 20 ALVEs were identified across eight elite layer lines from Hy-Line International, including four novel integrations and characterisation of a fast feathered phenotypic revertant that still contained ALVE21. These bioinformatically detected sites were subsequently validated using new high- throughput KASP assays, which showed that obsERVer was highly precise and exhibited a 0% false discovery rate. A further fifty-seven diverse chicken WGS datasets were analysed for their ALVE content, identifying a total of 322 integration sites, over 80% of which were novel. Like exogenous ALV, ALVEs show site preference for proximity to protein-coding genes, but also exhibit signs of selection against deleterious integrations within genes.
    [Show full text]
  • Intra-Epidemic Genome Variation in Highly Pathogenic African Swine
    Farlow et al. Virology Journal (2018) 15:190 https://doi.org/10.1186/s12985-018-1099-z RESEARCH Open Access Intra-epidemic genome variation in highly pathogenic African swine fever virus (ASFV) from the country of Georgia Jason Farlow3* , Marina Donduashvili1, Maka Kokhreidze1, Adam Kotorashvili2, Nino G. Vepkhvadze1, Nato Kotaria2 and Ana Gulbani1 Abstract Background: African swine fever virus (ASFV) causes an acute hemorrhagic infection in suids with a mortality rate of up to 100%. No vaccine is available and the potential for catastrophic disease in Europe remains elevated due to the ongoing ASF epidemic in Russia and Baltic countries. To date, intra-epidemic whole-genome variation for ASFV has not been reported. To provide a more comprehensive baseline for genetic variation early in the ASF outbreak, we sequenced two Georgian ASFV samples, G-2008/1 and G-2008/2, derived from domestic porcine blood collected in 2008. Methods: Genomic DNA was extracted directly from low-volume ASFV PCR-positive porcine blood samples and subjected to next generation sequencing on the Illumina Miseq platform. De novo and mapped sequence assemblies were performed using CLCBio software. Genomic illustrations, sequence alignments and assembly figures were generated using Geneious v10.2.4. Sequence repeat architecture was analyzed using DNASTAR GeneQuest 14.1.0. Results: The G-2008/1 and G-2008/2 genomes were distinguished from each other by coding changes in seven genes, including MGF 110-1 L, X69R, MGF 505-10R, EP364R, H233R, E199L, and MGF 360-21R in addition to eight homopolymer tract variations. The 2008/2 genome possessed a novel allele state at a previously undescribed intergenic repeat locus between genes C315R and C147L.
    [Show full text]
  • Conversion of DNA Sequences: from a Transposable Element to a Tandem Repeat Or to a Gene
    G C A T T A C G G C A T genes Review Conversion of DNA Sequences: From a Transposable Element to a Tandem Repeat or to a Gene Ana Paço 1,* , Renata Freitas 2,3,4 and Ana Vieira-da-Silva 1 1 MED-Mediterranean Institute for Agriculture, Environment and Development, University of Évora, 7002–554 Évora, Portugal; [email protected] 2 IBMC-Institute for Molecular and Cell Biology, University of Porto, R. Campo Alegre 823, 4150–180 Porto, Portugal; [email protected] 3 I3S-Institute for Innovation and Health Research, University of Porto, Rua Alfredo Allen, 208, 4200–135 Porto, Portugal 4 ICBAS-Institute of Biomedical Sciences Abel Salazar, University of Porto, 4050-313 Porto, Portugal * Correspondence: [email protected]; Tel.: +351-266-760-878 Received: 19 October 2019; Accepted: 29 November 2019; Published: 5 December 2019 Abstract: Eukaryotic genomes are rich in repetitive DNA sequences grouped in two classes regarding their genomic organization: tandem repeats and dispersed repeats. In tandem repeats, copies of a short DNA sequence are positioned one after another within the genome, while in dispersed repeats, these copies are randomly distributed. In this review we provide evidence that both tandem and dispersed repeats can have a similar organization, which leads us to suggest an update to their classification based on the sequence features, concretely regarding the presence or absence of retrotransposons/transposon specific domains. In addition, we analyze several studies that show that a repetitive element can be remodeled into repetitive non-coding or coding sequences, suggesting (1) an evolutionary relationship among DNA sequences, and (2) that the evolution of the genomes involved frequent repetitive sequence reshuffling, a process that we have designated as a “DNA remodeling mechanism”.
    [Show full text]