Sourabh JAIN

Total Page:16

File Type:pdf, Size:1020Kb

Sourabh JAIN Aix-Marseille Université Faculté de Médecine de Marseille Ecole Doctorale des Sciences de la Vie et de la Santé THÈSE DE DOCTORAT Présentée par Sourabh JAIN Date et lieu de naissance: 14-May-1984, Inde Comparative genomic study for identifying gene acquisitions in Megavirales Soutenance de la thèse le 06-JULY-2017 En vue de l’obtenir du grade de Docteur de l’Université d’Aix-Marseille Membres du jury de la thèse Docteur Pierre PONTAROTTI Directeur de Thèse Professeur Didier RAOULT Co-Directeur de Thèse Professeur Patrick FORTERRE Rapporteur Docteur Franck PANABIERES Rapporteur Professeur Jean Louis MEGE Examiner Laboratoire d’accueil URMITE Unité de Recherche sur les Maladies Infectieuses et Tropicales Emergentes, UMR 6236, Faculté de Médecine 27, Boulevard Jean Moulin, 13385 Marseille, France I2M UMR-CNRS 7373, Evolution Biologique et Modélisation, Aix-Marseille Université 3, place V. Hugo case 19, 13331 Marseille Cedex 3 France 1 2 CONTENTS Abstract………………………………………...............5 Résumé …………………………………….…...............9 Avant-Propos……………………………………….....13 Chapter 1 Introduction…….…………...………...………......……15 Chapter 2 Megavirale diversity and evolution……………………25 Chapter 3 MimiLook: A Phylogenetic Workflow for Detection of Gene Acquisition in Major Orthologous Groups of Megavirales ………………………………….....……..79 Chapter 4 Contribution of horizontal gene transfers in evolution of family specific genome mosaicism of Megavirales ….127 Conclusions & Future Perspective ………………...183 Acknowledgements …………………………………189 3 4 Abstract Discovery of giant viruses with giant genome size and surprising genomic features raises different question about their origin and evolution. The diversity of Megavirales (MVs) imposes difficulties in collectively evaluating their phylogenetic relationships. While small subset of conserved core genes and phylogenomic analyses based on them, provide useful classification of MVs, but they give little insight on the remaining un- conserved and variable gene content of accessory genomes. Thus, many phylogenetic studies have pointed out decisive role of HGTs and genetic exchanges on evolution of MVs, but, majority of them are based on closely related MV families. However, exact proportion of instances of genes acquired horizontally varies greatly with the methodologies used for their detection of interpretation of phylogenies prepared. Therefore, it is necessary to adopt some systematic searching for detecting reticulate evolutionary events like HGT in MVs to decipher genomic composition and genome mosaicism of distantly related MV families. To investigate the contribution of HGTs in distantly related MV families, we have determined gene distributions and gene phylogenies for the 86 complete MV ORFomes classified in 6 defined and 4 putative families, in context of their homologs from other domains of life. At first, we prepared an automated phylogenetic workflow MimiLook, which deduces orthologous groups (OGs) 5 from ORFomes of MVs and constructs phylogeny by performing alignment generation, alignment editing and BLASTP searching across NCBI nr protein sequence database. Finally, this tool detects statistically validated events of gene acquisitions with the help of T-REX algorithm. We found 4577 clusters of orthologus groups (OGs), out of it, 91% of OGs are found to be family specific (i.e. represented by species classified in one MV family only), whereas, only 9% are represented by proteins from 2 or more MV families. In step 2 of our analysis, we found 414 OGs with detected HGT event. 174 were inferred to have transferred from eukaryotes, 106 to have transferred from bacteria and 9 gene families to have transferred from cellular domains other than eukaryotes or bacteria (archaea, and viruses, including phages). 52 OGs were detected as cases of sympatric transfers (gene transfer by association of MVs with more than one cellular domain). Interestingly, 129 gene families were identified to be involved in gene transfers from MVs to other cellular domains. We applied a similar procedure to the 7,898 non-orthologous proteins to detect transfer events and putative donors and identified 259 instances of HGT from non-orthologous proteins, of which 135 cases were from eukaryotes, 82 cases from bacteria, 11 cases from Phages and other viruses, 31 cases where MVs are transferring protein to other cellular domains. Instances of HGT were found to be depicting donor specificity, as viruses of vertebrates/invertebrates (Poxviridae, Ascoviridae and Iridoviridae) acquired genes from donors like 6 Euteleostomii, Eutheria, Baculoviridae and proteobacteria; algal viruses (Phycodnaviridae) and protozoan viruses (pandoravirus, Mimiviridae, pithovirus, and Marseilleviridae) were found to be acquiring genes majorly from cellular donors like Dictyostellium, Mammeillales, Firmicutes, Clostridiales, Klebsormidium, Rozella allomycis, Ooomycetes and Phytophthora. In conclusion, clear distinction can be seen in the genome mosaicism of distantly related Megavirale families, where they evolved via genome specificity and family specific gene acquisitions from their respective ecological niche. Evolution of Megavirale families can be evidently based on phylogenetic analysis of few core genes as well as similarities of their gene contents, but, knowing that the horizontal gene transfer play a major role on the gene contents of Megavirales, it could be unforeseen to decipher the evolution of all Megavirale families by this approach. Keywords: Megavirales; Horizontal gene transfer; MimiLook; Comparative genomics; phylogeny 7 8 Résumé La diversité de Megavirales (MV) impose des difficultés à évaluer collectivement leurs relations phylogénétiques. Bien qu'un petit sous-ensemble de gènes de base conservés et des analyses phylogénomiques basés sur eux, fournissent une classification utile des MV, mais ils donnent peu de perspicacité sur le reste du contenu génétique non conservé et variable des génomes accessoires. Ainsi, de nombreuses études phylogénétiques ont souligné le rôle décisif des HGT et des échanges génétiques sur l'évolution des MV, mais la plupart d'entre eux sont basés sur des familles MV étroitement liées. Cependant, la proportion exacte des cas de gènes acquis horizontalement varie considérablement avec les méthodologies utilisées pour leur détection de l'interprétation des phylogénies préparées. Par conséquent, il est nécessaire d'adopter une recherche systématique de la détection d'événements évolutifs réticulés comme HGT dans les MV pour déchiffrer la composition génomique et la mosaïque du génome de familles MV à distance. Pour étudier la contribution des HGT dans les familles de MV éloignées, nous avons déterminé les distributions de gènes et les phylogénies de gènes pour les 86 ORFomes complets de MV classés dans 6 familles définies et 4 putatives, dans le cadre de leurs homologues d'autres domaines de la vie. Au début, nous avons préparé un flux de travail phylogénétique automatisé MimiLook, qui détermine les groupes 9 orthologues (OG) des ORFomes des MV et construit la phylogénie en effectuant une génération d'alignement sur les homologues BLASTP. Enfin, cet outil détecte des événements statistiquement validés d'acquisitions de gènes avec l'aide de T-REX. Nous avons trouvé 4577 groupes de groupes d'orthologues, hors de celui-ci, 91% des OG se révèlent spécifiques à la famille, alors que seulement 9% sont représentés par des protéines de 2 familles de MV ou plus. À l'étape 2 de notre analyse, nous avons trouvé 414 OG avec événement HGT détecté. On a déduit que 174 ont été transférés des eucaryotes, 106 pour avoir transféré des bactéries et 9 familles de gènes pour avoir transféré des domaines cellulaires autres que les eucaryotes ou les bactéries. 52 OG ont été détectés comme des cas de transferts sympatriques. Notons que 129 familles de gènes ont été identifiées comme impliquées dans le transfert de gènes de MV à d'autres domaines cellulaires. Similairement, 7898 protéines non orthologues pour détecter les événements de transfert et les donneurs putatifs et identifié 259 cas de HGT à partir de protéines non orthologues, dont 135 cas proviennent d'earyaryotes, 82 cas de bactéries, 11 cas de Phages et autres Virus, 31 cas où les MV transfèrent des protéines sur d'autres domaines cellulaires. Les exemples de HGT ont révélé la spécificité des donneurs, car les virus des vertébrés/invertébrés (Poxviridae, Ascoviridae et Iridoviridae) ont acquis des gènes de donneurs comme Euteleostomii, Eutheria, Baculoviridae et proteobactéries; Les virus des algues (Phycodnaviridae) et les virus des 10 protozoaires (pandoravirus, Mimiviridae, Pithovirus et Marseilleviridae) ont été en train d'acquérir des gènes principalement auprès de donneurs cellulaires comme Dictyostellium, Mammeillales, Firmicutes, Clostridiales, Klebsormidium, Rozella allomycis, Ooomycetes et Phytophthora. En conclusion, une distinction claire peut être observée dans le mosaïque du génome des familles de Megavirale éloignées, où elles ont évolué par spécificité génomique et acquisitions de gènes spécifiques à la famille de leur créneau écologique respectif. L'évolution des familles de Megavirale peut évidemment être basée sur l'analyse phylogénétique de quelques gènes de noyau ainsi que sur les similitudes de leur contenu génétique, mais, sachant que le transfert de gène horizontal joue un rôle majeur sur le contenu des gènes de Megavirales, il pourrait être imprévisible de déchiffrer Évolution de toutes les familles Megavirale par cette approche. Mots-clés: Megavirales; Horizontal
Recommended publications
  • Diversification of Fungal Chitinases and Their Functional Differentiation in 2 Histoplasma Capsulatum 3
    bioRxiv preprint doi: https://doi.org/10.1101/2020.06.09.137125; this version posted June 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 1 Diversification of fungal chitinases and their functional differentiation in 2 Histoplasma capsulatum 3 4 Kristie D. Goughenour1*, Janice Whalin1, 5 Jason C. Slot2, Chad A. Rappleye1# 6 7 1 Department of Microbiology, Ohio State University 8 2 Department of Plant Pathology, Ohio State University 9 10 11 #corresponding author: 12 [email protected] 13 614-247-2718 14 15 *current affiliation: 16 Division of Pulmonary and Critical Care Medicine 17 University of Michigan 18 VA Ann Arbor Healthcare System, Research Service 19 Ann Arbor, Michigan, USA 20 21 22 running title: Fungal chitinases 23 24 keywords: chitinase, GH18, fungi, Histoplasma 25 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.09.137125; this version posted June 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 26 ABSTRACT 27 Chitinases enzymatically hydrolyze chitin, a highly abundant biomolecule with many potential 28 industrial and medical uses in addition to their natural biological roles. Fungi are a rich source of 29 chitinases, however the phylogenetic and functional diversity of fungal chitinases are not well 30 understood.
    [Show full text]
  • Frequent Homozygous Deletions in the FRA3B Region in Tumor Cell Lines Still Leave the FHIT Exons Intact
    Oncogene (1998) 16, 635 ± 642 1998 Stockton Press All rights reserved 0950 ± 9232/98 $12.00 Frequent homozygous deletions in the FRA3B region in tumor cell lines still leave the FHIT exons intact Liang Wang1, John Darling1, Jin-San Zhang1, Chi-Ping Qian1, Lynn Hartmann2, Cheryl Conover2, Robert Jenkins1 and David I Smith1 1Division of Experimental Pathology, Department of Laboratory Medicine and Pathology; and 2Department of Medical Oncology, Mayo Clinic/Foundation, 200 First Street, SW, Rochester, Maine 55902, USA FRA3B at human chromosomal band 3p14.2 is the most Soreng, 1984). However, whether or not fragile sites active common fragile site in the human genome. The play a causative role in these structural chromosome molecular mechanism of fragility at this region remains alterations has yet to be determined. unknown but does not involve expansion of a trinucleo- FRA3B, at chromosome band 3p14.2, is the most tide or minisatellite repeat as has been observed for highly inducible fragile site in the human genome several of the cloned rare fragile sites. Deletions and (Smeets et al., 1986). The constitutive familial renal cell rearrangements at FRA3B have been observed in a carcinoma-associated translocation t(3;8)(p14.2;q24) number of distinct tumors. The recently identi®ed (hRCC) (Cohen et al., 1979) was localized immedi- putative tumor suppressor gene FHIT spans FRA3B, ately centromeric of FRA3B (Paradee et al., 1995; and various groups have reported identifying deletions in Boldog et al., 1993). Structural rearrangements and this gene in dierent tumors. Using a high density of deletions at FRA3B were reported in a variety of PCR ampli®able markers within FRA3B searching for histologically dierent cancers including lung (Todd et deletions in the FRA3B region, we have analysed 21 al., 1997), breast (Panagopoulos et al., 1996), tumor cell lines derived from renal cell, pancreatic, and esophageal (Wang et al., 1996), ovarian (Ehlen et al., ovarian carcinomas.
    [Show full text]
  • Repetitive Elements in Humans
    International Journal of Molecular Sciences Review Repetitive Elements in Humans Thomas Liehr Institute of Human Genetics, Jena University Hospital, Friedrich Schiller University, Am Klinikum 1, D-07747 Jena, Germany; [email protected] Abstract: Repetitive DNA in humans is still widely considered to be meaningless, and variations within this part of the genome are generally considered to be harmless to the carrier. In contrast, for euchromatic variation, one becomes more careful in classifying inter-individual differences as meaningless and rather tends to see them as possible influencers of the so-called ‘genetic background’, being able to at least potentially influence disease susceptibilities. Here, the known ‘bad boys’ among repetitive DNAs are reviewed. Variable numbers of tandem repeats (VNTRs = micro- and minisatellites), small-scale repetitive elements (SSREs) and even chromosomal heteromorphisms (CHs) may therefore have direct or indirect influences on human diseases and susceptibilities. Summarizing this specific aspect here for the first time should contribute to stimulating more research on human repetitive DNA. It should also become clear that these kinds of studies must be done at all available levels of resolution, i.e., from the base pair to chromosomal level and, importantly, the epigenetic level, as well. Keywords: variable numbers of tandem repeats (VNTRs); microsatellites; minisatellites; small-scale repetitive elements (SSREs); chromosomal heteromorphisms (CHs); higher-order repeat (HOR); retroviral DNA 1. Introduction Citation: Liehr, T. Repetitive In humans, like in other higher species, the genome of one individual never looks 100% Elements in Humans. Int. J. Mol. Sci. alike to another one [1], even among those of the same gender or between monozygotic 2021, 22, 2072.
    [Show full text]
  • MNS16A Tandem Repeat Minisatellite of Human Telomerase Gene: Functional Studies in Colorectal, Lung and Prostate Cancer
    www.impactjournals.com/oncotarget/ Oncotarget, 2017, Vol. 8, (No. 17), pp: 28021-28027 Research Paper MNS16A tandem repeat minisatellite of human telomerase gene: functional studies in colorectal, lung and prostate cancer Philipp Hofer1, Cornelia Zöchmeister1, Christian Behm1, Stefanie Brezina1, Andreas Baierl2, Angelina Doriguzzi1, Vanita Vanas1, Klaus Holzmann1, Hedwig Sutterlüty- Fall1, Andrea Gsur1 1Medical University of Vienna, Institute of Cancer Research, A-1090 Vienna, Austria 2University of Vienna, Department of Statistics and Operations Research, A-1010 Vienna, Austria Correspondence to: Andrea Gsur, email: [email protected] Keywords: genetic variation, MNS16A, functional polymorphism, telomerase, TERT regulation Received: September 23, 2016 Accepted: February 21, 2017 Published: March 03, 2017 Copyright: Hofer et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC-BY), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. ABSTRACT MNS16A, a functional polymorphic tandem repeat minisatellite, is located in the promoter region of an antisense transcript of the human telomerase reverse transcriptase gene. MNS16A promoter activity depends on the variable number of tandem repeats (VNTR) presenting varying numbers of transcription factor binding sites for GATA binding protein 1. Although MNS16A has been investigated in multiple cancer epidemiology studies with incongruent findings, functional data of only two VNTRs (VNTR-243 and VNTR-302) were available thus far, linking the shorter VNTR to higher promoter activity. For the first time, we investigated promoter activity of all six VNTRs of MNS16A in cell lines of colorectal, lung and prostate cancer using Luciferase reporter assay.
    [Show full text]
  • A Higher-Level Phylogenetic Classification of the Fungi
    mycological research 111 (2007) 509–547 available at www.sciencedirect.com journal homepage: www.elsevier.com/locate/mycres A higher-level phylogenetic classification of the Fungi David S. HIBBETTa,*, Manfred BINDERa, Joseph F. BISCHOFFb, Meredith BLACKWELLc, Paul F. CANNONd, Ove E. ERIKSSONe, Sabine HUHNDORFf, Timothy JAMESg, Paul M. KIRKd, Robert LU¨ CKINGf, H. THORSTEN LUMBSCHf, Franc¸ois LUTZONIg, P. Brandon MATHENYa, David J. MCLAUGHLINh, Martha J. POWELLi, Scott REDHEAD j, Conrad L. SCHOCHk, Joseph W. SPATAFORAk, Joost A. STALPERSl, Rytas VILGALYSg, M. Catherine AIMEm, Andre´ APTROOTn, Robert BAUERo, Dominik BEGEROWp, Gerald L. BENNYq, Lisa A. CASTLEBURYm, Pedro W. CROUSl, Yu-Cheng DAIr, Walter GAMSl, David M. GEISERs, Gareth W. GRIFFITHt,Ce´cile GUEIDANg, David L. HAWKSWORTHu, Geir HESTMARKv, Kentaro HOSAKAw, Richard A. HUMBERx, Kevin D. HYDEy, Joseph E. IRONSIDEt, Urmas KO˜ LJALGz, Cletus P. KURTZMANaa, Karl-Henrik LARSSONab, Robert LICHTWARDTac, Joyce LONGCOREad, Jolanta MIA˛ DLIKOWSKAg, Andrew MILLERae, Jean-Marc MONCALVOaf, Sharon MOZLEY-STANDRIDGEag, Franz OBERWINKLERo, Erast PARMASTOah, Vale´rie REEBg, Jack D. ROGERSai, Claude ROUXaj, Leif RYVARDENak, Jose´ Paulo SAMPAIOal, Arthur SCHU¨ ßLERam, Junta SUGIYAMAan, R. Greg THORNao, Leif TIBELLap, Wendy A. UNTEREINERaq, Christopher WALKERar, Zheng WANGa, Alex WEIRas, Michael WEISSo, Merlin M. WHITEat, Katarina WINKAe, Yi-Jian YAOau, Ning ZHANGav aBiology Department, Clark University, Worcester, MA 01610, USA bNational Library of Medicine, National Center for Biotechnology Information,
    [Show full text]
  • Identification and Characterization of a Minisatellite Contained Within A
    Klein et al. Mobile DNA (2015) 6:18 DOI 10.1186/s13100-015-0049-1 RESEARCH Open Access Identification and characterization of a minisatellite contained within a novel miniature inverted-repeat transposable element (MITE) of Porphyromonas gingivalis Brian A. Klein1,2, Tsute Chen2, Jodie C. Scott2, Andrea L. Koenigsberg1, Margaret J. Duncan2 and Linden T. Hu1* Abstract Background: Repetitive regions of DNA and transposable elements have been found to constitute large percentages of eukaryotic and prokaryotic genomes. Such elements are known to be involved in transcriptional regulation, host-pathogen interactions and genome evolution. Results: We identified a minisatellite contained within a miniature inverted-repeat transposable element (MITE) in Porphyromonas gingivalis.TheP. gingivalis minisatellite and associated MITE, named ‘BrickBuilt’, comprises a tandemly repeating twenty-three nucleotide DNA sequence lacking spacer regions between repeats, and with flanking ‘leader’ and ‘tail’ subunits that include small inverted-repeat ends. Forms of the BrickBuilt MITE are found 19 times in thegenomeofP. gingivalis strain ATCC 33277, and also multiple times within the strains W83, TDC60, HG66 and JCVI SC001. BrickBuilt is always located intergenically ranging between 49 and 591 nucleotides from the nearest upstream and downstream coding sequences. Segments of BrickBuilt contain promoter elements with bidirectional transcription capabilities. Conclusions: We performed a bioinformatic analysis of BrickBuilt utilizing existing whole genome sequencing, microarray and RNAseq data, as well as performing in vitro promoter probe assays to determine potential roles, mechanisms and regulation of the expression of these elements and their affect on surrounding loci. The multiplicity, localization and limited host range nature of MITEs and MITE-like elements in P.
    [Show full text]
  • Unfolding of Quadruplex Structure in the G-Rich Strand of the Minisatellite Repeat by the Binding Protein UP1
    Unfolding of quadruplex structure in the G-rich strand of the minisatellite repeat by the binding protein UP1 Hirokazu Fukuda*, Masato Katahira†, Naoto Tsuchiya*, Yoshiaki Enokizono†, Takashi Sugimura*, Minako Nagao*, and Hitoshi Nakagama*‡ *Biochemistry Division, National Cancer Center Research Institute, 1-1, Tsukiji 5, Chuo-ku, Tokyo 104-0045, Japan; and †Department of Environment and Natural Sciences, Graduate School of Environment and Information Sciences, Yokohama National University, 79-7 Tokiwadai, Hodogaya-ku, Yokohama 240-8501, Japan Contributed by Takashi Sugimura, July 31, 2002 The mouse hypervariable minisatellite (MN) Pc-1 consists of tan- the other four, MNBP-D, MNBP-E, MNBP-F, and MNBP-G, to dem repeats of d(GGCAG) and flanked sequences. We have previ- the complementary C-rich strand. ously demonstrated that single-stranded d(GGCAG)n folds into the In this article, we document isolation of cDNA clones encod- intramolecular folded-back quadruplex structure under physiolog- ing MNBP-B and characterization of a recombinant MNBP-B. ical conditions. Because DNA polymerase progression in vitro is Sequences of seven proteolytic peptides of purified MNBP-B blocked at the repeat, the characteristic intramolecular quadruplex were determined, and cDNA clones were subsequently isolated. structure of the repeat, at least in part, could be responsible for the MNBP-B was revealed to be identical to the single-stranded hypermutable feature of Pc-1 and other MNs with similar repetitive DNA binding protein, UP1 (17), which is a proteolytic product units. On the other hand, we have isolated six MN Pc-1 binding corresponding to the N-terminal 195 aa of the 34-kDa hetero- proteins (MNBPs) from nuclear extracts of NIH 3T3 cells.
    [Show full text]
  • Revisions to the Classification, Nomenclature, and Diversity of Eukaryotes
    University of Rhode Island DigitalCommons@URI Biological Sciences Faculty Publications Biological Sciences 9-26-2018 Revisions to the Classification, Nomenclature, and Diversity of Eukaryotes Christopher E. Lane Et Al Follow this and additional works at: https://digitalcommons.uri.edu/bio_facpubs Journal of Eukaryotic Microbiology ISSN 1066-5234 ORIGINAL ARTICLE Revisions to the Classification, Nomenclature, and Diversity of Eukaryotes Sina M. Adla,* , David Bassb,c , Christopher E. Laned, Julius Lukese,f , Conrad L. Schochg, Alexey Smirnovh, Sabine Agathai, Cedric Berneyj , Matthew W. Brownk,l, Fabien Burkim,PacoCardenas n , Ivan Cepi cka o, Lyudmila Chistyakovap, Javier del Campoq, Micah Dunthornr,s , Bente Edvardsent , Yana Eglitu, Laure Guillouv, Vladimır Hamplw, Aaron A. Heissx, Mona Hoppenrathy, Timothy Y. Jamesz, Anna Karn- kowskaaa, Sergey Karpovh,ab, Eunsoo Kimx, Martin Koliskoe, Alexander Kudryavtsevh,ab, Daniel J.G. Lahrac, Enrique Laraad,ae , Line Le Gallaf , Denis H. Lynnag,ah , David G. Mannai,aj, Ramon Massanaq, Edward A.D. Mitchellad,ak , Christine Morrowal, Jong Soo Parkam , Jan W. Pawlowskian, Martha J. Powellao, Daniel J. Richterap, Sonja Rueckertaq, Lora Shadwickar, Satoshi Shimanoas, Frederick W. Spiegelar, Guifre Torruellaat , Noha Youssefau, Vasily Zlatogurskyh,av & Qianqian Zhangaw a Department of Soil Sciences, College of Agriculture and Bioresources, University of Saskatchewan, Saskatoon, S7N 5A8, SK, Canada b Department of Life Sciences, The Natural History Museum, Cromwell Road, London, SW7 5BD, United Kingdom
    [Show full text]
  • Sequence-Based Estimation of Minisatellite and Microsatellite Repeat Variability
    Downloaded from genome.cshlp.org on October 3, 2021 - Published by Cold Spring Harbor Laboratory Press Resource Sequence-based estimation of minisatellite and microsatellite repeat variability Matthieu Legendre,1,4 Nathalie Pochet,1,2,4 Theodore Pak,1 and Kevin J. Verstrepen1,3,5 1FAS Center for Systems Biology, Harvard University, Cambridge, Massachusetts 02138, USA; 2Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA; 3Centre of Microbial and Plant Genetics, Department of Molecular and Microbial Systems, Katholieke Universiteit Leuven, Faculty of Applied Bioscience and Engineering, B-3001 Leuven (Heverlee), Belgium Variable tandem repeats are frequently used for genetic mapping, genotyping, and forensics studies. Moreover, variation in some repeats underlies rapidly evolving traits or certain diseases. However, mutation rates vary greatly from repeat to repeat, and as a consequence, not all tandem repeats are suitable genetic markers or interesting unstable genetic modules. We developed a model, “SERV,” that predicts the variability of a broad range of tandem repeats in a wide range of organisms. The nonlinear model uses three basic characteristics of the repeat (number of repeated units, unit length, and purity) to produce a numeric “VARscore” that correlates with repeat variability. SERV was experimentally validated using a large set of different artificial repeats located in the Saccharomyces cerevisiae URA3 gene. Further in silico analysis shows that SERV outperforms existing models and accurately predicts repeat variability in bacteria and eukaryotes, including plants and humans. Using SERV, we demonstrate significant enrichment of variable repeats within human genes involved in transcriptional regulation, chromatin remodeling, morphogenesis, and neurogenesis. Moreover, SERV allows identification of known and candidate genes involved in repeat-based diseases.
    [Show full text]
  • Length Polymorphism and Methylation Status of UPS29 Minisatellite of the ACAP3 Gene As Molecular Biomarker of Epilepsy. Sex Diff
    International Journal of Molecular Sciences Article Length Polymorphism and Methylation Status of UPS29 Minisatellite of the ACAP3 Gene as Molecular Biomarker of Epilepsy. Sex Differences in Seizure Types and Symptoms Irina O. Suchkova 1 , Elena V. Borisova 2 and Eugene L. Patkin 1,* 1 Laboratory of Molecular Cytogenetics of Mammalian Development, Department of Molecular Genetics, Institute of Experimental Medicine of the Russian Academy of Sciences, St. Petersburg 197376, Russia; [email protected] 2 Department of Neurology, Clinic of Institute of Experimental Medicine, St. Petersburg 197376, Russia; [email protected] * Correspondence: [email protected] Received: 9 November 2020; Accepted: 27 November 2020; Published: 2 December 2020 Abstract: Epilepsy is a neurological disease with different clinical forms and inter-individuals heterogeneity, which may be associated with genetic and/or epigenetic polymorphisms of tandem-repeated noncoding DNA. These polymorphisms may serve as predictive biomarkers of various forms of epilepsy. ACAP3 is the protein regulating morphogenesis of neurons and neuronal migration and is an integral component of important signaling pathways. This study aimed to carry out an association analysis of the length polymorphism and DNA methylation of the UPS29 minisatellite of the ACAP3 gene in patients with epilepsy. We revealed an association of short UPS29 alleles with increased risk of development of symptomatic and cryptogenic epilepsy in women, and also with cerebrovascular pathologies, structural changes in the brain, neurological status, and the clinical pattern of seizures in both women and men. The increase of frequency of hypomethylated UPS29 alleles in men with symptomatic epilepsy, and in women with both symptomatic and cryptogenic epilepsy was observed.
    [Show full text]
  • Conversion of DNA Sequences: from a Transposable Element to a Tandem Repeat Or to a Gene
    G C A T T A C G G C A T genes Review Conversion of DNA Sequences: From a Transposable Element to a Tandem Repeat or to a Gene Ana Paço 1,* , Renata Freitas 2,3,4 and Ana Vieira-da-Silva 1 1 MED-Mediterranean Institute for Agriculture, Environment and Development, University of Évora, 7002–554 Évora, Portugal; [email protected] 2 IBMC-Institute for Molecular and Cell Biology, University of Porto, R. Campo Alegre 823, 4150–180 Porto, Portugal; [email protected] 3 I3S-Institute for Innovation and Health Research, University of Porto, Rua Alfredo Allen, 208, 4200–135 Porto, Portugal 4 ICBAS-Institute of Biomedical Sciences Abel Salazar, University of Porto, 4050-313 Porto, Portugal * Correspondence: [email protected]; Tel.: +351-266-760-878 Received: 19 October 2019; Accepted: 29 November 2019; Published: 5 December 2019 Abstract: Eukaryotic genomes are rich in repetitive DNA sequences grouped in two classes regarding their genomic organization: tandem repeats and dispersed repeats. In tandem repeats, copies of a short DNA sequence are positioned one after another within the genome, while in dispersed repeats, these copies are randomly distributed. In this review we provide evidence that both tandem and dispersed repeats can have a similar organization, which leads us to suggest an update to their classification based on the sequence features, concretely regarding the presence or absence of retrotransposons/transposon specific domains. In addition, we analyze several studies that show that a repetitive element can be remodeled into repetitive non-coding or coding sequences, suggesting (1) an evolutionary relationship among DNA sequences, and (2) that the evolution of the genomes involved frequent repetitive sequence reshuffling, a process that we have designated as a “DNA remodeling mechanism”.
    [Show full text]
  • BMC Evolutionary Biology Biomed Central
    BMC Evolutionary Biology BioMed Central Research article Open Access A fungal phylogeny based on 82 complete genomes using the composition vector method Hao Wang1, Zhao Xu1, Lei Gao2 and Bailin Hao*1,3,4 Address: 1T-life Research Center, Department of Physics, Fudan University, Shanghai 200433, PR China, 2Department of Botany & Plant Sciences, University of California, Riverside, CA(92521), USA, 3Institute of Theoretical Physics, Academia Sinica, Beijing 100190, PR China and 4Santa Fe Institute, Santa Fe, NM(87501), USA Email: Hao Wang - [email protected]; Zhao Xu - [email protected]; Lei Gao - [email protected]; Bailin Hao* - [email protected] * Corresponding author Published: 10 August 2009 Received: 30 September 2008 Accepted: 10 August 2009 BMC Evolutionary Biology 2009, 9:195 doi:10.1186/1471-2148-9-195 This article is available from: http://www.biomedcentral.com/1471-2148/9/195 © 2009 Wang et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract Background: Molecular phylogenetics and phylogenomics have greatly revised and enriched the fungal systematics in the last two decades. Most of the analyses have been performed by comparing single or multiple orthologous gene regions. Sequence alignment has always been an essential element in tree construction. These alignment-based methods (to be called the standard methods hereafter) need independent verification in order to put the fungal Tree of Life (TOL) on a secure footing.
    [Show full text]