Sequence Composition and Genome Organization of Maize

Total Page:16

File Type:pdf, Size:1020Kb

Sequence Composition and Genome Organization of Maize Sequence composition and genome organization of maize Joachim Messing*†, Arvind K. Bharti*, Wojciech M. Karlowski‡, Heidrun Gundlach‡, Hye Ran Kim§, Yeisoo Yu§, Fusheng Wei§, Galina Fuks*, Carol A. Soderlund¶, Klaus F. X. Mayer‡, and Rod A. Wing§ *Plant Genome Initiative at Rutgers, Waksman Institute, Rutgers, The State University of New Jersey, Piscataway, NJ 08854; ‡Munich Information Center for Protein Sequences, Institute for Bioinformatics, GSF Research Center for Environment and Health, D-85764 Neuherberg, Germany; and §Arizona Genomics Institute and ¶Arizona Genomics Computational Laboratory, University of Arizona, Tucson, AZ 85721 Communicated by Brian A. Larkins, University of Arizona, Tucson, AZ, August 20, 2004 (received for review July 22, 2004) Zea mays L. ssp. mays, or corn, one of the most important crops and regions was performed with DNA markers (7) and by comparative a model for plant genetics, has a genome Ϸ80% the size of the mapping to close relatives of maize (8, 9). human genome. To gain global insight into the organization of its Although genetic and cytogenetic analyses provided us with a genome, we have sequenced the ends of large insert clones, global view of the organization of the maize genome, a more yielding a cumulative length of one-eighth of the genome with a detailed analysis will come from its DNA sequence. In fact, the DNA sequence read every 6.2 kb, thereby describing a large maize genome is most likely the next plant genome that will be percentage of the genes and transposable elements of maize in an sequenced after Arabidopsis and rice. However, because of its unbiased approach. Based on the accumulative 307 Mb of se- suspected greater sequence complexity than the human genome, quence, repeat sequences occupy 58% and genic regions occupy maize is also thought to be the next technical challenge in genome 7.5%. A conservative estimate predicts Ϸ59,000 genes, which is sequencing. When large genomes are dissected into overlapping higher than in any other organism sequenced so far. Because the bacterial artificial chromosome (BAC) clones, these clones can be sequences are derived from bacterial artificial chromosome clones, assembled into megabase (Mb)-sized chromosomal fragments which are ordered in overlapping bins, tagged genes are also through fingerprinting methods. After the fragments are assigned ordered along continuous chromosomal segments. Based on this to their chromosomal locations, sequences from the BAC clones positional information, roughly one-third of the genes appear to become positioned relative to the genetic map, thereby serving as consist of tandemly arrayed gene families. Although the ancestor anchors for other sequences. of maize arose by tetraploidization, fewer than half of the genes Following this concept, we have generated Ϸ475,000 maize BAC appear to be present in two orthologous copies, indicating that the end sequences (BESs) with a cumulative length of 307 Mb, pro- maize genome has undergone significant gene loss since the viding an 8-fold coverage of the genome. BES reads averaged 647 duplication event. bp with an average distribution of one end every 6.2 kb of the genome. Besides constituting a framework for sequencing the maize genome ͉ whole-genome sequence tags ͉ map-based sequence ͉ maize genome, the BES provided us with a comprehensive, quan- whole-genome duplication ͉ gene families titative data set, which allowed us to assess maize transposable element (TE) and gene content. Moreover, by examining the lant genomes differ from mammalian genomes by their enor- physical linkage of BESs, we determined that a large proportion of Pmous range in size. Whereas mammalian genomes range be- the maize gene set consists of tandemly arrayed gene families, and tween 2 and 3 gigabases (Gb), the major crop plants vary from 0.4 that a heavy loss of unlinked duplicated genes must have occurred Gb in rice to 16 Gb in wheat. The model plant Arabidopsis has an during the transition from a tetraploid to a diploid species. even smaller genome of 0.125 Gb and was, therefore, the first plant SCIENCES Materials and Methods to be sequenced in its entirety (1). The only other plant genome that AGRICULTURAL has been sequenced is from rice (http:͞͞rgp.dna.affrc.go.jp). Ara- Processing of BAC Clones. Data from three large insert BAC libraries bidopsis and rice belong to the two major divisions of the plant of maize inbred B73 were used to anchor the DNA sequence to the kingdom, the dicotyledonous and monocotyledonous plants, re- genetic map. The libraries had previously been constructed by spectively. The grass family (Gramineae) is one of the largest partial digestion of genomic DNA with HindIII, EcoRI, and MboI monocotyledonous families; it arose 65 million years ago and is (10, 11), providing a 30-fold physical coverage of the 2.365-Gb comprised of Ͼ10,000 species adapted to one-third of the arable genome (12). land on earth (2). Cereals including rice, sorghum, sugarcane, oat, BAC clones were retrieved from 384-well storage plates and barley, wheat, and maize (corn) are examples of Gramineae species; grown in 96 deep-well plates to saturation. DNAs were extracted they provide a major source of food and feed for humans. with Whatman Unifilters by using a Tomtec liquid handling system. ␮ A significant feature emerging from the genome sequences of DNA precipitates were resuspended in 40 l of buffer; samples of ␮ Arabidopsis and rice is the large number of genes compared to 10 l were used for the forward and reverse sequencing of the BAC ␮ mammalian genomes. Although the gene number in Arabidopsis is ends with universal primers; another 10- l aliquot was used for comparable to that in human and mouse, rice appears to have a DNA fingerprinting. About 75% of the clones were assembled into much larger gene set, due largely to gene amplification (3). Does the a physical map of contiguous, overlapping clones (www.genome. gene number in plants at all correlate to their genome size (4)? Such a question arises, particularly in light of the fact that many crop Freely available online through the PNAS open access option. plants have undergone whole-genome duplication. Maize is an Abbreviations: Gb, gigabase(s); Mb, megabase(s); BAC, bacterial artificial chromosome; interesting example of such a duplication event. In contrast to BES, BAC end sequence; TE, transposable element; GSS, genome survey sequence; TC, wheat, maize has not maintained homoeologous chromosomes, but tentative consensus; HC, high C0t-derived; MF, methyl-filtered; GFS, gene family signature. rather has undergone reassortment of the homoeologous regions Data deposition: The sequences reported in this paper have been deposited in the GenBank acquired from the two progenitor genomes. These regions were first database (accession nos. can be found in Table 4, which is published as supporting demonstrated cytologically between nonhomologous chromosomes information on the PNAS web site). (5) and then later by genetic linkage mapping of nontandem gene †To whom correspondence should be addressed. E-mail: [email protected]. duplicates (6). A comprehensive genetic analysis of homoeologous © 2004 by The National Academy of Sciences of the USA www.pnas.org͞cgi͞doi͞10.1073͞pnas.0406163101 PNAS ͉ October 5, 2004 ͉ vol. 101 ͉ no. 40 ͉ 14349–14354 Downloaded by guest on September 29, 2021 Table 1. Distribution of repetitive DNA in rice and various GSSs of maize as percent of genome Details Total BES HindIII BES EcoRI BES MboI BES UF MF HC RescueMu Rice Total no. of sequences 474,604 309,560 78,313 86,731 50,876 30,000 30,000 30,000 179 Total no. of base pairs 307,169,410 206,221,247 46,673,217 54,274,946 37,621,118 21,649,324 21,637,862 10,074,088 35,800,000 Percent of genome, % 12.99 8.72 1.97 2.29 1.59 0.92 0.91 0.43 8.33 Class I retroelements, % 55.60 58.39 46.77 52.58 57.73 15.70 6.55 4.91 19.28 Ty1͞copia-like elements, % 24.54 25.33 20.54 24.97 17.50 8.37 2.54 1.98 3.67 Ty3͞gypsy-like elements, % 21.43 23.39 15.69 18.94 31.25 4.32 2.22 1.71 8.90 LINES ϩ SINES, % 0.18 0.23 0.06 0.06 0.04 0.06 0.18 0.13 0.96 Other retroelements, % 9.45 9.44 10.48 8.61 8.94 2.95 1.62 1.09 5.74 Class II DNA transposons, % 0.98 0.94 1.16 0.98 0.92 1.14 1.58 4.84 10.35 hAT superfamily, % 0.03 0.02 0.04 0.03 0.04 0.04 0.09 0.11 0.37 CACTA superfamily, % 0.32 0.31 0.38 0.30 0.21 0.21 0.17 0.76 2.83 Mutator, % 0.01 0.01 0.02 0.02 0.01 0.05 0.05 0.36 0.51 MITEs, % 0.17 0.16 0.20 0.17 0.19 0.39 0.62 1.08 3.96 Other DNA transposons, % 0.45 0.42 0.54 0.48 0.48 0.44 0.65 2.53 2.67 Simple repeats, % 0.40 0.27 0.47 0.83 1.66 1.27 0.19 0.17 1.03 High-copy-number genes, % 0.82 0.12 1.13 3.17 1.95 0.17 0.19 0.15 0.06 Other repeats, % 0.12 0.11 0.12 0.16 0.30 0.09 0.06 0.05 0.43 Total repeats, % 57.91 59.82 49.65 57.72 62.55 18.38 8.57 10.13 31.14 Greater than 70% identity limit.
Recommended publications
  • A Field Guide to Eukaryotic Transposable Elements
    GE54CH23_Feschotte ARjats.cls September 12, 2020 7:34 Annual Review of Genetics A Field Guide to Eukaryotic Transposable Elements Jonathan N. Wells and Cédric Feschotte Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14850; email: [email protected], [email protected] Annu. Rev. Genet. 2020. 54:23.1–23.23 Keywords The Annual Review of Genetics is online at transposons, retrotransposons, transposition mechanisms, transposable genet.annualreviews.org element origins, genome evolution https://doi.org/10.1146/annurev-genet-040620- 022145 Abstract Annu. Rev. Genet. 2020.54. Downloaded from www.annualreviews.org Access provided by Cornell University on 09/26/20. For personal use only. Copyright © 2020 by Annual Reviews. Transposable elements (TEs) are mobile DNA sequences that propagate All rights reserved within genomes. Through diverse invasion strategies, TEs have come to oc- cupy a substantial fraction of nearly all eukaryotic genomes, and they rep- resent a major source of genetic variation and novelty. Here we review the defining features of each major group of eukaryotic TEs and explore their evolutionary origins and relationships. We discuss how the unique biology of different TEs influences their propagation and distribution within and across genomes. Environmental and genetic factors acting at the level of the host species further modulate the activity, diversification, and fate of TEs, producing the dramatic variation in TE content observed across eukaryotes. We argue that cataloging TE diversity and dissecting the idiosyncratic be- havior of individual elements are crucial to expanding our comprehension of their impact on the biology of genomes and the evolution of species. 23.1 Review in Advance first posted on , September 21, 2020.
    [Show full text]
  • Tandemly Arrayed Genes in Vertebrate Genomes
    Tandemly Arrayed Genes in Vertebrate Genomes Deng Pan1 and Liqing Zhang2∗ 1Department of Computer Science, Virginia Tech, 2050 Torgerson Hall, Blacksburg, VA 24061-0106, USA 2Program in Genetics, Bioinformatics, and Computational Biology ∗To whom correspondence should be addressed; E-mail: [email protected] July 9, 2008 Abstract Tandemly arrayed genes (TAGs) are duplicated genes that are linked as neighbors on a chromosome, many of which have important physio- logical and biochemical functions. Here we performed a survey of these genes in 11 available vertebrate genomes. TAGs account for an average of about 14% of all genes in these vertebrate genomes, and about 25% of all duplications. The majority of TAGs (72-94%) have parallel tran- scription orientation (i.e. are encoded on the same strand), in contrast to the genome, which has about 50% of its genes in parallel transcrip- tion orientation. The majority of tandem arrays have only two members. In all species, the proportion of genes that belong to TAGs tends to be higher in large gene families than in small ones; together with our re- cent finding that tandem duplication played a more important role than retroposition in large families, this fact suggests that among all types of duplication mechanisms, tandem duplication is the predominant mecha- nism of duplication, especially in large families. Species-specific tandem arrays (SSTAs) are identified and results of statistical tests suggest that natural selection played a role in maintaining some of the SSTAs. Our 1 work will provide more insight into the evolutionary history of TAGs in vertebrates. 2 1 Introduction Although the importance of duplicated genes in providing raw materials for genetic innovation has been recognized since the 1930s and is highlighted in Ohno’s book Evolution by gene duplication (Ohno, 1970), it is only recently that the availability of numerous genomic sequences has made it possible to quantitatively estimate how many genes in a genome are generated by gene duplication.
    [Show full text]
  • Comparative Genomics of Arabidopsis and Maize: Prospects and Comment Limitations Volker Brendel*, Stefan Kurtz† and Virginia Walbot‡
    http://genomebiology.com/2002/3/3/reviews/1005.1 Minireview Comparative genomics of Arabidopsis and maize: prospects and comment limitations Volker Brendel*, Stefan Kurtz† and Virginia Walbot‡ Addresses: *Department of Zoology and Genetics and Department of Statistics, Iowa State University, Ames, IA 50010, USA. †Technische Fakultät, Universität Bielefeld, D-33501 Bielefeld, Germany. ‡Department of Biological Sciences, Stanford University, Stanford, CA 94305- 5020, USA. Correspondence: Volker Brendel. E-mail: [email protected] reviews Published: 14 February 2002 Genome Biology 2002, 3(3):reviews1005.1–1005.6 The electronic version of this article is the complete one and can be found online at http://genomebiology.com/2002/3/3/reviews/1005 © BioMed Central Ltd (Print ISSN 1465-6906; Online ISSN 1465-6914) reports Abstract The completed Arabidopsis genome seems to be of limited value as a model for maize genomics. In addition to the expansion of repetitive sequences in maize and the lack of genomic micro-colinearity, maize-specific or highly-diverged proteins contribute to a predicted maize proteome of about 50,000 deposited research proteins, twice the size of that of Arabidopsis. Maize (Zea mays L., corn) was domesticated in the high- contributions to agriculture through the discovery of hybrid lands of Central Mexico approximately 10,000 years ago [1]. vigor and cytoplasmic male sterility. Corn agriculture spread rapidly into diverse climate zones, ranging from 45° N to 45° S, and supported vast Native The beautiful detail evident in meiotic maize chromosomes refereed research American civilizations. Today, maize is one of the world’s stimulated a generation of gifted cytogeneticists to identify most important crops: for direct human consumption, as a the physical basis for recombination, to construct linkage key component of animal feed, and as the source of chemical maps tied to chromosomes, and to analyze the consequences feed stocks.
    [Show full text]
  • Evolution of Orthologous Tandemly Arrayed Gene Clusters
    Tremblay Savard et al. BMC Bioinformatics 2011, 12(Suppl 9):S2 http://www.biomedcentral.com/1471-2105/12/S9/S2 PROCEEDINGS Open Access Evolution of orthologous tandemly arrayed gene clusters Olivier Tremblay Savard1*, Denis Bertrand2*, Nadia El-Mabrouk1* From Ninth Annual Research in Computational Molecular Biology (RECOMB) Satellite Workshop on Com- parative Genomics Galway, Ireland. 8-10 October 2011 Abstract Background: Tandemly Arrayed Gene (TAG) clusters are groups of paralogous genes that are found adjacent on a chromosome. TAGs represent an important repertoire of genes in eukaryotes. In addition to tandem duplication events, TAG clusters are affected during their evolution by other mechanisms, such as inversion and deletion events, that affect the order and orientation of genes. The DILTAG algorithm developed in [1] makes it possible to infer a set of optimal evolutionary histories explaining the evolution of a single TAG cluster, from an ancestral single gene, through tandem duplications (simple or multiple, direct or inverted), deletions and inversion events. Results: We present a general methodology, which is an extension of DILTAG, for the study of the evolutionary history of a set of orthologous TAG clusters in multiple species. In addition to the speciation events reflected by the phylogenetic tree of the considered species, the evolutionary events that are taken into account are simple or multiple tandem duplications, direct or inverted, simple or multiple deletions, and inversions. We analysed the performance of our algorithm on simulated data sets and we applied it to the protocadherin gene clusters of human, chimpanzee, mouse and rat. Conclusions: Our results obtained on simulated data sets showed a good performance in inferring the total number and size distribution of duplication events.
    [Show full text]
  • Correlated Variation and Population Differentiation in Satellite DNA Abundance Among Lines of Drosophila Melanogaster
    Correlated variation and population differentiation in satellite DNA abundance among lines of Drosophila melanogaster Kevin H.-C. Wei, Jennifer K. Grenier, Daniel A. Barbash, and Andrew G. Clark1 Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853-2703 Contributed by Andrew G. Clark, November 18, 2014 (sent for review October 1, 2014; reviewed by Giovanni Bosco and Keith A. Maggert) Tandemly repeating satellite DNA elements in heterochromatin repeats of simple sequences (≤10 bp); the most abundant include occupy a substantial portion of many eukaryotic genomes. Al- AAGAG (aka GAGA-satellite), AACATAGAAT (aka 2L3L), though often characterized as genomic parasites deleterious to and AATAT (11, 14, 15). In comparison, the genome of its sister the host, they also can be crucial for essential processes such as species Drosophila simulans,fromwhichD. melanogaster di- chromosome segregation. Adding to their interest, satellite DNA verged ∼2.5 mya, is estimated to have only 5% satellite DNA, elements evolve at high rates; among Drosophila, closely related more than 10-fold less AAGAG, and little to no AACATA- species often differ drastically in both the types and abundances GAAT (16). For further contrast, nearly 50% of the Drosophila of satellite repeats. However, due to technical challenges, the evo- virilis and less than 0.5% of Drosophila erecta genomes are sat- lutionary mechanisms driving this rapid turnover remain unclear. ellite DNA (16, 17). Strikingly, such rapid changes in genomes have been implicated in postzygotic isolation of species in the Here we characterize natural variation in simple-sequence repeats – of 2–10 bp from inbred Drosophila melanogaster lines derived form of hybrid incompatibility in several species of flies (18 20), from multiple populations, using a method we developed called demonstrating the critical role satellite DNA has on the evolu- k-Seek that analyzes unassembled Illumina sequence reads.
    [Show full text]
  • Genome Analysis of Protozoan Parasites: Proceedings of a Workshop Held at ILRAD, Nairobi, Kenya, 11–13 November 1992, Ed
    G E N O M E A N A L Y S I S OF PROTOZOAN PARASITES PROCEEDINGS OF A WORKSHOP HELD AT ILRAD NAIROBI, KENYA 11–13 NOVEMBER 1992 Edited by S.P. Morzaria THE INTERNATIONAL LABORATORY FOR RESEARCH ON ANIMAL DISEASES BOX 30709 • NAIROBI • KENYA The International Laboratory for Research on Animal Diseases (ILRAD) was established in 1973 with a global mandate to develop effective control measures for livestock diseases that seriously limit world food production. ILRAD’s research program focuses on animal trypanosomiasis and tick-borne diseases, particularly theileriosis (East Coast fever). ILRAD is one of 18 centres in a worldwide agricultural research network sponsored by the Consultative Group on International Agricultural Research. In 1992 ILRAD received funding from the United Nations Development Programme, the World Bank and the governments of Australia, Belgium, Canada, Denmark, Finland, France, Germany, Italy, Japan, the Netherlands, Norway, Sweden, Switzerland, the United Kingdom and the United States of America. Production Editor: Peter Werehire This publication was typeset on a microcomputer and the final pages produced on a laser printer at ILRAD, P.O. Box 30709, Nairobi, Kenya, telephone 254 (2) 632311, fax 254 (2) 631499, e-mail BT GOLD TYMNET CGI017. Colour separations for cover were done by PrePress Productions, P.O. Box 41921, Nairobi, Kenya. Printed by English Press Ltd., P.O. Box 30127, Nairobi, Kenya. Copyright © 1993 by the International Laboratory for Research on Animal Diseases. ISBN 92-9055-296-4 The correct citation for this book is Genome Analysis of Protozoan Parasites: Proceedings of a Workshop Held at ILRAD, Nairobi, Kenya, 11–13 November 1992, ed.
    [Show full text]
  • First Efficient CRISPR-Cas9-Mediated Genome Editing in Leishmania Parasites
    First efficient CRISPR-Cas9-mediated genome editing in Leishmania parasites. Lauriane Sollelis, Mehdi Ghorbal, Cameron Ross Macpherson, Rafael Miyazawa Martins, Nada Kuk, Lucien Crobu, Patrick Bastien, Artur Scherf, Jose-Juan Lopez-Rubio, Yvon Sterkers To cite this version: Lauriane Sollelis, Mehdi Ghorbal, Cameron Ross Macpherson, Rafael Miyazawa Martins, Nada Kuk, et al.. First efficient CRISPR-Cas9-mediated genome editing in Leishmania parasites.. Cellular Mi- crobiology, Wiley, 2015, 17 (10), pp.1405-1412. 10.1111/cmi.12456. hal-01992713 HAL Id: hal-01992713 https://hal.umontpellier.fr/hal-01992713 Submitted on 2 Mar 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. First efficient CRISPR-Cas9-mediated genome editing in Leishmania parasites Lauriane Sollelis1,2, Mehdi Ghorbal1,2, Cameron Ross MacPherson3, Rafael Miyazawa Martins3, Nada Kuk1, Lucien Crobu2, Patrick Bastien1,2,4, Artur Scherf3, Jose-Juan Lopez- Rubio1,2 and Yvon Sterkers1,2,4# Running title: CRISPR-Cas9 for genome editing in Leishmania 1 University of Montpellier, Faculty of Medicine, Laboratory of Parasitology-Mycology, 2 CNRS 5290 - IRD 224 – University of Montpellier (UMR "MiVEGEC"), 3 Institut Pasteur - INSERM U1201 - CNRS ERL9195, "Biology of Host-Parasite Interactions" Unit, Paris, France, 4 CHRU (Centre Hospitalier Universitaire de Montpellier), Department of Parasitology-Mycology, Montpellier, France #To whom correspondence should be addressed.
    [Show full text]
  • Bioinformatics: a Practical Guide to the Analysis of Genes and Proteins, Second Edition Andreas D
    BIOINFORMATICS A Practical Guide to the Analysis of Genes and Proteins SECOND EDITION Andreas D. Baxevanis Genome Technology Branch National Human Genome Research Institute National Institutes of Health Bethesda, Maryland USA B. F. Francis Ouellette Centre for Molecular Medicine and Therapeutics Children’s and Women’s Health Centre of British Columbia University of British Columbia Vancouver, British Columbia Canada A JOHN WILEY & SONS, INC., PUBLICATION New York • Chichester • Weinheim • Brisbane • Singapore • Toronto BIOINFORMATICS SECOND EDITION METHODS OF BIOCHEMICAL ANALYSIS Volume 43 BIOINFORMATICS A Practical Guide to the Analysis of Genes and Proteins SECOND EDITION Andreas D. Baxevanis Genome Technology Branch National Human Genome Research Institute National Institutes of Health Bethesda, Maryland USA B. F. Francis Ouellette Centre for Molecular Medicine and Therapeutics Children’s and Women’s Health Centre of British Columbia University of British Columbia Vancouver, British Columbia Canada A JOHN WILEY & SONS, INC., PUBLICATION New York • Chichester • Weinheim • Brisbane • Singapore • Toronto Designations used by companies to distinguish their products are often claimed as trademarks. In all instances where John Wiley & Sons, Inc., is aware of a claim, the product names appear in initial capital or ALL CAPITAL LETTERS. Readers, however, should contact the appropriate companies for more complete information regarding trademarks and registration. Copyright ᭧ 2001 by John Wiley & Sons, Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic or mechanical, including uploading, downloading, printing, decompiling, recording or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without the prior written permission of the Publisher.
    [Show full text]
  • Quality Assessment of Maize Assembled Genomic Islands (Magis) and Large-Scale Experimental Verification of Predicted Genes
    Quality assessment of maize assembled genomic islands (MAGIs) and large-scale experimental verification of predicted genes Yan Fu*†‡, Scott J. Emrich‡§¶, Ling Guo†§, Tsui-Jung Wenʈ, Daniel A. Ashlock§**††‡‡, Srinivas Aluru§¶**§§, and Patrick S. Schnable*†§ʈ**§§¶¶ *Interdepartmental Genetics Graduate Program, §Interdepartmental Bioinformatics and Computational Biology Graduate Program, **L. H. Baker Center for Bioinformatics and Biological Statistics, §§Center for Plant Genomics, and Departments of †Genetics, Development, and Cell Biology, ¶Electrical and Computer Engineering, ʈAgronomy, and ††Mathematics, Iowa State University, Ames, IA 50011 Edited by Susan R. Wessler, University of Georgia, Athens, GA, and approved July 5, 2005 (received for review April 26, 2005) Recent sequencing efforts have targeted the gene-rich regions of ically, this speed makes it possible to determine the effects of the maize (Zea mays L.) genome. We report the release of an different assembly parameter values on the quality of the improved assembly of maize assembled genomic islands (MAGIs). resulting assemblies. The 114,173 resulting contigs have been subjected to computa- Three research groups currently provide publicly available partial tional and physical quality assessments. Comparisons to the se- maize genome assemblies based on the GSS data [The Institute for quences of maize bacterial artificial chromosomes suggest that at Genomic Research (TIGR), Plant Genome Database, and our least 97% (160 of 165) of MAGIs are correctly assembled. Because group].
    [Show full text]
  • Applying CRISPR/Cas for Genome Engineering in Plants
    Available online at www.sciencedirect.com ScienceDirect Applying CRISPR/Cas for genome engineering in plants: the best is yet to come Holger Puchta Less than 5 years ago the CRISPR/Cas nuclease was first zinc finger nucleases (ZFNs) and transcription activator- introduced into eukaryotes, shortly becoming the most efficient like effector nucleases (TALENs), which allow the tar- and widely used tool for genome engineering. For plants, geting of most genomic sites, have been developed and efforts were centred on obtaining heritable changes in most applied for a long period of time in plants (for reviews see transformable crop species by inducing mutations into open [6–8]). Although with the advent of ZFNs the basic reading frames of interest, via non-homologous end joining. problem of targeting DSBs to specific site in the genome Now it is important to take the next steps and further develop was solved, the system revealed numerous drawbacks. the technology to reach its full potential. For breeding, besides The construction of the enzymes is time consuming and using DNA-free editing and avoiding off target effects, it will be expensive and their specificity is limited, resulting in desirable to apply the system for the mutation of regulatory DSB at genomic site that very similar but nor identical elements and for more complex genome rearrangements. to the target site. Thus, unwanted secondary, ‘off-target’ Targeting enzymatic activities, like transcriptional regulators or mutations were introduced regularly. After their intro- DNA modifying enzymes, will be important for plant biology in duction TALENs outcompeted ZFNs in specificity as the future.
    [Show full text]
  • An Active DNA Transposon Family in Rice
    letters to nature reference to their impact on the salt marsh. Contrib. Mar. Sci. 23, 25–55 (1980). transposons comprise the largest component of transposable 8. Zeil, J. & Layne, J. Crustacean Experimental Systems in Neurobiology (ed. Wiese, K.) 227–247 (Springer, Heidelberg, 2002). elements in the rice genome (14% of the genomic DNA) but, 9. Zeil, J., Nalbach, G. & Nalbach, H.-O. Eyes, eye stalks, and the visual world of semi-terrestrial crabs. numerically, MITEs form the largest group with over 100,000 J. Comp. Physiol. A 159, 801–811 (1986). elements divided into hundreds of families comprising about 6% 10. Krapp, H. G., Hengstenberg, B. & Hengstenberg, R. Dendritic structure and receptive-field of the genome6,7. MITEs are the predominant transposable element organization of optic flow processing interneurons in the fly. J. Neurophysiol. 79, 1902–1917 (1998). 11. Wehner, R. ‘Matched filters’—neural models of the external world. J. Comp. Physiol. A 161, 511–531 associated with the non-coding regions of the genes of flowering (1987). plants, especially grasses, and have been found in several animal 12. Schall, R. Estimation in generalized linear models with random effects. Biometrika 78, 719–727 genomes including Caenorhabditis elegans, mosquitoes, fish and (1991). human (reviewed in ref. 8). 13. Zeil, J. & Al-Mutairi, M. M. The variation of resolution and of ommatidial dimensions in the eyes of the fiddler crab Uca lactea annulipes (Ocypodidae, Brachyura, Decapoda). J. Exp. Biol. 199, 1569–1577 Structurally, MITEs are reminiscent of non-autonomous DNA (1996). (class 2) elements with their small size (,600 base pairs) and short 14.
    [Show full text]
  • A Broad Analysis of Tandemly Arrayed Genes in the Genomes of Human, Mouse, and Rat
    A Broad Analysis of Tandemly Arrayed Genes in the Genomes of Human, Mouse, and Rat Valia Shoja Thesis submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements for the degree of Master of Science in Computer Science and Application Liqing Zhang, Ph.D. Chair Lenwood S. Heath, Ph.D. Zhijian Tu, Ph.D. November 10, 2006 Blacksburg, Virginia Keywords: Comparative Genomics, Gene Duplication, Tandemly Arrayed Genes, Gene Expression. Copyright 2006, Valia Shoja A Broad Analysis of Tandemly Arrayed Genes in the Genomes of Human, Mouse, and Rat Valia Shoja (ABSTRACT) Tandemly arrayed genes (TAG) play an important functional and physiological role in the genome. Most previous studies have focused on individual TAG families in a few species, yet a broad characterization of TAGs is not available. We identified all the TAGs in the genomes of human, chimp, mouse, and rat and performed a comprehensive analysis of TAG distribution, TAG sizes, TAG gene orientations and intergenic distances, and TAG gene functions. TAGs account for about 14-17% of all the genomic genes and nearly one third of all the duplicated genes in the four genomes, highlighting the predominant role that tandem duplication plays in gene duplication. For all species, TAG distribution is highly heterogeneous along chromosomes and some chromosomes are enriched with TAG forests while others are enriched with TAG deserts. The majority of TAGs are of size two for all genomes, similar to the previous findings in C. elegans, A. thaliana, and O. sativa, suggesting that it is a rather general phenomenon in eukaryotes.
    [Show full text]