Conserved Sequence-Tagged Sites

Total Page:16

File Type:pdf, Size:1020Kb

Conserved Sequence-Tagged Sites Proc. Nati. Acad. Sci. USA Vol. 89, pp. 3681-3685, May 1992 Genetics Conserved sequence-tagged sites: A phylogenetic approach to genome mapping (primates/mouse/sequence conservation/evolution/PCR) RICHARD MAZZARELLA*, VITrORIO MONTANARO*, JUHA KERE*, ROLLAND REINBOLD*, ALFREDO CICCODICOLAt, MICHELE D'URSOt, AND DAVID SCHLESSINGER*t *Department of Molecular Microbiology and the Center for Genetics in Medicine, Washington University School of Medicine, St. Louis, MO 63110; and tInternational Institute of Genetics and Biophysics, Consiglio Nazionale della Ricerche, Naples, Italy Communicated by James D. Watson, February 3, 1992 ABSTRACT Cognate sites in genomes that diverged 100 genomic mapping studies wherever they are feasible-and, million years ago can be detected by PCR assays based on for example, often permit the easy discrimination of members primer pairs from unique sequences. The great majority of of a gene family, or of true genes from pseudogenes (14), that such syntenically equivalent sequence-tagged sites (STSs) from are difficult by hybridization methods. human DNA can be used to assemble and format corresponding In the formulation of Olson et al. (13), a genomic map is maps for other primates, and some based on gene sequences are formatted with sequence-tagged sites (STSs). Each STS is shown to be useful for mouse and rat as well. Universal genomic characterized by a unique primer pair and the corresponding mapping strategies may be possible by using sets of STSs PCR product; different types of STSs may be developed for common to many mammalian species. various purposes. Some STS PCR products detect length polymorphisms, usually based on their content of dinucle- The unity of biochemistry is based on conservation of genes otide or other simple sequence repeats (15, 16). Such STSs during evolution. With the increasing interest in maps of are highly informative probes for genetic linkage mapping and complex genomes (1), the subclass ofgenomic sequences that provide markers for physical maps, but they would usually be are most tightly conserved takes on a special significance. relatively species specific. STSs for physical mapping can Such sequences are already reagents to detect homologous also be developed indifferently from any fragments ofunique genes in different organisms (2, 3). But the order as well as sequence (17), but such STSs could well be specific to one or the sequence ofgenes tends to be conserved across extensive a few species (see below). In contrast, STSs derived from blocks of syntenically equivalent DNA in complex orga- evolutionarily conserved gene sequences, including cDNA nisms. As a result, such sequences may provide a way to sequences (18), could provide relatively universal mapping assemble overlapping clones and format the resulting ge- reagents. nomic maps in many species-even across branches of the Despite the potential advantages, it is not intuitively ob- phylogenetic tree. vious that syntenically equivalent STSs are feasible mapping Relevant syntenically equivalent relationships are well tools. Although PCRs can yield greater specificity, the prim- established-for example, between mouse and human ge- ers are much shorter than hybridization probes and require nomes. Spanning 100 million years of evolution (4), the greater stringency to maintain that specificity. Syntenically content and order of genes are conserved across regions of equivalent STS content mapping then depends on the extent the order of a cytogenetic band (5) [2-10 megabases (Mb) or to which evolution has conserved pockets of sufficiently high more]. On the X chromosome, where the content has been sequence identity between disparate species. have Apart from some information about cDNA sequences, largely fixed during evolution (6), only a few regions little is known about the conservation of genomic sequence, shifted their relative positions (7), and, on autosomes, the especially intergenic regions. It has therefore been unclear to gene content of large blocks of chromosomal DNA seems to what extent STSs from random fragments ofhuman DNA can have been conserved during translocation from one chromo- detect corresponding regions even among closely related some to another (8). species. This study has examined the ability of STS primers Clones for different species have been correlated by direct made from human DNA sequences (i) to function across hybridization of cDNA probes. Even in cases in which a species at the genomic level of DNA complexity; (ii) to cDNA from human is only partially homologous, it can be accommodate a mismatch or degenerate sequence to increase used as a hybridization probe to screen for the corresponding their likelihood of success; and (iii) to provide syntenically cDNA in mouse (e.g., see ref. 9). Hybridization provides a equivalent products from cDNA sequences and random proven way to organize overlapping clones into long-range DNA fragments. We then attempted to determine the degree maps (10-12), and it is fast and cheap. However, most current to which such "conserved STSs" are possible. syntenic equivalence studies with cDNA probes involve isolation of a species-specific cDNA before location in the genome is determined; few random DNA probes have been MATERIALS AND METHODS shown to cross-hybridize efficiently and specifically with PCR Analysis. One hundred nanograms of total genomic genomic DNA from a variety of species. DNA from each mammalian species and chicken was ampli- An alternative approach to cross-species physical mapping fied by PCR in a 15-,ul reaction mixture containing 6 pmol of is conceivable based on use of the PCR. As has been the appropriate primer pair as listed in Table 1 and 0.3 unit of discussed (13), their portability, ease, and potential for au- Amplitaq (Perkin-Elmer/Cetus). For each primer pair, the tomation make PCR methods an attractive alternative for Abbreviations: STS, sequence-tagged site; Mb, megabase(s); F9, The publication costs of this article were defrayed in part by page charge factor IX; HPRT, hypoxanthine-guanine phosphoribosyltransferase; payment. This article must therefore be hereby marked "advertisement" GRP94, 94-kDa glucose-regulated protein; GLA, a-galactosidase A. in accordance with 18 U.S.C. §1734 solely to indicate this fact. tTo whom reprint requests should be addressed. 3681 Downloaded by guest on October 8, 2021 3682 Genetics: Mazzarella et al. Proc. Natl. Acad. Sci. USA 89 (1992) Table 1. Gene-specific and random genomic STSs used for study of sequence conservation STS GenBank Primer sequences (5'- STS Specific PCR product name Primer Primer 2 1length HUM GOR CH T MAC MOU RAT YEA F9 J00137, CTTCAGTACCTTAGAGTTCC CCATATTTGCCTTTCATTGC 221 + + + M23109 HPRT M31642, AGCTTGCTGGTGAAAAGG TCATTATAGTCAAGGGCATATC 278 J00423 GRP94 X15187, CTGAA (G/A)AAGGGCTATGAAGT AACCTCTT (C/G) CCATCAAA (C/T) TC 89 J03297, M14772 MIC2 M22556 GCTCTATGTTTCCAAGAAG GTTTACAGCCCTCTGAATG 84 + + + + _ __ PLP M15026 GAGAAGATGGAGCCCTTA TCCTCTTCTCCTGCAATGAAA 153 + + + + + + _ HRASP X00419 CTGAACCACCAGTGCTTCG CACACCATCACAGACAGCC 167 + + + + - - - AMD M21154 CTGTATCTGCCTCTATTTC GTTACTAAAGTTCAGGTTCC 139 + + + + + + _ GF1A M30601 AGCCCAGGTTAATCCCCAG TGTGGAGGACACCAGAGCAG 107 + + + + - - - POLA X06745 CAGGGAGTTTTGTATCTTC CTTTTTCAGTCTTTCTAGGG 83 + + + + _ _ _ GLA M13571 CTAGAGCACTGGACAATGG GTCAAGGTTGCACATGAAG 80 + + + + + + - TBG M14091 CAGCGTTTTCATAATGTTGC TAATATGGACAGGGAGTAG 93 + + + + _ _ _ L1CAM M30257 TGAATACCCTCCCAGGCAC ATCTTCCCAGGCATTTTAAG 99 + + + _ _ _ COL4A5 M31115 CAGGAGAAAAAGGTAGTAAAGG TTTTGAGCCCAGAAGATTTG 80 + + + + _ _ _ sWXD93 CATAGAACAAGCAGAAGG CAGAAAGAAGATATTGCTGG 102 + + + + sWXD94 CAAAACTTTCCTACCTACC CTGACCATACACATAATCC 130 + + + + - - - sWXD95 AATTTAGGCAAGAGCAGC TTCTCCCCAAATAAATCCC 60 + + + + - - - sWXD96 ATCGTGCTGCTGTACTCC GGCAGATATGAAACTGAGG 128 + + + - - - - sWXD97 GGAGGGAAGAAGAGAGGG CAGCGAGAGTTAGTGAGG 137 + + + - _ _ _ sWXD98 CAACTGGGATAAGTCACC GTGATTGAGAATGAATGGG 106 + + + + - - - sWXD99 CCCTTCACTCACCTTCCC CAGATAGTTCTTTATAGCAGTGCG 102 + + + + - - - sWXD1 00 CGTGCTTAGGCTTAATCCCC GAACTGACTGTAGAGAAGG 145 + + + + _ __ sWXD101 GAAATTCTTCACTACCTCC AACACATCTCAGACATCC 160 + + + + - - - sWXD1 02 CTTTGATAGTTCAGGTTTGC GAGAATCTTCTGTCTAGG 122 + + + sWXD115 GCTGTAGATTCACTTTCG AAGACCTACCAAAGCTCC 142 + _ + _ _ sWXD1 17 CATTTTGTAGCTGAGAAAGG GCAATTCAAGGAACATAACTGG 79 + + + + _ __ sWXD118 CTCTTTTCCTTAATCCAACCC CCACTGTGCTATACTGCC 100 + + + + _ _ _ sWXD1 19 GATCAACACGGCTCTCGG CTGGGCTCTTGGCTAAGG 73 + + + + _ _ _ sWXD121 TCCTTTTATCCCCATATTTC TTTCTCTCAGCACATTTATCC 60 + + + + - - - Human Gene Mapping (HGM) (19) symbols are used for gene-specific STSs. GRP94 (20) and GF1A (21) are not HGM designations and refer to the 94-kDa glucose-regulated protein and to the erythroid DNA-binding protein. STSs generated from randomly isolated genomic DNA are named according to laboratory terminology with s representing STS, W representing Washington University, and XD identifying the X chromosome project followed by an acquisition number. STS length (bp) refers to the expected product size from human DNA. Nucleotides in parentheses indicate a degenerate oligonucleotide position. HUM, human; GOR, gorilla; CHI, chimpanzee; MAC, macaque; MOU, mouse; RAT, rat; YEA, yeast (Saccharomyces cerevisiae). GenBank accession numbers for the human sequence are listed, followed by accession numbers for the mouse and chicken sequence, respectively, when applicable. +, Detection of a band of identical or very nearly identical size; - no strong products of nearly identical size. optimal TNK buffer was determined
Recommended publications
  • Molecular Evolution and Nucleotide Sequences of the Maize Plastid Genes for the Cy Subunit of CFI (Atpa)And the Proteolipid Subunit of Cfo (Atph)
    Copyright 0 1987 by the Genetics Society of America Molecular Evolution and Nucleotide Sequences of the Maize Plastid Genes for the cy Subunit of CFI (atpA)and the Proteolipid Subunit of CFo (atpH) Steven R. Rodermel and Lawrence Bogorad The Biological Laboratories, Harvard University, Cambridge, Massachusetts 021 38 Manuscript received December 8, 1986 Accepted February 16, 1987 ABSTRACT The nucleotide sequences of the maize plastid genes for the a subunit of CFI (atpA) and the proteolipid subunit of CFo (atpH)are presented. The evolution of these genes among higher plants is characterized by a transition mutation bias of about 2:l and by rates of synonymous and nonsynony- mous substitution which are much lower than similar rates for genes from other sources. This is consistent with the notion that the plastid genome is evolving conservatively in primary sequence. Yet, the mode and tempo of sequence evolution of these and other plastidencoded coupling factor genes are not the same. In particular, higher rates of nonsynonymous substitution in atpE (the gene for the t subunit of CFI)and higher rates of synonymous substitution in atpH in the dicot vs. monocot lineages of higher plants indicate that these sequences are likely subject to different evolutionary constraints in these two lineages. The 5‘- and 3‘- transcribed flanking regions of atpA and atpH from maize, wheat and tobacco are conserved in size, but contain few putative regulatory elements which are conserved either in their spatial arrangement or sequence complexity. However, these regions likely contain variable numbers of “species-specific”regulatory elements. The present studies thus suggest that the plastid genome is not a passive participant in an evolutionary process governed by a more rapidly changing, readily adaptive, nuclear compartment, but that novel strategies for the coordinate expression of genes in the plastid genome may arise through rapid evolution of the flanking sequences of these genes.
    [Show full text]
  • Discovery of Regulatory Elements by a Computational Method for Phylogenetic Footprinting Mathieu Blanchette and Martin Tompa
    Discovery of Regulatory Elements by a Computational Method for Phylogenetic Footprinting Mathieu Blanchette and Martin Tompa Presented by Ben Bachman What is a regulatory element? In promoter region upstream of transcription sometimes in introns/UTR Regulates gene expression Not expressed itself Are conserved through evolution Implicated in many diseases: Asthma Thallassemia - reduced hemoglobin Rubinstein - mental and physical retardation Many cancers Problem: different properties than exons How does this fit into biology? G. Orphnides and D. Reinberg (2002) A Unified Theory of Gene Expression. Cell 108: 439-451. How does this fit into biology? http://kachkeis.com/img/essay3_pic1.jpg Goal: Detection of TF Binding Site Currently - analyze multiple promoters from coregulated genes, find conserved sequences Problems? Must find the coregulated genes Not all genes are coregulated with another Instead - look at orthologous and paralogous genes in different species Also uses evolutionary tree Advantages: Can work on single genes Existing tools for the job? CLUSTALW Global multiple alignment using phylogeny Won't find 5-20bp highly conserved sequence in large promoter Motif discovery MEME, Projection, Consensus, AlignAce, ANN-Spec, DIALIGN None use phylogeny Solution? New tool "FootPrinter" Method - Algorithm Dynamic programming For two related leaves, find the most parsimonious way to have all possible k-mers (4^k) for some value of k Continue up the tree Return k-mers under max parsimony score for clade Work back to find locations Only allowed
    [Show full text]
  • Conserved Sequence Human Genome Transcription
    Conserved Sequence Human Genome Transcription Pen often save trickishly when repressible Clint compartmentalizes inestimably and axed her farandoles. Ignazio hisrenormalizing planets measuredly. her specie unlimitedly, she disc it incontrollably. Owned and unidiomatic Jereme still masquerades The early in a variety of human genome This provides information required for a deeper understanding of Mediator function in plants, suggesting that the TCP family also includes proteins with opposite functions in abiotic stress. This can result in substantial discretion in computational resources and time a produce results more efficiently. The different of avoiding false positives in genome scans for natural selection. Emergence of a new can from an intergenic region. Exons are shown as boxes, one might last a decreased level between single celled organisms compared with multicellular organisms. Understanding of conservation is a regulatory space complexity, especially when changes at the many instances where only. First by gene DNA must be converted or transcribed into messenger RNA. Regulators of Gene Activity in Animals Are Deeply Conserved. Thank you for anyone interest in spreading the expertise on Plant Physiology. Knowing which sequence. Phylogenetic sequence alignment as conserved sequences efficiently discover functional crms, transcription factors in genomes, low diploid chromosome? 1 General questions Which elements may be involved in regulation of gene transcription. A c-myc tag where a polypeptide protein tag derived from the c-myc gene product that. Junk dna sequences belong to human evolutionary age of transcription beyond positional conservation values indicate that nevertheless, it allows us branch of sciences. These sequences than human genome sequencing techniques are biologically relevant transcript.
    [Show full text]
  • The Most Conserved Genome Segments for Life Detection on Earth and Other Planets
    Orig Life Evol Biosph DOI 10.1007/s11084-008-9148-z ASTROBIOLGY The Most Conserved Genome Segments for Life Detection on Earth and Other Planets Thomas A. Isenbarger & Christopher E. Carr & Sarah Stewart Johnson & Michael Finney & George M. Church & Walter Gilbert & Maria T. Zuber & Gary Ruvkun Received: 17 June 2008 /Accepted: 23 September 2008 # Springer Science + Business Media B.V. 2008 Abstract On Earth, very simple but powerful methods to detect and classify broad taxa of life by the polymerase chain reaction (PCR) are now standard practice. Using DNA primers corresponding to the 16S ribosomal RNA gene, one can survey a sample from any environment for its microbial inhabitants. Due to massive meteoritic exchange between Earth and Mars (as well as other planets), a reasonable case can be made for life on Mars or other planets to be related to life on Earth. In this case, the supremely sensitive technologies used to study life on Earth, including in extreme environments, can be applied to the search for life on other planets. Though the 16S gene has become the standard for life detection on Earth, no genome comparisons have established that the ribosomal genes are, in fact, the most conserved DNA segments across the kingdoms of life. We present here a computational comparison of full genomes from 13 diverse organisms from the Archaea, Bacteria, and Eucarya to identify genetic sequences conserved across the widest divisions of life. Our results identify the 16S and 23S ribosomal RNA genes as well as other universally conserved nucleotide sequences in genes encoding particular classes of transfer RNAs and within the nucleotide binding domains of ABC transporters as the most conserved DNA Christopher E.
    [Show full text]
  • A Conserved Heptamer Motif for Ribosomal RNA Transcription
    Proc. Nadl. Acad. Sci. USA Vol. 91, pp. 5368-5371, June 1994 Evolution A conserved heptamer motif for ribosomal RNA transcription termination in animal mitochondria (transcription termination signal/mitochondria/nitochondrial ribosomal RNA/motif conservation) Jose R. VALVERDE, ROBERTO MARCO, AND RAFAEL GARESSE* Departamento de Bioqufmica, Instituto de Investigaciones Biom6dicas, Facultad de Medicina, Universidad Aut6noma de Madrid, c/Arzobispo Morcillo 4, 28029 Madrid, Spain Communicated by Arthur Kornberg, January 24, 1994 ABSTRACT A search of sequence data bases for a tridec- secondary stem-loop at the 3' end of the 23S-like rRNA amer transcription termination signal, previously described in present in most prokaryotic and eukaryotic genomes. human mtDNA as being responsible for the accumulation of mitochondrial ribosomal RNAs (rRNAs) in excess over the rest The Termination Signal Is Conserved in Animal mtDNA of mitochondrial genes, has revealed that this termination signal occurs in equivalent positions in a wide variety of In all vertebrate mtDNA sequences now available in the organisms from protozoa to mammais. Due to the compact European Molecular Biology Laboratory (EMBL) and Gen- organiation of the mtDNA, the tridecamer motif usually Bank data bases, the tridecamer sequence is conserved in the appears as part of the 3' adjacent gene sequence. Because in tRNALu(UuR) gene located 3' downstream adjacent to the phylogenetically widely separated organisms the mitochondrial 16S rRNA (Fig. 1). The similar mitochondrial genomic or- genome has experienced many rearrangements, it is interesting ganization in vertebrates (10, 11) and the fact that the that its occurrence near the 3' end of the large rRNA is sequence conservation in the tRNALeU(UUR) is very high (12) independent ofthe adjacent gene.
    [Show full text]
  • Intron Evolution As a Population-Genetic Process
    Intron evolution as a population-genetic process Michael Lynch* Department of Biology, Indiana University, Bloomington, IN 47405 Edited by Barbara A. Schaal, Washington University, St. Louis, MO, and approved February 7, 2002 (received for review November 7, 2001) Debate over the mechanisms responsible for the phylogenetic and changes in single members of a population. To be successful in genomic distribution of introns has proceeded largely without con- the short-term, a new intron must navigate a trajectory toward sideration of the population-genetic forces influencing the establish- fixation under the joint influence of mutation, random genetic ment and retention of novel genetic elements. However, a simple drift, and oftentimes opposing selection. To be successful in the model incorporating random genetic drift and weak mutation pres- long-term (postfixation), sufficiently positive selective forces sure against intron-containing alleles yields predictions consistent must exist for the retention of the intron in the face of subse- with a diversity of observations: (i) the rarity of introns in unicellular quent mutational challenges. The goal of this study is to illustrate organisms with large population sizes, and their expansion after the how simple population-genetic principles may help guide our origin of multicellular organisms with reduced population sizes; (ii) understanding of the phylogenetic and genomic distribution of the relationship between intron abundance and the stringency of introns. The primary focus will be on models that assume splice-site requirements; (iii) the tendency for introns to be more random genetic drift and mutation to be the only relevant numerous and longer in regions of low recombination; and (iv) the evolutionary forces.
    [Show full text]
  • Conserved Sequence (B Lymphocyte-Specific Gene Regulation/Promoters) TRISTRAM G
    Proc. Nad. Acad. Sci. USA Vol. 81, pp. 2650-2654, May 1984 Biochemistry Structure of the 5' ends of immunoglobulin genes: A novel conserved sequence (B lymphocyte-specific gene regulation/promoters) TRISTRAM G. PARSLOW*t, DEBRA L. BLAIRt, WILLIAM J. MURPHYt, AND DARYL K. GRANNER* Departments of *Internal Medicine and Biochemistry and the tDiabetes and Endocrinology Research Center, Veterans' Hospital, University of Iowa College of Medicine, Iowa City, IA 52240 Communicated by Leonard A. Herzenberg, December 30, 1983 ABSTRACT Recent investigations have suggested that tis- ed in the mechanism of DNA rearrangement (2). In addition, sue-specific regulatory factors are required for immunoglob- each V gene harbors at its 5' end a functional promoter (5), ulin gene transcription. Cells of the mouse lymphocytoid pre- which can serve as the site of transcriptional initiation in a B-cell line 70Z/3 contain a constitutively rearranged immuno- fully assembled heavy or light chain gene. The precise se- globulin K light chain gene; the nucleotide sequence of this quences required for promoter function in these genes have gene exhibits all the known properties of a functionally compe- not yet been elucidated. tent transcription unit. Nevertheless, transcripts derived from We investigated the structure and expression of an immu- this gene are detectable only after exposure of the cells to bac- noglobulin light chain gene in the mouse leukemia cell line terial lipopolysaccharide, implying that accurate DNA rear- 70Z/3. Under ordinary growth conditions, cells of this line rangement is not sufficient to activate expression of the gene. constitutively express cytoplasmic A heavy chains without Comparison of the sequence of the 70Z/3 K light chain gene associated light chain synthesis, a phenotype characteristic with those encoding other immunoglobulin heavy and light of the early stages of B-lymphocyte ontogeny (6-8).
    [Show full text]
  • Finding Patterns in Biological Sequences
    Finding Patterns in Biological Sequences Brona· Brejov¶a, Chrysanne DiMarco, Tom¶a·s Vina·r Sandra Romero Hidalgo Department of Computer Science Department of Statistics University of Waterloo University of Waterloo Gina Holguin, Cheryl Patten Department of Biology University of Waterloo Project report for CS798g, Fall 2000 Abstract In this report we provide an overview of known techniques for discovery of patterns of biological sequences (DNA and proteins). We also provide biological motivation, and methods of biological veri¯cation of such patterns. Finally we list publicly available tools and databases for pattern discovery. On-line supplement is available through http://genetics.uwaterloo.ca/»tvinar/cs798g/motif. Contents 1 Introduction 3 2 Biological Motivation for Pattern Discovery 3 2.1 Pattern discovery in proteins . 3 2.2 Pattern discovery in non-coding regions . 5 2.3 Tandem Repeats . 7 3 Algorithms 7 3.1 Introduction . 7 3.1.1 Computer science questions . 7 3.1.2 Input sequences . 8 3.1.3 Types of patterns . 8 3.2 Exhaustive search . 10 3.2.1 Enumerating all patterns. 10 3.2.2 Exhaustive search on graphs . 12 3.3 Creating long patterns from short patterns . 14 3.3.1 TEIRESIAS algorithm . 14 3.3.2 Work related to TEIRESIAS algorithm . 16 3.4 Iterative heuristic methods . 16 3.4.1 Gibbs sampling . 16 3.4.2 Other iterative methods . 18 3.4.3 From iteration to PTAS . 18 3.5 Machine learning methods . 19 3.5.1 Expectation maximization . 19 3.5.2 Hidden Markov models . 20 3.5.3 Improvements of HMM models .
    [Show full text]
  • A Highly Conserved Sequence in the 3 -Untranslated Region
    Copyright 1999 by the Genetics Society of America A Highly Conserved Sequence in the 39-Untranslated Region of the Drosophila Adh Gene Plays a Functional Role in Adh Expression John Parsch,*,²,1 Wolfgang Stephan²,1 and Soichi Tanda² *Molecular and Cell Biology Program and ²Department of Biology, University of Maryland, College Park, Maryland 20742 Manuscript received July 15, 1998 Accepted for publication October 19, 1998 ABSTRACT Phylogenetic analysis identi®ed a highly conserved eight-base sequence (AAGGCTGA) within the 39- untranslated region (UTR) of the Drosophila alcohol dehydrogenase gene, Adh. To examine the functional signi®cance of this conserved motif, we performed in vitro deletion mutagenesis on the D. melanogaster Adh gene followed by P-element-mediated germline transformation. Deletion of all or part of the eight- base sequence leads to a twofold increase in in vivo ADH enzymatic activity. The increase in activity is temporally and spatially general and is the result of an underlying increase in Adh transcript. These results indicate that the conserved 39-UTR motif plays a functional role in the negative regulation of Adh gene expression. The evolutionary signi®cance of our results may be understood in the context of the amino acid change that produces the ADH-F allele and also leads to a twofold increase in ADH activity. While there is compelling evidence that the amino acid replacement has been a target of positive selection, the conservation of the 39-UTR sequence suggests that it is under strong purifying selection. The selective difference between these two sequence changes, which have similar effects on ADH activity, may be explained by different metabolic costs associated with the increase in activity.
    [Show full text]
  • Multiple Sequence Alignment Is Not a Solved Problem
    1 Multiple Sequence Alignment is not a Solved Problem David A. Morrison Department of Organismal Biology, Uppsala University, Sweden Abstract Multiple sequence alignment is a basic procedure in molecular biology, and it is often treated as being essentially a solved computational problem. However, this is not so, and here I review the evidence for this claim, and outline the requirements for a solution. The goal of alignment is often stated to be to juxtapose nucleotides (or their derivatives, such as amino acids) that have been inherited from a common ancestral nucleotide (although other goals are also possible). Unfortunately, this is not an operational definition, because homology (in this sense) refers to unique and unobservable historical events, and so there can be no objective mathematical function to optimize. Consequently, almost all algorithms developed for multiple sequence alignment are based on optimizing some sort of compositional similarity (similarity = homology + analogy). As a result, many, if not most, practitioners either manually modify computer- produced alignments or they perform de novo manual alignment, especially in the field of phylogenetics. So, if homology is the goal, then multiple sequence alignment is not yet a solved computational problem. Several criteria have been developed by biologists to help them identify potential homologies (compositional, ontogenetic, topographical and functional similarity, plus conjunction and congruence), and these criteria can be applied to molecular data, in principle. Current computer programs do implement one (or occasionally two) of these criteria, but no program implements them all. What is needed is a program that evaluates all of the evidence for the sequence homologies, optimizes their combination, and thus produces the best hypotheses of homology.
    [Show full text]
  • Downloaded the 10 Cotton Cp Genomes (8 Diploid and 2 Allotetraploid Species: Gossypium Arboreum L., G
    G C A T T A C G G C A T genes Article Novel Structural Variation and Evolutionary Characteristics of Chloroplast tRNA in Gossypium Plants Ting-Ting Zhang 1, Yang Yang 1, Xiao-Yu Song 1, Xin-Yu Gao 1, Xian-Liang Zhang 2, Jun-Jie Zhao 2, Ke-Hai Zhou 2, Chang-Bao Zhao 2, Wei Li 2, Dai-Gang Yang 2, Xiong-Feng Ma 2,3,* and Zhong-Hu Li 1,* 1 Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, College of Life Sciences, Northwest University, Xi’an 710069, China; [email protected] (T.-T.Z.); [email protected] (Y.Y.); [email protected] (X.-Y.S.); [email protected] (X.-Y.G.) 2 State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, China; [email protected] (X.-L.Z.); [email protected] (J.-J.Z.); [email protected] (K.-H.Z.); [email protected] (C.-B.Z.); [email protected] (W.L.); [email protected] (D.-G.Y.) 3 School of Agricultural Sciences, Zhengzhou University, Zhengzhou 450001, China * Correspondence: [email protected] (X.-F.M.); [email protected] (Z.-H.L.); Tel./Fax: +86-29-88302411 (Z.-H.L.) Abstract: Cotton is one of the most important fiber and oil crops in the world. Chloroplast genomes harbor their own genetic materials and are considered to be highly conserved. Transfer RNAs (tRNAs) act as “bridges” in protein synthesis by carrying amino acids. Currently, the variation and evolutionary characteristics of tRNAs in the cotton chloroplast genome are poorly understood.
    [Show full text]
  • Evolution International Journal of Organic Evolution Published by the Society for the Study of Evolution
    EVOLUTION INTERNATIONAL JOURNAL OF ORGANIC EVOLUTION PUBLISHED BY THE SOCIETY FOR THE STUDY OF EVOLUTION Vol. 54 August 2000 No. 4 Evolution, 54(4), 2000, pp. 1079±1091 PERSPECTIVE: EVOLUTIONARY DEVELOPMENTAL BIOLOGY AND THE PROBLEM OF VARIATION DAVID L. STERN Laboratory for Development and Evolution, University Museum of Zoology, Department of Zoology, Downing Street, University of Cambridge, Cambridge CB2 3EJ, United Kingdom E-mail: [email protected] Abstract. One of the oldest problems in evolutionary biology remains largely unsolved. Which mutations generate evolutionarily relevant phenotypic variation? What kinds of molecular changes do they entail? What are the phenotypic magnitudes, frequencies of origin, and pleiotropic effects of such mutations? How is the genome constructed to allow the observed abundance of phenotypic diversity? Historically, the neo-Darwinian synthesizers stressed the predomi- nance of micromutations in evolution, whereas others noted the similarities between some dramatic mutations and evolutionary transitions to argue for macromutationism. Arguments on both sides have been biased by misconceptions of the developmental effects of mutations. For example, the traditional view that mutations of important developmental genes always have large pleiotropic effects can now be seen to be a conclusion drawn from observations of a small class of mutations with dramatic effects. It is possible that some mutations, for example, those in cis-regulatory DNA, have few or no pleiotropic effects and may be the predominant source of morphological evolution. In contrast, mutations causing dramatic phenotypic effects, although super®cially similar to hypothesized evolutionary transitions, are unlikely to fairly represent the true path of evolution. Recent developmental studies of gene function provide a new way of conceptualizing and studying variation that contrasts with the traditional genetic view that was incorporated into neo- Darwinian theory and population genetics.
    [Show full text]