Patterns and Signals of Biology: an Emphasis on the Role of Post

Total Page:16

File Type:pdf, Size:1020Kb

Patterns and Signals of Biology: an Emphasis on the Role of Post University of Nebraska at Omaha DigitalCommons@UNO Student Work 5-2016 Patterns and Signals of Biology: An Emphasis On The Role of Post Translational Modifications in Proteomes for Function and Evolutionary Progression Oliver Bonham-Carter Follow this and additional works at: https://digitalcommons.unomaha.edu/studentwork Part of the Computer Sciences Commons Patterns and Signals of Biology: An Emphasis On The Role of Post Translational Modifications in Proteomes for Function and Evolutionary Progression by Oliver Bonham-Carter A DISSERTATION Presented to the Faculty of The Graduate College at the University of Nebraska In Partial Fulfillment of Requirements For the Degree of Doctor of Philosophy Major: Information Technology Under the Supervision of Dhundy Bastola, PhD and Hesham Ali, PhD Omaha, Nebraska May, 2016 SUPERVISORY COMMITTEE Chair: Dhundy (Kiran) Bastola, PhD Co-Chair: Hesham Ali, PhD Lotfollah Najjar, PhD Steven From, PhD Pro Que st Num b e r: 10124836 A ll rig hts re se rve d INFO RMA TION TO A LL USERS The quality of this reproduction is dependent upon the quality of the copy submitted. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will b e no te d . Also , if m a te ria l ha d to b e re m o ve d , a no te will ind ic a te the d e le tio n. Pro Que st 10124836 Pub lishe d b y Pro Que st LLC (2016). Co p yrig ht o f the Disserta tio n is he ld b y the Autho r. A ll rig hts re se rve d . This work is protected against unauthorized copying under Title 17, United States Code Microform Edition © ProQuest LLC. Pro Q u e st LLC . 789 East Eisenho we r Pa rkwa y P.O. Box 1346 Ann Arb o r, MI 48106 - 1346 Abstract Patterns and Signals of Biology: An Emphasis On The Role of Post Translational Modifications in Proteomes for Function and Evolutionary Progression Oliver Bonham-Carter, MS, PhD University of Nebraska, 2016 Advisors: Dhundy (Kiran) Bastola, PhD and Hesham Ali, PhD After synthesis, a protein is still immature until it has been customized for a specific task. Post-translational modifications (PTMs) are steps in biosynthesis to perform this customization of protein for unique functionalities. PTMs are also important to protein survival because they rapidly enable protein adaptation to environmental stress factors by conformation change. The overarching contribution of this thesis is the construction of a computational profiling framework for the study of biological signals stemming from PTMs associated with stressed proteins. In particular, this work has been developed to predict and detect the biological mechanisms involved in types of stress response with PTMs in mitochondrial (Mt) and non-Mt protein. Before any mechanism can be studied, there must first be some evidence of its existence. This evidence takes the form of signals such as biases of biological actors and types of protein interaction. Our framework has been developed to locate these signals, distilled from “Big Data” resources such as public databases and the the entire PubMed literature corpus. We apply this framework to study the signals to learn about protein stress responses involving PTMs, modification sites (MSs). We developed of this framework, and its approach to analysis, according to three main facets: (1) by statistical evaluation to determine patterns of signal dominance throughout large volumes of data, (2) by signal location to track down the regions where the mechanisms must be found according to the types and numbers of associated actors at relevant regions in protein, and (3) by text mining to determine how these signals have been previously investigated by researchers. The results gained from our framework enable us to uncover the PTM actors, MSs and protein domains which are the major components of particular stress response mechanisms and may play roles in protein malfunction and disease. iii Copyright Copyright 2016, Oliver Bonham-Carter. iv We never work alone in this business. Oliver Bonham-Carter Dedication I dedicate this work to my wife; Janyl and son; Vincent, my sisters; Kate and Daisy, my brother; Alexander, and my mother and father; Jane and Nicholas. These are the people who I, likely, subjected to many long-winded and unwarranted updates about the progression of this document and that of my degree, in general. This work is also dedicated to my advisors Dr. Dhundy (Kiran) Bastola, and Dr. Hesham Ali who served on my doctorate committee with Dr. Lotfollah Najjar and Dr. Steven From. By your constant and tireless efforts, mentoring, and coaching, I have been inspired to create an exciting work of scientific investigation. I would also like to thank the following wonderful people for their constant warmth, support and friendship: Ishwor Thapa, Dr. Jasjit Banwait-Kaur, Sean West and Scott McGrath. It is very likely that I also subjected these good people to far too many general updates on this work and I thank them earnestly for their counsel. This work is also dedicated to my friends from the bioinformatics lab at the University of Nebraska at Omaha: Dr. Sanjukta Bhowmick, Dr. v Kate Cooper, Julia Warnke, Vladimir Ufimtsev, Jay Pedersen, Kritika Karri, Kaitlin Goettsch, Asuda Sharma, Sunandini Sharma, Suyeon Kim, Sahil Sethi and many others too! Without the encouragement from my family and friends, none of this work would have been possible to return to our scientific community, from whom Ihavetakensomuch. vi Contents 1 Introduction 1 1.1SignalsAfterMechanisms........................ 1 1.2 A Focus On PTMs, Proteins And Their Typical Stress Responses . 2 1.2.1 Protein Regulation And Modification .............. 4 1.2.2 Sudden Stresses .......................... 7 1.2.3 PTMLocalizationAndStudy.................. 8 1.2.4 ProteinDomains......................... 9 1.3MitochondrialAndNon-MitochondrialProtein............. 10 1.3.1 Mitochondria: Energy Production And Stress Response . 12 1.3.1.1 AnEnergyCrisis.................... 13 1.3.2 MechanismsFromSignalDetection............... 16 2 The Ubiquitous Nature Of Signals In Biology And Living Systems 18 2.1TrafficSignals............................... 18 2.2SignalsOfTheHeart........................... 21 2.3NaturalSignalsOfOrganismalBiology................. 23 2.3.1 Mobile Organisms ......................... 23 2.3.2 Immobile Organisms ....................... 25 2.4Signals,SemanticsAndComprehension................. 27 vii 3 Objectives 29 3.1Motivation................................. 29 3.2GeneralOrganization........................... 30 3.3 Thesis Contributions ........................... 31 3.3.1 Contribution 1 .......................... 31 3.3.2 Contribution 2 .......................... 31 3.3.3 Contribution 3 .......................... 32 3.3.4 Contribution 4 .......................... 32 3.3.5 Contribution 5 .......................... 33 3.3.6 Contribution 6 .......................... 33 3.3.7 Contribution 7 .......................... 34 3.3.8 Contribution 8 .......................... 34 3.3.9 Contribution 9 .......................... 35 3.3.10Contribution10.......................... 35 4 Alignment-Free Genetic Sequence Comparisons: A Review of Recent Approaches By Word Analysis 37 4.1Abstract.................................. 37 4.2Introduction................................ 38 4.3Background................................ 41 4.4FactorFrequencies............................ 41 4.4.1 BBCByAnalysisOfMutualInformation............ 42 4.4.2 FeatureFrequencyProfiles(FFPs)............... 44 4.4.2.1 A Comparison By Frequencies And The Jensen-ShannonDivergenceTest........... 46 4.4.3 Suffix Trees By k-merFrequencies................ 49 4.4.4 Composition Vectors Based On k-merFrequencies....... 50 4.4.4.1 Creation Of Composition Vectors, (CVs, CCVs) . 51 viii 4.4.4.2 CreationOfImprovedCCV’s............. 52 4.4.4.3 DistanceMeasurement................. 53 4.4.5 ARevisedStringCompositionMethod............. 54 4.4.6 D2 Statistic............................ 56 4.5 Data Compression And Dictionaries ................... 59 4.5.1 Text Compression Algorithms .................. 61 4.5.2 Average Common Substring (ACS) ............... 62 4.6 Applications Of Alignment-Free Methods ................ 64 4.6.1 Biological Data And Sequence Assembly ............ 64 4.6.2 ChromosomalDataAndPhylogeny............... 65 4.6.3 Horizontal Gene Transfer ..................... 66 4.7AdvantagesAndDisadvantagesOfMethods.............. 68 4.8Conclusion................................. 70 4.9ArticleDetails............................... 71 5 An Analysis Of Palindromic Content In The Coding And Non-coding DNA Regions Of Bacteria 73 5.1Abstract.................................. 73 5.2Introduction................................ 74 5.3Methods.................................. 74 5.4ResultsAndDiscussion.......................... 77 5.4.1 Lengths-{4,6,8,10} Palindromes ................. 78 5.5Conclusions................................ 79 5.6ArticleDetails............................... 80 6 A Base Composition Analysis Of Natural Patterns For The Pre- Processing Of Metagenome Sequences 81 6.1Abstract.................................. 81 ix 6.2IntroductionAndRelatedWork..................... 82 6.3Methods.................................. 85 6.3.1 GenomeSequences........................ 85 6.3.2 ReadAndContigSequences................... 86 6.3.3 Motifs............................... 87 6.3.3.1 BaseCompositionsAndSpectrumSets........ 87 6.3.4 Proportions............................ 90 6.4ResultsAndDiscussion.......................... 91 6.4.1 SequenceData.........................
Recommended publications
  • The Grammar of Transcriptional Regulation
    Hum Genet (2014) 133:701–711 DOI 10.1007/s00439-013-1413-1 REVIEW PAPER The grammar of transcriptional regulation Shira Weingarten-Gabbay · Eran Segal Received: 11 June 2013 / Accepted: 24 December 2013 / Published online: 5 January 2014 © Springer-Verlag Berlin Heidelberg 2014 Abstract Eukaryotes employ combinatorial strategies to regulatory motifs (Levine and Tjian 2003) and exploit motif generate a variety of expression patterns from a relatively geometry as another dimension of combinatorial power small set of regulatory DNA elements. As in any other lan- for regulating transcription. Understanding the fundamen- guage, deciphering the mapping between DNA and expres- tal principles governing transcriptional regulation could sion requires an understanding of the set of rules that govern allow us to predict expression from DNA sequence, with basic principles in transcriptional regulation, the functional far-reaching implications. Most notably, in many human elements involved, and the ways in which they combine to diseases, genetic changes occur in non-coding regions such orchestrate a transcriptional output. Here, we review the cur- as gene promoters and enhancers. However, without under- rent understanding of various grammatical rules, including standing the grammar of transcriptional regulation, we the effect on expression of the number of transcription factor cannot tell which sequence changes affect expression and binding sites, their location, orientation, affinity and activity; how. For example, even for a single binding site, we do not co-association with different factors; and intrinsic nucleo- know the quantitative effects on expression of its location, some organization. We review different methods that are orientation, and affinity; whether these effects are general, used to study the grammar of transcription regulation, high- factor-specific, and/or promoter-dependent; and how they light gaps in current understanding, and discuss how recent depend on the intrinsic nucleosome organization.
    [Show full text]
  • Technical Glossary
    WBVGL 6/28/03 12:00 AM Page 409 Technical Glossary abortive infection: Infection of a cell where there is no net increase in the production of infectious virus. abortive transformation: See transitory (transient or abortive) transformation. acid blob activator: A regulatory protein that acts in trans to alter gene expression and whose activity depends on a region of an amino acid sequence containing acidic or phosphorylated residues. acquired immune deficiency syndrome (AIDS): A disease characterized by loss of cell-mediated and humoral immunity as the result of infection with human immunodeficiency virus (HIV). acute infection: An infection marked by a sudden onset of detectable symptoms usually followed by complete or apparent recovery. adaptive immunity (acquired immunity): See immunity. adjuvant: Something added to a drug to increase the effectiveness of that drug. With respect to the immune system, an adjuvant increases the response of the system to a particular antigen. agnogene: A region of a genome that contains an open reading frame of unknown function; origi- nally used to describe a 67- to 71-amino acid product from the late region of SV40. AIDS: See acquired immune deficiency syndrome. aliquot: One of a number of replicate samples of known size. a-TIF: The alpha trans-inducing factor protein of HSV; a structural (virion) protein that functions as an acid blob transcriptional activator. Its specificity requires interaction with certain host cel- lular proteins (such as Oct1) that bind to immediate-early promoter enhancers. ambisense genome: An RNA genome that contains sequence information in both the positive and negative senses. The S genomic segment of the Arenaviridae and of certain genera of the Bunyaviridae have this characteristic.
    [Show full text]
  • Unitary Structure of Palindromes in DNA
    bioRxiv preprint doi: https://doi.org/10.1101/2021.07.21.453288; this version posted July 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Unitary Structure of Palindromes in DNA Mehmet Ali Tibatan1, ∗ and Mustafa Sarısaman2, y 1Department of Biotechnology, Istanbul University, 34134, Vezneciler, Istanbul, Turkey 2Department of Physics, Istanbul University, 34134, Vezneciler, Istanbul, Turkey We investigate the quantum behavior encountered in palindromes within DNA structure. In par- ticular, we reveal the unitary structure of usual palindromic sequences found in genomic DNAs of all living organisms, using the Schwinger’s approach. We clearly demonstrate the role played by palin- dromic configurations with special emphasis on physical symmetries, in particular subsymmetries of unitary structure. We unveil the prominence of unitary structure in palindromic sequences in the sense that vitally significant information endowed within DNA could be transformed unchangeably in the process of transcription. We introduce a new symmetry relation, namely purine-purine or pyrimidine-pyrimidine symmetries (p-symmetry) in addition to the already known symmetry rela- tion of purine-pyrimidine symmetries (pp-symmetry) given by Chargaff’s rule. Therefore, important vital functions of a living organisms are protected by means of these symmetric features. It is un- derstood that higher order palindromic sequences could be generated in terms of the basis of the highest prime numbers that make up the palindrome sequence number. We propose that violation of this unitary structure of palindromic sequences by means of our proposed symmetries leads to a mutation in DNA, which could offer a new perspective in the scientific studies on the originand cause of mutation.
    [Show full text]
  • Palindromes in DNA—A Risk for Genome Stability and Implications in Cancer
    International Journal of Molecular Sciences Review Palindromes in DNA—A Risk for Genome Stability and Implications in Cancer Marina Svetec Mikleni´cand Ivan Krešimir Svetec * Faculty of Food Technology and Biotechnology, University of Zagreb, Pierottijeva 6, 10000 Zagreb, Croatia; [email protected] * Correspondence: [email protected]; Tel.: +385-1483-6016 Abstract: A palindrome in DNA consists of two closely spaced or adjacent inverted repeats. Certain palindromes have important biological functions as parts of various cis-acting elements and protein binding sites. However, many palindromes are known as fragile sites in the genome, sites prone to chromosome breakage which can lead to various genetic rearrangements or even cell death. The ability of certain palindromes to initiate genetic recombination lies in their ability to form secondary structures in DNA which can cause replication stalling and double-strand breaks. Given their recombinogenic nature, it is not surprising that palindromes in the human genome are involved in genetic rearrangements in cancer cells as well as other known recurrent translocations and deletions associated with certain syndromes in humans. Here, we bring an overview of current understanding and knowledge on molecular mechanisms of palindrome recombinogenicity and discuss possible implications of DNA palindromes in carcinogenesis. Furthermore, we overview the data on known palindromic sequences in the human genome and efforts to estimate their number and distribution, as well as underlying mechanisms of genetic rearrangements specific palindromic sequences cause. Keywords: DNA palindromes; quasipalindromes; palindromic amplification; palindrome-mediated genetic recombination; carcinogenesis Citation: Svetec Mikleni´c,M.; Svetec, I.K. Palindromes in DNA—A Risk for Genome Stability and Implications in Cancer.
    [Show full text]
  • Molecular Biology and Applied Genetics
    MOLECULAR BIOLOGY AND APPLIED GENETICS FOR Medical Laboratory Technology Students Upgraded Lecture Note Series Mohammed Awole Adem Jimma University MOLECULAR BIOLOGY AND APPLIED GENETICS For Medical Laboratory Technician Students Lecture Note Series Mohammed Awole Adem Upgraded - 2006 In collaboration with The Carter Center (EPHTI) and The Federal Democratic Republic of Ethiopia Ministry of Education and Ministry of Health Jimma University PREFACE The problem faced today in the learning and teaching of Applied Genetics and Molecular Biology for laboratory technologists in universities, colleges andhealth institutions primarily from the unavailability of textbooks that focus on the needs of Ethiopian students. This lecture note has been prepared with the primary aim of alleviating the problems encountered in the teaching of Medical Applied Genetics and Molecular Biology course and in minimizing discrepancies prevailing among the different teaching and training health institutions. It can also be used in teaching any introductory course on medical Applied Genetics and Molecular Biology and as a reference material. This lecture note is specifically designed for medical laboratory technologists, and includes only those areas of molecular cell biology and Applied Genetics relevant to degree-level understanding of modern laboratory technology. Since genetics is prerequisite course to molecular biology, the lecture note starts with Genetics i followed by Molecular Biology. It provides students with molecular background to enable them to understand and critically analyze recent advances in laboratory sciences. Finally, it contains a glossary, which summarizes important terminologies used in the text. Each chapter begins by specific learning objectives and at the end of each chapter review questions are also included.
    [Show full text]
  • Evidence for Two Mechanisms of Palindrome-Stimulated Deletion in Escherichia Coli: Single-Strand Annealing and Replication Slipped Mispairing
    Copyright 2001 by the Genetics Society of America Evidence for Two Mechanisms of Palindrome-Stimulated Deletion in Escherichia coli: Single-Strand Annealing and Replication Slipped Mispairing Malgorzata Bzymek1 and Susan T. Lovett Department of Biology and Rosenstiel Basic Medical Sciences Research Center, Brandeis University, Waltham, Massachusetts 02454-0110 Manuscript received January 31, 2001 Accepted for publication March 19, 2001 ABSTRACT Spontaneous deletion mutations often occur at short direct repeats that ¯ank inverted repeat sequences. Inverted repeats may initiate genetic rearrangements by formation of hairpin secondary structures that block DNA polymerases or are processed by structure-speci®c endonucleases. We have investigated the ability of inverted repeat sequences to stimulate deletion of ¯anking direct repeats in Escherichia coli. Propensity for cruciform extrusion in duplex DNA correlated with stimulation of ¯anking deletion, which was partially sbcD dependent. We propose two mechanisms for palindrome-stimulated deletion, SbcCD dependent and SbcCD independent. The SbcCD-dependent mechanism is initiated by SbcCD cleavage of cruciforms in duplex DNA followed by RecA-independent single-strand annealing at the ¯anking direct repeats, generating a deletion. Analysis of deletion endpoints is consistent with this model. We propose that the SbcCD-independent pathway involves replication slipped mispairing, evoked from stalling at hairpin structures formed on the single-stranded lagging-strand template. The skew of SbcCD-independent deletion endpoints with respect to the direction of replication supports this hypothesis. Surprisingly, even in the absence of palindromes, SbcD affected the location of deletion endpoints, suggesting that SbcCD- mediated strand processing may also accompany deletion unassociated with secondary structures. N vivo, large DNA palindromes are intrinsically unsta- and Wang 1983).
    [Show full text]
  • Contributions to DNA Cryptography : Applications to Text and Image Secure Transmission Olga Tornea
    Contributions to DNA cryptography : applications to text and image secure transmission Olga Tornea To cite this version: Olga Tornea. Contributions to DNA cryptography : applications to text and image secure transmis- sion. Other. Université Nice Sophia Antipolis; Universitatea tehnică (Cluj-Napoca, Roumanie), 2013. English. NNT : 2013NICE4092. tel-00942608 HAL Id: tel-00942608 https://tel.archives-ouvertes.fr/tel-00942608 Submitted on 6 Feb 2014 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. UNIVERSITE DE NICE-SOPHIA ANTIPOLIS ECOLE DOCTORALE STIC SCIENCES ET TECHNOLOGIES DE L’INFORMATION ET DE LA COMMUNICATION T H E S E pour l’obtention du grade de Docteur en Sciences de l’Université de Nice-Sophia Antipolis Mention : Automatique, Traitement du Signal et des Images présentée et soutenue par Olga TORNEA CONTRIBUTIONS TO DNA CRYPTOGRAPHY APPLICATIONS TO TEXT AND IMAGE SECURE TRANSMISSION Thèse dirigée par Monica BORDA et Marc ANTONINI soutenue le 13 Novembre 2013 Jury : Mircea Giurgiu Université Technique de Cluj-Napoca Président William Puech Université
    [Show full text]
  • Genomics-Based Security Protocols: from Plaintext to Cipherprotein
    Genomics-Based Security Protocols: From Plaintext to Cipherprotein Harry Shaw Sayed Hussein, Hermann Helgert Microwave and Communications Systems Branch Department of Electrical and Computer Engineering NASA/Goddard Space Flight Center George Washington University Greenbelt, MD, USA Washington, DC, USA [email protected] [email protected], [email protected] Abstract- The evolving nature of the internet will require RSA and nonce exchanges, etc. It also provides for a continual advances in authentication and confidentiality biological authentication capability. protocols. Nature provides some clues as to how this can be This paper is organized as follows:. accomplished in a distributed manner through molecular • A description of the elements of the prototype biology. Cryptography and molecular biology share certain genomic HMAC architecture aspects and operations that allow for a set of unified principles • A description of the DNA code encryption process, to be applied to problems in either venue. A concept for genome selection and properties developing security protocols that can be instantiated at the genomics level is presented. A DNA (Deoxyribonucleic acid) • The elements of the prototype protocol architecture inspired hash code system is presented that utilizes concepts and its concept of operations from molecular biology. It is a keyed-Hash Message • A short plaintext to ciphertext encryption example. Authentication Code (HMAC) capable of being used in secure • Description of the principles of gene expression mobile Ad hoc networks. It is targeted for applications without and transcriptional control to develop protocols for an available public key infrastructure. Mechanics of creating information security. These protocols would the HMAC are presented as well as a prototype HMAC operate in both the electronic and genomic protocol architecture.
    [Show full text]
  • Evolutionary Advantage of a Broken Symmetry in Autocatalytic Polymers Tentatively Explains Fundamental Properties of DNA
    Evolutionary advantage of a broken symmetry in autocatalytic polymers tentatively explains fundamental properties of DNA Hemachander Subramanian1∗, Robert A. Gatenby1;2 1 Integrated Mathematical Oncology Department, 2Cancer Biology and Evolution Program, H. Lee Moffitt Cancer Center and Research Institute, Tampa, Florida. ∗To whom correspondence should be addressed; E-mail: hemachander.subramanian@moffitt.org. August 17, 2018 Abstract The macromolecules that encode and translate information in living systems, DNA and RNA, exhibit distinctive structural asymmetries, in- cluding homochirality or mirror image asymmetry and 30-50 directionality, that are invariant across all life forms. The evolutionary advantages of these broken symmetries remain unknown. Here we utilize a very simple model of hypothetical self-replicating polymers to show that asymmetric autocatalytic polymers are more successful in self-replication compared to their symmetric counterparts in the Darwinian competition for space and common substrates. This broken-symmetry property, called asym- metric cooperativity, arises with the maximization of a replication poten- tial, where the catalytic influence of inter-strand bonds on their left and right neighbors is unequal. Asymmetric cooperativity also leads to ten- tative, qualitative and simple evolution-based explanations for a number of other properties of DNA that include four nucleotide alphabet, three nucleotide codons, circular genomes, helicity, anti-parallel double-strand orientation, heteromolecular base-pairing, asymmetric base compositions, and palindromic instability, apart from the structural asymmetries men- tioned above. Our model results and tentative explanations are consistent with multiple lines of experimental evidence, which include evidence for the presence of asymmetric cooperativity in DNA. arXiv:1605.00748v3 [q-bio.BM] 9 Mar 2017 Introduction Living systems, uniquely in nature, acquire, store and use information au- tonomously.
    [Show full text]
  • Molecular Mechanism of Directional CTCF Recognition of a Diverse Range of Genomic Sites
    Cell Research (2017) :1365-1377. © 2017 IBCB, SIBS, CAS All rights reserved 1001-0602/17 $ 32.00 ORIGINAL ARTICLE www.nature.com/cr Molecular mechanism of directional CTCF recognition of a diverse range of genomic sites Maolu Yin1, 2, Jiuyu Wang1, Min Wang1, Xinmei Li1, 2, Mo Zhang3, 4, 5, Qiang Wu3, 4, 5, Yanli Wang1, 2, 6 1Key Laboratory of RNA Biology, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; 2University of Chinese Academy of Sciences, Beijing 100049, China; 3Center for Comparative Biomedicine, MOE Key Laboratory of Systems Biomedicine, Institute of Systems Biomedicine, Collaborative Innovative Center of Systems Biomedicine, SCSB, Shanghai Jiao Tong University (SJTU), Shanghai 200240, China; 4State Key Laboratory of On- cogenes and Related Genes, SJTU Medical School, Shanghai 200240, China; 5School of Life Sciences and Biotechnology, SJTU, Shanghai 200240, China; 6Collaborative Innovation Center of Genetics and Development, Shanghai 200438, China CTCF, a conserved 3D genome architecture protein, determines proper genome-wide chromatin looping interac- tions through directional binding to specific sequence elements of four modules within numerous CTCF-binding sites (CBSs) by its 11 zinc fingers (ZFs). Here, we report four crystal structures of human CTCF in complex with CBSs of the protocadherin (Pcdh) clusters. We show that directional CTCF binding to cognate CBSs of the Pcdh enhancers and promoters is achieved through inserting its ZF3, ZFs 4-7, and ZFs 9-11 into the major groove along CBSs, result- ing in a sequence-specific recognition of module 4, modules 3 and 2, and module 1, respectively; and ZF8 serves as a spacer element for variable distances between modules 1 and 2.
    [Show full text]
  • Palindromic Sequence Impedes Sequencing-By- Ligation Mechanism
    Huang et al. BMC Systems Biology 2012, 6(Suppl 2):S10 http://www.biomedcentral.com/1752-0509/6/S2/S10 PROCEEDINGS Open Access Palindromic sequence impedes sequencing-by- ligation mechanism Yu-Feng Huang1†, Sheng-Chung Chen1†, Yih-Shien Chiang2, Tzu-Han Chen1, Kuo-Ping Chiu1,2,3* From 23rd International Conference on Genome Informatics (GIW 2012) Tainan, Taiwan. 12-14 December 2012 Abstract Background: Current next-generation sequencing (NGS) platforms adopt two types of sequencing mechanisms: by synthesis or by ligation. The former is employed by 454 and Solexa systems, while the latter by SOLiD system. Although the pros and cons for each sequencing mechanism have more or less been discussed in a number of occasions, the potential obstacle imposed by palindromic sequences has not yet been addressed. Methods: To test the effect of the palindromic region on sequencing efficacy, we clonally amplified a paired-end ditag sequence composed of a 24-bp palindromic sequence flanked by a pair of tags from the E. coli genome. We used the near homogeneous fragments produced from MmeI digestion of the amplified clone to generate a sequencing library for SOLiD 5500xl sequencer. Results: Results showed that, traditional ABI sequencers, which adopt sequencing-by-synthesis mechanism, were able to read through the palindromic region. However, SOLiD 5500xl was unable to do so. Instead, the palindromic region was read as miscellaneous random sequences. Moreover, readable tag sequence turned obscure ~2 bp prior to the palindromic region. Conclusions: Taken together, we demonstrate that SOLiD machines, which employ sequencing-by-ligation mechanism, are unable to read through the palindromic region.
    [Show full text]
  • WO 2011/043817 Al 14 April 2011 (14.04.2011) PCT
    (12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) (19) World Intellectual Property Organization International Bureau (10) International Publication Number (43) International Publication Date WO 2011/043817 Al 14 April 2011 (14.04.2011) PCT (51) International Patent Classification: 10010 (US). ARNOLD, Lee, D. [US/US]; 55 Chestnut A61F 2/00 (2006.01) Street, Mt. Sinai, NY 11766 (US). (21) International Application Number: (74) Agents: GOLDMAN, Michael, L. et al; Nixon Peabody PCT/US20 10/002708 LLP, Patent Group, 1100 Clinton Square, Rochester, NY 14604-1792 (US). (22) International Filing Date: 7 October 2010 (07.10.2010) (81) Designated States (unless otherwise indicated, for every kind of national protection available): AE, AG, AL, AM, (25) Filing Language: English AO, AT, AU, AZ, BA, BB, BG, BH, BR, BW, BY, BZ, (26) Publication Langi English CA, CH, CL, CN, CO, CR, CU, CZ, DE, DK, DM, DO, DZ, EC, EE, EG, ES, FI, GB, GD, GE, GH, GM, GT, (30) Priority Data: HN, HR, HU, ID, IL, IN, IS, JP, KE, KG, KM, KN, KP, 61/278,523 7 October 2009 (07.10.2009) US KR, KZ, LA, LC, LK, LR, LS, LT, LU, LY, MA, MD, (71) Applicants (for all designated States except US): COR¬ ME, MG, MK, MN, MW, MX, MY, MZ, NA, NG, NI, NELL UNIVERSITY [US/US]; Cornell Center For NO, NZ, OM, PE, PG, PH, PL, PT, RO, RS, RU, SC, SD, Technology Enterprise, & Commercialization, 395 Pine SE, SG, SK, SL, SM, ST, SV, SY, TH, TJ, TM, TN, TR, Tree Road, Suite 310, Ithaca, NY 14850 (US).
    [Show full text]