The Substrate Specificity of N-End Rule Components from Arabidopsis Thaliana

Total Page:16

File Type:pdf, Size:1020Kb

The Substrate Specificity of N-End Rule Components from Arabidopsis Thaliana The substrate specificity of N-end rule components from Arabidopsis thaliana Dissertation zur Erlangung des Doktorgrades der Naturwissenschaften (Dr. rer. nat.) der Naturwissenschaftlichen Fakultät I – Biowissenschaften – der Martin-Luther-Universität Halle-Wittenberg, vorgelegt von Frau Maria Katharina Klecker geb. am 03.10.1986 in Marburg Gutachter Prof. Dr. Ingo Heilmann, Martin-Luther-Universität Halle-Wittenberg Prof. Dr. Nico Dissmeyer, Universität Osnabrück Prof. Dr. Frederica Theodoulou, Rothamsted Research, Harpenden Verteidigung 18. Dezember 2019 Table of contents Table of contents Table of contents ...................................................................................................................................... I Abbreviations .......................................................................................................................................... V Summary ................................................................................................................................................ IX Zusammenfassung ................................................................................................................................... X 1. Introduction ..................................................................................................................................... 1 1.1 Protein degradation in plants .................................................................................................. 1 1.1.1 Bulk protein degradation ................................................................................................. 1 1.1.2 Selective protein degradation – the 26S proteasome ..................................................... 1 1.1.3 Ubiquitin .......................................................................................................................... 2 1.1.4 Plant E3 ligases ................................................................................................................ 4 1.2 The N-end rule pathway of protein degradation .................................................................... 7 1.2.1 Functions of the N-end rule pathway .............................................................................. 9 1.2.2 Primary destabilizing amino acids and N-recognins ...................................................... 10 1.2.3 Secondary destabilizing N-termini ................................................................................ 15 1.2.4 Tertiary destabilizing N-termini ..................................................................................... 16 1.3 Male gametogenesis in Arabidopsis ...................................................................................... 18 1.3.1 Regulation of pollen development ................................................................................ 21 1.4 The phytohormone ethylene................................................................................................. 26 1.4.1 Impact of ethylene on plant physiology ........................................................................ 26 1.4.2 Ethylene signaling .......................................................................................................... 28 1.5 The phytohormone cytokinin ................................................................................................ 32 1.5.1 Cytokinin during etiolation ............................................................................................ 33 2. Aims and scope of the thesis ......................................................................................................... 35 3. Results ........................................................................................................................................... 37 3.1 The genetic background in prt1-1ts confers thermosensitivity of pollen development ........ 37 3.1.1 The prt1-1ts line is sterile at an elevated temperature ................................................. 37 3.1.2 The thermosensitivity of prt1-1ts is independent of the PRT1 locus ............................. 38 3.1.3 The heat-induced sterility of prt1-1ts is caused by a defect in male gametogenesis .... 40 3.1.4 Microspore to pollen transition at 28°C is affected in prt1-1ts...................................... 43 3.1.5 Conditional male sterility of prt1-1ts is likely a sporophytic trait .................................. 46 3.1.6 Several genes are differentially expressed in anthers of prt1-1ts .................................. 47 3.2 PRT1 promoter activity .......................................................................................................... 50 3.3 Biochemical characterization of the PRT1 protein ................................................................ 53 I Table of contents 3.3.1 PRT1 acts as an E3 ubiquitin ligase in vitro.................................................................... 54 3.3.2 Both RING domains of PRT1 are required for autoubiquitination ................................ 56 3.3.3 The in-vitro substrate specificity of PRT1 ...................................................................... 59 3.3.4 PRT1 RING domains are not involved in substrate binding in vitro .............................. 62 3.3.5 In-vitro PRT1 binding to Arabidopsis peptide sequences .............................................. 64 3.4 EIN2 is a potential in-vivo substrate for PRT1 ....................................................................... 66 3.4.1 EIN2C-term is destabilized in a PRT1-dependent manner in vivo ..................................... 67 3.4.2 Apparent molecular weight of EIN2C-term ....................................................................... 73 3.4.3 Physical interaction between EIN2 and PRT1 in vivo .................................................... 73 3.4.4 Hypocotyl elongation in the dark is slightly increased in prt1-1ts ................................. 76 3.4.5 Cytokinin-induced inhibition of hypocotyl elongation in the dark is impaired in prt1-1ts and N119 ....................................................................................................................................... 78 3.4.6 Expression of EIN2C-term confers signs of constitutive ethylene response in combination with the prt1-1ts genotype ............................................................................................................ 84 3.5 The substrate specificities of PRT6 and ATE1 ........................................................................ 86 3.5.1 The in-vitro substrate specificity of the UBR domain of PRT6 ...................................... 86 3.5.2 ATE1 arginylates acidic N-termini in vitro ..................................................................... 90 4. Discussion ...................................................................................................................................... 93 4.1 A new temperature-dependent sterility phenotype ............................................................. 93 4.1.1 Potential origin of the sterility phenotype in prt1-1ts ................................................... 93 4.1.2 The thermosensitivity of prt1-1ts reproduction ............................................................. 96 4.2 PRT1 acts as an E3 ligase in-vitro using both RING domains ............................................... 103 4.3 The in-vitro substrate specificity of plant N-recognins ....................................................... 106 4.3.1 Determinants of N-degron recognition by PRT1 ......................................................... 106 4.3.2 BB is an in-vivo substrate of PRT1 ............................................................................... 110 4.3.3 In-vitro substrate interactions of PRT6 ........................................................................ 112 4.3.4 In-vitro arginylation activity of ATE1 ........................................................................... 114 4.4 A potential role for PRT1 in EIN2 signaling .......................................................................... 116 4.4.1 Physical interaction of EIN2 and PRT1 ......................................................................... 116 4.4.2 EIN2C-term has low in-vivo protein stability ................................................................... 117 4.4.3 In-vivo stability of EIN2C-term depends on PRT1 activity ............................................... 118 4.4.4 EIN2C-term degradation in planta .................................................................................. 121 4.5 The genetic background of prt1-1 lines confers cytokinin hyposensitivity in the dark ...... 123 4.5.1 Sugar-dependence of cytokinin hyposensitivity in prt1-1ts/N119 ............................... 124 5. Conclusions .................................................................................................................................. 128 II Table of contents 6. Materials and Methods ............................................................................................................... 130 6.1 Plant growth conditions ...................................................................................................... 130 6.2 Plant material ...................................................................................................................... 130 6.3 Hormone treatments ........................................................................................................... 131
Recommended publications
  • 4-6 Weeks Old Female C57BL/6 Mice Obtained from Jackson Labs Were Used for Cell Isolation
    Methods Mice: 4-6 weeks old female C57BL/6 mice obtained from Jackson labs were used for cell isolation. Female Foxp3-IRES-GFP reporter mice (1), backcrossed to B6/C57 background for 10 generations, were used for the isolation of naïve CD4 and naïve CD8 cells for the RNAseq experiments. The mice were housed in pathogen-free animal facility in the La Jolla Institute for Allergy and Immunology and were used according to protocols approved by the Institutional Animal Care and use Committee. Preparation of cells: Subsets of thymocytes were isolated by cell sorting as previously described (2), after cell surface staining using CD4 (GK1.5), CD8 (53-6.7), CD3ε (145- 2C11), CD24 (M1/69) (all from Biolegend). DP cells: CD4+CD8 int/hi; CD4 SP cells: CD4CD3 hi, CD24 int/lo; CD8 SP cells: CD8 int/hi CD4 CD3 hi, CD24 int/lo (Fig S2). Peripheral subsets were isolated after pooling spleen and lymph nodes. T cells were enriched by negative isolation using Dynabeads (Dynabeads untouched mouse T cells, 11413D, Invitrogen). After surface staining for CD4 (GK1.5), CD8 (53-6.7), CD62L (MEL-14), CD25 (PC61) and CD44 (IM7), naïve CD4+CD62L hiCD25-CD44lo and naïve CD8+CD62L hiCD25-CD44lo were obtained by sorting (BD FACS Aria). Additionally, for the RNAseq experiments, CD4 and CD8 naïve cells were isolated by sorting T cells from the Foxp3- IRES-GFP mice: CD4+CD62LhiCD25–CD44lo GFP(FOXP3)– and CD8+CD62LhiCD25– CD44lo GFP(FOXP3)– (antibodies were from Biolegend). In some cases, naïve CD4 cells were cultured in vitro under Th1 or Th2 polarizing conditions (3, 4).
    [Show full text]
  • A SARS-Cov-2 Protein Interaction Map Reveals Targets for Drug Repurposing
    Article A SARS-CoV-2 protein interaction map reveals targets for drug repurposing https://doi.org/10.1038/s41586-020-2286-9 A list of authors and affiliations appears at the end of the paper Received: 23 March 2020 Accepted: 22 April 2020 A newly described coronavirus named severe acute respiratory syndrome Published online: 30 April 2020 coronavirus 2 (SARS-CoV-2), which is the causative agent of coronavirus disease 2019 (COVID-19), has infected over 2.3 million people, led to the death of more than Check for updates 160,000 individuals and caused worldwide social and economic disruption1,2. There are no antiviral drugs with proven clinical efcacy for the treatment of COVID-19, nor are there any vaccines that prevent infection with SARS-CoV-2, and eforts to develop drugs and vaccines are hampered by the limited knowledge of the molecular details of how SARS-CoV-2 infects cells. Here we cloned, tagged and expressed 26 of the 29 SARS-CoV-2 proteins in human cells and identifed the human proteins that physically associated with each of the SARS-CoV-2 proteins using afnity-purifcation mass spectrometry, identifying 332 high-confdence protein–protein interactions between SARS-CoV-2 and human proteins. Among these, we identify 66 druggable human proteins or host factors targeted by 69 compounds (of which, 29 drugs are approved by the US Food and Drug Administration, 12 are in clinical trials and 28 are preclinical compounds). We screened a subset of these in multiple viral assays and found two sets of pharmacological agents that displayed antiviral activity: inhibitors of mRNA translation and predicted regulators of the sigma-1 and sigma-2 receptors.
    [Show full text]
  • The Changing Chromatome As a Driver of Disease: a Panoramic View from Different Methodologies
    The changing chromatome as a driver of disease: A panoramic view from different methodologies Isabel Espejo1, Luciano Di Croce,1,2,3 and Sergi Aranda1 1. Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain 2. Universitat Pompeu Fabra (UPF), Barcelona, Spain 3. ICREA, Pg. Lluis Companys 23, Barcelona 08010, Spain *Corresponding authors: Luciano Di Croce ([email protected]) Sergi Aranda ([email protected]) 1 GRAPHICAL ABSTRACT Chromatin-bound proteins regulate gene expression, replicate and repair DNA, and transmit epigenetic information. Several human diseases are highly influenced by alterations in the chromatin- bound proteome. Thus, biochemical approaches for the systematic characterization of the chromatome could contribute to identifying new regulators of cellular functionality, including those that are relevant to human disorders. 2 SUMMARY Chromatin-bound proteins underlie several fundamental cellular functions, such as control of gene expression and the faithful transmission of genetic and epigenetic information. Components of the chromatin proteome (the “chromatome”) are essential in human life, and mutations in chromatin-bound proteins are frequently drivers of human diseases, such as cancer. Proteomic characterization of chromatin and de novo identification of chromatin interactors could thus reveal important and perhaps unexpected players implicated in human physiology and disease. Recently, intensive research efforts have focused on developing strategies to characterize the chromatome composition. In this review, we provide an overview of the dynamic composition of the chromatome, highlight the importance of its alterations as a driving force in human disease (and particularly in cancer), and discuss the different approaches to systematically characterize the chromatin-bound proteome in a global manner.
    [Show full text]
  • Content Based Search in Gene Expression Databases and a Meta-Analysis of Host Responses to Infection
    Content Based Search in Gene Expression Databases and a Meta-analysis of Host Responses to Infection A Thesis Submitted to the Faculty of Drexel University by Francis X. Bell in partial fulfillment of the requirements for the degree of Doctor of Philosophy November 2015 c Copyright 2015 Francis X. Bell. All Rights Reserved. ii Acknowledgments I would like to acknowledge and thank my advisor, Dr. Ahmet Sacan. Without his advice, support, and patience I would not have been able to accomplish all that I have. I would also like to thank my committee members and the Biomed Faculty that have guided me. I would like to give a special thanks for the members of the bioinformatics lab, in particular the members of the Sacan lab: Rehman Qureshi, Daisy Heng Yang, April Chunyu Zhao, and Yiqian Zhou. Thank you for creating a pleasant and friendly environment in the lab. I give the members of my family my sincerest gratitude for all that they have done for me. I cannot begin to repay my parents for their sacrifices. I am eternally grateful for everything they have done. The support of my sisters and their encouragement gave me the strength to persevere to the end. iii Table of Contents LIST OF TABLES.......................................................................... vii LIST OF FIGURES ........................................................................ xiv ABSTRACT ................................................................................ xvii 1. A BRIEF INTRODUCTION TO GENE EXPRESSION............................. 1 1.1 Central Dogma of Molecular Biology........................................... 1 1.1.1 Basic Transfers .......................................................... 1 1.1.2 Uncommon Transfers ................................................... 3 1.2 Gene Expression ................................................................. 4 1.2.1 Estimating Gene Expression ............................................ 4 1.2.2 DNA Microarrays ......................................................
    [Show full text]
  • Transcription of Endogenous Retrovirus Group K Members and Their Neighboring Genes in Chicken Skeletal Muscle Myoblasts
    http://www.jstage.jst.go.jp/browse/jpsa doi:10.2141/ jpsa.0200021 Copyright Ⓒ 2021, Japan Poultry Science Association. Transcription of Endogenous Retrovirus Group K Members and Their Neighboring Genes in Chicken Skeletal Muscle Myoblasts Tomohide Takaya1, 2, 3, Yuma Nihashi1, Tamao Ono2 and Hiroshi Kagami2 1 Department of Science and Technology, Graduate School of Medicine, Science and Technology, Shinshu University, Nagano 399-4598, Japan 2 Department of Agricultural and Life Science, Faculty of Agriculture, Shinshu University, Nagano 399-4598, Japan 3 Department of Biomolecular Innovation, Institute for Biomedical Sciences, Shinshu University, Nagano 399-4598, Japan Skeletal muscle myoblasts are myogenic precursor cells that generate myofibers during muscle development and growth. We recently reported that broiler myoblasts, compared to layer myoblasts, proliferate and differentiate more actively and promptly into myocytes, which corresponds well with the muscle phenotype of broilers. Furthermore, RNA sequencing (RNA-seq) revealed that numerous genes are differentially expressed between layer and broiler myoblasts during myogenic differentiation. Based on the RNA-seq data, we herein report that chicken myoblasts transcribe endogenous retrovirus group K member (ERVK) genes. In total, 16 ERVKs were highly expressed in layer myoblasts and two (termed BrK1 and BrK2) were significantly induced in broiler myoblasts. These transcribed ERVKs had a totalof 182 neighboring genes within ±100 kb on the chromosomes, of which 40% were concentrated within ±10 kb of the ERVKs. We further investigated whether the transcription of ERVKs affects the expression of their neighboring genes. BrK1 had two neighboring genes; LOC107052719 was overlapping with BrK1 and down- regulated in the broiler myoblasts, and FAM19A2 was upregulated in the broiler myoblasts as well as BrK1.
    [Show full text]
  • Detection of H3k4me3 Identifies Neurohiv Signatures, Genomic
    viruses Article Detection of H3K4me3 Identifies NeuroHIV Signatures, Genomic Effects of Methamphetamine and Addiction Pathways in Postmortem HIV+ Brain Specimens that Are Not Amenable to Transcriptome Analysis Liana Basova 1, Alexander Lindsey 1, Anne Marie McGovern 1, Ronald J. Ellis 2 and Maria Cecilia Garibaldi Marcondes 1,* 1 San Diego Biomedical Research Institute, San Diego, CA 92121, USA; [email protected] (L.B.); [email protected] (A.L.); [email protected] (A.M.M.) 2 Departments of Neurosciences and Psychiatry, University of California San Diego, San Diego, CA 92103, USA; [email protected] * Correspondence: [email protected] Abstract: Human postmortem specimens are extremely valuable resources for investigating trans- lational hypotheses. Tissue repositories collect clinically assessed specimens from people with and without HIV, including age, viral load, treatments, substance use patterns and cognitive functions. One challenge is the limited number of specimens suitable for transcriptional studies, mainly due to poor RNA quality resulting from long postmortem intervals. We hypothesized that epigenomic Citation: Basova, L.; Lindsey, A.; signatures would be more stable than RNA for assessing global changes associated with outcomes McGovern, A.M.; Ellis, R.J.; of interest. We found that H3K27Ac or RNA Polymerase (Pol) were not consistently detected by Marcondes, M.C.G. Detection of H3K4me3 Identifies NeuroHIV Chromatin Immunoprecipitation (ChIP), while the enhancer H3K4me3 histone modification was Signatures, Genomic Effects of abundant and stable up to the 72 h postmortem. We tested our ability to use H3K4me3 in human Methamphetamine and Addiction prefrontal cortex from HIV+ individuals meeting criteria for methamphetamine use disorder or not Pathways in Postmortem HIV+ Brain (Meth +/−) which exhibited poor RNA quality and were not suitable for transcriptional profiling.
    [Show full text]
  • Thirty Loci Identified for Heart Rate Response to Exercise and Recovery
    ARTICLE DOI: 10.1038/s41467-018-04148-1 OPEN Thirty loci identified for heart rate response to exercise and recovery implicate autonomic nervous system Julia Ramírez 1,2, Stefan van Duijvenboden1,2, Ioanna Ntalla1, Borbala Mifsud1,3, Helen R Warren1,3, Evan Tzanis1,3, Michele Orini 4,5, Andrew Tinker1,3, Pier D. Lambiase2,4 & Patricia B. Munroe 1,3 Δ ex 1234567890():,; Impaired capacity to increase heart rate (HR) during exercise ( HR ), and a reduced rate of recovery post-exercise (ΔHRrec) are associated with higher cardiovascular mortality rates. Currently, the genetic basis of both phenotypes remains to be elucidated. We conduct genome-wide association studies (GWASs) for ΔHRex and ΔHRrec in ~40,000 individuals, followed by replication in ~27,000 independent samples, all from UK Biobank. Six and seven single-nucleotide polymorphisms for ΔHRex and ΔHRrec, respectively, formally replicate. In a full data set GWAS, eight further loci for ΔHRex and nine for ΔHRrec are genome-wide significant (P ≤ 5×10−8). In total, 30 loci are discovered, 8 being common across traits. Processes of neural development and modulation of adrenergic activity by the autonomic nervous system are enriched in these results. Our findings reinforce current understanding of HR response to exercise and recovery and could guide future studies evaluating its con- tribution to cardiovascular risk prediction. 1 Clinical Pharmacology, William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London EC1M 6BQ, UK. 2 Institute of Cardiovascular Science, University College London, London WC1E 6BT, UK. 3 NIHR Barts Cardiovascular Biomedical Research Centre, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London EC1M 6BQ, UK.
    [Show full text]
  • Exome Sequencing Identifies Pathogenic and Modifier Mutations
    ORIGINAL RESEARCH Exome Sequencing Identifies Pathogenic and Modifier Mutations in a Child With Sporadic Dilated Cardiomyopathy Pamela A. Long, BS; Brandon T. Larsen, MD, PhD; Jared M. Evans, MS; Timothy M. Olson, MD Background-—Idiopathic dilated cardiomyopathy (DCM) is typically diagnosed in adulthood, yet familial cases exhibit variable age- dependent penetrance and a subset of patients develop sporadic DCM in childhood. We sought to discover the molecular basis of sporadic DCM in an 11-year-old female with severe heart failure necessitating cardiac transplantation. Methods and Results-—Parental echocardiograms excluded asymptomatic DCM. Whole exome sequencing was performed on the family trio and filtered for rare, deleterious, recessive, and de novo variants. Of the 8 candidate genes identified, only 2 had a role in cardiac physiology. A de novo missense mutation in TNNT2 was identified, previously reported and functionally validated in familial DCM with markedly variable penetrance. Additionally, recessive compound heterozygous truncating mutations were identified in XIRP2, a member of the ancient Xin gene family, which governs intercalated disc (ICD) maturation. Histomorphological analysis of explanted heart tissue revealed misregistration, mislocalization, and shortening of ICDs, findings similar to Xirp2À/À mice. Conclusions-—The synergistic effects of TNNT2 and XIRP2 mutations, resulting in perturbed sarcomeric force generation and transmission, respectively, would account for an early-onset heart failure phenotype. Whereas the importance
    [Show full text]
  • Mouse Zyg11b Conditional Knockout Project (CRISPR/Cas9)
    http://www.alphaknockout.com/ Mouse Zyg11b Conditional Knockout Project (CRISPR/Cas9) Objective: To create a Zyg11b conditional knockout mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Zyg11b gene ( NCBI Reference Sequence: NM_001033634 ; Ensembl: ENSMUSG00000034636 ) is located on mouse chromosome 4. 14 exons are identified , with the ATG start codon in exon 1 and the TGA stop codon in exon 14 (Transcript Zyg11b- 201: ENSMUST00000043616). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the mouse Zyg11b gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-272K18 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Exon 3 starts from about 8.83% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 5645 bp, and the size of intron 3 for 3'-loxP site insertion: 5727 bp. The size of effective cKO region: ~1570 bp. This strategy is designed based on genetic information in existing databases. Due to the complexity of biological processes, all risk of loxP insertion on gene transcription, RNA splicing and protein translation cannot be predicted at existing technological level. Page 1 of 7 http://www.alphaknockout.com/ Overview of the Targeting Strategy Wildtype allele gRNA region 5' gRNA region 3' 1 3 14 Targeting vector Targeted allele Constitutive KO allele (After Cre recombination) Legends Exon of mouse Zyg11b Homology arm cKO region loxP site Page 2 of 7 http://www.alphaknockout.com/ Overview of the Dot Plot Window size: 10 bp Forward Reverse Complement Sequence 12 Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats.
    [Show full text]
  • Supplemental Data
    Supplemental Table 1: List of the 570 genes with conserved Stat3 binding sites (CBS) in at least four species differentially expressed between C56BL/6 and FVB/N remnant kidneys 60 days after nephron reduction. Gene Symbol Ratio CBS in Name Abhd1 27.86 5 species abhydrolase domain containing 1 Pi4k2b 19.99 4 species phosphatidylinositol 4-kinase type 2 beta Pik3c2g 11.64 4 species phosphoinositide-3-kinase, class 2, gamma polypeptide Snap91 11.31 4 species synaptosomal-associated protein, 91kDa homolog (mouse) Ocel1 10.58 ≥ 6 species occludin/ELL domain containing 1 Slc34a2 8.86 4 species solute carrier family 34 (sodium phosphate), member 2 Psmc3ip 8.68 ≥ 6 species PSMC3 interacting protein Hddc3 7.17 ≥ 6 species HD domain containing 3 Lcn2 6.52 4 species lipocalin 2 Pea15a 5.09 5 species phosphoprotein enriched in astrocytes 15 Gfra1 5.09 4 species GDNF family receptor alpha 1 H2-DMb2 4.94 4 species major histocompatibility complex, class II, DM beta Atf3 4.88 4 species activating transcription factor 3 Wdfy1 4.85 4 species WD repeat and FYVE domain containing 1 Serpina10 4.73 4 species serpin peptidase inhibitor, clade A (alpha-1 antiproteinase, antitrypsin), member 10 Pm20d1 4.71 4 species peptidase M20 domain containing 1 Aldh4a1 4.35 4 species aldehyde dehydrogenase 4 family, member A1 Wasf2 4.10 4 species WAS protein family, member 2 Edn1 4.07 4 species endothelin 1 Zkscan1 3.98 4 species zinc finger with KRAB and SCAN domains 1 Cyr61 3.94 4 species cysteine-rich, angiogenic inducer, 61 Adamts1 3.76 5 species ADAM metallopeptidase
    [Show full text]
  • Structure-Function Relationships of Rna and Protein in Synaptic Plasticity
    University of Pennsylvania ScholarlyCommons Publicly Accessible Penn Dissertations 2017 Structure-Function Relationships Of Rna And Protein In Synaptic Plasticity Sarah Middleton University of Pennsylvania, [email protected] Follow this and additional works at: https://repository.upenn.edu/edissertations Part of the Bioinformatics Commons, Biology Commons, and the Neuroscience and Neurobiology Commons Recommended Citation Middleton, Sarah, "Structure-Function Relationships Of Rna And Protein In Synaptic Plasticity" (2017). Publicly Accessible Penn Dissertations. 2474. https://repository.upenn.edu/edissertations/2474 This paper is posted at ScholarlyCommons. https://repository.upenn.edu/edissertations/2474 For more information, please contact [email protected]. Structure-Function Relationships Of Rna And Protein In Synaptic Plasticity Abstract Structure is widely acknowledged to be important for the function of ribonucleic acids (RNAs) and proteins. However, due to the relative accessibility of sequence information compared to structure information, most large genomics studies currently use only sequence-based annotation tools to analyze the function of expressed molecules. In this thesis, I introduce two novel computational methods for genome-scale structure-function analysis and demonstrate their application to identifying RNA and protein structures involved in synaptic plasticity and potentiation—important neuronal processes that are thought to form the basis of learning and memory. First, I describe a new method for de novo identification of RNA secondary structure motifs enriched in co-regulated transcripts. I show that this method can accurately identify secondary structure motifs that recur across three or more transcripts in the input set with an average recall of 0.80 and precision of 0.98. Second, I describe a tool for predicting protein structural fold from amino acid sequence, which achieves greater than 96% accuracy on benchmarks and can be used to predict protein function and identify new structural folds.
    [Show full text]
  • Modeling Gene Regulation from Paired Expression and Chromatin Accessibility Data
    Modeling gene regulation from paired expression and PNAS PLUS chromatin accessibility data Zhana Durena,b,c, Xi Chenb, Rui Jiangd,1, Yong Wanga,c,1, and Wing Hung Wongb,1 aAcademy of Mathematics and Systems Science, National Center for Mathematics and Interdisciplinary Sciences, Chinese Academy of Sciences, Beijing 100080, China; bDepartment of Statistics, Department of Biomedical Data Science, Bio-X Program, Stanford University, Stanford, CA 94305; cSchool of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China; and dMinistry of Education Key Laboratory of Bioinformatics, Bioinformatics Division and Center for Synthetic & Systems Biology, Tsinghua National Laboratory for Information Science and Technology, Department of Automation, Tsinghua University, Beijing 100084, China Contributed by Wing Hung Wong, May 8, 2017 (sent for review March 20, 2017; reviewed by Christina Kendziorski and Sheng Zhong) The rapid increase of genome-wide datasets on gene expression, gene expression data, accessibility data are available for a diverse set chromatin states, and transcription factor (TF) binding locations offers of cellular contexts (Fig. 1, blue boxes). In fact, we expect the an exciting opportunity to interpret the information encoded in amount of matched expression and accessibility data (i.e., measured genomes and epigenomes. This task can be challenging as it requires on the same sample) will increase very rapidly in the near future. joint modeling of context-specific activation of cis-regulatory ele- The purpose of the present work is to show that, by using ments (REs) and the effects on transcription of associated regulatory matched expression and accessibility data across diverse cellular factors. To meet this challenge, we propose a statistical approach contexts, it is possible to recover a significant portion of the in- based on paired expression and chromatin accessibility (PECA) data formation in the missing data on binding location and chromatin across diverse cellular contexts.
    [Show full text]