Bioinformatics Tools for the Analysis of Gene-Phenotype Relationships Coupled with a Next Generation Chip-Sequencing Data Processing Pipeline

Total Page:16

File Type:pdf, Size:1020Kb

Bioinformatics Tools for the Analysis of Gene-Phenotype Relationships Coupled with a Next Generation Chip-Sequencing Data Processing Pipeline Bioinformatics Tools for the Analysis of Gene-Phenotype Relationships Coupled with a Next Generation ChIP-Sequencing Data Processing Pipeline Erinija Pranckeviciene Thesis submitted to the Faculty of Graduate and Postdoctoral Studies in partial fulfillment of the requirements for the Doctorate in Philosophy degree in Cellular and Molecular Medicine Department of Cellular and Molecular Medicine Faculty of Medicine University of Ottawa c Erinija Pranckeviciene, Ottawa, Canada, 2015 Abstract The rapidly advancing high-throughput and next generation sequencing technologies facilitate deeper insights into the molecular mechanisms underlying the expression of phenotypes in living organisms. Experimental data and scientific publications following this technological advance- ment have rapidly accumulated in public databases. Meaningful analysis of currently avail- able data in genomic databases requires sophisticated computational tools and algorithms, and presents considerable challenges to molecular biologists without specialized training in bioinfor- matics. To study their phenotype of interest molecular biologists must prioritize large lists of poorly characterized genes generated in high-throughput experiments. To date, prioritization tools have primarily been designed to work with phenotypes of human diseases as defined by the genes known to be associated with those diseases. There is therefore a need for more prioritiza- tion tools for phenotypes which are not related with diseases generally or diseases with which no genes have yet been associated in particular. Chromatin immunoprecipitation followed by next generation sequencing (ChIP-Seq) is a method of choice to study the gene regulation processes responsible for the expression of cellular phenotypes. Among publicly available computational pipelines for the processing of ChIP-Seq data, there is a lack of tools for the downstream analysis of composite motifs and preferred binding distances of the DNA binding proteins. This thesis is aimed to address the gap existing in the tools available to process high-throughput ChIP-Seq data to provide rapid analysis and interpretation of large lists of poorly characterized genes. Additionally, programs for the analysis of preferred binding distances of transcription factors were integrated into the pipeline for expedited results. A gene prioritization algorithm linking genes to non-disease phenotypes described by meaningful keywords was developed. This algo- rithm can be used to process candidate genetic targets of a transcription factor produced by a computational pipeline for ChIP-Seq data analysis. Contents Abstract i List of Figures vi List of Tables viii Abbreviations xi Acknowledgements xiii Preface xv 1 Introduction 1 1.1 ChIP-Seq Data Analysis................................5 1.1.1 Computational Processing of Sequenced Reads...............6 1.1.2 Available Computational Pipelines with Integrated Computational Proce- dures to Analyze ChIP-Seq Data....................... 11 1.2 Analysis of Gene Lists with the Enrichment of Gene Annotations......... 13 1.3 Gene Prioritization................................... 15 1.3.1 Current State of Phenotypic Resources for Phenotype Definitions..... 16 1.3.2 Approaches that are Used by Gene Prioritization Algorithms....... 19 1.3.2.1 Methods Defining the Phenotype by Training Genes....... 20 1.3.2.2 Methods Defining the Phenotype by Keywords.......... 22 1.3.2.3 Methods Prioritizing Genes for Disease Phenotypes....... 22 1.3.3 Inference of Relationships between Genes and Phenotypes......... 22 1.3.4 Concluding Remarks.............................. 26 1.4 Objectives, Contributions and Outline of this Thesis................ 27 2 Optimized ChIP-Sequencing Data Analysis Pipeline 30 2.1 Materials and Methods. Elements of ChIP-Seq Data Processing Pipeline..... 31 2.1.1 Command Line Tools............................. 32 2.1.1.1 Preprocessing Task......................... 32 2.1.1.2 Task of Alignment to a Reference Genome............ 33 2.1.1.3 Task of Peak Calling......................... 34 2.1.1.4 Estimation of Sufficiency of Coverage............... 37 2.1.2 Integrated Tools................................ 38 ii Contents iii 2.1.2.1 Characterization the Genome-Wide Binding of the Transcription Factor................................. 38 2.1.2.2 Task of Association of Peaks to Genes............... 40 2.1.2.3 Task of Motif Analysis in Peaks.................. 41 2.1.2.4 Analysis of Transcription Factor Composite Sequences...... 42 2.2 Conclusions....................................... 44 3 An Algorithm for the Prioritization of Candidate Genes 45 3.1 Materials and Methods: Components of the Prioritization Algorithm....... 47 3.1.1 Description of Input to the Gene Prioritization Algorithm......... 49 3.1.2 Computation of Gene-Phenotype Relationships Based on Literature... 50 3.1.3 Computations of Gene-Phenotype Relationships Based on Training Genes 53 3.1.4 Ranking and Prioritization of Candidate Genes............... 55 3.1.5 Availability of Gene Prioritization Algorithm................ 57 3.2 Mathematical Framework Underlying the Literature-Based Prioritization Method 58 3.2.1 The Fuzzy Set................................. 59 3.2.2 Fuzzy Binary Relation............................. 60 3.2.2.1 General Definition.......................... 61 3.2.2.2 Definition of Similarity and Inclusion FBR............ 62 3.2.3 Modeling the Relationships between Phenotype and Gene Functions by FBR 64 3.2.3.1 Definitions of FBR between MeSH terms and GO annotations. 65 3.2.3.2 Maximum Composition Operation................. 66 3.2.3.3 Inference of Relationships...................... 70 3.2.3.4 Thresholds in α-cuts......................... 71 3.3 Validation of Gene Prioritization Algorithms.................... 72 3.3.1 The benchmark................................. 73 3.3.2 Criteria to estimate performance of the prioritization algorithms..... 73 3.3.3 Method of Prioritization Performance Assessment Used in the Literature- Based Part of the Algorithm......................... 74 3.3.4 Method of Cross-Validation Prioritization Performance Assessment Used in the Algorithm Part Based on Training Genes............... 75 3.3.5 Benchmark Performance in Prioritization Algorithms............ 76 3.4 Application of the Developed Prioritization Algorithm............... 84 3.4.1 Cell Fusion Example.............................. 85 3.4.2 Cancer Related Genes............................. 90 3.5 Discussion. Advantages and Limitations of the Literature-Based Prioritization Algorithm........................................ 94 3.5.1 Factors Affecting the Prioritization...................... 94 3.5.1.1 Most Studied Topics......................... 94 3.5.1.2 Well Studied Phenotypes and Genes................ 95 3.5.1.3 Current State of Knowledge on Genes.............. 96 3.5.1.4 Good Candidates not on the Test List............... 98 3.5.2 Advantages................................... 99 3.5.2.1 Gene-Phenotype Relationships are Inferred without Reading the Full Text of the Articles....................... 99 3.5.2.2 Gene-Phenotype Links Can Be Interpreted in Natural Language 100 3.5.2.3 Specific and Rare Relationships are Promoted.......... 101 Contents iv 3.5.2.4 Physical and Chemical Domains Related to Gene Functions Are Identified............................... 102 3.5.3 Limitations................................... 103 3.5.3.1 Topics Most Studied in MEDLINE................. 103 3.5.3.2 Greater-Studied Genes Being Better Annotated......... 104 3.5.3.3 Heterogeneity of Gene Annotations in Genes with Different Roles 105 3.6 Conclusion....................................... 106 4 ChIP-Seq Data Analysis of BCL11B and TAL1 Transcription Factors 107 4.1 Pipeline Tools Applied to BCL11B ChIP-Seq data................. 109 4.2 Preferred Distances of TF Co-location in TAL1 Genome-Wide Binding Regions in ECF Cells...................................... 116 4.3 Conclusions....................................... 117 5 General Discussion 119 5.1 Overview of Findings.................................. 120 5.2 Examining Evidence of Gene-Phenotype Relationships............... 123 5.2.1 Application Testing.............................. 125 5.3 Evaluation of Performance............................... 126 5.4 Integrative Analysis of ChIP-Seq Data........................ 130 5.5 Future Directions.................................... 135 5.6 Conclusion....................................... 136 References 138 Appendix A. Supplementary Material for Chapter 3 169 A.1 Data sources Used to Create a Local Database for the Algorithm......... 171 A.2 Contents of the Local Database of the Prioritization Algorithm.......... 172 A.3 Parameters of Literature-Based Prioritization Algorithm.............. 173 A.4 Comparison of Literature-Based Prioritization Algorithm with CANDID Tool.. 175 A.5 Details of the \Respiration" Phenotype...................... 177 A.6 Details of the \DNA Repair" Phenotype...................... 178 A.7 Details of the \Autophagy" Phenotype....................... 179 A.8 Details of the \Vasoconstriction" Phenotype................... 181 A.9 User Manual...................................... 183 Appendix B. Supplementary Material for Chapter 2 190 B.10 Dependencies of the ChIP-Seq Data Analysis Tools................. 190 B.11 Characterization Genome Wide Binding of the Transcription Factor....... 192 B.12 Association of Peaks to Genes............................. 195 B.13 Motif Analysis in Peaks................................ 196 B.14 The Fasta from Bed Tool..............................
Recommended publications
  • CISD2 (NM 001008388) Human Tagged ORF Clone Product Data
    OriGene Technologies, Inc. 9620 Medical Center Drive, Ste 200 Rockville, MD 20850, US Phone: +1-888-267-4436 [email protected] EU: [email protected] CN: [email protected] Product datasheet for RC207131L3 CISD2 (NM_001008388) Human Tagged ORF Clone Product data: Product Type: Expression Plasmids Product Name: CISD2 (NM_001008388) Human Tagged ORF Clone Tag: Myc-DDK Symbol: CISD2 Synonyms: ERIS; Miner1; NAF-1; WFS2; ZCD2 Vector: pLenti-C-Myc-DDK-P2A-Puro (PS100092) E. coli Selection: Chloramphenicol (34 ug/mL) Cell Selection: Puromycin ORF Nucleotide The ORF insert of this clone is exactly the same as(RC207131). Sequence: Restriction Sites: SgfI-MluI Cloning Scheme: ACCN: NM_001008388 ORF Size: 405 bp This product is to be used for laboratory only. Not for diagnostic or therapeutic use. View online » ©2021 OriGene Technologies, Inc., 9620 Medical Center Drive, Ste 200, Rockville, MD 20850, US 1 / 2 CISD2 (NM_001008388) Human Tagged ORF Clone – RC207131L3 OTI Disclaimer: The molecular sequence of this clone aligns with the gene accession number as a point of reference only. However, individual transcript sequences of the same gene can differ through naturally occurring variations (e.g. polymorphisms), each with its own valid existence. This clone is substantially in agreement with the reference, but a complete review of all prevailing variants is recommended prior to use. More info OTI Annotation: This clone was engineered to express the complete ORF with an expression tag. Expression varies depending on the nature of the gene. RefSeq: NM_001008388.1 RefSeq Size: 5892 bp RefSeq ORF: 408 bp Locus ID: 493856 UniProt ID: Q8N5K1 Protein Families: Transmembrane MW: 15.3 kDa Gene Summary: The protein encoded by this gene is a zinc finger protein that localizes to the endoplasmic reticulum.
    [Show full text]
  • Analysis of Trans Esnps Infers Regulatory Network Architecture
    Analysis of trans eSNPs infers regulatory network architecture Anat Kreimer Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Graduate School of Arts and Sciences COLUMBIA UNIVERSITY 2014 © 2014 Anat Kreimer All rights reserved ABSTRACT Analysis of trans eSNPs infers regulatory network architecture Anat Kreimer eSNPs are genetic variants associated with transcript expression levels. The characteristics of such variants highlight their importance and present a unique opportunity for studying gene regulation. eSNPs affect most genes and their cell type specificity can shed light on different processes that are activated in each cell. They can identify functional variants by connecting SNPs that are implicated in disease to a molecular mechanism. Examining eSNPs that are associated with distal genes can provide insights regarding the inference of regulatory networks but also presents challenges due to the high statistical burden of multiple testing. Such association studies allow: simultaneous investigation of many gene expression phenotypes without assuming any prior knowledge and identification of unknown regulators of gene expression while uncovering directionality. This thesis will focus on such distal eSNPs to map regulatory interactions between different loci and expose the architecture of the regulatory network defined by such interactions. We develop novel computational approaches and apply them to genetics-genomics data in human. We go beyond pairwise interactions to define network motifs, including regulatory modules and bi-fan structures, showing them to be prevalent in real data and exposing distinct attributes of such arrangements. We project eSNP associations onto a protein-protein interaction network to expose topological properties of eSNPs and their targets and highlight different modes of distal regulation.
    [Show full text]
  • Title a New Centrosomal Protein Regulates Neurogenesis By
    Title A new centrosomal protein regulates neurogenesis by microtubule organization Authors: Germán Camargo Ortega1-3†, Sven Falk1,2†, Pia A. Johansson1,2†, Elise Peyre4, Sanjeeb Kumar Sahu5, Loïc Broic4, Camino De Juan Romero6, Kalina Draganova1,2, Stanislav Vinopal7, Kaviya Chinnappa1‡, Anna Gavranovic1, Tugay Karakaya1, Juliane Merl-Pham8, Arie Geerlof9, Regina Feederle10,11, Wei Shao12,13, Song-Hai Shi12,13, Stefanie M. Hauck8, Frank Bradke7, Victor Borrell6, Vijay K. Tiwari§, Wieland B. Huttner14, Michaela Wilsch- Bräuninger14, Laurent Nguyen4 and Magdalena Götz1,2,11* Affiliations: 1. Institute of Stem Cell Research, Helmholtz Center Munich, German Research Center for Environmental Health, Munich, Germany. 2. Physiological Genomics, Biomedical Center, Ludwig-Maximilian University Munich, Germany. 3. Graduate School of Systemic Neurosciences, Biocenter, Ludwig-Maximilian University Munich, Germany. 4. GIGA-Neurosciences, Molecular regulation of neurogenesis, University of Liège, Belgium 5. Institute of Molecular Biology (IMB), Mainz, Germany. 6. Instituto de Neurociencias, Consejo Superior de Investigaciones Científicas and Universidad Miguel Hernández, Sant Joan d’Alacant, Spain. 7. Laboratory for Axon Growth and Regeneration, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany. 8. Research Unit Protein Science, Helmholtz Centre Munich, German Research Center for Environmental Health, Munich, Germany. 9. Protein Expression and Purification Facility, Institute of Structural Biology, Helmholtz Center Munich, German Research Center for Environmental Health, Munich, Germany. 10. Institute for Diabetes and Obesity, Monoclonal Antibody Core Facility, Helmholtz Center Munich, German Research Center for Environmental Health, Munich, Germany. 11. SYNERGY, Excellence Cluster of Systems Neurology, Biomedical Center, Ludwig- Maximilian University Munich, Germany. 12. Developmental Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, USA 13.
    [Show full text]
  • Anti-ARL4A Antibody (ARG41291)
    Product datasheet [email protected] ARG41291 Package: 100 μl anti-ARL4A antibody Store at: -20°C Summary Product Description Rabbit Polyclonal antibody recognizes ARL4A Tested Reactivity Hu, Ms, Rat Tested Application ICC/IF, IHC-P Host Rabbit Clonality Polyclonal Isotype IgG Target Name ARL4A Antigen Species Human Immunogen Recombinant fusion protein corresponding to aa. 121-200 of Human ARL4A (NP_001032241.1). Conjugation Un-conjugated Alternate Names ARL4; ADP-ribosylation factor-like protein 4A Application Instructions Application table Application Dilution ICC/IF 1:50 - 1:200 IHC-P 1:50 - 1:200 Application Note * The dilutions indicate recommended starting dilutions and the optimal dilutions or concentrations should be determined by the scientist. Calculated Mw 23 kDa Properties Form Liquid Purification Affinity purified. Buffer PBS (pH 7.3), 0.02% Sodium azide and 50% Glycerol. Preservative 0.02% Sodium azide Stabilizer 50% Glycerol Storage instruction For continuous use, store undiluted antibody at 2-8°C for up to a week. For long-term storage, aliquot and store at -20°C. Storage in frost free freezers is not recommended. Avoid repeated freeze/thaw cycles. Suggest spin the vial prior to opening. The antibody solution should be gently mixed before use. Note For laboratory research only, not for drug, diagnostic or other use. www.arigobio.com 1/2 Bioinformation Gene Symbol ARL4A Gene Full Name ADP-ribosylation factor-like 4A Background ADP-ribosylation factor-like 4A is a member of the ADP-ribosylation factor family of GTP-binding proteins. ARL4A is similar to ARL4C and ARL4D and each has a nuclear localization signal and an unusually high guaninine nucleotide exchange rate.
    [Show full text]
  • The Study of Two Transmembrane Autophagy Proteins and the Autophagy Receptor, P62
    The study of two transmembrane autophagy proteins and the autophagy receptor, p62 Gautam M. Runwal St. John’s College, Cambridge September 2018 This dissertation is submitted for the degree of Doctor of Philosophy Title: The study of two transmembrane autophagy proteins and the autophagy receptor, p62 Submitted by : Gautam M Runwal Abstract Autophagy is an evolutionarily conserved process across eukaryotes that is responsible for degradation of cargo such as aggregate-prone proteins, pathogens, damaged organelles, macromolecules etc. via its delivery to lysosomes. The process is known to involve the formation of a double-membraned structure, called autophagosome, that engulfs the cargo destined for degradation and delivers its contents by fusing with lysosomes. This process involves several proteins at its core which include two transmembrane proteins, ATG9 and VMP1. While ATG9 and VMP1 has been discovered for about a decade and half, the trafficking and function of these proteins remain relatively unclear. My work in this thesis identifies and characterises a novel trafficking route for ATG9 and VMP1 and shows that both these proteins traffic via the dynamin-independent ARF6-associated pathway. Moreover, I also show that these proteins physically interact with each other. In addition, the tools developed during these studies helped me identify a new role for the most common autophagy receptor protein, p62. I show that p62 can specifically associate with and sequester LC3-I in autophagy- impaired cells (ATG9 and ATG16 null cells) leading to formation of LC3-positive structures that can be misinterpreted as mature autophagosomes. Perturbations in the levels of p62 were seen to affect the formation of these LC3-positive structures in cells.
    [Show full text]
  • Supplementary Information Integrative Analyses of Splicing in the Aging Brain: Role in Susceptibility to Alzheimer’S Disease
    Supplementary Information Integrative analyses of splicing in the aging brain: role in susceptibility to Alzheimer’s Disease Contents 1. Supplementary Notes 1.1. Religious Orders Study and Memory and Aging Project 1.2. Mount Sinai Brain Bank Alzheimer’s Disease 1.3. CommonMind Consortium 1.4. Data Availability 2. Supplementary Tables 3. Supplementary Figures Note: Supplementary Tables are provided as separate Excel files. 1. Supplementary Notes 1.1. Religious Orders Study and Memory and Aging Project Gene expression data1. Gene expression data were generated using RNA- sequencing from Dorsolateral Prefrontal Cortex (DLPFC) of 540 individuals, at an average sequence depth of 90M reads. Detailed description of data generation and processing was previously described2 (Mostafavi, Gaiteri et al., under review). Samples were submitted to the Broad Institute’s Genomics Platform for transcriptome analysis following the dUTP protocol with Poly(A) selection developed by Levin and colleagues3. All samples were chosen to pass two initial quality filters: RNA integrity (RIN) score >5 and quantity threshold of 5 ug (and were selected from a larger set of 724 samples). Sequencing was performed on the Illumina HiSeq with 101bp paired-end reads and achieved coverage of 150M reads of the first 12 samples. These 12 samples will serve as a deep coverage reference and included 2 males and 2 females of nonimpaired, mild cognitive impaired, and Alzheimer's cases. The remaining samples were sequenced with target coverage of 50M reads; the mean coverage for the samples passing QC is 95 million reads (median 90 million reads). The libraries were constructed and pooled according to the RIN scores such that similar RIN scores would be pooled together.
    [Show full text]
  • A Computational Approach for Defining a Signature of Β-Cell Golgi Stress in Diabetes Mellitus
    Page 1 of 781 Diabetes A Computational Approach for Defining a Signature of β-Cell Golgi Stress in Diabetes Mellitus Robert N. Bone1,6,7, Olufunmilola Oyebamiji2, Sayali Talware2, Sharmila Selvaraj2, Preethi Krishnan3,6, Farooq Syed1,6,7, Huanmei Wu2, Carmella Evans-Molina 1,3,4,5,6,7,8* Departments of 1Pediatrics, 3Medicine, 4Anatomy, Cell Biology & Physiology, 5Biochemistry & Molecular Biology, the 6Center for Diabetes & Metabolic Diseases, and the 7Herman B. Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202; 2Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, Indianapolis, IN, 46202; 8Roudebush VA Medical Center, Indianapolis, IN 46202. *Corresponding Author(s): Carmella Evans-Molina, MD, PhD ([email protected]) Indiana University School of Medicine, 635 Barnhill Drive, MS 2031A, Indianapolis, IN 46202, Telephone: (317) 274-4145, Fax (317) 274-4107 Running Title: Golgi Stress Response in Diabetes Word Count: 4358 Number of Figures: 6 Keywords: Golgi apparatus stress, Islets, β cell, Type 1 diabetes, Type 2 diabetes 1 Diabetes Publish Ahead of Print, published online August 20, 2020 Diabetes Page 2 of 781 ABSTRACT The Golgi apparatus (GA) is an important site of insulin processing and granule maturation, but whether GA organelle dysfunction and GA stress are present in the diabetic β-cell has not been tested. We utilized an informatics-based approach to develop a transcriptional signature of β-cell GA stress using existing RNA sequencing and microarray datasets generated using human islets from donors with diabetes and islets where type 1(T1D) and type 2 diabetes (T2D) had been modeled ex vivo. To narrow our results to GA-specific genes, we applied a filter set of 1,030 genes accepted as GA associated.
    [Show full text]
  • Cytotoxic Effects and Changes in Gene Expression Profile
    Toxicology in Vitro 34 (2016) 309–320 Contents lists available at ScienceDirect Toxicology in Vitro journal homepage: www.elsevier.com/locate/toxinvit Fusarium mycotoxin enniatin B: Cytotoxic effects and changes in gene expression profile Martina Jonsson a,⁎,MarikaJestoib, Minna Anthoni a, Annikki Welling a, Iida Loivamaa a, Ville Hallikainen c, Matti Kankainen d, Erik Lysøe e, Pertti Koivisto a, Kimmo Peltonen a,f a Chemistry and Toxicology Research Unit, Finnish Food Safety Authority (Evira), Mustialankatu 3, FI-00790 Helsinki, Finland b Product Safety Unit, Finnish Food Safety Authority (Evira), Mustialankatu 3, FI-00790 Helsinki, c The Finnish Forest Research Institute, Rovaniemi Unit, P.O. Box 16, FI-96301 Rovaniemi, Finland d Institute for Molecular Medicine Finland (FIMM), University of Helsinki, P.O. Box 20, FI-00014, Finland e Plant Health and Biotechnology, Norwegian Institute of Bioeconomy, Høyskoleveien 7, NO -1430 Ås, Norway f Finnish Safety and Chemicals Agency (Tukes), Opastinsilta 12 B, FI-00521 Helsinki, Finland article info abstract Article history: The mycotoxin enniatin B, a cyclic hexadepsipeptide produced by the plant pathogen Fusarium,isprevalentin Received 3 December 2015 grains and grain-based products in different geographical areas. Although enniatins have not been associated Received in revised form 5 April 2016 with toxic outbreaks, they have caused toxicity in vitro in several cell lines. In this study, the cytotoxic effects Accepted 28 April 2016 of enniatin B were assessed in relation to cellular energy metabolism, cell proliferation, and the induction of ap- Available online 6 May 2016 optosis in Balb 3T3 and HepG2 cells. The mechanism of toxicity was examined by means of whole genome ex- fi Keywords: pression pro ling of exposed rat primary hepatocytes.
    [Show full text]
  • 9. Atypical Dusps: 19 Phosphatases in Search of a Role
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Digital.CSIC Transworld Research Network 37/661 (2), Fort P.O. Trivandrum-695 023 Kerala, India Emerging Signaling Pathways in Tumor Biology, 2010: 185-208 ISBN: 978-81-7895-477-6 Editor: Pedro A. Lazo 9. Atypical DUSPs: 19 phosphatases in search of a role Yolanda Bayón and Andrés Alonso Instituto de Biología y Genética Molecular, CSIC-Universidad de Valladolid c/ Sanz y Forés s/n, 47003 Valladolid, Spain Abstract. Atypical Dual Specificity Phosphatases (A-DUSPs) are a group of 19 phosphatases poorly characterized. They are included among the Class I Cys-based PTPs and contain the active site motif HCXXGXXR conserved in the Class I PTPs. These enzymes present a phosphatase domain similar to MKPs, but lack any substrate targeting domain similar to the CH2 present in this group. Although most of these phosphatases have no more than 250 amino acids, their size ranges from the 150 residues of the smallest A-DUSP, VHZ/DUSP23, to the 1158 residues of the putative PTP DUSP27. The substrates of this family include MAPK, but, in general terms, it does not look that MAPK are the general substrates for the whole group. In fact, other substrates have been described for some of these phosphatases, like the 5’CAP structure of mRNA, glycogen, or STATs and still the substrates of many A-DUSPs have not been identified. In addition to the PTP domain, most of these enzymes present no additional recognizable domains in their sequence, with the exception of CBM-20 in laforin, GTase in HCE1 and a Zn binding domain in DUSP12.
    [Show full text]
  • Genomic Approaches to Reproductive Disorders
    Genomic Approaches to Reproductive Disorders Aleksandar Rajkovic Dept Obstetrics Gynecology and Reproductive Sciences University of Pittsburgh Magee Womens Research Institute Pittsburgh, PA Preconceptional Care Scope • Half of Pregnancies are Unintended • Medical Conditions • Mental Conditions • Immunization History • Nutritional Issues • Family History/Genetic Risk • Occupational/Environmental Exposures • Tobacco/Drug Abuse • Social Issues Preconceptional genetic screening Ethnic: Sickle cell disease Tay–Sachs disease Pan-ethnic: cystic fibrosis fragile X syndrome Spinal muscular atrophy Mendelian Inheritance • 5593 phenotypes for which molecular basis known • 3452 genes with phenotype causing mutation • Over 15,000 mutations to date known Preconceptional Pan Ethnic Testing • Screens for known mutations in more than 100 genes, easy on genetic counsellors • The screen is pan-ethnic • Useful also for couples undergoing IVF and potentially PGD • 1:5 will be carriers of a Mendelian disorder. • $600 (529 Euros) for the couple Genetic Counselling • Objective of the test • Test Methodology • Type of sample required (parents, siblings) • Possible outcomes (abnormal results, result of unknown clinical significance) ClinVar Stars and their interpretation Number of golden stars No submitter provided an interpretation with assertion criteria (no assertion criteria provided), none or no interpretation was provided (no assertion provided) At least one submitter provided an interpretation with assertion criteria (criteria provided, single submitter)
    [Show full text]
  • Targeting PH Domain Proteins for Cancer Therapy
    The Texas Medical Center Library DigitalCommons@TMC The University of Texas MD Anderson Cancer Center UTHealth Graduate School of The University of Texas MD Anderson Cancer Biomedical Sciences Dissertations and Theses Center UTHealth Graduate School of (Open Access) Biomedical Sciences 12-2018 Targeting PH domain proteins for cancer therapy Zhi Tan Follow this and additional works at: https://digitalcommons.library.tmc.edu/utgsbs_dissertations Part of the Bioinformatics Commons, Medicinal Chemistry and Pharmaceutics Commons, Neoplasms Commons, and the Pharmacology Commons Recommended Citation Tan, Zhi, "Targeting PH domain proteins for cancer therapy" (2018). The University of Texas MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences Dissertations and Theses (Open Access). 910. https://digitalcommons.library.tmc.edu/utgsbs_dissertations/910 This Dissertation (PhD) is brought to you for free and open access by the The University of Texas MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences at DigitalCommons@TMC. It has been accepted for inclusion in The University of Texas MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences Dissertations and Theses (Open Access) by an authorized administrator of DigitalCommons@TMC. For more information, please contact [email protected]. TARGETING PH DOMAIN PROTEINS FOR CANCER THERAPY by Zhi Tan Approval page APPROVED: _____________________________________________ Advisory Professor, Shuxing Zhang, Ph.D. _____________________________________________
    [Show full text]
  • Gfh 2016 Tagungsband Final.Pdf
    27. Jahrestagung der Deutschen Gesellschaft für Humangenetik 27. Jahrestagung der Deutschen Gesellschaft für Humangenetik gemeinsam mit der Österreichischen Gesellschaft für Humangenetik und der Schweizerischen Gesellschaft für Medizinische Genetik 16.–18. 3. 2016, Lübeck Tagungsort Prof. Dr. med. Jörg Epplen, Bochum (Tagungspräsident 2017) Prof. Dr. med. Michael Speicher, Graz (Tagungspräsident 2015) Hotel Hanseatischer Hof Prof. Dr. med. Klaus Zerres, Aachen Wisbystraße 7–9 Prof. Dr. rer. nat. Wolfgang Berger, Zürich (Delegierter der SGMG) 23558 Lübeck Univ.-Prof. DDr. med. univ. Johannes Zschocke, Telefon: (0451) 3 00 20-0 (Innsbruck Delegierter der ÖGH) http://www.hanseatischer- hof.de/ [email protected] Fachgesellschaften Die Jahrestagung fi ndet im Hotel Hanseatischer Hof statt, das sich Deutsche Gesellschaft für Humangenetik (GfH) in in fussläufi ger Entfernung vom Lübecker Hauptbahnhof befi ndet. Vorsitzender: Prof. Dr. med. Klaus Zerres, Aachen Stellvertretende Vorsitzende: Prof. Dr. med. Gabriele Gillessen- Tagungspräsidentin Kaesbach, Lübeck Stellvertretende Vorsitzende: Prof. Dr. biol. hum. Hildegard Prof. Dr. med. Gabriele Gillessen- Kaesbach Kehrer- Sawatzki, Ulm Institut für Humangenetik Schatzmeister: Dr. rer. nat. Wolfram Kress, Würzburg Universität zu Lübeck/UKSH Schrift führerin: Dr. rer. nat. Simone Heidemann, Kiel Ratzeburger Allee 160, Haus 72 23538 Lübeck Österreichische Gesellschaft für Humangenetik (ÖGH) Telefon: 0049-451-500-2620 Univ. Prof. Dr. med. univ. Michael Speicher, Graz (Vorsitzender) Fax: 0049-451-500-4187 Univ. Doz. Dr. med. univ. Hans-Christoph Duba, Linz (Stellvertre- tende Vorsitzende) Tagungsorganisation Univ.-Prof. DDr. med. univ. Johannes Zschocke, Innsbruck (Stellvertretende Vorsitzende) Dr. Christine Scholz (Leitung) Dr. Gerald Webersinke, Linz (Schrift führer) Brigitte Fiedler (Teilnehmerregistrierung) Ass.-Prof. Priv. Doz. Dr. med. univ. Franco Laccone, Wien (Stellver- Deutsche Gesellschaft für Humangenetik e.
    [Show full text]