Temporal Phenome Analysis of a Large Electronic Health Record Cohort

Total Page:16

File Type:pdf, Size:1020Kb

Temporal Phenome Analysis of a Large Electronic Health Record Cohort Downloaded from http://jamia.bmj.com/ on November 29, 2014 - Published by group.bmj.com Research and applications Temporal phenome analysis of a large electronic health record cohort enables identification of hospital-acquired complications Jeremy L Warner,1,2 Amin Zollanvari,3,4 Quan Ding,3 Peijin Zhang,5 Graham M Snyder,6 Gil Alterovitz3,7,8 ▸ Additional material is ABSTRACT data, visual analytics has become of increasing published online only. To view Objective To develop methods for visual analysis of prominence.56 please visit the journal online (http://dx.doi.org/10.1136/ temporal phenotype data available through electronic Earlier work has demonstrated that a clinical amiajnl-2013-001861). health records (EHR). phenome, for example, as defined by the prevalence fi fi Materials and methods 24 580 adults from the of International Classi cation of Disease, Ninth For numbered af liations see fi end of article. multiparameter intelligent monitoring in intensive care Revision, Clinical Modi cation (ICD-9-CM) codes V.6 (MIMIC II) EHR database of critically ill patients were or aggregations of such codes, can be used to calcu- Correspondence to analyzed, with significant temporal associations late a phenome-wide association of disease codes – Dr Jeremy L Warner, visualized as a map of associations between hospital with single-nucleotide polymorphisms (SNP).7 9 Department of Medicine, Division of Hematology and length of stay (LOS) and ICD-9-CM codes. An expanded However, most work in this area has focused on Oncology, Vanderbilt phenotype, using ICD-9-CM, microbiology, and dichotomous variables, such as SNP,which are amen- University, 2220 Pierce Ave, computerized physician order entry data, was defined for able to display by conventional probability plotting Preston Research Building 777, hospital-acquired Clostridium difficile (HA-CDI). LOS, (eg, Manhattan and Q-Q plots). In previous work, Nashville, TN 37232, USA; estimated costs, 30-day post-discharge mortality, and we demonstrated that complex phenomic informa- [email protected] antecedent medication provider order entry were tion can be visualized as a function of a continuous Received 31 March 2013 evaluated for HA-CDI cases compared to randomly laboratory value, such as the white blood cell Revised 15 July 2013 selected controls. count.10 As disease conditions are expected to Accepted 16 July 2013 Results Temporal phenome analysis revealed 191 change over time, capturing the continuous temporal Published Online First fi 1 August 2013 signi cant codes (p value, adjusted for false discovery evolution of a phenome could lead to new and valu- rate, ≤0.05). HA-CDI was identified in 414 cases, and able discoveries. For example, hospital-acquired com- was associated with longer median LOS, 20 versus plications are well known to prolong the length of 9 days, and adjusted HR 0.33 (95% CI 0.28 to 0.39). hospitalizations, but may do so to varying degrees. This prolongation carries an estimated annual Information about the temporal associations of spe- incremental cost increase of US$1.2–2.0 billion in the cific hospital-acquired conditions, optimized for USA alone. data visualization, may be evaluable through compre- Discussion Comprehensive EHR data have made large- hensive EHR resources. Previous work on temporal scale phenome-based analysis feasible. Time-dependent phenotype extraction from EHR has focused primar- pathological disease states have dynamic phenomic ily on individual patient trajectories or specific evolution, which may be captured through visual case scenarios (eg, LifeLines,11 LifeLines2,12 analytical approaches. Although MIMIC II is a single VISITORS);13 we propose to examine a large cohort institutional retrospective database, our approach should for undiscovered temporal associations. be portable to other EHR data sources, including prospective ‘learning healthcare systems’. For example, Objective interventions to prevent HA-CDI could be dynamically In this paper, we describe a new method for: visual evaluated using the same techniques. analysis of phenomic associations as a function of Conclusions The new visual analytical method time; and hypothesis generation based on the described in this paper led directly to the identification resultant temporal phenome map. These two steps of numerous hospital-acquired conditions, which could comprise a substantial portion of the ‘learning be further explored through an expanded phenotype healthcare system,’ as shown in figure 1. For a clin- definition. ical use case, we demonstrate how the application of these visual analytical methods can lead to rec- ognition of specific hospital-acquired complica- tions, as a function of the length of inpatient BACKGROUND AND SIGNIFICANCE hospitalization, in a diverse EHR database of critic- Clinical phenomics, the measurement of the diver- ally ill patients. sity of disease states across human subjects, is of increasing importance in the analysis of large clin- MATERIALS AND METHODS ical datasets.1 Electronic health records (EHR) now Data source make it possible to evaluate disease prevalence, dis- Multiparameter intelligent monitoring in intensive fi To cite: Warner JL, tribution, and correlation across a prede ned com- care V.6 (MIMIC II), an EHR-based database of crit- Zollanvari A, Ding Q, et al. prehensive collection of phenotypes (a ‘phenome’) ically ill patients admitted to Beth Israel Deaconess – J Am Med Inform Assoc to a degree that was previously not achievable.2 4 Medical Center between 2001 and 2007, was the – 2013;20:e281 e287. Due to the complexity of EHR and related medical primary data source.14 MIMIC II contains detailed Warner JL, et al. J Am Med Inform Assoc 2013;20:e281–e287. doi:10.1136/amiajnl-2013-001861 e281 Downloaded from http://jamia.bmj.com/ on November 29, 2014 - Published by group.bmj.com Research and applications patients in each subgroup, we explored how varying the health- care exposure duration intervals affected the results. A two-dimensional ‘temporal phenome map’ was rendered by plotting time as a function of ICD-9-CM codes, displaying line segments for significant adjusted p values, with segment length corresponding to the time interval containing the subgroup, and width proportionate to the negative logarithm of the adjusted p value. In the baseline analysis, an additional temporal phenome map with all adjusted p values less than 0.99 was rendered. Visual analysis of the temporal phenome map For the purposes of this pilot project, we limited the scope of extended evaluation to those ICD-9-CM codes in chapter one: Figure 1 Example schema of a ‘learning healthcare system’. This Infectious and parasitic disease. These codes were systematically example demonstrates the ideal flow of a learning healthcare system examined, and those that appeared to be consistent with environment, which begins with data analysis and visualization. Based hospital-acquired complications were selected for further study, on interpretation of these data, potential problems are recognized and with a final selection of one condition for further evaluation: hypotheses are generated. These lead to the development of hospital-acquired Clostridium difficile infection (HA-CDI). interventions to mitigate or improve the identified problems, which are fi then implemented and evaluated in an iterative fashion. The rst two fi steps of this process, represented by solid lines, are described within Expanded case de nitions the paper. We then developed an expanded phenotypic definition for HA-CDI, using medication and microbiology information avail- able in MIMIC II. Cases of HA-CDI were defined by one or information about vital sign parameters, laboratory data, and pro- more of the following occurring at least 48 h after initial vider order entry (POE), including time stamps of ordering, start, contact: (1) a positive assay for C difficile toxin; (2) POE for and stop times for medications. Our analysis was restricted to oral or rectal vancomycin; (3) POE for oral or intravenous adult patients (≥16 years old). All investigators completed appro- metronidazole and ICD-9-CM code 008.45: C difficile. For the priate human subjects training before accessing the data, which is second criterion, treatment of C difficile is the only common classified as institutional review board exempt. This study complies use for oral or rectal vancomycin so the ICD-9-CM code was with the guidelines of the Declaration of Helsinki. not required; conversely, oral or intravenous metronidazole is used to treat other conditions, so the ICD-9-CM code was required for the third criterion. Non-HA-CDI was defined using Duration of healthcare exposure definition the same criteria but with cutoffs before 48 h for case We d e fined healthcare exposure duration as the interval from definition. initial contact with the healthcare system to hospital discharge. In MIMIC II, hospital admissions and discharges are recorded Matching controls to cases as midnight-timed date stamps; intensive care unit (ICU) admis- For the set of HA-CDI cases, the outlying 1st percentile and sion and discharge, POE orders, and laboratory tests are 99th percentile of healthcare exposure duration were excluded recorded with time stamps. We used the first time stamp for a before selection of a matching control group. Candidate con- laboratory test or POE obtained within ±48 h of the admission trols were randomly selected from the remaining MIMIC II date stamp as a proxy for initial contact. For same-day hospital/ cohort, excluding cases of non-HA-CDI. Candidates were ICU discharges, we used the ICU discharge time as the last excluded if their healthcare exposure duration was less than the contact. Otherwise, we used the last time stamp of a laboratory 1st percentile or greater than the 99th percentile of the case test or POE stop order occurring within 24 h of the hospital dis- hospitalizations. If candidates had at least one laboratory value charge date stamp as a proxy for last contact. If there were no measurement during the first 48 h of hospitalization, they were stop orders or labs within 24 h of the discharge date stamp, the included as a control.
Recommended publications
  • Proteome-By-Phenome Mendelian Randomisation Detects 38 Proteins with Causal
    bioRxiv preprint doi: https://doi.org/10.1101/631747; this version posted May 10, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. 1 Proteome-by-phenome Mendelian Randomisation detects 38 proteins with causal 2 roles in human diseases and traits 3 ANDREW D. BRETHERICK1*, ORIOL CANELA-XANDRI1,2, PETER K. JOSHI3, DAVID W. CLARK3, 4 KONRAD RAWLIK2, THIBAUD S. BOUTIN1, YANNI ZENG1,4,5,6, CARMEN AMADOR1, PAU 5 NAVARRO1, IGOR RUDAN3, ALAN F. WRIGHT1, HARRY CAMPBELL3, VERONIQUE VITART1, 6 CAROLINE HAYWARD1, JAMES F. WILSON1,3, ALBERT TENESA1,2, CHRIS P. PONTING1, J. 7 KENNETH BAILLIE2, AND CHRIS HALEY1,2* 8 9 1 MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University 10 of Edinburgh, Western General Hospital, Crewe Road, EH4 2XU, Scotland, UK. 11 2 The Roslin Institute, University of Edinburgh, Easter Bush, EH25 9RG, Scotland, UK. 12 3 Centre for Global Health Research, Usher Institute of Population Health Sciences and 13 Informatics, University of Edinburgh, Teviot Place, Edinburgh, EH8 9AG, Scotland, UK. 14 4 Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-Sen University, 15 74 Zhongshan 2nd Road, Guangzhou 510080, China. 16 5 Guangdong Province Translational Forensic Medicine Engineering Technology 17 Research Center, Zhongshan School of Medicine, Sun Yat-Sen University, 74 Zhongshan 18 2nd Road, Guangzhou 510080, China. 19 6 Guangdong Province Key Laboratory of Brain Function and Disease, Zhongshan 20 School of Medicine, Sun Yat-Sen University, 74 Zhongshan 2nd Road, Guangzhou 21 510080, China.
    [Show full text]
  • A Phenome-Wide Association Study (Phewas)
    RESEARCH ARTICLE A phenome-wide association study (PheWAS) in the Population Architecture using Genomics and Epidemiology (PAGE) study reveals potential pleiotropy in African Americans 1 2,3 4 5 Sarah A. PendergrassID , Steven BuyskeID , Janina M. Jeff , Alex Frase , 5 5 6 7 8 a1111111111 Scott DudekID , Yuki Bradford , Jose-Luis AmbiteID , Christy L. Avery , Petra Buzkova , 6 9 10 7,11 a1111111111 Ewa Deelman , Megan D. Fesinmeyer , Christopher Haiman , Gerardo HeissID , Lucia 12 13 14 15 16 a1111111111 A. Hindorff , Chun-Nan HsuID , Rebecca D. Jackson , Yi Lin , Loic Le Marchand , 3 10 17 7,11 a1111111111 Tara C. MatiseID , Kristine R. Monroe , Larry Moreland , Kari E. North , Sungshim a1111111111 L. Park10, Alex Reiner18, Robert Wallace19, Lynne R. Wilkens16, Charles Kooperberg15, 5☯ 20,21☯ Marylyn D. Ritchie , Dana C. CrawfordID * 1 Genentech, Inc., South San Francisco, California, United States of America, 2 Department of Statistics, Rutgers University, Piscataway, New Jersey, United States of America, 3 Department of Genetics, Rutgers University, Piscataway, New Jersey, United States of America, 4 Illumina, Inc., San Diego, California, United OPEN ACCESS States of America, 5 Department of Genetics, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America, 6 Information Citation: Pendergrass SA, Buyske S, Jeff JM, Frase Sciences Institute; University of Southern California, Marina del Rey, California, United States of America, A, Dudek S, Bradford
    [Show full text]
  • Genetic Determinants of Skin Color, Aging, and Cancer Genetische Determinanten Van Huidskleur, Huidveroudering En Huidkanker
    Genetic Determinants of Skin Color, Aging, and Cancer Genetische determinanten van huidskleur, huidveroudering en huidkanker Leonie Cornelieke Jacobs Layout and printing: Optima Grafische Communicatie, Rotterdam, The Netherlands Cover design: Annette van Driel - Kluit © Leonie Jacobs, 2015 All rights reserved. No part of this thesis may be reproduced, stored in a retrieval system or transmitted in any form or by any means, without prior written permission of the author or, when appropriate, of the publishers of the publications. ISBN: 978-94-6169-708-0 Genetic Determinants of Skin Color, Aging, and Cancer Genetische determinanten van huidskleur, huidveroudering en huidkanker Proefschrift Ter verkrijging van de graad van doctor aan de Erasmus Universiteit Rotterdam op gezag van rector magnificus Prof. dr. H.A.P. Pols en volgens besluit van het College voor Promoties. De openbare verdediging zal plaatsvinden op vrijdag 11 september 2015 om 11:30 uur door Leonie Cornelieke Jacobs geboren te Rotterdam PROMOTIECOMMISSIE Promotoren: Prof. dr. T.E.C. Nijsten Prof. dr. M. Kayser Overige leden: Prof. dr. H.A.M. Neumann Prof. dr. A.G. Uitterlinden Prof. dr. C.M. van Duijn Copromotor: dr. F. Liu COntents Chapter 1 General introduction 7 PART I SKIn COLOR Chapter 2 Perceived skin colour seems a swift, valid and reliable measurement. 29 Br J Dermatol. 2015 May 4; [Epub ahead of print]. Chapter 3 Comprehensive candidate gene study highlights UGT1A and BNC2 37 as new genes determining continuous skin color variation in Europeans. Hum Genet. 2013 Feb; 132(2): 147-58. Chapter 4 Genetics of skin color variation in Europeans: genome-wide association 59 studies with functional follow-up.
    [Show full text]
  • A Review of Automatic Phenotyping Approaches Using Electronic Health Records
    electronics Article A Review of Automatic Phenotyping Approaches using Electronic Health Records Hadeel Alzoubi 1,*, Raid Alzubi 2, Naeem Ramzan 3, Daune West 3, Tawfik Al-Hadhrami 4,* and Mamoun Alazab 5 1 School of Computer and Information Technology, Jordan University of Science and Technology, Irbid 22110, Jordan 2 Department of Computer Science, Faculty of Information Technology, Middle East University, Amman 11831, Jordan; [email protected] 3 School of Engineering and Computing, University of the West of Scotland, Paisley PA1 2BE, UK; [email protected] (N.R.); [email protected] (D.W.) 4 School of Science and Technology, Nottingham Trent University, Nottingham NG11 8NS, UK 5 College of Engineering, IT and Environment, Charles Darwin University, Darwin 0815, Australia; [email protected] * Correspondences: [email protected] (H.A.); Tawfi[email protected] (T.A.-H.); Tel.: +44-115-848-4818 (T.A.-H.) Received: 27 September 2019; Accepted: 22 October 2019; Published: 29 October 2019 Abstract: Electronic Health Records (EHR) are a rich repository of valuable clinical information that exist in primary and secondary care databases. In order to utilize EHRs for medical observational research a range of algorithms for automatically identifying individuals with a specific phenotype have been developed. This review summarizes and offers a critical evaluation of the literature relating to studies conducted into the development of EHR phenotyping systems. This review describes phenotyping systems and techniques based on structured and unstructured EHR data. Articles published on PubMed and Google scholar between 2013 and 2017 have been reviewed, using search terms derived from Medical Subject Headings (MeSH).
    [Show full text]
  • The Genome and Phenome of the Green Alga Chloroidium Sp
    The genome and phenome of the green alga Chloroidium sp. UTEX 3007 reveal adaptive traits for desert acclimati- zation David R. Nelson1,2,*, Basel Khraiwesh1,2, Weiqi Fu1, Saleh Alseekh3, Ashish Jaiswal1, Amphun Chaiboonchoe1, Khaled M. Hazzouri2, Matthew J. O’Connor4, Glenn L. Butterfoss2, Nizar Drou2, Jil- lian D. Rowe2, Jamil Harb3,5, Alisdair R. Fernie3, Kristin C. Gunsalus2,6, Kourosh Salehi-Ashtiani1,2,* 1Laboratory of Algal, Systems, and Synthetic Biology, Division of Science and Math, New York Uni- versity Abu Dhabi, Abu Dhabi, UAE 2Center for Genomics and Systems Biology, New York University Abu Dhabi, Abu Dhabi, UAE 3Max Planck Institute of Molecular Plant Physiology, D-14476 Potsdam-Golm, Germany 4Core Technology Platform, New York University Abu Dhabi, Abu Dhabi, UAE 5Department of Biology and Biochemistry, Birzeit University, Birzeit, West Bank, Palestine 6Center for Genomics & Systems Biology and Department of Biology, New York University, New York, NY, USA *Correspondence and requests for materials should be addressed to D.R.N. ([email protected]) or K.S.- A ([email protected]) Abstract To investigate the phenomic and genomic traits that allow green algae to survive in deserts, we characterized a ubiquitous species, Chloroidium sp. UTEX 3007, which we isolated from multiple locations in the United Arab Emirates (UAE). Metabolomic analyses of Chloroidium sp. UTEX 3007 indicated that the alga accumulates a broad range of carbon sources, including several desiccation tolerance-promoting sugars and unusually large stores of palmitate. Growth assays re- vealed capacities to grow in salinities from zero to 60 g/L and to grow heterotrophically on >40 distinct carbon sources.
    [Show full text]
  • Phenome-Wide Mendelian Randomization Mapping The
    5 6 Phenome-wide Mendelian randomization mapping the influence of the plasma proteome on complex 7 diseases 8 9 Jie Zheng*§1, Valeriia Haberland*1, Denis Baird*1, Venexia Walker*1, Philip C. Haycock*1, Mark R. Hurle2, Alex Gutteridge3, Pau Erola1, Yi Liu1, 10 Shan Luo1,4, Jamie Robinson1, Tom G. Richardson1, James R. Staley1,5, Benjamin Elsworth1, Stephen Burgess5, Benjamin B. Sun5, John 11 Danesh5,6,7,8,9,10, Heiko Runz11, Joseph C. Maranville12, Hannah M. Martin13, James Yarmolinsky1, Charles Laurin1, Michael V. Holmes1,14,15,16, 12 Jimmy Z. Liu11, Karol Estrada11, Rita Santos17, Linda McCarthy3, Dawn Waterworth2, Matthew R. Nelson2, George Davey Smith*1,18, Adam S. 13 Butterworth*5,6,7,8,9, Gibran Hemani*1, Robert A. Scott*§3, and Tom R. Gaunt*§1,18 14 15 1MRC Integrative Epidemiology Unit (IEU), Bristol Medical School, University of Bristol, Bristol, UK. 16 2Human Genetics, GlaxoSmithKline, Collegeville, PA, USA. 17 3Human Genetics, GlaxoSmithKline, Stevenage, Hertfordshire, UK. 18 4School of Public Health, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong SAR, China. 19 5BHF Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK. 20 6BHF Centre of Research Excellence, School of Clinical Medicine, Addenbrooke’s Hospital, Cambridge, UK. 21 7NIHR Blood and Transplant Research Unit in Donor Health and Genomics, Department of Public Health and Primary Care, University of Cambridge, 22 Cambridge, UK. 23 8NIHR Cambridge Biomedical Research Centre, School of Clinical Medicine, Addenbrooke’s Hospital, Cambridge, UK. 24 9Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Hinxton, UK.
    [Show full text]
  • Functional Phenomics: an Emerging Field Integrating High-Throughput Phenotyping, 2 Physiology, and Bioinformatics
    bioRxiv preprint doi: https://doi.org/10.1101/370288; this version posted July 16, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 1 Functional phenomics: An emerging field integrating high-throughput phenotyping, 2 physiology, and bioinformatics 3 Larry M. York* 4 *Corresponding Author 5 Noble Research Institute. Ardmore, OK 73401 USA 6 580-224-6720 7 [email protected] 8 Submitted April 10, 2018 9 3 figures, 4 boxes 10 Word count: 3781 11 bioRxiv preprint doi: https://doi.org/10.1101/370288; this version posted July 16, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 12 Functional phenomics: Integrating high-throughput phenotyping, physiology, and 13 bioinformatics 14 Running title: Functional phenomics 15 Highlight: Functional phenomics is an emerging field in plant biology that relies on high- 16 throughput phenotyping, big data analytics, controlled manipulative experiments, and 17 simulation modelling to increase understanding of plant physiology. 18 Abstract 19 The emergence of functional phenomics signifies the rebirth of physiology as a 21st 20 century science through the use of advanced sensing technologies and big data analytics. 21 Functional phenomics highlights the importance of phenotyping beyond only identifying 22 genetic regions because a significant knowledge gap remains in understanding which plant 23 properties will influence ecosystem services beneficial to human welfare.
    [Show full text]
  • Epigenome and Phenome Study Reveals Circulating Markers Pertinent to Brain Health
    medRxiv preprint doi: https://doi.org/10.1101/2021.09.03.21263066; this version posted September 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Epigenome and phenome study reveals circulating markers pertinent to brain health Danni A Gadd1,, Robert F Hillary1, Daniel L McCartney1, Liu Shi2, Robert I McGeachan3, Aleks Stolicyn4, Stewart W Morris1, Rosie M Walker5, Archie Campbell1, Miruna C Barbu4, Mathew A Harris4, Ellen V Backhouse5, Joanna M Wardlaw5,6,7,8, J Douglas Steele9, Diego A Oyarzún10, Graciela Muniz-Terrera12, Craig Ritchie12, Alejo Nevado-Holgado2, Caroline Hayward1,13, Kathryn L Evans1, David J Porteous1, Simon R Cox14,15, Heather C Whalley4, Andrew M McIntosh4, Riccardo E Marioni1,† 1 Centre for Genomic and Experimental Medicine, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU 2 Department of Psychiatry, University of Oxford, UK, OX3 7JX 3 Centre for Discovery Brain Sciences, University of Edinburgh, 1 George Square, Edinburgh EH8 9JZ, UK 4 Division of Psychiatry, University of Edinburgh, Royal Edinburgh Hospital, Edinburgh, EH10 5HF, UK 5 Centre for Clinical Brain Sciences, Chancellor’s Building, 49 Little France Crescent, Edinburgh BioQuarter, Edinburgh, EH16 4SB 6 Centre for Cognitive Ageing and Cognitive Epidemiology, University of Edinburgh, Edinburgh, UK 7 Edinburgh Imaging, University of
    [Show full text]
  • Graph Analytics for Phenome-Genome Associations Inference
    bioRxiv preprint doi: https://doi.org/10.1101/682229; this version posted June 26, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Graph analytics for phenome-genome associations inference Davide Cirillo1,*, Dario Garcia-Gasulla1,*, Ulises Cortés1,3, Alfonso Valencia1,2 1 Barcelona Supercomputing Center (BSC), C/ Jordi Girona 29, 08034, Barcelona, Spain 2 ICREA, Pg. Lluís Companys 23, 08010, Barcelona, Spain. 3 Universitat Politècnica de Catalunya (UPC), C/ Jordi Girona 31, 08034, Barcelona, Spain * corresponding authors: [email protected]; [email protected] Abstract Motivation: Biological ontologies, such as the Human Phenotype Ontology (HPO) and the Gene Ontology (GO), are extensively used in biomedical research to find enrichment in the annotations of specific gene sets. However, the interpretation of the encoded information would greatly benefit from methods that effectively interoperate between multiple ontologies providing molecular details of disease-related features. Results: In this work, we present a statistical framework based on graph theory to infer direct associations between HPO and GO terms that do not share co-annotated genes. The method enables to map genotypic features to phenotypic features thus providing a valid tool for bridging functional and pathological annotations. We validated the results by (a) supporting evidence of known drug-target associations (PanDrugs), protein-protein physical and functional interactions (BioGRID and STRING), and common pathways (Reactome); (b) comparing relationships inferred from early ontology releases with knowledge contained in the latest versions. Applications: We applied our method to improve the interpretation of molecular processes involved in pathological conditions, illustrating the applicability of our predictions with a number of biological examples.
    [Show full text]
  • Phenome-Wide Association Studies in Diverse Populations
    Genomic Research in Action: Phenome-Wide Association Studies in Diverse Populations Sarah A. Pendergrass PhD, MS Assistant Professor Biomedical and Translational Informatics Geisinger Health System October 14, 2016 How Did I Get Here? MathUndergraduate and analytical thinking in Physics and programming, from Smith never College ending problem setsMasters in Biomedical Engineering from Thayer School of Working on disease based problems with computational solutions Engineering at Dartmouth PhD in Genetics from Dartmouth College Emerging world of bioinformatics and using these tools to study disease Postdoctoral work at Vanderbilt University Genetic Staff epidemiology Scientist atand Penn bioinformatics State – how can we leverage complex multifaceted data for discovery?? Investigator at Geisinger Health System PheWAS What are Phenome Wide Association Studies (PheWAS)? . Genome Wide Association Studies (GWAS) . Single Nucleotide Polymorphisms (SNPs) . One phenotype, or a limited phenotypic domain Phenome Wide Association Studies (PheWAS) . Comprehensively calculating the association between . Small to large number of SNPs . Diverse range of phenotypes from phenotypically rich datasets We can also use other data types too! . Other genetic variants: copy number variants, rare variants, mitochondrial variants . Quantitative lab variables instead of SNPs for biomarker identification PheWAS Phenome Wide Association Studies . Datasets with thousands of collected phenotypes . Unlikely that time would be spent to carefully investigate each phenotype for use . Hypothesis generation . Can we identify SNPs and/or phenotypes for further research ? Also exploring cross-phenotype associations . Exploring the relationship between associations for the same genetic variant and multiple phenotypes MegaPheWAS Performed a comprehensive genome-wide PheWAS . 632,574 genetic variants, MAF > 1% . 38,662 EA . 541 ICD-9 derived case-control diagnoses and a series of clinical lab measures Anurag Verma, Anastasia Lucas MegaPheWAS Performed a comprehensive genome-wide PheWAS .
    [Show full text]
  • Genome to Phenome: Improving Animal Health, Production, and Well-Being – a New USDA Blueprint for Animal Genome Research 2018–2027
    University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Roman L. Hruska U.S. Meat Animal Research U.S. Department of Agriculture: Agricultural Center Research Service, Lincoln, Nebraska 5-16-2019 Genome to Phenome: Improving Animal Health, Production, and Well-Being – A New USDA Blueprint for Animal Genome Research 2018–2027 Caird Rexroad USDA, Agricultural Research Service, [email protected] Jeffrey Vallet United States Department of Agriculture Lakshmi Kumar Matukumalli United States Department of Agriculture James Reecy Iowa State University Derek Bickhart United States Department of Agriculture FSeeollow next this page and for additional additional works authors at: https:/ /digitalcommons.unl.edu/hruskareports Rexroad, Caird; Vallet, Jeffrey; Matukumalli, Lakshmi Kumar; Reecy, James; Bickhart, Derek; Blackburn, Harvey; Boggess, Mark; Cheng, Hans; Clutter, Archie C.; Cockett, Noelle; Ernst, Catherine; Fulton, Janet E.; Liu, John; Lunney, Joan; Neibergs, Holly; Purcell, Catherine; Smith, Timothy P. L.; Sonstegard, Tad; Taylor, Jerry; Telugu, Bhanu; Eenennaam, Alison Van; Van Tassell, Curtis P.; and Wells, Kevin, "Genome to Phenome: Improving Animal Health, Production, and Well-Being – A New USDA Blueprint for Animal Genome Research 2018–2027" (2019). Roman L. Hruska U.S. Meat Animal Research Center. 454. https://digitalcommons.unl.edu/hruskareports/454 This Article is brought to you for free and open access by the U.S. Department of Agriculture: Agricultural Research Service, Lincoln, Nebraska at DigitalCommons@University of Nebraska - Lincoln. It has been accepted for inclusion in Roman L. Hruska U.S. Meat Animal Research Center by an authorized administrator of DigitalCommons@University of Nebraska - Lincoln. Authors Caird Rexroad, Jeffrey Vallet, Lakshmi Kumar Matukumalli, James Reecy, Derek Bickhart, Harvey Blackburn, Mark Boggess, Hans Cheng, Archie C.
    [Show full text]
  • Integrated Omics: Tools, Advances and Future Approaches
    62 1 Journal of Molecular B B Misra et al. Approaches and tools in 62:1 R21–R45 Endocrinology integrated omics REVIEW Integrated omics: tools, advances and future approaches Biswapriya B Misra1, Carl Langefeld1,2, Michael Olivier1 and Laura A Cox1,3 1Center for Precision Medicine, Section on Molecular Medicine, Department of Internal Medicine, Wake Forest School of Medicine, Winston-Salem, North California, USA 2Department of Biostatistics, Wake Forest School of Medicine, Winston-Salem, North California, USA 3Southwest National Primate Research Center, Texas Biomedical Research Institute, San Antonio, Texas, USA Correspondence should be addressed to L A Cox: [email protected] Abstract With the rapid adoption of high-throughput omic approaches to analyze biological Key Words samples such as genomics, transcriptomics, proteomics and metabolomics, each f integrated analysis can generate tera- to peta-byte sized data files on a daily basis. These data file f omics sizes, together with differences in nomenclature among these data types, make the f genomics integration of these multi-dimensional omics data into biologically meaningful context f transcriptomics challenging. Variously named as integrated omics, multi-omics, poly-omics, trans-omics, f proteomics pan-omics or shortened to just ‘omics’, the challenges include differences in data f metabolomics cleaning, normalization, biomolecule identification, data dimensionality reduction, f network biological contextualization, statistical validation, data storage and handling, sharing and f statistics data archiving. The ultimate goal is toward the holistic realization of a ‘systems biology’ f Bayesian understanding of the biological question. Commonly used approaches are currently f machine learning limited by the 3 i’s – integration, interpretation and insights.
    [Show full text]