Related Phenotypes a Dissertation Submitted to The

Total Page:16

File Type:pdf, Size:1020Kb

Related Phenotypes a Dissertation Submitted to The ENVIRONMENT-WIDE ASSOCIATIONS TO DISEASE AND DISEASE- RELATED PHENOTYPES A DISSERTATION SUBMITTED TO THE PROGRAM IN BIOMEDICAL INFORMATICS AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Chirag Jagdish Patel August 2011 © 2011 by Chirag Jagdish Patel. All Rights Reserved. Re-distributed by Stanford University under license with the author. This work is licensed under a Creative Commons Attribution- Noncommercial 3.0 United States License. http://creativecommons.org/licenses/by-nc/3.0/us/ This dissertation is online at: http://purl.stanford.edu/mg775gw7130 ii I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Atul Butte, Primary Adviser I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Jayanta Bhattacharya I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Mark Cullen Approved for the Stanford University Committee on Graduate Studies. Patricia J. Gumport, Vice Provost Graduate Education This signature page was generated electronically upon submission of this dissertation in electronic format. An original signed hard copy of the signature page is on file in University Archives. iii ABSTRACT Common diseases arise out of combination of both genetic and environmental influences. Advances in genomic technology have enabled investigators to create hypotheses regarding the contribution of genetic factors at a breathtaking pace. However, the assessment of multiple and specific environmental factors—and their interactions with the genome-- has not. We lack high-throughput analytic methodologies to comprehensively and systematically associate multiple physical and specific environmental factors, or the “envirome”, to disease and human health. We claim that the creation of hypotheses regarding the environmental contribution to disease is practicable through high-throughput analytic methods that have been well established in genomics. In the following dissertation, we develop and apply methods to systematically and comprehensively associate specific factors of the envirome with disease states, prioritizing factors for in- depth future study. The current disciplines of studying the environmental determinants of health include toxicology and epidemiology, which operate on molecular and population scales, respectively. This dissertation proposes approaches in both of these disciplines. For example, we have developed a framework to conduct the first “Environment-wide Association Study” (EWAS), systematically associating environmental factors to disease on a population scale. We have applied this framework to investigate type 2 diabetes and heart disease on cohorts that are representative United States population, finding novel and robust associations in diverse and independent cohorts. Given the lack of explained risk resulting from current day genome-wide studies, the time is ripe to usher in a more comprehensive study of the environment, or “enviromics”, toward better understanding of multifactorial diseases and their prevention. iv ACKNOWLEDGEMENTS Foremost, I thank my advisor, Dr. Atul Butte, for his undying confidence, inspiration, and guidance. Even just three years ago, it was far from my belief that the scientist whom I admired from afar would eventually take me on as a student and teach me how to compute, see, and enlighten. For Dr. Atul Butte’s supervision I am forever indebted and most fortunate. I am also indebted to my dissertation committee, Drs. Jay Bhattacharya, Mark Cullen, John Ioannidis, and Robert Tibshirani. Much of this work has come out of discussions with these individuals and it is inspired by and stands on their fundamental teachings. I thank my academic advisors, Drs. Mark Musen and Betty Cheng, for encouraging me to keep taking courses that enabled this work. I thank my many friends and colleagues in the Butte Laboratory and in the Biomedical Informatics program whom I continue to look up to and draw inspiration from. I feel honored and privileged to be among you. In particular, I thank Dr. Rong Chen, Alex Morgan, Joel Dudley, and Nick Tatonetti for providing support and encouragement when it was least expected but most needed. From teaching me how to read and write and to gifting me the newest computers, I thank my parents, Neela and Jagdish Patel. I will always be grateful to them for initiating this most rewarding journey of lifelong learning. I thank my brother, Ankur Patel, for his unflagging support and faith through thick and thin. v I thank my in-laws, Tapan and Kokila Chaudhuri, for their support and encouragement. I do not have the words to thank my partner in life, Trina Chaudhuri. I hope that I can some day enable her to achieve her aspirations as she has done for me. I am grateful to the National Library of Medicine and Applied Biosystems, Inc. for financial support. I thank Centers for Disease Control and Prevention (CDC), the National Center for Health Statistics (NCHS), and the staff and individuals who take part in the National Health and Nutrition Examination Survey (NHANES). In particular, I thank Vijay Gambhir and Peter Meyer of the CDC/NCHS for their support in accessing and processing NHANES restricted genetic data. I am grateful again to Dr. Atul Butte for providing funds to access the NHANES restricted data. I thank the staff of the Biomedical Informatics Training program and the Butte Laboratory, Mary Jeanne Oliva, Susan Aptekar, Alex Skrenchuk, Dr. Russ Altman, and Dr. Larry Fagan. Without the support of these institutions and people, this work would have not been possible. A portion of the work in this dissertation derives from two published articles and two articles currently in review for publication: Chapter 2: 1. Patel, C. J. and A. J. Butte, Predicting environmental chemical factors associated with disease-related gene expression data. BMC Med Genomics, 2010. 3(1): p. 17. vi Chapter 4: 2. Patel, C.J., J. Bhattacharya, and A.J. Butte, An Environment-Wide Association Study (EWAS) on type 2 diabetes mellitus. PLoS ONE, 2010. 5(5): p. e10746. 3. Patel, C.J., M. R. Cullen, J.P.A. Ioannidis, A.J. Butte, Non-genetic associations and correlation globes for determinants of lipid levels: an environment-wide association study. Submitted, 7/2011. Chapter 5: 4. Patel, C.J., R. Chen, J.P.A. Ioannidis, A.J. Butte, Systematic identification of interaction effects between validated genome- and environment-wide associations on Type 2 Diabetes Mellitus. Submitted, 8/2011. In the Chapter 2 work, I devised the methodology and wrote the manuscript with my advisor, Atul Butte. In the Chapter 4 work, I devised the “Environment-wide-Association Study” (EWAS) framework and carried out the analyses. For the EWAS on Type 2 Diabetes, I wrote the manuscripts with Jay Bhattacharya and Atul Butte. For the EWAS on serum lipid levels, I wrote and edited the manuscripts with Mark Cullen, John Ioannidis, and Atul Butte. Finally, in the Chapter 5 work, I devised the “Gene-Environment-Wide Association Study” (G-EWAS) framework and implemented the software to carry out the analyses. Rong Chen and Atul Butte provided the database of curated genetic information. I interpreted the data and wrote the manuscript with Rong Chen, John Ioannidis, and Atul Butte. vii TABLE OF CONTENTS CHAPTER 1: INTRODUCING MULTI-DIMENSIONAL AND DATA- DRIVEN APPROACHES TO CREATE HYPOTHESES REGARDING ENVIRONMENTAL ASSOCIATIONS TO DISEASE ................................ 1 What is the “Environment”? What is the “Envirome”? .................................... 3 Creation of robust hypotheses connecting the environment, genome, and multifactorial disease ............................................................................................ 12 Creating hypotheses comprehensively on a population scale ............................. 14 Creating hypotheses comprehensively on a molecular or toxicological scale .... 18 Discussion ............................................................................................................... 21 CHAPTER 2. MAPPING MULTIPLE TOXICOLOGICAL RESPONSES TO COMPLEX DISEASE ............................................................................. 25 INTRODUCTION ................................................................................................. 25 METHOD TO PREDICT ENVIRONMENTAL ASSOCIATION TO GENE EXPRESSION RESPONSE ................................................................................. 30 RESULTS ............................................................................................................... 41 Verification Phase ............................................................................................. 42 Predicting Environmental Chemicals Associated with Cancer Data Sets ... 44 Clustering Significant Predictions by PubChem-derived Biological Activity ............................................................................................................................ 54 DISCUSSION ........................................................................................................ 57 CHAPTER 3. METHODS TO EXECUTE ENVIRONMENT-WIDE ASSOCIATIONS ON DISEASE AND DISEASE-RELATED PHENOTYPES ON POPULATIONS. ........................................................
Recommended publications
  • Early Diagnosis of Colorectal Cancer Via Plasma Proteomic Analysis of CRC and Advanced Adenomatous Polyp
    Gastroenterology and Hepatology From Bed to Bench. ORIGINAL ARTICLE ©2019 RIGLD, Research Institute for Gastroenterology and Liver Diseases Early diagnosis of colorectal cancer via plasma proteomic analysis of CRC and advanced adenomatous polyp Setareh Fayazfar1, Hakimeh Zali2, Afsaneh Arefi Oskouie1, Hamid Asadzadeh Aghdaei3, Mostafa Rezaei Tavirani4, Ehsan Nazemalhosseini Mojarad5 1Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran 2School of Advanced Technologies in Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran 3Basic and Molecular Epidemiology of Gastroenterology Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran 4Proteomics Research Center, Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran 5Gastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran ABSTRACT Aim: This paper aimed to identify new candidate biomarkers in blood for early diagnosis of CRC. Background: Colorectal cancer (CRC) is the third most widespread malignancies increasing globally. The high mortality rate associated with colorectal cancer is due to the delayed diagnosis in an advanced stage while the metastasis has occurred. For better clinical management and subsequently to reduce mortality of CRC, early detection biomarkers are in high demand.
    [Show full text]
  • High-Throughput Bioinformatics Approaches to Understand Gene Expression Regulation in Head and Neck Tumors
    High-throughput bioinformatics approaches to understand gene expression regulation in head and neck tumors by Yanxiao Zhang A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Bioinformatics) in The University of Michigan 2016 Doctoral Committee: Associate Professor Maureen A. Sartor, Chair Professor Thomas E. Carey Assistant Professor Hui Jiang Professor Ronald J. Koenig Associate Professor Laura M. Rozek Professor Kerby A. Shedden c Yanxiao Zhang 2016 All Rights Reserved I dedicate this thesis to my family. For their unfailing love, understanding and support. ii ACKNOWLEDGEMENTS I would like to express my gratitude to Dr. Maureen Sartor for her guidance in my research and career development. She is a great mentor. She patiently taught me when I started new in this field, granted me freedom to explore and helped me out when I got lost. Her dedication to work, enthusiasm in teaching, mentoring and communicating science have inspired me to feel the excite- ment of research beyond novel scientific discoveries. I’m also grateful to have an interdisciplinary committee. Their feedback on my research progress and presentation skills is very valuable. In particular, I would like to thank Dr. Thomas Carey and Dr. Laura Rozek for insightful discussions on the biology of head and neck cancers and human papillomavirus, Dr. Ronald Koenig for expert knowledge on thyroid cancers, Dr. Hui Jiang and Dr. Kerby Shedden for feedback on the statistics part of my thesis. I would like to thank all the past and current members of Sartor lab for making the lab such a lovely place to stay and work in.
    [Show full text]
  • Goat Anti-Ymer / CCDC50 Antibody Size: 100Μg Specific Antibody in 200Μl
    EB06551 - Goat Anti-Ymer / CCDC50 Antibody Size: 100µg specific antibody in 200µl Target Protein Principal Names: Ymer protein, coiled-coil domain containing 50, C3orf6, YMER, C3orf6, chromosome 3 open reading frame 6 UK Office Official Symbol: CCDC50 Accession Number(s): NP_848018.1; NP_777568.1 Everest Biotech Ltd Human GeneID(s): 152137 Cherwell Innovation Centre Important Comments: This antibody is expected to recognise both reported isoforms of 77 Heyford Park human Ymer protein. Upper Heyford Oxfordshire Immunogen OX25 5HD Peptide with sequence CFSKSESSHKGFHYK, from the internal region of the protein UK sequence according to NP_848018.1; NP_777568.1. Enquiries: Please note the peptide is available for sale. [email protected] Sales: Purification and Storage [email protected] Purified from goat serum by ammonium sulphate precipitation followed by antigen affinity Tech support: chromatography using the immunizing peptide. [email protected] Supplied at 0.5 mg/ml in Tris saline, 0.02% sodium azide, pH7.3 with 0.5% bovine serum albumin. Tel: +44 (0)1869 238326 Aliquot and store at -20°C. Minimize freezing and thawing. Fax: +44 (0)1869 238327 Applications Tested US Office Peptide ELISA: antibody detection limit dilution 1:32000. Everest Biotech c/o Abcore Western blot: Preliminary experiments gave an approx 18kDa band in Human Brain 405 Maple Street, Suite A106 lysates after 1µg/ml antibody staining. Please note that currently we cannot find an Ramona, explanation in the literature for the band we observe given the calculated size of 56.3kDa CA 92065 according to NP_848018.1 and 35.8kDa according to NP_777568.1. The 18kDa band was USA successfully blocked by incubation with the immunizing peptide.
    [Show full text]
  • Gene Mapping and Medical Genetics
    J Med Genet: first published as 10.1136/jmg.24.8.451 on 1 August 1987. Downloaded from Gene mapping and medical genetics Journal of Medical Genetics 1987, 24, 451-456 Molecular genetics of human chromosome 16 GRANT R SUTHERLAND*, STEPHEN REEDERSt, VALENTINE J HYLAND*, DAVID F CALLEN*, ANTONIO FRATINI*, AND JOHN C MULLEY* From *the Cytogenetics Unit, Adelaide Children's Hospital, North Adelaide, South Australia 5006; and tUniversity of Oxford, Nuffield Department of Clinical Medicine, John Radcliffe Hospital, Headington, Oxford OX3 9DU. SUMMARY The major diseases mapped to chromosome 16 are adult polycystic kidney disease and those resulting from mutations in the a globin complex. There are at least six other less important genetic diseases which map to this chromosome. The adenine phosphoribosyltransferase gene allows for selection of chromosome 16 in somatic cell hybrids and a hybrid panel is available which segments the chromosome into six regions to facilitate gene mapping. Genes which have been mapped to this chromosome or which have had their location redefined since HGM8 include APRT, TAT, MT, HBA, PKDI, CTRB, PGP, HAGH, HP, PKCB, and at least 19 cloned DNA sequences. There are RFLPs at 13 loci which have been regionally mapped and can be used for linkage studies. Chromosome 16 is not one of the more extensively have been cloned and mapped to this chromosome. mapped human autosomes. However, it has a Brief mention will be made of a hybrid cell panel http://jmg.bmj.com/ number of features which make it attractive to the which allows for an efficient regional localisation of gene mapper.
    [Show full text]
  • The T(6;16)(P21;Q22) Chromosome Translocation in the Lncap Prostate Carcinoma Cell Line Results in a Tpc/Hpr Fusion Gene'
    CANCERRESEARCH56.728-732.February15. 9961 Advances in Brief The t(6;16)(p21;q22) Chromosome Translocation in the LNCaP Prostate Carcinoma Cell Line Results in a tpc/hpr Fusion Gene' Maria Luisa Veronese, Florencia Bulinch, Massimo Negrini, and Carlo M. Croce2 Jefferson Cancer Center, Jefferson Cancer Institute and Department of Microbiology and Immunology, Thomas Jefferson University, Philadelphia, Pennsylvania 19107 Abstract level. We have found that the translocation results in fusion of the hpr gene on chromosome 16 to the tpc gene, a novel gene coding for a Very little is known about the molecular and genetic mechanisms protein similar to nbosomal protein 510. involved in prostate cancer.Previousstudieshaveshownfrequent lossof heterozygosity(40%)at chromosomalregions8p, lOq,and 16q,suggesting thepresenceoftumorsuppressorgenesintheseregions.TheLNCaP cell Materials and Methods line, establishedfrom a metastaticlesionof human prostatic adenocarci Rodent-Human Hybrids. The hybrids seriesA9LN were obtainedfrom noma,carries a t(6;16)(p21;q22)translocation.To determinewhether this translocation involved genesimportant in the processof malignant trans thefusionof thehumanprostatecarcinomacellline LNCaPandthemouseA9 formation, weclonedand sequencedthet(6;16) breakpoint ofthis cell line. cell line as previously described (15). PCR analysis with primers from both the Sequenceanalysisshowedthat the breakpoint is within the haptoglobin shortandlongarmsof chromosome16wascarriedout in 1 X PCRbufferwith geneclusteron chromosome16,and that, on chromosome6,the break MgCI2 (Boehringer Mannheim) with 100 ng template DNA, 100 ng each of occurs within a novel gene,tpc, similar to the prokaryotic SlO ribosomal forward and reverseprimer, 250 @Mdeoxynucleotidetriphosphates(Perkin protein gene. The translocation results in the production of a fusion Elmer/Cetus), and 0.5 units of Taq DNA polymerase (Boehringer Mannheim) transcript, tpc/hpr.
    [Show full text]
  • Nº Ref Uniprot Proteína Péptidos Identificados Por MS/MS 1 P01024
    Document downloaded from http://www.elsevier.es, day 26/09/2021. This copy is for personal use. Any transmission of this document by any media or format is strictly prohibited. Nº Ref Uniprot Proteína Péptidos identificados 1 P01024 CO3_HUMAN Complement C3 OS=Homo sapiens GN=C3 PE=1 SV=2 por 162MS/MS 2 P02751 FINC_HUMAN Fibronectin OS=Homo sapiens GN=FN1 PE=1 SV=4 131 3 P01023 A2MG_HUMAN Alpha-2-macroglobulin OS=Homo sapiens GN=A2M PE=1 SV=3 128 4 P0C0L4 CO4A_HUMAN Complement C4-A OS=Homo sapiens GN=C4A PE=1 SV=1 95 5 P04275 VWF_HUMAN von Willebrand factor OS=Homo sapiens GN=VWF PE=1 SV=4 81 6 P02675 FIBB_HUMAN Fibrinogen beta chain OS=Homo sapiens GN=FGB PE=1 SV=2 78 7 P01031 CO5_HUMAN Complement C5 OS=Homo sapiens GN=C5 PE=1 SV=4 66 8 P02768 ALBU_HUMAN Serum albumin OS=Homo sapiens GN=ALB PE=1 SV=2 66 9 P00450 CERU_HUMAN Ceruloplasmin OS=Homo sapiens GN=CP PE=1 SV=1 64 10 P02671 FIBA_HUMAN Fibrinogen alpha chain OS=Homo sapiens GN=FGA PE=1 SV=2 58 11 P08603 CFAH_HUMAN Complement factor H OS=Homo sapiens GN=CFH PE=1 SV=4 56 12 P02787 TRFE_HUMAN Serotransferrin OS=Homo sapiens GN=TF PE=1 SV=3 54 13 P00747 PLMN_HUMAN Plasminogen OS=Homo sapiens GN=PLG PE=1 SV=2 48 14 P02679 FIBG_HUMAN Fibrinogen gamma chain OS=Homo sapiens GN=FGG PE=1 SV=3 47 15 P01871 IGHM_HUMAN Ig mu chain C region OS=Homo sapiens GN=IGHM PE=1 SV=3 41 16 P04003 C4BPA_HUMAN C4b-binding protein alpha chain OS=Homo sapiens GN=C4BPA PE=1 SV=2 37 17 Q9Y6R7 FCGBP_HUMAN IgGFc-binding protein OS=Homo sapiens GN=FCGBP PE=1 SV=3 30 18 O43866 CD5L_HUMAN CD5 antigen-like OS=Homo
    [Show full text]
  • A Genome-Wide in Vitro Bacterial-Infection Screen Reveals Human Variation in the Host Response Associated with Inflammatory Disease
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector ARTICLE A Genome-wide In Vitro Bacterial-Infection Screen Reveals Human Variation in the Host Response Associated with Inflammatory Disease Dennis C. Ko,1 Kajal P. Shukla,1 Christine Fong,1 Michael Wasnick,1 Mitchell J. Brittnacher,1 Mark M. Wurfel,2 Tarah D. Holden,2 Grant E. O’Keefe,5 Brian Van Yserloo,2 Joshua M. Akey,3 and Samuel I. Miller1,2,3,4,* Recent progress in cataloguing common genetic variation has made possible genome-wide studies that are beginning to elucidate the causes and consequences of our genetic differences. Approaches that provide a mechanistic understanding of how genetic variants func- tion to alter disease susceptibility and why they were substrates of natural selection would complement other approaches to human- genome analysis. Here we use a novel cell-based screen of bacterial infection to identify human variation in Salmonella-induced cell death. A loss-of-function allele of CARD8, a reported inhibitor of the proinflammatory protease caspase-1, was associated with increased cell death in vitro (p ¼ 0.013). The validity of this association was demonstrated through overexpression of alternative alleles and RNA interference in cells of varying genotype. Comparison of mammalian CARD8 orthologs and examination of variation among different human populations suggest that the increase in infectious-disease burden associated with larger animal groups (i.e., herds and colonies), and possibly human population expansion, may have naturally selected for loss of CARD8. We also find that the loss-of-function CARD8 allele shows a modest association with an increased risk of systemic inflammatory response syndrome in a small study (p ¼ 0.05).
    [Show full text]
  • Supplementary Tables S1-S3
    Supplementary Table S1: Real time RT-PCR primers COX-2 Forward 5’- CCACTTCAAGGGAGTCTGGA -3’ Reverse 5’- AAGGGCCCTGGTGTAGTAGG -3’ Wnt5a Forward 5’- TGAATAACCCTGTTCAGATGTCA -3’ Reverse 5’- TGTACTGCATGTGGTCCTGA -3’ Spp1 Forward 5'- GACCCATCTCAGAAGCAGAA -3' Reverse 5'- TTCGTCAGATTCATCCGAGT -3' CUGBP2 Forward 5’- ATGCAACAGCTCAACACTGC -3’ Reverse 5’- CAGCGTTGCCAGATTCTGTA -3’ Supplementary Table S2: Genes synergistically regulated by oncogenic Ras and TGF-β AU-rich probe_id Gene Name Gene Symbol element Fold change RasV12 + TGF-β RasV12 TGF-β 1368519_at serine (or cysteine) peptidase inhibitor, clade E, member 1 Serpine1 ARE 42.22 5.53 75.28 1373000_at sushi-repeat-containing protein, X-linked 2 (predicted) Srpx2 19.24 25.59 73.63 1383486_at Transcribed locus --- ARE 5.93 27.94 52.85 1367581_a_at secreted phosphoprotein 1 Spp1 2.46 19.28 49.76 1368359_a_at VGF nerve growth factor inducible Vgf 3.11 4.61 48.10 1392618_at Transcribed locus --- ARE 3.48 24.30 45.76 1398302_at prolactin-like protein F Prlpf ARE 1.39 3.29 45.23 1392264_s_at serine (or cysteine) peptidase inhibitor, clade E, member 1 Serpine1 ARE 24.92 3.67 40.09 1391022_at laminin, beta 3 Lamb3 2.13 3.31 38.15 1384605_at Transcribed locus --- 2.94 14.57 37.91 1367973_at chemokine (C-C motif) ligand 2 Ccl2 ARE 5.47 17.28 37.90 1369249_at progressive ankylosis homolog (mouse) Ank ARE 3.12 8.33 33.58 1398479_at ryanodine receptor 3 Ryr3 ARE 1.42 9.28 29.65 1371194_at tumor necrosis factor alpha induced protein 6 Tnfaip6 ARE 2.95 7.90 29.24 1386344_at Progressive ankylosis homolog (mouse)
    [Show full text]
  • Predicting Environmental Chemical Factors Associated with Disease-Related Gene Expression Data
    UCSF UC San Francisco Previously Published Works Title Predicting environmental chemical factors associated with disease-related gene expression data. Permalink https://escholarship.org/uc/item/1kj3j6m8 Journal BMC medical genomics, 3(1) ISSN 1755-8794 Authors Patel, Chirag J Butte, Atul J Publication Date 2010-05-06 DOI 10.1186/1755-8794-3-17 Peer reviewed eScholarship.org Powered by the California Digital Library University of California Patel and Butte BMC Medical Genomics 2010, 3:17 http://www.biomedcentral.com/1755-8794/3/17 RESEARCH ARTICLE Open Access PredictingResearch article environmental chemical factors associated with disease-related gene expression data Chirag J Patel1,2,3 and Atul J Butte*1,2,3 Abstract Background: Many common diseases arise from an interaction between environmental and genetic factors. Our knowledge regarding environment and gene interactions is growing, but frameworks to build an association between gene-environment interactions and disease using preexisting, publicly available data has been lacking. Integrating freely-available environment-gene interaction and disease phenotype data would allow hypothesis generation for potential environmental associations to disease. Methods: We integrated publicly available disease-specific gene expression microarray data and curated chemical- gene interaction data to systematically predict environmental chemicals associated with disease. We derived chemical- gene signatures for 1,338 chemical/environmental chemicals from the Comparative Toxicogenomics Database (CTD). We associated these chemical-gene signatures with differentially expressed genes from datasets found in the Gene Expression Omnibus (GEO) through an enrichment test. Results: We were able to verify our analytic method by accurately identifying chemicals applied to samples and cell lines. Furthermore, we were able to predict known and novel environmental associations with prostate, lung, and breast cancers, such as estradiol and bisphenol A.
    [Show full text]
  • Download Ppis for Each Single Seed, Thus Obtaining Each Seed’S Interactome (Ferrari Et Al., 2018)
    bioRxiv preprint doi: https://doi.org/10.1101/2021.01.14.425874; this version posted January 16, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. Integrating protein networks and machine learning for disease stratification in the Hereditary Spastic Paraplegias Nikoleta Vavouraki1,2, James E. Tomkins1, Eleanna Kara3, Henry Houlden3, John Hardy4, Marcus J. Tindall2,5, Patrick A. Lewis1,4,6, Claudia Manzoni1,7* Author Affiliations 1: Department of Pharmacy, University of Reading, Reading, RG6 6AH, United Kingdom 2: Department of Mathematics and Statistics, University of Reading, Reading, RG6 6AH, United Kingdom 3: Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, London, WC1N 3BG, United Kingdom 4: Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, London, WC1N 3BG, United Kingdom 5: Institute of Cardiovascular and Metabolic Research, University of Reading, Reading, RG6 6AS, United Kingdom 6: Department of Comparative Biomedical Sciences, Royal Veterinary College, London, NW1 0TU, United Kingdom 7: School of Pharmacy, University College London, London, WC1N 1AX, United Kingdom *Corresponding author: [email protected] Abstract The Hereditary Spastic Paraplegias are a group of neurodegenerative diseases characterized by spasticity and weakness in the lower body. Despite the identification of causative mutations in over 70 genes, the molecular aetiology remains unclear. Due to the combination of genetic diversity and variable clinical presentation, the Hereditary Spastic Paraplegias are a strong candidate for protein- protein interaction network analysis as a tool to understand disease mechanism(s) and to aid functional stratification of phenotypes.
    [Show full text]
  • Bolton Et Al. Supplement
    Bolton_et_al._Supplement SUPPLEMENTAL RESULTS AND DISCUSSION Some HPr-1AR ARE-containing Genes Are Unresponsive to Androgen Intracellular receptors specify complex patterns of gene expression that are cell and gene specific. For example, among the 62 androgen non-responsive genes in HPr-1AR that nevertheless bear AR-occupied intragenic AREs in those cells, seven are androgen regulated in LNCaP cells (DePrimo et al. 2002; Nelson et al. 2002). In general, it seems likely that many of these AR binding sites may confer androgen responses in different cellular contexts, reflecting, for example, requirements for additional coregulators to enable hormonal control. Thus, for this subset of genes, AR occupancy is not the primary determinant for AR regulation. Alternatively, some of these AR-occupied genes may be strongly expressed prior to androgen treatment, rendering their induction difficult to measure. However, our qPCR expression data do not support this alternative. ARBRs Function as AREs in Androgen-Mediated Transcriptional Regulation Interestingly, four of the 500-bp ARBRs identified near repressed ARGs produced activation rather than repression of luciferase expression when tested in reporter contexts, whereas three failed to confer androgen regulation in either direction (ECB and KRY, unpublished). Although we do not yet understand this regulatory “polarity reversal”, our result is similar to earlier findings in which “negative” glucocortoicoid response elements direct transcriptional activation in simple contexts (M. Cronin and KRY,
    [Show full text]
  • Full-Text.Pdf
    Systematic Evaluation of Genes and Genetic Variants Associated with Type 1 Diabetes Susceptibility This information is current as Ramesh Ram, Munish Mehta, Quang T. Nguyen, Irma of September 23, 2021. Larma, Bernhard O. Boehm, Flemming Pociot, Patrick Concannon and Grant Morahan J Immunol 2016; 196:3043-3053; Prepublished online 24 February 2016; doi: 10.4049/jimmunol.1502056 Downloaded from http://www.jimmunol.org/content/196/7/3043 Supplementary http://www.jimmunol.org/content/suppl/2016/02/19/jimmunol.150205 Material 6.DCSupplemental http://www.jimmunol.org/ References This article cites 44 articles, 5 of which you can access for free at: http://www.jimmunol.org/content/196/7/3043.full#ref-list-1 Why The JI? Submit online. • Rapid Reviews! 30 days* from submission to initial decision by guest on September 23, 2021 • No Triage! Every submission reviewed by practicing scientists • Fast Publication! 4 weeks from acceptance to publication *average Subscription Information about subscribing to The Journal of Immunology is online at: http://jimmunol.org/subscription Permissions Submit copyright permission requests at: http://www.aai.org/About/Publications/JI/copyright.html Email Alerts Receive free email-alerts when new articles cite this article. Sign up at: http://jimmunol.org/alerts The Journal of Immunology is published twice each month by The American Association of Immunologists, Inc., 1451 Rockville Pike, Suite 650, Rockville, MD 20852 Copyright © 2016 by The American Association of Immunologists, Inc. All rights reserved. Print ISSN: 0022-1767 Online ISSN: 1550-6606. The Journal of Immunology Systematic Evaluation of Genes and Genetic Variants Associated with Type 1 Diabetes Susceptibility Ramesh Ram,*,† Munish Mehta,*,† Quang T.
    [Show full text]