CBA HoliRD REPORT: Hirschsprung Disease Marina Esteban Medina María Peña-Chilet Carlos Loucera Joaquín Dopazo Clinical Bioinformatics Area - FPS Sevilla, January 13, 2020 Collaborators: Dra.Salud Borrego ’s group - U702 CIBERER Research group at IBIS - Instituto de Biomedicina de Sevilla. CBA Objectives and methodology: The Holistic Rare Disease project (HoliRD) aims to build Diseases Maps for as many Rare Diseases as possible and to model them to systematize research in drug repurposing. In order to achieve this purpose several databases such as ORPHANET, OMIM, HPO, PubMed, KEGG, STRING, as well as the literature is used to collect all the up-to-date knowledge of the diseases under study and defining a Disease Map that contains the functional relationships among the known disease genes, as well as the functional consequences of their activity. Then, a mechanistic model that accounts for the activity of such map is used. The HiPathia algorithm, which has successfully proven to predict cell activities related to cancer hallmarks (Hidalgo et al., Oncotarget 2017; 8:5160-5178; Hidalgo et al., Biol Direct. 2018;13:16) as well as the effect of protein inhibitions on cell survival (Cubuk et al., Cancer Res. 2018; 78:6059-6072) is used to simulate the activity of the disease map. Finally, machine learning algorithms are used to find other proteins, already target of drugs with another indication, which display a potential causal effect on the activity of the previously defined disease map. The drugs that target these proteins are potential candidates for repurposing. Schematic representation of the method used. Examples of the use of this approach can be found in Esteban-Medina et al., BMC Bioinformatics. 2019, 20(1):370. CBA Report This report describes the results of the different steps of the HoliRD approach applied to Hirschsprung Disease Identification of genes highly related to the rare disease (RD) under study in Orphanet/OMIM A total of 10 genes annotated as Hirschsprung Disease (HD) were found in the ORPHANET/OMIM database. HD highly related genes Disease ID Entrez ID Gene Symbol Disease ID Entrez ID Gene symbol ORPHA:2151 8929 PHOX2B ORPHA:388 4902 NRTN ORPHA:388 10512 SEMA3C ORPHA:388 5979 RET ORPHA:388 1889 ECE1 ORPHA:388 2668 GDNF ORPHA:388 1908 EDN3 ORPHA:388 223117 SEMA3D ORPHA:388 1910 EDNRB OMIM:609136 6663 SOX10 CBA Identification of highly related HPO to the RD under study: A total of 9 HPO codes associated to HD with specificity >=7 were selected. HD highly related HPOs HPO ID HPO term Specificity level HP:0000407 Sensorineural hearing impairment 10 HP:0001181 Adducted thumb 14 HP:0001249 Intellectual disability 7 HP:0001531 Failure to thrive in infancy 7 HP:0002027 Abdominal pain 8 HP:0002251 Aganglionic megacolon 14 HP:0005214 Intestinal obstruction 11 HP:0100031 Neoplasm of the thyroid gland 9 HP:0200008 Intestinal polyposis 12 Identification of genes that shared at least RD-HPO codes Genes with >= 4 HD-HPO codes Gene Symbol Entrez Gene Symbol Entrez Gene Symbol Entrez APC 324 KRAS 3845 SDHC 6391 ATRX 546 LIMK1 3984 SDHD 6392 BMPR1A 657 MITF 4286 SOX10 6663 CTNNB1 1499 TRNL1 4567 TGFBR2 7048 CBA ECE1 1889 TRNS1 4574 CLIP2 7461 EDN3 1908 NRTN 4902 BAZ1B 9031 EDNRB 1910 PIK3CA 5290 GTF2IRD1 9569 ELN 2006 PTEN 5728 SEMA3C 10512 GDNF 2668 RET 5979 SETBP1 26040 GNAS 2778 RFC2 5982 TBL2 26608 GTF2I 2969 SDHA 6389 BCOR 54880 KIT 3815 SDHB 6390 SEMA4A 64218 SEMA3D 223117 Genes with >= 6 HD-HPO codes Gene Symbol Entrez Gene Symbol Entrez ECE1 1889 NRTN 4902 EDN3 1908 RET 5979 EDNRB 1910 SEMA3C 10512 GDNF 2668 SEMA3D 223117 KRAS 3845 CBA Genes with >= 8 HD-HPO codes Gene Symbol Entrez Gene Symbol Entrez ECE1 1889 NRTN 4902 EDN3 1908 RET 5979 EDNRB 1910 SEMA3C 10512 GDNF 2668 SEMA3D 223117 In order to maintain the specificity and not over expand the Disease Map of action only genes with >=6 HD-HPO codes were selected. Location of the selected disease related genes in KEGG pathways to define the Disease Map of action. After locating the RD associated genes within KEGG pathways, a total of 182 circuits belonging to 36 KEGG pathways were found as part of the disease map. KEGG pathway KEGG-pathway code MAPK signaling pathway hsa04010 ErbB signaling pathway hsa04012 Ras signaling pathway hsa04014 Rap1 signaling pathway hsa04015 Calcium signaling pathway hsa04020 cGMP-PKG signaling pathway hsa04022 Chemokine signaling pathway hsa04062 FoxO signaling pathway hsa04068 Sphingolipid signaling pathway hsa04071 Phospholipase D signaling pathway hsa04072 mTOR signaling pathway hsa04150 PI3K-Akt signaling pathway hsa04151 Apoptosis hsa04210 Longevity regulating pathway - mammal hsa04211 Axon guidance hsa04360 VEGF signaling pathway hsa04370 Tight junction hsa04530 CBA Gap junction hsa04540 Signaling pathways regulating pluripotency of stem cells hsa04550 Natural killer cell mediated cytotoxicity hsa04650 T cell receptor signaling pathway hsa04660 B cell receptor signaling pathway hsa04662 Fc epsilon RI signaling pathway hsa04664 Neurotrophin signaling pathway hsa04722 Cholinergic synapse hsa04725 Serotonergic synapse hsa04726 Regulation of actin cytoskeleton hsa04810 Insulin signaling pathway hsa04910 GnRH signaling pathway hsa04912 Progesterone-mediated oocyte maturation hsa04914 Estrogen signaling pathway hsa04915 Melanogenesis hsa04916 Prolactin signaling pathway hsa04917 Thyroid hormone signaling pathway hsa04919 Oxytocin signaling pathway hsa04921 Aldosterone-regulated sodium reabsorption hsa04960 HiPathia is a signal propagation algorithm that considers pathways as collections of circuits defined as sub-pathways or sequences of proteins connecting signal receptor proteins to effector proteins. HiPathia uses expression values genes as proxies of the level of activation of the corresponding protein in the circuit. Taking into account the inferred protein activity and the interactions between the proteins (activation or inhibition) defined in the pathway, the level of activity of a circuit is estimated using a signal propagation algorithm. Ultimately, effector proteins are annotated with a cellular function. In order to enable a better visualization of the RD Map the HiPathia viewer has been used. The circuits that define the RD Map are marked in RED (please ignore the color legend). The pathways that contain these circuits are highlighted in the right window with a red arrow. The only purpose of this report is to represent the components (genes and interaction) and functions of the circuits that compose the RD Map. Click to access the RD Map Report CBA HiPathia uses KEGG pathway for the graphical representation of the circuits. The original pathways can also be visualized in the KEGG repository https://www.genome.jp/kegg/pathway.html Select prefix: hsa (Organism) Enter keywords: e.g. FoxOsignalingpathway (any HiPathia pathway) Prediction of relevance of gene targets from approved drugs extracted from DRUGBANK database (release 5.1.4) The HoliRD approach takes the mechanistic model of the disease map as the proxy for the molecular basis of the disease outcome. Then, a Multi-Output Random Forest (MORF) regressor, a machine learning algorithm that predicts the circuit activities across the whole disease map, is trained on GTEx gene expression data to find proteins (which are targets for drugs with indications for other diseases) that correctly predict the behavior of the disease map. The drugs targeting the best predictor proteins are candidate for drug repurposing. The relevance score accounts for the accuracy of the prediction contributed by each individual protein. Relevance are absolute values and do not account for the direction of the prediction, that is, if the interaction is an activation or an inhibition. CBA From a total of 683 targets for approved drugs (AT) in the DRUGBANK database (release 5.1.4) the machine learning algorithm selected the 62 most relevant ones (top AT). Entrez Gene symbol Relevance score Entrez Gene symbol Relevance score 4130 MAP1A 0.1554035565 657 BMPR1A 0.003619209 3561 IL2RG 0.0853983045 3570 IL6R 0.0036092627 2554 GABRA1 0.0591264789 6616 SNAP25 0.0035877724 3767 KCNJ11 0.0372669743 7068 THRB 0.0035538714 1441 CSF3R 0.0301809157 941 CD80 0.0035463019 6093 ROCK1 0.0257452691 1043 CD52 0.003537257 5916 RARG 0.0256825012 7134 TNNC1 0.0034864037 4214 MAP3K1 0.0197904239 4134 MAP4 0.0034379318 57468 SLC12A5 0.0197133066 3563 IL3RA 0.0033721244 9475 ROCK2 0.0151823485 6323 SCN1A 0.0031798254 6261 RYR1 0.0142311972 3688 ITGB1 0.0031312124 645 BLVRB 0.0108550062 3351 HTR1B 0.0031262512 3683 ITGAL 0.0099676683 847 CAT 0.0029671267 2904 GRIN2B 0.0093780568 3849 KRT2 0.0029401554 3725 JUN 0.0081707732 3716 JAK1 0.0029390546 3039 HBA1 0.0073066217 782 CACNB1 0.0027859541 2902 GRIN1 0.0071713071 1583 CYP11A1 0.0027795856 64127 NOD2 0.0052587777 1080 CFTR 0.0027726055 302 ANXA2 0.0052463705 2247 FGF2 0.0025992222 774 CACNA1B 0.0051372524 5698 PSMB9 0.0025055776 55800 SCN3B 0.0050310471 4128 MAOA 0.0024790078 2444 FRK 0.00495984 338442 HCAR2 0.0024740118 7097 TLR2 0.0047835314 6715 SRD5A1 0.002470577 2559 GABRA6 0.0047791421 3737 KCNA2 0.0024400973 695 BTK 0.0046609928 1361 CPB2 0.0024025715 5423 POLB 0.0045240031 2690 GHR 0.0023936546 301 ANXA1 0.0044558809 7048 TGFBR2 0.0023879617 6752 SSTR2 0.0043553951 5468 PPARG 0.0023317997 5139 PDE3A 0.0042249933 5743 PTGS2 0.002313308 786 CACNG1 0.0039811131 7422 VEGFA 0.0023012003 3768 KCNJ12 0.003918914 1956 EGFR 0.0022066809 CBA Relevance plot depicting the 62 most relevant gene targets (top AT). Drugs from DRUGBANK db (release 5.1.4) that target top AT. And the list of drugs that target the 62 most relevant genes follows: You can click on the hyperlink of the Drug ID to see more detailed information about the drug in DrugBank DB.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages30 Page
-
File Size-