Modeling Biological Systems and Analyzing Large-Scale Data Sets ilya shmulevich TCGA Data Types TCGA Research Network Heterogeneous data Clinical variables contributing to tumor aggressiveness Less More Aggressive Aggressive Distant M0=No M1=Yes Metastasis Tumor Stage Early (I-II) Late(III-IV) Fraction Lymph Nodes 0 100 % Positive by H & E Lymphatic No Yes Invasion Present Vascular No Yes Invasion Present Histological Mucinous Non- Type mucinous Vesteinn Thorsson Nature, 487,330- 337, 2012. FBXW7 Vesteinn Thorsson Nature, 487,330- Vesteinn Thorsson 337, 2012. Nature, 487,330- Vesteinn Thorsson, Dick 337, 2012. Kreisberg Web-based Apps http://explorer.cancerregulome.org The Regulome Explorer is an interactive web application that allows the user to explore multivariate relationships in data explorer.cancerregulome.org Richard Kreisberg, Jake Lin, Timo Erkkila, Sheila Reynolds explorer.cancerregulome.org RF-ACE, a multivariate statistical inference method based on ensembles of decision trees, which seeks to uncover significant associations between features in the input data matrix. Timo Erkkilä, Sheila Reynolds, Kari Torkkola RF-ACE has high predictive power and is resistant to over- fitting. http://code.google.com/p/rf-ace/ Computational challenges: • mixed data types: continuous, RF-ACE features: discrete, and categorical • handles mixed variable types • tens of thousands of features • does not require imputation of x tens or hundreds of samples missing values • non-linear, noisy, and • random subsampling rather than multivariate relationships combinatorial search • correlated features • statistical testing removes redundant • missing data features • -value for each candidate predictor • fast, portable implementation in C++ Timo Erkkilä Google I/O keynote presentation 600,000 cores June 27, 2012 A multilevel pan-cancer view: from genes to hallmarks Theo Knijnenburg Mutational investment Billions of Associations! explorer.cancerregulome.org Motivating questions • Repurposing – Which existing cancer drugs may be therapeutic in which other cancers? – Which inhibitors with no current cancer indications may be therapeutic in certain cancers? • Opportunity – TCGA primary tumor data may serve as the basis for guided investigation of these open questions Guiding principle • The direct protein target for most inhibitors is not the sensitizing aberrated protein itself – e.g., AKT1 inhibitors are most effective against cell lines with PTEN mutations Song et al. (2012) Proof of concept: Associations between drug targets (e.g., AKT1) and sensitizing aberrations (e.g., PTEN) also evident in TCGA PTEN mutations in UCEC AKT1 protein expression related to PTEN mutation in UCEC PTEN mutation status Proof of concept: Associations between drug targets (e.g., AKT1) and sensitizing aberrations (e.g., PTEN) also evident in TCGA Association drug target : sensitizing aberration pairs Approach • Create large heterogeneous graph of associations from TCGA data, literature, databases, … – [Billions of edges, Terabytes of data] • Query on Cray YarcData uRiKA graph analytics appliance – No locality of reference, graphs hard to partition – [Minutes rather than hours per query] Synthetic lethal protein targets Candidate compounds ATR CHEK1 Genomic Aberration PAK3 TP53 mutation PLK1 • Identify aberrated gene → target → drug … SGK2 … relationships for drugs with and without WEE1 known efficacy in cancer Integrating multiple data sources into a (big) graph Genomic aberrations Therapeutic targets Candidate inhibitors RNAi Graph Data Model: Resource Description Framework (RDF) 1 0.2515 <http://www.systemsbiology.net/nmd#combocount> <http://www.systemsbiology.net/nmd#nmd> _:blankGeneNMD <http://www.systemsbiology.net/nmd#term2> <http://www.systemsbiology.net/nmd#term1> http://www.systemsbiolo http://www.systemsbiolo gy.net/brca2> gy.net/tp53> <http://www.systemsbiology.net/feature#label> <http://www.systemsbiology.net/nmd#term1> _:blankDrugGeneNMD <http://www.systemsbiology.net/tp5 <http://www.systemsbiology.net/feature#source 3y_n_somatic> > <http://www.systemsbiology.net/nmd#nmd> <http://www.systemsbiology.net/nmd#term2> <http://www.systemsbiology.net/feature#source> <http://www.systemsbiology.net/association#feature1> http://www.systemsbiolo 1.1628 gnab gy.net/biotin> _:blankPairwise <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.systemsbiology.net/association#datas et> <http://www.systemsbiology.net/association#feature2> http://www.systemsbiolo gy.net/Drug> <http://www.systemsbiology.net/g brca_05nov ata3gexp> <http://www.systemsbiology.net/association#correlation> <http://www.systemsbiology.net/feature#source> -0.511 gexp Example SPARQL Query Literature Literature Seed Gene TCGA Associated Small Cancer List Genes Molecules Type cancer.gov approved drugs Database Example Result: PTEN associations in UCEC Genomic aberrations Candidate targets Candidate inhibitors Acepromazine Acitretin Adapalene Adenine Adenosine monophosphate Adenosine triphosphate Adinazolam Alfuzosin Alitretinoin Allylestrenol Alpha-Linolenic Acid Alprazolam Alteplase Aminocaproic Acid Amiodarone Amitriptyline Amoxapine Amsacrine Anistreplase Aprotinin Arcitumomab Aripiprazole Astemizole Atomoxetine Atorvastatin Bepridil Biotin Bromazepam Bromocriptine Bupropion Caffeine Capromab Carglumic acid ASRGL1 Carmustine Carvedilol Chlordiazepoxide Chlorotrianisene Chlorpheniramine Chlorpromazine ESR1 Cinolazepam Cisapride Clobazam Clomifene Clomipramine Clonazepam GLYATL2 Clorazepate Clotiazepam Clozapine Cocaine Conjugated Estrogens Cysteamine PLIN3 Danazol Dantrolene Dapiprazole Debrisoquin Desipramine Desogestrel HADH Desvenlafaxine Dexmethylphenidate Dextroamphetamine Diazepam Dicumarol Dienestrol NT5E Diethylpropion Diethylstilbestrol Dipyridamole Dofetilide Doxazosin Doxepin PIK3R3 Drospirenone Droxidopa Duloxetine Dutasteride Dydrogesterone Ephedra GABRE Ephedrine Epinephrine Ergotamine Escitalopram Estazolam Estradiol PGR Estramustine Estriol Estrone Estropipate Ethinyl Estradiol Ethynodiol Diacetate FBP1 Etonogestrel Felodipine Finasteride Fludiazepam Fluoxymesterone Flurazepam SMPD3 Fluticasone Propionate Fluvastatin Fulvestrant Gabapentin Galsulfase Ginkgo biloba GRIN1 Glutathione Glycine Guanadrel Sulfate Guanethidine Halazepam Halofantrine PIK3R1 Hydroxocobalamin Ibutilide Idursulfase Imipramine Isoproterenol Isradipine RARG Ketazolam Labetalol L-Alanine L-Arginine L-Asparagine L-Aspartic Acid AADAT L-Carnitine L-Citrulline L-Cysteine Levonordefrin Levonorgestrel L-Glutamic Acid CACNA2D2 L-Histidine Lindane Lisdexamfetamine L-Methionine Lorazepam L-Ornithine SST Lovastatin L-Proline L-Serine Maprotiline Mazindol Medroxyprogesterone SRD5A1 Megestrol Melatonin Menadione Meperidine Mestranol Methamphetamine B4GALT1 Methotrimeprazine Methoxamine Methylphenidate Mianserin Miconazole Midazolam ADRA1B Midodrine Mifepristone Milnacipran Modafinil N-Acetyl-D-glucosamine NADH KCNJ12 Naloxone Nefazodone Nicardipine Nitrazepam Nitrendipine Norelgestromin RYR1 Norepinephrine Norethindrone Norgestimate Nortriptyline Olanzapine Olopatadine SLC6A14 Orphenadrine Oxazepam Paliperidone Paroxetine Pentostatin Pentoxifylline RETSAT Pergolide Phendimetrazine Phenmetrazine Phentermine Phenylephrine Phosphatidylserine FAAH Pimozide Pravastatin Prazepam Prazosin Progesterone Promazine SRR Propafenone Propericiazine Propiomazine Protriptyline Pseudoephedrine Pyridoxine NQO1 Quazepam Quetiapine Quinestrol Quinidine Raloxifene Reboxetine CEACAM1 Reteplase Risperidone Rosuvastatin S-Adenosylmethionine Sertindole Sibutramine KCNK6 Simvastatin Sotalol Streptokinase Suramin Tamoxifen Tamsulosin ACADS Tazarotene Temazepam Tenecteplase Terazosin Terfenadine Tetracycline CRAT Thiopental Thioproperazine Thioridazine Toremifene PTEN Tramadol Tranexamic Acid ELOVL4 Tretinoin Triazolam Trilostane Trimipramine Urokinase Venlafaxine FOLH1 Verapamil Vitamin A Xylometazoline Ziprasidone ALDH1A3 Zonisamide SORD ASS1 NADSYN1 PRNP NDUFA11 KCNH2 CPS1 SLC22A5 HMGCR ALDH18A1 PARS2 GLS B4GALT4 ACACB SLC38A3 GSR OAZ3 TCN1 SLC1A1 SMPD4 BHMT2 HSD17B4 GRIK5 GLDC PPIB PIPOX ADA SCN3B S100A1 PLG SLC1A4 CBS GLRB ACVR1B SLC6A2 Example Result: PTEN associations in UCEC Genomic aberrations Candidate targets Candidate inhibitors PTEN PIK3R1/PIK3CA Wortmannin PTEN mutation status PDB id 3hhm Repurposing existing cancer drugs in other cancers Genomic aberrations Candidate targets Candidate inhibitors Existing cancer indication Target Cancer Drug A New cancer indication Example Result • TP53 is frequently mutated in most tumor types • ABCG2, also known as Breast Cancer Resistance Protein (BCRP), is associated with TP53 mutation in TCGA breast cancer data • Nelfinavir, an HIV protease inhibitor, also binds ABCG2 and many other proteins • High-throughput cell line screening of breast cancer cells recently identified Nelfinavir as a selective inhibitor. “It can be brought to HER2-breast cancer treatment trials with the same dosage regimen as that used among HIV patients. “ [Shim et al. JNCI 2012] Understanding behavior of massive multicellular systems: BioCellion Source: http://www.sjrcd.org/soilhealth/soilagg.html Source: http://www.theregister.co.uk source: EMBO Rep. 2004 May; 5(5): 470–476. Ductal Carcinoma model: Nicholas Flann, Utah State Univ. Source: http://www.webmd.com Acknowledgments Brady Bernard, Ryan Bressler, Andrea Eakin, Timo Erkkilä, Lisa Iype, Seunghwa Kang, Theo Knijnenburg, Roger Kramer, Richard Kreisberg, Kalle Leinonen, Jake Lin, Yuexin Liu, Michael Miller, Sheila Reynolds, Hector Rovira, Vesteinn Thorsson, Da Yang, Wei Zhang .
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages32 Page
-
File Size-