The Curing AI for Precision Medicine
Total Page:16
File Type:pdf, Size:1020Kb
The Curing AI for Precision Medicine Hoifung Poon 1 Medicine Today Is Imprecise Top 20 drugs 80% non-responders Wasted 1/3 health spending $750 billion / year 2 Disruption 1: Big Data 2009 2013: 40% 93% 3 Disruption 2: Pay-for-Performance Goal: 75% by 2020 4 Vemurafenib on BRAF-V600 Melanoma Before Treatment 15 Weeks 5 Vemurafenib on BRAF-V600 Melanoma Before Treatment 15 Weeks 23 Weeks 6 Why We Haven’t Solved Precision Medicine? … ATTCGGATATTTAAGGC … … ATTCGGGTATTTAAGCC … … ATTCGGATATTTAAGGC … … ATTCGGGTATTTAAGCC … … ATTCGGATATTTAAGGC … … ATTCGGGTATTTAAGCC … High-Throughput Data Discovery Bottleneck #1: Knowledge Bottleneck #2: Reasoning AI is the key to overcome these bottlenecks 7 Use Case: Molecular Tumor Board 8 www.ucsf.edu/news/2014/11/120451/bridging-gap-precision-medicine Use Case: Molecular Tumor Board Problem: Hard to scale U.S. 2015: 1.6 million new cases, 600K deaths 902 cancer hospitals Memorial Sloan Kettering 2016: Sequence: Tens of thousand Board can review: A few hundred Wanted: Decision support for cancer precision medicine 9 First-Generation Molecular Tumor Board Knowledge bottleneck E.g., given a tumor sequence, determine: What genes and mutations are important What drugs might be applicable Can do manually but hard to scale 10 Next-Generation Molecular Tumor Board Reasoning bottleneck E.g., personalize drug combinations Can’t do manually, ever 11 Big Medical Data Decision Support Precision Medicine Machine Predict Reading Drug Combo 12 13 PubMed 26 millions abstracts Two new abstracts every minute Adds over one million every year 14 Machine Reading PMID: 123 … VDR+ binds to SMAD3 to form … PMID: 456 Knowledge … JUN expression Base is induced by SMAD3/4 … …… 15 Machine Reading Involvement of p70(S6)-kinase activation in IL-10 up-regulation in human monocytes by gp41 envelope protein of human immunodeficiency virus type 1 ... 16 Machine Reading Involvement of p70(S6)-kinase activation in IL-10 up-regulation in human monocytes by gp41 envelope protein of human immunodeficiency virus type 1 ... human gp41 p70(S6)-kinase IL-10 monocyte GENE GENE GENE CELL 17 Machine Reading Involvement of p70(S6)-kinase activation in IL-10 up-regulation in human monocytes by gp41 envelope protein of human immunodeficiency virus type 1 ... Involvement REGULATION Theme Cause up-regulation REGULATION activation REGULATION Theme Cause Site Theme human gp41 p70(S6)-kinase IL-10 monocyte GENE GENE GENE CELL 18 Long Tail of Variations TP53 inhibits BCL2. Tumor suppressor P53 down-regulates the activity of BCL-2 proteins. BCL2 transcription is suppressed by P53 expression. The inhibition of B-cell CLL/Lymphoma 2 expression by TP53 … …… negative regulation 532 inhibited, 252 inhibition, 218 inhibit, 207 blocked, 175 inhibits, 157 decreased, 156 reduced, 112 suppressed, 108 decrease, 86 inhibitor, 81 Inhibition, 68 inhibitors, 67 abolished, 66 suppress, 65 block, 63 prevented, 48 suppression, 47 blocks, 44 inhibiting, 42 loss, 39 impaired, 38 reduction, 32 down-regulated, 29 abrogated, 27 prevents, 27 attenuated, 26 repression, 26 decreases, 26 down-regulation, 25 diminished, 25 downregulated, 25 suppresses, 22 interfere, 21 absence, 21 repress …… 19 Machine Reading Prior work Focused on Newswire / Web Popular entities and facts Redundancy Simple methods often suffice High-value verticals Healthcare, finance, law, etc. Little redundancy: Rare entities and facts Novel challenges require sophisticated NLP 20 Precision Medicine Machine Reading Challenges Advances Annotation Bottleneck Distant Supervision Complex Knowledge Grounded Semantic Parsing Reasoning Neural Embedding Distant Supervision with Beyond Sentence Boundary Discourse Modeling 21 Free Lunch: Existing KB Regulation Theme Cause Positive A2M FOXO1 NCI Positive ABCB1 TP53 Pathway KB Negative BCL2 TP53 … … … 22 Free Lunch: Existing KB Regulation Theme Cause Positive A2M FOXO1 NCI Positive ABCB1 TP53 Pathway KB Negative BCL2 TP53 … … … 23 Free Lunch: Existing KB Regulation Theme Cause Positive A2M FOXO1 NCI Positive ABCB1 TP53 Pathway KB Negative BCL2 TP53 … … … TP53 inhibits BCL2. Tumor suppressor P53 down-regulates the activity of BCL-2 proteins. BCL2 transcription is suppressed by P53 expression. The inhibition of B-cell CLL/Lymphoma 2 expression by TP53 … …… 24 Free Lunch: Existing KB Regulation Theme Cause Positive A2M FOXO1 NCI Positive ABCB1 TP53 Pathway KB Negative BCL2 TP53 … … … TP53 inhibits BCL2. Tumor suppressor P53 down-regulates the activity of BCL-2 proteins. BCL2 transcription is suppressed by P53 expression. The inhibition of B-cell CLL/Lymphoma 2 expression by TP53 … …… 25 Free Lunch: Existing KB Regulation Theme Cause Positive A2M FOXO1 NCI Positive ABCB1 TP53 Pathway KB Negative BCL2 TP53 … … … TP53 inhibits BCL2. Tumor suppressor P53 down-regulatesDistant the activity Supervision of BCL-2 proteins. BCL2 transcription is suppressed by P53 expression. The inhibition of B-cell CLL/Lymphoma 2 expression by TP53 … …… 26 Genetic Pathways PubMed-scale extraction 15,000 genes, 1.5 million unique regulations Compare w. NCI: 10,000 unique regulations Applications UCSC Genome Browser, MSR Interactions Track U. Wisconsin breast cancer study Etc. Poon, Toutanova, Quirk, “Distant Supervision for Cancer Pathway Extraction from Text”. PSB-15. 27 Complex Knowledge Outperform 19 out of 24 supervised participants in GENIA Shared Task Parikh, Poon, Toutanova. “Grounded Semantic Parsing for Complex Knowledge Extraction”, NAACL-15. 28 Cross-Sentence, N-ary Relations The deletion mutation on exon-19 of EGFR gene was present in 16 patients, while the L858E point mutation on exon-21 was noted in 10. All patients were treated with gefitinib and showed a partial response. 29 Drug-Gene Interactions Distant supervision w. discourse modeling Orders of magnitude increase: 162 79,952 No need for annotated examples Quirk & Poon, “Distant Supervision for Relation Extraction beyond the Sentence Boundary”, arxiv.org/abs/1609.04873. 30 Reasoning: Neural Embedding Embed gene network Entity / Relation (v1, v2, …, vk) Relation triple <subj, rel, obj> Score Distant supervision: Known relations score higher Increased recall by 20 points on held-out Toutanova et al., “Representing Text for Joint Embedding of Text and Knowledge Bases”, EMNLP-15. Toutanova et al., “Compositional Learning of Embeddings for Relation Paths in Knowledge Bases and Text”, ACL-16. 31 Big Medical Data Decision Support Precision Medicine Predict Drug Combo 32 Drug Combination Problem: What combos to try? Cancer drug: 250+ approved, 1200+ developing Pairwise: 719,400; three-way: 287,280,400 Wanted: Prioritize drug combos in silico 33 Drug Combination Problem: What combos to try? Cancer drug: 250+ approved, 1200+ developing Pairwise: 719,400; three-way: 287,280,400 Wanted: Prioritize drug combos in silico Drug 1 Drug 2 34 35 BeatAML Targeted drugs: 149 Pairs: 11,026 Tested: 102; Unknown: 10,924 36 Machine Learning Evaluation: Cell kill + Synergy Learning: Ranking loss Features Panomics: Gene expression, … Pharmacology: Drug targets Network knowledge: TP53 inhibits BCL2, … 37 Interpretable Model Feature Weight Feature Weight BCL2 and MAPK3 (moderate) 0.0442 BCL2 or MAPK3 (moderate) 0.041 MAP2K1 or MAPK10 (moderate) 0.0402 AKT2 and BCL2 (high) 0.033 BCL2 or MAPK3 (moderate) 0.0325 MAP2K1 or MAPK10 (moderate) 0.033 CSNK1E or PLK4 (high) 0.0311 BCL2 and MAPK3 (moderate) 0.031 MAP2K7 and MAPK7 (moderate) 0.0301 MAP2K5 and MAPK4 (high) 0.031 AKT3 and MAP2K1 (high) 0.0293 BCL2 and MAPK1 (high) 0.030 NEK2 or PLK1 (moderate) 0.0286 MAPK4 and SRC (high) 0.029 PSMB1 or PSMB2 (moderate) 0.0267 MAP2K2 or MAPK10 (moderate) 0.028 MAPK9 and STK11 (high) 0.0263 AKT3 and MAP2K1 (high) 0.027 MAPK1 and MAPK13 (moderate) 0.0263 BCL2 and MAPK3 (high) 0.027 … … … … BIRC5 or PLK4 (moderate) -0.0321 CSNK1D or PLK4 (moderate) -0.026 MAP2K2 or MAPK14 (high) -0.0324 MAP2K1 or MAPK13 (moderate) -0.026 AKT3 and MAPK8 (moderate) -0.0336 STK10 and STK33 (high) -0.027 STK10 and STK33 (high) -0.0337 AKT3 and MAPK8 (moderate) -0.028 BCL2 or MAPK8 (moderate) -0.0343 MAPK3 and SRC (high) -0.028 EGFR and MAPK3 (moderate) -0.036 BIRC5 or PLK4 (moderate) -0.029 MAPK10 and MAPK3 (moderate) -0.0381 MAP2K1 and MAPK10 (moderate) -0.031 MAP2K1 and MAPK10 (moderate) -0.0395 PLK1 and TAOK1 (moderate) -0.032 BCL2 or MAPK1 (high) -0.0442 MAP2K2 or MAPK14 (high) -0.034 BCL2 or MAPK8 (high) -0.0507 AURKB or PLK4 (moderate) -0.034 Composite metric AA metric 38 Interpretable Model Feature Weight Feature Weight BCL2 and MAPK3 (moderate) 0.0442 BCL2 or MAPK3 (moderate) 0.041 MAP2K1 or MAPK10 (moderate) 0.0402 AKT2 and BCL2 (high) 0.033 BCL2 or MAPK3 (moderate) 0.0325 MAP2K1 or MAPK10 (moderate) 0.033 CSNK1E or PLK4 (high) 0.0311 BCL2 and MAPK3 (moderate) 0.031 MAP2K7 and MAPK7 (moderate) 0.0301 MAP2K5 and MAPK4 (high) 0.031 AKT3 and MAP2K1 (high) 0.0293 BCL2 and MAPK1 (high) 0.030 NEK2 or PLK1 Hanover:(moderate) 0.0286 BCL2MAPK4 + andMEK SRC (high) 0.029 PSMB1 or PSMB2 (moderate) 0.0267 MAP2K2 or MAPK10 (moderate) 0.028 MAPK9 and STK11 (high) 0.0263 AKT3 and MAP2K1 (high) 0.027 MAPK1 and MAPK13 (moderate) 0.0263 BCL2 and MAPK3 (high) 0.027 … … … … BIRC5 or PLK4 (moderate) -0.0321 CSNK1D or PLK4 (moderate) -0.026 ImpendingMAP2K2 or MAPK14 (high) trial on-0.0324 VenetoclaxMAP2K1 or MAPK13 (moderate)/ Trametinib-0.026 AKT3 and MAPK8 (moderate) -0.0336 STK10 and STK33 (high) -0.027 STK10 and STK33 (high) -0.0337 AKT3 and MAPK8 (moderate) -0.028 BCL2 or MAPK8 (moderate) -0.0343 MAPK3 and SRC (high) -0.028 EGFR and MAPK3