<<

Big data e inteligencia artificial y HPC en salud y biomedicina

Alfonso Valencia. Ph.D. ICREA Professor Director Life Sciences Dept. BSC Director Spanish Institute INB-ISCIII ELIXIR-ES

Plan TL InfoDay 21 October 19 • DATOS Datos sobre el tiempo Datos sobre Biología: Genómica, Transcriptómica, Epigenómica, Proteómica, Metabolómica, Lipidómica, ….

Diez mil millones de cell phones 16 ELIXIR: European infrastructure for biological information

Spanish National Bioinformatics Institute (INB) Spanish Node of ELIXIR (ELIXIR-ES)

TransBioNet network of Bioinformaticians in Research Institutes of Spanish Hospitals (INB hosted)

13 • DATOS • INTELIGENCIA ARTIFICIAL 10/12/2019 Advances in AI and HPC go hand by hand

Since GPUs were first used in AI (2012), computing power available to generate AI models has increased exponentially – and improvements in computing power has been key for AI progress.

Petaflops/day used to train neural networks 10/12/2019 10/12/2019 BSC works on Medical Imaging

● Detecting retina pathologies ○ Trained models competitive with ophthalmologists Glaucoma Epiretinal Nevus Macular ○ With Lenovo & Hospital Vall Hebron Membrane Degeneration ● Learning from liver conditions ○ Learning about rare diseases ○ With Hospital Clinic ● Predicting and guiding in-vitro success ○ Finding the best embryo ASAP ○ With Hospital Clinic ● Supporting medical doctors on Rx review ○ Aid for Dr. in rural areas ○ With Asepeyo and ICS

By Ulises Cortes and Dario Garcia UPF & BSC Predicting structure from the sequence is one of the fundamental problems in .

It is the key to the prediction of the consequences of mutations in human diseases and to drug design • DATOS • INTELIGENCIA ARTIFICIAL • HPC The of the Research Paradigm

• Numerical Reduce expense • Avoid suffering Simulation and • Help to build knowledge where Big Data Analysis experiments are impossible or not affordable Digital Twin for Future Medicine

Is this scenario possible? When?

Simulations of biological systems at different levels

Slide from M. Ponce de L, BSC By Mariano Vazquez, CASE - BSC ~48 h simulation time, 30 min wall time ~2500 cells

Drug target

Model’s readouts

By Victor Guallar ICREA & BSC Large Scale Simulations

AI / ML systems

Exploration of Monitoring and Analysing the the parameter tailoring simulations results of the space during execution simulations time Event Recognition System

  

   Advances in AI and HPC go hand by hand

Since GPUs were first used in AI (2012), computing power available to generate AI models has increased exponentially – and improvements in computing power has been key for AI progress.

Petaflops/day used to train neural networks

From Jordi Torres & Cognitive computing for melanoma research

BSC Specialized info Watson General info

BSC 16M Text mining IBM Text Bio- abstra Healthcare Analytics Knowledge cts dictionarie Catalog corpus s

WATSON Cognitive computing

Corpus/entities extraction Content Network of entities Analytics Bio-entities associations Inference of relationships graph Functional interpretation Translational applications myasthenia azathioprine cheilitis nephrectomy elastosis sarcoidosis aids appendectomy breast

LOX tyrosine kinase inhibitor lung cancer appendicitis lobectomy neurosurgery tetanus apc potency dysphagia SMAD3 leukoderma sequela SKI CTGF hypophysectom TGFB1 CAV1 y chest pain MMP1 MIF diplopia TF NOTCH1 IRF8 dilation and ITK autoimmunity pancreatic cancer IGF1 IL6 IL10 curettage splenecto ascites MST1 CD81 pancreatitis my lymphadenitis XIAP irritation anemia vulvectomy endometriosis hyperactivity lymphadenopath FAS TP53 BRCA1 hepato y toxoplasmosis teratoma gastrectomy CD55 CDKN2 uveitis ma fertility CCND vitiligo colon BRCA2 neurofibroma MSH2 cancer melanosis enucleati amaurosis CDKN2B MLH1 on LTBP2 atypicali hypercalcemia exenteration retinal PTEN ty sclerit detachmen carcinoid is t psoriasis hemosider iriti goiter s osis photocoagulati thyroidectomy amputation on

In collaboration with IBM Word embeddings

v1 0,78 0,65 0,98 Word vectors

v2 0,23 0,12 0,32 100M Word2Vec v tokens |V| 1 v v3 0,90 0,32 0,56 3 fastText v v4 0,08 0,43 0,65 2 v 4 v5 0,77 0,88 0,77

Evaluación intrínsica: cálculo similitud entre términos (sinónimos en SNOMED) 85M tokens Evaluación extrínsica: comprobar su utilidad en otras tareas PLN (neuroNER)

MAPK3 H3K9 methylation cerebellar granule

cells Word embeddings Word

individualized Paediatric Cure (iPC) Cloud-based virtual-patient models for precision paediatric oncology Gender and other biases …

Explainable Artificial Inteligence 10/12/2019