for Personalised Healthcare: Advances and Challenges

Danielle Belgrave Researcher Healthcare Machine Learning

@DaniCMBelg My Research

Patient 1

Patient 2

Patient 3

Patient 4

0 2 4 6 8 10 12 14 16 18 20 22 24 Time (hours) Latent Variable Modelling Longitudinal Data Analysis Missing Data

X Y

Z

Patient-Centric Approach Causality Multidisciplinary Top 8 Challenges: DS in Healthcare

1. Address some of the technical challenges facing the community of machine learning for doing impactful healthcare research

2. Presentation of current solutions

3. Steps for future research Challenge # 1: Estimating Treatment Effects

Intervention

Population is split into 2 Outcomes for both groups by random allocation groups are measured

Patient Group Control

= Cured = Still Diseased Supervised Learning

Intervention Control p1 = proportion in the intervention group who are cured p2 proportion in the control group who are cured

H0: p1 - p2 = 0 e vs

H1: p1 - p2 ≠ 0

Mean: p1 - p2

푝 (1−푝 ) 푝 (1−푝 ) = Cured = Not cured Variance: 1 1 + 2 2 푛1 푛2

푝 − 푝 Assume well-labelled groups Test Statistic: Z = 1 2 1 1 푝(1−푝)( + ) Machine recognises a new example 푛1 푛2 Classification, regression Challenge #2: Heterogeneous Populations

Patient Group

Same diagnosis same prescription Understanding Heterogeneity

Drug NOT toxic Drug toxic but and beneficial Patient Group NOT beneficial

Same diagnosis same prescription

Drug toxic but Drug NOT toxic and beneficial NOT beneficial Supervised Learning Unsupervised Learning is not Enough

Not cured Cured Patient Group

Assume well-labelled groups Infer patterns/ discover underlying Machine recognises a new example data structure from a dataset without reference to labelled outcomes Classification, regression Clustering, Latent variable modelling Discovery to Understand Heterogeneity

To identify subgroups of complex disease risk

Treatment outcome explained by distinctive underlying mechanism

Foundation of Stratified Medicine

Seeking better-targeted interventions Accounting for Heterogeneity

Individualised Healthcare Outcomes

• Probabilistic Graphical Models

• Reinforcement Learning

• Deep Learning Understanding Heterogeneity Through Probabilistic Graphical Modelling Identifying Heterogeneous Patient Groups

Parsimonious description of the data inferred from what is observed Probabilistic Programming

Probabilistic reasoning system The evidence contains The probabilistic model specific information expresses general about a situation Probabilistic knowledge about a model situation

The inference uses the Evidence model to answer Inference queries given evidence Algorithm Queries

The answers to queries are framed as probabilities of Answer different outcomes The queries express the things that will help The basic components of a probabilistic reasoning system you make a decision Adapted from Pfeffer, Avi. "Practical probabilistic programming." International Conference on Inductive Logic Programming. Springer Berlin Heidelberg, 2010. Eco-systems for Probabilistic Programming

Infer.NET Edward Observed values (data, priors) Probabilistic Pyro program Stan

Infer.NET Inference Engine

Algorith Infer.NET C# Algorith C# m compiler compiler m execution

Probability distributions Understanding Heterogeneity Motivating Probabilistic Graphical Examples Modelling Example 1: Heterogeneity in Asthma

Phenotypes: Observable Manifestations of Disease

Asthma Medication Exacerbations Allergy

Poor Wheeze Asthma Lung Symptoms Function

Don’t Grow out of Respond to Asthma Late Severity Respond to Asthma treatment in Childhood treatment

Endotypes? Different Diseases With Different Causes Causal Mechanisms of Asthma and Allergy

Manchester Longitudinal birth cohort ~2000 children

Progression of allergy: Eczema -> Asthma -> Rhinitis

Symptoms Causally Linked

Prevention strategy: Bristol Target children with eczema to reduce progression to asthma Longitudinal birth cohort and rhinitis ~10000 children Hidden Markov Model: “Allergic March”

Eczema Age 1 Age 3 Age 5 Age 8 Age 11 Class Eczema State Eczema State Eczema State Eczema State Eczema State

Wheeze Age 1 Age 3 Age 5 Age 8 Age 11 Class Wheeze State Wheeze State Wheeze State Wheeze State Wheeze State

Rhinitis Age 1 Age 3 Age 5 Age 8 Age 11 Class Rhinitis State Rhinitis State Rhinitis State Rhinitis State Rhinitis State

Children (n=9801)

t e r A h e s s t h m n c a y Latent Class Disease a a n d u d M A l l e r g y S t Profile Longitudinal Latent Disease Profile

Latent State Latent State Latent State Latent State Latent State Age 1 Age 3 Age 5 Age 8 Age 11

Class = 1,….,k

Eczema Eczema Eczema Eczema Eczema Age 1 Age 3 Age 5 Age 8 Age 11

Wheeze Wheeze Wheeze Wheeze Wheeze Age 1 Age 3 Age 5 Age 8 Age 11

Rhinitis Rhinitis Rhinitis Rhinitis Rhinitis Age 1 Age 3 Age 5 Age 8 Age 11

Children (n=9801)

Latent Class Disease Profile Disaggregating Symptom Heterogeneity

The “Allergic March” reflects patterns at the population level, rather than the natural covariance of symptoms within individuals’ life courses Developmental profiles of are heterogeneous

Danielle CM Belgrave, Raquel Granell, Angela Simpson, John Guiver, , Iain Buchan, A. John Henderson, and Adnan Custovic. Developmental Profiles of Eczema, Wheeze, and Rhinitis: Two Population-Based Birth Cohort Studies. PlosMedicine 2014. Example 2: Heterogeneity in Complex Chronic Diseases Scleroderma

Aim: To predict a function of time that models the future trajectory of a single target clinical marker tracking a disease process of interest. Heterogeneity in Lung Function

Schulam, Peter, and Suchi Saria. "Integrative analysis using coupled latent variable models for individualizing prognoses." The Journal of Machine Learning Research 17.1 (2016): 8244-8278. Probabilistic Models for Individualised Care

Schulam, Peter, and Suchi Saria. "Integrative analysis using coupled latent variable models for individualizing prognoses." The Journal of Machine Learning Research 17.1 (2016): 8244-8278. Reinforcement Learning in Motivating Healthcare Examples Individualised Treatment Effects

Task of Optimising Sequential Decision-Making

Unobserved responses

Observed decisions and response

Unobserved Mechanical responses Ventilation? Sedate? Vasopressors Gottesman, O., Johansson, F., Komorowski, M., Faisal, A., Sontag, D., Doshi-Velez, F., & Celi, L. A. (2019). Guidelines for reinforcement learning in healthcare. Nature medicine, 25(1), 16-18. Reinforcement Learning in Healthcare What action maximises the reward

Action (A) - State (S) - Reward (R) - Policy (π) - Value (V)

Expected long-term return of the current state sunder policy π Vπ(s) Q-value or action-value (Q) Long-term return of the current state s, taking action a under policy π Qπ(s, a)

v(s) = E[Rt+1 + λv(St+1)| St = s]

2 Qπ(s, a) = E[rt+1 + λrt+2 + λ rt+3 | s, a]

= E[r + λ Qπ(s’, a’) | s, a]

Q*π(s, a) = E[r + λmax Q*(s’, a’) | s, a] Successful Applications of RL in Healthcare

1. RL applied to optimizing antiretroviral therapy in HIV Parbhoo, S., Bogojeska, J., Zazzi, M., Roth, V. & Doshi-Velez, F. AMIA Summits on Translational Science Proceedings 2017, 239 (2017).

2. RL applied to tailoring antiepilepsy drugs for seizure control Guez, A., Vincent, R. D., Avoli, M. & Pineau, J. Treatment of epilepsy via batch- mode reinforcement learning. In Proceedings of the Twenty-Tird AAAI Conference on Artifcial Intelligence 1671–1678 (AAAI, 2008).

3. RL applied to interventions in ICU Komorowski, M., Celi, L. A., Badawi, O., Gordon, A. & Faisal, A. Nat. Med. 24, 1716– 1720 (2018). Confounding factors: Estimated effects of medication may be diminished if severely ill or high-risk patients are given a particular drug

Effective sample size larger if learned policies are close to the clinician policies – RL better for refining existing practices rather than discovering new treatment approaches

If the rewards we are trying to maximise are too simplistic, will the system behave as expected in real life

Downfalls of RL in Healthcare Deep Learning

Exceptionally effective at learning patterns of data

Utilizes learning that derive meaning out of data by using a hierarchy of multiple layers that mimic the neural networks of our brain

If you provide a system with tons of information, it begins to understand it and respond in useful ways Challenge #3: Heterogeneous Data Types Data “In-the-Wild”

The Richer the Data Types, the more we need other methods Meta-challenges to Estimating Treatment Effect

Heterogeneous Generalisability Causal Discovery Patient Populations

Data “In-the-Wild” Missing Data Interpretability Challenge #4: Causality Causal Reasoning

The questions that motivate most studies in the health, social and behavioral sciences are not associational but causal in nature.

Before an association is assessed for the possibility that it is causal, other explanations such as chance, bias and confounding have to be excluded

Require some knowledge of the data-generating process - cannot be computed from the data alone, nor from distributions governing data

Aim: to infer dynamics of beliefs under changing conditions, for example, changes induced by treatments or external interventions. Pearl, Judea. "Causal inference in : An overview." Statistics surveys 3 (2009): 96-146. Efficacy and mechanism evaluation: Causal Framework for Personalising Health U

Mediator

Predictive biomarker (moderator)

Outcomes Random Allocation

Prognostic biomarker (risk factor) Example: Personalisation of Cancer Treatment

U

Tumor Size

Genetic Marker

Outcome Treatment (Survival)

Prognostic biomarker (risk factor) Bradford-Hill Principles of Causality

Plausibility Consistency Temporality Cause associated with disease in Does causation make sense Cause precedes disease different population and studies

Strength Specificity Dose-Response Cause strongly associated Does the cause lead to Greater exposure to cause, with disease a specific effect higher the risk of disease What can Machine Learning Offer The Deconfounder

• Capture dependency structure using a latent variable model

• Assumption 1: the fitted latent-variable model is a good model of the assigned causes • Testable

• Assumption 2: there are no unobserved single-cause confounders, variables that affect one cause and the potential outcome function • Not testable, it is weaker than an assumption of no unobserved confounders Challenge #5: Generalizability Sparsity in Health Data

Major challenge for truly generalizable and scalable AI in healthcare is maximizing information utility for public health impact when that information (observational or clinical-context data) is sparse Missing data Inadequately sampled data Data that does not represent the diversity of a population

Generalisability: Training datasets that are representative of the diversity of the population as well as the heterogeneity of health conditions.

Transfer learning: potential to Maximise utility of available data Improve model’s ability to generalise Transfer Learning for Data Sparsity

Good quality healthcare data is expensive and very often sparse Aim: Maximizing information by using multiple data sources Challenge: Feature mismatch: features in different datasets may vary Challenge: Distribution Mismatch: differing patient populations across different hospitals GAN architectures to efficiently enlarge the dataset Better predictive models than if we simply used the target dataset Jinsung Yoon, James Jordon and Mihaela van der Schaar. “RadialGAN: Leveraging multiple datasets to improve target-specific predictive models using Generative Adversarial Networks.” arXiv preprint arXiv:1802.06403 (2018). RadialGAN Transfer Learning for Data Sparsity

Z: Latent space X(i) x Y: ith domain Gi, Fi, Di: Decoders, Encoders and Discriminator of the ith domain

The ith domain is translated to th the j domain via Z using Fi and Gj

Jinsung Yoon, James Jordon and Mihaela van der Schaar. “RadialGAN: Leveraging multiple datasets to improve target-specific predictive models using Generative Adversarial Networks.” arXiv preprint arXiv:1802.06403 (2018). Challenge #6: Missing Data Missing Data Missing Completely At Random (MCAR) The probability of data being missing does not depend on the observed or unobserved data e.g. logit(pit) = θ0 Missing At Random (MAR) The probability of data being missing does not depend on the unobserved data, conditional on the observed data e.g. Children with missing wheeze data have better lung function e.g. logit(pit) = θ0 + θ1ti or logit(pit) = θ0 + θ2y0 Missing Not At Random (MNAR) The probability of data being missing does depend on the unobserved data, conditional on the observed data. e.g. Children with missing lung function have better lung function e.g. logit(pit) = θ0 + θ3yit Mason, Alexina, Nicky Best, Sylvia Richardson, and IAN PLEWIS. "Strategy for modelling non-random missing data mechanisms in observational studies using Bayesian methods." Journal of Official Statistics (2010) Alexina Mason. “Bayesian methods for modelling non-random missing data mechanisms in longitudinal studies” PhD Thesis (2009) Missing Completely at Random

β θ

µ푖 푥푖

σ² 푦푖 푝푖

푚푖 Individual i Model of Model of Interest Missingness Missing at Random

β θ

µ푖 푥푖

σ² 푦푖 푝푖

푚푖 Individual i Model of Model of Interest Missingness Missing Not at Random

β θ

µ푖 푥푖

σ² 푦푖 푝푖

푚푖 Individual i Model of Model of Interest Missingness Challenge #7: Interpretability Challenge #8: Fairness

“People will only use technology they trust.” Brad Smith President & Chief Legal Officer Microsoft Corporation Problem Specific Challenge

Tutorial: 21 fairness definitions and their politics Arvind Narayanan Update: this tutorial was presented at the Conference on Fairness, Accountability, and Transparency, Feb 23 2018. Watch it here.

Computer scientists and have devised numerous mathematical criteria to define what it means for a classifier or a model to be fair. The proliferation of these definitions represents an attempt to make technical sense of the complex, shifting social understanding of fairness. Thus, these definitions are laden with values and politics, and seemingly technical discussions about mathematical definitions in fact implicate weighty normative questions. A core component of these technical discussions has been the discovery of trade-offs between different (mathematical) notions of fairness; these trade-offs deserve attention beyond the technical community. https://towardsdatascience.com/a-tutorial-on-fairness-in-machine-learning-3ff8ba1040cb Making AI work in healthcare

Design AI to Earn Trust z

Fairness Reliability Privacy & Inclusiveness & Safety Security

Transparency

Accountability Reliability & Safety

Evaluate training data

Test extensively (and enable a user feedback loop)

Monitor ongoing performance

Design for unexpected circumstances—including nefarious attacks

Human in the loop 53 Privacy & Security Existing privacy laws (e.g. the General Data Protection Regulation) apply

Provide transparency about data collection and use, and good controls so people can make choices about their data

Design systems to protect against bad actors

Use de-identification techniques to promote both privacy and security 54 Inclusiveness

Inclusive design practices to address potential barriers that could unintentionally exclude people

Enhances opportunities people with disabilities

Build trust through contextual interaction

EQ in addition to IQ 55 Transparency

People should understand how decisions were made

Provide contextual explanations

Make it easier to raise awareness of potential bias, errors and unintended outcomes

56 Ultimate Challenge: Defining the Trade-offs between the Challenges

Heterogeneous Generalisability Causal Discovery Accuracy Patient Populations

Data “In-the-Wild” Missing Data Interpretability Fairness Healthcare ML at MSR Cambridge

Javier Alvarez-Valle Danielle Belgrave Chris Bishop Laurence Bourn Isabel Chien David Carter Stephanie Hyland Richard Lowe

Pratik Ghosh Hannah Murfet Jay Nanavati Aditya Nori Kenton O’Hara Konstantina Palla Tim Regan Anton Schwaighofer

Jan Steumer Kenji Takeda Ivan Tarapov Anja Thieme Sebastian Tschiatschek Stefan Wijnen Ted Meeds Richard Turner Thank You!

t e r A h e s s t h m n c a y a a n u d M d A l l e g y S t r Danielle Belgrave [email protected]

@DaniCMBelg