Learning Cost-Effective and Interpretable Treatment Regimes

Learning Cost-Effective and Interpretable Treatment Regimes Himabindu Lakkaraju Cynthia Rudin Stanford University Duke University Abstract If Spiro-Test=Pos and Prev-Asthma=Yes and Cough=High then C Else if Spiro-Test=Pos and Prev-Asthma =No then Q Decision makers, such as doctors and judges, make crucial decisions such as recommending Else if Short-Breath =Yes and Gender=F and Age≥ 40 and Prev-Asthma=Yes then C treatments to patients, and granting bail to de- Else if Peak-Flow=Yes and Prev-RespIssue=No and Wheezing =Yes, then Q fendants on a daily basis. Such decisions typi- Else if Chest-Pain=Yes and Prev-RespIssue =Yes and Methacholine =Pos then C cally involve weighing the potential benefits of Else Q taking an action against the costs involved. In this work, we aim to automate this task of learn- Figure 1: Regime for treatment recommendations for ing cost-effective, interpretable and actionable asthma patients output by our framework;Q refers to treatment regimes. We formulate this as a prob- milder forms of treatment used for quick-relief, andC lem of learning a decision list – a sequence of corresponds to more intense treatments such as controller if-then-else rules – that maps characteristics of drugs (C is higher cost thanQ); Attributes in blue are least subjects (eg., diagnostic test results of patients) expensive. to treatments. This yields an end-to-end individ- ualized policy for tests and treatments. We pro- of medical tests etc.). For instance, a doctor first diagnoses pose a novel objective to construct a decision list the patient’s condition by studying the patient’s medical which maximizes outcomes for the population, history and ordering a set of relevant tests that are cru- and minimizes overall costs. Since we do not cial to the diagnosis. In doing so, she also factors in the observe the outcomes corresponding to counter- physical, mental, and monetary costs incurred due to each factual scenarios, we use techniques from causal of these tests. Based on the test results, she carefully de- inference literature to infer them. We model the liberates various treatment options, analyzes the potential problem of learning the decision list as a Markov side-effects as well as the effectiveness of each of these op- Decision Process (MDP) and employ a variant tions. Analogously, a judge deciding if a defendant should of the Upper Confidence Bound for Trees (UCT) be granted bail studies the criminal records of the defen- strategy which leverages customized checks for dant, and enquires for additional information (e.g., defen- pruning the search space effectively. Experimen- dant’s socio-economic status) if needed. She then recom- tal results on real world observational data cap- mends a course of action that trades off the risk with grant- turing judicial bail decisions and treatment rec- ing bail to the defendant (the defendant may commit a new ommendations for asthma patients demonstrate crime when out on bail) with the cost of denying bail (ad- the effectiveness of our approach. verse effects on defendant or defendant’s family, cost of jail to the county). 1 Introduction In practical situations, human decision makers often lever- Medical and judicial decisions can be complex: they in- age personal experience to make decisions, without consid- volve careful assessment of the subject’s condition, analyz- ering data, even if massive amounts of it exist for the prob- ing the costs associated with the possible actions, and the lem at hand. There exist domains where machine learn- nature of the consequent outcomes. Further, there might be ing models could potentially help – but they would need to costs associated with the assessment of the subject’s condi- consider all three aspects discussed above: predictions of tion itself (e.g., physical pain endured and monetary costs counterfactuals, costs of gathering information, and costs of treatments. Further, these models must be interpretable Proceedings of the 20th International Conference on Artificial in order to create any reasonable chance of a human deci- Intelligence and Statistics (AISTATS) 2017, Fort Lauderdale, sion maker actually using them. In this work, we address Florida, USA. JMLR: W&CP volume 54. Copyright 2017 by the problem of learning such cost-effective, interpretable the author(s). treatment regimes from observational data. Learning Cost-Effective and Interpretable Treatment Regimes Prior research addresses various aspects of the problem at ods. Regression-based methods [27, 31, 28, 32, 43, 27, 25] hand in isolation. For instance, there exists a large body model the conditional distribution of the outcomes given of literature on estimating treatment effects (namely the the treatment and characteristics of patients and choose the causal inference literature), recommending optimal treat- treatment resulting in the best possible outcome for each in- ments (see, e.g., [1, 33, 7]), and learning intelligible mod- dividual. Policy-search-based methods [28, 42, 41, 38, 39] els for prediction (e.g., [18, 15, 21, 4]). However, an ef- search for a policy (a function which assigns treatments to fective solution for the problem at hand should ideally in- individuals) within a pre-specified class of policies such corporate all of the aforementioned aspects. Furthermore, that the resulting expected outcome is maximized across existing solutions for learning treatment regimes neither ac- the population of interest. Furthermore, recent research count for the costs associated with gathering the required in personalized medicine has also focused on develop- information, nor the treatment costs. The goal of this work ing dynamic treatment regimes [14, 40, 33, 7] where the is to propose a framework which jointly addresses all of the goal is to learn treatment regimes that maximize outcomes aforementioned aspects. for patients in a given population by recommending a sequence of appropriate treatments over time, based on the We address the problem at hand by formulating it as a task state of the patient. Very few of the aforementioned so- of learning a decision list that maps subject characteristics lutions [39, 24, 40] produce regimes that are intelligible. to tests and treatments (such as the one shown in Figure 1) None of the aforementioned approaches explicitly account such that it: 1) maximizes the expectation of a pre-specified for treatment costs and costs associated with gathering in- outcome when used to assign treatments to a population of formation pertaining to patient characteristics. interest 2) minimizes costs associated with assessing subjects’ conditions and 3) minimizes costs associated with the While most work on learning treatment regimes has been treatments themselves. Note that for each subject in our done in the context of medicine, the same ideas apply to data, we observe only the outcome for the treatment as- policies in other fields. To the best of our knowledge, this signed to that subject, not the counterfactual. We use the work is the first attempt in extending work on treatment doubly robust estimation technique to infer the counter- regimes to judicial bail decisions. factuals. We chose decision lists to express the treatment regime because they are interpretable, and allow for tests Subgroup Analysis. The goal of this line of research is to be performed sequentially. We propose a novel objective to find out whether there exist subgroups of individuals in function to learn a decision list optimized with respect to which a given treatment exhibits heterogeneous effects, and the criteria discussed above. We prove that the proposed if so, how the treatment effect varies across them. This objective is NP-hard by reducing it to the weighted exact problem has been well studied [30, 8, 19, 3, 9, 34]. How- cover problem. We then optimize this objective by model- ever, identifying subgroups with heterogeneous treatment ing it as a Markov Decision Process (MDP) and employing effects does not readily provide us with regimes. a variant of the Upper Confidence Bound for Trees (UCT) Interpretable Models. A large body of machine learning strategy which leverages customized checks for pruning the literature focuses on developing interpretable models for search space effectively. classification [18, 35, 15, 21, 4] and clustering [11, 17, 16]. We empirically evaluate the proposed framework on two To this end, various classes of models such as decision real world datasets: 1) judicial bail decisions 2) treatment lists [18], decision sets [15, 36], prototype (case) based recommendations for asthma patients. Our results demon- models [4], and generalized additive models [21] were strate that the regimes output by our framework result in proposed. These classes of models were not conceived improved outcomes compared to state-of-the-art baselines to model treatment effects. There has been recent work at much lower costs. The treatment regimes we found are on leveraging decision lists to describe estimated treat- not complicated and require few diagnostic checks to de- ment regimes [34, 24, 13, 39]. These solutions do not ac- termine the optimal treatment. count for the treatment costs or costs involved in gathering patient characteristics. Several of these techniques use greedy methods, which causes issues with the quality of 2 Related Work the models. Lastly, there has also been some work on cost- Below, we provide an overview of related research on sensitive learning of classification models such as decision learning treatment regimes, subgroup analysis, and inter- trees [20, 10]. This is, however, not applicable to the prob- pretable models. lem at hand because it does not model treatment effects. Treatment Regimes. The problem of learning treat- 3 Our Framework ment regimes has been extensively studied in the con- First, we formalize the notion of treatment regimes and dis- text of medicine and health care. Along the lines of cuss how to represent them as decision lists. We then pro- [39], literature on treatment regimes can be categorized as: pose an objective function for constructing cost-effective regression-based methods and policy-search-based meth- and interpretable treatment regimes.

Learning Cost-Effective and Interpretable Treatment Regimes

Nber Working Paper Series Human Decisions And

Himabindu Lakkaraju

Overview on Explainable AI Methods Global‐Local‐Ante‐Hoc‐Post‐Hoc

Algorithms, Platforms, and Ethnic Bias: an Integrative Essay

Bayesian Sensitivity Analysis for Offline Policy Evaluation

Designing Alternative Representations of Confusion Matrices to Support Non-Expert Public Understanding of Algorithm Performance

Human Decisions and Machine Predictions*

Examining Wikipedia with a Broader Lens

Graph Evolution: Densification and Shrinking Diameters

CURRICULUM Vitae Jure LESKOVEC

Jure Leskovec

Counterfactual Predictions Under Runtime Confounding