Basic Difference-In-Differences Models In

How D-I-D you do that? Basic Difference-in-Differences Models in SAS® E Margaret Warton, Kaiser Permanente Division of Research, Oakland, CA Melissa M Parker, Kaiser Permanente Division of Research, Oakland, CA Andrew J Karter, Kaiser Permanente Division of Research, Oakland, CA ABSTRACT Long a mainstay in econometrics research, difference-in-differences (D-I-D) models have only recently become more commonly used in health services and epidemiologic research. D-I-D study designs are quasi-experimental, can be used with retrospective observational data, and do not require exposure randomization. This study design estimates the difference in pre-post changes in an outcome comparing an exposed group to an unexposed (reference) group. The outcome change in the unexposed group estimates the expected change in the exposed group had the group been, counterfactually, unexposed. By subtracting this change from the change in the exposed group (the “difference in differences”), the effects of background secular trends are removed. In the basic D-I-D model, each subject serves as his or her own control, removing confounding by known and unknown individual factors associated with the outcome of interest. Thus, the D-I-D generates a causal estimate of the change in an outcome associated with the initiation of the exposure of interest while controlling for biases due to secular trends and confounding. A basic repeated-measures generalized linear model provides estimates of population-average slopes between two time points for the exposed and unexposed groups and tests whether the slopes differ by including an interaction term between the time and exposure variables. In this paper, we illustrate the concepts behind the basic D-I-D model and present SAS® code for running these models. We include a brief discussion of more advanced D-I-D methods and present an example of a real-world analysis using data from a study on the impact of introducing a value-based insurance design (VBID) medication plan at Kaiser Permanente Northern California on change in medication adherence. INTRODUCTION Difference-in-differences (D-I-D) methods have been used in the field of econometrics for several decades but have only recently become more widely used in the fields of epidemiology and health research. D-I-D analysis is a quasi- experimental design used in the study of longitudinal cohort data with pre- and post-exposure repeated measures. It allows the comparison of changes over time in an outcome between exposed and control groups while accounting for changes in secular trends and controlling for both measured and unmeasured confounding. Because the design forces adherence to time-ordering in exposure and outcome measures, estimates from D-I-D models can be interpreted causally. Simple D-I-D models can be used effectively when data are available from a longitudinal pre/post cohort design. Either prospective or retrospective data collection is possible, so long as the timing of measurements is known. In this paper, we explain the fundamental D-I-D study design and illustrate a basic analysis using SAS® , specifically the GLM and MIXED procedures that allow accounting for repeated measures. THE D-I-D DESIGN OVERVIEW The D-I-D design is conceptually simple: measure the change in an outcome between the pre and post periods for an exposed group and a control group, then subtract one from the other to see the “difference in the differences” between the groups. In other words, the most basic D-I-D study is a “pre-post” design that compares the changes between two groups over two time points. EXPOSED VERSUS UNEXPOSED In order to use the D-I-D analytic approach, a longitudinal cohort is divided into at least two groups: subjects exposed and unexposed to the condition or treatment of interest. Outcome measures must be available for members of both groups before and after a time point at which exposure occurs for the exposed group. While the time points do not have to be specified calendar dates or even the same for each subject, this timing is the simplest way to create the pre-post longitudinal study. For this reason, this study design is very useful for measuring the results of programs, policies or protocols that are implemented at a specific time and are applied to a subgroup within a population. As with any study design requiring an unexposed comparison group, the identification of an appropriate control group is key. The control group should be as similar as possible to the exposed group, observed over the same period of time, and hopefully differing only in the exposure. PRE AND POST MEASURES In its simplest form, the only data required for a D-I-D analysis are the exposure flag, the outcome measures, 1 How D-I-D you do that? Basic Difference-in-Differences Models in SAS®, continued identified as pre or post, and an identifier variable for each individual. In situations where the predicted outcomes should take account of the various population characteristics (age and sex, for example), these variables can be included in the model and then used to adjust predicted values. The simplest D-I-D models are used with continuous outcomes, as changes in continuous outcomes are more easily interpreted. D-I-D models can be used with binary outcomes, although the interpretation for binary outcomes is a little more complicated. This paper will focus on continuous outcomes. CAUSAL INFERENCE The pre-post design maintains the time-ordering of events, an important aspect of the design that meets the basic requirement of any model that can be interpreted causally. In addition, in a D-I-D analysis, each subject serves as its own control: the characteristics of each subject that remain the same in both periods therefore cannot be confounders. This is true whether or not those characteristics are measured, so results from the D-I-D model account for both measured and unmeasured confounders. Including the unexposed control group in the model adjusts for underlying temporal trends in the outcome, thus differences between changes in the exposed group and the unexposed group represent changes specifically due to the exposure. For these reasons, the results from a D-I-D analysis can be interpreted causally, making it an ideal design for pre-post analyses in observational studies. EXAMPLE STUDY BACKGROUND Healthcare costs have been rising rapidly in the United States for many years. In the past, nearly all health care plans provided by Kaiser Permanente of Northern California (KPNC) had no deductible and low co-pays. However, during the past decade, employers and individuals have begun to purchase health insurance with a deductible and higher co-pays to reduce premium costs. To counter increasing out-of-pocket costs, new benefit programs have been introduced to reduce costs for effective treatments that have been shown to improve patient health. The theory is that encouraging use of these preventive treatments will lower the cost of providing care over time. Plans like these that are built into the insurance policy are known as Value-Based Insurance Design (VBID) benefits. One important healthcare area experiencing rising costs is prescription medications. As costs increase for medications, patients may take lower doses than recommended or stop taking a medication altogether. These behaviors are often measured by medication adherence: the percentage of time over a given period during which a patient has adequate medication available, usually based on prescription refill data. In 2013, KPNC began offering a VBID pharmacy benefit option (VBID Rx) to provide certain prescription medications for free, including drugs to treat high cholesterol, diabetes, and hypertension. Using this potential natural experiment, we wanted to know if the VBID medication benefit improved adherence to medications for chronic conditions among patients with a deductible plan(Reed, Mary. 2016). However, we did not have a large enough sample of patients who were on a deductible plan and had the VBID medication plan added later. Instead, we identified a cohort of patients on a non-deductible plan in 2013 whose employers switched to a deductible plan at the beginning of 2014. Our comparison cohort consisted of patients on a non-deductible plan in 2013 who switched to a deductible plan with no VBID benefit in 2014. We then compared medication adherence among those with and without the VBID benefit in 2013 and 2014. With the D-I-D design, we could identify changes in medication adherence due to the VBID plan while simultaneously removing the influence of the underlying change to a deductible plan and adjusting for confounding, measured or not, at the subject level. Figure 1 illustrates the study timing, cohort composition and sample sizes. Figure 1. Cohort Description for Difference-in-Differences Study of VBID Medication Plan Implementation and Medication Adherence 2 How D-I-D you do that? Basic Difference-in-Differences Models in SAS®, continued THE D-I-D STUDY DESIGN IN DETAIL A graphical illustration can be helpful in understanding the D-I-D study design. In Figure 2, A1 and A2 indicate the mean medication adherence values of the outcome at the pre and post time periods, respectively, in the unexposed group. Similarly, B1 and B2 represent the change in pre and post adherence for the exposed group. The change in the unexposed group over time is represented by the difference in height between the pre and post mean outcomes, the dashed line. Since the measurement points are a year apart, the slope of this line represents the annual rate of change and is interpreted as the background secular trend in the outcome over time in a group not affected by the exposure. Similarly, the slope of the solid line indicates the change in the medication adherence between the pre and post periods among those who experienced the exposure.

Basic Difference-In-Differences Models In

Wait, I Don't Want to Be the Linux Administrator for SAS VA

Drilling Network Stacks with Packetdrill

UNIX X Command Tips and Tricks David B

Lex and Yacc

Lexical Analysis Lexeme Token Using Lex and Yacc Tools Examples

A/UX® Programming Languages and Tools, Volume 2

Alexander Lex

SAS Programmer's Guide to Life on the SAS Grid

The A-Z of Programming Languages: AWK

Awk — a Pattern Scanning and Processing Language (Second

Sysadmin and Networking

The Prosodic Structure of Serbo-Croatian Function Words: an Argument for Tied Constraints *