Thinking Beyond the Mean: a Practical Guide for Using Quantile Regression Methods for Health Services Research

Thinking beyond the mean: a practical guide for using quantile regression methods for health services research The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation Lê Cook, Benjamin, and Willard G. Manning. 2013. “Thinking beyond the mean: a practical guide for using quantile regression methods for health services research.” Shanghai Archives of Psychiatry 25 (1): 55-59. doi:10.3969/j.issn.1002-0829.2013.01.011. http:// dx.doi.org/10.3969/j.issn.1002-0829.2013.01.011. Published Version doi:10.3969/j.issn.1002-0829.2013.01.011 Citable link http://nrs.harvard.edu/urn-3:HUL.InstRepos:12406692 Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#LAA Shanghai Archives of Psychiatry, 2013, Vol.25, No.1 ·55· • Biostatistics in psychiatry (13) • Thinking beyond the mean: a practical guide for using quantile regression methods for health services research Benjamin Lê COOK1*, Willard G. MANNING2 1. Introduction moderate drinkers; that is, higher taxes did not reduce consumption nearly as much for light and heavy drinkers Health services and health economics research ar- as it did for moderate drinkers. The policy implication ticles commonly use multivariate regression techniques is that increasing alcohol taxes might bring in revenue to measure the relationship of health service utilization (and reduce alcohol-related accidents among moderate and health outcomes (the outcomes of interest) with drinkers) but will have limited success in reducing the clinical characteristics, sociodemographic factors, and prevalence of heavy drinking and its sequelae. policy changes (usually treated as explanatory variables). Common regression methods measure differences in Another example is that associations of interest outcome variables between populations at the mean explaining health care and health outcomes may be (i.e., ordinary least squares regression), or a population very different among the highest utilizers of health average effect (i.e., logistic regression models), after care, compared to individuals at the bottom or middle adjustment for other explanatory variables of interest. of the distribution of health care utilization. As a simple These are often done assuming that the regression illustration, Figure 1 plots the relationship between coefficients are constant across the population – in the number of hours attended of a hypothetical other words, the relationships between the outcomes psychotherapy intervention (x-axis) and a fictitious of interest and the explanatory variables remain the scale of post-intervention mental health (higher score same across different values of the variables. There are indicates better mental health on the y-axis) for a group times, however, when researchers, policymakers, and of 400 individuals. In this example, the regression line clinicians may be interested in group differences across from an ordinary least squares (OLS) regression model the distribution of a given dependent variable rather is essentially flat, suggesting that there is no relationship than only at the mean. between number of psychotherapy session-hours and mental health at follow-up. To describe the association Taking a more concrete example from the literature, between number of session-hours and mental health for research on individuals’ consumption of alcohol individuals with low and high post-treatment scores on consistently reported that higher alcohol prices were the mental health scale using OLS, the analyst extends [1] associated with lower alcohol consumption. This the line up or down to the 90th and 10th quantiles led to a call for increases in taxes as a policy lever to in a parallel fashion, as the OLS model assumes the reduce alcohol consumption and the subsequent social association between hours of psychotherapy and mental costs of alcoholism and alcohol abuse. However, these health outcome remains the same at different levels of studies did not provide any information about whether the mental health scale. increased price decreased alcohol use similarly for light drinkers, moderate drinkers, and heavy drinkers. Because In contrast, in Figure 2, we use quantile regression there are positive social benefits for light drinkers and to allow slopes of the regression line to vary across negative health and social consequences for heavy quantiles of the mental health scale. Although the me- drinkers, analyzing the demand response of different dian line is flat as before, the 90th quantile prediction types of drinkers was important to understanding line is significantly increasing whereas the 10th quantile who was most likely to modify their behavior due to prediction line is significantly decreasing. This suggests increasing alcohol taxes. A subsequent study[2] found that the association between the hypothetical inter- light and heavy drinkers were much less price elastic than vention and post-intervention mental health is positive doi: 10.3969/j.issn.1002-0829.2013.01.011 1Center for Multicultural Mental Health Research, Cambridge Health Alliance/Harvard Medical School, Boston, MA, United States 2Harris School of Public Policy Studies, University of Chicago, Chicago, IL, United States *correspondence: [email protected] ·56· Shanghai Archives of Psychiatry, 2013, Vol.25, No.1 for those with better post-intervention mental health but provides greater flexibility than other regression there is a negative association among those with poorer methods to identify differing relationships at different post-intervention mental health. Quantile regression parts of the distribution of the dependent variable. Figure 1. Figure 2. Prediction lines at 10th quantile, mean, and 90th Prediction lines at 10th quantile, mean, and 90th quantile using ordinary least squares (OLS) regression quantile using quantile regression Same slopes - Parallel quantiles Differing slopes - Not parallel quantiles 90th quantile prediction line 90th quantile prediction line Mental health scale score health scale Mental score health scale Mental 10th quantile prediction line 10th quantile prediction line Number of hours of psychotheraphy Number of hours of psychotheraphy Health care expenditures are another area impor- same at all levels. Quantile regression is not a regression tant to policy that is amenable to an analytical strategy estimated on a quantile, or subsample of data as the that measures differences across the distribution. The name may suggest. Quantile methods allow the analyst average user of health care is obviously very different to relax the common regression slope assumption. In from the heavy user in terms of health status, but what OLS regression, the goal is to minimize the distances about other factors such as race/ethnicity, gender, between the values predicted by the regression line and employment, insurance status and other factors of the observed values. In contrast, quantile regression policy interest? Quantile regression allows for analysis differentially weights the distances between the values of these other differences that exist among heavy predicted by the regression line and the observed health care users in a way that is not possible with com- values, then tries to minimize the weighted distances.[4-6] monly used regression methods. Referring to Figure 2 above, estimating a 75th quantile regression fits a regression line through the data so that In previous applications, we have used quantile 90 percent of the observations are below the regression regression methods to assess racial and ethnic disparities line and 25 percent are above. Alternatively, this can be in health care expenditures and in mental health care viewed as weighting the distances between the values expenditures across different quantiles of expenditures, predicted by the regression line and the observed adjusting for covariates.[3] In the United States, dispari- values below the line (negative residuals) by 0.5, and ties in the distribution of health care expenditures weighting the distances between the values predicted between Blacks and Whites, and between Hispanics and by the regression line and the observed values above the Whites diminish in the upper quantiles of expenditure, line (positive residuals) by 1.75. Doing so ensures that but remain significant throughout the distribution. This minimization occurs when 75 percent of the residuals same pattern of persistent disparities was still evident in are negative. the highest education and income categories. 2.1 Describing differences across the distribution of 2. What is quantile regression? health care expenditures Quantile regression provides an alternative to To demonstrate basic SAS software package code ordinary least squares (OLS) regression and related for implementing descriptive statistics and quantile methods, which typically assume that associations regression, we use a sample of Black and White adults between independent and dependent variables are the Shanghai Archives of Psychiatry, 2013, Vol.25, No.1 ·57· of at least 18 years of age taken from the United 2.2 Example of quantile regression to measure racial States 2009 Medical Expenditure Panel Survey (MEPS). and ethnic differences across the distribution of This survey is conducted in a nationally representative health care expenditures sample of

Thinking Beyond the Mean: a Practical Guide for Using Quantile Regression Methods for Health Services Research

QUANTILE REGRESSION for CLIMATE DATA Dilhani Marasinghe Clemson University, [email protected]

Gretl User's Guide

Quantreg: Quantile Regression

柏際股份有限公司 Bockytech, Inc. 9F-3,No 70, Yanping S

Unconditional Quantile Regressions*

Stochastic Dominance Via Quantile Regression

A New Quantile Regression for Modeling Bounded Data Under a Unit Birnbaum–Saunders Distribution with Applications in Medicine and Politics

Quantile Regressionregression

Notes on Median and Quantile Regression

The Use of Logistic Regression and Quantile Regression in Medical Statistics

Chapter Quantile Regression

Logistic Quantile Regression to Evaluate Bounded Outcomes