Best–Worst Scaling Vs. Discrete Choice Experiments: an Empirical Comparison Using Social Care Data Article (Accepted Version) (Refereed)

Dimitris Potoglou, Peter Burge, Terry Flynn, Ann Netten, Juliette Malley, Julien Forder and John E. Brazier Best–worst scaling vs. discrete choice experiments: an empirical comparison using social care data Article (Accepted version) (Refereed) Original citation: Potoglou, Dimitris and Burge, Peter and Flynn, Terry and Netten, Ann and Malley, Juliette and Forder, Julien and Brazier, John E. (2011) Best–worst scaling vs. discrete choice experiments: an empirical comparison using social care data. Social science & medicine, 72 (10). pp. 1717-1727. ISSN 0277-9536 DOI: 10.1016/j.socscimed.2011.03.027 © 2011 Elsevier Ltd. This version available at: http://eprints.lse.ac.uk/42278/ Available in LSE Research Online: April 2012 LSE has developed LSE Research Online so that users may access research output of the School. Copyright © and Moral Rights for the papers on this site are retained by the individual authors and/or other copyright owners. Users may download and/or print one copy of any article(s) in LSE Research Online to facilitate their private study or for non-commercial research. You may not engage in further distribution of the material or use it for any profit-making activities or any commercial gain. You may freely distribute the URL (http://eprints.lse.ac.uk) of the LSE Research Online website. This document is the author’s final manuscript accepted version of the journal article, incorporating any revisions agreed during the peer review process. Some differences between this version and the published version may remain. You are advised to consult the publisher’s version if you wish to cite from it. Best-Worst Scaling vs. Discrete Choice Experiments: An Empirical Comparison using Social Care Data Authors: Dimitris Potoglou1, Peter Burge1, Terry Flynn2, Ann Netten3, Juliette Malley3, Julien Forder3, John E Brazier4, Author affiliations: 1 RAND Europe, Cambirdge 2 Centre for the Study of Choice, University of Technology Sydney 3 Personal Social Services Research Unit, University of Kent 4 Heath Economics and Decision Science, School of Health and Related Research, University of Sheffield Corresponding author: Dimitris Potoglou Accepted for publication by Social Science and Medicine 1 Key messages • This study illustrates key issues that are important in choosing between profile-case best-worst scaling and discrete choice experiment studies • Empirical research on the value of outcomes of social care reveals similar patterns in the preference weights obtained from the two approaches • In the majority of cases examined, preference weights are not significantly different once the weights have been appropriately normalised/rescaled 2 Abstract This paper presents empirical findings from the comparison between two principal preference elicitation techniques: discrete choice experiments and profile-based best-worst scaling. Best-worst scaling involves less cognitive burden for respondents and provides more information than traditional "pick-one" tasks asked in discrete choice experiments. However, there is lack of empirical evidence on how best-worst scaling compares to discrete choice experiments. This empirical comparison between discrete choice experiments and best-worst scaling was undertaken as part of the Outcomes of Social Care for Adults project, which aims to develop a weighted measure of social care outcomes. The findings show that preference weights from best-worst scaling and discrete choice experiments do reveal similar patterns in preferences and in the majority of cases preference weights - when normalised/rescaled - are not significantly different. Keywords UK; Best-worst scaling; discrete choice experiments; stated choice; discrete choice models; social care, social care outcomes; quality of life 3 Introduction Priority-setting in many areas of public policy is informed through the use of public preferences. Within the ‘non-welfarist’ or ‘extra-welfarist’ paradigm, public preferences are elicited in nationally representative valuation exercises, typically using the standard gamble (SG) or time trade-off (TTO) (Brazier, Ratcliffe, Salomon, & Tsuchiya, 2007). These tools require respondents to manipulate probabilities or lengths of life and so rely on an assumption of cardinality in responses. Theoretical and empirical problems with these methods (Bleichrodt, 2002) have led to interest in tasks that require only ordinality in responses, such as discrete choice experiments (DCEs) and ranking studies. DCEs have been used extensively to facilitate analyses in the fields of transport and environmental policy. However, they can also be used to value different instruments and work is underway to do so for the EQ-5D-5L for measuring health outcomes (as a supplement to a TTO valuation) and the ICECAP capability indices (at least). Best-worst scaling (BWS) is an alternative preference elicitation method that also only requires an assumption of ordinality. It was developed by Louviere and Woodworth (1990) and its first application was published in 1992 (Finn & Louviere) illustrating Case 1 (the ‘object’ case). The method gained popularity in health and social care when the properties of Case 2 (the ‘profile’ – previously called ‘attribute’ – case) were proved (Marley, Flynn, & Louviere, 2008) and a guide to its use was published (Flynn, Louviere, Peters, & Coast, 2007). Flynn (2010a) provides an overview and theoretical discussion of the different cases of BWS. Case 2 has particular advantages in valuation studies that seek to elicit general population preferences for important attributes of quality of life (or whatever maximand is of relevance to policymakers). In particular, it presents profiles one at a time, rather than in choice sets of size two or more as in a traditional DCE. This is important when respondents do not have experience of making choices in the particular area of application: keeping two or more profiles in mind at once is likely to be a harder task, leading to an increase the size 4 of the random utility component and reduction of the statistical efficiency of the preference elicitation. This paper reports on the empirical comparison between the discrete choice and profile-case BWS experiments using data from a pilot study seeking to elicit values for different dimensions of social care related quality of life. The specific objective is to determine the extent to which valuations of quality of life states obtained through a best-worst scaling experiment are comparable to those obtained through discrete choice experiments. To our knowledge, this paper is the first to empirically test the comparability of the profile-case BWS and DCE estimates. The following section provides a brief background to general research framework. Then the methods employed in this study including the design of the discrete choice and best-worst scaling experiments, data collection and econometric analysis will be presented. Finally, model estimations from the discrete choice and best-worst experiments and the comparison of values between DCE and BWS will be discussed. Background to ASCOT measure This research is part of the Outcomes of Social Care for Adults (OSCA) project (Netten, Malley, Forder, Burge, Potoglou, Brazier et al., 2009), which is building on work that has been undertaken on social-care outcome measurement over a number of years, including, the Individual Budget pilot evaluation (Glendinning, Challis, Fernández, Jacobs, Jones, Manthorpe et al., 2008). The measure being developed is part of the Adult Social Care Outcome Toolkit (ASCOT) (see Netten et al., 2009 for full details). The toolkit is being developed as part of the Measuring Outcomes for Public Service Users (MOPSU) project, which was led by the Office for National Statistics (ONS) in UK (ONS, 2010). The work on adult social care focuses primarily on outcomes for residents of care homes (Netten, Beadle- Brown, Trukeschitz, Towers, Welch, Forder et al., 2010) and low level interventions, that is low cost services usually targeted at people with low level needs, for example many day centres for older people (Caiels, 2010). 5 The ASCOT measure is designed to capture information about an individual’s social care- related quality of life (SCRQOL). The aim is for the measure to be applicable across as wide a range of user groups and care and support settings as possible. In identifying and defining the domains the aim was to ensure the measure is sensitive to outcomes of social care activities. Evidence from consultation with service users, experts and policy-makers, as well as focus group work and interviews with service users indicated that the measure captures aspects of SCRQOL that are valued by service users (and policy-makers) (Bamford, Qureshi, Nicholas, & Vernon, 1999; Malley, Sandhu, & Netten, 2006; Miller, Cooper, Cook, & Petch, 2008; Netten, McDaid, Fernández, Forder, Knapp, Matosevic et al., 2005; Netten, Ryan, Smith, Skatun, Healey, Knapp et al., 2002; Qureshi, Patmore, Nicholas, & Bamford, 1998). Methods Social care domains and levels Evidence from previous analyses (Bamford et al., 1999; Malley et al., 2006; Miller et al., 2008; Netten et al., 2005; Netten et al., 2002; Qureshi et al., 1998), conceptual development and results of the consultation with stakeholders and service users and carers (Netten et al., 2009) fed into selection of domains and their levels. Nine domains were finally selected to describe social-care related quality-of-life situations: food and drink, personal cleanliness, accommodation, safety, social participation, occupation,

Best–Worst Scaling Vs. Discrete Choice Experiments: an Empirical Comparison Using Social Care Data Article (Accepted Version) (Refereed)

Combining Choice Experiment and Attribute Best–Worst Scaling

Integrated Choice and Latent Variable Models: a Literature Review on Mode Choice Hélène Bouscasse

Cloud Computing Adoption Decision Modelling for Smes: from the PAPRIKA Perspective Salim Alismaili University of Wollongong

A Systematic Review of the Reliability and Validity of Discrete Choice

Experimental Measurement of Preferences in Health and Healthcare Using Best-Worst Scaling: an Overview Axel C

The Generalized Multinomial Logit Model

An R Package for Case 1 Best-Worst Scaling Mark H. White II

Design and Analysis of Simulated Choice Or Allocation Experiments in Travel Choice Modeling

Discrete Choice Experiments Are Not Conjoint Analysis

Is Best-Worst Scaling Suitable for Health State Valuation? a Comparison with Discrete Choice Experiments

Constructing Experimental Designs for Discrete-Choice Experiments: Report of the ISPOR Conjoint Analysis Experimental Design Good Research Practices Task Force F

Estimation of Multinomial Logit Models in R : the Mlogit Packages