<<

The impact of cognitive on information searching and decision making

Annie Ying Shan LAU

Thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy

Centre for Health Informatics

University of New South Wales

December 2006

I hereby grant the University of New South Wales or its agents the right to archive and to make available my thesis or dissertation in whole or part in the University libraries in all forms of media, now or here after known, subject to the provisions of the Copyright Act 1968. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation. I also authorise University Microfilms to use the 350 word abstract of my thesis in Dissertation Abstracts International. I have either used no substantial portions of copyright material in my thesis or I have obtained permission to use copyright material; where permission has not been granted I have applied/will apply for a partial restriction of the digital copy of my thesis or dissertation.

Annie Ying Shan Lau

November 2007

I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at UNSW or any other educational institution, except where due acknowledgement is made in the thesis. Any contribution made to the research by others, with whom I have worked at UNSW or elsewhere, is explicitly acknowledged in the thesis. I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project's design and conception or in style, presentation and linguistic expression is acknowledged.

Annie Ying Shan Lau

December 2006

Dedicated to my parents

Abstract

This research is possibly the first study investigating the impact of cognitive biases on information searching and decision making. Set in the context of making health-related decisions, this research tests the hypotheses that (i) people experience cognitive biases during information searching; (ii) cognitive biases can be corrected during information searching; and (iii) correcting for biases during information searching improves decision making. Using a retrospective data analysis, a Bayesian model and a series of prospective empirical experiments, the cognitive biases investigated are anchoring effect, order effects, exposure effect and reinforcement effect.

People may experience anchoring effect, exposure effect and order effects while searching for information. A person’s prior belief (anchoring effect) has a significant impact on decision outcome (P < 0.001). Documents accessed at different positions in a search journey (order effects) and documents processed for different lengths of time (exposure effect) have different degrees of influence on decision making (order: P = 0.026; exposure: P

= 0.0081).

To remedy the impact of cognitive biases, a series of interventions were designed and trialled to test for their ability to modify the impact of biases during search. A search engine interface was modified to allow for a document-level intervention, which attempts to debias order effects, exposure effect and reinforcement effect; a decision-focussed intervention for the anchoring effect; and an education-based intervention to inform users about the biases investigated in this research.

Evaluation of these alterations to the search interface showed that some of the interventions can reduce or exacerbate cognitive biases during search. Order effects are no longer apparent amongst subjects using a “keep document tool” (i.e. order debiasing intervention) (P = 0.34); however, it is not associated with any significant improvement in decision accuracy (P = 0.23). Although the anchoring effect remains robust amongst subjects using a “for/against document tool” (i.e. anchor debiasing intervention) (P < 0.001),

the intervention is marginally associated with a 10.3% increased proportion of subjects who answered incorrectly pre-search to answer correctly post-search (P = 0.10). Overall, this research concludes with evidence that using a debiasing intervention can alter search behaviour and influence the accuracy and confidence in decision making.

Acknowledgement

This thesis would not have been produced without the invaluable opportunity given by

Professor Enrico Coiera and Professor Nigel Lovell. I am indebted to Enrico for guiding me into the world of research with great wisdom, support and inspiration. His continuous pursuit for innovation and excellence and his respect for every individual are qualities that inspire me deeply in my training as a researcher. His insight to see the core of a problem and his ingenious approaches to research questions are attributes that I believe every researcher should aspire to develop.

Special thanks go to current and past members of the decision support team at the

Centre for Health Informatics for their technical assistance and their intellectual stimulation in the research (in alphabetical order): Dr Farah Magrabi, Mr Ken Nguyen, Dr Victor

Vickland and Mr Martin Walter. Also, special thanks go to Professor Johanna Westbrook for providing the data that enabled this research to commence and for her assistance in research matters over the years. In addition, special thanks go to Dr David Thomas and Dr Isle

Blignault for their assistance in the study recruitment and the case scenario design, and the hundreds of participants who took part in the study.

Thanks go to the following people who have given me assistance during my candidature, from statistical matters, study design, usability and pilot studies to thesis draft revisions (in alphabetical order): Associate Professor Deborah Black, Miss Michelle Brear, Dr Grace

Chung, Miss Nerida Creswick, Dr Sally Galbraith, Mr Andrew Georgiou, Dr Bob Jansen, Dr

Geoff McDonnell, Dr Conrad Newton, Dr Marilyn Rob, Ms Sam Sheridan, Dr Vitali

Sintchenko and Ms Margaret Williamson. I also want to thank Dr Yusuf Pisan for encouraging and facilitating my pursuit of a postgraduate research degree, and to my colleagues at the Centre for Health Informatics for their friendship and encouragement throughout the years. Acknowledgement goes to the Australian Research Council for its financial support in this research.

Finally, I want to thank my parents, my siblings and Trevor for being my pillar of strength, and for walking this entire journey with me with unconditional love, patience and support.

Contents

PART I: INTRODUCTION...... 1

1 BACKGROUND INFORMATION ...... 2

1.1 PROBLEM STATEMENT ...... 2

1.2 RESEARCH GAP ...... 2

1.3 AIM...... 5

2 GUIDE TO THESIS ...... 6

PART II: WHAT DO WE UNDERSTAND ABOUT INFORMATION SEARCHING

BEHAVIOUR?...... 7

3 EVIDENCE JOURNEY: DATA EXPLORATION...... 8

3.1 INTRODUCTION ...... 8

3.2 DATA DESCRIPTION...... 9

3.3 DATA EXPLORATION ...... 13

3.4 DISCUSSION ...... 18

3.5 CONCLUSION ...... 19

PART III: DO COGNITIVE BIASES INFLUENCE INFORMATION SEARCHING AND

DECISION MAKING?...... 20

4 COGNITIVE BIASES: A LITERATURE REVIEW ...... 21

4.1 INTRODUCTION ...... 21

4.2 COGNITIVE BIASES ...... 22

4.3 STUDIES OF COGNITIVE IN DECISION MAKING ...... 25

4.4 HEURISTICS AND IN THE CONTEXT OF INFORMATION SEARCHING ...... 29

4.5 CONCLUSION ...... 31

5 DO COGNITIVE BIASES OCCUR IN SEARCH BEHAVIOURS? A PRELIMINARY ANALYSIS ...... 32

5.1 METHOD ...... 32

5.2 RESULTS ...... 36

5.3 DISCUSSION ...... 49

5.4 CONCLUSION ...... 50

6 HYPOTHESES ...... 52

6.1 COGNITIVE BIASES DURING INFORMATION SEARCHING...... 52

6.2 IMPACT ON INFORMATION SEARCHING AND DECISION MAKING...... 58

6.3 SUMMARY...... 60

7 MODELLING THE IMPACT OF BIASES ON SEARCH DECISION OUTCOMES: BAYESIAN MODEL .. 62

7.1 INTRODUCTION ...... 62

7.2 METHOD ...... 63

7.3 RESULTS ...... 67

7.4 DISCUSSION ...... 68

7.5 CONCLUSION ...... 71

PART IV: CAN WE OPTIMISE INFORMATION SEARCHING TO MAKE BETTER

DECISIONS?...... 72

8 DEBIASING STRATEGIES: A LITERATURE REVIEW ...... 73

8.1 OVERVIEW OF DEBIASING STRATEGIES ...... 73

8.2 DEBIASING STRATEGIES (WITHOUT USING INFORMATION TECHNOLOGY)...... 74

8.3 DEBIASING INTERVENTIONS (USING INFORMATION TECHNOLOGY) ...... 83

8.4 DISCUSSION ...... 88

8.5 CONCLUSION ...... 89

9 SEARCH USER INTERFACE: A LITERATURE REVIEW ...... 91

9.1 SELECTING INFORMATION...... 91

9.2 PROCESSING INFORMATION...... 97

9.3 INTEGRATING INFORMATION...... 101

9.4 DISCUSSION ...... 107

9.5 CONCLUSION ...... 108

10 DESIGN OF DEBIASING INTERVENTIONS ...... 110

10.1 DESIGN GOALS AND ASSUMPTIONS ...... 110

10.2 METHODS ...... 111

10.3 KEEP DOCUMENT TOOL...... 114

10.4 FOR/AGAINST DOCUMENT TOOL ...... 119

10.5 EDUCATION-BASED INTERVENTION ...... 122

10.6 CONCLUSION ...... 123

PART V: WHAT IS THE IMPACT OF INFORMATION SEARCHING ON DECISION

MAKING? ...... 124

11 STUDY DESIGN: ...... 125

11.1 INTRODUCTION ...... 125

11.2 STUDY DESIGN ...... 126

11.3 CONCLUSION ...... 137

12 IMPACT OF INFORMATION SEARCHING ...... 138

12.1 DATA EXCLUSION CRITERIA...... 138

12.2 METHODS ...... 138

12.3 COMPATIBILITY BETWEEN PHASES 1 AND 2...... 140

12.4 RESULTS ...... 144

12.5 DISCUSSION ...... 147

12.6 CONCLUSION ...... 149

13 IMPACT OF DEBIASING: BIAS BEHAVIOUR...... 150

13.1 DATA EXCLUSION CRITERIA...... 150

13.2 METHOD ...... 152

13.3 RESULTS FOR DEBIASING ORDER EFFECTS...... 152

13.4 RESULTS ON DEBIASING THE ANCHORING EFFECT...... 154

13.5 RESULTS ON EXPOSURE EFFECT ...... 161

13.6 RESULTS ON REINFORCEMENT EFFECT ...... 163

13.7 DISCUSSION ...... 164

13.8 CONCLUSION ...... 166

14 IMPACT OF DEBIASING: DECISION MAKING...... 167

14.1 METHOD ...... 167

14.2 ABILITY TO ANSWER QUESTIONS BEFORE SEARCHING...... 168

14.3 RESULTS (BASELINE VERSUS ANCHOR DEBIASING INTERVENTION) ...... 169

14.4 RESULTS (BASELINE VERSUS ORDER DEBIASING INTERVENTION)...... 173

14.5 RESULTS (USER PREFERENCE)...... 176

14.6 DISCUSSION ...... 177

14.7 CONCLUSION ...... 179

15 IMPACT OF INFORMATION SEARCHING ON DECISION MAKING: CONCLUSION ...... 180

15.1 ORDER EFFECTS ...... 180

15.2 ANCHORING EFFECT...... 183

15.3 EXPOSURE EFFECT AND REINFORCEMENT EFFECT ...... 187

15.4 OTHER FINDINGS...... 189

15.5 RESEARCH SUMMARY ...... 191

15.6 FUTURE RESEARCH ...... 193

15.7 CONCLUSION ...... 196

REFERENCES ...... 196

APPENDICES ...... 196

APPENDIX A: SEQUENTIAL, NON-SEQUENTIAL AND ODDS FORM OF THE BAYES’ THEOREM ...... 196

APPENDIX B: RECRUITMENT ANNOUNCEMENTS...... 196

APPENDIX C: PARTICIPANT INFORMATION STATEMENT ...... 196

APPENDIX D: TEN-PAGE TUTORIAL ...... 196

APPENDIX E: LOCATION OF EVIDENCE FOR EACH CASE SCENARIO QUESTION ...... 196

Figures

Figure 3.1 Data excluded from retrospective analysis 10 Figure 3.2 Frequency of subjects answering a question correctly or incorrectly for each document accessed by at least five subjects 13 Figure 3.3 Variation in influence of accessed documents in obtaining a correct post-search answer, as measured by likelihood ratio 15 Figure 5.1 Example of a serial position curve 33 Figure 5.2 Relationship between pre-search answer and post-search correctness 36 Figure 5.3 Frequency of subjects with different confidence levels in the pre-search answer 38 Figure 5.4 Relationship between confidence in pre-search answer and the pre-search answer retention rate (separate graphs for correct and incorrect post-search answers) 38 Figure 5.5 Serial anchor curve: relationship between confidence in pre-search answer and retention of pre-search answer after searching (combined graph for correct and incorrect post-search answers) 39 Figure 5.6 Frequency of subjects accessing documents at different positions 41 Figure 5.7 Relationship between document access position and concurrence rate between post-search answer and document-suggested answer (separate graphs for correct and incorrect document) 41 Figure 5.8 Relationship between document access position and concurrence rate between post-search answer and document-suggested answer (correct and incorrect documents combined) 42 Figure 5.9 Serial position curve: normalised view of the relationship between document access position and concurrence rate between post-search answer and document-suggested answer 42 Figure 5.10 Frequency of subjects being exposed to documents at different levels 44 Figure 5.11 Relationship between document exposure level and concurrence between post- search answer and document-suggested answer (separate graphs for correct and incorrect document) 45 Figure 5.12 Relationship between document exposure level and concurrence between post- search answer and document-suggested answer (correct and incorrect documents combined) 45 Figure 5.13 Serial exposure curve: normalised view of the relationship between document exposure level and concurrence between post-search answer and document-suggested answer 46 Figure 5.14 Frequency of subjects accessing documents at different number of repetitions 47 Figure 5.15 Relationship between document access frequency and concurrence between post-search answer and document-suggested answer (separate graphs for correct and incorrect document) 48

Figure 5.16 Relationship between document access frequency and concurrence between post-search answer and document-suggested answer (correct and incorrect documents combined) 48 Figure 5.17 Serial reinforcement curve: normalised view of the relationship between document access frequency and concurrence between post-search answer and document- suggested answer 49 Figure 6.1 Anchoring effect: the null hypothesis (H0) and the alternative hypothesis (H1) 53 Figure 6.2 Confidence in anchoring effect: the null hypothesis (H0) and the alternative hypothesis (H1) 54 Figure 6.3 Order effects: the null hypothesis (H0) and the alternative hypothesis (H1) 55 Figure 6.4 Exposure effect: the null hypothesis (H0) and the alternative hypothesis (H1) 56 Figure 6.5 Reinforcement effect: the null hypothesis (H0) and the alternative hypothesis (H1) 58 Figure 8.1 Information in independent format (Klayman & Brown, 1993) 81 Figure 8.2 Information in contrastive format (Klayman & Brown, 1993) 81 Figure 8.3 Information expressed in graphical representation (Roy & Lerch, 1996) 82 Figure 8.4 Problem expressed in causal order (Roy & Lerch, 1996) 83 Figure 8.5 Problem expressed in non-causal order (Roy & Lerch, 1996) 83 Figure 8.6 Information distracter screens (Chandra & Krovi, 1999) 84 Figure 8.7 Baseball team ranking task supplied with two pieces of information on the right (Marett & Adams, 2006) 86 Figure 8.8 Example of information allocated to confirming, neutral or disconfirming to a decision (Arnott, 2006) 86 Figure 9.1 Enhanced thumbnails (Woodruff et al., 2002) 93 Figure 9.2 TileBars (Hearst, 1995) 94 Figure 9.3 InfoCrystal (Spoerri, 1993) 95 Figure 9.4 Scatter and Gather (Cutting et al., 1992) 96 Figure 9.5 Envision (Heath et al., 1995) 98 Figure 9.6 Perspective wall (Mackinlay et al., 1991) 99 Figure 9.7 SketchTrieve (Hendry & Harper, 1997) 100 Figure 9.8 PadPrints (Hightower et al., 1998) 103 Figure 9.9 InfoGrid (Rao et al., 1992) 104 Figure 9.10 Data Mountain (Robertson et al., 1998) 104 Figure 9.11 DLITE (Cousins, 1997) 105 Figure 9.12 Cat-a-cone (Hearst & Karadi, 1997) 106 Figure 10.1 The original Search page of Quick Clinical, an online evidence retrieval system developed by Centre for Health Informatics, University of New South Wales (Coiera et al., 2005) 113 Figure 10.2 The original Results page of Quick Clinical 113 Figure 10.3 The modified Results page of Quick Clinical – users simply click on any of the displayed thumbnails to access a document 114 Figure 10.4 Keep document tool – a tool for users to write and keep notes with each document 115 Figure 10.5 Keep document tool – enlargement of the Keep notes tool 116

Figure 10.6 Keep document tool– document and its notes automatically move to another section of the screen once user selects ‘Keep’ document 116 Figure 10.7 Keep document tool– collected documents and their notes are accessible across multiple searches 117 Figure 10.8 Keep document tool– documents and their notes are rearranged according to the document rearrangement strategy that minimises the impact of the targeted cognitive bias. Users review this newly arranged set of documents before making a decision 119 Figure 10.9 For/against document tool – a tool for users to write and keep notes with each document, as well as to classify the utility of the document 120 Figure 10.10 For/against document tool – enlargement of the For/Against document tool 120 Figure 10.11 For/against document tool– document and its notes automatically move to relevant section of the screen once user classifies the utility of the document 121 Figure 10.12 For/against document tool– classified documents and their notes are accessible across multiple searches. Users can reclassify a document during the evidence journey 121 Figure 10.13 For/against document tool– documents and notes are presented to the user for review before making a decision 122 Figure 10.14 Education-based intervention 122 Figure 11.1 Subject information sheet about study 128 Figure 11.2 Consent form of study: subjects can press “I Agree” to proceed with the study or “I Disagree” to discontinue 128 Figure 11.3 First page of the online tutorial 129 Figure 11.4 Pre-study questionnaire for collecting subjects’ demographics 129 Figure 11.5 Login screen into experiment 130 Figure 11.6 Subjects answer case scenario question before searching 130 Figure 11.7 Search interface: subjects enter keywords and press “Go” to conduct a search, or press “Finish Searching” to proceed to answer the case scenario question again 131 Figure 11.8 Retrieving documents from different information sources 131 Figure 11.9 Subjects answer case scenario question after searching 132 Figure 11.10 After answering post-search, subjects are presented with other subjects’ post- search answers and have the opportunity to answer the question again 132 Figure 11.11 At the completion of six case scenarios, subjects complete the post-study questionnaire 133 Figure 12.1 Data exclusion in Phases 1 and 2 of the study 139 Figure 13.1 Data exclusion criteria for comparing the effectiveness of baseline search interface and debiasing intervention 151 Figure 13.2 Order effects: the null hypothesis (H0) and the alternative hypothesis (H1) 152 Figure 13.3 Relationship between document access position and concurrence rate between post-search answer and document-suggested answer (baseline vs. order debiasing intervention) 154 Figure 13.4 Anchoring effect: the null hypothesis (H0) and the alternative hypothesis (H1) 155

Figure 13.5 Confidence in anchoring effect: the null hypothesis (H0) and the alternative hypothesis (H1) 156 Figure 13.6 Relationship between pre-search answer and post-search correctness (baseline vs. anchor debiasing intervention) 157 Figure 13.7 Relationship between confidence in pre-search answer and pre-search answer retention rate 159 Figure 13.8 Exposure effect: the null hypothesis (H0) and the alternative hypothesis (H1) 161 Figure 13.9. Relationship between document exposure level and concurrence rate between post-search answer and document-suggested answer (using baseline search interface) 162 Figure 13.10 Reinforcement effect: the null hypothesis (H0) and the alternative hypothesis (H1) 163 Figure 13.11 Relationship between document access frequency and concurrence rate between post-search answer and document-suggested answer (using baseline search interface) 164 Figure 15.1 Relationship between document access position and rate of concurrence between post-search answer and document-suggested answer (using baseline search interface) (Chapter 13, Figure 13.3) 181 Figure 15.2 Relationship between document access position and rate of concurrence between post-search answer and document-suggested answer (using order debias search interface) (Chapter 13, Figure 13.3) 182 Figure 15.3 Relationship between pre-search answer and post-search correctness (using baseline search interface) (Chapter 13, Figure 13.6) 184 Figure 15.4 Relationship between confidence in pre-search answer and pre-search answer retention rate (using baseline search interface) (Chapter 13, Figure 13.7) 184 Figure 15.5 Relationship between pre-search answer and post-search correctness (using anchor debiasing intervention) (Chapter 13, Figure 13.6) 185 Figure 15.6 Relationship between confidence in pre-search answer and pre-search answer retention rate (using anchor debiasing intervention) (Chapter 13, Figure 13.7) 186 Figure 15.7 Relationship between document exposure level and concurrence rate between post-search answer and document-suggested answer (using baseline search interface) (Chapter 13, Figure 13.9) 188 Figure 15.8 Relationship between document access frequency and concurrence rate between post-search answer and document-suggested answer (using baseline search interface) (Chapter 13, Figure 13.11) 189

Tables

Table 3.1 Clinical questions in the scenarios presented to subjects (Westbrook et al., 2005) 9 Table 3.2 Total number of subjects, accessed documents, search sessions and searches across the six scenarios (before and after data exclusion) 11 Table 3.3 Changes in answers pre- and post- search after data exclusion (n=400) (after Westbrook et al., 2005) 11 Table 3.4 Average number of accessed documents, searches and time taken in a search session (after data exclusion) 12 Table 3.5 Access frequency in IVF scenario for document “Cancer incidence in a cohort of infertile women who underwent in vitro fertilization” 14 Table 3.6 Symbols used in graphically depicting evidence journey 16 Table 5.1 Relationship between pre-search answer and post-search answer 36 Table 5.2 Pre-search answer and confidence with percentages of post-search answers 37 Table 5.3 Relationship between confidence in pre-search answer and retention of pre- search answer after searching 37 Table 5.4 Correct/incorrect document accessed at different positions and the corresponding percentage of a correct/incorrect post-search answer 40 Table 5.5 Relationship between document access position and concurrence between post- search answer and document-suggested answer 40 Table 5.6 Concurrence for documents exposed for correct/incorrect document accessed with different levels of exposure and the corresponding percentage of a correct/incorrect post-search answer 43 Table 5.7 Relationship between document exposure level and concurrence between post- search answer and document-suggested answer 44 Table 5.8 Documents accessed with different number of repetition and the corresponding percentage of a correct/incorrect post-search answer 47 Table 5.9 Relationship between document access frequency and concurrence between post- search answer and document-suggested answer 47 Table 7.1 Population prior (prevalence of a correct answer amongst all subjects) for each scenario 64 Table 7.2 Personal prior belief calculated from pre-search answer and confidence used to calculate 64 Table 7.3 Personal prior calculated based on willingness to switch belief using data from all six scenarios (n=400 search sessions) 65 Table 7.4 Personal prior* for each scenario calculated based on willingness to switch belief (n=400 search sessions) 66 Table 7.5 Prediction accuracy (%) for each version of Bayesian model with different decision biases and each form of prior belief (with 95% confidence interval) 69 Table 8.1 Summary of reviewed debiasing interventions 75

Table 9.1 Examples of search user interfaces that assist users to select documents 92 Table 9.2 Examples of sensemaking tools (i.e. tools that allow users to interact and re- represent existing data to construct new information patterns) 98 Table 9.3 Examples of search workspaces (i.e. tools for users to organise and maintain search results) 102 Table 10.1 Summary of intervention types 112 Table 11.1 Case scenarios presented to subjects 135 Table 11.2 Percentage of correct answers before and after searching in Phase 1 of the study 136 Table 12.1 Characteristics of subjects 141 Table 12.2 Subjects’ ability to answer questions pre-search 141 Table 12.3 Allocation of case scenarios to interventions 142 Table 12.4 Number of search sessions, searches and accessed documents in each phase of the study (before and after data exclusion) 143 Table 12.5 Average time taken, no. of searches conducted and no. of documents accessed to answer each scenario question in Phases 1 & 2 of the study (after data exclusion) 143 Table 12.6 Percentage of correct answers before searching, after searching and after knowing other subjects’ answers for each case scenario of the study 144 Table 12.7 Changes in answer before and after system use (n=928) (after Westbrook et al., 2005) 145 Table 12.8 Confidence in post-search answer for subjects who did not know scenario answer before searching (n=147) (after Westbrook et al., 2005a) 145 Table 12.9 Changes in confidence in original answer following online searches (n=905) (after Westbrook et al., 2005a) 146 Table 12.10 Comparison of confidence between pre- and post-search right and wrong answers 146 Table 12.11 Percentage of subjects who change post-search answer after knowing other subjects’ post-search answers 146 Table 13.1 Relationship between document access position and concurrence between post- search answer and document-suggested answer (baseline vs. order debiasing intervention) 154 Table 13.2 Relationship between pre-search answer and post-search answer (baseline vs. anchor debiasing intervention) 157 Table 13.3 Relationship between confidence in pre-search answer and retention of pre- search answer after searching (baseline vs. anchor debiasing intervention) 159 Table 13.4 Comparative pre-search answer retention rates by search interface at different pre-search confidence levels (baseline vs. anchor debiasing intervention) 160 Table 13.5 Relationship between document exposure level and concurrence between post- search answer and document-suggested answer (using baseline search interface) 162 Table 13.6 Relationship between document access frequency and concurrence between post-search answer and document-suggested answer (using baseline search interface) 164 Table 14.1 Comparison of post-search correctness in baseline search interface and anchor debiasing intervention 169 Table 14.2 Comparison in proportions of RR, RW, WW and WR responses between baseline search interface and anchor debiasing intervention 169

Table 14.3 Confidence in post-search answers amongst subjects who did not know answer before searching (after Westbrook et al., 2005b) 170 Table 14.4 Self-reported changes in confidence pre-/post-search (after Westbrook et al. 2005b) 171 Table 14.5 Comparison of confidence in right and wrong post-search answers between baseline and anchor debiasing intervention 172 Table 14.6 Comparison of search sessions using baseline and anchor debiasing intervention 172 Table 14.7 Comparison of post-search correctness in baseline search interface and order debiasing intervention 173 Table 14.8 Comparison in proportions of RR, RW, WW and WR responses between baseline search interface and order debiasing intervention 173 Table 14.9 Confidence in post-search answers amongst subjects who did not know answer before searching (after Westbrook et al. 2005b 174 Table 14.10 Self-reported changes in confidence pre-/post-search (after Westbrook et al., 2005b) 175 Table 14.11 Comparison of confidence in right and wrong post-search answers between baseline and order debiasing intervention 175 Table 14.12 Comparison of search sessions using baseline and order debiasing intervention 176 Table 14.13 System preferences reported by subjects (n=183) 177



Part I: Introduction

1 Background information

1.1 Problem statement

Increasingly, information searching plays an important part in healthcare consumers’ decision making (Eysenbach, 2001) and clinicians’ practice of evidence-based medicine

(Hersh, 1996). Decisions are improved by better access to relevant information, and searching for documents on the Web is increasingly an important source of that information

(Morrison et al., 2001). While much research focuses on the design of retrieval methods that identify potentially relevant documents, there has been little examination of the way retrieved documents then shape real-life decision making (Hersh, 2005). Decision making research has for a long time identified that people experience cognitive biases and that these biases can have adverse impacts on their decision outcomes (Kahneman et al., 1982; Elstein,

1999; Croskerry, 2002; Kaptchuk, 2003). Little or no research seems to have examined whether people also experience cognitive biases while searching for information, whether there are negative consequences from any such biases, and whether these biases can be corrected for to improve the quality of decision making. Yet, to develop information retrieval systems that actively support decision making, it will be necessary to understand the complex process of how people search for and review information when making decisions (Spink & Cole, 2005), and design appropriate search user interfaces for these needs.

1.2 Research gap

Understanding the way people seek and search for knowledge in their everyday context is related to the area of cognitive information retrieval (CIR) (Spink & Cole, 2005a). Processes and techniques in CIR and human information behaviour research have provided a better understanding on the way people multitask during information searching (Spink et al.,

2005b), the way people conduct successive searches (Spink et al., 2002), the way people

2

Background information 3

interact in a user-intermediary session (Ellis et al., 2002), and the potential benefit of eliciting user feedback to infer document relevance (Kelly, 2005). Theoretical frameworks and models have been developed to provide a better conceptual understanding on the impact of uncertainty on information retrieval (Wilson et al., 2002), the way people interact in an information retrieval task (Cole et al., 2005) and the intricate relationship between information and knowledge (Ford, 2005). Of particular relevance to this research, studies have identified that different cognitive styles and cognitive states influence the way people conduct their searches. For example, individuals’ cognitive processes, such as field dependency/independency capability, influence the use of a local or global approach to search (Ford et al. 1994); and individuals’ bias for verbaliser/imager cognitive styles influence the effectiveness of an information retrieval task (Ford et al., 2001).

However, no studies seem to have considered whether people experience cognitive biases while searching for online information. Examining the current literature in the fields of information science, health informatics, and decision making shows that little or no research has been conducted that (i) considers whether people experience cognitive biases while searching for information; (ii) assesses the impact of these biases on decision making

(specifically in the area of health); and (iii) investigates the possibility of debiasing the process of information searching and the subsequent impact on decision making. This research is designed to address these three research gaps in a series of retrospective data analyses, qualitative and quantitative explorations, Bayesian model and prospective empirical experiments.

In information science, there have been studies that investigate whether search systems bias the way information is indexed and retrieved, such as whether information retrieved from search engines originates from a particular country or organisation (Mowshowitz &

Kawaguchi, 2002). There are also studies that demonstrated that the order of information presentation influences the way people evaluate the relevance of a document (Eisenberg &

Barry, 1988; Tubbs et al., 1993; Purgailis Parker & Johnson, 1990; Huang & Wang, 2004), and that the order of information presentation affects the way people make decisions when their confidence is low but not when their confidence is high (Wang et al., 2000). However, no studies have evaluated whether the order in which people access information, the way

Background information 4

information is processed and interpreted, and the way people perceive the problem and revise their belief in a search session, influence the way they make decisions.

In health-related decision making, studies have confirmed that physicians do not always achieve optimal results when using information retrieval systems (Hersh & Hickman, 1999), and that physicians display cognitive biases and use heuristics when making clinical decisions and interpreting research evidence (Elstein, 1999; Croskerry, 2002; Kaptchuk,

2003). In fact, studies have demonstrated that medical practitioners arrive at different diagnoses when the same information is presented in a different order (Bergus et al., 1995;

Bergus et al., 1998; Chapman et al., 1996; Cunnington et al., 1997), and that different order presentation influences patients’ interpretation of treatment options and information

(Bergus et al., 2002). However, no studies have been conducted to evaluate the extent to which these biases influence health-related decision making in an information retrieval task.

In decision support, different strategies and interventions have been proposed to debias different decision making tasks. Examples of these tasks include appraisal of real estate prices (George et al., 2000), appraisal of department head performance (Lim et al., 2000), development of software (Rai et al., 1994; Shore, 1996), conducting audit review (Arnold et al., 2000; Ashton & Kennedy, 2002), assisting managing director and board of directors to make corporate decisions (Arnott, 2006), estimating sport team performance (Marett &

Adams, 2006), correcting expert assessments (Clemen et al., 2002), assisting students to learn to diagnose diseases (Klayman & Brown, 1993), making clinical decisions (Croskerry,

2003, 2003a), making legal decisions (Jolls & Sunstein, 2006), and using information in various problem solving tasks (Roy & Lerch, 1996; Chandra & Krovi, 1999). Studies have also examined the conditions in an environment in which cognitive biases can be minimised

(e.g. Hogarth & Einhorn, 1992 for order effects; and Murphy & Winkler, 1984 for overconfidence). Although many search user interfaces have been designed to assist people to select, process and integrate information (Hearst, 1999), no intervention has been designed specifically for debiasing the process of information searching and there have been no attempts to integrate debiasing strategies into the user interface of a search system.

Background information 5

1.3 Aim

The overall objective of this research is to investigate how cognitive biases impact on health- related information searching and decision making. The hypotheses in this research are (i) people experience cognitive biases during information searching; (ii) cognitive biases can be corrected during information searching; and (iii) correcting for biases during information searching improves decision making. The underlying assumption for these hypotheses is that information searching influences decision making.

2 Guide to thesis

This thesis is divided into five parts. The first part presents background information and a guide to this thesis (Chapters 1–2). The second part presents a data analysis of how clinicians search for online information to answer clinical questions (Chapter 3), which leads to establishing the hypotheses in the third part.

The third part establishes and tests the hypotheses (Chapters 4–7). It starts with an overview of the literature related to cognitive biases and their impact on decision making and information searching (Chapter 4); followed by a retrospective data analysis that preliminarily tests the working hypotheses (Chapter 5). The hypotheses are then formalised in Chapter 6; and tested in a Bayesian model in Chapter 7.

The fourth part describes interventions designed to debias information searching

(Chapters 8–10). Chapter 8 presents a review of the literature on strategies to debias decision making from cognitive science, decision support and information science. This is followed by a review of the literature on search user interfaces that support the analysis and synthesis of search results (Chapter 9). Three classes of interventions are then proposed to debias information searching; a description, the design process and the rationale behind these interventions are outlined in Chapter 10.

The fifth part discusses the results of a study undertaken to test the effectiveness of these debiasing interventions (Chapters 11–14). Chapter 11 describes the pre-/post-search study design undertaken with healthcare consumers to test the debiasing interventions. Chapter

12 reports the impact of information searching on subjects’ accuracy and confidence in answering health-related questions. Chapter 13 compares subjects’ cognitive bias behaviour during information searching between using a debiasing intervention and a baseline search interface. Chapter 14 analyses how subjects’ decision outcome, confidence in answers, search behaviour and user preference differ between using the debiasing intervention and baseline search interface.

The thesis concludes with a summary chapter in Chapter 15. 6

Part II: What do we understand about information searching behaviour?

3 Evidence journey: data exploration

The overall objective of this research is to investigate the impact of information searching on decision making, and to examine if the user interface of a search engine can be improved to make better decisions. To undertake such a research program and develop an initial set of hypotheses, a retrospective analysis was independently designed and conducted on a dataset of search and decision behaviours. That dataset was collected by other researchers prior the commencement of this research to test the effectiveness of a new search engine (Coiera et al., 2005), and was first reported in (Westbrook et al., 2005). This chapter presents the results of a retrospective analysis conducted on that dataset, and the investigations that led to the hypotheses that underpin this research.

3.1 Introduction

The aim of this analysis is to understand how people search for and use information to make a decision. To carry out this analysis, it is important to understand the “journey” that people undertake to arrive at a decision. Taking the definition of evidence to be “the body of observed, reported or research-derived data that has the potential to inform decision making” (Ford et al., 1999, p. 385), an “evidence journey” is the process that describes an individual accessing different pieces of information retrieved from an online evidence retrieval system to make a decision. The notion of the evidence journey is similar to Bates’ berrypicking metaphor, or bit-at-a-time retrieval, where “a query is satisfied not by a single final retrieved set, but by a series of selections of individual references and bits of information at each stage of the ever-modifying search” (Bates, 1989).

There is a body of literature looking at how people use information retrieved from search engines to inform their decision making. An example from the information science literature is the model of document use proposed by Wang and colleagues (1998, 1999).

Based on a longitudinal study of academic researchers’ use of documents retrieved from online databases during a research project, they proposed that document use is a decision 8

Evidence journey: data exploration 9

making process and people do not necessarily use the same criteria to select, read or cite documents (Wang & Soergel, 1998; Wang & White, 1999).

Gruppen et al. (1991), reporting in the medical decision making literature, suggested that information gathering and selection are more problematic than information integration and use. Based on a study examining how first year house officers select information to make a diagnosis, they found that subjects selected the optimal information in only 23% of the cases but were able to use the selected information to make a diagnosis in over 60% of cases. They suggested that physicians appear to have difficulties recognising the diagnostic value of information, which results in making decisions based on diagnostically weak information.

3.2 Data description

To help develop a model of the evidence journey, a retrospective analysis was constructed on a dataset of 75 clinicians’ search and decision behaviours (44 doctors and 31 clinical nurse consultants), who answered questions for 8 real-life clinical scenarios within 80 minutes in a controlled setting at a university computer laboratory (Westbrook et al., 2005). Scenarios were presented in a random order, and subjects were asked to record their answers and level of confidence for each scenario. Subjects were then asked to use an Internet search engine to

Table 3.1 Clinical questions in the scenarios presented to subjects (Westbrook et al., 2005)

Question (scenario name) Expected correct answer

Does current evidence support the insertion of No, not indicated tympanostomy tubes in child with normal hearing? (Glue ear) What is the best delivery device for inhaled medication to Spacer (holding chamber) a child during moderate asthma attack? (Asthma) Is there evidence for the use of nicotine replacement No, use is contraindicated therapy after myocardial infarction? (MI) Is there evidence for increased breast and cervical cancer No evidence of increased risk risk after IVF treatment? (IVF) Is there evidence for increased risk of SIDS in siblings of Yes, there is an increased risk baby who died of SIDS? (SIDS) What is the anaerobic organism(s) associated with Peptrostreptococcus, Bacteroides osteomyelitis in diabetes? (Diabetes)

Evidence journey: data exploration 10

Figure 3.1 Data excluded from retrospective analysis

locate documentary evidence and answer the scenario questions again.

Subjects recorded their pre- and post-search answers to each question, their confidence in these answers and their confidence in the evidence they had found using the search engine. There were four options to answer each question: yes, no, conflicting evidence and don’t know. Confidence was measured by a 4 point Likert scale from “very confident” to

“not confident”. In addition, subjects recorded any change in answer or confidence from their pre-search response and identified which documents influenced their decision. They were asked to work through the scenarios as they would within a clinical situation and not spend more than 10 minutes on any one question.

Data from the six scenarios for which a correct answer could be identified were analysed

(scenario questions are described in Table 3.1). Figure 3.1 describes how data was excluded from this study. Out of 450 responses (75 × 6 = 450) in the dataset for the 6 scenarios, 11 responses were excluded because either subjects did not perform any searches, subjects did not access documents or data was unavailable. Of the remaining 439 (450 – 11 = 439) search

Evidence journey: data exploration 11

Table 3.2 Total number of subjects, accessed documents, search sessions and searches across the six scenarios (before and after data exclusion)

No. of accessed No. of search sessions No. of searches documents Before After Before After Before After Scenario exclusion exclusion exclusion exclusion exclusion exclusion Asthma 75 64 230 188 302 266 Glue ear 75 68 351 307 351 323 SIDS 75 66 248 202 363 327 MI 75 73 352 350 284 273 IVF 75 72 317 311 235 235 Diabetes 75 57 484 403 522 449 Total 450 400 1982 1761 2057 1873

sessions, 39 were eliminated because subjects did not provide a post-search answer or their post-search answer was “don’t know”, leaving a dataset of 400 search sessions.

The unit of measure used in the analysis that follows is a search session, which is “the entire series of queries by a user” (Jansen et al., 2000, p. 252) to answer one question.

Overall, subjects made 1761 searches and accessed 1873 documents across the 400 search sessions for six scenarios (Table 3.2). In a search session, subjects took on average 405 seconds (standard deviation (SD): 590.8), made 4.32 (SD: 4.002) searches and accessed 4.65 documents (SD: 3.670) to complete a question (Table 3.4).

Across the six scenarios, most subjects improved their answers after searching. There are subjects, as reported in Westbrook et al. (2005), who had a wrong answer pre-search and a right answer post-search, wrong-right (WR: 37%), followed by those who never answered correctly, wrong-wrong (WW: 30%), then those who answered correctly pre- and post- search (RR: 26%), and those who went from right to wrong (RW: 8%) (Table 3.3).

Table 3.3 Changes in answers pre- and post- search after data exclusion (n=400) (after Westbrook et al., 2005)

Before search After search Total no. % (95% CI) Right Right 104 26% (21.9 to 30.5) Right Wrong 30 8% (5.3 to 10.5) Wrong Wrong 120 30% (25.7 to 34.7) Wrong Right 146 37% (31.9 to 41.3)

Table 3.4 Average number of accessed documents, searches and time taken in a search session (after data exclusion)

Time taken (seconds)* No. of searches No. of accessed documents Scenario (n) Average (SD) Range Average (SD) Range Average (SD) Range Asthma (64) 289 (588.9) 0 to 4442 2.88 (2.347) 1 to 13 4.09 (2.518) 1 to 12 Glue ear (68) 356 (341.6) 0 to 1210 4.46 (3.911) 1 to 19 4.74 (3.240) 1 to 14 SIDS (66) 345 (336.1) 0 to 1372 3.03 (2.075) 1 to 10 4.91 (3.794) 1 to 22 MI (73) 360 (505.4) 0 to 3646 4.77 (3.802) 1 to 21 3.71 (2.264) 1 to 11 IVF (72) 344 (515.9) 0 to 4016 4.31 (3.368) 1 to 16 3.25 (2.372) 1 to 11 Diabetes (57) 795 (987.8) 0 to 5789 6.95 (6.323) 1 to 36 7.86 (5.588) 1 to 28 All scenarios (400) 405 (590.8) 0 to 5789 4.35 (4.002) 1 to 36 4.65 (3.670) 1 to 28

* Measured as time elapsed between commencement of first and last searches. Time taken is therefore zero where only one search was

conducted. Evidence journey: data exploration

12

Evidence journey: data exploration 13

50

45

40

35

30

25

20

15

10

5

No. of clinicians accessing the document 0 Documents in frequency order of access by different clinicians

Correct answ er Incorrect answ er

Figure 3.2 Frequency of subjects answering a question correctly or incorrectly for each document accessed by at least five subjects

3.3 Data exploration

3.3.1 Did people who accessed the same document arrive at the same answer? Analysis of the different evidence journeys taken by study participants reveals that subjects did not necessarily arrive at the same answer after having accessed the same document. In one scenario (MI), around 25% of WW subjects (i.e. 10 out of 39 subjects) cited the same source as WR subjects to support the opposite answer. Across the six scenarios, subjects often produced different answers to questions despite having accessed the same document.

The majority of frequently accessed documents were seen both by subjects who answered a question correctly after searching as well as those who answered incorrectly (Figure 3.2).

3.3.2 Can we measure the influence of a document on a decision? Since a document may be influencing some subjects to answer in different ways, a quantitative measure is needed to model the impact a document may have on a decision.

One simple method is to associate a document with the frequency of correct and incorrect decisions made after having accessed the document. This leads to the idea of using a likelihood ratio (LR) to calculate a ratio of the frequency that accessing a document is associated with a correct answer rather than with an incorrect answer,

Evidence journey: data exploration 14

P(AccessedDoc|Correct) / P(AccessedDoc|Incorrect) (shown in Equation 1). The LR measures the impact a document has in influencing a subject towards a specific answer. Documents with a LR > 1 are most likely to be associated with a correct answer, and LR < 1 with an incorrect answer.

The likelihood ratio is calculated from the sensitivity and specificity of a document with respect to an answer. The sensitivity, or true positive rate, of a document is the frequency with which the document being accessed correlated with a correct answer being provided post-search (shown in Equation 2). The false negative rate, the frequency with which access of a document correlated with an incorrect answer, was also calculated. The specificity, or true negative rate, is one minus the false negative rate (shown in Equation 3).

The sensitivity and specificity values were calculated based upon the frequency with which a document was accessed for each scenario:

P(AccessedDo Correctc )| ySensitivit Likelihoodratio = = 1)( P(AccessedDo Incorrectc )| − ySpecificit1

No. of correct answershpost-searc where document was accessed ySensitivit = 2)( answers correct hpost-searc of no.Total no.Total of hpost-searc correct answers

answers hpost-searc incorrect of No. of incorrect hpost-searc answers where document was accessed − ySpecificit1 = 3)( answers incorrect hpost-searc of no.Total no.Total of hpost-searc incorrect answers

For example, the sensitivity measure for the document in Table 3.5 is 38/55 and the one minus specificity measure is 7/20. The LR of the document is (38/55) / (7/20) = 1.97, which means that a person having accessed this document is 1.97 times more likely to be associated with a correct post-search answer than with an incorrect post-search answer.

Since some documents were accessed on only a few occasions, it was not possible to calculate meaningful sensitivity and specificity measures for all documents. Thus, LR was only calculated for the subset of documents that had been accessed by at least five subjects

Table 3.5 Access frequency in IVF scenario for document “Cancer incidence in a cohort of infertile women who underwent in vitro fertilization”

After search (no. of subjects) Accessed document? Right Wrong Total Yes 38 7 45 No 17 13 30 Total 55 20 75

Evidence journey: data exploration 15

10

9

8

7

6

5

4 Likelihood ratio ratio Likelihood

3

2

P(AccessedDoc|Correct)/P(AccessedDoc|Incorrect) 1

0 Document (in ascending order of likelihood ratio)

Figure 3.3 Variation in influence of accessed documents in obtaining a correct post-search answer, as measured by likelihood ratio Note: Some documents have a likelihood ratio > 10

(each document was accessed by 4.7 subjects on average). A total of 725 documents were accessed across the 6 scenarios, with a range from 78 to 138 documents per scenario. After culling, 88 documents were kept in the pool of influential documents (i.e. accessed by more than 5 subjects), with a range from 10 to 19 documents per scenario.

Figure 3.3 shows that documents accessed were almost equally distributed between those more likely to be associated with a correct answer (51% of accessed documents had a

LR > 1) or incorrect answer (49% of accessed document had LR < 1). There was a clear variation in the likelihood that accessing different documents was associated with a subject providing a correct or incorrect post-search answer.

3.3.3 Are there typical patterns in the evidence journey? To better understand the way accessing a sequence of documents might have influenced an individual in making a decision, a qualitative graphical analysis was conducted to look for typical patterns amongst the dataset. In a search session, a positive document is a document with LR > 1 and is represented by a closed circle; a negative document is a document with LR

≤ 1, represented by an open circle; an indeterminate document is a document with a LR that cannot be established and it is represented with a strip-patterned circle; and each query submitted to the search engine is represented with a line (Table 3.6).

Evidence journey: data exploration 16

Table 3.6 Symbols used in graphically depicting evidence journey

Symbol Name Meaning Positive document Document with LR > 1 Negative document Document with LR 1 Document in which the LR Indeterminate document cannot be established

Represents a single search (i.e. Search one query submitted to the search engine)

The following examples demonstrate the evidence journeys of subjects in four different categories: RR, RW, WR and WW.

Case 1: RR a

1

The subject in this example was correct before and after searching. This subject indicated being very confident in the pre-search answer. The subject made only one search and accessed one document, which is a positive document (a). One possible interpretation of this evidence journey is that the first document confirmed the subject’s pre-search answer; hence, the subject stopped searching and provided the correct answer. Case 2: RW

a: 2 min c: 30 sec b: 5 min

1 2 3 4

This subject gave a correct answer before searching but changed to an incorrect answer after searching. The subject made four searches. On the first search, the subject accessed two documents; the first document was a positive document (a), followed by a negative

Evidence journey: data exploration 17

document (b). The subject then performed more searches, viewed the titles and summaries of documents retrieved on the results pages but did not access any document until the last search, which was a document with an indeterminate likelihood ratio (c). One possible interpretation is that as the subject spent more time on the negative document (b: 5 minute) than the other two documents (a: 2 minute, c: 30 seconds), the extra time spent on the negative document may have contributed to the subject giving an incorrect answer after searching. (Note: Time spent on a document was measured as time elapsed between the commencement of accessing the document and the subject’s next action). Case 3: WR

a c

b d

1 2

The subject in this example gave an incorrect answer before searching, made two searches, accessed four documents and answered the question correctly after searching. The first document was a positive document (a), followed by two negative documents (b and c), and then a positive document (d). One possible interpretation is that the first and the last documents, which are positive documents, influenced the subject to change opinion and make a correct answer. Case 4: WW

a b

b

1 2 3 4 5 6

This subject answered the question incorrectly before and after searching. Although the subject only accessed two documents, a positive document (a) and a negative document (b), the subject accessed the negative document twice (b). Perhaps, revisiting the negative document (b) led the subject to retain an incorrect pre-search opinion and provide an incorrect answer after searching.

Evidence journey: data exploration 18

3.4 Discussion

The previous section illustrates that people take different journeys to arrive at an answer to a question and that documents may have different influences on a decision. From this preliminary analysis, the following components of an evidence journey seem to shape a decision outcome:

• an individual’s belief and confidence before searching, e.g. Case 1: RR

• the order of accessing documents, especially the first and last accessed

documents, e.g. Case 3: WR

• the amount of time spent on the documents, e.g. Case 2: RW

• the number of visits the searcher made to the same document, e.g. Case 4: WW

Each of these phenomena has been identified in the general decision making and information retrieval literature in different guises. Specifically, the literature identifies the existence of cognitive biases in the way people use information to make decisions. The concept of decision biases may provide us with a theoretical model with which to understand search behaviours. For example, decision makers are biased by their prior beliefs, and the order of information presentation in making decisions and forming impressions (Tversky and Kahneman, 1974; Asch, 1946). Different subjects have different knowledge and levels of confidence before searching; their evidence journeys are different in search length, number and types of documents accessed; and the way documents were accessed and how these documents may have interacted are possible elements that influence the way people process and use information to make decisions. This exploration suggests that to understand how people use retrieved information to make decisions, it is important to understand the journey people took to arrive at a conclusion.

3.4.1 Analysis limitations The assumption that subjects read a document based on having accessed the document was a potential limitation in the study. Subjects may not have read documents they accessed, or only partially read them, modifying the likelihood that the document influenced them.

Similarly, subjects may have been influenced by documents without accessing them, for example, looking at the title or the abstract of the document only on the search engine results page, but not accessing the document itself.

Evidence journey: data exploration 19

3.5 Conclusion

The exploration of how health professionals searched for and accessed documents suggests that people take different journeys to answer questions, and that the way documents were accessed could contribute to different interaction effects between pieces of information, which influences a person’s evaluation of the evidence and subsequently the decision making process. In the next section, cognitive biases will be examined further through a literature review, followed by a quantitative analysis of bias impact based upon a Bayesian belief revision framework.

Part III: Do cognitive biases influence information searching and decision making?

4 Cognitive biases: a literature review

This chapter presents a literature review on cognitive biases and their impact on decision making. It starts with a brief overview of the classic literature that defines and grounds the cognitive biases investigated in this research: anchoring, order, exposure and reinforcement effects. Studies of cognitive biases in health-related decision making and information science are also presented. The review concludes with illustrations of how people may experience cognitive biases while searching for information.

4.1 Literature search strategy

The work in this thesis lies in the intersection between cognitive science, information searching and decision making. Sources of relevant literature were scattered in different disciplines ranging from cognitive psychology, health informatics, computer science, information science and decision science. Consequently, a formal “search strategy” that might work well for a narrow topic area review was not sufficient to identify relevant literature that was often loosely connected, used different terminologies and appeared in very different research topics.

The strategy taken to identify relevant work in these literatures included searches on the

Web of Science conducted to identify key journals in each of these five disciplines. This was followed by a series of journal runs from each discipline, attempting to retrieve literature that reports on the intersection of these three research topics. At the time the literature reviews were conducted, little or no research seemed to have examined this intersection.

Searching was then re-strategised to identify key papers and key researchers from each representative discipline. Once key papers and key researchers were identified from each discipline, bibliographic runs and author citation searches were conducted to complete the literature search. From cognitive psychology, searches were conducted to identify the origin and development of cognitive biases (reported in this Chapter). From health informatics, studies were collated to examine whether clinicians and patients experience biases while 21

Cognitive biases: a literature review 22

making health-related decisions (also reported in this Chapter). From decision sciences, literature was reviewed to analyse the effectiveness of proposed interventions for debiasing

(reported in Chapter 8), From computer and information sciences, studies describing features of user interfaces for search engines were assessed for their effectiveness in assisting people to search and process information (reported in Chapter 9).

4.2 Cognitive biases

Human beings seldom follow a purely rational model in decision making and are prone to a series of decision biases (Kahneman et al., 1982). Medical practitioners, for example, display cognitive biases and the use of heuristics when making clinical decisions (Elstein, 1999;

Croskerry, 2002) and in interpreting research evidence (Kaptchuk, 2003). Of particular relevance to document retrieval, several studies have shown that in laboratory studies, medical practitioners arrive at different diagnoses when the same information is presented in a different order (Bergus et al., 1995; Bergus et al., 1998; Chapman et al., 1996;

Cunnington et al., 1997); and that order effects influence patients’ interpretation of treatment options and information (Bergus et al., 2002).

Biased decisions occur when an individual’s cognition is affected by “contextual factors, information structures, previously held attitudes, preferences and moods” (de Mello, 2004, p. 4). These biases are generally classified into two main categories: cognitive biases, which arise because of limitations in human cognitive ability to properly attend to and process all the information that is available (Kruglanski & Ajzen, 1983, p. 7); and motivational biases, which are characterised by a tendency to form and hold beliefs that serve the individual’s needs and desires (Kruglanski & Ajzen, 1983, p. 4), and are not necessarily related to an individual’s limitations to process information. This chapter will focus on cognitive biases in information searching and the impact they have on decision making.

4.2.1 The anchoring effect The anchoring effect refers to the phenomenon of one’s prior belief exerting a persistent influence on the way new information is processed and subsequently affecting the way beliefs are formed (Wickens & Hollands, 2000). It was first discussed by Tversky and

Kahneman (1974) to describe situations when judgements and decisions are “biased toward

Cognitive biases: a literature review 23

the initial values” (Tversky & Kahneman, 1974, p. 1128), and that “different starting points yield different estimates” (Tversky & Kahneman, 1974, p. 1128). In their classic demonstration of the anchoring effect, groups of subjects were instructed to spin a “wheel of fortune” with sectors numbered from 0 to 100, and were then asked to indicate whether the percentage of African countries in the United Nations is higher or lower than the number, and then to estimate the percentage by moving upward or downward from that number. The percentage estimates were noticeably influenced by these arbitrary numbers. For example, groups who received 10 as the starting number had a median estimate of 25, whereas groups who received 65 had a median estimate of 45. Following from this experiment, many researchers confirmed this phenomenon and investigated the different conditions in which the anchoring effect occurs. In particular, Wilson et al. (1996) conducted two studies and proposed the basic anchoring effect, which is a phenomenon where people still experience an anchoring effect simply by being exposed to the anchor without having any explicit instruction to compare the anchor with the target value.

4.2.2 Order effects Order effects refer to the way in which the temporal order that information is presented affects the final judgement of an event (Wang et al, 2000). In the psychology literature, order effects are further characterised into the primacy and recency effects. In memory recall experiments, where subjects are read a list of terms and asked to recall as many terms as possible, primacy effect refers to the phenomenon where initial items in the list are more readily recalled than other terms in the list (Murdock, 1961); and recency effect refers to when latter items are better recalled (Murdock, 1961). In impression forming studies, where subjects are presented with serial information on a topic or on a person and asked to form an impression, primacy effect or the “Law of Primacy in persuasion” describes the phenomenon where individuals’ impressions are more influenced by earlier information than information later in the sequence (first reported by Lund, 1925); and recency effect refers to the situation when impressions are more influenced by later information

(Cromwell, 1950; Luchins, 1957; Hovland, 1957).

The primacy and recency effects as defined in the impression forming literature are most relevant to the effect of evidence on decision making. They are best illustrated in the following classic experiment conducted by Asch in 1946. Two groups of subjects were read

Cognitive biases: a literature review 24

the same list of characteristics of a person and asked to describe the impression they formed.

The list was designed such that favourable adjectives were earlier in the list and unfavourable adjectives were later in the list. For one group of subjects, the following series was read:

Intelligent-industrious-impulsive-critical-stubborn-envious

And for the other group of subjects, the following reverse order was read:

Envious-stubborn-critical-impulsive-industrious-intelligent

Asch found that subjects who were exposed to favourable adjectives earlier and unfavourable adjectives later in the list formed better impression of the person than those who were exposed to the same list of adjectives but in reverse order.

In more recent psychology literature, Hogarth and Einhorn (1992) developed an analytical model to predict when primacy and recency effect would occur and the magnitude of these effects. Upon reviewing numerous studies on order effects and conducting multiple experiments, they identified that order effects arise from the interaction of information processing strategies and task characteristics. They proposed a belief-adjustment model that predicts when primacy or recency effect would occur under the interaction of the following factors: task complexity, length of the evidence series, and whether one is to respond Step-by-Step (SbS) or at the End-of-Sequence (EoS) of evidence.

They concluded that (i) primacy effect is more likely to be associated in end-of-sequence processing tasks or in tasks that are simple but long; (ii) recency effect occurs more often in step-by-step processing tasks, in complex tasks or in situations when mixed evidence is presented (positive and negative); and (iii) the model predicts that people do not experience order effects when the evidence being processed is all consistent (positive or negative).

4.2.3 Exposure effect Exposure effect refers to the phenomenon where the level of exposure to information affects the final judgement of an event. Exposure (e.g. length of time spent looking at a given document) has been reported to influence a person’s evaluation of an experience. Most research that has studied the impact of time has examined a person’s reaction to pain or the recall of pleasant or unpleasant personal memories over time. Features in a temporal sequence, such as the spread of experiences, the partitioning of episodes, the peak-and-end events in an incident and the degradation or improvement in experience over time have

Cognitive biases: a literature review 25

been reported to influence a person’s overall impression of the experience. For example,

Varey and Kahneman (1992) found that ratings of pain experiences were primarily based on the maximum and final intensities of the experiences, where participants rated the overall pain in the sequence {2, 5, 8} worse than the overall pain in the sequence {2, 5, 8, 4} (large numbers in the sequence refer to greater pain than small numbers).

4.2.4 Reinforcement effect Reinforcement effect refers to the phenomenon where the level of repeated exposure to information affects the final judgement of an event. Reinforcement (e.g. the number of times a document has been looked at is perhaps best related to “mere exposure” discussed by Zajonc (1968), where the “mere repeated exposure of the individual to a stimulus object is a sufficient condition for the enhancement of his attitude toward it” (Zajonc, 1968, p. 1); and mere exposure simply refers to the condition of having the given stimulus accessible to the individual’s perception. Reinforcement in this research is different to the notion discussed in Premack’s principle of positive reinforcement (Premack, 1959), where positive feedback and rewards are used to encourage certain desirable behaviour. Through reviewing and conducting experimental studies, Zajonc found that there was a linear log relationship between the frequency with which a subject was exposed to a stimulus and the subject’s enhanced attitude towards the stimulus, regardless of whether the stimulus was a nonsense word, a symbol or a photograph of people. He observed that, independent of word content, subjects consistently gave a higher rating to a word if they had seen it more often.

4.3 Studies of cognitive bias in decision making

Studies of cognitive biases from health-related decision making and information science are presented in this section. Research on cognitive biases from these fields mainly consist of empirical experiments that study whether subjects experienced the anchoring effect or order effects.

4.3.1 Health-related decision making Physicians’ and patients’ judgements have been found to tend towards anchors that had information irrelevant to the scenario. Brewer et al. (in press) conducted two empirical experiments with patients and physicians to investigate whether they experience the

Cognitive biases: a literature review 26

anchoring effect when given irrelevant information to make treatment choices or diagnoses.

The first experiment asked 99 HIV-positive patients to estimate the chance that a sexual partner would become infected with HIV after sex with a defective condom and to indicate their treatment choices; the second study involved 191 physicians to rate the chance that a hypothetical patient had pulmonary embolism and to make a choice on treatment plan. In these two experiments, physicians and patients were given anchor numbers (i.e. 1% or 90%) to judge whether the chance of infection/disease was more or less than this anchor number and to give an actual estimate of the percentage. Overall, Brewer et al. found that patients in the lower and upper anchor group differed on average by 20% in estimating the probability of HIV infection (average estimate from lower anchor: 43%, average estimate from upper anchor: 63%, P = 0.01); and that physicians in the two anchor groups differed on average by

30% in estimating the probability of disease (average estimate from lower anchor: 23%, average estimate from upper anchor: 53%, P < 0.001). Interestingly, patients in the lower anchor group selected more aggressive treatment than patients in the higher anchor group

(P < 0.05); similarly, physicians in the lower anchor group selected more aggressive treatment options (P < 0.001), and physicians in the higher anchor group selected less aggressive treatment options, such as watchful waiting (P < 0.05) . Overall, patients and physicians from different anchor groups formed significantly different disease/infection estimates and made significantly different choices.

Order effects have been reported to influence diagnosis-making amongst family physicians. Bergus et al. (1995) is possibly one of the earliest studies that examined the impact of order effects on medical decision making. The study involved 55 full-time family physicians who had to provide the probable diagnosis for a patient as each of four pieces of clinical evidence was presented to them. These physicians were divided into two groups with the same information presented in a different order. Out of 45 participants who completed the exercise, Bergus et al. found that the diagnoses were significantly different in each group (P < 0.0001). Chapman et al. (1996) conducted a similar study with 300 family physicians. They found that information order, i.e. whether presenting the previous disease history first or last, was the only statistically significant factor associated with the difference in the final judged percentage of disease between groups of subjects (P < 0.0001).

Cognitive biases: a literature review 27

Order effects have also been found to influence diagnosis-making with a mixture of recency and primacy effects. The study conducted by Cunnington et al. (1997) which involved 67 medical students, residents and academic internists showed that there was a significant overall effect of information sequence on diagnosis (P = 0.05). They also found that there was a clear tendency for subjects to favour the diagnosis for which information was presented first. In contrast, the study conducted by Bergus et al (1998) on 400 family physicians found that the group that received history earlier gave it less weight than the group who received it last (P = 0.04).

Order effects have also been studied amongst healthcare consumers. Bergus et al. (2002) recruited 685 people in the waiting rooms of primary care physicians, and each subject was randomised to receive an information booklet on one of three specific treatment options for symptomatic carotid artery disease and asked to rate the favourability of the treatment.

There were two information booklets for each treatment option, which carried the same information but the paragraphs were in different order: risk then benefit, or benefit then risk. The three treatment options were: low-risk/low-benefit, high-risk/high-benefit and high-risk/unknown-benefit. Overall, Bergus et al. found that for low-risk/low-benefit treatments, subjects were influenced by the order of risk/benefit information; those who learned about risks after benefits had a lesser favourability rating on the treatment than those who learnt about benefits after risks (P = 0.02) and were less likely to consent (P =

0.04). They also found that the order of risk/benefit information did not affect the favourability rating of high-risk treatments regardless of potential benefits (P > 0.05).

4.3.2 Information searching and processing In information science, studies of order effects have mainly focused on whether the order of information presentation influences subjects’ evaluation of document relevance. Eisenberg and Barry (1988) conducted two experiments with 42 graduate and undergraduate students to examine whether subjects judged the relevance of 15 documents differently if the documents were presented in increasing order of relevance compared with decreasing order of relevance. They found that when documents were presented in decreasing order of relevance, subjects consistently underestimated the significance of documents at the higher end of relevance; and when documents were presented in increasing order of relevance, subjects over-estimated the relevance of documents, especially those in the low to middle

Cognitive biases: a literature review 28

range of relevance. They concluded that there was a significant difference in ratings between the two orders (P < 0.002). In another study, Tubbs et al. (1993) examined whether order effects influence the way people update their beliefs when they encounter consistent and inconsistent evidence. Using between-subject and within-subject experiments, Tubbs et al. confirmed Hogarth & Einhorn’s prediction that recency effect will be found when two pieces of evidence are inconsistent and strong (P < 0.01), and suggested that interventions to minimise biases (i.e. “debiasing methods”) are best applied at the evidence integration stage of the belief-updating process rather than at the evaluation of individual pieces of evidence.

Order effects have also been studied in information searching where the focus is mainly on document relevance. Purgailis Parker and Johnson (1990) conducted a study with 40 graduate and senior undergraduate students, where subjects had to use an information retrieval system to search for documents on a topic of their choice and rate the relevance of each document returned from the system. Over 47 search queries, they concluded that subjects are not influenced by order effects when the set of retrieved document citations has fewer than 15 documents, and that order effects are marginally statistically significant when the number of documents is equal to or greater than 15 documents (P = 0.052). Building on this work, Huang & Wang (2004) conducted a two-phase study with a convenient sample of

48 undergraduate and postgraduate students. They concluded that significant order effects take place when there are 15 and 30 documents (P < 0.05). Sets with 45 and 60 documents reveal order effects but the results are not statistically significant and that subjects are not influenced by the order of presentation when the set of documents has 5 or 75 members.

Order effects may occur when confidence is low and disappear when confidence increases (Wang et al., 2000). Using an empirical and computational study, Wang et al.

(2006) concluded that order effects result from one’s coherently and dynamically adaptive expectations of the statistical properties of the environment. They hypothesised that over- reacting and under-reacting are major causes of order effects. They concluded that the existence and disappearance of order effects are rational because beliefs are gradually tuned to the statistical structure of the environment, i.e. as people gain more experience and increase confidence, they avoid over-reacting or under-reacting and hence prevent order effects.

Cognitive biases: a literature review 29

Other investigations that study biased processes, cognitive processes or cognitive styles in information searching include: whether information retrieved from search engines originate from a particular country or organisation (Mowshowitz & Kawaguchi, 2002); whether searchers’ cognitive processes, such as field dependency/independency capability influence the use of a local or global approach to search (Ford et al. 1994); and whether individuals’ bias for verbaliser/imager cognitive styles influence their information retrieval effectiveness (Ford et al., 2001).

4.4 Heuristics and cognitive bias in the context of information searching

People appear to use simple procedures or rules (called ‘heuristics’) to reduce mental effort in their everyday tasks (Kahneman et al., 1982). The three main heuristics commonly described are the representativeness heuristic, , and the anchoring and adjustment heuristic.

The representativeness heuristic describes the situation when “probabilities are evaluated by the degree to which A is representative of B, that is, by the degree to which A resembles B.” (Kahneman et al., 1982, p. 4). Biases emanating from the representativeness heuristic include “insensitivity to sample size”, which describes the situation when individuals frequently fail to appreciate the role of sample size in assessing the reliability of sample information (Kahneman & Tversky, 1972). For example, people may judge characteristics in a small sample to be highly representative of the population (Kahneman et al., 1982). In the context of information searching, a study conducted by Jansen and Spink

(2003) shows that in the use of a commercial web search system, more than half the users

(54%) viewed only one page of search results. This could be attributed to the “insensitivity to sample size” bias where people look through the results on the first page and evaluate the success of the search based on the first page as a representative of the overall retrieval results.

The availability heuristic describes the “situation in which people assess the frequency of a class or the probability of an event by the ease with which instances or occurrences can be brought to mind.” (Kahneman et al., 1982, p. 11). The dependence on availability leads to biases, such as ease of recall, which describes the phenomenon in which “individuals judge

Cognitive biases: a literature review 30

events that are more easily recalled from memory, based upon vividness or recency, to be more numerous than events of equal frequency whose instances are less easily recalled”.

(Kahneman et al., 1982, p. 11). In the context of information searching, people affected by order, exposure and reinforcement effects may experience the ease of when they use the availability heuristic to recall what has been read in their evidence journey to make a decision. Documents that have been read in the first and last positions (order effects), documents where people spent more time (exposure effect) or visited more than once

(reinforcement effect) may dominate the development of the decision outcome simply because they are more easily recalled than other documents rather than the utility contribution of these documents.

The anchoring and adjustment heuristic describes the situation “when different starting points yield different estimates, which are biased toward the initial values” (Kahneman et al.,

1982, p. 14). Biases resulting from the anchoring and adjustment heuristic include insufficient anchor adjustment and overconfidence (Bazerman, 1994). The insufficient anchor adjustment bias (or the conservatism bias) describes the phenomenon in which individuals make estimates for values based upon an initial value and typically make insufficient adjustments from that anchor when establishing a final value (Bazerman, 1994, p. 46). In the context of information searching, people may experience the insufficient adjustment bias if they already hold a particular point of view on a matter and fail to seek conflicting information or adjust accordingly when they encounter information that challenge their point of view. Studies have shown that people do not look for or often ignore or disregard information that conflicts with their point of view (Klayman & Ha, 1987; Chinn

& Brewer, 1993). A study conducted by Russo et al. (1996) found that in the process of reviewing different pieces of information, people with or without a pre-existing preference tended to develop a preference for one point of view before finishing reviewing the rest of the information; the development of such preference could distort people’s perception of the remaining pieces of information to prematurely favour and adopt a particular view point.

Another bias that can be caused by the use of the anchoring and adjustment heuristic is the overconfidence bias. Overconfidence bias describes the phenomenon in which when individuals tend to be overconfident in their hypothesis, they are less likely to seek further information (Wickens & Hollands, 2000). Overconfidence has been reported to affect

Cognitive biases: a literature review 31

decisions that are of moderate to extreme difficulty (Bazerman, 1994, pp. 38–39). It has also been reported to be one of the most commonly observed findings (Fischhoff, 1982). In the context of information searching, overconfidence bias may occur when people encounter information that supports their point of view, and prematurely terminate their searches and fail to seek further information to confirm or refute that paticular view point.

4.5 Conclusion

This review began with a historical development of anchoring, order, exposure and reinforcement effects in the psychology literature. Research on the impact of cognitive biases was presented in areas of health-related decision making and information science; most studies reported mixed findings of anchoring effect and order effects. Suggestions of how people may experience cognitive biases in the context of information searching were also presented.

5 Do cognitive biases occur in search behaviours? A preliminary analysis

This chapter presents a retrospective data analysis to test whether there is any evidence to support the conclusion that people experience cognitive biases during information searching, using the dataset introduced in Chapter 3. As identified in Chapters 3 and 4, people undertake different evidence journeys as they search for information to answer similar questions. It seems plausible that a subject’s prior belief and their confidence (i.e. anchor and confidence in anchor) and the way a document was processed in the evidence journey may influence the impact it has on the decision outcome. Specifically, the order of document access (i.e. order effects), amount of time spent on the document (i.e. exposure effect) and the number of times the document was accessed (i.e. reinforcement effect) may all influence the decision made after retrieving information to answer a question.

This preliminary data analysis will be used to develop hypotheses about the existence and impact of cognitive biases on decisions made using search engines.

5.1 Method

To study the impact of cognitive biases, a standard way of measuring and reporting deviation from expected decisions is needed. The serial position curve is a technique initially developed to study how the order of information impacts on memory of recall. According to

Murdock in 2001, it is one of the oldest techniques in the experimental study of human memory; it has been reported to antedate Ebbinghaus (1885) and was first discussed by

Nipher (1876). The x-axis of the curve indicates the serial position of the to-be-remembered items in the list and the y-axis indicates the percentage of recall for the item, which is typically obtained by averaging the outcome response across a number of subjects. Figure

5.1 is an example of a serial position curve.

32

Do cognitive biases occur in search behaviours? A preliminary analysis 33

% recall

123456789 Serial position

Figure 5.1 Example of a serial position curve

Serial position curves are often used in memory recall experiments. Most memory recall experiments involved exposing a list of items to subjects followed by asking them to recall as many items as possible (not necessarily in exact order). A common finding is the U-shaped curve, where items at the beginning and at the end of a list are easier to recall than items in the middle of the list, which resembles a combination of primacy and recency effects. Serial position curves have been used to study order effects in different areas over the years, such as attitude and persuasion studies since the 1960s, personality impression studies (Andersen et al., 1973), memory testing on patients with Alzheimer disease (Capitani et al., 1992), family physicians’ clinical judgement (Bergus et al., 1995; Chapman et al., 1996), patients’ preferred treatment (Bergus et al., 2002), evaluation of document relevance (Huang et al.,

2004), and commanding a simulation of a naval ship (Wang et al., 2006).

The serial position curve seems well suited to study whether documents accessed at different positions in the evidence journey have different influence on the decision outcome.

Similarly, serial exposure curve and serial reinforcement curve could be used to examine whether documents processed for different lengths of time and documents accessed at different frequencies have different influences on the decision outcome. In addition, a serial anchor curve may be used to investigate whether different confidence levels in the pre-

Do cognitive biases occur in search behaviours? A preliminary analysis 34

search answer influence a person’s tendency to retain the answer post-search. All serial curves are normalised because evidence journeys vary in length.

5.1.1 Plotting a serial position curve Plotting a serial position curve is a three-step procedure. The first step involves collating a pool of candidate documents. The second step involves calculating the concurrence rate for each point in the evidence journey, which measures the level of agreement or concurrence between subjects’ post-search answer and the answer suggested by the documents accessed at that point. The third step involves plotting these concurrence rates along the y-axis and the dimension (under investigation) in the evidence journey on the x-axis on a graph. Apart from plotting a serial position curve for order of document access, similar analyses allow us to generate curves for exposure (i.e. serial exposure curve), reinforcement (i.e. serial reinforcement curve) and confidence in anchoring effect (i.e. serial anchor curve).

For example, to plot a serial position curve for order effects, on the x-axis is the access position of documents in the evidence journey in normalised form, where first refers to the first document accessed in an evidence journey, last to the last-accessed document and middle to documents accessed in any position other than first or last. The reason for normalising the order of document access is so that evidence journeys of different lengths can be compared and plotted onto the same graph. For example, a person having accessed 5 documents in the evidence journey would have the 1st and 5th accessed document as the first and last documents, and the 2nd, 3rd and 4th documents as middle documents. An alternative normalising strategy is also used, which attempts to allocate documents into three bins of similar size. Applying this alternative strategy onto the same example, the 1st and 2nd accessed documents would be allocated to first, the 3rd document to middle, and the 4th and 5th documents to last. The y-axis describes the concurrence rate between subjects’ post- search answers and the answer suggested by documents accessed at that position.

The first step of constructing the document pool for a serial position curve involves: (i) gathering all documents accessed by subjects in journeys that involved more than one document; and (ii) removing documents that have not been accessed by subjects at more than one position. The next step involves calculating concurrence rates at each access position in the evidence journey in normalised form, i.e. first, middle and last. The

Do cognitive biases occur in search behaviours? A preliminary analysis 35

concurrence rate for documents accessed at the first position in the evidence journey is illustrated in Equation 41.

= Concurrence first )( Concurrence rate first 4 Access first where

Concurrence first = + post-searc of No. of post-searc h correct subjects who accessed positivea document at Position first

subjects incorrect hpost-searc of No. of hpost-searc incorrect subjects who accessed negativea document at Position first and = Accessfirst no.Total of subjects who accessed documenta at Position first

5.1.2 Quantitative analyses Chi-square analysis was conducted for each cognitive bias. It was used to test whether there was a statistically significant relationship between documents accessed or processed at different points in the evidence journey and the concurrence between subjects’ post-search answers and the answer suggested by the document. It was also used to test whether there was a statistically significant relationship between subjects’ pre-search answers and their post-search answers, as well as between subjects’ confidence in the pre-search answer and their tendency to retain the pre-search answer after searching. Two assumptions about independence were made in using the chi-square analysis: the first was that subjects’ prior knowledge and information-processing abilities to answer one question did not influence their ability to answer other questions; the second was that subjects’ interpretation of a document did not influence their interpretation of other documents in the evidence journey. Relative risks were also calculated to measure the impact of each bias, which compares the likelihood of influencing a decision outcome between documents read at a given position versus documents read at other positions.

1 Positive document refers to a document having a likelihood ratio > 1; and a negative document refers to a document having a likelihood ratio ≤ 1.

Do cognitive biases occur in search behaviours? A preliminary analysis 36

Table 5.1 Relationship between pre-search answer and post-search answer

After search Before search Right Wrong Right (n=134) 104 (77.6%) 30 (22.4%) Wrong (n=266) 146 (54.9%) 120 (45.1%)

5.2 Results

5.2.1 Anchoring effect Anchoring effect, the tendency to be influenced by anchors (i.e. prior belief or pre-search answer), was looked for using two investigations. The first investigation, namely the anchoring effect investigation, assesses whether there is a statistically significant relationship between subjects’ pre-search answer and their post-search answers. The second analysis, namely the confidence in anchoring effect investigation, examines whether there is a statistically significant relationship between subjects’ confidence in their pre-search answer and their tendency to retain the pre-search answer after searching.

The anchoring effect investigation is illustrated in Table 5.1 and Figure 5.2. Subjects who were correct pre-search were more likely to answer correctly post-search than those who were incorrect pre-search (Figure 5.2). Chi-square analysis conducted on Table 5.1 shows that there was a statistically significant relationship between pre-search answers and post-

100% 90% 80% 70% 60% 50% 40% 30% Right (afterRight search) 20% 10% 0% Right Wrong Before search

Figure 5.2 Relationship between pre-search answer and post-search correctness

Do cognitive biases occur in search behaviours? A preliminary analysis 37

Table 5.2 Pre-search answer and confidence with percentages of post-search answers

Post-search answer Pre-search answer and confidence Correct Incorrect Incorrect Very confident (n=9) 0 (0%) 9 (100%) Confident (n=23) 10 (43%) 13 (57%) Somewhat confident (n=44) 22 (50%) 22 (50%) Not confident (n=24) 10 (42%) 14 (58%) Don’t know (n=166) 104 (63%) 62 (37%) Correct Not confident (n=9) 8 (89%) 1 (11%) Somewhat confident (n=50) 35 (70%) 15 (30%) Confident (n=48) 40 (83%) 8 (17%) Very confident (n=27) 21 (78%) 6 (22%) search answers (χ2 = 19.63, df = 1, P < 0.001).

The confidence in anchoring effect investigation is described in Table 5.2, Table 5.3, Figure

5.3, Figure 5.4 and Figure 5.5. The frequency of subjects’ pre-search answers, pre-search confidence and post-search answers are described in Table 5.2, Table 5.3 and Figure 5.3.

Figure 5.4 illustrates the development of the serial anchor curve and Figure 5.5 reports the serial anchor curve. The x-axis in the serial anchor curve reports subjects’ level of confidence in their pre-search answers in ascending order: not confident, somewhat confident, confident and very confident; and the y-axis denotes the pre-search answer retention rate, which is the percentage of pre-search answers that were retained by subjects after searching. The hypothesis is that the more confident subjects are in their pre-search answers, the more likely they are to retain their pre-search answer after searching.

Chi-square analysis conducted on Table 5.3 shows that there was a marginally

Table 5.3 Relationship between confidence in pre-search answer and retention of pre-search answer after searching

Retained pre-search answer after searching? Pre-search confidence Yes No Relative risk (95% CI) Not confident (n=33) 22 (66.7%) 11 (33.3%) 0.96 (0.740 to 1.239) Somewhat confident (n=94) 57 (60.6%) 37 (39.4%) 0.81 (0.669 to 0.977) Confident (n=71) 53 (74.6%) 18 (25.4%) 1.12 (0.939 to 1.328) Very confident (n=36) 30 (83.3%) 6 (16.7%) 1.25 (1.048 to 1.491)

Do cognitive biases occur in search behaviours? A preliminary analysis 38

180 160 140 120 100 80 60 No. of subjects 40 20 0

) ) ) nt) t t e nt ent) d e en d know fi id nfident) 't n f o nfiden n c o t Don c (confid co o y (not co hat t (n w rect ct t (ver ct (very confident) e e rr Cor Incorrect (confident) some (somewhat confi Co ( t Incorrec Correc Incorr rect or C Incorrec Pre-search answ er and confidence

Figure 5.3 Frequency of subjects with different confidence levels in the pre-search answer significant relationship between subjects’ confidence in their pre-search answers and their retention of the pre-search answer after searching (χ2=7.70, df = 3, P = 0.053). However, the

95% confidence interval (95% CI) of relative risk for each confidence level of the pre-search answer either crosses one or is very near one, which means that there was no difference in the pre-search answer retention rate at any level of pre-search confidence (Table 5.3).

100%

90%

80%

70%

60%

50%

40%

30%

20% Pre-search answer retention rate retention answer Pre-search 10%

0%

) ) ) nt) t ) t e ent) nt ent dent) know den e den id fi t i i ' nf (confid con conf co ry confid Don y conf ot ct ( ve ect e ver (n ( t ( ct orr ct ec corr ect (not re C e In r corr or (somewhat confident) Cor C Incorr ct In rect (somewhat confid or C Incorre Pre-search answer (confidence)

Post-search correct Post-search incorrect

Figure 5.4 Relationship between confidence in pre-search answer and the pre-search answer retention rate (separate graphs for correct and incorrect post-search answers)

Do cognitive biases occur in search behaviours? A preliminary analysis 39

100%

90%

80%

70%

60%

50%

40%

30%

20% Pre-search answer retention rate 10%

0% Not confident Somew hat confident Confident Very confident Confidence in pre-search answ er

Figure 5.5 Serial anchor curve: relationship between confidence in pre-search answer and retention of pre-search answer after searching (combined graph for correct and incorrect post- search answers)

5.2.2 Order Order effects are covered by Table 5.4, Table 5.5, Figure 5.6, Figure 5.7, Figure 5.8 and

Figure 5.9. In particular, the frequency of documents accessed at different positions during a search journey is described in Table 5.4, Table 5.5 and Figure 5.6. Figure 5.7 and Figure

5.8 illustrate the development of the serial position curve. The serial position curve is described in Figure 5.9, which shows a slight V-curve shape between the position at which a document was accessed and the concurrence rate. The shape of the curve qualitatively matches our expectations for order effects. However, the effect seems slight and chi-square analysis conducted on Table 5.5 did not show a statistically significant relationship between documents accessed at different positions and the concurrence rate between subjects’ post- search answers and the document-suggested answer (χ2=0.55, df=2, P = 0.76). In addition, each 95% CI of relative risk for documents accessed at first, middle and last position crosses one, which means that there was no difference in the concurrence rate between subjects’ post-search answers and the document-suggested answer regardless of whether documents were accessed at the first, middle or last position in the evidence journey (Table 5.5).

Do cognitive biases occur in search behaviours? A preliminary analysis 40

Table 5.4 Correct/incorrect document accessed at different positions and the corresponding percentage of a correct/incorrect post-search answer

Correct documents Incorrect All documents documents Correct Incorrect Access after after position Total search Total search Total Concurrence [95% CI] 1 127 103 (81%) 106 70 (66%) 233 173 (74%) [68 to 79%] 2 96 68 (71%) 88 54 (61%) 184 122 (66%) [59 to 73%] 3 69 47 (68%) 66 45 (68%) 135 92 (68%) [60 to 75%] 4 56 37 (66%) 48 30 (63%) 104 67 (64%) [55 to 73%] 5 39 29 (74%) 31 21 (68%) 70 50 (71%) [60 to 81%] 6 30 22 (73%) 18 17 (94%) 48 39 (81%) [68 to 90%] 7 21 18 (85%) 25 16 (64%) 46 34 (74%) [60 to 84%] 8 19 12 (63%) 18 14 (78%) 37 26 (70%) [54 to 83%] 9 7 3 (43%) 11 6 (55%) 18 9 (50%) [29 to 71%] 10 9 5 (56%) 8 6 (75%) 17 11 (65%) [41 to 83%] >10 24 18 (75%) 28 18 (64%) 52 36 (69%) [56 to 80%]

Table 5.5 Relationship between document access position and concurrence between post- search answer and document-suggested answer

Concurrence between post-search answer and document-suggested answer? Access position Yes No Relative risk (95% CI) First (n=255) 192 (75.3%) 63 (24.7%) 1.03 (0.949 to 1.121) Middle (n=470) 342 (72.8%) 128 (27.2%) 0.98 (0.906 to 1.055) Last (n=219) 161 (73.5%) 58 (26.5%) 1.00 (0.912 to 1.093) Note: Normalising strategy used in this table is for single first and single last accessed document; results produced from the alternative three-bin normalising strategy are not statistically significant either

Do cognitive biases occur in search behaviours? A preliminary analysis 41

250

200

150

100 No. of subjects 50

0 12345678910More than 10 Access position

Figure 5.6 Frequency of subjects accessing documents at different positions

100% 90% 80% 70% 60% 50% 40% 30% Concurrence rate Concurrence 20% 10% 0% 12345678910More than 10 Access position

Correct document Incorrect document

Figure 5.7 Relationship between document access position and concurrence rate between post-search answer and document-suggested answer (separate graphs for correct and incorrect document)

Do cognitive biases occur in search behaviours? A preliminary analysis 42

100%

90%

80%

70%

60%

50%

40%

Concurrence rate 30%

20%

10%

0% 12345678910More than 10 Access position

Figure 5.8 Relationship between document access position and concurrence rate between post- search answer and document-suggested answer (correct and incorrect documents combined)

100%

90%

80%

70%

60% Concurrence rate

50% 10% 0% First Middle Last

Pos ition

Figure 5.9 Serial position curve: normalised view of the relationship between document access position and concurrence rate between post-search answer and document-suggested answer

Do cognitive biases occur in search behaviours? A preliminary analysis 43

5.2.3 Exposure In the serial exposure curve for exposure effect, the x-axis shows the amount of time subjects were exposed to a document normalised to the evidence journey, where least refers to the document in which the least time was spent in a subject’s journey, most to the document in which the most time was spent and medium to all other documents. The y-axis describes the concurrence rate between subjects’ post-search answers and the document- suggested answer.

Exposure effect is covered by Table 5.6, Table 5.7, Figure 5.10, Figure 5.11, Figure 5.12 and Figure 5.13. The frequency of documents exposed for different lengths of time in a search journey are described in Table 5.6, Table 5.7 and Figure 5.10. Figure 5.11 and Figure

5.12 illustrate the development of the serial exposure curve, which is described in Figure

5.13. Chi-square analysis conducted on Table 5.7 did not show a statistically significant relationship between document exposure duration and the concurrence rate between subjects’ post-search answers and the document-suggested answer (χ2 = 2.61, df = 2, P =

0.27). In addition, each 95% CI of relative risk for documents exposed with least, medium and most duration crosses one, which means that there was no difference in the concurrence rate between subjects’ post-search answers and the document-suggested answer regardless of whether documents were exposed at least, medium or most exposure

(Table 5.7).

Table 5.6 Concurrence for documents exposed for correct/incorrect document accessed with different levels of exposure and the corresponding percentage of a correct/incorrect post- search answer

Time spent Correct documents Incorrect documents All documents on document Tot Correct after Tota Incorrect after Tota (minutes) al search l search l Concurrence [95% CI] 1 99 69 (70%) 110 74 (67%) 209 143 (68%) [62 to 74%] 1–2 115 80 (70%) 119 71 (60%) 234 151 (65%) [58 to 70%] 2–3 49 37 (76%) 67 49 (73%) 116 86 (74%) [65 to 81%] 3–4 27 18 (67%) 24 19 (79%) 51 37 (73%) [59 to 83%] 4–5 16 12 (75%) 10 9 (90%) 26 21 (81%) [62 to 91%] >5 27 19 (70%) 17 7 (41%) 44 26 (59%) [44 to 72%]

Do cognitive biases occur in search behaviours? A preliminary analysis 44

Table 5.7 Relationship between document exposure level and concurrence between post- search answer and document-suggested answer

Concurrence between post-search answer and document-suggested answer? Relative risk (95% CI) Level of exposure Yes No Least (n=88) 58 (65.9%) 30 (34.1%) 0.91 (0.780 to 1.069) Medium (n=417) 307 (73.6%) 110 (26.4%) 1.07 (0.976 to 1.174) Most (n=245) 171 (69.8%) 74 (30.2%) 0.97 (0.875 to 1.066) Note: Normalising strategy used in this table is for single least and single most exposed document; results produced from the alternative 3-bin normalising strategy are not statistically significant either.

250

200

150

100 No. of subjects

50

0 Less than 1 1 to 2 2 to 3 3 to 4 4 to 5 More than 5 Time spent on document (min)

Figure 5.10 Frequency of subjects being exposed to documents at different levels

Do cognitive biases occur in search behaviours? A preliminary analysis 45

100% 90% 80% 70% 60% 50% 40% 30% Concurrence rate 20% 10% 0% Less than 1 1 to 2 2 to 3 3 to 4 4 to 5 More than 5 Time spent on document (min)

Correct document Incorrect document

Figure 5.11 Relationship between document exposure level and concurrence between post- search answer and document-suggested answer (separate graphs for correct and incorrect document)

100% 90% 80% 70% 60% 50% 40%

Concurrence rate 30% 20% 10% 0% Less than 1 1 to 2 2 to 3 3 to 4 4 to 5 More than 5 Time spent on document (min)

Figure 5.12 Relationship between document exposure level and concurrence between post- search answer and document-suggested answer (correct and incorrect documents combined)

Do cognitive biases occur in search behaviours? A preliminary analysis 46

100%

90%

80%

70%

60% Concurrence rate

50% 10% 0% Least Medium Most Time spent on document (min)

Figure 5.13 Serial exposure curve: normalised view of the relationship between document exposure level and concurrence between post-search answer and document-suggested answer

5.2.4 Reinforcement In the serial reinforcement curve for reinforcement effect, the x-axis shows the access frequency of documents in normalised form, which is either once only or more than once; and the y-axis describes the concurrence rate between subjects’ post-search answers and the document-suggested answer.

Reinforcement effect is covered by Table 5.8, Table 5.9, Figure 5.14, Figure 5.15, Figure

5.16 and Figure 5.17. Table 5.8, Table 5.9 and Figure 5.14 describe the distribution of documents accessed at different frequency during a search journey. Figure 5.15 and Figure

5.16 illustrate the development of the serial reinforcement curve, which is illustrated in

Figure 5.17. Chi-square analysis conducted on Table 5.9 did not show a statistically significant relationship between document access frequency and the concurrence rate between subjects’ post-search answers and the document-suggested answer (χ2 = 1.03, df = 1,

P = 0.31). In addition, each 95% CI of relative risk for documents accessed once only or more than once crosses one, which means that there was no difference in the concurrence rate between subjects’ post-search answers and the document-suggested answer regardless of whether documents were accessed once, or more than once (Table 5.9).

Do cognitive biases occur in search behaviours? A preliminary analysis 47

Table 5.8 Documents accessed with different number of repetition and the corresponding percentage of a correct/incorrect post-search answer

Correct documents Incorrect documents All documents No. of Correct after Incorrect repetitions Total search Total after search Total Concurrence [95% CI] 0 387 283 (73%) 290 186 (64%) 677 469 (69%) [66 to 73%] 1 41 32 (78%) 47 33 (70%) 88 65 (74%) [64 to 82%] 2 8 4 (50%) 20 15 (75%) 28 19 (68%) [49 to 82%]

Table 5.9 Relationship between document access frequency and concurrence between post- search answer and document-suggested answer

Concurrence between post-search answer and document-suggested answer? Access frequency Yes No Relative risk (95% CI) Once only (n=441) 306 (69.4%) 135 (30.6%) 0.94 (0.841 to 1.053) More than once (n=156) 115 (73.7%) 41 (26.3%) 1.06 (0.950 to 1.189)

700

600

500

400

300 No. of subjects 200

100

0 0 1 2 or more No. of repetition

Figure 5.14 Frequency of subjects accessing documents at different number of repetitions

Do cognitive biases occur in search behaviours? A preliminary analysis 48

100% 90% 80% 70% 60% 50% 40% 30% Concurrence rate 20% 10% 0% 0 1 2 or more No. of repetition

Correct document Incorrect document

Figure 5.15 Relationship between document access frequency and concurrence between post- search answer and document-suggested answer (separate graphs for correct and incorrect document)

100% 90% 80% 70% 60% 50% 40%

Concurrence rate 30% 20% 10% 0% 0 1 2 or more No. of repetition

Figure 5.16 Relationship between document access frequency and concurrence between post- search answer and document-suggested answer (correct and incorrect documents combined)

Do cognitive biases occur in search behaviours? A preliminary analysis 49

100%

90%

80%

70%

60% Concurrence rate

50% 10% 0% Once only More than once No. of visits

Figure 5.17 Serial reinforcement curve: normalised view of the relationship between document access frequency and concurrence between post-search answer and document- suggested answer

5.3 Discussion

There was a statistically significant relationship between subjects’ pre-search answers and post-search answers (i.e. the anchoring effect). Subjects who were correct pre-search were more likely to answer correctly post-search than those who were incorrect pre-search; similarly, subjects who were incorrect pre-search were more likely to answer incorrectly post-search than those who were correct pre-search. There was a marginally significant relationship between subjects’ confidence in their pre-search answers and their tendency to retain the pre-search answer after searching. However, chi-square analyses and relative risk calculations did not show statistically significant differences between documents processed at different points in the evidence journey and the concurrence rate between subjects’ post- search answer and the answer suggested by the document processed at that point. It may be that each of order, exposure and reinforcement effects has a small effect on decision outcome, or that the sample size was too small to detect a statistically significant difference.

Nevertheless, the shape of each serial curve is not flat: a positive linear relationship for confidence in anchors, a slight V-shape for order effects, an inverted V-shape for exposure effect and an uneven column graph for reinforcement effect. These shapes suggest that different confidence levels in anchors and documents processed under different conditions

Do cognitive biases occur in search behaviours? A preliminary analysis 50

might possibly be found to influence the post-search answer significantly if larger samples of searches were available for analysis.

There are several limitations in using serial curves to study cognitive biases during information searching:

• Assumptions in chi-square analysis: For the independence assumptions to hold in the

chi-square analysis, two assumptions on the way people process information are made.

First, it is assumed that subjects’ ability to answer one question does not influence their

ability to answer another question (Elstein et al., 1978, p. x). Second, it is assumed that

subjects’ interpretation of a document did not influence their interpretation of another

document. However, just as interpreting a sequence of medical tests may fail a test of

independence, so one document may alter the impact of a subsequent one in a search

(Hunink et al, 2001, pp. 198–202). Methods are available for dealing with such effects

(Hunink et al, 2001, pp. 202–204; Cooper, 1991).

• Small subject numbers and limited number of documents: There were only 75 subjects

and a limited pool size of influential documents in this preliminary data analysis;

documents that were used by only a few subjects were not included (only 88 of 725

retrieved documents were sufficiently frequently accessed to be included in this

analysis). With larger subject numbers, more of the discarded documents may have

been included, to further elucidate whether subjects experience cognitive biases while

searching and processing information to make a decision.

5.4 Conclusion

This preliminary data analysis concludes that there was a statistically significant relationship between subjects’ pre-search answers and their post-search answers (i.e. anchoring effect); and that there was a marginally significant relationship between subjects’ pre-search confidence and their tendency to retain the pre-search answer after searching (i.e. confidence in anchoring effect). However, there were no statistically significant differences between subjects’ post-search answers and the documents processed at different access positions, different exposure or reinforcement levels. Nevertheless, there are reasons to continue with this investigation. Firstly, this data analysis is post-hoc, i.e. it is not an experiment specifically designed to test for the effects of cognitive biases on information

Do cognitive biases occur in search behaviours? A preliminary analysis 51

searching and decision making. Secondly, it is possible that each of these biases exerts only a small effect on the decision outcome and that the sample size in this study was too small to detect the influence of these cognitive biases. Thirdly, the qualitative analysis reported in

Chapter 3 remains suggestive that the way documents are accessed and processed influences the way people make decisions. Lastly, the literature review reported in Chapter 4 demonstrates that there is strong evidence for these biases in decision making.

Further experimentation and analyses are reported in later chapters to investigate the relationship between cognitive biases, information searching and decision making.

6 Hypotheses

This chapter outlines and formalises the three hypotheses in this research. Although preliminary post-hoc analysis of the first dataset was inconclusive, it has assisted in developing a testable set of hypotheses about the impact of cognitive biases on decision outcomes post-search. The first hypothesis is that people experience cognitive biases during information searching. The second hypothesis is that information searching can be

“debiased”. The third hypothesis is that attempts to debias information searching improve decision making. The biases investigated in this research are: anchoring effect, order effects, exposure effect and reinforcement effect. The fundamental assumption behind each of these hypotheses is that information searching does influence decision making.

6.1 Cognitive biases during information searching

This section describes the first and second hypotheses. Specific descriptive terms will be used throughout this section to outline the hypotheses: H0 refers to the null hypothesis and

H1 to the alternative hypothesis. The term “most probable answer” refers to the answer that the majority of users provide to a question after having accessed a given document. In serial curves, the y-axis “Concurrence rate” is the percentage concurrence between the answer provided by users and the most probable answer suggested by the document.

6.1.1 Hypothesis for anchoring effect The anchoring effect refers to the phenomenon where one’s prior belief exerts a persistent influence on the way information is processed and subsequently affects the way beliefs are formed (Wickens & Hollands, 2000). Figure 6.1 describes the null and the alternative hypotheses for the relationship between subjects’ pre-search answers and their post-search answers. The null hypothesis describes a situation where people are equally likely to answer correctly post-search regardless of their pre-search answers. The alternative hypothesis

52

Hypotheses 53

Right (after search) (%) search) (after Right

Right Wrong Before search

H0: no anchoring effect H1: anchoring effect

Figure 6.1 Anchoring effect: the null hypothesis (H0) and the alternative hypothesis (H1) describes the scenario where people who are right pre-search are more likely to be right post- search than subjects who are wrong pre-search.

To express H0 and H1 for anchoring effect in equation form, let PpreSearch_X = the proportion of subjects with pre-search answer X and who have answered the question correctly post-search, where X is either correct or incorrect:

H0 (no anchoring effect):

PpreSearch_correct = PpreSearch_incorrect i.e. subjects who answer correctly pre-search are equally likely to answer correctly post- search as those who answer incorrectly pre-search;

H1 (anchoring effect):

PpreSearch_correct > PpreSearch_incorrect i.e. subjects who answer correctly pre-search are more likely to answer correctly post-search than those who answer incorrectly pre-search.

Figure 6.2 illustrates the null and the alternative hypotheses for the relationship between subjects’ confidence in their pre-search answers and their tendency to retain the pre-search answer after searching. The null hypothesis (H0) is represented by a flat linear relationship; it describes the scenario where subjects’ tendency to change their answer after searching is

Hypotheses 54

Pre-search answerrate retention (%)

Not confident Somew hat confident Confident Very confident Confidence in pre-search answ er

H0: no confidence in anchoring effect H1: confidence in anchoring effect

Figure 6.2 Confidence in anchoring effect: the null hypothesis (H0) and the alternative hypothesis (H1)

independent of their confidence in the pre-search answer. The alternative hypothesis (H1) is represented by a monotonically increasing relationship; it describes the scenario where the more confident subjects are in their pre-search answer, the more likely it is that they retain the pre-search answer to a given question after searching.

To express H0 and H1 for confidence in anchoring effect in equation form, let

PpreSearchConfidence_X = the proportion of subjects with pre-search confidence X who retained their pre-search answer after searching, where X ranges from not confident, somewhat confident, confident to very confident:

H0 (no confidence in anchoring effect):

PpreSearch_notConf = PpreSearch_somewhatConf = PpreSearch_conf = PpreSearch_veryConf i.e. the proportion of subjects who retain their pre-search answer after searching is independent of their confidence in the pre-search answer;

H1 (confidence in anchoring effect):

PpreSearch_notConf < PpreSearch_somewhatConf < PpreSearch_conf < PpreSearch_veryConf i.e. the proportion of subjects who retain their pre-search answer after searching is dependent on their confidence in the pre-search answer: the more confident a person is in

Hypotheses 55

the pre-search answer, the more likely the person is to retain the pre-search answer after searching.

6.1.2 Hypothesis for order effects Order effects refer to the phenomenon where the temporal order in which information is presented affects the final judgement of an event (Wang et al, 2000). More specifically, this research investigates whether people experience primacy and recency effects during information searching. In this research, primacy refers to the phenomenon where individuals’ impressions are based more on information earlier than later in the sequence

(first reported by Lund, 1925); and recency effect to the situation where impressions are based more on later information (Luchins, 1942; Cromwell, 1950; Hovland, 1957).

Figure 6.3 describes the null and the alternative hypotheses for order effects in the evidence journey. The graph describes the relationship between the access position of a document and the concurrence rate between subjects’ post-search answer and the answer suggested by the document. The null hypothesis (H0) describes the scenario where the concurrence between subjects’ post-search answer and the answer suggested by a document is not influenced by the position at which the document was accessed. The alternative hypothesis (H1) describes the scenario where there is greater concurrence between subjects’ post-search answer and the answer suggested by documents accessed at the first and last Concurrence rate(%)

First Middle Last Access position

H0: No order effects H1: Order effects

Figure 6.3 Order effects: the null hypothesis (H0) and the alternative hypothesis (H1)

Hypotheses 56

position.

To express H0 and H1 for order effects in equation form, let PX = the proportion of subjects whose post-search answer concurs with the most probable answer suggested by a document accessed at position X (where X can be the first, middle or last, reflecting the position of accessing a document in an evidence journey):

H0 (no order effects): Pfirst = Pmiddle = Plast i.e. the concurrence rate between post-search answers and document-suggested answers is uniform across different access positions in the evidence journey;

H1 (primacy effect): Pfirst > Pnon-first i.e. the concurrence rate between post-search answers and document-suggested answers is greater for documents accessed at first position than for documents accessed at other positions;

H1 (recency effect): Plast > Pnon-last i.e. the concurrence rate between post-search answers and document-suggested answers is greater for documents accessed at last position than for documents accessed at other positions.

6.1.3 Hypothesis for exposure effect Exposure effect refers to the phenomenon where the level of exposure to information affects Concurrence rate (%)

Least Medium Most Time spent on document H0: No exposure effect H1: Exposure effect

Figure 6.4 Exposure effect: the null hypothesis (H0) and the alternative hypothesis (H1)

Hypotheses 57

the final judgement of an event. The null and alternative hypotheses for exposure effect are described in Figure 6.4. The graph describes the relationship between the level of document exposure and the concurrence rate between subjects’ post-search answer and the answer suggested by the document. The null hypothesis (H0) describes the scenario where the concurrence between subjects’ post-search answer and the answer suggested by a document is not influenced by the amount of time spent on the document. The alternative hypothesis

(H1) describes the scenario where the concurrence between subjects’ post-search answer and the answer suggested by the document is dependent on the amount of time spent on the document. At this stage, there is uncertainty about the shape of the relationship, but the shape is believed to be curvilinear where documents accessed at different levels of exposure would have different impacts on the post-search answer.

To express H0 and H1 for exposure effect in equation form, let PX = the proportion of subjects whose post-search answer concurs with the most probable answer suggested by a document accessed for X amount of time (where X can be least, medium or most, reflecting the amount of time subjects spend on a document in an evidence journey):

H0 (no exposure effect): Pleast = Pmedium = Pmost i.e. the concurrence rate between post-search answers and document-suggested answers is uniform across documents processed at different levels of exposure.

H1 (exposure effect): Pleast ≠ Pmedium ≠ Pmost i.e. the concurrence rate between post-search answers and document-suggested answers is different across documents processed at different levels of exposure.

6.1.4 Hypothesis for reinforcement effect Reinforcement effect refers to the phenomenon where the level of repeated exposure to information affects the final judgement of an event. The null and alternative hypotheses for reinforcement effect are described in Figure 6.5. The graph describes the relationship between the frequency of accessing a document and the concurrence rate between subjects’ post-search answer and the answer suggested by the document. The null hypothesis (H0) describes the scenario where the concurrence rate between subjects’ post-search answer and the answer suggested by a document is not influenced by the number of times the document was accessed. The alternative hypothesis (H1) describes the scenario when there is greater

Hypotheses 58

Concurrence rate (%)

Once only More than once No. of visits

H0: no reinforcement effect H1: reinforcement effect

Figure 6.5 Reinforcement effect: the null hypothesis (H0) and the alternative hypothesis (H1) concurrence between subjects’ post-search answer and the answer suggested by documents accessed more than once.

To express H0 and H1 for reinforcement effect in equation form, let PX = the proportion of subjects whose post-search answer concurs with the most probable answer suggested by a document accessed X number of times (where X is either once only or more than once):

H0 (no reinforcement bias): PonceOnly = PmoreThanOnce i.e. the concurrence rate between post-search answers and document-suggested answers is uniform across documents accessed with different frequencies;

H1 (reinforcement bias): PmoreThanOnce > PonceOnly i.e. the concurrence rate between post-search answers and document-suggested answers is greater for documents accessed more than once than for documents accessed only once.

6.2 Impact on information searching and decision making

The section develops a third hypothesis, which looks at whether using an intervention

(such as an alternate search engine user interface) intended to debias information searching can influence decision making. Four parameters are compared between baseline (i.e. no intervention) and each debiasing intervention: accuracy of decision outcome, confidence in decision made, efficiency in searching and subjects’ preference for a given search interface.

Hypotheses 59

6.2.1 Impact on decision accuracy H0: The distribution of pre- and post-search outcomes is the same between using baseline and using a debiasing intervention, i.e. the proportion of subjects who were right before and after searching (RR) is the same for the baseline and a debiasing intervention, and similarly for the subpopulations wrong-right (WR), wrong-wrong (WW) and right-wrong (RW):

RRbaseline = RRintervention

WRbaseline = WRintervention

WWbaseline = WWintervention

RWbaseline = RWintervention

H1: Using a debiasing intervention improves decision outcomes, i.e. the distribution of pre- and post-search outcomes is different between using baseline and using a debiasing intervention:

RRintervention > RRbaseline

WRintervention > WRbaseline

WWintervention < WWbaseline

RWintervention < RWbaseline

6.2.2 Impact on decision confidence H0: There is no change in confidence associated with answers post-search between using baseline and using a debiasing intervention.

H1: There is a difference in confidence associated with answers post-search between using baseline and using a debiasing intervention.

6.2.3 Impact on search attributes (i) Time taken to answer a question

H0: Amount of time taken to answer a question is the same for baseline and a debiasing intervention.

H1: Amount of time taken to answer a question is shorter using a debiasing intervention than using baseline.

Hypotheses 60

(ii) Number of searches

H0: Number of searches conducted to answer a question is the same for baseline and a debiasing intervention.

H1: Number of searches conducted to answer a question is lower using a debiasing intervention than using baseline.

(iii) Number of documents accessed

H0: Number of documents accessed to answer a question is the same for baseline and a debiasing intervention.

H1: Number of documents accessed to answer a question is lower using a debiasing intervention than using baseline.

6.2.4 Impact on user preference for search engine H0: The proportion of subjects who prefer using a debiasing intervention is the same as the proportion of subjects who prefer using baseline.

H1: The proportion of subjects who prefer using a debiasing intervention is greater than the proportion of subjects who prefer using baseline.

6.3 Summary

Overall, the research questions in this thesis are:

1. Do people experience cognitive biases while searching for information?

2. Can we debias information searching?

3. Would correcting for these biases improve decision making?

The hypotheses for these research questions are that (i) people experience cognitive biases during information searching; (ii) cognitive biases can be corrected during information searching; and (iii) correcting for biases during information searching improves decision making. The underlying assumption for these hypotheses that information searching influences decision making will be tested in Chapter 7.

Interventions designed to debias information searching are reported in Chapter 10. A study conducted with healthcare consumers to investigate whether people experience anchoring effect, order effects, exposure effect and reinforcement effect during information

Hypotheses 61

searching is described in Chapter 11. The comparisons between using the baseline search interface and a debiasing intervention are detailed in Chapters 13–14.

7 Modelling the impact of biases on search decision outcomes: Bayesian model

This chapter models the impact of the hypothesised biases on post-search decision outcomes. Given the preliminary analysis and the hypotheses developed in previous chapters, a more sophisticated model is constructed to illustrate how these biases interact in a complex and non-independent way and influence decision outcomes. Specifically, a

Bayesian framework is developed to model the impact of online information retrieval upon decisions1; and an augmented Bayesian model that incorporates well-known decision biases such as anchoring, primacy, recency, exposure and reinforcement effects is also developed to model how cognitive biases influence search decision outcomes.

7.1 Introduction

Human beings seldom follow a purely rational model and are prone to a series of decision biases (Kahneman et al., 1982). Bayes’ theorem seems very well suited to model how beliefs change in response to exposure to a sequence of documents retrieved and viewed from a

Web search. In the Bayesian view, an individual’s prior beliefs about the correctness of a decision would be modified by the evidential weight of documents that are retrieved to influence that decision. The Bayesian model can be described by analogy with its application in medical decision making: the likelihood of an individual having a disease is analagous to the likelihood of an individual providing a specific answer; evidence for the presence of a disease in a patient provided by a diagnostic test is analagous to the evidence for a specific answer to a question provided by a document.

However, this is not the first application that uses Bayes’ theorem to model user behaviour. In the information retrieval domain, Bayes’ theorem has traditionally been used

1 The idea of using a Bayesian framework to analyse search data was suggested by Professor Enrico Coiera. 62

Modelling the impact of biases on search decision outcomes: Bayesian model 63

to probabilistically model how beliefs change in response to a sequence of evidence, such as the impact of diagnostic test results on beliefs about the presence of disease (Hunink et al.,

2001); it has also been used for text categorisation, document ranking or inference under uncertainty, such as calculating relevance or determining the semantic relationship between a document and a query (Crestani et al., 1998). In the study of human information searching behaviour, the process of searching has been modelled as a process of rational decision analysis, where the decision to continue to search is seen as a cost–benefit trade-off

(Blackshaw and Fischhoff, 1988). Bayes’ theorem has also been used to predict when a user will stop searching based on the accumulated number of positive and negative pieces of evidence read in a search journey (Kantor, 1987). In the field of decision making, Bayes’ theorem has always been used to model a normative view of human belief revision. In fact,

Wang et al. (2005) have noted that “normative theories, such as Bayes’ theorem, have no room for the order effects since it violates the fundamental law of commutativity” (Wang et al., 2006, p. 39). However, to our knowledge, Bayes’ theorem has yet to be applied to modelling the overall process of document retrieval and decision making.

7.2 Method

The Bayesian prediction models in this study were constructed from the dataset described in Chapter 3. There are three stages to developing a Bayesian model: Stage One incorporates the anchoring effect, Stage Two calculates the likelihood ratio for each document, and Stage

Three integrates order, exposure and reinforcement decision biases into the model:

1. Establish the strength of each subject’s prior belief in a scenario answer

Five separate methods were used to model the strength of a subject’s prior belief in an answer, before searching commenced:

• A base prior was calculated for the four possible answers that a subject can provide to a

question in the dataset: yes, no, conflicting evidence or don’t know. The baseline

probability for getting the correct outcome is 0.25, and 0.75 for getting one of the other

three (incorrect) outcomes.

Modelling the impact of biases on search decision outcomes: Bayesian model 64

Table 7.1 Population prior (prevalence of a correct answer amongst all subjects) for each scenario

Scenario Population prior (n=75) Asthma 0.77 (58) Glue ear 0.45 (34) SIDS 0.41 (31) MI 0.21 (16) IVF 0.17 (13) Diabetes 0.07 (5)

• A population prior was calculated from the pre-search data for each scenario, providing

the likelihood of any one subject being correct before searching, given the answers

provided by all subjects (Table 7.1).

• A personal prior, modelling the anchoring effect, was calculated for each subject based

upon his or her pre-search answer and confidence in the answer (Table 7.2).

• Another personal prior was calculated based upon each individual’s willingness to switch

belief – a tendency to change one’s belief because of underestimation or overestimation

of the likelihood of a belief being true – after searching (Table 7.3). The strength of an

anchoring effect is estimated according to a subject’s confidence in the pre-search

answer. There was an inverse relationship between the likelihood that a subject will

switch belief and their confidence in the pre-search answer. The more confident subjects

Table 7.2 Personal prior belief calculated from pre-search answer and confidence used to calculate

Personal prior Pre-search answer and confidence Correct after search (n) Incorrect after search (n) Incorrect Very confident (n=9) 0 (0) 1.00 (9) Confident (n=23) 0.43 (10) 0.57 (13) Somewhat confident (n=44) 0.50 (22) 0.50 (22) Not confident (n=24) 0.42 (10) 0.58 (14) Don’t know (n=166) 0.63 (104) 0.37 (62) Correct Not confident (n=9) 0.89 (8) 0.11 (1) Somewhat confident (n=50) 0.70 (35) 0.30 (15) Confident (n=48) 0.83 (40) 0.17 (8) Very confident (n=27) 0.78 (21) 0.22 (6)

Modelling the impact of biases on search decision outcomes: Bayesian model 65

Table 7.3 Personal prior calculated based on willingness to switch belief using data from all six scenarios (n=400 search sessions)

Before search (n=400) P(switch)* Personal prior† Not confident (n=37) 0.41 (15) 0.59 (22) Somewhat confident (n=104) 0.45 (47) 0.55 (57) Confident (n=76) 0.30 (23) 0.70 (53) Very confident (n=41) 0.27 (11) 0.73 (30) NA‡ (n=142) – –

* Subjects’ willingness to switch belief. † Calculated as 1 – P(switch). ‡ Pre-search answer ‘Don’t know’ or confidence in pre-search answer not given.

were in their pre-search answer, the less likely they were to change their answer after

searching. Therefore, the strength of personal prior belief is estimated to be 1 minus the

probability of switching an answer i.e. 1 – P(switch), where P(switch) is the probability

of a subject’s post-search answer not being the same as the pre-search answer.

• A willingness to switch belief after searching for each scenario was calculated in the same

way based on switch frequencies calculated for each individual scenario (Table 7.4).

2. Calculate the likelihood that once read, a document influences opinion towards a specific answer

To ensure that personal and population measures were kept independent, document sensitivity and specificity measures were recalculated for each subject, excluding the subject’s data from those of the remaining population. Full details on the calculation of document sensitivity and specificity measures are outlined in Chapter 3.

3. Model each subject’s changes in belief through exposure to a document using Bayes’ theorem

For each clinician, Bayes’ theorem was used to calculate the probability that the answer would be correct given the documents read, P(Correct|Docs), and the probability that the answer would be incorrect, P(Incorrect|Docs). The predicted response was taken to be that

Modelling theimpact of decision outcomes:biases onsearch Bayesianmodel Table 7.4 Personal prior* for each scenario calculated based on willingness to switch belief (n=400 search sessions)

Asthma Glue ear SIDS MI IVF Diabetes Before search (n=400)† Total 1 – P(switch)(n) Total 1 – P(switch)(n) Total 1 – P(switch)(n) Total 1 – P(switch)(n) Total 1 – P(switch)(n) Total 1 – P(switch)(n) Not confident 5 0.60 (3) 4 0.75 (3) 6 0.83 (5) 5 0.40 (2) 6 0.33 (2) 11 0.64 (7) (n=37) Somewhat confident 17 0.76 (13) 19 0.47 (9) 24 0.63 (15) 25 0.52 (13) 13 0.46 (6) 6 0.17 (1) (n=104) Confident 19 0.79 (15) 11 0.91 (10) 16 0.69 (11) 15 0.53 (8) 12 0.50 (6) 3 1.00 (3) (n=76) Very confident 17 0.76 (13) 9 0.78 (7) 4 0.25 (1) 5 1.00 (5) 2 0.50 (1) 4 0.75 (3) (n=41)

* Calculated as 1 – P(switch). † In 142 responses, pre-search answer was “Don’t know” or confidence in pre-search answer was not given.

66

Modelling the impact of biases on search decision outcomes: Bayesian model 67

associated with the higher of these two probabilities. Both sequential and non-sequential

Bayesian models were used (Appendix A). The sequential approach updates subjects’ belief for each document in sequence of access (Gorry et al., 1968), and the non-sequential approach updates belief at the end of accessing all documents (Warner et al., 1961).

To test for the possible impact of decision biases, a set of augmented Bayesian models were developed, using the odds form of Bayes’ theorem (as shown in Equation 5; for full explanation of the odds form of the Bayes’ theorem, see Appendix A). The likelihood odds of a document influencing a decision was modified by a power of a bias factor b, to account for the effect of cognitive biases on the predicted outcome, expressed as posterior odds

(Chapman et al., 1996):

Posterior odds = Prior odds × (Document likelihood ratio)b (5)

When using the odds formulation, the predicted response for a subject is correct when the odds of being correct were greater than the odds of being incorrect, and vice versa for the subject being incorrect. The augmented models attempted to capture the potential impact of each of the decision biases, both individually and cumulatively. The individual impact of a bias is the relative risk calculated in Chapter 5. The cumulative impact of multiple biases operating together was estimated by calculating a cumulative bias as the product of each individual bias (Equation 6).

b = ∏ Relative ofrisk individual bias (6)

7.3 Results

Table 7.5 summarises the predictive accuracy of the augmented Bayesian models using different prior estimation methods and decision biases. The optimal Bayes prediction model has a 73.3% (95% CI: 68.71 to 77.35) predictive accuracy and uses a personal prior that incorporates subjects’ willingness to switch belief after searching for each scenario and the combined impact of all decision biases, which was statistically different to the predictive accuracy of all other models.

Modelling the impact of biases on search decision outcomes: Bayesian model 68

Predictive accuracy increases as more information about the subjects in the estimates of personal prior belief in an answer is included. The baseline prior probability of 0.25 for a one in four decision choices has a slightly better than chance prediction accuracy of 52.8%

(95% CI: 47.85 to 57.59). This improves to 61.0% (95% CI: 56.14 to 65.65) when using population priors as an estimate, and to 67.8% (95% CI: 63.02 to 72.14) when incorporating an anchoring or “willingness to switch” effect. The accuracy improves to 70.4% (95% CI:

65.34 to 74.28) when personal prior is used and reaches 72.0% (95% CI: 67.41 to 76.17) when the anchoring effect is calculated specifically for each test scenario, rather than the whole dataset.

There is no difference between the sequential or non-sequential Bayes approaches. No statistically significant differences are found in models that incorporated order effects, exposure effect or reinforcement effect individually or in combination, although the combined bias estimate produces the best predictive result.

7.4 Discussion

This chapter presents a model that predicts the decisions people will give to a question, based on the documents they access from an online evidence retrieval system, independent of information about document structure or specific content, with an accuracy of 73.3%.

The model is based on the Bayesian belief revision framework and incorporates decision biases for anchoring, order, exposure and reinforcement effects.

This study found a statistically significant improvement in the prediction accuracy when the anchoring effect was incorporated into the Bayesian model; however, no significant improvements were found when order, exposure and reinforcement effects were incorporated. One possible reason is that the number of documents available in the pool of influential documents was too small. With only 12% (88/725) of the documents accessed during the evidence journey being included in constructing the Bayesian model, the size of the document pool may have been too small to detect the impact of order, exposure and reinforcement effects on decision outcomes.

Table 7.5 Prediction accuracy (%) for each version of Bayesian model with different decision biases and each form of prior belief (with 95% confidence interval) Modelling theimpact of decision outcomes:biases onsearch Bayesianmodel Personal prior Willingness to switch (all Willingness to switch (all Bayesian model* Baseline prior Population prior Anchoring effect scenarios together) scenarios separately) Without biases 52.8% (211) 61.0% (244) 70.4% (280) 67.8% (271) 72.0% (288) (47.85 to 57.59%) (56.14 to 65.65%) (65.34 to 74.28%) (63.02 to 72.14%) (67.41 to 76.17%) With order effects 52.5% (210) 61.0% (244) 70.1% (279) 67.8% (271) 72.3% (289) (47.61 to 57.35%) (56.14 to 65.65%) (65.08 to 74.05%) (63.02 to 72.14%) (67.67 to 76.41%) With exposure effect 52.5% (210) 60.8% (243) 70.1% (279) 67.8% (271) 72.8% (291) (47.61 to 57.35%) (55.88 to 65.41%) (65.08 to 74.05%) (63.02 to 72.14%) (68.19 to 76.88%) With reinforcement 52.3% (209) 61.0% (244) 70.6% (281) 67.8% (271) 72.8% (291) effect (47.36 to 57.10%) (56.14 to 65.65%) (65.59 to 74.52%) (63.02 to 72.14%) (68.19 to 76.88%) With all biases 52.5% (210) 60.8% (243) 70.1% (279) 67.8% (271) 73.3% (293) (47.61 to 57.35%) (55.88 to 65.41%) (65.08 to 74.05%) (63.02 to 72.14%) (68.71 to 77.35%)

* n=400 for all models. 69

Modelling the impact of biases on search decision outcomes: Bayesian model 70

The results in this study are generally in keeping with other studies that report on the impact of document searching on a user’s ability to answer a question. Hersh et al. (2002) identified factors related to the successful answering of clinical questions when using an information retrieval system. These include the correctness of answer before searching, user experience with search system, the question type and the user’s spatial visualisation score.

They also highlighted the importance of the user’s search ability and the ability to use the resulting information. Eisenberg and Barry (1988) and Purgailis Parker and Johnson (1990) found that judgements of document relevance were influenced by the order of document presentation. Also, Huang and Wang (2004) showed that there is a relationship between users’ judgement of document relevance, the number of documents being judged and order effects. In addition, Florance and Marchionini (1995) identified that interaction effects between articles affected one’s judgement of the applicability of a document to a question.

Other studies have used sensitivity and specificity measures in different contexts. For example, Nie (1989) used exhaustivity and specificity to evaluate the relevance of a document to a query; and Haynes et al. (2004) measured the sensitivity and specificity of unique search terms to retrieve medical documents.

7.4.1 Limitations of this study The methods used to establish document sensitivity and specificity were potential limitations of this study.

• The size of the pool of influential documents was limited: Only 88 of 725 retrieved

documents were included in the pool used to build the Bayesian models. Hence, many

documents that were used by some subjects were not included. With larger subject

numbers, more of the discarded documents may have been used for model building.

• Documents not frequently accessed may be influential: Infrequently accessed documents

may actually be influential, but they were not used to calculate the influential impact on

an individual because one could not establish reliable sensitivity and specificity

measures for these documents.

• The assumption that subjects read a document was based on their accessing the document.

Subjects may not have read documents they selected, or only partially read them,

modifying the likelihood that the document influenced them.

Modelling the impact of biases on search decision outcomes: Bayesian model 71

• Document effects may not be independent: Just as a sequence of medical tests may fail a

test of independence, so one document may alter the impact of a subsequent one in a

search (Hunink et al., 2001, pp. 198–202), and methods are available for dealing with

such effects (Hunink et al., 2001, pp. 202–204; Cooper, 1991).

Our method does not rely on specific information about a document’s structure or content, and so it is potentially general-purpose and robust. This does not mean that the actual process by which an individual makes a decision is not influenced by document content or structure, but rather that our probabilistic model summarises the effects of all of these possible attributes in the measures of document sensitivity and specificity. The

Bayesian model does however underline that an individual’s prior beliefs significantly influence the answer they will provide to a question, irrespective of the different documents they might be exposed to.

Finally, it is interesting to note that the best models were generated when they were fitted to individual scenarios, rather than the total scenario dataset, using the willingness to switch estimate. This is perhaps unsurprising given that expertise is likely to be case-based, and performance on one question is unlikely to be a good predictor of performance on others. This has led to what was initially a quite controversial realisation that knowledge of content is more critical than mastery of a generic-problem solving process (Elstein et al.,

1978, p. x).

7.5 Conclusion

In this chapter, the assumption that a person’s decision is predictable based upon the documents accessed during information searching has been tested. It has been shown that a

Bayesian model can be used to describe how beliefs change in response to exposure to a sequence of documents retrieved and viewed from an online search. Whilst we were unable to demonstrate an expected statistically significant effect on decisions from order, exposure and reinforcement effects, the anchoring effect was found to have a statistically significant impact on decision outcomes. Larger experiments conducted in the later chapters will test whether people experience cognitive biases while searching for information, and investigate whether it is possible to correct for biases using an intervention on the user interface of an online evidence retrieval system.

Part IV: Can we optimise information searching to make better decisions?

8 Debiasing strategies: a literature review

Given that there are biases that appear to affect how people view documents, is it possible to design a search environment which integrates debiasing strategies into the user interface of a search system, without compromising the way people normally conduct their searches? The literature on debiasing strategies (reported in this chapter) and the literature on search user interface design (reported in Chapter 9) are examined to guide the design of a search environment for debiasing information searching. A rationale for the design and experimental results are reported in Chapters 10–14.

8.1 Overview of debiasing strategies

Debiasing is a form of intervention that assists people to eliminate or reduce the impact of cognitive biases on their decisions and to focus their awareness directly on understanding the sources of their cognitive limitations (Wickens & Hollands, 2000). Several approaches have been suggested to classify debiasing methods. The classic review conducted by

Fischhoff (1982) categorises the need for debiasing according to three situations: faulty judges, faulty tasks and mismatch between judge and task. Faulty judges refer to people who misapply their cognitive skills as well as people who do not have the required skills or knowledge to perform the task. Faulty tasks refer to tasks that are incompetently designed such that people are more likely to make mistakes or misunderstand what is required of them. Mismatch between judge and task refers to the situation where the task is structured in a way that does not allow people to use their cognitive skills to the best of their ability.

Many reviews of debiasing strategies have been conducted. Turk and Salovey (1985,

1985a) examined clinicians’ cognitive structures and processes and proposed modifications for debiasing their behaviours. Evans (1989) proposed four debiasing approaches, which include: replacing human intuitions by a formal procedure, educating and training to improve reasoning ability, improving design of the human environment, and developing interactive decision aids. Keren (1990) designed a taxonomy for debiasing and classified 73

Debiasing strategies: a literature review 74

cognitive aids into procedural or structure modifying methods. Arkes (1991) proposed a taxonomy for judgement errors and hypothesised the variables for debiasing each group of errors. Bazerman (1994) proposed a three-step procedure to unfreeze, change and refreeze to debias judgement.

The purpose of this chapter is to identify attributes of a successful debiasing strategy that can be implemented on a search user interface. Debiasing strategies are classified into two sections, namely interventions implemented with or without the use of information technology. Interventions are then further classified according to an adapted version of

Fischhoff’s taxonomy: user-related, task-related and information-related. A summary of the papers reviewed about debiasing interventions is shown in Table 8.1.

8.2 Debiasing strategies (without using information technology)

8.2.1 User-related Thinking strategies, i.e. ways that users can practise to modify their ways of thinking, have been proposed by psychologists for individuals to change their thinking patterns to counter against biases. In particular for clinicians, Croskerry (2002) proposed eleven debiasing strategies from the psychology literature to overcome human dispositions to respond due to failures in perception or cognitive biases caused from using heuristics. These techniques include: develop insight/awareness, consider alternatives, decrease reliance on memory, provide specific training, make task easier, minimise time pressures, promote accountability and feedback, adapt metacognitive approach and use cognitive forcing strategies in clinical decision making. These techniques are based on a metacognitive approach to problem solving, which involves “stepping back from the immediate problem to examine and reflect on the thinking process” (Croskerry, 2003a, p. 779). An example of a cognitive forcing strategy for clinicians when reading radiographs is to continue the search for other findings even when an abnormality is already found (Croskerry, 2003). Although a metacognitive approach to debiasing is an insightful idea, the effectiveness of these strategies needs to be evaluated in studies and real-life clinical settings (Graber, 2003).

Table 8.1 Summary of reviewed debiasing interventions

Study (author, year) Bias targeted Debiasing strategy Number of subjects and Debias successful? Result summary task Biases resulting from the anchoring and adjustment heuristic Lim et al., 2000 First impression bias Multimedia 80 university students Yes Amongst subjects who presentation (Appraise the receive multimedia performance of a presentations, no significant department head) difference in the appraisal score between groups with and without a biased cue (P=0.13) George et al., 2000 Anchoring and Warning screens when 131 university students No No statistically significant adjustment bias subjects’ estimates were (Estimate the price value difference in the house within a certain range of of a house) value estimates between their anchors warned and un-warned subjects (P>0.05) Biases resulting from the representative heuristic Roy & Lerch, 1996 Base-rate fallacy Information displayed in Experiment one: 175 Yes Presenting narrative textual Debiasing strategiesaliteraturereview graph format and undergraduate students information in a graph or a problem presented in a Experiment two: 451 table reduces base-rate casual order that undergraduate students fallacy (control group: 66%, matches the solution (Calculate the experimental group: 39%) probabilities of a base- and improves response rate problem) accuracy (control group: 1%, experiment group: 39%) (G2 = 22.1, df = 1, P < 0.001)

75

Table 8.1 (continued)

Study (author, year) Bias targeted Debiasing strategy Number of subjects and Debias successful? Result summary task Biases resulting from the availability heuristic Chandra & Krovi, 1999 Recency effect Distracter screen NA* NA NA

Ashton & Kennedy, Recency effect Self-review 135 auditors Yes Debiasing occurs at the 2002 (Provide audit point of the self-review assessment) (F = 5.37, P = 0.03) Marett & Adams, 2006 Familiarity bias DSS that provides 196 undergraduate Yes Group with moderate different amount of students amount of information were information (Rank the top 25 the most accurate rankers Debiasing strategiesaliteraturereview baseball teams) and least susceptible to familiarity bias * Not applicable.

76

Table 8.1 (continued)

Study (author, year) Bias targeted Debiasing strategy Number of subjects and Debias successful? Result summary task Other heuristics or biases Klayman & Brown, 1993 No specific bias Different information 48 university students Yes Subjects in the two structures: independent (learn to diagnose two conditions formed different format vs. contrastive fictitious diseases) disease concepts and the format contrastive group formed diagnoses that are closer to the expected judgements in the presented information Rai et al., 1994 No specific bias Recommend guidance NA NA NA features at different phases of the decision making process in an Debiasing strategiesaliteraturereview executive system Hirt & Markman, 1995 No specifc bias Consider any alternative NA NA NA outcome for an event, not necessarily the opposite outcome Weinstein & Klein, 1995 Optimistic bias Focus attention on risk NA No Optmistic biases are factors related to the exaggerated rather than matter reduced

77

Table 8.1 (continued)

Study (author, year) Bias targeted Debiasing strategy Number of subjects and Debias successful? Result summary task Shore, 1996 No specific bias Guidelines for the NA NA NA software development cycle to prevent introducing biases into the system Cohen et al., 1997 Overconfidence bias Consider reasons for NA NA NA why forecasts may fail Debiasing strategiesaliteraturereview

Croskerry, 2003, 2003a Cognitive dispositions to Thinking strategies (e.g. NA NA NA respond (no specific cognitive forcing bias) strategies)

78

Debiasing strategies: a literature review 79

Studies have also demonstrated that considering alternative outcomes can reduce biases when processing information. For example, Cohen et al. (1997) found that weather forecasters’ overconfidence bias is reduced when they were asked to consider reasons for why their forecasts may fail. In addition, Hirt and Markman (1995) found that judgements can be debiased by asking subjects to consider any alternative outcome for an event, not necessarily the opposite outcome. However, explicitly asking users to consider alternative outcomes may introduce the mere-measurement effect. Mere-measurement effect is the phenomenon where the act of asking a question could lead to a biased response, which in turn leads to a change in subsequent behavior (Sherman, 1980); it is found that the effect exists automatically rather than the user deliberately exercising it (Fitzsimons & Williams,

2000). In fact, studies conducted by Weinstein and Klein (1995) show that subjects’ optimistic biases are exaggerated rather than reduced when they are merely asked to focus their attention on risk factors related to the matter. Mere-measurement effects have also been demonstrated to increase both healthy and unhealthy personal behaviours as well. The study conducted by Williams et al. (2006) found that asking subjects about their intent on exercising increased their exercise rates and asking about their intent to use illegal drugs actually increased their use of the illegal drugs. Hence, debiasing strategies that require asking subjects questions about their intent should be used with caution because they may introduce the mere-measurement effect which unintentionally alters subjects’ behaviour.

8.2.2 Task-related Ashton and Kennedy (2002) conducted a study with 135 accounting auditors and found that the use of a self-review can successfully debias recency effect in evaluating the performance of a company. Subjects were given 12 pieces of information about a company (in contrary/mitigating or mitigating/contrary order) and asked to assess the probability that the company will continue to operate for the coming year. They were divided into three groups: the Step-by-Step (SbS) group revised the assessment probability after each piece of information; the End-of-Sequence (EoS) group made the probability assessment after all twelve pieces of information were reviewed; and the Debiaser group followed the protocol of the SbS group, i.e. revised the assessment probability after each piece of information, but were then followed by a self-review which explicitly asked them to review the contrary and mitigating factors and received another opportunity (13th attempt) to revise the final

Debiasing strategies: a literature review 80

probability assessment. They confirmed that debiasing for the Debiaser group occurred at the point of the self-review by showing that order interacts with the timing of the judgement between the 12th and the 13th probability judgements (F = 5.37, P = 0.03), and suggested that that the self-review technique can successfully debias recency effect.

8.2.3 Information-related Klayman and Brown (1993) demonstrated that debiasing the environment, i.e. altering the structure of information, without attempting to modify people’s judgement processes, alters the way people develop concepts and make decisions. Forty-eight university students were recruited to learn to distinguish two fictitious diseases. One group was given information in an “independent” format where subjects learned about each disease separately (see Figure

8.1); another group was given information in a “contrastive” format where information about the two diseases was juxtaposed to highlight distinctive features (see Figure 8.2).

Subjects in the two conditions formed different disease concepts and the ‘contrastive’ group formed diagnoses that are closer to the judgements expected from the presented information. They suggested that interventions that modify the environment may provide an alternative approach to debias when it is difficult to modify people’s processes. In a study conducted by Elting et al. (1999), clinicians had to decide whether to stop a clinical trial based on reading some data that was presented in four different displays. The study found that those who used icon displays and tables made the correct decision more frequently than those who used pie charts or bar graphs. Overall, presenting information in different formats has been shown to have a significant impact on the accuracy of decision making.

Debiasing strategies: a literature review 81

Identification: Zymosis is an acute disease of short duration and varying severity. Onset is sudden and characterized by fever, prostration (physical weakness) and anorexia (loss of appetite). Anorexia is commonly associated with diarrhea, although the underlying cause is sometimes nausea. The degree of prostration is more likely to be mild than severe. Zymosis patients experience difficulty in walking; the cause is usually pain in the lower extremities, especially the feet. Sometimes, though, the patient’s difficulty in walking is attributed to a “pins-and-needles” or tingling sensation in the legs and feet. Fever is constantly elevated for 5 to 7 days, and is very likely to be low-grade (i.e., temperatures in the range of 99–101°F). Patients normally develop a dry cough, although they occasionally experience a harsh, barking cough instead. More often than not, the tongue is cracked and fissured. As the disease progresses, a rash appears on the chest and abdomen and then spreads to the body generally. This rash commonly has a scaly consistency, although it sometimes can be moist and “weepy”. Occurrence: Worldwide and relatively frequent. More common in males than females, and somewhat more common in children than adults. There are pronounced racial differences, with White people being much more susceptible than Black or Brown people, for reasons that are not clear. Mode of transmission: From person to person by direct contact. Highly contagious. Incubation period: Highly variable and difficult to ascertain; usually 7 to 21 days. Susceptibility and resistance: Susceptibility is universal among those not …

Figure 8.1 Information in independent format (Klayman & Brown, 1993) Appendix A: summary chapter from the independent training condition.

Identification: Proxititis and zymosis are both acute disease characterized by sudden onset of fever, cough, prostration (physical weakness) and anorexia (loss of appetite), and later development of a rash. Illness is of short duration, with constantly elevated temperature for about 5 to 7 days. The two diseases have similar symptomatology. With zymosis, anorexia is commonly associated with diarrhea, although the underlying cause is sometimes nausea. The degree of prostration is more likely to be mild than severe. Patients experience difficulty in walking; the cause is usually pain in the lower extremities, especially the feet. Sometimes, though, the patient’s difficulty in walking is attributed to a “pins-and-needles” or tingling sensation in the legs and feet. Fever is very likely to be low-grade (i.e., temperatures in the range of 99–101°F). Zymosis patients normally develop a dry cough, although they occasionally experience a harsh, barking cough instead. More often than not, the tongue is cracked and fissured. As the disease progresses, a rash appears on the chest and abdomen and then spreads to the body generally. This rash commonly has a scaly consistency, although it sometimes can be moist and “weepy”. With proxititis, fever is more often low-grade than high. Patients are somewhat more likely to have a dry cough than a harsh, barking one. The tongue is usually cracked and fissured but occasionally will have a smooth appearance. Anorexia is commonly due to nausea. Prostration is normally mild. although proxititis patients occasionally will experience more severe physical weakness. In addition, most patients experience difficulty in walking due to a tingling sensation in the lower extremities, particularly the feet. Later, a rash appears. The rash is usually moist and “weepy”, although proxititis patients sometimes have a scaly rash. Occurrence: Both diseases occur worldwide and are relatively frequent. Zymosis is more common in males than females, and somewhat more common in children than adults. There are pronounced racial differences, with White people being much more susceptible to zymosis than Black or Brown people, for reasons that are not clear. Proxititis is more common in females than in males. Racial differences are not pronounced, but Whites are somewhat more susceptible than Black and Brown people. Proxititis is normally a disease of children and adolescents, but it is sometimes seen in adults.

Figure 8.2 Information in contrastive format (Klayman & Brown, 1993) Appendix B: summary chapter from the contrastive training condition.

Debiasing strategies: a literature review 82

Roy and Lerch (1996) conducted two experiments with 175 and 451 undergraduate students and found that presenting information in a graphical format that is appropriate for the mental representation of a task reduces the frequency of the base-rate fallacy and improves the accuracy of responses (see Figure 8.3). Base-rate fallacy describes the phenomenon where “people neglect base rates where prior beliefs in a hypothesis should be taken into account when new evidence is obtained” (Roy & Lerch, 1996, p. 233). The experimental results showed that when the problem is expressed in a causal order that does not fit the causal order of the solution, subjects experienced difficulty in solving the problem

(see Figure 8.4 and Figure 8.5 for problem expressed in causal/non-causal order). In addition, they found that assisting people to construct a more appropriate representation of the problem, such as presenting narrative textual information in a graph or a table, reduces base-rate fallacy (control group: 66%, experimental group: 39%) and improves response

Figure 8.3 Information expressed in graphical representation (Roy & Lerch, 1996)

Debiasing strategies: a literature review 83

Figure 8.4 Problem expressed in causal order (Roy & Lerch, 1996) accuracy (control group: 1%, experiment group: 39%) (ΔG2 = 22.1, df = 1, P < 0.001). In the end, they suggested that the design of a decision support system that reduces the incidence of base-rate fallacy and other judgement biases should encourage its users to only retrieve information from presentation formats that are less prone to produce judgement biases; one example of such a format is to present information visually in the absence of irrelevant cues and inappropriate causal order.

8.3 Debiasing interventions (using information technology)

8.3.1 Task related George et al. (2000) designed a decision support system (DSS) using warning screens to reduce the effects of the anchoring and adjustment bias in appraising house values and concluded that the intervention did not mitigate the bias. One hundred and thirty-one university students were recruited to use the decision support system to estimate price values for houses. Subjects in the experimental group received the owner’s asking price as an

With intercom Without intercom A blue cab with A blue cab without A blue cab intercom intercom A green cab with A green cab A green cab intercom without intercom

Figure 8.5 Problem expressed in non-causal order (Roy & Lerch, 1996)

Debiasing strategies: a literature review 84

anchor and subjects in the control group did not receive an anchor. To mitigate the anchoring and adjustment bias, subjects given anchors received warnings when any of their estimates were within a certain range of their anchors during the experiment. Subjects who were not given anchors did not receive any warning. However, there was no statistically significant difference in the house value estimates between warned and un-warned subjects

(P > 0.05), i.e. the warning screens did not mitigate the anchoring and adjustment bias.

The study conducted by Chandra and Krovi (1999) used distracter screens to debias recency effects in a psychology study but the effectiveness of the debiasing technique was not evaluated. In the study, subjects had to answer a list of questions after viewing a sequence of stimulus material on the animal kingdom. Subjects received a distracter screen in between successive pieces of stimulus material (Figure 8.6); the distracter screen consists of random text that is unrelated to the study and is presented to the subject to prevent recency effect. However, as stated above, no evaluation was carried out to test the effectiveness of these distracter screens in preventing recency effect.

DISTRACTER 1 DISTRACTER 2 The phrase ‘mathematical folklore’ sounds a Sun Microsystems isn’t the only U.S. company little strange at first. taking advantage of brainpower in former Let’s first start with a time honored story about Soviet republics: Archimedes, the greatest mathematician and HEWLETT-PACKARD sponsored 1992 Russia inventor of the ancient world. While soaking in wide competition on theories of computer a bathtub, it is said, he discovered the principle recognition of speech and printed characters. that a body immersed in a fluid is buoyed up a Employing Russian chess masters in artificial- force equal to the weight of the fluid displaced. intelligence research. Exhilarated by this realization, he ran naked IBM: Ten scientists in Minsk, Belarus, are through the streets shouting “Eureka! Eureka!” developing network-management software for (Greek for “I have found it! I have found it!”). IBM mainframes. How about Nobel? Alfred Nobel, the inventor UNITED TECHNOLOGIES: Working with of dynamite and the founder of the Nobel Moscow’s institute of Structural Macro-kinetics Prize, was reported to have stipulated that on high-strength ceramics. there be no Nobel Prize awarded in AT&T: Bell Laboratories support 120 fiber- Mathematics in order to retaliate against his optics researchers at General Physics Institute wife’s lover, Mittag-Leffler, a likely winner at in Moscow. the time of the prizes’ inception. Kurt Godel, for example, is said to have resisted becoming a U.S. Citizen for several years because he found a logical contradicition in the Constitution.

Figure 8.6 Information distracter screens (Chandra & Krovi, 1999) Distracter 1 adapted from Ruminations of a Numbers Man by John Allen Paulos, Random House, 1991. Distracter 2 from BusinessWeek June 1993, p. 84.

Debiasing strategies: a literature review 85

8.3.2 Information-related Lim et al. (2000) showed that multimedia presentations reduce the influence of first impression bias in a task that requires subjects to evaluate the performance of an authority figure. First impression bias is closely related to the primacy effect; it refers to a “limitation of human information processing in which people are strongly influenced by the first piece of information they are exposed to, and that they are biased in evaluating subsequent information in the direction of the initial influence” (Lim et al., 2000, p. 115). Eighty university students were recruited to appraise the performance of a department head. Each subject was randomly assigned to four groups, where information was presented in textual or multimedia form, with or without a biased cue. They found that amongst the two groups who received textual information, the appraisal score of the group who received a biased cue was significantly lower than the group who did not receive a biased cue (t(76) = 3.07, P <

0.01); and amongst the two groups who received multimedia information, there was no significant difference in the appraisal score between the groups with and without a biased cue (t(76) = 1.54, P = 0.13). They suggested that multimedia presentations should allow users to better retain and retrieve information because multimedia uses complementary cues (i.e. audio and video) to capture information, which makes evidence disconfirming to the first piece of evidence difficult to ignore, and thus reduces the potential for misinterpretation and decreases the likelihood of the first impression bias.

Marett and Adams (2006) developed a decision support system (DSS) that provides different amount of information to alleviate the familiarity bias in a baseball performance ranking task (see Figure 8.7); they concluded that an appropriate amount of information (i.e. not too much and not too few) can improve decision accuracy. Familiarity bias refers to the situation where people are uncertain and prefer a familiar option over an unfamiliar option.

196 undergraduate students were recruited to use a DSS accessible through a Web browser to rank the top 25 baseball teams out of a list that contains the home team and rival teams.

Subjects were divided into five groups where each group received different quantities of information. They found that the most accurate rankers and the group least susceptible to the familiarity bias was the group that received information in which the quantity is more appropriate to the amount of time available to process the information (i.e., this group outperformed groups that received less information and groups that received more

Debiasing strategies: a literature review 86

Figure 8.7 Baseball team ranking task supplied with two pieces of information on the right (Marett & Adams, 2006) information). They suggested that a DSS should present information that fits the usage requirements and strategy of the task and the intended users.

Arnott (2006) described the experience of developing a decision aid for a consulting firm that has successfully debiased the . The decision aid is a website that asks users to search for possible disconfirming information and allocate the information as being confirming, disconfirming or neutral to the decision (see Figure 8.8 for an example).

Before the design aid was introduced, the design analyst identified that the managing director and the board may be experiencing the confirmation bias because the information

Decision Information Type Source impact Profit and loss statements (YTD and last 2 years) Quantitative Office manager Confirming Report on the future of Delta Consulting Qualitative Consultant’s report Confirming Revenue and expenditure forecasts (Total company, next 3 years) Quantitative Consultant’s report Confirming Revenue and expenditure forecasts (By divisions, next 3 years) Quantitative Consultant’s report Confirming Course attendance history (last 3 years) Quantitative Training manager Neutral YTD, year-to-date. Figure 8.8 Example of information allocated to confirming, neutral or disconfirming to a decision (Arnott, 2006)

Debiasing strategies: a literature review 87

they had available were all confirming to closing down a department in the firm. After using the decision aid, the managing director and the board reached a conclusion not to close down the department. Although the proposed technique has successfully debiased the confirmation bias, the technique still needs to be tested in a controlled experiment before any conclusion on its effectiveness can be made.

8.3.3 Other approaches Rai et al. (1994) described a set of design implications from their experience in developing executive systems to alleviate judgemental biases. They observed that the impact of the availability heuristic, regression effects and overconfidence bias can be reinforced in certain features of an executive system. They recommended that providing suggestive and informative guidance at different phases of the decision making process in an executive system can prevent the reinforcement of these biases. These recommendations include: using guiding prompts to suggest the use of specific tools when performing particular tasks, providing warning messages to remind users about information processing biases in specific contexts, and suggesting appropriate sampling techniques to alleviate the use of the availability heuristic and the overconfidence bias that may be caused due to sampling errors.

Although these suggestions are insightful, the extent to which these suggestions can be generalised beyond designing executive systems needs to be taken with caution.

Shore (1996) reported the development of an expert system for the underwriting process at an insurance products company and demonstrated how roles at different levels of the organisational hierarchy may introduce biases into the system. He suggested strategies which are similar to software engineering principles to be used during the system development lifecycle to prevent biases being introduced into the system. These strategies include: using perceptual and behavioural elicitation techniques in the requirements gathering stage, developing systems that learn, maintaining expertise (such as appointing someone to overlook changes in the environment that need to be incorporated into the system), regularly maintaining and validating system requirements in the project lifecycle, accommodating different user cognitive styles and providing documentation and training.

However, these suggestions are not interventions that can be implemented as features in a decision support system to debias decision making.

Debiasing strategies: a literature review 88

8.4 Discussion

Overall, the literature in debiasing strategies describes techniques that have demonstrated variable success rates (Klayman & Brown, 1993; Roy & Lerch, 1996; Chandra & Krovi, 1999;

George et al., 2000; Lim et al. 2000; Ashton & Kennedy, 2002; Marett & Adams, 2006). Most of these studies have been conducted in artificial settings with university students and the majority of these results have not produced statistically significant findings in debiasing decision making. A number of papers report real-life experience of delivering decision support systems (Rai et al., 1994; Shore, 1996; Arnott, 2006). However, these local customised solutions have not been tested in controlled experiments. Hence, the extent to which these recommendations can be generalised beyond their settings is unclear.

Regardless of whether information technology was or was not used, there is mixed success in each category of user-, task- and information-related debiasing approaches.

Within user-related approaches, thinking strategies have been proposed but not evaluated; and strategies that require asking subjects questions about their intent need to be used with caution because they can introduce the mere-measurement effect and unintentionally alter subjects’ behaviour. Within task-related approaches, self-review has been found to successfully reduce recency effect, but warning screens and distracter screens were either not tested or not found to eliminate anchoring and order effects. Interestingly, approaches belonging to the information-related category, namely, information provided in independent format versus contrastive format; information presented in graphical or multimedia formats; information given in appropriate quantity; and information allocated to being confirming, disconfirming or neutral to the decision, have been shown to successfully reduce the impact of base-rate fallacy, first impression bias, familiarity bias and confirmation bias. Furthermore, studies have shown that presenting information in different formats does significantly impact on the accuracy of the decision-outcome (Roy & Lerch,

1996; Elting et al., 1999).

Although no strong conclusions can be drawn on the success of most of these proposed interventions, there is a sense that the way information is presented and a task that demands self-evaluation may be elements that constitute a successful debiasing intervention. These speculations correspond with the approaches hypothesised by Arkes (1991) in debiasing judgement errors, which fall into the categories of strategy-based errors, association-based

Debiasing strategies: a literature review 89

errors and psychophysically-based errors. According to Arkes, strategy-based errors occur when people do not appreciate the extra effort required to achieve greater accuracy; the proposed debiasing approach is to allow the decision-maker to recognise the benefit when accuracy is increased. Association-based errors occur when people misuse information and form harmful associations; debiasing would require users to activate different associations with the information. Psychophysically-based errors happen when there is a difference between the external stimuli and subjects’ interpretation of the stimuli; debiasing would require subjects to alter their initial position on the matter such that a more balanced interpretation of the stimuli can be achieved.

Combining the findings in this review and Arkes’ debiasing approaches, specific strategies are suggested for debiasing anchoring effect, order effects, exposure effect and reinforcement effect. The anchoring effect is assumed to be associated with the psychophysically-based error because there is a nonlinear relationship between what the documents say and how the subjects’ prior beliefs influence their interpretation of the documents. An approach to debias the anchoring effect would require repositioning the subjects’ anchor. For example, a study conducted by Lopes (1982) found that anchoring effects can be reduced by training subjects to anchor on the most informative sources rather than the initial stimuli that may not be informative. Order effects, exposure effect and reinforcement effect are assumed to be related to association-based error because these effects potentially cause some documents to be more available and thus inappropriately associated when subjects make decisions. An approach to debias these effects would require changing the availability of these documents or activating different associations amongst documents that are more available. These approaches to debiasing cognitive biases may be able to be translated into the user interface of a search engine, perhaps in the form of altering the way documents are presented or asking users to self-evaluate their anchors and belief in the search journey.

8.5 Conclusion

Although there have been attempts to implement debiasing strategies in decision-support systems, no intervention was identified for debiasing information searching. The challenge of this research is to design a debiasing intervention for information retrieval systems such

Debiasing strategies: a literature review 90

that people may better utilise their cognitive resources when using retrieved information to make decisions.

9 Search user interface: a literature review

One of the purposes in this chapter is to review existing search user interfaces and identify features that can be used to implement the debiasing approaches identified in Chapter 8.

Search can be classified into two main processes, namely query specification and search result analysis (Marchionini & Komlodi, 1999). This chapter focuses on search user interfaces that assist users to interact with their search results, concentrating on the components that support people to select, process and integrate information.

Another purpose of this chapter is to review classic examples of information retrieval interfaces and examine whether they address human cognitive limitations during information searching. These examples are taken from patents and the following reviews on search user interfaces and information visualisations: Rao et al., 1995; Card, 1996; Hearst,

1999; Marchionini & Komlodi, 1999; Börner, 2002. Although most of these search user interfaces date from the 1990s, they are landmark approaches that have influence on the design of present-day search user interfaces.

9.1 Selecting information

One of the major challenges in selecting information is that human attention is selective (i.e. one focuses on a specific component and may intentionally or unintentionally neglect other components in the information space) (Wickens & Hollands, 2000). However, the process of making a selection requires one to attend to as many pieces of information as possible. To assist users to focus on the most appropriate information, search interfaces attempt to increase the salience of information that appears to be relevant to a query (salience referring to the amount of attention a piece of information attracts). The attention and the subsequent processing that each piece of information receives are greatly affected by the way the information is physically presented (Wickens & Hollands, 2000). Search systems use visual and structural cues to alter the physical appearance of documents to assist users to filter and select information (Table 9.1). 91

Search user interfaces: a literature review 92

Table 9.1 Examples of search user interfaces that assist users to select documents

Approach Author, date Name Visual cues Interfaces that present Lucas & Senn, 1996 Documents in strands documents in non-textual Sciammarella & Herndon, Documents in images formats 1999 Woodruff et al., 2002 Documents in enhanced thumbnails Brown et al., 2003 Documents in thumbnails

Interfaces that display the Landauer et al., 1993 Highlighting of query terms match between a document Kupiec et al., 1995 KWIC (keyword-in-context) and the keywords in a query Hearst, 1995 TileBars

Structural cues Static structure Spoerri, 1993 InfoCrystal Hearst, 1994 Cougar Plaisant et al., 1996 Lifelines MacPhail, 2004 Information items on a continuum scale related to a reference piece of information Orbanes & Guzman, Information displayed using 2004 multiple templates

Dynamic structure Cutting et al., 1992 Scatter/Gather Chen et al., 1999 Cha-Cha Pratt et al., 1999 DynaCat Subramaniam et al., 2004 Interface that attaches and displays subsequent search results in reference to the first result selected Hearst, 2000 Flamenco Hearst et al., 2002 Yee et al., 2003

9.1.1 Visual cues There are two ways search interfaces use visual cues to increase salience on information.

One way is to present documents in non-textual format, such as in strands (e.g. Lucas &

Senn, 1996), images (e.g. Sciammarella & Herndon, 1999), thumbnails (e.g. Brown et al.,

2003) and enhanced thumbnails (Woodruff et al., 2002). Presenting documents in non- textual formats allows users to use their perceptual system to identify salient information.

Search user interfaces: a literature review 93

Figure 9.1 Enhanced thumbnails (Woodruff et al., 2002)

For example, enhanced thumbnails are thumbnail representation of documents that highlight and magnify the query keyword in documents. Documents that have a high frequency of the keyword are displayed with high salience (as demonstrated in the middle document in Figure 9.1). A user study conducted to compare the efficiency of using three different types of document presentation (enhanced thumbnails, plain thumbnails and text summaries) to search Web pages found that participants on average were able to find the answer quickest using enhanced thumbnails (Woodruff et al., 2002).

Another way to increase salience on information is to display the match between a document and the keywords in a query, such as, highlighting of query terms (Landauer et al.,

1993), KWIC (keyword-in-context) (Kupiec et al., 1995) and TileBars (Hearst, 1995).

TileBars is a landmark search user interface that attaches with each retrieved document a graphical display that outlines the frequency of query terms for each section in the document (Figure 9.2). The aim is to allow users to see the distribution of query terms in the document and assist users to select documents and identify which sections in the document are relevant. In a TileBar, each row represents a specific keyword and each column represents a section of a document; the darker in colour an area is in a TileBar, the more frequent a particular keyword is in that section of the document.

The core visualisation idea behind TileBars has been evaluated in a study that compares the effectiveness and efficiency of different information retrieval displays in their basic abstract mode (Morse et al., 2002). These basic modes of display are namely text, table, icon, graph and spring (TileBars has been classified as belonging to the icon display). The study involved a series of experiments with 72 to 223 subjects; it involved presenting subjects with a set of documents on each display mode and asking them to identify the number of

Search user interfaces: a literature review 94

Figure 9.2 TileBars (Hearst, 1995) documents that contain a set of terms. Between using the icon display and the text display, there were no statistical significant differences in the number of correctly answered questions and the amount of time taken to answer questions. In addition, the study reported that the icon display was considered useful, easy to use and one of the best display modes.

9.1.2 Structural cues Another challenge in selecting information is information overload. Studies have shown that people do not generally make better decisions when there is more information (Oskamp,

1965; Allen, 1982; Dawes, 1979; Dawes & Corrigan, 1974; Malhotra, 1982; Schroeder &

Benbassat, 1975). In fact, it has been shown that decision making performance deteriorated when more information rather than less information was provided under time stress

(Wright, 1974). To assist people to select from a mountain of information, search interfaces organise search results using static or dynamic approaches.

Search user interfaces: a literature review 95

Search systems that statically organise retrieved documents display documents in a predefined structure. There are interfaces that display results in the metaphor of a Venn diagram (Hearst, 1994; Spoerri, 1993); interfaces that present information according to a particular scale, such as in chronological order (e.g. Plaisant et al., 1996); interfaces that classify and display search results in separate categories of folders (e.g. Silverman, 2001); interfaces that allow users to view information using multiple templates (e.g. Orbanes &

Guzman, 2004); and interfaces that display information items on a continuum scale that are related to a reference piece of information (e.g. MacPhail, 2004). More information on the advantages and disadvantages of using categories and clusters to organise search results can be found in Hearst’s comprehensive review (Hearst, 1999a; Hearst, 2006).

InfoCrystal (Spoerri, 1993) is an innovative example that structures search results in the metaphor of a Venn diagram (Figure 9.3). The number of keywords in a query determines the number of angles. The areas of intersection in the Venn diagram, which represent each possible combination of the Boolean query, are symbolised by different shapes in the graphical display. Each shape is associated with a number which displays the number of documents in that intersection of the Venn diagram. InfoCrystal has also been assessed in the same user evaluation study that compared core visualisation ideas (in which InfoCrystal was classified belonging to the graph display). The study concluded that subjects prefer

Figure 9.3 InfoCrystal (Spoerri, 1993)

Search user interfaces: a literature review 96

graphical methods (i.e. icon, graph and spring) over text when there is increased complexity in the task (Morse et al., 2002).

Search systems that dynamically organise documents either re-display documents according to the way people interact with the documents or organise documents in a way that reflects the semantic content of the retrieved documents. For the former kind, there are interfaces that cluster documents according to the content of the documents and rearrange the clusters according to how user selects documents, such as Scatter/Gather (Cutting et al.,

1992); interfaces that attach and display subsequent search results in reference to the first result in a predefined view, such as Subramaniam et al. (2004); and interfaces that use hierarchical faceted metadata to organise and maintain the structure of search results while users refine and expand the query, such as Flamenco (Hearst, 2000; Hearst et al., 2002; Yee et al., 2003). For the latter kind, there are interfaces, such as Cha-Cha (Chen et al., 1999), that dynamically produces a hierarchical “table of contents” to reflect the underlying structure of the intranet in which the search results were retrieved; and there are interfaces such as DynaCat (Pratt et al., 1999), which uses a knowledge base to dynamically categorise search results into a hierarchical organization.

Scatter/Gather is a widely-cited search system that clusters documents according to the way users select the clusters. It applies clustering and scatters retrieved documents into a

Figure 9.4 Scatter and Gather (Cutting et al., 1992)

Search user interfaces: a literature review 97

small number of document groups, and presents a short summary detailing the number of documents, a set of topical terms and examples of titles in each cluster (Figure 9.4). The system then gathers groups that the user selected and applies clustering again to form smaller document groups that are more specific. The iterative process of system clustering, forming document groups and user selecting document groups continues until user reaches groups that contain only a single document. Several technical and user evaluations have been reported for Scatter/Gather (Hearst et al., 1995a, Pirolli et al., 1996, Hearst et al., 1996,

Hearst & Pedersen, 1996). Participants were judged to be able to successfully interact with the clusters and able to select the cluster with the most relevant documents (Hearst &

Pedersen, 1996).

9.1.3 Commentary The purpose of using visual or structural cues in a search interface is to assist users to select information by changing the physical layout of the information. Evaluations on search user interfaces such as TileBars, InfoCrystal and Scatter/Gather provide the preliminary evidence that visual and structural displays are useful. However, it appears that most interfaces outlined in this section only consider the frequency of keywords in a document before altering the salience of the document. There is no indication that these interfaces consider the diagnosticity (i.e. the usefulness of the information) and the reliability (i.e. the likelihood that the information can be trusted) of the information before altering the salience.

Documents with a high frequency of keywords are not necessarily high in diagnosticity and reliability. In fact, using information that is of low quality but of high salience can increase the likelihood of making inaccurate decisions (Wickens & Hollands, 2000).

9.2 Processing information

The challenge in processing information lies in the limitation of the working memory to understand ambiguous information and assess multiple variables under uncertainty. To assist people to process information, there are tools known as sensemaking tools that assist people to explore their information space (Card, 1996). Users apply these tools to construct new information patterns from existing data by re-representing or interacting with the data.

Sensemaking tools are classified according to their functionality in assisting people to

Search user interfaces: a literature review 98

Table 9.2 Examples of sensemaking tools (i.e. tools that allow users to interact and re- represent existing data to construct new information patterns)

Approach Author, date Name Group information (e.g. sliders Ahlberg & Shneiderman, 1994 FilmFinder and selection buttons) Ahlberg et al., 1994 Dynamic Query (DQ) Heath et al., 1995 Envision Mackinlay et al., 1999 Butterfly Wang Baldonado & Winograd, SenseMaker 1997 Focus+context Spence & Apperley, 1982 Bifocal lens Furnas, 1986 Fisheye views Mackinlay et al., 1991 Perspective wall Robertson et al., 1991 Cone tree Bier et al., 1993 Magic Lens Robertson & Mackinlay, 1993 Document Lens Rao & Card, 1994 TableLens Lamping & Rao, 1996 Hyperbolic browser Bederson, 1996 Pad++ Compare documents Hendry & Harper, 1997 SketchTrieve

process their information space, which include grouping information, focusing on information and comparing information (Table 9.2).

9.2.1 Group information Sensemaking tools range in sophistication and functionality. Tools such as sliders and selection buttons allow users to group information according to a set of criteria; for example:

FilmFinder (Ahlberg & Shneiderman, 1994), Dynamic Query (Ahlberg et al., 1994),

Envision (Heath et al., 1995), Butterfly (Mackinlay et al., 1999) and Sensemaker (Wang

Figure 9.5 Envision (Heath et al., 1995)

Search user interfaces: a literature review 99

Baldonado & Winograd, 1997). Envision is a well-known example that displays retrieved information in a two-dimensional table. It allows users to sort and re-display information by selecting attributes along the X- and Y-axes of the table, such as author or date (Figure 9.5).

A formative usability evaluation with five participants showed that participants were able to complete the given tasks faster than the interface designer; and a usability questionnaire showed that participants on average were able to use the system (Nowell et al., 1999).

9.2.2 Focus+context Focus+context tools allow users to select parts of the information space and zoom in and out to examine in detail; for example, Bifocal lens (Spence & Apperley, 1982), Fisheye views

(Furnas, 1986), Perspective wall (Mackinlay et al., 1991), Cone tree (Robertson et al., 1991),

Magic Lens (Bier et al., 1993), Document Lens (Robertson & Mackinlay, 1993), TableLens

(Rao & Card, 1994), Hyperbolic browser (Lamping & Rao, 1996) and Pad++ (Bederson,

1996). Perspective wall is a classic application of the fisheye technique, which makes all regions of the information space available on the display (Furnas, 1986). The Perspective wall uses a three-dimensional view to achieve the fisheye technique by bringing the focused region to the front and leaving the context area in the background (Figure 9.6). Although the Perspective wall is designed to reduce short term memory load, the use of the fisheye

Figure 9.6 Perspective wall (Mackinlay et al., 1991)

Search user interfaces: a literature review 100

Figure 9.7 SketchTrieve (Hendry & Harper, 1997) technique has been cautioned in general because evaluation studies on this technique show that it can enhance as well as be detrimental to information processing performance

(Cockburn et al., 2006).

9.2.3 Compare documents There are tools that allow users to compare documents or groups of documents; for example, SketchTrieve (Hendry & Harper, 1997). SketchTrieve is one of the first kinds of search environments that provide users with the tools to organise and compare documents in a side-by-side manner (Figure 9.7). It also allows users to make annotations on documents and create linkages between sets of documents. A formative evaluation conducted on SketchTrieve suggested that the environment seems to support search activity when users have undergone practice (Hendry & Harper, 1997).

9.2.4 Commentary The purpose of sensemaking tools is to assist users to explore and process their information space. Although formative evaluations of SketchTrieve and Envision show that subjects are able to use these interfaces, it is unclear whether these visualisation techniques present documents whose visual layout resembles the way people process information. People use different techniques to process information in different situations. For example, studies have

Search user interfaces: a literature review 101

shown that people tend to underweigh information that is difficult to interpret or integrate

(Johnson et al., 1988; Beach & Mitchell, 1990); people often fall prey to the as-if heuristic and treat information of different diagnosticity and reliability equally (Wickens & Hollands,

2000); and there are circumstances in which people interpret the absence of information to be supporting or refuting a hypothesis (Wickens & Hollands, 2000). However, the sensemaking tools outlined in this section do not appear to differentiate these situations and assist users accordingly to process information.

9.3 Integrating information

The challenge in integrating information lies in the competition for one’s mental capacity to engage in the decision making process and to juggle between different tasks involved in conducting a search journey. In an information searching task, one requires the role of the metacognition (i.e. the awareness of one’s ability to plan and evaluate progress) to decide how to multitask and prioritise information needs (Spink et al., 2006), how much and what type of information is needed, whether to use or discard the currently retrieved information, and how to conduct the next search step. Premature termination of the search can prevent the decision maker from obtaining further information to make a decision; yet, prolonging the search for more information can take up resources unnecessarily and complicate the decision making process. These decisions involved in conducting a search journey often take up the users’ mental capacity from focusing on the actual decision making task.

To assist people to alternate between searching for information and concentrating on the decision task, there are interfaces that provide tools for users to organise and integrate information during searching, known as “workspace”. Workspace is a collection of objects whose access is arranged to make relevant tasks efficient (Card, 1996); it is ideal for decision tasks that involve seeking information to confirm or disprove an initial hypothesis.

Workspaces are classified according to the level of sophistication displayed in the user interface, namely simple tools, workbench approaches and sophisticated metaphors (Table

9.3). The role of a workspace is to support the user’s working memory in using information to revise belief.

Search user interfaces: a literature review 102

Table 9.3 Examples of search workspaces (i.e. tools for users to organise and maintain search results)

Approach Author, date Name Simple tools Note-taking A9, 2006 A9 Store relevant documents Melvyl, 2006 Melvyl (web version) Track search history Twidale et al., 1995 Ariadne Roth et al., 1997 Visage Hightower et al., 1998 PadPrints Robertson et al, 2000 Task Gallery Komlodi et al., 2006 Search history Workbench approaches Structured user interface for Rao et al, 1992 InfoGrid querying, selecting and storing Rao et al., 1994a Protofoil documents Brajnik et al., 1996 FIRE (Flexible Information Retrieval Environment) Kandogan & Shneiderman, 1997 Elastic windows

Users directly manipulate and Robertson et al., 1998 Data Mountain organise documents on user interface Sophisticated metaphors Metaphors for organising search Henderson & Card, 1986 Rooms results Cousins, 1997 DLITE Hearst & Karadi, 1997 Cat-a-cone

9.3.1 Simple tools A simple approach incorporates tools into the search interface for users to organise search results. There are tools that allow users to take and store notes with a document (e.g. A9,

2006); to mark documents in the list of returned results and store them for later access (e.g.

Merly, 2006); and to save search history or search state, such as Ariadne (Twidale et al.,

1995), Visage (Roth et al., 1997), PadPrints (Hightower et al., 1998), Task Gallery

(Robertson et al, 2000), and the search history project conducted by Komlodi et al. (2006).

PadPrints is a simple search interface that stores and presents all links a user accessed while browsing a website in a tree diagram (Figure 9.8). Two usability studies were conducted to test the effectiveness and usability of using PadPrints to navigate the web. The studies found that users of PadPrints were faster to complete navigation tasks that required returning to visited pages, and that users in general accessed fewer pages and were more satisfied than users without PadPrints (Hightower et al., 1998).

Search user interfaces: a literature review 103

Figure 9.8 PadPrints (Hightower et al., 1998)

9.3.2 Workbench approaches A more sophisticated approach involves using information visualisation techniques in search environments that allows users to manually organise documents in the search interface. These interfaces are often designed around users’ tasks, which consist of a structured layout where there is a designated area for querying, selecting documents and storing documents. For example: InfoGrid (Rao et al., 1992), Protofoil (Rao et al., 1994a),

FIRE (Brajnik et al., 1996) and Elastic windows (Kandogan & Shneiderman, 1997). InfoGrid is one of the early structured user interfaces for accessing documents, where the top panel allows users to insert their queries; the middle panel displays retrieved documents; the right panel displays the selected document; the left panel displays controls for manipulating the information space; and the bottom panel is divided into two sections where the left side displays stored documents and the right side displays history of accessed documents (Figure

9.9). It has been reported that users are able to use InfoGrid with minimal training and that they are able to use a second InfoGrid application without further guidance (Rao et al.,

1992).

Search user interfaces: a literature review 104

Figure 9.9 InfoGrid (Rao et al., 1992)

There are also interfaces that allow users to directly manipulate and organise documents onto different areas of the search interface, such as Data Mountain (Robertson et al., 1998).

Data Mountain is a novel three-dimensional space where users can freely place thumbnail images of web pages (Figure 9.10). A user study was conducted to test the usability and effectiveness of Data Mountain for managing documents (Robertson et al., 1998). The study found that participants using Data Mountain were able to retrieve web pages faster and were more likely to retrieve a web page within the time limit than using the Microsoft Internet

Explorer Favourites (IE4) mechanism. It also found that participants using Data Mountain were able to retrieve documents more accurately.

Figure 9.10 Data Mountain (Robertson et al., 1998)

Search user interfaces: a literature review 105

Figure 9.11 DLITE (Cousins, 1997)

9.3.3 Sophisticated metaphors Other interfaces employ interesting metaphors for organising search results and maintaining search states. One innovative example is Rooms (Henderson & Card, 1986), which takes the metaphor of a room to represent a user’s work context. Each room is associated with its relevant sets of information and application programs, and users travel through different rooms to transit between work contexts. Another example is DLITE (Cousins, 1997), which could be regarded as a “search factory”. It represents queries, sources, documents and groups of retrieved documents as objects; and objects can be dragged and dropped onto other objects to create new objects on the user interface (Figure 9.11). An initial pilot on DLITE was conducted with six novice users. All subjects were able to successfully complete a complex task in less than 30 minutes, which involved finding bibliographic references relevant to the citations, replacing citations in a document with automatically generated references, and attaching a bibliography to a document (Cousins, 1997).

An additional example is Cat-a-cone (Hearst & Karadi, 1997), which uses a book metaphor to display the retrieval results and a cone metaphor to display the concept hierarchy related to the results (Figure 9.12). When the user conducts a search, the retrieved results are placed in a virtual book. When the user accesses a document from these retrieved results (i.e. opens a page from the virtual book), the system highlights the path on the cone tree that reveals the concept hierarchical structure in which the document belongs to. As the

Search user interfaces: a literature review 106

Figure 9.12 Cat-a-cone (Hearst & Karadi, 1997) user accesses another document (i.e. selects another page from the virtual book), the relevant part of the hierarchical tree becomes salient and the irrelevant parts of the tree lose focus and shift to the background. Although there were plans for evaluating Cat-a-cone with clinicians and patients, no user evaluation study could be located for Cat-a-cone.

9.3.4 Commentary The purpose of a workspace is to provide a visual space for users to off-load their working memory in organising information so that users can concentrate on using the processed information. User evaluations or pilot studies for PadPrints, InfoGrid, Data Mountain and

DLITE have demonstrated that users can be trained to use these systems, and that some of these systems help users to complete their tasks in less time, retrieve documents more accurately, and access fewer documents.

There are situations where people fail to seek disconfirmatory evidence or terminate their searches prematurely, but the workspace interfaces outlined in this section do not assist users to overcome these types of cognitive limitations. People are prone to four kinds of cognitive biases when integrating information and refining belief: (i) overconfidence in one’s hypothesis which leads to a tendency not to seek further information (Wickens & Hollands,

Search user interfaces: a literature review 107

2000); (ii) anchoring, where one’s prior belief exerts a persistent influence on the way information is processed and subsequently affects the belief revision process (Wickens &

Hollands, 2000); (iii) order effects, which refer to the phenomenon where information processed at the first and last positions are more influential than documents processed at other positions (Wang et al, 2000); and, (iv) confirmation bias, where one fails to process information that challenges one’s hypothesis (Wickens & Hollands, 2000). At any time during the search journey, people need to make decisions, such as whether to terminate a search or seek further information. Good decision makers display the metacognitive quality of being aware of what they do not know and are able to proceed to seek further information before concluding their decision making process (Orasanu & Fischer, 1997). However, the search interfaces discussed in this section do not appear to assist people on this metacognition level, i.e. one’s awareness of the ability to make decisions, to overcome these biases during information searching.

9.4 Discussion

Search user interfaces assist people to select information by increasing the salience of query- relevant information through visual or structural cues. Interfaces present documents in non- textual formats or visually highlight the match between the document and the query; and interfaces statically or dynamically organise retrieval results. However, these methods do not necessarily help users to distinguish the diagnosticity and the reliability between different types of information unless salience is determined by more semantically generated metrics than keyword occurrences. People may be subject to the salience bias and prevent themselves from selecting information that is of high diagnosticity and reliability but of low salience.

Search user interfaces assist people to process information by providing sensemaking tools that allow users to re-construct and interpret their information space. Sensemaking tools range in functionality, such as grouping information, focusing on information and comparing information. However, these tools do not appear to assist users accordingly in different situations that require different information processing strategies. People may use the as-if heuristic to treat all information of different diagnosticity and reliability equally

Search user interfaces: a literature review 108

and prevent themselves from correctly interpreting the information and using it appropriately.

Search user interfaces assist people to integrate information by providing a visual workspace to offload the organising of information from the working memory onto the perceptual system. There are simple tools that allow users to take notes and store documents, workbench approaches that allow users to manually organise documents, and sophisticated metaphors that transport users to another environment to manage their tasks and information space. However, these methods do not assist people with their metacognition to overcome biases in the process of refining their belief. People are subject to belief revision vulnerabilities, such as anchoring, overconfidence, order effects and confirmation bias, which prevent them from obtaining further information or fully understanding the situation before making a decision.

9.5 Conclusion

Overall, this review has identified components in a search user interface that are potentially useful for implementing debiasing strategies on the search user interface: (i) for elements in the debiasing strategy that require changing the way people select information, visual and structural cues can be used to alter the physical layout of information; (ii) for elements that require altering the way people process information, sensemaking tools can be tailored to alter the way people group, focus and compare documents; (iii) for elements that require modifying the way people integrate information, workspace solutions ranging from simple tools to sophisticated metaphors can modify the balance in which people engage in decision making and conduct information searching.

This review has also found that there seems to be no published attempts to use the search user interface to debias information searching. Search user interfaces share the common goal of amplifying users’ spatial cognition to analyse and synthesise information.

They share the assumption to off-load components of an information searching task from users’ cognitive system onto their perceptual system. However, user evaluations on these interfaces are mostly unavailable or limited, it is not clear how effective these visualisation techniques are in assisting people to select, process or integrate information. In addition, people are generally vulnerable to cognitive biases when processing information and

Search user interfaces: a literature review 109

revising belief. However, the search user interfaces reviewed in this chapter do not appear to address these biases nor support the role of the metacognition during information searching.

10 Design of debiasing interventions

Using the literature on debiasing and search user interfaces reviewed in previous chapters, this chapter reports three classes of interventions, namely keep document tool, for/against document tool and education-based intervention, which have been designed with the aim of debiasing the information search process. Each intervention is designed to counteract a specific bias (i.e. anchoring, order, exposure or reinforcement effect). Interventions that target order, exposure and reinforcement effect belong to the class of keep document tools.

The intervention that targets the anchoring effect is the for/against document tool. The intervention that educates users about the impact of cognitive biases is the education-based intervention. This chapter details the design rationale and an illustration for each intervention.

10.1 Design goals and assumptions

Interventions for debiasing information searching need to address two main design goals: reduce the impact of cognitive biases that users may experience during searching, and assist users to process and organise information during the evidence journey. Apart from these goals, interventions should support users in their transition from searching for information to making a decision without compromising the way people normally conduct searches.

Tubbs et al. (1993) suggested that interventions to minimise biases are best applied at the evidence integration stage of the belief-updating process rather than at the evaluation of individual pieces of evidence. As a result, a keep document tool was designed to assist users to integrate evidence during search by providing tools that allow users to gather, organise and rearrange documents while searching for information; a for/against document tool was designed to provide an alternative way of thinking about the decision that minimises users’ reliance on their prior beliefs, such that users can organise and integrate documents collected according to this alternate manner as they search for information to make decisions. Most strategies used here are recommended by Fischhoff for debiasing 110

Design of debiasing interventions 111

(Fischhoff, 1982, p. 424). Table 10.1 summerises the design rationale behind each intervention and a more detailed explanation is presented later on for each class of intervention.

10.2 Methods

10.2.1 Design procedures Keep document and for/against document tools were developed in an iterative design process that involved a series of usability studies conducted with four members from the general public and fifteen researchers at the Centre for Health Informatics, University of

New South Wales. The education-based intervention was designed in parallel but was not tested in the iterative design process.

Two prototypes were developed before the current design of keep document and for/against document tools. Participants in the usability studies had professional backgrounds in psychology, human–computer interaction, user interface design, health informatics, software engineering and computer science. Each usability study involved one to three participants over a period of 0.5 to 1 hour. Participants were briefed on the research questions and the design goals, presented with screenshots of the prototypes, and asked for answers to two specific questions: “Is the interface user friendly? Would this user interface help you while you search for information?” The final version of the keep document and for/against document tools were further piloted on two occasions with five people in each; no major alterations were made to the design.

10.2.2 Basic components of the search user interface Both keep document and for/against document tools are integrated into the user interface of

Quick Clinical (QC), an online evidence retrieval system developed at the Centre for Health

Informatics, University of New South Wales (Coiera et al., 2005). On the search query page, users select a Profile to describe the nature of the question (on the left) and enter keywords in any of the four query fields (on the right) (Figure 10.1). After a search is conducted, the online evidence retrieval system returns documents from a variety of information sources and presents them in a list of ten items per page on the results page (Figure 10.2).

Table 10.1 Summary of intervention types Search user interface implemented (reference by which inspired from Intervention Biases addressed Debiasing strategies used (from Chapter 8) Chapter 9) Education-based Anchoring effect Educate users on cognitive biases (Fischhoff, 1992) Information screen (Stallard & Order effects (primacy and Develop awareness on the adverse effects of bounded Worthington, 1998) recency effects) rational behaviour (Croskerry, 2003) Exposure effect Reinforcement effect Keep document tool Order effects (primacy and Alter structure of information (Klayman & Brown, 1993) Document re-arrangement strategy recency effects) Offer alternative formulations (Fischhoff, 1992) (Cousins, 1997; Hendry & Harper, Exposure effect 1997) Reinforcement effect For/Against Anchoring effect Offer alternative formulations (Fischhoff, 1992) Tool to allocate information document tool Consider alternative formulations (Fischhoff, 1992; Hirt & confirming, disconfirming or neutral Design of debiasing interventions Markman, 1995) to the decision (Arnott, 2006; Brajnik et al., 1996) Provide the environment to support the search for discrepant information (Fischhoff, 1992)

112

Design of debiasing interventions 113

Figure 10.1 The original Search page of Quick Clinical, an online evidence retrieval system developed by Centre for Health Informatics, University of New South Wales (Coiera et al., 2005)

Two changes are made to the original Quick Clinical results page: documents are represented in thumbnail, and documents are presented in a panel of 4 thumbnails instead of a list of 10 items (Figure 10.3). The modified user interface is designed to minimise user’s cognitive load during searching. The thumbnail representations not only allow users to

Figure 10.2 The original Results page of Quick Clinical

Design of debiasing interventions 114

Figure 10.3 The modified Results page of Quick Clinical – users simply click on any of the displayed thumbnails to access a document preview documents; a study shows that users are quicker to find answers and visit fewer links when documents are represented in thumbnail compared to in text (Woodruff et al.,

2002). The four-document panel is a less-busy page that minimises user mouse movement up and down along the screen. This modified results page is used across all interventions; it is also the baseline search user interface used in the experiment for comparing with each debiasing intervention.

10.3 Keep document tool

The aim of the keep document tool is to increase the availability of potentially neglected documents due to order, exposure and reinforcement effects. A different document rearrangement strategy is designed to target each of the order effects, exposure effect and reinforcement effect. These are biases resulting from using the availability heuristic, i.e. people find certain documents more available in their working memory and use them in their decision making. These documents may simply be more available because they were accessed at the first or last position (i.e. order effects), they were more exposed (i.e. exposure effect), or they were visited more than once (i.e. reinforcement effect).

Design of debiasing interventions 115

Figure 10.4 Keep document tool – a tool for users to write and keep notes with each document

10.3.1 Design rationale Two debiasing strategies, altering the structure of information and offering alternative formulations, are used in keep document tools. These strategies are translated into the search user interface in the form of rearranging and redisplaying documents, and allowing users to review the same information in an altered structure where the impact of the targeted bias is minimised.

The strategy behind all keep document tools is to redisplay documents that the user accessed during the evidence journey; these documents are rearranged and re-displayed in a way such that all accessed documents are equally available in the user’s working memory.

When users select a document from the pool of retrieved documents, they can annotate the document (Figure 10.4; enlargement: Figure 10.5). If users decide to keep the document, it will automatically move across the screen to an area where kept documents are stored

(Figure 10.6). These documents are kept in a document storage area and are accessible across multiple searches (Figure 10.7). After users finish searching and are about to make a decision, the retained documents are rearranged and presented to the user (Figure 10.8).

Design of debiasing interventions 116

Figure 10.5 Keep document tool – enlargement of the Keep notes tool

Each keep document tool uses a different document rearrangement strategy for its specific cognitive bias. Potential bias positions for order are: first and last documents accessed; for exposure: documents that were accessed the most amount of time in the evidence journey; and for reinforcement: documents that have been accessed more than once. A document rearrangement strategy reorganises documents collected by the user in a manner that aims to minimise the impact of the targeted bias. Documents are rearranged in a manner such that those that were not accessed at potential bias points are presented earlier than those that were accessed at potential bias positions.

Figure 10.6 Document-based intervention – document and its notes automatically move to another section of the screen once user selects ‘Keep’ document

Design of debiasing interventions 117

Figure 10.7 Document-based intervention – collected documents and their notes are accessible across multiple searches

The set of rearranged documents is presented to the user for self-review before making a decision. The assumption behind keep document tools is that by encouraging searchers to re-examine documents presented in a way which minimises the impact of a bias, all documents would be assessed in the absence of that bias.

Figure 10.8 Document-based intervention – documents and their notes are rearranged according to the document rearrangement strategy that minimises the impact of the targeted cognitive bias. Users review this newly arranged set of documents before making a decision

Design of debiasing interventions 118

10.3.2 Order effects The document rearrangement strategy for debiasing order effects involves redisplaying documents that were read in the middle of the sequence at first and last positions of the new sequence, and documents that were read at first and last positions re-displayed in the middle of the new sequence. The rationale behind this is that if people are influenced by primacy and recency effects, documents that were in the middle of the sequence would receive more attention if they were presented at the first and last positions of the sequence.

For example:

• Subject accessed four documents in the evidence journey: D1, D2, D3, D4

(Note: for each Dn, n denotes the access order of the document; in an evidence journey

of length 4, D1 is the first accessed document and D4 is the last accessed document.)

• Documents rearranged in new sequence for debiasing order effects: D2, D1, D4, D3

10.3.3 Exposure effect The document rearrangement strategy for debiasing exposure effect involves redisplaying documents in an ascending order, from least to most, based on the amount of time the searcher spent on each document in the evidence journey. The document that was accessed for the least amount of time was redisplayed at the beginning, and the document that was accessed for the greatest amount of time was redisplayed at the end of the sequence. For example:

• Subject accessed four documents in the evidence journey: D3, D1, D4, D2

(Note: for each Dn, n denotes the relative amount of time spent on the document; in

this journey of length 4, D4 in this journey is the document on which most time

was spent and D1 the document on which least time was spent)

• Documents rearranged in new sequence for debiasing exposure effect: D1, D2,

D3, D4

10.3.4 Reinforcement effect In the reinforcement debiasing intervention, the document rearrangement strategy for debiasing reinforcement effect involves redisplaying documents that have only been

Design of debiasing interventions 119

accessed once at the beginning of the sequence, and those that were accessed more than once at the end of the sequence. For example:

• Subject accessed four documents in the evidence journey: D3, D2, D1, D2

(Note: for each Dn, n denotes the unique ID of the document; so in this example,

D2 has been accessed twice, D1 and D3 accessed only once)

• Documents rearranged for debiasing reinforcement effect: D3, D1, D2, D2

10.4 For/against document tool

10.4.1 Design rationale The aim of the for/against document tool is to debias the anchoring effect by providing an alternative way of thinking about the decision such that users can anchor onto this alternate way of thinking rather than their subjective pre-search belief. Three strategies are used to debias the anchoring effect (names of the debiasing strategies are in italics). The for/against document tool invites users to consider alternative situations. It outlines the set of possible answers to a decision and implicitly encourages users to search for discrepant information.

Furthermore, users are encouraged to classify documents according to confirming and disconfirming views, which offers alternative formulations of representing the information.

10.4.2 Anchoring effect The strategy behind the for/against document tool is to provide an alternative way of thinking about the decision such that users can anchor onto this alternate way of thinking rather than their subjective pre-search belief. The tool allows users to determine, upon reading the selected document, whether the document supports the affirmative case, the negative case, is neutral or is irrelevant to the decision (Figure 10.9). It targets the anchoring bias by asking users to build the body of evidential support for each possible answer to the decision. The assumption is that the process of building and seeing the body of evidential support for each possible answer overcomes any assumption the user may have had about the decision.

Design of debiasing interventions 120

Figure 10.9 For/against document tool – a tool for users to write and keep notes with each document, as well as to classify the utility of the document

Similar to the keep document tool, users can also annotate a document (Figure 10.9; enlargement: Figure 10.10). Each document automatically moves to a designated section of the screen depending on whether the user classified it as affirmative, negative or neutral

(Figure 10.11). All classified documents are kept and are accessible across multiple searches for the decision (Figure 10.12). The affirmative, negative and neutral documents are displayed to users with their classified groups before users make the decision post-search, with the irrelevant documents being omitted (Figure 10.13).

Figure 10.10 For/against document tool– enlargement of the For/Against document tool

Design of debiasing interventions 121

Figure 10.11 For/against document tool– document and its notes automatically move to relevant section of the screen once user classifies the utility of the document

Figure 10.12 For/against document tool – classified documents and their notes are accessible across multiple searches. Users can reclassify a document during the evidence journey

Design of debiasing interventions 122

Figure 10.13 For/against document tool – documents and notes are presented to the user for review before making a decision

10.5 Education-based intervention

The education-based intervention is an information page that educates users about the impact of cognitive biases on decision making. It explains that the order in which a document was accessed, the time spent on it and the number of visits to that document can

Figure 10.14 Education-based intervention

Design of debiasing interventions 123

influence the impact that document has on a decision (Figure 10.14). It asks users to remain mindful of these factors when searching for information and making decisions. Although many education-based interventions are not successful in debiasing (Kahneman et al.,

1982), the study conducted by Stallard and Worthington (1998) demonstrated that the impact of the and other potential biases can be reduced by asking participants to be wary of cognitive biases, to avoid making guesses, and to pay attention to relevant information that may have been overlooked.

10.6 Conclusion

Three classes of interventions were designed with the goal of minimising the impact of cognitive biases during information searching. The aim of all keep document tools is to increase the availability of documents that may be neglected due to order, exposure and reinforcement effects. Each keep document tool provides users with a document collection tool to gather documents during information searching; different document rearrangement strategies are used to minimise the impact of order, exposure and reinforcement effects, and collected documents are later redisplayed to users in a non-biased sequence.

The aim of the for/against document tool is to debias the anchoring effect by providing an alternative way of thinking about the decision such that users can anchor onto this way of thinking rather than their subjective pre-search belief. It targets the anchoring effect by providing a tool that allows users to classify documents according to their utility to a decision; users build the evidential support for each possible answer to overcome the influence of their prior belief.

The aim of the education-based intervention is to educate users about the impact of cognitive biases on decision making. It is not integrated into the task of information searching but is directed at the abovementioned cognitive biases by presenting users with an educational information page before searching. In the next section, a study is undertaken to test the effectiveness of these interventions to debias search engine users and to evaluate their impact on subjects’ search and decision making outcomes.

Part V: What is the impact of information searching on decision making?

11 Study design: methodology

This chapter describes the design of the study undertaken to evaluate the effectiveness of interventions designed to debias information searching. Non-medical university undergraduate students were recruited to use both a vanilla search user interface (i.e. baseline) and each of the proposed debiasing interventions to search for answers to six different health-related questions. The purpose of this study is to evaluate whether the interventions are successful in debiasing decisions, and to compare between the baseline search interface and each debiasing intervention to evaluate which system is better at improving decision quality.

The baseline search interface is the Quick Clinical (QC) search system, which has been validated to be effective and efficient in searching and delivering information in various technical, laboratory and real-life evaluation studies (Coiera et al., 2005; Westbrook et al.,

2005; Magrabi et al., 2005). QC has proven to be a reliable and suitable baseline search interface upon which to build the debiasing intervention and to compare with the debiasing intervention.

11.1 Introduction

The study design follows the Quick Clinical pre-/post-intervention online experimental design reported in Westbrook et al. (2005), which examined the impact of online information retrieval systems on experienced clinicians in answering clinical questions (also described in Chapter 3). This experiment was conducted online where subjects searched and answered questions in their own space and time. Ethics approval from the Human

Research Ethics Advisory Panel at the University of New South Wales (UNSW) was obtained for this study.

125

Study design: methodology 126

11.2 Study design

11.2.1 Subjects A convenience sample of 227 people who have previously used an online search engine was recruited from the undergraduate student population at UNSW. Subjects were recruited in two phases. Sixty-four subjects were recruited from Phase 1 over a period of one month in

February 2005 and 163 from Phase 2 over a period of three months from May to July 2005, providing 928 pre/post pairs of responses to six health-related questions after data exclusion.

People with Internet access who had used an online search engine were recruited by announcements seeking volunteers advertised via student email lists, posters, leaflets, weekly student magazines and research news website on UNSW campus (see Appendix B).

Upon completion of the study, subjects were remunerated by being entered into a draw for one of 100 movie tickets. Part of the recruitment involved class announcements at a one- week general education course offered by the Faculty of Medicine at UNSW; this course was open only to undergraduate students at the university enrolled in faculties other than

Medicine. In this course, there was a non-compulsory component allowing students to gain

2–3 bonus marks in each of the first four days of the course by completing nightly reading summaries. Students were allowed to substitute participation in this study for one of these nightly summaries. All students had the option to participate in the study on a voluntary basis and were remunerated in the same way.

11.2.2 Protocol Before commencing the study, subjects were presented with an online subject information sheet and a study consent form and were asked to read carefully before submitting their consent (see Figure 11.1 for subject information sheet; Figure 11.2 for consent form; and

Appendix C for full version of the subject information sheet). Each subject then received an online tutorial detailing the study protocol and the instructions on how to use the baseline search interface and other debiasing interventions (see Figure 11.3 for the first page of the tutorial; and Appendix D for the full version of the 10-page tutorial). After completion of the tutorial, subjects completed a questionnaire collecting demographic information, such as gender, age, self-rated search skills, self-reported frequency of using a search engine for general purposes and for health-related purposes (Figure 11.4). This questionnaire was

Study design: methodology 127

adapted from the Quick Clinical pre-study questionnaire that examines the impact of an online evidence retrieval system on clinicians’ performance in answering clinical questions

(Westbrook et al., 2005). Subjects were then given a URL and asked to log onto the study website by entering their name and email address (Figure 11.5).

Subjects received a set of 6 case scenario questions. For each question, they had to give an answer before searching and rate their confidence in their answer on a 4-point Likert scale from “not confident”, “somewhat confident”, “confident” to “very confident” (Figure

11.6). Subjects then conducted searches using the baseline search interface or a debiasing intervention to answer the question (Figure 11.7). The search engine retrieved documents from tested resources that have high diagnosticity and reliability in answering health-related questions; these resources are: PubMed1, MedlinePlus2, and HealthInsite3(Figure 11.8). After answering the question post-search (Figure 11.9), they were presented with a summary of the answers provided by previous subjects post-search and asked to answer the question again (Figure 11.10). Subjects were advised to spend about 10 minutes for each question and to only use the search system provided to answer the questions. To prevent subjects visiting external websites during the experiment, the navigational bar on the subject’s browser was hidden once the subject logged onto the study website.

At the end of the study, subjects completed a 4-item post-study questionnaire that asked them to choose which of the baseline search interface, keep document tool and for/against document tool they found most useful, enjoyed using the most and preferred to use in the future (Figure 11.11). The final question asks for general comments on their search experience, using freetext. This type of questionnaire has been used in online market research to evaluate users’ preferred choice after experiencing multiple options (Grover,

1 PubMed is the United States National Library of Medicine's (NLM®) database of biomedical citations and abstracts that is searchable on the Web. URL: http://www.pubmed.gov 2 MedlinePlus is a website maintained by NLM that contains carefully selected links to Web resources with health information for healthcare consumers and health professionals. URL: http://medlineplus.gov 3 HealthInsite is a website initiated and funded by the Australian Government that provides access to quality health information for healthcare consumers and health providers. URL: http://www.healthinsite.gov.au

Study design: methodology 128

2005). The study and the questionnaires were piloted on two occasions with 5 subjects each time with no difficulties reported.

Figure 11.1 Subject information sheet about study

Figure 11.2 Consent form of study: subjects can press “I Agree” to proceed with the study or “I Disagree” to discontinue

Study design: methodology 129

Figure 11.3 First page of the online tutorial

Figure 11.4 Pre-study questionnaire for collecting subjects’ demographics

Study design: methodology 130

Figure 11.5 Login screen into experiment

Figure 11.6 Subjects answer case scenario question before searching

Study design: methodology 131

Figure 11.7 Search interface: subjects enter keywords and press “Go” to conduct a search, or press “Finish Searching” to proceed to answer the case scenario question again

Figure 11.8 Retrieving documents from different information sources

Study design: methodology 132

Figure 11.9 Subjects answer case scenario question after searching

Figure 11.10 After answering post-search, subjects are presented with other subjects’ post-search answers and have the opportunity to answer the question again

Study design: methodology 133

Figure 11.11 At the completion of six case scenarios, subjects complete the post-study questionnaire

11.2.3 Two-phase study design Subjects in Phase 1 of the study used the baseline search interface and all five debiasing interventions presented in Chapter 10. All subjects received the baseline search interface first, followed by a random selection of anchoring, order, exposure or reinforcement debiasing intervention for each of the four questions, and the education-based intervention for the last question.

To ensure we obtained a sufficient sample size, subjects in Phase 2 of the study only used the baseline search interface, anchoring and order debiasing interventions; the exposure, reinforcement and education-based interventions were removed. Subjects received the baseline search interface in the first two questions, followed by a random selection of anchoring and order debiasing intervention for the remaining four questions; each debiasing intervention was randomly allocated twice. Exposure, reinforcement and education-based interventions were eliminated from Phase 2 of the study because only the intervention with the highest improvement rate in decision accuracy from each class of

Study design: methodology 134

for/against document and keep document tools were selected for further evaluation, i.e. anchoring and order debiasing interventions.

Before the start of each question, subjects were informed which search interface they had been assigned to: the “Keep document” tool, the “For/Against document” tool, or “No tool” (i.e. baseline or education-based intervention). Subjects were not briefed on the actual purpose of the study, namely to evaluate the impact of cognitive biases and debiasing effects on information searching and decision making.

Apart from removing the three debiasing interventions, the protocols of Phases 1 and 2 are basically the same. An analysis carried out later in Chapter 12 demonstrates that there are no statistically significant differences in subjects’ attributes and ability to answer questions, nor differences in case scenario allocation between the two phases of the study, suggesting that this change in protocol did not introduce significant confounding factors to the study.

The allocation of debiasing interventions was randomised using a random number generator to minimise any learning effect in the use of the search tool over a sequence of scenarios. The allocation of case scenario questions was designed to ensure a balanced distribution of case scenario questions to each intervention type.

11.2.4 Case scenario development Case scenarios targeted for healthcare consumers were developed in consultation with a general practitioner practising on the UNSW campus and two academic lecturers from the

School of Public Health and Community Medicine at UNSW. Scenarios were rejected if there was not a definite answer or the evidence was not available via the online evidence retrieval system. Agreement was reached among the team about the “correct” answer and the location of the best evidence sources for each question (see Appendix E). The pool of 23 questions was narrowed to 8, where six of these eight questions were randomly assigned to each subject in the study. These questions ranged in difficulty and topic to ensure that each intervention covered a spectrum of healthcare consumer questions. Each question and the corresponding expected correct answer are shown in Table 11.1.

Study design: methodology 135

The response categories of the questions were (i) yes; (ii) no; (iii) conflicting evidence;

(iv) don’t know. Pre-experiment, a pilot of three members from the general public tested the questions for interestingness and readability. Later, two additional pilots of five people in each used the system to confirm it was possible to locate documentary evidence required to answer the questions correctly.

Table 11.1 Case scenarios presented to subjects*

Expected correct Scenario and question (scenario name) answer 1. We hear of people going on low carbohydrate and high protein diets, such as the No Atkins diet, to lose weight. Is there evidence to support that low carbohydrate, high protein diets result in greater long-term weight loss than conventional low energy, low fat diets? (Diet) 2. You can catch infectious diseases such as the flu from inhaling the air into which No others have sneezed or coughed, sharing a straw or eating off someone else's fork. The reason is because certain germs reside in saliva, as well as in other bodily fluids. Hepatitis B is an infectious disease. Can you catch Hepatitis B from kissing on the cheek? (Hepatitis B) 3. After having a few alcoholic drinks, we depend on our liver to reduce the Blood Yes Alcohol Concentration (BAC). Drinking coffee, eating, vomiting, sleeping or having a shower will not help reduce your BAC. Are there different recommendations regarding safe alcohol consumption for males and females? (Alcohol) 4. Sudden infant death syndrome (SIDS), also known as “cot death”, is the Yes unexpected death of a baby where there is no apparent cause of death. Studies have shown that sleeping on the stomach increases a baby's risk of SIDS. Is there an increased risk of a baby dying from SIDS if the mother smokes during pregnancy? (SIDS) 5. Breast cancer is one of the most common types of cancer found in women. Is Yes there an increased chance of developing breast cancer for women who have a family history of breast cancer? (Breast cancer) 6. Men are encouraged by our culture to be tough. Unfortunately, many men tend Yes to think that asking for help is a sign of weakness. In Australia, do more men die by committing suicide than women? (Suicide) 7. Many people use home therapies when they are sick or to keep healthy. Examples No of home therapies include drinking chicken soup when sick, drinking milk before bed for a better night's sleep and taking vitamin C to prevent the common cold. Is there evidence to support the taking of vitamin C supplements to help prevent the common cold? (Cold) 8. We know that we can catch AIDS from bodily fluids, such as from needle sharing, No having unprotected sex and breast-feeding. We also know that some diseases can be transmitted by mosquito bites. Is it likely that we can get AIDS from a mosquito bite? (AIDS) * A random selection of 6 cases are presented to each subject in the study.

Study design: methodology 136

11.2.5 Sample size Due to the lack of benchmark data on the effectiveness of debiasing interventions for information searching, the exact sample size was confirmed after analysing results from

Phase 1 of the study. Before Phase 1, the expectation was that using a debiasing intervention was 10% better than baseline in improving pre-/post-search answer accuracy. This 10% improvement was later confirmed from Phase 1 results, which showed that 10% is approximately the midpoint in the range of percentage difference in correct answers between using a debiasing intervention and using baseline (Table 11.2).

A sample size of 231 responses for baseline and each of the debiasing intervention is required to show a 10% difference at a 5% level of significance with a power of 80% (one- sided test) (Lwanga & Lemeshow, 1991). Assuming that 10% of the data will be excluded due to drop-out rate or subjects not completing the questions validly, 258 (> 1.1 × 231) pairs of baseline-intervention responses are required.

11.2.6 Measurement variables Subjects’ searches, selected documents, pre-/post-search answers and confidence, time taken from answering the question pre-search to answering post-search, and responses to the pre- and post-search questionnaire were logged during the experiment. Responses to questions were coded “correct”, “don’t know” or “incorrect” according to the pre-determined answers for each question.

Table 11.2 Percentage of correct answers before and after searching in Phase 1 of the study

Correct before Correct after % Improvement % Difference to Phase 1 intervention searching searching (post – pre) baseline Baseline (n=57) 32 (56.1%) 48 (84.2%) 28.1% 0% For/against document tool Anchoring (n=51) 32 (62.7%) 47 (92.2%) 29.5% 1.4% Keep document tool Order (n=52) 22 (42.3%) 44 (84.6%) 42.3% 14.2% Reinforcement(n=48) 32 (66.7%) 43 (89.6%) 22.9% –5.2% Exposure (n=51) 23 (45.1%) 42 (82.4%) 37.3% 9.2% Education-based Education (n=49) 22 (44.9%) 41 (83.7%) 38.8% 10.7%

Study design: methodology 137

11.3 Conclusion

This chapter describes the design of the study undertaken to evaluate the effectiveness of debiasing interventions for information searching. Results on how information searching impacts on subjects’ performance in answering health-related questions are detailed in

Chapter 12. Comparisons between the baseline search interface and different debiasing interventions are reported in Chapters 13–14.

12 Impact of information searching

This chapter reports the impact of using an online evidence retrieval system on subjects’ performance in answering health-related questions. All users of the baseline search interface and debiasing interventions were pooled together in this chapter to investigate whether simply using an online evidence retrieval system improves decision making. Comparisons between the performance of the baseline search interface and debiasing interventions are reported in Chapters 13–14.

12.1 Data exclusion criteria

Not all data obtained in the study was retained for analysis. Figure 12.1 describes how data was excluded from two phases of the study. Data was excluded when the subject provided incomplete responses to the question (e.g. did not conduct any searching before answering the post-search question, provided a “don’t know” post-search answer, or did not complete the study within 60 minutes). Data that originated from debiasing interventions that were discontinued in the investigation were also excluded (more details in Chapter 13).

12.2 Methods

This study investigates the impact of information searching on the accuracy of decision making and the confidence in decision outcomes. The test for difference between proportions was used to compare differences between subjects’ pre-search answers, post- search answers and post-subject answers (i.e. answers given after knowing other subjects’ post-search answers). It was also used to compare changes in subjects’ confidence in their answers pre- and post-search. Chi-square was used to examine whether there was a

.

138

Impact of information searching 139

Figure 12.1 Data exclusion in Phases 1 and 2 of the study

Impact of information searching 140

statistically significant relationship between subjects’ confidence in their answers and the accuracy of these answers.

Before conducting any analysis, Phase 1 and Phase 2 of the study were compared to see if there were statistically significant differences in subjects’ characteristics, subjects’ ability to answer questions or the experimental design that would prevent the two phases from being combined for data analysis. Chi-square and the test for difference between proportions were used to compare differences in user demographic variables, the prevalence of correct pre- search answers and the allocation of case scenarios to each intervention between Phases 1 and 2.

12.3 Compatibility between Phases 1 and 2

Phases 1 and 2 of the study are combined for the purpose of data analysis because there were no statistically significant differences in subjects’ demographic attributes, subjects’ ability to answer questions pre-search, or the allocation of case scenarios between both phases of the study. First, there were no statistically significant differences in subjects’ demographic attributes between both phases of the study. This is reported in Table 12.1 which shows the comparison between subjects of both phases in terms of gender (χ2 = 0.46, df = 1, P = 0.50), age (χ2 = 5.70, df = 3, P = 0.13), self-rated search ability (χ2 = 1.24, df = 2, P = 0.54), general search experience (χ2 = 0.10, df = 1, P = 0.75) and health information search experience (χ2 =

5.02, df = 3, P = 0.17).

Second, there were no statistically significant differences in subjects’ ability to answer questions pre-search in both phases of the study. Table 12.2 shows that there were no statistically significant differences in the number of questions answered correctly pre-search between both phases except for case 7. Even though there was a statistically significant difference in the proportion of correct pre-search answers between Phases 1 and 2 for case

7, it was still included in the data analysis because the significant difference was possibly due to a small number of valid attempts (only 16) for case 7 in Phase 1 of the study.

Impact of information searching 141

Table 12.1 Characteristics of subjects

Variable Phase 1 (n=57) Phase 2 (n=154) Total (n=211) Gender Female 33 (57.9%) 97 (63.0%) 130 (61.6%) Male 24 (42.1%) 57 (37.0%) 81 (38.4%) Age <25 40 (70.2%) 99 (64.3%) 139 (65.9%) 25 to 34 14 (24.6%) 32 (20.8%) 46 (21.8%) 35 to 44 3 (5.3%) 9 (5.8%) 12 (5.7%) 45 0 (0%) 14 (9.1%) 14 (6.6%) Search skill Fair or poor 15 (26.3%) 31 (20.1%) 46 (21.8%) Good 27 (47.4%) 73 (47.4%) 100 (47.4%) Very good 15 (26.3%) 50 (32.5%) 65 (30.8%) Search experience Once a week or less 4 (7.0%) 9 (5.8%) 13 (6.2%) Several times a week 53 (93.0%) 145 (94.2%) 198 (93.8%) Health search experience Never 3 (5.3%) 6 (3.9%) 9 (4.3%) Less than once a week 32 (56.1%) 62 (40.3%) 94 (44.5%) Once a week 10 (17.5%) 42 (27.3%) 52 (24.6%) Several times a week 12 (21.1%) 44 (28.6%) 56 (26.5%)

Third, there were no statistically significant differences in the allocation of case scenarios to each intervention in each phase of the study nor in both phases of the study (Phase 1: χ2 =

17.84, df = 14, P = 0.21; Phase 2: χ2 = 4.10, df = 14, P = 1.00; Phases 1 and 2: χ2 = 2.68, df =

14, P = 1.00) (see Table 12.3 for the allocation of case scenarios in Phase 1, Phase 2 and

Table 12.2 Subjects’ ability to answer questions pre-search

Phase 1 Phase 2 Case scenario n Correct pre-search n Correct pre-search Z P 1. Diet 18 5 (27.8%) 97 33 (34.0%) 0.54 0.58 2. Hepatitis B 24 15 (62.5%) 99 75 (75.8%) 1.23 0.22 3. Alcohol 23 17 (73.9%) 90 76 (84.4%) 1.06 0.29 4. SIDS 20 15 (75.0%) 91 56 (61.5%) 1.23 0.22 5. Breast cancer 28 15 (83.3%) 103 93 (90.3%) 0.75 0.45 6. Suicide 25 10 (40.0%) 88 53 (60.2%) 1.82 0.07 7. Cold 16 1 (6.3%) 95 21 (22.1%) 2.14 0.03 8. AIDS 16 8 (50.0%) 105 75 (71.4%) 1.62 0.11

Impact of information searching 142

Table 12.3 Allocation of case scenarios to interventions

Case number Debiasing intervention 1 2 3 4 5 6 7 8 Total Phase 1 Baseline 4 10 8 8 12 6 4 5 57 Anchoring 7 5 8 9 3 7 7 5 51 Order 7 9 7 3 3 12 5 6 52 Total 18 24 23 20 18 25 16 16 160 Phase 2 Baseline 36 33 35 33 31 36 37 43 284 Anchoring 31 32 27 29 36 25 29 31 240 Order 30 34 28 29 36 27 29 31 244 Total 97 99 90 91 103 88 95 105 768 Both phases combined Baseline 40 43 43 41 43 42 41 48 341 Anchoring 38 37 35 38 39 32 36 36 291 Order 37 43 35 32 39 39 34 37 296 Total 115 123 113 111 121 113 111 121 928 both phases combined).

Overall, the two phases of the study included 211 subjects, 928 responses (341 for baseline, 291 for anchoring and 296 for order) with 1606 searches and 3019 document accesses (Figure 12.1). 61.6% of subjects were female and 38.4% male (Table 12.1). Most subjects were under 25 years of age (65.9%), followed by the age group 25 to 34 (21.8%), over 45 (6.6%) and 35 to 44 (5.7%) (Table 12.1). The majority of subjects searched the

Internet several times a week (93.8%) and 6.2% searched once a week or less (Table 12.1). In addition, 44.5% searched for health information less than once a week, 26.5% several times a week, 24.6% once a week and 4.3% never searched for health information on the Internet

(Table 12.1). Further, 47.4% rated themselves to be good searchers, 30.8% very good searchers and 21.8% fair or poor searchers (Table 12.1). Overall, subjects took on average

361 seconds (SD: 281.2) to search, made 1.73 (SD: 1.391) searches and accessed 3.25 (SD:

3.067) documents to answer a question (Table 12.5; see Table 12.4 for number of search sessions, number of searches and number of accessed documents in each phase before and after data exclusion).

Impact of information searching 143

Table 12.4 Number of search sessions, searches and accessed documents in each phase of the study (before and after data exclusion)

No. of accessed No. of search sessions No. of searches documents Before After Before After Before After Case number exclusion exclusion exclusion exclusion exclusion exclusion Phase 1 1 42 18 51 24 165 74 2 43 24 58 30 120 63 3 41 23 67 36 164 98 4 40 20 48 32 157 97 5 40 18 54 31 132 60 6 44 25 76 40 191 100 7 39 16 48 18 158 82 8 43 16 47 20 139 55 Total 332 160 449 231 1226 629 Phase 2 1 107 97 150 148 351 343 2 116 99 204 191 271 247 3 107 90 204 191 284 268 4 102 91 144 140 243 243 5 116 103 165 162 317 317 6 107 88 289 247 388 353 7 108 95 143 142 336 336 8 116 105 155 154 287 263 Total 879 768 1454 1375 2477 2390

Table 12.5 Average time taken, no. of searches conducted and no. of documents accessed to answer each scenario question in Phases 1 & 2 of the study (after data exclusion)

No. of accessed Time taken (seconds)* No. of searches documents Case Average (SD) Range Average (SD) Range Average (SD) Range number 1 486 (379.6) 0 to 1560 1.50 (0.912) 1 to 6 3.63 (3.278) 0 to 17 2 346 (244.7) 0 to 1520 1.80 (1.361) 1 to 10 2.52 (2.038) 0 to 10 3 356 (237.7) 0 to 1102 2.01 (1.550) 1 to 9 3.24 (2.650) 0 to 13 4 322 (288.9) 0 to 1505 1.55 (0.882) 1 to 5 3.06 (2.909) 0 to 16 5 340 (270.2) 0 to 1399 1.60 (1.107) 1 to 8 3.12 (2.961) 0 to 14 6 392 (264.2) 0 to 1106 2.54 (2.368) 1 to 15 4.01 (4.325) 0 to 34 7 379 (288.0) 0 to 1298 1.44 (0.969) 1 to 7 3.77 (3.322) 0 to 14 8 286 (222.1) 0 to 1634 1.44 (1.016) 1 to 7 2.79 (2.415) 0 to 13 Overall 361 (281.2) 0 to 1634 1.73 (1.391) 1 to 15 3.25 (3.067) 0 to 34

* Measured as time elapsed between commencement of first and last searches. Time taken is therefore zero where only one search was conducted.

Impact of information searching 144

12.4 Results

12.4.1 Impact on decision outcome The test for difference between proportions shows that there was a statistically significant improvement (21%) in the percentage of correct answers before and after searching (pre- search: 61.2%, 95% CI: 58.03 to 64.29; post-search: 82.0%, 95% CI: 79.40 to 84.34; Z = 10.31,

P < 0.001) (Table 12.6). Most subjects answered correctly before and after searching, right- right (RR: 56.5%, 95% CI: 53.26 to 59.63), followed by those who improved their answers after searching, wrong-right (WR: 25.5%, 95% CI: 22.84 to 28.44), then those who never answered correctly, wrong-wrong (WW: 13.3%, 95% CI: 11.22 to 15.58), and those who went from right to wrong (RW: 4.7%, 95% CI: 3.55 to 6.30) (Table 12.7). There were no statistically significant differences in the percentage of correct answers before and after knowing other subjects’ answers (pre: 82.0%, 95% CI: 79.40 to 84.34; post: 85.3%, 95% CI:

82.92 to 87.47; Z = 1.75, P = 0.80) (Table 12.6), i.e. there was no additional benefit to knowing other users’ answers.

12.4.2 Impact on confidence Table 12.9 shows that the most frequently self-reported change in confidence before and after searching for all WW, WR, RW and RR responses was “increased confidence” (WW:

51.9%, 95% CI: 42.53 to 61.05; WR: 54.0%, 95% CI: 46.34 to 61.55; RW: 40.4%, 95% CI:

Table 12.6 Percentage of correct answers before searching, after searching and after knowing other subjects’ answers for each case scenario of the study

Correct after Correct before Correct after knowing other Case scenario (n) searching searching subjects’ answers 1. Diet (115) 38 (33.0%) 72 (62.6%) 79 (68.7%) 2. Hepatitis B (123) 90 (73.2%) 108 (87.8%) 114 (92.7%) 3. Alcohol (113) 93 (82.3%) 94 (83.2%) 99 (87.6%) 4. SIDS (111) 71 (64.0%) 95 (85.6%) 97 (87.4%) 5. Breast cancer (121) 108 (89.3%) 108 (89.3%) 111 (91.7%) 6. Suicide (113) 63 (55.8%) 98 (86.7%) 104 (92.0%) 7. Cold (111) 22 (19.8%) 68 (61.3%) 71 (64.0%) 8. AIDS (121) 83 (68.6%) 118 (97.5%) 117 (96.7%) Total (928) 568 (61.2%) 761 (82.0%) 792 (85.3%)

Impact of information searching 145

Table 12.7 Changes in answer before and after system use (n=928) (after Westbrook et al., 2005)

Pre-search Post-search % (95% CI) Total no. Right Right 56.5% (53.26 to 59.63) 524 Wrong Right 25.5% (22.84 to 28.44) 237 Wrong Wrong 13.3% (11.22 to 15.58) 123 Right Wrong 4.7% (3.55 to 6.30) 44

27.64 to 54.66; RR: 71.1%, 95% CI: 67.35 to 74.65). More than half (55.6%, 95% CI: 37.32 to

72.42) of subjects who did not know the answer pre-search and answered incorrectly post- search (DW) reported they were confident or very confident with their post-search answer

(Table 12.8). In fact, 82.0% (95% CI: 75.52 to 87.12) of subjects who were incorrect post- search reported they were confident or very confident with their answer (Table 12.10).

Although Table 12.10 shows that the proportion of highly confident correct answers (i.e. confident or very confident) significantly increased after searching (pre-search: 61.6%, 95%

CI: 57.56 to 65.53; post-search: 95.5%, 95% CI: 93.82 to 96.79; Z = –15.60, P < 0.0001), the proportion of highly confident incorrect answers also increased after searching (pre-search:

55.3%, 95% CI: 50.12 to 60.33; post-search: 82.0%, 95% CI: 75.52 to 87.12; Z = –6.75, P <

0.0001) (Table 12.10). Perhaps not surprisingly, those who were not confident or somewhat confident in their post-search answers were 28.5% more likely than those who were confident or very confident to change answer after knowing other subjects’ post-search answers (not confident/somewhat confident: 34.4%, 95% CI: 23.93 to 46.60; confident/very confident: 5.9%, 95% CI: 4.52 to 7.67; χ2 = 66.65, df = 1, P < 0.0001) (Table 12.11).

Table 12.8 Confidence in post-search answer for subjects who did not know scenario answer before searching (n=147) (after Westbrook et al., 2005a)

Did not know answer before search Wrong after search (DW) Right after search (DR) Post-search confidence (n=27) (n=120) Not confident/ 12 (44.4%) 13 (10.8%) Somewhat confident Confident/ 15 (55.6%) 107 (89.2%) Very confident

Impact of information searching 146

Table 12.9 Changes in confidence in original answer following online searches (n=905) (after Westbrook et al., 2005a)*

Wrong before search Right before search Wrong after Right after Wrong after Right after Change in search (WW) search (WR) search (RW) search (RR) confidence (n=108) (n=161) (n=47) (n=589) Decreased 15 (13.9%) 58 (36.0%) 14 (29.8%) 5 (0.8%) No change 37 (34.3%) 16 (9.9%) 14 (29.8%) 165 (28.0%) Increased 56 (51.9%) 87 (54.0%) 19 (40.4%) 419 (71.1%)

* In 23 responses, subjects did not report a confidence rating.

Table 12.10 Comparison of confidence between pre- and post-search right and wrong answers

Confidence in answer Pre-search Post-search Z P Right answer (n=558) (n=761) Not confident/ 208 (36.6%) 34 (4.5%) 14.91 <0.0001 Somewhat confident Confident/ 350 (61.6%) 727 (95.5%) –15.60 <0.0001 Very confident Not provided 10 (1.8%) – – – Wrong answer (n=360) (n=167) Not confident/ 154 (42.8%) 30 (18.0%) 6.28 <0.0001 Somewhat confident Confident/ 199 (55.3%) 137 (82.0%) –6.75 <0.0001 Very confident Not provided 7 (1.9%) – – –

Table 12.11 Percentage of subjects who change post-search answer after knowing other subjects’ post-search answers

Changed answer after knowing other subjects’ answers? Post-search confidence Yes No Not confident/Somewhat confident 22 (34.4%) 42 (65.6%) (n=64) Confident/Very confident (n=864) 51 (5.9%) 813 (94.1%)

Impact of information searching 147

12.5 Discussion

These results for non-clinically trained users are in line with other studies that reported search systems improve the ability of clinically trained users to answer questions (e.g. Hersh et al., 2000; Hersh et al., 2002; Westbrook et al., 2005). The 21% improvement in accuracy between pre- and post-search answers reported in Table 12.6 corresponds with the study conducted by Hersh et al. (2002), which found that 66 medical and nurse practitioner students were able to improve their answers to a set of five clinical questions by up to 20% before and after using MEDLINE. Our improvement rate also corresponds with the 21% improvement reported for the Quick Clinical engine (upon which this search system was based), which was used by clinicians (Westbrook et al., 2005: pre-search correct: 29%; post- search correct: 50%, Z = 9.58, P < 0.001).

Results for subjects’ confidence in answers also correspond with the finding that clinicians’ confidence in an answer is not always related to the answer being correct

(Westbrook et al., 2005a). The observation that 55.6% (95% CI: 37.32 to 72.42) of subjects who did not know the answer before searching reported they were confident or very confident in their incorrect post-search answers (Table 12.8) concurs with the result of

Westbrook et al. (2005a), which found that amongst clinicians who did not know the answer before searching and were incorrect after searching, 60% of doctors and 52% of clinical nurse consultants reported they were confident or very confident in their incorrect post- search answer.

12.5.1 Limitations of this study There are several limitations in the online pre/post study design adopted here:

• High percentage of invalid responses and drop-off rate in online experiments: Subjects self-

administered the experiment in their own space at their own time. To avoid a high

percentage of invalid responses, subjects were informed before commencing the study

that their answers would be checked and verified against the information they accessed

during searching in order to be eligible to be renumerated (Reips, 2000). The baseline

search interface was made available early in the experiment so that any valid attempts of

a debiasing intervention later on in the study could be compared to the subject’s baseline

response. In addition, the system monitored the number of valid attempts completed for

Impact of information searching 148

each debiasing intervention before allocating an intervention to ensure a balanced

distribution of valid attempts across all interventions at any time. Furthermore,

encouragement messages were given to each user after the completion of each question

to remind them of how many questions they had answered and how close they were to

the completion of the study.

• Knowing other subjects’ answers discourages information searching: It was only during

data analysis and in hindsight that we realised the component that invites subjects to

view other subjects’ post-search answers may actually discourage subjects to search for

information. To address this issue, all cases where searching was not conducted have

been removed from the data analysis.

• Use of external material to answer questions: Even though subjects were asked to only

use the provided search system and not to use any external resources to answer the

questions, there was no monitoring to check whether subjects used external resources

other than the provided online evidence retrieval system to answer questions. However,

the navigational bar on the browser was disabled throughout the experiment to prevent

subjects visiting other websites or going back to completed questions to alter their

answers. In addition, subjects were asked to complete each question in a ten-minute

time limit, which would likely minimise the possible use of printed material.

• Same subject doing the experiment more than once: It is possible that the same subject

could have logged on to do the experiment more than once, or that a subject used

multiple identities to do the experiment. However, no suspicions of multiple attempts or

logins from the same subject were identified. Also, other studies have shown that the

rate of repeated participation is less than 3% in most studies (Reips, 2000).

: May occur because the study could possibly be more appealing to people

who are interested or literate in computers, the Internet, search engines, information

searching or health-related topics. These people may be more enthusiastic about online

health information searching than the general Internet search-experienced population.

In addition, subjects from the University setting are more likely to be open and positive

to new research ideas and willing to participate in research studies, which may

potentially bias the research findings. However, the purpose of this study is to compare

Impact of information searching 149

the effectiveness of using a debiasing intervention and a baseline search interface, with

each subject being his/her own control in the experiment.

• Second decision effect: May occur when subjects make a mistake while inputting their

pre-search answer and use the post-search attempt to correct their answer; in this case,

the difference in pre- and post-search answers may not actually be a result of the

searching intervention. To address the possibility of this effect, subjects who did not

search to answer the question were removed from the analysis.

• Measurement bias and Hawthorn effect: There is tendency for people to give preference

for interventions that favour what they think the study wants (Spyridakis et al., 2005).

However, throughout the study, subjects were never briefed on the main aim of this

study, namely is to investigate and compare the effectiveness in minimising the impact

of biases on information searching and decision making between using interventions

and a baseline search interface.

12.6 Conclusion

Similar to clinicians, non-clinically trained subjects in this study improved their accuracy in answering health-related questions after using an online evidence retrieval system, but their confidence in the answer was not always related to the answer being correct. In fact, there was a significant increase in the proportion of highly confident incorrect answers after searching. Nevertheless, this chapter demonstrates that even for non-clinically trained users, providing access to health-related information from tested resources is effective in improving their ability to answer questions.

13 Impact of debiasing: bias behaviour

This chapter tests the hypotheses that people experience cognitive biases during information searching and that information searching can be debiased to correct for this effect. Serial curves are used in this chapter to evaluate whether subjects experienced order effects, the anchoring effort and the confidence in anchoring effect when using baseline search interface and the corresponding debiasing intervention. Serial curves are also used to evaluate whether subjects experienced exposure effect and reinforcement effect when using baseline search interface.

13.1 Data exclusion criteria

To ensure that only valid pairwise comparisons were included in this analysis, data were excluded if a subject did not use both baseline search interface and a debiasing intervention.

Figure 13.1 describes how subjects and their responses were removed from Phases 1 and 2 in the baseline–anchoring and baseline–order comparison. When a subject’s baseline attempt was excluded, the subject’s attempts using debiasing interventions were also excluded. Combining Phases 1 and 2 after data exclusion, there were 183 subjects in the baseline–anchoring comparison, who produced 303 baseline responses, 291 anchoring responses, 1095 searches and 1904 document accesses; there were 182 subjects in the baseline–order comparison, who produced 304 baseline responses, 296 order responses,

1098 searches and 1860 document accesses.

150

Impact of debiasing: bias behaviour 151

Figure 13.1 Data exclusion criteria for comparing the effectiveness of baseline search interface and debiasing intervention

Impact of debiasing: bias behaviour 152

13.2 Method

Separate serial curves were created for order effects, anchoring effort and the confidence in anchoring effect using baseline search interface and debiasing intervention responses. Chi- square analysis was conducted to evaluate whether subjects were influenced by each of these effects within baseline search interface and each debiasing intervention. The test for difference between proportions was used to compare the pre-search answer retention rate between using baseline search interface and the anchor debiasing intervention. Serial curves were also created for exposure effect and reinforcement effect using only baseline search interface responses.

13.3 Results for debiasing order effects

13.3.1 Order effects hypothesis Order effects are hypothesised in Figure 13.2. People who do not experience order effects would display behaviour similar to the null hypothesis (H0); and people who experience order effects would display behaviour similar to the alternative hypothesis (H1). The null hypothesis describes the scenario where the concurrence between subjects’ post-search answer and the answer suggested by a document is not influenced by the position at which the document was accessed. The alternative hypothesis describes the scenario where there is greater concurrence between subjects’ post-search answer and the answer suggested by Concurrence rate (%)

First Middle Last Access position

H0: No order effects H1: Order effects

Figure 13.2 Order effects: the null hypothesis (H0) and the alternative hypothesis (H1)

Impact of debiasing: bias behaviour 153

documents accessed at the first and last position. The expectation is that people who use the baseline search interface would experience order effects and display behaviour similar to the alternative hypothesis (H1); and people who use the order debiasing intervention would not experience order effects and display behaviour similar to the null hypothesis (H0).

13.3.2 Order effects (baseline versus order debiasing intervention) With responses completed using baseline search interface, results from the order effects investigation are reported in Table 13.1 and Figure 13.3. Table 13.1 shows that there was a statistically significant relationship between the access position of a document and the concurrence between the post-search answer and the answer suggested by the document (χ2

= 7.27, df = 2, P = 0.026). Concurrence rate decreased as the document access position proceeded from first, middle to last (also illustrated in Figure 13.3). In other words, order effects were evident amongst responses completed using the baseline search interface.

With responses completed using the keep document tool (i.e. order debiasing intervention), results from the order effects investigation are reported in Table 13.1 and

Figure 13.3. Table 13.1 shows that there was not a statistically significant relationship between the access position of a document and the concurrence between the post-search answer and the answer suggested by the document (χ2 = 2.14, df = 2, P = 0.34); this is also illustrated in Figure 13.3. In other words, order effects were not evident amongst responses completed using the keep document tool.

Impact of debiasing: bias behaviour 154

Table 13.1 Relationship between document access position and concurrence between post- search answer and document-suggested answer (baseline vs. order debiasing intervention)

Concurrence between post-search answer and document? Access position Yes No Baseline search interface First (n=185) 171 (92.4%) 14 (7.6%) Middle (n=342) 306 (89.5%) 36 (10.5%) Last (n=185) 155 (83.8%) 30 (16.2%) Keep document tool (i.e. order debiasing intervention) First (n=212) 173 (81.6%) 39 (18.4%) Middle (n=573) 481 (83.9%) 92 (16.1%) Last (n=212) 184 (86.8%) 28 (13.2%)

100%

90%

80%

70%

60% Concurrence rate

50% 10% 0% First Middle Last

Pos ition

Baseline Order debiasing intervention

Figure 13.3 Relationship between document access position and concurrence rate between post-search answer and document-suggested answer (baseline vs. order debiasing intervention)

13.4 Results on debiasing the anchoring effect

13.4.1 Anchoring effect hypothesis The anchoring effect is hypothesised in Figure 13.4 and the confidence in anchoring effect is hypothesised in Figure 13.5. People who do not experience anchoring effect or the confidence in anchoring effect would display behaviour similar to the null hypothesis (H0) in both Figure 13.4 and Figure 13.5; whereas people who experience any of these effects would display behaviour similar to the alternative hypothesis (H1).

Impact of debiasing: bias behaviour 155

The null hypothesis describes the scenario where people are equally likely to answer correctly post-search regardless of their pre-search answer or their confidence in the pre- search answer. The alternative hypothesis describes the scenario where people who are right pre-search are more likely to be right post-search than subjects who are wrong pre-search. In addition, people who are more confident in their pre-search answers are more likely to retain their pre-search answers after searching than those who are not as confident.

The expectation is that people who use the baseline search interface would experience anchoring effect and/or the confidence in anchoring effect, and display behaviour similar to the alternative hypothesis (H1); and people who use the anchor debiasing intervention would not experience anchoring effect nor the confidence in anchoring effect, and display behaviour similar to the null hypothesis (H0). Right (after search) (%) search) (after Right

Right Wrong Before search

H0: no anchoring effect H1: anchoring effect

Figure 13.4 Anchoring effect: the null hypothesis (H0) and the alternative hypothesis (H1)

Impact of debiasing: bias behaviour 156

Pre-search answer retention rate (%)

Not confident Somew hat confident Confident Very confident Confidence in pre-search answ er

H0: no confidence in anchoring effect H1: confidence in anchoring effect

Figure 13.5 Confidence in anchoring effect: the null hypothesis (H0) and the alternative hypothesis (H1)

13.4.2 Anchoring effect (baseline vs. anchor debiasing intervention) With responses completed using baseline search interface, results from the anchoring effect investigation are reported in Table 13.2 and Figure 13.6. Table 13.2 shows that there was a statistically significant relationship between subjects’ pre-search answers and their post- search answers (χ2 = 50.25, df = 1, P < 0.001). Subjects who were right pre-search were more likely to be right post-search than those who were wrong pre-search (also illustrated in Figure

13.6); similarly, subjects who were wrong pre-search were more likely to be wrong post-search than those who were right pre-search. In other words, an anchoring effect was evident amongst responses completed using the baseline search interface.

With responses completed using the for/against document tool (i.e. anchor debiasing intervention), results from the anchoring effect investigation are reported in Table 13.2 and

Figure 13.6. Table 13.2 shows that there was a statistically significant relationship between subjects’ pre-search answers and their post-search answers (χ2 = 22.48, df = 1, P < 0.001).

Subjects who were right pre-search were more likely to be right post-search than those who were wrong pre-search (also illustrated in Figure 13.6); similarly, subjects who were wrong pre-search were more likely to be wrong post-search than those who were right pre-search. In other words, an anchoring effect was evident amongst responses completed using for/against document tool.

Impact of debiasing: bias behaviour 157

Table 13.2 Relationship between pre-search answer and post-search answer (baseline vs. anchor debiasing intervention)

After search Before search Right (n=192) Wrong (n=111) Baseline search interface Right (n=192) 181 (94.3%) 11 (5.7%) Wrong (n=111) 69 (62.2%) 42 (37.8%)

For/against document tool (i.e. anchor debiasing intervention)

Right (n=182) 169 (92.9%) 13 (7.1%) Wrong (n=109) 79 (72.5%) 30 (27.5%)

100% 90% 80% 70% 60% 50% 40% 30% Right (afterRight search) 20% 10% 0% Right Wrong Before search

Baseline Anchoring debiasing intervention

Figure 13.6 Relationship between pre-search answer and post-search correctness (baseline vs. anchor debiasing intervention)

Impact of debiasing: bias behaviour 158

13.4.3 Confidence in anchoring effect (baseline versus anchor debiasing intervention) Results from the confidence in anchoring effect investigation using the baseline search interface are reported in Table 13.3 and Figure 13.7. Although Figure 13.7 shows an ascending relationship between subjects’ confidence in their pre-search answers and their pre-search answer retention rate, chi-square analysis conducted on Table 13.3 shows that this relationship was not statistically significant (χ2 = 2.67, df = 3, P = 0.45). In other words, a confidence in anchoring effect was not evident amongst responses completed using the baseline search interface.

Results from the confidence in anchoring effect investigation using the for/against document tool (i.e. anchor debiasing intervention) are reported in Table 13.3 and Figure

13.7. Table 13.3 shows that there was a statistically significant relationship between subjects’ confidence in their pre-search answers and their retention of the answers after searching (χ2

= 9.91, df = 3, P = 0.019); this is also illustrated in Figure 13.7. Subjects who lacked confidence in their pre-search answers (i.e. not confident or somewhat confident) were less likely to retain their answers after searching than subjects who were confident or very confident in their pre-search answers (not confident: 51.2%, 95% CI: 36.75 to 65.38; somewhat confident: 64.1%, 95% CI: 53.02 to 73.85; confident: 75.6%, 95% CI: 65.76 to

83.27 and very confident: 74.3%, 95% CI: 63.34 to 82.90). In other words, a confidence in anchoring effect was evident amongst responses completed using the for/against document tool.

Impact of debiasing: bias behaviour 159

Table 13.3 Relationship between confidence in pre-search answer and retention of pre-search answer after searching (baseline vs. anchor debiasing intervention)

Retained pre-search answer after searching? Confidence (before search) Yes No Baseline search interface* Not confident (n=33) 22 (66.7%) 11 (33.3%) Somewhat confident (n=71) 51 (71.8%) 20 (28.2%) Confident (n=85) 62 (72.9%) 23 (27.1%) Very confident (n=106) 84 (79.2%) 22 (20.8%) For/against document tool (i.e. anchor debiasing intervention)† Not confident (n=43) 22 (51.2%) 21 (48.8%) Somewhat confident (n=78) 50 (64.1%) 28 (35.9%) Confident (n=90) 68 (75.6%) 22 (24.4%) Very confident (n=74) 55 (74.3%) 19 (25.7%)

* 8 responses were excluded because they did not provide a pre-search confidence. † 6 responses were excluded because they did not provide a pre-search confidence.

100% 90% 80% 70% 60% 50% 40% 30% 20% 10%

Pre-search answer retention rate 0% Not confident Somew hat Confident Very confident confident

Baseline Anchoring debiasing intervention

Figure 13.7 Relationship between confidence in pre-search answer and pre-search answer retention rate

Impact of debiasing: bias behaviour 160

13.4.4 Baseline versus anchor debiasing intervention: comparative analysis This section compares the pre-search answer retention rate between using the baseline search interface and the for/against document tool (i.e. anchor debiasing intervention).

Table 13.4 shows that there were no statistically significant differences in the proportion of subjects who retained their pre-search answer after searching between those who used baseline search interface and those who used the anchor debiasing intervention (baseline:

74.2%, 95% CI: 68.96 to 78.89; anchoring: 68.4%, 95% CI: 62.81 to 73.54; Z = 1.55, P = 0.12).

Amongst subjects who lacked confidence in their pre-search answers (i.e. not confident or somewhat confident) (Table 13.4), subjects who used the anchor debiasing intervention were marginally more likely to change their answers than subjects who used the baseline search interface (baseline: 29.8%, 95% CI: 21.86 to 39.19; anchoring: 40.5%, 95% CI: 32.17 to

49.40; Z = –1.69, P = 0.091). However, amongst subjects who were confident or very confident in their pre-search answers (Table 13.4), there were no statistically significant differences in the pre-search answer retention rate between subjects who used the baseline search interface and subjects who used the anchor debiasing intervention (baseline: 76.4%,

95% CI: 69.94 to 81.90; anchoring: 75.0% , 95% CI: 67.85 to 81.00; Z = 0.32, P = 0.75).

Table 13.4 Comparative pre-search answer retention rates by search interface at different pre- search confidence levels (baseline vs. anchor debiasing intervention)

Retained pre-search answer Baseline search Anchor debiasing after searching? interface intervention Z P All pre-search confidence levels (n=295) (n=285) Yes 219 (74.2%) 195 (68.4%) 1.55 0.12 No 76 (25.8%) 90 (31.6%) –1.55 0.12 Not confident or somewhat confident in pre-search answer (n=104) (n=121) Yes 73 (70.2%) 72 (59.5%) 1.69 0.091 No 31 (29.8%) 49 (40.5%) –1.69 0.091 Confident or very confident in pre-search answer (n=191) (n=164) Yes 146 (76.4%) 123 (75.0%) 0.32 0.75 No 45 (23.6%) 41 (25.0%) –0.32 0.75

Impact of debiasing: bias behaviour 161

13.5 Results on exposure effect

13.5.1 Exposure effect hypothesis Exposure effect is hypothesised in Figure 13.8. People who do not experience exposure effect would display behaviour similar to the null hypothesis (H0); and people who experience exposure effect would display behaviour similar to the alternative hypothesis

(H1). The null hypothesis describes the scenario where the concurrence between subjects’ post-search answer and the answer suggested by a document is not influenced by the amount of time subjects spend on the document. The alternative hypothesis describes the scenario where the amount of time subjects spent on a document influences the concurrence rate. The expectation is that if people experience exposure effect, they would display behaviour similar to the alternative hypothesis (H1); and if people do not experience exposure effect, they would display behaviour similar to the null hypothesis (H0).

13.5.2 Exposure effect With responses completed using baseline search interface, results from the exposure effect investigation are reported in Table 13.5 and Figure 13.9. Table 13.5 shows that there was a statistically significant relationship between exposure level of a document and concurrence between the post-search answer and the answer suggested by the document (χ2 = 9.64, df =

2, P = 0.0081) (also illustrated in Figure 13.9). In other words, exposure effect was evident amongst responses completed using the baseline search interface. Concurrence rate (%) rate Concurrence

Least Medium Most Time spent on document H0: No exposure effect H1: Exposure effect

Figure 13.8 Exposure effect: the null hypothesis (H0) and the alternative hypothesis (H1)

Impact of debiasing: bias behaviour 162

Table 13.5 Relationship between document exposure level and concurrence between post- search answer and document-suggested answer (using baseline search interface)

Concurrence between post-search answer and document? Level of exposure Yes No Baseline search interface Least (n=126) 103 (81.7%) 23 (18.3%) Medium (n=221) 205 (92.8%) 16 (7.2%) Most (n=194) 171 (88.1%) 23 (11.9%)

100%

90%

80%

70%

60% Concurrence rate

50% 10% 0% Least Medium Most Time spent on document (min)

Figure 13.9. Relationship between document exposure level and concurrence rate between post-search answer and document-suggested answer (using baseline search interface)

Impact of debiasing: bias behaviour 163

13.6 Results on reinforcement effect

13.6.1 Reinforcement effect hypothesis Reinforcement effect is hypothesised in Figure 13.10. People who do not experience reinforcement effect would display behaviour similar to the null hypothesis (H0); and people who experience reinforcement effect would display behaviour similar to the alternative hypothesis (H1). The null hypothesis describes the scenario where the concurrence between subjects’ post-search answer and the answer suggested by a document is not influenced by the frequency with which the document was accessed. The alternative hypothesis describes the scenario where the access frequency of a document influences the concurrence rate. The expectation is that if people experience reinforcement effect while using the baseline search interface, they would display behaviour similar to the alternative hypothesis (H1); and if people do not experience reinforcement effect, they would display behaviour similar to the null hypothesis (H0).

13.6.2 Reinforcement effect With responses completed using baseline search interface, results from the reinforcement effect investigation are reported in Table 13.6 and Figure 13.11. Table 13.6 shows that there was not a statistically significant relationship between access frequency of a document and Concurrence rate (%) rate Concurrence

Once only More than once No. of visits

H0: no reinforcement effect H1: reinforcement effect

Figure 13.10 Reinforcement effect: the null hypothesis (H0) and the alternative hypothesis (H1)

Impact of debiasing: bias behaviour 164

Table 13.6 Relationship between document access frequency and concurrence between post- search answer and document-suggested answer (using baseline search interface)

Concurrence between post-search answer and document? Access frequency Yes No Baseline search interface Once only (n=583) 504 (86.4%) 79 (13.6%) More than once (n=72) 63 (87.5%) 9 (12.5%) concurrence between the post-search answer and the answer suggested by the document (χ2

= 0.061, df = 1, P = 0.81) (also illustrated in Figure 13.11). In other words, reinforcement effect was not evident amongst responses completed using the baseline search interface.

13.7 Discussion

Subjects experienced both the anchoring effect and order effects when using the baseline search interface. There was a statistically significant relationship between subjects’ pre- search answers and their post-search answers. Subjects were equally likely to retain their pre-search answer after searching regardless of how confident they were in the pre-search answer. There was a statistically significant relationship between the access position of a document and the degree of influence the document had on a decision. The observed relationship – that the influence of a document decreases as one gets further away from the

100%

90%

80%

70%

60% Concurrence rate

50% 10% 0% Once only More than once No. of visits

Figure 13.11 Relationship between document access frequency and concurrence rate between post-search answer and document-suggested answer (using baseline search interface)

Impact of debiasing: bias behaviour 165

start of a search journey – is contrary to the order effects hypothesised in this research.

Using the keep document tool (i.e. order debiasing intervention) has successfully eliminated order effects. There were no statistically significant differences in the degree of influence a document had on a decision between different access positions. However, using the for/against document tool (i.e. anchor debiasing intervention) did not eliminate the anchoring effect but has introduced the confidence in anchoring effect. Amongst subjects who used the for/against document tool, there was a statistically significant relationship between subjects’ confidence in their pre-search answer and their retention rate of the answer after searching. In addition, subjects who lacked confidence in their pre-search answers were more likely to change their answers post-search than those who were confident or very confident in their pre-search answers.

One possible explanation for this anchoring phenomenon is that although the for/against document tool did not eliminate the anchoring effect, it reduced the impact of the pre-search answer by facilitating people who were not as confident in their pre-search answers to change their answers after information searching. Compared to using the baseline search interface, using the for/against document tool has marginally reduced the pre-search answer retention rate amongst subjects who lacked confidence in their pre- search answers, and did not increase the pre-search answer retention rate amongst subjects who were confident or very confident in their pre-search answers.

Several limitations are associated with the design of the debiasing interventions and the assumption that cognitive biases can be debiased one at a time, which raise questions about the effectiveness of these interventions to debias information searching:

• Debiasing relies on subjects using the interventions: One assumption behind the

anchoring and order debiasing interventions is that subjects use the keep document and

for/against document tools provided to search for information. However, these

interventions would not have worked if subjects did not use the tool to collect

documents during searching, subjects did not look at the rearranged documents before

making a decision, or subjects accessed too few documents to have any benefit from

these document collection tools.

• Interaction of biases: Although this research debiases one cognitive bias at a time, it is

unclear how biases interact and the collective impact they have on information

Impact of debiasing: bias behaviour 166

searching. As demonstrated in this chapter and in other studies (Weinstein & Klein,

1995; Sanna et al., 2002; Turpin & Plooy, 2004), interventions can unintentionally

introduce new biases, amplify or reduce the impact of existing biases or eliminate other

biases instead of the targeted bias.

13.8 Conclusion

This chapter provides evidence that people experience both order effects and the anchoring effect during information searching, and that interventions designed to debias information searching can eliminate or introduce biases. Using the keep document tool (i.e. order debiasing intervention) has successfully mitigated order effects. Using the for/against document tool (i.e. anchor debiasing intervention) did not eliminate the anchoring bias, but introduced the confidence in anchoring effect that facilitated more subjects who were not as confident in their pre-search answers to change their answers after information searching.

Due to the lack of benchmark data on cognitive biases and information searching, findings in this chapter need to be verified with further research before generalisations can be made on how widespread these biases are in information searching or how effective the proposed interventions are in debiasing information searching.

14 Impact of debiasing: decision making

This chapter tests the hypothesis that attempts to debias information searching improve decision making. Statistical tests and descriptive statistics are used to compare differences between using baseline search interface and debiasing intervention on four aspects of decision making: response accuracy, user confidence, search efficiency and user preference.

The statistical analyses involve examining whether using a debiasing intervention improves the number of correct post-search answers, alters users’ confidence in their answers, and requires less time to complete searching to answer a question compared to using baseline search interface. Descriptive statistics are used to evaluate subjects’ preferences for different interventions.

14.1 Method

Differences between baseline search interface and a debiasing intervention were compared in pairs of individual pre-/post-search outcomes where each subject was his/her own control. Before these comparison analyses could be conducted, the distributions of pre- search answers between baseline search interface and each debiasing intervention were compared to evaluate whether there were statistically significant differences in subjects’ ability to answer questions pre-search that would influence the distribution of post-search responses.

14.1.1 Accuracy in decisions To compare differences in response accuracy between using baseline search interface and a debiasing intervention, three analyses were conducted using the test for difference between proportions. The first analysis compared the proportion of correct post-search answers between using the baseline search interface and a debiasing intervention. The second analysis compared the proportion of pre-/post-search answers, i.e. the proportions of right- right (RR), right-wrong (RW), wrong-wrong (WW) and wrong-right (WR) between the

167

Impact of debiasing: decision making 168

baseline search interface and a debiasing intervention. The third analysis compared the baseline search interface and a debiasing intervention for the proportion of pre-search incorrect subjects who shift to correct post-search.

14.1.2 Confidence in decisions User confidence in answers is investigated within the baseline search interface and each debiasing intervention, as well as compared between the two systems. These analyses were similar to the confidence-related analyses reported in Chapter 12. The first analysis examined within baseline search interface and each debiasing intervention, the distribution of confidence in an incorrect post-search answer amongst subjects who answered “don’t know” pre-search (i.e. DW). The second analysis examined the distribution of self-reported changes in confidence amongst RR, RW, WW and WR subgroups within baseline search interface and each debiasing intervention. The third analysis examined whether using an intervention would result in more highly confident correct post-search answers and fewer incorrect highly confident post-search answers than using baseline.

14.1.3 Search performance and user preference Search session variables were compared to evaluate search efficiency when using baseline search interface and when using a debiasing intervention. These are variables that describe the evidence journey taken to answer a question: time used to search and access documents, the number of searches conducted and the number of documents accessed. The Wilcoxon signed ranks test was used to compare differences in these variables between using baseline search interface and a debiasing intervention. In addition, descriptive statistic was used to investigate which intervention subjects found most useful, enjoyed using the most and preferred to use for future information searching.

14.2 Ability to answer questions before searching

Overall, subjects were equally capable of answering questions correctly before searching regardless of whether a baseline search interface or a debiasing intervention was used. Table

14.1 shows that there were no statistically significant differences in the proportion of correct pre-search answers between using baseline and the for/against document tool (i.e. anchor

Impact of debiasing: decision making 169

Table 14.1 Comparison of post-search correctness in baseline search interface and anchor debiasing intervention

Anchor debiasing After search Baseline intervention Z P All responses (n=303) (n=291) Right 250 (82.5%) 248 (85.2%) –0.90 0.37 Wrong 53 (17.5%) 43 (14.8%) 0.90 0.37 Right before search (n=192) (n=182) Right 181 (94.3%) 169 (92.9%) 0.56 0.58 Wrong 11 (5.7%) 13 (7.1%) –0.56 0.58 Wrong before search (n=111) (n=109) Right 69 (62.2%) 79 (72.5%) –1.64 0.10 Wrong 42 (37.8%) 30 (27.5%) 1.64 0.10 debiasing intervention), nor in the proportion of correct pre-search answers between using baseline search interface and the keep document tool (i.e. order debiasing intervention).

14.3 Results (baseline versus anchor debiasing intervention)

14.3.1 Impact on response accuracy Comparisons in response accuracy between baseline search interface and the for/against document tool (i.e. anchor debiasing intervention) are reported in Table 14.1 and Table 14.2.

There were no statistically significant differences between using the baseline search interface and the anchor debiasing intervention either in the overall proportion of post-search answers that were correct (baseline: 82.5%, anchoring: 85.2%, Z = –0.90, P = 0.37) (Table

14.1) or in the relative proportions of right-right (RR), right-wrong (RW), wrong-wrong

(WW) nor wrong-right (WR) responses (Table 14.2).

Amongst subjects who were correct pre-search (Table 14.1), there were no statistically

Table 14.2 Comparison in proportions of RR, RW, WW and WR responses between baseline search interface and anchor debiasing intervention

Anchor debiasing Pre-/post-search Baseline (n=303) intervention (n=291) Z P Right Right (RR) 181 (59.7%) 169 (58.1%) 0.41 0.68 Right Wrong (RW) 11 (3.6%) 13 (4.5%) –0.52 0.60 Wrong Wrong (WW) 42 (13.9%) 30 (10.3%) 1.33 0.18 Wrong Right (WR) 69 (22.8%) 79 (27.1%) –1.23 0.22

Impact of debiasing: decision making 170

Table 14.3 Confidence in post-search answers amongst subjects who did not know answer before searching (after Westbrook et al., 2005b)

Did not know answer before search After search Wrong after search (DW) Right after search (DR) Baseline search interface* (n=2) (n=24) Not confident/Somewhat confident 2 (100%) 4 (16.7%) Confident/Very confident 0 (0%) 20 (83.3%) For/against document tool (i.e. (n=8) (n=31) anchoring debiasing intervention) Not confident/Somewhat confident 3 (37.5%) 0 (0%) Confident/Very confident 5 (62.5%) 31 (100%)

* Using data from baseline–anchoring comparison only. significant differences in the proportion of correct post-search answers between using baseline search interface and the anchor debiasing interface (baseline: 94.3%, anchoring:

92.9%, Z = 0.56, P = 0.58). However, amongst subjects who were incorrect pre-search

(Table 14.1), the proportion of correct post-search answers was marginally higher amongst those who used the anchor debiasing intervention than those who used the baseline search interface (baseline: 62.2%, anchoring: 72.5%, Z = –1.64, P = 0.10). In other words, the anchor debiasing intervention facilitated more subjects with an incorrect pre-search answer to answer correctly post-search than the baseline search interface.

14.3.2 Impact on confidence Investigation in confidence in answers within and between baseline search interface and the for/against document tool (i.e. anchor debiasing intervention) are reported in Table 14.3,

Table 14.4 and Table 14.5. The first analysis examines, within the baseline search interface and the anchor debiasing intervention, the distribution of confidence in post-search answers amongst subjects who did not know the answer before searching. However, there was not enough data to conduct this analysis as there were only 26 responses available from the baseline search interface and 39 responses from the anchor debiasing intervention (Table

14.3).

Impact of debiasing: decision making 171

Table 14.4 Self-reported changes in confidence pre-/post-search (after Westbrook et al. 2005b)

Wrong before search Right before search Change in Wrong after Right after Wrong after Right after confidence search (WW) search (WR) search (RW) search (RR) Baseline search interface* (n=41) (n=65) (n=11) (n=178) Decreased 9 (22.0%) 18 (27.7%) 5 (45.5%) 1 (0.6%) No change 14 (34.1%) 16 (24.6%) 3 (27.3%) 55 (30.9%) Increased 18 (43.9%) 31 (47.7%) 3 (27.3%) 122 (68.5%) For/against document tool (i.e. anchor debiasing intervention)† (n=30) (n=77) (n=13) (n=165) Decreased 4 (13.3%) 13 (16.9%) 4 (30.8%) 1 (0.6%) No change 13 (43.3%) 21 (27.3%) 5 (38.5%) 51 (30.9%) Increased 13 (43.3%) 43 (55.8%) 4 (30.8%) 113 (68.5%)

* Using data from baseline–anchoring comparison only. 8 baseline responses excluded because they did not report pre-search confidence. † 6 anchoring responses excluded because they did not report pre-search confidence.

The second analysis examines the distribution of self-reported changes in confidence pre-

/post-search within baseline search interface and the anchor debiasing intervention (Table

14.4). For baseline search interface, the most frequently self-reported change in confidence for RR, WR and WW responses was “increased confidence” (RR: 68.5%, 95% CI: 61.39 to

74.91; WR: 47.7%, 95% CI: 36.02 to 59.62 and WW: 43.9%, 95% CI: 29.89 to 58.96); and the most frequent for RW responses was “decreased confidence” (45.5%; 95% CI: 21.27 to 71.99)

(Table 14.4). For the anchor debiasing intervention, the most frequently self-reported change in confidence for RR and WR responses was “increased confidence” (RR: 68.5%; 95% CI

61.04 to 75.08 and WR: 55.8%, 95% CI: 44.73 to 66.39); the most frequent for RW responses was “no change in confidence” (38.5%, 95% CI: 17.71 to 64.48) and for WW responses both

“no change in confidence” and “decreased confidence” were equally most frequent (43.3%,

95% CI: 27.37 to 60.80) (Table 14.4).

The third analysis compares the distribution of highly confident post-search answers (i.e. confident or very confident) between baseline search interface and the anchor debiasing intervention (Table 14.5). Amongst subjects who were correct post-search, the proportion of highly confident responses was marginally greater amongst those who used the anchor debiasing intervention than those who used the baseline search interface (baseline: 93.6%,

Impact of debiasing: decision making 172

Table 14.5 Comparison of confidence in right and wrong post-search answers between baseline and anchor debiasing intervention

Anchor debiasing Post-search confidence Baseline intervention Z P Right after search (n=250) (n=248) Not confident/ 16 (6.4%) 7 (2.8%) 1.92 0.055 Somewhat confident Confident/ 234 (93.6%) 241 (97.2%) –1.92 0.055 Very confident Wrong after search (n=53) (n=43) Not confident/ 10 (18.9%) 9 (20.9%) –0.25 0.80 Somewhat confident Confident/ 43 (81.1%) 34 (79.1%) 0.25 0.80 Very confident

95% CI 89.86 to 96.02; anchoring: 97.2%, 95% CI: 94.29 to 98.63; Z = –1.92, P = 0.055).

Amongst subjects who were incorrect post-search, there were no statistically significant differences in the proportion of highly confident responses between using baseline search interface and the anchor debiasing intervention (baseline: 81.1%, 95% CI: 68.64 to 89.41; anchoring: 79.1%, 95% CI: 64.79 to 88.58; Z = 0.25, P = 0.80).

14.3.3 Impact on search behaviour Subjects using the for/against document tool (i.e. anchor debiasing intervention) (i) took longer to search and answer a question (baseline: 322 seconds, SD: 270.3; anchoring: 372 seconds, SD: 294.7; Z= –2.129, P = 0.033), (ii) conducted fewer searches (baseline: 2.14, SD:

1.653; anchoring: 1.53, SD: 1.370; Z = –7.15, P < 0.001), and (iii) accessed more documents

(baseline: 2.58, SD: 2.455; anchoring: 3.86, SD: 3.522; Z = –5.53, P < 0.001) than those using the baseline search interface (Table 14.6).

Table 14.6 Comparison of search sessions using baseline and anchor debiasing intervention

No. of accessed Time taken (seconds) No. of searches documents Intervention Average (SD) Range Average Range Average Range (SD) (SD) Baseline 322 0 to 1541 2.14 1 to 13 2.58 0 to 17 (270.3) (1.653) (2.455) Anchor debiasing 372 0 to 1690 1.53 1 to 15 3.86 0 to 34 intervention (294.7) (1.370) (3.522)

Impact of debiasing: decision making 173

Table 14.7 Comparison of post-search correctness in baseline search interface and order debiasing intervention

Order debiasing After search Baseline intervention Z P All responses (n=304) (n=296) Right 249 (81.9%) 239 (80.7%) 0.36 0.71 Wrong 55 (18.1%) 57 (19.3%) –0.36 0.71 Right before search (n=193) (n=176) Right 180 (93.3%) 158 (89.8%) 1.20 0.23 Wrong 13 (6.7%) 18 (10.2%) –1.20 0.23 Wrong before search (n=111) (n=120) Right 69 (62.2%) 81 (67.5%) –0.85 0.40 Wrong 42 (37.8%) 39 (32.5%) 0.85 0.40

14.4 Results (baseline versus order debiasing intervention)

14.4.1 Impact on response accuracy Comparisons in response accuracy between baseline search interface and the keep document tool (i.e. order debiasing intervention) are reported in Table 14.7 and Table 14.8.

There were no statistically significant differences between using the baseline search interface and the order debiasing intervention either in the overall proportion of post-search answers that were correct (baseline: 81.9%, order: 80.7%, Z = 0.36, P = 0.71) (Table 14.7) or in the relative proportions of right-right (RR), right-wrong (RW), wrong-wrong (WW) nor wrong-right (WR) responses (Table 14.8). In addition, amongst subjects who were correct pre-search or incorrect pre-search, there were no statistically significant differences in the proportion of correct post-search answers between using baseline search interface and the order debias interface (Table 14.7).

Table 14.8 Comparison in proportions of RR, RW, WW and WR responses between baseline search interface and order debiasing intervention

Order debiasing Pre-/post-search Baseline (n=304) intervention (n=296) Z P Right Right (RR) 180 (59.2%) 158 (53.4%) 1.44 0.15 Right Wrong (RW) 13 (4.3%) 18 (6.1%) –1.00 0.32 Wrong Wrong (WW) 42 (13.8%) 39 (13.2%) 0.23 0.82 Wrong Right (WR) 69 (22.7%) 81 (27.4%) –1.32 0.19

Impact of debiasing: decision making 174

Table 14.9 Confidence in post-search answers amongst subjects who did not know answer before searching (after Westbrook et al. 2005b Did not know answer before search After search Wrong after search (DW) Right after search (DR) Baseline search interface* (n=4) (n=23) Not confident/Somewhat confident 2 (50.0%) 4 (17.4%) Confident/Very confident 2 (50.0%) 19 (82.6%) Keep document tool (i.e. order (n=5) (n=37) debiasing intervention) Not confident/Somewhat confident 2 (40.0%) 4 (10.8%) Confident/Very confident 3 (60.0%) 33 (89.2%)

* Using data from baseline–order comparison only.

14.4.2 Impact on confidence Investigation in confidence in answers within and between the baseline search interface and the keep document tool (i.e. order debiasing intervention) are reported in Table 14.9, Table

14.10 and Table 14.11. Similar to the investigation between the baseline search interface and the anchor debiasing intervention, there was not enough data to conduct the first analysis, which examines the distribution of confidence in post-search answers amongst subjects who did not know the answer before searching. There were only 27 responses available from the baseline search interface and 42 responses from the order debiasing intervention (Table

14.9).

The second analysis examines the distribution of self-reported changes in confidence pre-

/post-search within the baseline search interface and the order debiasing intervention (Table

14.10). For baseline search interface, the most frequently self-reported change in confidence for RR, WR and WW responses was “increased confidence” (RR: 68.4%, 95% CI: 61.18 to

74.76; WR: 44.6%, 95% CI: 33.17 to 56.67 and WW: 47.6%, 95% CI: 33.36 to 62.28); and for

RW responses, both “decreased confidence” and “increased confidence” were equally most frequent (38.5%, 95% CI: 17.71 to 64.48) (Table 14.10). For the order debiasing intervention, the most frequently self-reported change in confidence was “increased confidence” across all pre-/post-search responses (RR: 72.6%, 95% CI: 65.16 to 78.98; WR: 55.6%, 95% CI: 44.73 to

65.88; WW: 51.3%, 95% CI: 36.20 to 66.13 and RW: 41.2%, 95% CI: 21.61 to 64.00) (Table

14.10).

Impact of debiasing: decision making 175

Table 14.10 Self-reported changes in confidence pre-/post-search (after Westbrook et al., 2005b)

Wrong before search Right before search Change in Wrong after Right after Wrong after Right after confidence search (WW) search (WR) search (RW) search (RR) Baseline search interface* (n=42) (n=65) (n=13) (n=177) Decreased 9 (21.4%) 19 (29.2%) 5 (38.5%) 1 (0.6%) No change 13 (31.0%) 17 (26.2%) 3 (23.1%) 55 (31.1%) Increased 20 (47.6%) 29 (44.6%) 5 (38.5%) 121 (68.4%) Keep document tool (i.e. order debiasing intervention)† (n=39) (n=81) (n=17) (n=157) Decreased 2 (5.1%) 13 (16.0%) 5 (29.4%) 2 (1.3%) No change 17 (43.6%) 23 (28.4%) 5 (29.4%) 41 (26.1%) Increased 20 (51.3%) 45 (55.6%) 7 (41.2%) 114 (72.6%)

* Using data from baseline–order comparison only. 7 baseline responses excluded because they did not report pre-search confidence. † 2 order responses excluded because they did not report pre-search confidence.

The third analysis compares the distribution of highly confident post-search answers (i.e. confident or very confident) between baseline search interface and the order debiasing intervention (Table 14.11). Between using baseline search interface and the order debiasing intervention, there were no statistically significant differences in the proportion of highly confident correct post-search answers, (baseline: 94.0%, 95% CI: 90.30 to 96.32; order:

95.4%, 95% CI: 91.95 to 97.41; Z = –0.70, P = 0.48), nor in the proportion of highly

Table 14.11 Comparison of confidence in right and wrong post-search answers between baseline and order debiasing intervention

Order debiasing Post-search confidence Baseline intervention Z P Right after search (n=249) (n=239) Not confident/ 15 (6.0%) 11 (4.6%) 0.70 0.48 Somewhat confident Confident/ 234 (94.0%) 228 (95.4%) –0.70 0.48 Very confident Wrong after search (n=55) (n=57) Not confident/ 10 (18.2%) 9 (15.8%) 0.34 0.73 Somewhat confident Confident/ Very confident 45 (81.8%) 48 (84.2%) –0.34 0.73

Impact of debiasing: decision making 176

Table 14.12 Comparison of search sessions using baseline and order debiasing intervention

No. of accessed Time taken (seconds) No. of searches documents Intervention Average (SD) Range Average Range Average Range (SD) (SD) Baseline 325 0 to 1541 2.16 1 to 13 2.62 0 to 17 (270.7) (1.665) (2.448) Order debiasing 332 0 to 1703 1.49 1 to 7 3.59 0 to 16 intervention (291.0) (0.967) (3.088) confident incorrect post-search answers (baseline: 81.8%, 95% CI: 69.68 to 89.81; order:

84.2%, 95% CI: 72.64 to 91.46; Z = –0.34, P = 0.73).

14.4.3 Impact on search behaviour In the comparison between subjects using the keep document tool (i.e. order debiasing intervention) and baseline search interface, there were (i) no statistically significant differences in the amount of time taken to search and answer a question (baseline: 325 seconds, SD: 270.7 vs. order: 332 seconds, SD: 291.0; Z = –0.57, P = 0.57). However, subjects using the order debiasing intervention (ii) conducted fewer searches (baseline: 2.16, SD:

1.665; order: 1.49, SD: 0.967; Z = –6.81, P < 0.001), and (iii) accessed more documents

(baseline: 2.62, SD: 2.448; order: 3.59, SD: 3.088; Z = –4.10, P < 0.001) than those using baseline search interface (Table 14.12).

14.5 Results (user preference)

Table 14.13 shows that the systems subjects found most useful, enjoyed using the most and preferred to use in the future are (in descending order, from most favourable): for/against document tool (i.e. anchor debiasing intervention), keep document tool (i.e. order debiasing intervention) and baseline search interface. Out of 183 subjects who were included in this analysis, one subject did not answer the questionnaire (0.5%).

Impact of debiasing: decision making 177

Table 14.13 System preferences reported by subjects (n=183)

System nominated by respondent Frequency (%) 95% Confidence Interval (CI) System found most useful For/against document tool 80 (43.7%) 36.74 to 50.96 (i.e. anchor debiasing intervention) Keep document tool 75 (41.0%) 34.11 to 48.22 (i.e. order debiasing intervention) Baseline search interface 27 (14.8%) 10.34 to 20.61 No response 1 (0.5%) Not applicable System found most enjoyable For/against document tool 83 (45.4%) 38.32 to 52.59 (i.e. anchor debiasing intervention) Keep document tool 60 (32.8%) 26.40 to 39.88 (i.e. order debiasing intervention) Baseline search interface 39 (21.3%) 16.00 to 27.80 No response 1 (0.5%) Not applicable System preferred for future use For/against document tool 77 (42.1%) 35.16 to 49.32 (i.e. anchor debiasing intervention) Keep document tool 76 (41.5%) 34.64 to 48.77 (i.e. order debiasing intervention) Baseline search interface 29 (15.8%) 11.27 to 21.84 No response 1 (0.5%) Not applicable

14.6 Discussion

Several findings are identified from the comparison between using baseline and the for/against document tool (i.e. anchor debiasing intervention). First, amongst subjects who were incorrect before searching, using the anchor debiasing intervention was associated with a higher proportion of subjects getting a correct post-search answer than using the baseline search interface. Second, using the anchor debiasing intervention was associated with a greater proportion of highly confident correct post-search answers than using the baseline search interface. In fact, amongst subjects who used the anchor debiasing intervention, the majority of those who were correct post-search reported their confidence increased and the majority of those who were incorrect post-search reported their confidence decreased or did not change. Third, although using the anchor debiasing intervention was associated with more time spent on searching and accessing documents to answer a question, it was also

Impact of debiasing: decision making 178

associated with fewer searches conducted and more documents accessed. Last, the anchor debiasing intervention was reported to be the most enjoyable, useful and preferred system to be used for future information searching compared to the baseline search interface and the order debiasing intervention.

In the comparison between baseline and the keep document tool (i.e. order debiasing intervention), there were no statistically significant differences between using baseline search interface and the order debiasing intervention in the proportion of pre-search incorrect subjects who shift to be correct post-search. In fact, the majority of subjects using the order debiasing intervention self-reported their confidence increased regardless of the accuracy of their pre-/post-search answer. Furthermore, using an order debiasing intervention did not require more time to search; it resulted in fewer searches conducted and more documents accessed to answer a question compared to the baseline search interface. Finally, more subjects selected the order debiasing intervention to be the most enjoyable, useful and preferred system to be used for future information searching than the baseline search interface.

Several limitations are associated with the assumption that debiasing can improve decision making:

• Role of heuristics in information searching: People may actually rely on using heuristics

to select information from a set of retrieved documents to make a “good-enough”

decision (Simon, 1982). For example, search engines actually work on the principle of

primacy effect and present relevant documents early in the list such that users can

differentiate the difference in relevance between documents and pay more attention to

earlier items.

• Debiasing could “backfire”: Undesirable behaviours may be exacerbated by attempts

intended to correct them (Curley et al., 1989; Tetlock & Boettger, 1989, 1994; Ashton,

1990; Weinstein & Klein, 1995; Sanna et al., 2002). For example, Sanna et al. (2002)

tested the strategy of thinking about alternative outcomes to debias hindsight bias and

found that the strength of the hindsight bias actually increased. One possible reason

suggested is that subjects experienced difficulty with listing many counterfactual

thoughts and thus concluded that there are not many alternative outcomes, which

enhanced the hindsight bias.

Impact of debiasing: decision making 179

• Other factors that influence the impact of information searching on decision making: A

person’s search ability and ability to use the resulting information are found to be

important factors that influence the use of an information retrieval system to

successfully answer clinical questions (Hersh et al, 2002). In addition, interaction effects

between articles can affect a person’s judgement of the applicability of a document to a

question (Florance and Marchionini, 1995).

14.7 Conclusion

This chapter provided evidence that attempts to debias information searching can influence decision outcome, affect user confidence and alter information searching behaviour. There is mixed evidence on whether removing biases has a positive or negative impact on decision making, especially with the unexpected finding that failure to remove the anchoring effect is marginally associated with more incorrect pre-search subjects answering correctly post- search; and yet, correcting order effects is not associated with any improvement in response accuracy. With the lack of benchmark literature on bias and debias studies in information searching, one cannot draw generalisations from these findings. Nevertheless, these interventions allow subjects to conduct fewer searches and access more documents. In conclusion, this research confirms our lack of understanding in the intersection between cognitive biases, information searching and decision making; more research is needed to explore how people search for information and make decisions from a cognitive perspective.

15 Impact of information searching on decision making: conclusion

This chapter summarises the overall findings of this research and concludes with potential areas for further research. This research has contributed an investigation of the nature and impact of cognitive biases on information searching and decision making, providing possibly the first set of evidence that people may experience cognitive biases while searching for information. It has also provided evidence that search user interfaces can be designed to correct for these biases, and that such attempts to debias information searching can influence decision outcome, affect user confidence and alter information searching behaviour. A retrospective data analysis, qualitative and quantitative explorations, a

Bayesian model, a series of interventions and empirical experiments were designed and conducted to carry out this investigation.

This chapter starts by addressing the hypotheses for each of the cognitive biases investigated in this research, presented in the following sequence, order effects, anchoring effect, exposure effect and reinforcement effect. It proceeds with a description of other findings, followed by an overall research summary and concludes with five potential areas for further research.

15.1 Order effects

15.1.1 Finding 1: People experience order effects during information searching This research provides evidence that people experience order effects during information searching. Although the retrospective data analysis does not show that subjects experienced order effects (χ2 = 0.55, df = 2, P = 0.76) (Chapter 5, Table 5.5), the empirical experiment shows that there was a statistically significant relationship between the access position of a document and the concurrence between subjects’ post-search answer and the answer suggested by the document (χ2 = 7.27, df = 2, P = 0.026) (Chapter 13, Table 13.1).

Concurrence rate decreased as the document access position proceeded from first to middle 180

Impact of information searching on decision making: conclusion 181

100%

90%

80%

70%

60% Concurrence rate

50% 10% 0% First Middle Last

Pos ition

Figure 15.1 Relationship between document access position and rate of concurrence between post-search answer and document-suggested answer (using baseline search interface) (Chapter 13, Figure 13.3) and last (Figure 15.1). In other words, documents accessed at different positions in a search journey have different degrees of influence on a decision; and the degree of influence a document has on a decision decreases the later the document is accessed in the search journey.

15.1.2 Finding 2: Order effects can be eliminated during information searching Using a keep document tool (i.e. order debiasing intervention) can successfully eliminate order effects during information searching. Looking at the concurrence rate across document access position amongst responses completed using the keep document tool

(Figure 15.2), there was not a statistically significant relationship between the access position of a document and the concurrence between subjects’ post-search answer and the answer suggested by the document (χ2 = 2.14, df = 2, P = 0.34) (Chapter 13, Table 13.1). In other words, using the keep document tool has removed order effects.

Impact of information searching on decision making: conclusion 182

100%

90%

80%

70%

60% Concurrence rate

50% 10% 0% First Middle Last

Pos ition

Figure 15.2 Relationship between document access position and rate of concurrence between post-search answer and document-suggested answer (using order debias search interface) (Chapter 13, Figure 13.3)

15.1.3 Finding 3: Debiasing order effects during information searching does not alter the distribution of post-search answers This study provides evidence that attempts to debias order effects do not alter the distribution of decision outcomes and do not affect the proportion of confident or very confident answers, but do alter subjects’ information searching behaviour. Although using the keep document tool (i.e. order debiasing intervention) did not increase the proportion of correct post-search answers (baseline: 81.9%, order: 80.7%, Z = 0.36, P = 0.71) (Chapter

14, Table 14.7), it did not increase the proportion of incorrect post-search answers either

(baseline: 18.1%, order: 19.3%, Z = –0.36, P = 0.71) (Chapter 14, Table 14.7). In addition, the intervention did not increase the proportion of subjects who answered correctly pre-search and subsequently incorrectly post-search (baseline: 6.7%, order: 10.2%, Z = –1.20, P = 0.23)

(Chapter 14, Table 14.7).

Comparing the confidence distribution amongst post-search answers between using the baseline search interface and the keep document tool (i.e. order debiasing intervention), there were no statistically significant differences in the proportions of confident or very confident correct post-search answers (baseline: 94.0%, 95% CI: 90.30 to 96.32; order: 95.4%,

95% CI: 91.95 to 97.41; Z = –0.70, P = 0.48) (Chapter 14, Table 14.11); nor were there differences in the proportions of confident or very confident incorrect post-search answers

Impact of information searching on decision making: conclusion 183

(baseline: 81.8%, 95% CI: 69.68 to 89.81; order: 84.2%, 95% CI: 72.64 to 91.46; Z = –0.34, P

= 0.73) (Chapter 14, Table 14.11).

Furthermore, Table 14.12 in Chapter 14 reports that using the keep document tool (i.e. order debiasing intervention) did not change the amount of time taken to search and answer a question (baseline: 325 seconds, SD: 270.7 versus order: 332 seconds, SD: 291.0; Z = –0.57,

P = 0.57). However, it was associated with fewer searches conducted (baseline: 2.16, SD:

1.665; order: 1.49, SD: 0.967; Z = –6.81, P < 0.001) and more documents accessed (baseline:

2.62, SD: 2.448 vs. order: 3.59, SD: 3.088; Z = –4.10, P < 0.001).

15.1.4 Summary Subjects experience order effects during information searching. Using the keep document tool (i.e. order debiasing intervention) successfully mitigated order effects. Compared to using the baseline search interface, using the keep document tool did not facilitate more subjects to answer incorrectly post-search, nor did it shift more subjects from answering correctly pre-search to answering incorrectly post-search. In addition, it did not alter the distribution of confidence in post-search answers. While there were no statistically significant differences in the amount of time taken to search for information and answer a question, subjects were able to conduct fewer searches and accessed more documents.

Overall, using the keep document tool allowed subjects to search more effectively and efficiently than using the baseline search interface; it also allowed subjects to be corrected from order effects without worsening the decision accuracy nor altering their confidence in the decision outcome.

15.2 Anchoring effect

15.2.1 Finding 1: People experience the anchoring effect during information searching This research provides evidence that people experience the anchoring effect during information searching, i.e. subjects’ post-search answers were influenced by their pre-search answers (see Figure 15.3 for anchoring effect from the empirical experiment). In both the retrospective data analysis and the empirical experiment, subjects who were correct pre- search were more likely to answer correctly post-search than those who were incorrect pre- search (retrospective: χ2 = 19.63, df = 1, P < 0.001 (Chapter 5, Table 5.1); empirical: χ2 =

Impact of information searching on decision making: conclusion 184

100% 90% 80% 70% 60% 50% 40% 30% Right (after Right search) 20% 10% 0% Right Wrong Before search

Figure 15.3 Relationship between pre-search answer and post-search correctness (using baseline search interface) (Chapter 13, Figure 13.6)

50.25, df = 1, P < 0.001 (Chapter 13, Table 13.2)). However, subjects’ confidence in the pre- search answer did not alter their tendency to retain that answer after searching (see Figure

15.4 for confidence in anchoring effect from the empirical experiment). Although the retrospective data analysis shows a marginal statistical relationship, the empirical experiment did not show a statistically significant relationship between subjects’ confidence in their pre-search answers and their tendency to retain that answer post-search

(retrospective: χ2=7.70, df = 3, P = 0.053 (Chapter 5, Table 5.3); empirical: χ2 = 2.67, df = 3, P

= 0.45 (Chapter 13, Table 13.3). In other words, subjects’ post-search answers were influenced by their pre-search answers regardless of their confidence in pre-search answer.

100%

90%

80%

70%

60%

50%

40%

30%

20% Pre-search answer retention rate 10%

0% Not confident Somew hat confident Confident Very confident Confidence in pre-search answ er

Figure 15.4 Relationship between confidence in pre-search answer and pre-search answer retention rate (using baseline search interface) (Chapter 13, Figure 13.7)

Impact of information searching on decision making: conclusion 185

15.2.2 Finding 2: Anchoring effect was unable to be eliminated during information searching in this experiment This experiment was unable to control the anchoring effect. Subjects still experienced the anchoring effect when using the for/against document tool (i.e. anchor debiasing intervention) (Figure 15.5). There was a statistically significant relationship between subjects’ pre-search answers and their post-search answers (χ2 = 22.48, df = 1, P < 0.001)

(Chapter 13, Table 13.2). In addition, subjects experienced the confidence in anchoring effect (Figure 15.6). There was a statistically significant relationship between subjects’ confidence in their pre-search answers and their retention of the answers after searching (χ2

= 9.91, df = 3, P = 0.019) (Chapter 13, Table 13.3). Using the for/against document tool has marginally reduced the pre-search answer retention rate amongst subjects who lacked confidence in their pre-search answers (baseline: 29.8%, 95% CI: 21.86 to 39.19; anchoring:

40.5%, 95% CI: 32.17 to 49.40; Z = –1.69, P = 0.091) (Chapter 13, Table 13.4), but it did not increase the pre-search answer retention rate amongst subjects who were confident or very confident in their pre-search answers (baseline: 76.4%, 95% CI: 69.94 to 81.90; anchoring:

75.0%, 95% CI: 67.85 to 81.00; Z = 0.32, P = 0.75). (Chapter 13, Table 13.4). Overall, subjects’ post-search answers were influenced by their pre-search answers when using the for/against document tool, but they were more likely to change their answers post-search if they lacked confidence in their pre-search answers.

100% 90% 80% 70% 60% 50% 40% 30% Right (afterRight search) 20% 10% 0% Right Wrong Before search

Figure 15.5 Relationship between pre-search answer and post-search correctness (using anchor debiasing intervention) (Chapter 13, Figure 13.6)

Impact of information searching on decision making: conclusion 186

100% 90% 80% 70% 60% 50% 40% 30% 20% 10% Pre-searchanswer rate retention 0% Not confident Somew hat Confident Very confident confident Confidence in pre-search answ er

Figure 15.6 Relationship between confidence in pre-search answer and pre-search answer retention rate (using anchor debiasing intervention) (Chapter 13, Figure 13.7)

15.2.3 Finding 3: Influencing the anchoring effect during information searching increases the proportion of subjects to shift from incorrect pre-search to correct post-search This study provides evidence that attempts to debias the anchoring effect influence subjects’ decision outcome, affect their confidence in answers and alter their information searching behaviour. Although using the for/against document tool (i.e. anchor debiasing intervention) did not increase the proportion of correct post-search answers (baseline:

82.5%, anchoring: 85.2%, Z = –0.90, P = 0.37) (Chapter 14, Table 14.1), it facilitated more subjects with an incorrect pre-search answer to answer correctly post-search (baseline: 62.2%, anchoring: 72.5%, Z = –1.64, P = 0.10) (Chapter 14, Table 14.1).

In addition, using the for/against document tool (i.e. anchor debiasing intervention) was marginally associated with more subjects being confident or very confident with their correct post-search answers (baseline: 93.6%, 95% CI 89.86 to 96.02; anchoring: 97.2%, 95%

CI: 94.29 to 98.63; Z = –1.92, P = 0.055) (Chapter 14, Table 14.5); and it did not increase the proportion of subjects being confident or very confident with their incorrect post-search answers (baseline: 81.1%, 95% CI: 68.64 to 89.41; anchoring: 79.1%, 95% CI: 64.79 to 88.58;

Z = 0.25, P = 0.80) (Chapter 14, Table 14.5).

Impact of information searching on decision making: conclusion 187

Even though Table 14.6 in Chapter 14 reports that subjects using the for/against document tool (i.e. anchor debiasing intervention) took longer to search and answer a question (baseline: 322 seconds, SD: 270.3; anchoring: 372 seconds, SD: 294.7; Z = –2.129, P

= 0.033), they conducted fewer searches (baseline: 2.14, SD: 1.653; anchoring: 1.53, SD:

1.370; Z = –7.15, P < 0.001) and accessed more documents than when using the baseline search interface (baseline: 2.58, SD: 2.455; anchoring: 3.86, SD: 3.522; Z = –5.53, P < 0.001).

Overall, more than 40% of subjects found the for/against document tool to be most useful, most enjoyable, and the one they would prefer to use in the future (Chapter 14, Table 14.13).

15.2.4 Summary Subjects experienced the anchoring effect and it influenced their post-search decision making. Subjects’ pre-search answers had a significant impact on their post-search answers; they were equally likely to retain their pre-search answer after searching regardless of how confident they were in that pre-search answer. Using the for/against document tool (i.e. anchor debiasing intervention) did not mitigate the anchoring effect; instead, it introduced the confidence in anchoring effect, which led more subjects who lacked confidence in their pre-search answer to change their answer after searching. Perhaps this introduction of the confidence in anchoring effect assisted in reducing the impact of the anchoring effect, where subjects who were incorrect pre-search were more likely to change their belief and answer correctly post-search. In addition, using the for/against document tool increased the proportion of confident or very confident correct post-search answers without increasing the proportion of confident or very confident incorrect post-search answers. Furthermore, the user interface was well-liked and subjects were able to access more documents with fewer searches. Overall, using the for/against document tool had a positive impact on the accuracy and confidence in decision outcome, search effectiveness as well as user favourability.

15.3 Exposure effect and reinforcement effect

This investigation was supposed to evaluate the impact of exposure effect and reinforcement effect on information searching and decision making. However, due to resource limitations in this research, empirical experiments were only conducted to investigate whether people experienced the exposure effect and the reinforcement effect while searching for

Impact of information searching on decision making: conclusion 188

100%

90%

80%

70%

60% Concurrence rate

50% 10% 0% Least Medium Most Time spent on document (min)

Figure 15.7 Relationship between document exposure level and concurrence rate between post-search answer and document-suggested answer (using baseline search interface) (Chapter 13, Figure 13.9)

information; no experiments were conducted to investigate whether these effects could be debiased or the subsequent impact on decision making.

This research provides evidence that people experience exposure effect during information searching. Although the retrospective data analysis does not show that subjects experienced exposure effect (χ2 = 2.61, df = 2, P = 0.27 (Chapter 5, Table 5.7), the prospective empirical experiment showed that there was a statistically significant relationship between the exposure level of a document and the concurrence between subjects’ post-search answer and the answer suggested by the document (χ2 = 9.64, df = 2, P

= 0.0081) (illustrated in Figure 15.7).

However, neither the retrospective data analysis nor the empirical experiment showed that subjects experienced reinforcement effect during information searching (retrospective:

χ2= 1.03, df = 1, P = 0.31 (Chapter 5, Table 5.9); empirical: χ2= 0.061, df = 1, P = 0.81

(Chapter 13, Table 13.6)). Results from the empirical experiment on reinforcement effect are illustrated in Figure 15.8.

Impact of information searching on decision making: conclusion 189

100%

90%

80%

70%

60% Concurrence rate

50% 10% 0% Once only More than once No. of visits

Figure 15.8 Relationship between document access frequency and concurrence rate between post-search answer and document-suggested answer (using baseline search interface) (Chapter 13, Figure 13.11)

15.4 Other findings

15.4.1 Qualitative and quantitative approaches to understanding the evidence journey A qualitative and a quantitative approach were used in this research to gain a better understanding on the “evidence journey”, which is defined to be the process that describes an individual accessing different pieces of information retrieved from an online evidence retrieval system to make a decision. This research provides evidence reported in Chapter 3 that (i) people do not necessarily arrive at the same answer after having accessed the same document; (ii) people access multiple pieces of evidence to form a conclusion to a question;

(iii) documents have varying levels of influence on a decision; and that (iv) documents do not influence everyone in the same manner.

From a qualitative perspective, searchers’ evidence journeys were depicted graphically, which exposed different components in an evidence journey that could influence the way people process information and make decisions. These elements include: subjects’ levels of knowledge and confidence in the topic area; the length and amount of time invested in searching, the number and types of documents accessed; and the impact of these documents depending on the way they were accessed and interpreted collectively. These elements were investigated in the form of cognitive biases in this research.

Impact of information searching on decision making: conclusion 190

From a quantitative perspective, the notions of sensitivity and specificity measures were borrowed from the medical and engineering disciplines to calculate the influential impact of a document on a decision in the form of a likelihood ratio. A likelihood ratio for each document can be calculated, which allows one to identify whether a correct or an incorrect answer is more likely to be expected after having accessed the document. In summary, the qualitative and quantitative exploration conducted in this research suggests that in order to understand how people use retrieved information to make a decision, it is important to understand the journey people took to arrive at their conclusion.

15.4.2 Bayesian model to predict the impact of documents and cognitive biases on decision making A Bayesian model was developed to predict the answer people will give to a question based on the documents they access from an online evidence retrieval system, with an accuracy of up to 73.3% (Chapter 7). This approach provides a way to predict the impact of a sequence of documents retrieved by a Web search engine on a decision task without reference to the content or structure of documents, but solely reliant on a simple Bayesian model of belief revision.

Results from the Bayesian model show that the anchoring effect is more influential than order, exposure and reinforcement effects on decision making. There is a statistically significant improvement in the prediction accuracy when the anchoring effect is incorporated into the Bayesian model. In fact, the predictive accuracy increases as more information about the subject’s personal prior belief is included. However, the study did not find a statistically significant improvement in the prediction accuracy when order, exposure and/or reinforcement effects were incorporated into the Bayesian model, which suggests that these effects may not be as significant as the anchoring effect in influencing decision making.

15.4.3 Healthcare consumers’ information searching and decision making behaviours This study found that non-clinically trained users can improve their answers after searching and that their confidence in the post-search answer is not always a good indicator of the accuracy in the answer. These findings resonate with studies conducted by Hersh et al.

(2000, 2002) and Westbrook et al. (2005), which found that clinically trained users can

Impact of information searching on decision making: conclusion 191

improve their answers after using an online evidence retrieval system. They also concur with the finding of Westbrook et al. (2005a) that clinicians reported being confident or very confident with their incorrect post-search answer. This study also demonstrates that even for non-clinically trained users, providing access to health-related information from tested resources is effective in improving their ability to answer questions.

15.4.4 Methodology to study cognitive biases during information searching The serial position curve was adapted to study confidence in anchoring effect, order effects, exposure effect and reinforcement effect during information searching. These methods can be customised and used by future researchers to study bias behaviour during information searching.

15.5 Research summary

This research is possibly the first study that examines the impact of cognitive biases on information searching and decision making. It proposes debiasing interventions for information searching and assesses the possibility of debiasing search behaviours to improve decision making.

Empirical evidence is provided to support the assertion that people can experience anchoring, exposure and order effects during information searching. A person’s pre-search answer has a significant effect on the post-search answer, regardless of the level of confidence in the pre-search answer. Documents accessed at different positions in the evidence journey and documents processed for different lengths of time have different degrees of influence on the post-search answer. On the whole, results from the Bayesian belief revision model and the empirical experiments suggest that the anchoring effect is a more influential cognitive bias than order, exposure and reinforcement effects in affecting decision outcomes.

Three classes of interventions are proposed to debias information searching: for/against document tool for anchoring effect; keep document tools for order effects, exposure effect and reinforcement effect; and education-based intervention for all the above-mentioned cognitive biases. Both keep document and for/against document tools integrate elements of successful debiasing strategies into the user interface of an online evidence retrieval system; they are

Impact of information searching on decision making: conclusion 192

also designed for users to gather and organise documents while searching for information.

The education-based intervention is a standalone information page that educates users about the impact of cognitive biases on decision making. These interventions were designed using selected literature in the fields of information science, decision making and cognitive science, involved members from the general public and researchers, and have been piloted and evaluated in empirical studies.

This research provides evidence that interventions can reduce or exacerbate cognitive biases during information searching, and that attempts to debias information searching can influence subjects’ decision outcomes, affect their confidence in answers and alter their search behaviour. However, one cannot conclude whether correcting biases has a positive or negative impact on decision making. For example, using an intervention that intends to debias anchoring did not succeed in removing the anchoring effect, yet it is marginally associated with an improved decision accuracy; on the other hand, using an intervention that successfully reduced order effects is not associated with any significant improvement in decision accuracy. Overall, this research concludes with evidence that using an intervention allows users to conduct fewer searches and access more documents and that using an intervention can influence the accuracy and confidence in decision making.

There are four other findings that can be drawn from this research. Firstly, this research utilises a qualitative and a quantitative approach into understanding the impact of information searching on decision making. From the qualitative perspective, this research graphically depicts the way people access and use documents during searching and identifies the components in an evidence journey that may influence the way people process information and make decisions. From the quantitative perspective, this research borrows sensitivity and specificity concepts from the medical and engineering disciplines to measure the impact of a document on a decision in the form of a likelihood ratio. Secondly, this research provides a Bayesian model that can be used to predict how different documents influence an individual’s decision outcome, independent of information about document structure or specific content. Thirdly, this research confirms that healthcare consumers are similar to clinicians in that they improve their answers after searching and their confidence in the post-search answer is not a good indicator of the accuracy of the answer. Fourthly, this research establishes the methodology of using adapted forms of the serial position curve

Impact of information searching on decision making: conclusion 193

to study cognitive biases during information searching. It provides data on behaviours of the anchoring, order, exposure and reinforcement effects during information searching, as well as data on the subsequent impact of some of these biases on decision making.

15.6 Future research

15.6.1 Cognitive biases during information searching (extending hypothesis 1)

There is a lack of literature that examines the interactions between cognitive biases, information searching and decision making, and many interesting lines of new investigation could enrich and extend our understanding in this area. In relation to the anchoring effect, this thesis has provided evidence that people are influenced by prior beliefs when searching for information to make decisions. Given that prior beliefs are shaped either by prior knowledge or irrelevant cues, an interesting investigation would be to consider whether these different origins of the anchoring bias exerts a different influence on the way new information is processed. Is anchoring whilst searching essentially due to prior knowledge, is it shaped by cues arising during the search session (the debiasing experiments here support such a conclusion), or as is most likely, is it some combination of both?

More experiments are also needed to investigate the impact of those cognitive biases that have not been fully investigated in this research, i.e. exposure and reinforcement effects. In this thesis, it has been demonstrated that there is evidence of an association between the degree of exposure to a document and the impact that a document then has on decision making. Further controlled experiments would be needed to investigate the causal direction of this relationship. In addition, although an associational relationship was not found between the reinforcement of and the impact of a document, further experiments are needed to confirm or refute this finding.

It is likely that the results of this thesis are generalisable across different domains. Given that biases are innate in human decision making, it is plausible that the impact of cognitive biases on information searching is not restricted to health-related decision making only.

Moreover, other cognitive biases, such as confirmation bias and conservation bias could be

Impact of information searching on decision making: conclusion 194

at work. Further research investigating other biases and other domain areas would provide more insight into the generalisability of the findings in this thesis.

15.6.2 Debiasing information searching (extending hypothesis 2)

More research is needed to explore different ways of debiasing information searching as well as other approaches to facilitate better information searching to improve decision making.

Overall, there are few successful interventions that use information technology to debias decision making. This research explored one way of debiasing information searching, namely implementing debiasing strategies as features in the user interface of a search engine. Future research may explore other ways of implementing debiasing strategies for information searching.

Aside from debiasing, techniques that have been suggested to improve decision making should be applied in the context of information searching. Examples of these techniques include training, proceduralisation and automation (Wickens & Hollands, 2000). In fact, debiasing is a form of specific training that targets certain aspects of decision making flaws

(Wickens & Hollands, 2000, p. 327) and focuses people’s awareness directly on understanding the sources of their cognitive limitations (Wickens & Hollands, 2000, p. 329).

Proceduralisation refers to ways for “outlining prescriptions of techniques that should be followed to improve the quality of decision making” (Wickens & Hollands, 2000, p. 329). In the context of information searching, one example of using proceduralisation is to provide a search agent that follows a procedure to search and accumulate evidence on behalf of the user. This way the human decision maker is prevented from exhausting cognitive resources in searching for information and can concentrate on interpreting and using the information to make decisions. The decision maker is also prevented from experiencing cognitive biases such as the confirmation bias, which may influence premature termination of a search journey and affect the types of information available to make a decision.

Automation refers to providing assistance automatically at major stages of processing in decision making. Existing measures of automation on the search user interface include rendering the appropriate display to assist users to aggregate information and storing accessed documents to offload aspects of the users’ working memory (Wickens & Hollands,

2000). Other measures of automation that are potentially useful for the search user interface

Impact of information searching on decision making: conclusion 195

include automatically making inferences between pieces of selected information, aggregating evidence for confirming and opposing views of the decision maker’s hypothesis, and suggesting new hypotheses based on the information retrieved.

15.6.3 Impact of debiasing information searching on decision making (extending hypothesis 3)

More experiments and analyses are needed to explore the relationship between debiasing, information searching and decision making. One of the most surprising and unexpected findings in this research is that attempts to debias the anchoring effect unintentionally introduced the confidence in anchoring effect. Yet, using the anchor debiasing intervention is associated with a marginal increase in the percentage of people who were incorrect pre- search to answer correctly post-search. On the other hand, successfully debiasing against order effects is not associated with any significant improvement in decision accuracy.

One possible explanation for this anchoring phenomenon is that although using the anchor debiasing intervention did not mitigate the anchoring effect, the introduction of the confidence in anchoring effect reduced the impact of the anchoring effect. The anchor debiasing intervention was found to significantly reduce the pre-search answer retention rate amongst subjects who lacked confidence in their pre-search answers, and thus facilitated these subjects to change answers post-search.

Another possible explanation is that the heuristics people use to process information and revise their beliefs are effective and that people should not be discouraged from using them. For example, the confirmation bias has been found in some circumstances to be a useful and adaptive way of gathering information (Klayman & Ha, 1987). In addition, using shortcuts offered by heuristics is often a necessity, especially in situations when there are time constraints and a decision maker must work rapidly and cannot afford to invest a large amount of mental effort or time to consider all possible hypotheses and options (Wickens &

Hollands, 2000). Furthermore, many of the heuristics are highly adaptive and people would not use them if they did not provide a satisfactory outcome most of the time (Payne et al.,

1993; Wickens & Hollands, 2000).

On the whole, the findings in this research confirm our lack of understanding on the collective impact of cognitive biases and the effectiveness of using heuristics in information

Impact of information searching on decision making: conclusion 196

searching and decision making. They have also raised questions of whether debiasing information searching has a positive or a negative impact on the accuracy of decision making. These questions and hypotheses need to be tested and verified with further experiments.

15.6.4 Search user interface design from a cognitive perspective

Further study from a cognitive perspective is needed to understand how the user interface of a search engine influences the way people process information and revise belief. This research shows that modifying the user interface of the search engine, i.e. the way documents are presented, the way users collect documents and the way documents are rearranged, can improve the accuracy of decision making. However, although the for/against document tool (i.e. anchor debiasing intervention) and the keep document tool

(i.e. order debiasing intervention) are very similar in their user interface design, they differ markedly in effectiveness in debiasing and improving decision making. These findings call for further research into understanding how components of a search user interface and the search task affect the way people process information and make decisions.

In addition, with the increasing role of using search engines to facilitate decision making, there needs to be an integrated approach that combines theories, findings from empirical studies and practical experience in designing search user interfaces that support efficient information searching and effective decision making (such as Hearst, 2006). Most user interfaces in existing search systems have not been evaluated. It is unclear which visual interface designs are suited to the various types of information searching tasks, and which components in a search user interface are effective for information processing and decision making. Even though there are theories that discuss how best to design a display (such as

Wickens & Carswell’s (1995) proximity compatibility principle, which details that different pieces of information that need to be used together should be displayed in close proximity and made available simultaneously), more research is needed to understand how these theories can be used to optimise the interaction dialogue between the search engine and the user, and how to design visual interfaces that resemble the way people process information and revise belief.

Impact of information searching on decision making: conclusion 197

15.6.5 Impact of information searching on accuracy and confidence in decision making

More research is needed to explore how search systems can assist users to be more discriminatory with their confidence in decision making. Confidence often governs the way people make decisions, in the form of the overconfidence bias (Kahneman et al., 1982;

Wickens & Hollands, 2000); however, findings from this research and previous studies have shown that confidence is always not a good indicator of decision accuracy (Pallier et al.,

2002; Westbrook et al., 2005a). The challenge in information searching lies in one’s ability to plan action and evaluate progress in the search journey (i.e. one’s metacognition). In an information searching task, the role of the metacognition in planning and deciding a search strategy, e.g. how much and what type of information is needed to make a decision and when is the appropriate condition to terminate searching, are often factors that determine the success of the search journey. Rather than relying on one’s confidence to make these decisions, one should be trained or supported to use the metacognition to plan and evaluate progress during the search journey. However, the search user interfaces reviewed in this research show no evidence of supporting the metacognition during information searching; more research is needed to investigate the design of a search user interface that supports the role of the metacognition.

Further investigation is required to explore how search systems can assist users to be more discriminatory on the diagnosticity and the reliability of retrieved information. People are often subject to the salience bias which prevents them from selecting information that is of high diagnosticity and reliability but of low salience. People frequently use the as-if heuristic to treat all information of different diagnosticity and reliability equally and prevent themselves from correctly interpreting the information and using it appropriately. In addition, people are subject to belief revision vulnerabilities, such as anchoring effect, overconfidence, order effects and confirmation bias, which prevent them from obtaining the appropriate information or fully understanding the situation to make the most suitable solution. However, there is no evidence to suggest that current search engines address these cognitive limitations associated with the human attention and working memory at each stage of information searching. More research from a cognitive perspective is needed to

Impact of information searching on decision making: conclusion 198

explore how people can overcome these cognitive limitations while searching for information.

15.7 Conclusion

This research is possibly the first study that looks at the impact of cognitive biases on information searching and decision making. It provides evidence that (i) people can experience cognitive biases during information searching; (ii) debiasing strategies can be integrated into the user interface of a search system; (iii) search behaviours can be debiased; and that (iv) attempts to debias information searching can influence subjects’ decision outcome, affect their confidence in answers and alter their information searching behaviour.

This research calls for further investigation into understanding the relationship between cognitive biases, debiasing, information searching and decision making. Potential areas of investigation include: the exploration of other cognitive biases that have not been addressed in this research, the collective impact of multiple cognitive biases on search behaviours and decision outcomes, the conditions in which debiasing introduces new biases or reduces the impact of other biases, and the contexts in which debiasing is beneficial or detrimental to information searching and decision making. This research also alludes to other broader research areas, such as the need for users to be more discriminatory with the diagnosticity and reliability of information (as well as their confidence in decision making), and a call for an integrated approach to designing search systems that facilitate better decision making.

Findings from this research should contribute a better understanding of how people search and use information to make a decision. It should also permit further study on the way information retrieval systems influence human decisions, and lead to better designs of search systems that support the entire journey from retrieving information to making better decisions, especially in the context of health-related decision making and evidence-based medicine. However, this research also confirms our lack of understanding at the intersection of cognitive science, information science and decision making. Further research efforts should continue in understanding the impact of information searching on decision making from a cognitive perspective.

References 199

References

A9. Retrieved 29th September, 2006, from www.a9.com

Melvyl. Retrieved 29th September, 2006, from http://melvyl.cdlib.org/

Ahlberg, C., & Shneiderman, B. (1994). In Visual information seeking using the filmfinder (pp. 433-434). Paper presented at the Conference on Human Factors in Computing Systems, Boston, MA. ACM Press.

Ahlberg, C., & Shneiderman, B. (1994). In Visual Information Seeking: Tight Coupling of Dynamic Query Filters with Starfield Displays (pp. 313-317). Paper presented at the SIGCHI conference on Human factors in computing systems: celebrating interdependence, Boston, MA. ACM Press

Allen, G. (1982). In Probability judgment in weather forecasting. Paper presented at the Ninth Conference in Weather Forecasting and Analysis, Boston: American Meteorological Society.

Anderson, N.H., & Farkas, A.J. (1973). New Light on Order Effects in Attitude Change. Journal of Personality and Social Psychology, 28(1), 88-93.

Arkes, H.R. (1991). Costs and Benefits of Judgment Errors: Implications for Debiasing. Psychological Bulletin, 110(3), 486-498.

Arnold, V., Collier, P.A., Leech, S.A., & Sutton, S.G. (2000). The effect of experience and complexity on order and recency bias in decision making by professional accountants. Accounting & Finance, 40(2), 109-134.

Arnott, D. (2006). Cognitive biases and decision support systems development: a design science approach. Information Systems Journal, 16(1), 55-78.

Asch, S.E. (1946). Forming impressions of personality. Journal of Abnormal and Social Psychology, 41, 258-290.

Ashton, R.H. (1990). Pressure and performance in accounting decision settings: paradoxical effects of insentives, feedback and justification. Jounal of Accounting Research, 28(Suppl), 148-180.

Ashton, R.H., & Kennedy, J. (2002). Eliminating Recency with Self-Review: The Case of Auditors' 'Going Concern' Judgments. Journal of Behavioral Decision Making, 15(3), 221- 231.

199

References 200

Bates, M.J. (1989). The Design of Browsing and Berrypicking Techniques for the Online Search Interface. Online Review, 13(5), 407-424.

Bazerman, M.H. (1994). Judgment in Managerial Decision Making (3rd ed.). Toronto: John Wiley & Sons, Inc.

Beach, L.R., & Mitchell, T.R. (1990). A contingency model for the selection of decision strategies. Academy of Management Review, 3(3), 439-449.

Bederson, B.B., Hollan, J.D., Perlin, K., Meyer, J., Bacon, D., & Furnas, G. (1996). Pad++: A zoomable graphical sketchpad for exploring alternative interface physics. Journal of Visual Languages and Computing, 7(1), 3-32(30).

Bergus, G.R., Chapman, G.B., Gjerde, C., & Elstein, A.S. (1995). Clinical reasoning about new symptoms despite preexisting disease: sources of error and order effects. Family Medicine, 27(5), 314-320.

Bergus, G.R., Chapman, G.B., Levy, B.T., Ely, J.W., & Oppliger, R.A. (1998). Clinical Diagnosis and the Order of Information. Medical Decision Making, 18(4), 412-417.

Bergus, G.R., Levin, I.P., & Elstein, A.S. (2002). Presenting Risks and Benefits to Patients - the Effect of Information Order on Decision Making. Journal of General Internal Medicine, 17(8), 612-617.

Bier, E.A., Stone, M.C., Pier, K., Buxton, W., & DeRose, T. (1993). In Tool glass and magic lenses: The see-through interface (pp. 73-80). Paper presented at the Twentieth annual conference on Computer graphics and interactive techniques ACM Press.

Blackshaw, L., & Fischhoff, B. (1988). Decision Making in Online Searching. Journal of the American Society for Information Science, 39(6), 369-389.

Börner, K. (2002). Visual Interfaces for Semantic Information Retrieval and Browsing. In V. Geroimenko & C. Chen (Eds.), Visualizing the Semantic Web (1st ed., pp. 99-115): Springer.

Brajnik, G., Mizzaro, S., & Tasso, C. (1996). In Evaluating User Interfaces to Information Retrieval Systems: A Case Study on User Support (pp. 128-136). Paper presented at the Ninteenth annual international ACM SIGIR conference on research and development in information retrieval Zurich, Switzerland. ACM Press.

Brewer, N.T., Chapman, G.B., Schwartz, J.B., & Bergus, G.R. (in press). The influence of irrelevant anchors on the judgments and choices of doctors and patients. Medical Decision Making.

Brown, M.W., Lawrence, K.R., & Paolini, M.A. (2003). Web page thumbnails and user configured complementary information provided from a server. United States.

Capitani, E., Della Sala, S., Logie, R.H., & Spinnler, H. (1992). Recency, Primacy, and Memory: Reappraising and Standardising the Serial Position Curve. Cortex, 28(3), 315-342.

References 201

Card, S.K. (1996). Visualizing Retrieved Information: A Survey. IEEE Computer Graphics and Applications, 16(2), 63-67.

Chandra, A., & Krovi, R. (1999). Representational congruence and information retrieval: Towards an extended model of cognitive fit. Decision Support Systems, 25(4), 271-288.

Chapman, G.B., Bergus, G.R., & Elstein, A.S. (1996). Order of Information Affects Clinical Judgment. Journal of Behavioral Decision Making, 9(3), 201-211.

Chen, M., Hearst, M., Hong, J., & Lin, J. (1999). In Cha-Cha: A System for Organizing Intranet Search Results. Paper presented at the Second USENIX Symposium on Internet Technologies and Systems (USITS), Boulder, CO.

Chinn, C., & Brewer, W. (1993). In Factors that influence how people respond to anomalous data (pp. 318-323). Paper presented at the Fifteenth Annual Conference of the Cognitive Science Society, Hillsdale, NJ. Lawrence Erlbaum.

Clemen, R.T., & Lichtendahl, K.C. (2002). In Debiasing Expert Overconfidence: A Bayesian Calibration Model. Paper presented at the Sixth International Conference on Probablistic Safety Assessment and Management (PSAM6), San Juan, Puerto Rico, USA.

Cockburn, A., Karlson, A., & Bederson, B.B. (2006). A Review of Focus and Context Interfaces: Human-Computer Interaction Lab, University of Maryland.

Cohen, M.S., Freeman, J.T., & Thompson, B.B. (1997). Training the naturalistic decision maker. In C.E. Zsambok & G. Klein (Eds.), Naturalistic decision making (pp. 257-268). Mahwah, NJ: Erlbaum.

Coiera, E., Walther, M., Nguyen, K., & Lovell, N.H. (2005). Architecture for Knowledge- Based and Federated Search of Online Clinical Evidence. Journal of Medical Internet Research, 7(5), e52.

Cole, C., Beheshti, J., Leide, J.E., & Large, A. (2005). Interactive information retrieval: bringing the user to a selection state. In A. Spink & C. Cole (Eds.), New directions in cognitive information retrieval (pp. 13-42). London: Springer.

Cooper, W.S. (1991). In A. Bookstein, Y. Chiaramella, G. Salton & V.V. Raghavan (Eds.), Some inconsistencies and Misnomers in Probabilistic Information Retrieval (pp. 57-61). Paper presented at the Fourteenth annual international ACM SIGIR conference on research and development in information retrieval, Chicago, Illinois. ACM.

Cousins, S.B., Paepcke, A., Winograd, T., Bier, E.A., & Pier, K. (1997). In The Digital Library Integrated Task Environment (DLITE) (pp. 142-151). Paper presented at the Second ACM international conference on Digital libraries Philadelphia, Pennsylvania. ACM Press.

Crestani, F., Lalmas, M., Van Rijsbergen, C.J., & Campbell, I. (1998). "Is This Document Relevant? ... Probably": A Survey of Probabilistic Models in Information Retrieval. ACM Computing Surveys, 30(4), 528-552.

References 202

Cromwell, H. (1950). The Relative Effect on Audience Attitude of the First Versus the Second Argumentative Speech of a Series. Speech Monographs, 17, 105-122.

Croskerry, P. (2002). Achieving Quality in Clinical Decision Making: Cognitive Strategies and Detection of Bias. Academic Emergency Medicine, 9(11), 1184-1204.

Croskerry, P. (2003). Cognitive Forcing Strategies in Clinical Decision Making. Annals of Emergency Medicine, 41(1), 110-120.

Croskerry, P. (2003a). The Importance of Cognitive Errors in Diagnosis and Strategies to Minimize Them. Academic Medicine, 78(8), 775-780.

Cunnington, J.P., Turnbull, J.M., Regehr, G., Marriott, M., & Norman, G.R. (1997). The Effect of Presentation Order in Clinical Decision Making. Academic Medicine, 72(10 Supplement I), S40-S42.

Curley, S.P., Yates, J.F., & Abrahms, R.A. (1986). Psychological sources of ambiguity avoidance. Organizational Behavior and Human Decision Processes, 38(2), 230-256.

Cutting, D.R., Karger, D.R., Pedersen, J.O., & Tukey, J.W. (1992). In Scatter/Gather: A Cluster-based Approach to Browsing Large Document Collections (pp. 318-329). Paper presented at the Fifteenth annual international ACM SIGIR conference on research and development in information retrieval Copenhagen, Denmark ACM Press.

Dawes, R.M. (1979). The robust beauty of improper linear models in decision making. American Psychologist, 34, 571-582.

Dawes, R.M., & Corrigan, B. (1974). Linear models in decision making. Psychological Bulletin, 81, 95-106. de Mello, G.E. (2004). In Need of a Favorable Conclusion: The Role of Goal- in Consumer Judgments and Evaluations (Summarised version of dissertation): University of Southern California.

Ebbinghaus, H. (1885). Memory: A Contribution to Experimental Psychology: Originally published in New York by Teachers College, Columbia University.

Eisenberg, M., & Barry, C. (1988). Order Effects: A Study of the Possible Influence of Presentation Order on User Judgments of Document Relevance. Journal of the American Society for Information Science, 39(5), 293-300.

Ellis, D., Wilson, T.D., Ford, N., Foster, A., Lam, H.M., Burton, R., et al. (2002). Information Seeking and Mediated Searching. Part 5. User–Intermediary Interaction. Journal of the American Society for Information Science and Technology, 53(11), 883-893.

Elstein, A.S. (1999). Heuristics and Biases: Selected errors in Clinical Reasoning. Academic Medicine, 74(7), 791-794.

Elstein, A.S., Shulman, L.S., & Sprafka, S.A. (1978). Medical Problem Solving: An Analysis of Clinical Reasoning. Cambridge, MA: Harvard University Press.

References 203

Elting, L.S., Martin, C.G., Cantor, S.B., & Rubenstein, E.B. (1999). Influence of data display formats on physician investigators' decisions to stop clinical trials: prospective trial with repeated measures. British Medical Journal, 318(7197), 1527-1531.

Evans, J.S.B.T. (1989). Bias in Human Reasoning: Causes and Consequences. Hove, Brighton: Lawrence Erlbaum Associates Ltd.

Eysenbach, G., & Jadad, A.R. (2001). Evidence-based Patient Choice and Consumer health informatics in the Internet age. Journal of Medical Internet Research, 3(2), e19.

Fischhoff, B. (1982). Debiasing. In K. Daniel, S. Paul & T. Amos (Eds.), Judgment under uncertainty: Heuristics and biases (pp. 422-444). New York: Cambridge University Press.

Fitzsimons, G.J., & Williams, P. (2000). Asking Questions Can Change Choice Behavior: Does It Do So Automatically or Effortfully? Journal of Experimental Psychology: Applied, 6(3), 195-206.

Florance, V., & Marchionini, G. (1995). In E.A. Fox, P. Ingwersen & R. Fidel (Eds.), Information processing in the context of medical care (pp. 158 - 163). Paper presented at the Eighteenth annual international ACM SIGIR conference on research and development in information retrieval, Seattle, Washington.

Ford, N. (2005). New Cognitive Directions. In A. Spink & C. Cole (Eds.), New Directions in Cognitive Information Retrieval (pp. 81-98). London: Springer.

Ford, N., Miller, D., Booth, A., O’Rourke, A., Ralph, J., & Turnock, E. (1999). Information retrieval for evidence-based decision making. Journal of Documentation, 55(4), 385-401.

Ford, N., Miller, D., & Moss, N. (2001). The role of individual differences in Internet searching: an empirical study. Journal of the American Society for Information Science and Technology, 52(12), 1049-1066.

Ford, N., Wood, F., & Walsh, C. (1994). Cognitive styles and searching. Online & CDROM Review, 18(2), 79-86.

Furnas, G.W. (1986). In Generalized fisheye views (pp. 16-23). Paper presented at the SIGCHI conference on Human factors in computing systems Boston, MA ACM Press.

George, J.F., Duffy, K., & Ahuja, M. (2000). Countering the anchoring and adjustment bias with decision support systems. Decision Support Systems, 29(2), 195-206.

Gorry, G.A., & Barnett, G.O. (1968). Experience with a model of sequential diagnosis. Computers and Biomedical Research, 1(5), 490-507.

Gruppen, L.D., Wolf, F.M., & Billi, J.E. (1991). Information gathering and integration as sources of error in diagnostic decision making. Medical Decision Making, 11(4), 233-239.

Haynes, R.B., & Wilczynski, N.L. (2004). Optimal search strategies for retrieving scientifically strong studies of diagnosis from Medline: analytical survey. British Medical Journal, 328(7447), 1040.

References 204

Hearst, M., Pedersen, J., Pirolli, P., Schuetze, H., Grefenstette, G., & Hull, D. (1996). In D.K. Harman (Ed.), The Fourth TREC-4 Tracks: the Xerox Site Report (pp. 97-119). Paper presented at the Proceedings of the Fourth Text REtrieval Conference (TREC-4), Gaithersburg, Maryland. U. S. Dept. of Commerce, National Institute of Standards and Technology.

Hearst, M.A. (1994). In Using categories to provide context for full-text retrieval results (pp. 115-130). Paper presented at the RIAO '94; Intelligent Multimedia Information Retrieval Systems and Management, New York.

Hearst, M.A. (2006). Clustering versus Faceted Categories for Information Exploration. Communications of the ACM, 49(4), 59-61.

Hearst, M.A. (2006). In Design Recommendations for Hierarchical Faceted Search Interfaces. Paper presented at the ACM SIGIR Workshop on Faceted Search, Seattle, WA.

Hearst, M.A. (2000). Next Generation Web Search: Setting Our Sites. IEEE Data Engineering Bulletin, Special issue on Next Generation Web Search, 23(3), 38-48.

Hearst, M.A. (1995). In TileBars: Visualization of Term Distribution Information in Full Text Information Access (pp. 59-66). Paper presented at the SIGCHI conference on Human factors in computing systems Denver, Colorado. ACM Press.

Hearst, M.A. (1999a). The Use of Categories and Clusters for Organizing Retrieval Results. In T. Strzalkowski (Ed.), Natural Language Information Retrieval (pp. 333-374). Dordrecht: The Netherlands: Kluwer Academic Publishers Group.

Hearst, M.A. (1999). User Interfaces and Visualization. In R. Baeza-Yates & B. Ribeiro-Neto (Eds.), Modern Information Retrieval (1st ed., pp. 257-324). Boston, MA: Addison-Wesley.

Hearst, M.A., Elliot, A., English, J., Sinha, R., Swearingen, K., & Yee, K.-P. (2002). Finding the Flow in Web Site Search. Communications of the ACM, 45(9), 42-49.

Hearst, M.A., & Karadi, C. (1997). In Cat-a-Cone: An Interactive Interface for Specifying Searches and Viewing Retrieval Results using a Large Category Hierarchy (pp. 246-255). Paper presented at the Twentieth annual international ACM SIGIR conference on research and development in information retrieval Philadelphia, Pennsylvania.

Hearst, M.A., Karger, D.R., & Pedersen, J.O. (1995a). In R. Burke (Ed.), Scatter/Gather as a Tool for the Navigation of Retrieval Results (pp. 10-12). Paper presented at the Working Notes of the AAAI Fall Symposium on AI Applications in Knowledge Navigation and Retrieval, Cambridge, MA.

Hearst, M.A., & Pedersen, J.O. (1996). In Reexamining the Cluster Hypothesis: Scatter/Gather on Retrieval Results (pp. 76-84). Paper presented at the Nineteenth International Conference on Research and Development in Information Retrieval, Zurich, Switzerland. ACM Press.

Heath, L.S., Hix, D., Nowell, L.T., Wake, W.C., Averboch, G.A., Labow, E., et al. (1995). Envision: A User-Centered Database of Computer Science Literature. Communications of the ACM, 38(4), 52-53.

References 205

Henderson, D.A., & Card, S.K. (1986). Rooms: the use of multiple virtual workspaces to reduce space contention in a window-based graphical user interface. ACM Transactions on Graphics (TOG), 5(3), 211-243.

Hendry, D.G., & Harper, D.J. (1997). An Informal Information-Seeking Environment. Journal of the American Society for Information Science, 48(11), 1036-1048.

Hersh, W.R. (1996). Evidence-based medicine and the Internet. ACP Journal Club, 5(4), A14-A16.

Hersh, W.R. (2005). Ubiquitous but unfinished: on-line information retrieval systems. Medical Decision Making, 25(2), 147-148.

Hersh, W.R., Crabtree, M.K., Hickam, D.H., Sacherek, L., Friedman, C.P., Tidmarsh, P., et al. (2002). Factors Associated with Success in Searching MEDLINE and Applying Evidence to Answer Clinical Questions. Journal of the American Medical Informatics Association, 9(3), 283-293.

Hersh, W.R., Crabtree, M.K., Hickam, D.H., Sacherek, L., Rose, L., & Friedman, C.P. (2000). Factors associated with successful answering of clinical questions using an information retrieval system. Bulletin of the Medical Library Association, 88(4), 323-331.

Hersh, W.R., & Hickman, D.H. (1999). How Well Do Physicians Use Electronic Information Retrieval Systems? A Framework for Investigation and Systematic Review. Journal of the American Medical Association, 280(15), 1347-1352.

Hightower, R.R., Ring, L.T., Helfman, J.I., Bederson, B.B., & Hollan, J.D. (1998). In Graphical Multiscale Web Histories: A Study of PadPrints (pp. 58-65). Paper presented at the Ninth ACM conference on Hypertext and hypermedia : links, objects, time and space--- structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems, Pittsburgh, PA. ACM.

Hirt, E.R., & Markman, K.D. (1995). Multiple Explanation: A Consider-an-Alternative Strategy for Debiasing Judgments. Journal of Personality and Social Psychology, 69(6), 1069-1086.

Hogarth, R.M., & Einhorn, H.J. (1992). Order Effects in Belief Updating: The Belief- Adjusted Model. Cognitive Psychology, 24, 1-55.

Hovland, C.I., Mandell, E., Campbell, E., Brock, T., Luchins, A.S., Cohen, A., et al. (1957). The Order of presentation in persuasion. New Haven, CT: Yale University Press.

Huang, M., & Wang, H. (2004). The influence of Document Presentation Order and Number of Documents Judged on Users' Judgments of Relevance. Journal of the American Society for Information Science and Technology, 55(11), 970-979.

Hunink, M., Glasziou, P., Siegel, J., Weeks, J., Pliskin, J., Elstein, A., et al. (2001). Decision making in health and medicine - integrating evidence and values. Cambridge: UK: Cambridge University Press.

References 206

Jansen, B.J., & Spink, A. (2003). In An Analysis of Web Documents Retrieved and Viewed (pp. 65-69). Paper presented at the Fourth International Conference on Internet Computing, Las Vegas, Nevada.

Johnson, E.J., Payne, J.W., & Bettman, J.R. (1988). Information displays and preference reversals. Organizational Behavior and Human Decision Processes, 42, 1-21.

Jolls, C., & Sunstein, C.R. (2006). Debiasing through Law. Journal of Legal Studies, 35, 199- 241.

Kahneman, D., Slovic, P., & Tversky, A. (1982). Judgment under uncertainty: Heuristics and biases. New York: Cambridge University Press.

Kahneman, D., & Tversky, A. (1972). Subjective probability: a judgment of representativeness. Cognitive Psychology, 3 430-454.

Kandogan, E., & Shneiderman, B. (1997). In Elastic windows: Evaluation of multi-window operations (pp. 250-257). Paper presented at the SIGCHI conference on Human factors in computing systems Atlanta, GA. ACM Press.

Kantor, P.B. (1987). A model for the stopping behavior of users of online systems Journal of the American Society for Information Science, 38(3), 211-214.

Kaptchuk, T.J. (2003). Effect of on research evidence. British Medical Journal, 326(7404), 1453-1455.

Kelly, D. (2005). Implicit feedback: using behavior to infer relevance. In A. Spink & C. Cole (Eds.), New directions in cognitive information retrieval (pp. 169-186). London: Springer.

Keren, G. (1990). Cognitive Aids and Debiasing Methods: Can Cognitive Pills cure Cognitive Ills? In J.-P. Caverni, J.-M. Fabre & M. Gonzalez (Eds.), Cognitive Biases (pp. 523-552). Amsterdam: North-Holland Elsevier Science Publishers B.V.

Klayman, J., & Brown, K. (1993). Debias the environment instead of the judge: an alternative approach to reducing error in diagnostic (and other) judgment. Cognition, 49, 97-122.

Klayman, J., & Ha, Y.W. (1987). Confirmation, disconfirmation, and information in hypothesis testing. Psychological Review, 94(2), 211-228.

Komlodi, A., Soergel, D., & Marchionini, G. (2006). Search histories for user support in user interfaces Journal of the American Society for Information Science and Technology, 57(6), 803-807.

Kruglanski, A.W., & Ajzen, I. (1983). Bias and error in human judgment. European Journal of Social Psychology, 13(1), 1-44.

Kupiec, J., Pedersen, J., & Chen, F. (1995). In A trainable document summarizer (pp. 68- 73). Paper presented at the Eighteenth annual International ACM/SIGIR Conference on research and development in information retrieval Seattle, WA. ACM press.

References 207

Lamping, J., & Rao, R. (1996). In Visualizing large trees using the hyperbolic browser (pp. 388-389). Paper presented at the Conference companion on Human factors in computing systems: common ground Vancouver, British Columbia. ACM Press.

Landauer, T.K., Egan, D.E., Remde, J.R., Lesk, M., Lochbaum, C.C., & Ketchum, D. (1993). Enhancing the usability of text through computer delivery and formative evaluation: the Superbook project. In C. McKnight, A. Dillon & J. Richardson (Eds.), Hypertext: A Psychological Perspective (pp. 71-136). New York: Ellis Horwood.

Lim, K.H., Benbasat, I., & Ward, L.M. (2000). The Role of Multimedia in Changing First Impression Bias. Information Systems Research, 11(2), 115-136.

Lopes, L.L. (1982). Procedural debiasing (No. WHIPP15): Madison: Wisconsin Human Information Processing Program.

Lucas, P., & Schneider, L. (1994). In Workscape: a scriptable document management environment (pp. 9-10). Paper presented at the Conference on Human Factors in Computing Systems Boston, MA ACM Press.

Lucas, P., & Senn, J.A. (1996). Document display system for organizing and displaying documents as screen objects organized along strand paths United States.

Luchins, A.S. (1957). Primacy-recency in impression formation. In C.I. Hovland, E. Mandell, E. Campbell, T. Brock, A.S. Luchins, A. Cohen, W. McGuire, I. Janis, R. Feierabend & N.H. Anderson (Eds.), The Order of presentation in persuasion (pp. 33-61). New Haven, CT: Yale University Press.

Lund, F.H. (1925). The Psychology of Belief: IV. The Law of Primacy in Persuasion. Journal of Abnormal and Social Psychology, 20, 183-191.

Lwanga, S.K., & Lemeshow, S. (1991). Sample size determination in health studies : a practical manual Geneva: World Health Organization.

Mackinlay, J.D., Rao, R., & Card, S.K. (1999). In An organic user interface for searching citation links (pp. 67-73). Paper presented at the SIGCHI conference on Human factors in computing systems Denver, Colorado. ACM press.

Mackinlay, J.D., Robertson, G., & Card, S.K. (1991). In The perspective wall: Detail and context smoothly integrated (pp. 173-176). Paper presented at the SIGCHI conference on Human factors in computing systems: Reaching through technology New Orleans, Louisiana.

MacPhail, M.G. (2004). Arrangement of information for display into a continuum ranging from closely related to distantly related to a reference piece of information United States.

Magrabi, F., Coiera, E., Westbrook, J., & Gosling, A.S. (2005). Clinician use of online evidence in primary care consultations. International Journal of Medical Informatics, 74(1), 1-12.

Malhotra, N.K. (1982). Information load and consumer decision making. Journal of Consumer Research, 8(4), 419-430.

References 208

Marchionini, G., & Komlodi, A. (1999). Design of Interfaces for Information seeking. Annual Review of Information Science and Technology, 33, 89-130.

Marett, K., & Adams, G. (2006). In The Role of Decision Support in Alleviating the Familiarity Bias (pp. 31b). Paper presented at the Thirty-ninth Annual Hawaii International Conference on System Sciences (HICSS'06) Track 2, Hawaii, USA.

Morrison, J.B., Pirolli, P., & Card, S.K. (2001). In A taxonomic analysis of what world wide web activities significantly impact people's decisions and actions. Paper presented at the Conference on Human Factors in Computing System, New York. ACM Press.

Morse, E., Lewis, M., & Olsen, K.A. (2002). Testing Visual Information Retrieval Case Study: Comparative Analysis of Textual, Icon, Graphical, and “Spring” Displays. Journal of the American Society for Information Science and Technology, 53(1), 28-40.

Mowshowitz, A., & Kawaguchi, A. (2002). Assessing bias in search engines. Information Processing and Management: an International Journal, 38(1), 141-156.

Murdock, B.B. (2001). An analysis of the serial position curve. In H.L.I. Roediger, J.S. Nairne, I. Neath & A.M. Surprenant (Eds.), The Nature of Remembering: Essays in Honor of Robert G. Crowder: American Psychological Association (APA).

Murdock, B.B., & Babick, A.J. (1961). The effect of repetition on the retention of individual words. American Journal of Psychology, 74(4), 596-601.

Murphy, A.H., & Winkler, R.L. (1984). Probability of precipitation forecasts. Journal of the Association Study of Perception, 79, 391-400.

Nie, J.Y. (1989). An Information Retrieval Model Based On Modal Logic. Information Processing and Management: an International Journal, 25(5), 477-491.

Nipher, F. (1876). On the Distribution of Errors in Numbers Written from Memory. Transactions of the Academy of Science of Saint Louis, 3, CCX-CCXI.

Nowell, L.T., France, R.K., Hix, D., Heath, L.S., & Fox, E.A. (1996). In Visualizing Search Results: Some Alternatives to Query-Document Similarity (pp. 67-75). Paper presented at the Nineteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Zurich, Switzerland. ACM Press.

Orasanu, J., & Fischer, U. (1997). Finding decisions in natural environments: The view from the cockpit. In C.E. Zsambok & G. Klein (Eds.), Naturalistic decision making (pp. 343-358). Hillsdale, NJ: Lawrence Erlbaum Associates.

Orbanes, J., & Guzman, A. (2004). Apparatus for viewing information in virtual space using multiple templates United States.

Oskamp, S. (1965). Overconfidence in case-study judgments. Journal of Consulting Psychology, 29, 261-265.

References 209

Pallier, G., Wilkinson, R., Danthiir, V., Kleitman, S., Knezevic, G., Stankov, L., et al. (2002). The role of individual differences in the accuracy of confidence judgments Journal of General Psychology, 129(3), 257–299.

Payne, J.W., Bettman, J.R., & Johnson, E.J. (1993). The adaptive decision maker. New York: Cambridge University Press.

Pirolli, P., Schank, P., Hearst, M.A., & Diehl, C. (1996). In Scatter/Gather Browsing Communicates the Topic Structure of a Very large Text Collection (pp. 213-220). Paper presented at the SIGCHI conference on Human factors in computing systems: common ground Vancouver, BC. ACM Press.

Plaisant, C., Milash, B., Rose, A., Widoff, S., & Schneiderman, B. (1996). In Lifelines: Visualizing Personal Histories (pp. 221-227). Paper presented at the SIGCHI conference on Human factors in computing systems: common ground Vancouver, British Columbia. ACM Press.

Pratt, W., Hearst, M.A., & Fagan, L.M. (1999). In A Knowledge-Based Approach to Organizing Retrieved Documents (pp. 80-85). Paper presented at the Sixteenth national conference on artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence Orlando, Florida. American Association for Artificial Intelligence

Premack, D. (1959). Toward empirical behavioral laws: I. Positive reinforcement. Psychological Review, 66, 219-233.

Purgailis Parker, L.M., & Johnson, R.E. (1990). Does Order of Presentation Affect Users' Judgment of Documents? Journal of the American Society for Information Science, 41(7), 493-494.

Rai, A., Stubbart, C., & Paper, D. (1994). Can Executive Information Systems Reinforce Biases? . Accounting Management & Information Technology, 4(2), 87-106.

Rao, R., & Card, S.K. (1994). In The Table Lens: Merging graphical and symbolic representations in an interactive focus+context visualization for tabular information (pp. 318-322). Paper presented at the ACM SIGCHI Conference on Human Factors in Computing Systems, Boston, MA.

Rao, R., Card, S.K., Jellinek, H.D., Mackinlay, J.D., & Robertson, G.G. (1992). In The Information Grid: A Framework for Information Retrieval and Retrieval-Centered Applications (pp. 23-32). Paper presented at the Fifth Annual Symposium on User Interface Software and Technology (UIST'92), Monterey, California. ACM Press.

Rao, R., Card, S.K., Johnson, W., Klotz, L., & Trigg, R.H. (1994a). In Protofoil: storing and finding the information worker's paper documents in an electronic file cabinet (pp. 212). Paper presented at the Conference companion on Human factors in computing systems Boston, MA ACM Press.

Rao, R., Pedersen, J.O., Hearst, M.A., Mackinlay, J.D., Card, S.K., Masinter, L., et al. (1995). Rich Interaction in the Digital Library. Communications of the ACM, 38(4), 29-39.

References 210

Reips, U.-D. (2002). Standards for Internet-Based Experimenting. Experimental Psychology, 49(4), 243-256.

Robertson, G., Czerwinski, M., Larson, K., Robbins, D.C., Thiel, D., & van Dantzich, M. (1998). In Data Mountain: Using Spatial Memory for Document Management (pp. 153- 162). Paper presented at the Eleventh annual ACM symposium on User interface software and technology San Francisco, CA. ACM Press.

Robertson, G., van Dantzich, M., Robbins, D., Czerwinski, M., Hinckley, K., Risden, K., et al. (2000). In The Task Gallery: a 3D window manager (pp. 494-501). Paper presented at the SIGCHI conference on Human factors in computing systems The Hague, The Netherlands ACM Press.

Robertson, G.G., & Mackinlay, J.D. (1993). In The document lens (pp. 101-108). Paper presented at the Sixth annual ACM symposium on User interface software and technology Atlanta, Georgia. ACM Press.

Robertson, G.G., Mackinlay, J.D., & Card, S.K. (1991). In Cone Trees: animated 3D visualizations of hierarchical information (pp. 189-194). Paper presented at the SIGCHI conference on Human factors in computing systems: Reaching through technology New Orleans, Louisiana. ACM Press.

Roth, S.F., Lucas, P., Senn, J.A., Gomberg, C.C., Burks, M.B., Stroffolino, P.J., et al. (1996). In Visage: A User Interface Environment for Exploring Information (pp. 3-12). Paper presented at the IEEE Symposium on Information Visualization (INFOVIS '96), Washington, DC. IEEE Computer Society.

Roy, M.C., & Lerch, F.J. (1996). Overcoming Ineffective Mental Representations in Base- rate Problems. Information Systems Research, 7(2), 233-247.

Russo, J.E., Medvec, V.H., & Melody, M.G. (1996). The Distortion of Information during Decisions. Organizational Behavior and Human Decision Processes, 66(1), 102-110.

Sanna, L.J., Schwarz, N., & Stocker, S.L. (2002). When Debiasing Backfires: Accessible Content and Accessibililty Experiences in Debiasing Hindsight. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28(3), 497-502.

Schroeder, R.G., & Benbassat, D. (1975). An experimental evaluation of the relationship of uncertainty to information used by decision makers. Decision Sciences, 6, 556-567.

Sciammarella, E., & Herndon, K. (1999). Method for displaying on a screen of a computer system images representing search results United States.

Sherman, S.J. (1980). On the self-erasing nature of errors of prediction. Journal of Personality and Social Psychology, 39(August), 211-221.

Shore, B. (1996). Bias in the development and use of an expert system: implications for life cycle costs. Industrial Management & Data Systems, 96(4), 18-26.

Simon, H.A. (1982). Models of bounded rationality (three volumes). Cambridge, Massachusetts: MIT Press.

References 211

Spence, R., & Apperley, M. (1982). Data base Navigation: An Office Environment for the Professional. Behavior and Information Technology, 1(1), 43-54.

Spink, A., & Cole, C. (2005b). A Multitasking Framework for Cognitive Information Retrieval. In A. Spink & C. Cole (Eds.), New directions in cognitive information retrieval (pp. 99-112). London: Springer.

Spink, A., & Cole, C. (2005). New Directions in Cognitive Information Retrieval (1st ed., Vol. 19). London: Springer.

Spink, A., & Cole, C. (2005a). Preface. In A. Spink & C. Cole (Eds.), New directions in cognitive information retrieval (pp. vii). London: Springer.

Spink, A., Park, M., & Koshman, S. (2006). Factors affecting assigned information problem ordering during Web search: An exploratory study. Information Processing and Management: an International Journal, 42(5), 1366-1378.

Spink, A., Wilson, T. D., Ford, N., Foster, A., & Ellis, D. (2002). Information Seeking and Mediated Searching Study. Part 3. Successive Searching. Journal of the American Society for Information Science and Technology, 53(9), 716-727.

Spoerri, A. (1993). In InfoCrystal: A visual tool for information retrieval (pp. 11-20). Paper presented at the Second international conference on Information and knowledge management Washington, D.C. ACM Press.

Spyridakis, J.H., Wei, C., Barrick, J., Uddihy, E.C., & Maust, B. (2005). Internet-based research: providing a foundation for web-design guidelines. IEEE Transactions on Professional Communication, 48(3), 242-260.

Stallard, M.J., & Worthington, D.L. (1998). Reducing the Hindsight Bias Utilizing Attorney Closing Arguments. Law and Human Behavior, 22(6), 671-683.

Subramaniam, P., Zoss, J., Ying, J.-J., & Caltabiano, M. (2004). Method, apparatus, and system for attaching search results. United States.

Tetlock, P.E., & Boettger, R. (1994). Accountability amplifies the status quo effect when change creates victims. Journal of Behavioral Decision Making, 7(1), 1-23.

Tubbs, R.M., Gaeth, G.J., Levin, I.P., & Van Osdol, L.A. (1993). Order Effects in Belief Updating with Consistent and Inconsistent Evidence. Journal of Behavioral Decision Making, 6, 257-269.

Turk, D.C., & Salovey, P. (1985). Cognitive Structures, Cognitive Processes, and Cognitive Behavior Modification: I. Client Issues. Cognitive Therapy and Research, 9(1), 1-17.

Turk, D.C., & Salovey, P. (1985a). Cognitive Structures, Cognitive Processes, and Cognitive Behavior Modification: II. Judgments and Inferences of the Clinician. Cognitive Therapy and Research, 9(1), 19-34.

References 212

Turpin, M., & Plooy, N.d. (2004). In Decision-making Biases and Information Systems. Paper presented at the Decision Support in an Uncertain and Complex World: The IFIP TC8/WG8.3 International Conference, Prato, Italy.

Tversky , A., & Kahneman, D. (1974). Judgment under Uncertainty: Heuristics and Biases. Science, 185(4157), 1124-1131.

Twidale, M., Nichols, D., Smith, G., & Trevor, J. (1995). In Ariadne: An Interface to Support Collaborative Database Browsing (pp. 367-374). Paper presented at the First international conference on computer support for collaborative learning, Indiana. Lawrence Erlbaum Associates.

Varey, C.A., & Kahneman, D. (1992). Experiences extended across time: Evaluation of moments and episodes. Journal of Behavioral Decision Making, 5(169-185).

Wang Baldonado, M.Q., & Winograd, T. (1997). In SenseMaker: an information- exploration interface supporting the contextual evolution of a user's interests (pp. 11-18). Paper presented at the SIGCHI conference on Human factors in computing systems Atlanta, Georgia. ACM Press.

Wang, H., Johnson, T.R., & Zhang, J. (2006). The order effect in human abductive reasoning: An empirical and computational study. Journal of Experimental and Theoretical Artificial Intelligence, 18(2), 215-247.

Wang, H., Zhang, J., & Johnson, T.R. (2000). In Human belief revision and the order effect. Paper presented at the Twenty-second Annual Conference of the Cognitive Science Society, Hillsdale, NJ.

Wang, P., & Soergel, D. (1998). A cognitive model of document use during a research project. Study I. document selection. Journal of the American Society for Information Science, 49(2), 115-133.

Wang, P., & White, M.D. (1999). A cognitive model of document use during a research project. Study II. Decisions at the reading and citing stages. Journal of the American Society for Information Science, 50(2), 98-114.

Warner, H.R., Toronto, A.F., Veasey, L.G., & Stephenson, R. (1961). A Mathematical Approach to Medical Diagnosis: application to congenital heart disease. Journal of the American Medical Association, 177(3), 177-183.

Weinstein, N.D., & Klein, W.M. (1995). Resistance of Personal Risk Perceptions to Debiasing Interventions. Health Psychology, 14(2), 132-140.

Westbrook, J.I., Coiera, E.W., & Gosling, A.S. (2005). Do Online Information Retrieval Systems Help Experienced Clinicians Answer Clinical Questions? Journal of the American Medical Informatics Association, 12(3), 315-321.

Westbrook, J.I., Gosling, A.S., & Coiera, E.W. (2005a). The Impact of an Online Evidence System on Confidence in Decision Making in a Controlled Setting. Medical Decision Making, 25(2), 178-185.

References 213

Wickens, C.D., & Carswell, C.M. (1995). The Proximity Compatibility Principle: Its Psychological Foundation and Its Relevance to Display Design. Human Factors, 37(3), 473- 494.

Wickens, C.D., & Hollands, J.G. (2000). Decision Making. In Engineering Psychology and Human Performance (3rd ed., pp. 293-336). Upper Saddle River, NJ: Prentice Hall.

Williams, P., Block, L.G., & Fitzsimons, G.J. (2006). Simply Asking Questions About Health Behaviors Increases Both Healthy and Unhealthy Behaviors. Social Influence, 1(2), 117-127.

Wilson, T.D., Ford, N., Ellis, D., Foster, A., & Spink, A. (2002). Information Seeking and Mediated Searching. Part 2. Uncertainty and Its Correlates. Journal of the American Society for Information Science and Technology, 53(9), 704-715.

Wilson, T.D., Houston, C.E., Eitling, K.M., & Brekke, N. (1996). A new look at anchoring effects: basic anchoring and its antecedents. Journal of Experimental Psychology: General, 125(4), 387-402.

Woodruff, A., Rosenholtz, R., Morrison, J.B., Faulring, A., & Pirolli, P. (2002). A comparison of the use of text summaries, plain thumbnails, and enhanced thumbnails for Web search tasks. Journal of the American Society for Information Science and Technology, 53(2), 172-185.

Wright, P. (1974). The harassed decision maker: Time pressures, distractions, and the use of evidence. Journal of Applied Psychology, 59(5), 555-561.

Yee, P., Swearingen, K., Li, K., & Hearst, M.A. (2003). In Faceted Metadata for Image Search and Browsing (pp. 401-408). Paper presented at the SIGCHI conference on Human factors in computing systems Ft. Lauderdale, Florida. ACM Press.

Zajonc, R.B. (1968). Attitudinal effects of mere exposure. Journal of Personality and Social Psychology (Monograph Supplement), 9(2), 1-27.

Appendices 214

Appendices

Appendix A: Sequential, non-sequential and odds form of the Bayes’ theorem

For both the sequential and non-sequential approaches, the probability of being correct after reading the first document (Doc1) is shown in Equation 1.

DocP 1 Correct Pr)|( ior CorrectP Doc1)|( = (1) DocP 1 Correct + DocPior 1 − ior1Incorrect )Pr)(|(Pr)|( where

Prior = P(Correct)

After reading the second document, the sequential approach updates the subject’s belief at the end of each document, where the prior for reading the second document (Doc2) is the probability of being correct after reading the first document (Doc1) (Gorry et al., 1968) (shown in Equation 2)

DocP 2 Correct ior'Pr)|( CorrectP DocDoc 21 ),|( = (2) DocP 2 Correct + DocPior 2 − ior1Incorrect )'Pr)(|('Pr)|( where

Prior’ = P(Correct|Doc1)

whereas, the non-sequential approach updates the subject’s belief at the end of all documents (Doc1, Doc2), as shown in Equation 3 (Warner et al., 1961).

CorrectP DocDoc 21 ),|( =

DocP 1 DocPCorrect 2 Correct Pr)|()|( ior DocP 1 DocPCorrect 2 Correct + DocPior 1 DocPIncorrect 2 − ior1Incorrect )Pr)(|()|(Pr)|()|( (3) where

Prior = P(Correct)

Appendices 215

i. Bayes’ theorem in odds form:

Posterior odds = Prior odds × (Document likelihood ratio) ii. The odds of being correct after reading a document:

CorrectP Doc)|( Posterior odds = IncorrectP Doc)|(

CorrectP )( Pr oddsior = IncorrectP )(

DocP Correct)|( Document likelihood ratio = DocP Incorrect)|( iii. The odds of being incorrect after reading a document:

IncorrectP Doc)|( Posterior odds = CorrectP Doc)|(

IncorrectP )( Pr oddsior = CorrectP )(

DocP Incorrect)|( Document likelihood ratio = DocP Correct)|(

Appendices 216

Appendix B: Recruitment announcements

Concise version Can you help build the next Google? http://129.94.108.23/health_searching/info.html

People who have used a search engine before are invited to participate in a 30-60 min Web- based experiment to answer 6 interesting health-related questions using a search engine. All participants will enter into a draw to win one of 100 movie tickets.

Contact: Annie Lau, [email protected],

Full version Can you help build the next Google?

People who have used an Internet search engine before are invited to participate in a Web- based search engine experiment, conducted by the Centre for Health Informatics, UNSW.

All participants will enter into a draw to win one of 100 movie tickets.

To take part, please go to: http://129.94.108.23/health_searching/info_study.html

Following a brief tutorial, you will be asked to use an online search engine to answer six (6) interesting health-related questions (approx: 30–60 min).

Your participation is important to us – the findings of this study will help us to better support people in using information to make health-related decisions. For more information about the study, please contact Annie Lau

Appendices 217

Appendix C: Participant information statement

The impact of information searching on decision making

Participant information statement

Purpose of the study You are invited to participate in a study of online information searching and decision-making. We hope to learn about the impact of searching the web to answer health-related questions. You were selected as a possible participant in this study because you have used an Internet search engine before and may use health-related evidence to inform your learning and decision-making.

Description of study and risks If you decide to participate, we will ask you to read six health-related questions and discuss your response. Next, we will ask you to use an online search engine to find relevant information pertaining to that scenario. Your information searching will be automatically recorded by the computer for analysis. People familiar with information searching would take around 30-60 min to complete the study.

We cannot and do not guarantee or promise that you will receive any benefits from this study. As our token of appreciation for your time and participation, you will enter into a draw for one of 100 movie tickets (or equivalent if you do not reside in Australia). If you are a student from UNSW GENM0518 Health and Power in the Internet Age, you will automatically gain 3 marks for your participation.

Confidentiality and disclosure of information The material in this research is Confidential Information. By agreeing to participate in this experiment, you agree to hold the Confidential Information in strict confidence and shall not disclose such information to any third party without the written permission of the Investigator. You also agree to employ all steps necessary to protect the Confidential Information from unauthorised disclosure or use. This Agreement is governed by and construed in accordance with the laws of Australia and any applicable international law.

Any information that is obtained in connection with this study and that can be identified with you will remain confidential and will be disclosed only with your permission, except as required by law. If you give us your permission by accepting this document, we plan to publish the results in conference and journal papers. In any publication, information will be provided in such a way that you cannot be identified.

Appendices 218

Complaints may be directed to the Ethics Secretariat, The University of New South Wales, SYDNEY 2052 AUSTRALIA (phone 9385 4234, fax 9385 6648, email [email protected]). Any complaint you make will be treated in confidence and investigated, and you will be informed of the outcome.

Your consent Your decision whether or not to participate will not prejudice your future relations with The University of New South Wales. If you decide to participate, you are free to withdraw your consent and to discontinue participation at any time without prejudice.

If you have any questions, please feel free to ask us. If you have any additional questions later, investigator Ms Annie Lau (02 9385 9035, [email protected]) will be happy to answer them.

You can print a copy of this page to keep.

Appendices 219

Appendix D: Ten-page tutorial

Appendices 220

Appendices 221

1.1 The search engine

Suppose the question is "Is there evidence to support that smoking causes lung cancer?". After giving your initial answer before searching, you will be asked to find information using the following search engine:

Functionalities:

1. Keywords: Enter your keywords according to these four dimensions, you do

*NOT* need to fill in all of them

2. Go: to start the search; Clear All: to clear all keywords

3. Builder: The Builder is for Advanced keywords.

Page:2/10

Appendices 222

1.2 The results page

Results will be returned in thumbnails, like the picture below:

Functionalities:

1. Top menu: Search to return to search page; Results to go to results page; Feedback to give us any feedback you have during searching; Contact us for contacting the investigator of the study

2. Page button: to move to the next or previous search results page (the page you're currently on is displayed on the top left hand corner, with the total no. of results returned)

3. Finish searching: To quit searching and proceed to answer the question again

Page:3/10

Appendices 223

2. Collecting evidence

While you are searching, you will be using a different tool to collect documents to support your answer in each scenario. These tools assist you to collect relevant information and make notes on the document:

- No tool (i.e. traditional searching) - Keep Document tool - For/against document tool

We will explore each tool in the next few pages.

Page:4/10

Appendices 224

2.1 No tool (i.e. traditional searching)

This is how people normally conduct searches these days. There is no special tool to assist you with noting down the search results that you have found useful. You simply have to remember what you have read across the searches.

Page:5/10

Appendices 225

2.2 Keep Document tool

This is a tool which helps you to keep relevant documents and make notes about them. Suppose you select the first document:

Step 1: After you have opened a document, a 'Keep document' appears at the bottom of the document window

Appendices 226

Step 2 : Select 'Keep' to retain relevant documents, then cut and paste or write some notes with the document

Step 3: Kept documents and notes will *automatically move* into the 'Selected results' section and stay across searches, like below.

Step 4 : Review all collected documents and notes *before* making the final decision

Page:6/10

Appendices 227

2.3 For/against document tool

This is a tool which allows you to allocate relevant documents to for/against the decision you are making, and allows you to make notes about them. This is similar to the 'Keep document' tool, except that the scales are 'For/Neutral/Against'.

Suppose you select the first document:

Step 1: After you have opened a document, the 'For/against document' tool appears at the bottom of the document window

Appendices 228

Step 2 : Allocate relevant documents to For, Neutral or Against, and cut and paste or write some notes with the document

- When you want to keep a document but it doesn't fit the "For" or "Against" categories, you should allocate it to "Neutral".

Step 3: Allocated documents and notes will *automatically move* into the 'For/Neutral/Against' section and stay across searches.

Step 4 : Review all collected documents and notes *before* making the final decision

Page:7/10

Appendices 229

2.4 Collecting evidence during experiment

At the beginning of each health scenario, one of these searching tools will be selected at random to assist you to collect evidence in that scenario. You will be told which type of tool has been selected before you start each scenario:

- No tool (i.e. traditional searching) - Keep Document tool - For/against document tool

Page:8/10

Appendices 230

3. A few more things ...

1. Collect and evaluate your evidence as thoroughly as possible

2. If an error occurs and you cannot continue, please login with the same details again; you will continue at the same place from where you left off

3. Stay in this browser window, do NOT open any new browser windows (this means you should not right-click and select Open Link in New Window or New Tab)

Whenever you visit HealthInsite, the site will ask you whether you would like to view the document in the current browser window. Be sure to always select Open the page in this window, like below:

Page:9/10

Appendices 231

You may now begin ...

Remember, the purpose of the experiment is to evaluate the effectiveness of different search experiences, not to evaluate your health-related knowledge.

Medicine is full of different viewpoints, be sure you have read and collected as much information as possible to support your answer.

You are now about to begin the experiment. Please log in using your name and a valid email address.

You may proceed to the experiment page. Have fun!

Page:10/10

Appendices 232

Appendix E: Location of evidence for each case scenario question

Case scenario 1: Diet We hear of people going on low carbohydrate and high protein diets, such as the Atkins diet, to lose weight. Is there evidence to support that low carbohydrate, high protein diets result in greater long-term weight loss than conventional low energy, low fat diets?

Expected correct answer: No

Location of evidence:

High protein low carbohydrate diets for weight loss http://www.healthyeatingclub.org/info/articles/body-shape/lowcarbbackground.htm [accessed on 1st November 2006]

Weight Loss and Results of Low-Carbohydrate Diets http://www.annals.org/cgi/content/full/140/10/I-27 [accessed on 1st November 2006]

Low-Carb, High-Protein Diets http://www.intelihealth.com/IH/ihtIH?d=dmtHMSContent&c=368035&p=~br,IHW|~st,24 479|~r,WSIHW000|~b,*|#research [accessed on 1st November 2006]

Weight loss diets: low, moderate or high carbohydrate? http://www.healthyeatingclub.org/info/articles/body-shape/lowcarbevidence.htm [accessed on 1st November 2006]

Which weight loss diet works best? 4 popular diets compared http://www.healthyeatingclub.org/info/articles/body-shape/diets.htm [accessed on 1st November 2006]

Weight loss and carbohydrates http://www.betterhealth.vic.gov.au/bhcv2/bhcarticles.nsf/pages/Weight_loss_and_carbohy drates [accessed on 1st November 2006]

Low-carbohydrate diets http://www.aafp.org/afp/20060601/1942.html [accessed on 1st November 2006]

Appendices 233

Case scenario 2: Hepatitis B You can catch infectious diseases such as the flu from inhaling the air into which others have sneezed or coughed, sharing a straw or eating off someone else's fork. The reason is because certain germs reside in saliva, as well as in other bodily fluids. Hepatitis B is an infectious disease. Can you catch Hepatitis B from kissing on the cheek?

Expected correct answer: No

Location of evidence:

Hepatitis B Frequently Asked Questions http://www.cdc.gov/ncidod/diseases/hepatitis/b/faqb.htm [accessed on 1st November 2006]

General information fact sheet: Hepatitis B http://www.health.qld.gov.au/sexhealth/factsheets/Hepatitis_B.shtml [accessed on 1st

November 2006]

Case scenario 3: Alcohol After having a few alcoholic drinks, we depend on our liver to reduce the Blood Alcohol

Concentration (BAC). Drinking coffee, eating, vomiting, sleeping or having a shower will not help reduce your BAC. Are there different recommendations regarding safe alcohol consumption for males and females?

Expected correct answer: Yes

Location of evidence:

FAQ: What is a safe level of drinking? http://www.niaaa.nih.gov/FAQs/General-English/FAQs13.htm [accessed on 1st November

2006]

Dietary Guidelines for Americans 2005: Key Recommendations http://www.ncadd.org/facts/health.html [accessed on 1st November 2006]

A-Z Health Topic: Alcohol

Appendices 234

http://www.health.nsw.gov.au/topics/alcohol.html [accessed on 1st November 2006]

Drinking alcohol: how much is health? http://mhcs.health.nsw.gov.au/mhcs/publication_pdfs/5375/BHC-5375-ENG.pdf [accessed on 1st November 2006]

Case scenario 4: SIDS Sudden infant death syndrome (SIDS), also known as ‘cot death’, is the unexpected death of a baby where there is no apparent cause of death. Studies have shown that sleeping on the stomach increases a baby's risk of SIDS. Is there an increased risk of a baby dying from SIDS if the mother smokes during pregnancy?

Expected correct answer: Yes

Location of evidence:

SIDS Q & A http://www.firstcandle.org/expectantparents/exp_reduce_qa.html [accessed on 1st

November 2006]

Sudden infant death syndrome http://www.nlm.nih.gov/medlineplus/ency/article/001566.htm [accessed on 1st November

2006]

Sudden Infant Death Syndrome and prenatal maternal smoking: rising attributed risk in the

Back to Sleep era http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstrac t&list_uids=15644131&query_hl=1 [accessed on 1st November 2006]

Sudden infant death syndrome (SIDS) http://kidshealth.org/parent/general/sleep/sids.html [accessed on 1st November 2006]

Pregnancy and smoking

Appendices 235

http://www.betterhealth.vic.gov.au/bhcv2/bhcarticles.nsf/pages/Pregnancy_and_smoking

[accessed on 1st November 2006]

Sudden infant death syndrome (SIDS) - risks http://www.betterhealth.vic.gov.au/bhcv2/bhcarticles.nsf/pages/Sudden_infant_death_synd rome_(SIDS) [accessed on 1st November 2006]

Case scenario 5: Breast cancer Breast cancer is one of the most common types of cancer found in women. Is there an increased chance of developing breast cancer for women who have a family history of breast cancer?

Expected correct answer: Yes

Location of evidence:

Genetics of Breast and Ovarian Cancer (PDQ) http://www.cancer.gov/cancertopics/pdq/genetics/breast-and-ovarian#Section_6 [accessed on 1st November 2006]

Risk factors. What causes breast cancer? http://www.breasthealth.com.au/riskfactors/index.html [accessed on 1st November 2006]

Managing healthy women at risk of breast cancer http://www.australianprescriber.com/magazine/25/6/139/41/ [accessed on 1st November

2006]

Q & A about breast cancer http://www.mydr.com.au/default.asp?article=3977 [accessed on 1st November 2006]

Breast health http://www.cancervic.org.au/cancer1/prevent/breasthealth.htm [accessed on 1st November

2006]

Appendices 236

Genetic Testing for Breast and Ovarian Cancer Susceptibility http://www.cdc.gov/genomics/training/perspectives/factshts/breastcancer.htm [accessed on

1st November 2006]

Case scenario 6: Suicide Men are encouraged by our culture to be tough. Unfortunately, many men tend to think that asking for help is a sign of weakness. In Australia, do more men die by committing suicide than women?

Expected correct answer: Yes

Location of evidence:

Men’s health http://www.healthinsite.gov.au/content/external/frame.cfm?ObjID=00018372-8849-10DB-

94C780B1536A006D [accessed on 1st November 2006]

Suicide and attempted suicide http://www.virtualpsychcentre.com/diseases.asp?did=83 [accessed on 1st November 2006]

Trends in Deaths: Analysis of Australian Data 1987-1998 with updates to 2000 http://www.aihw.gov.au/publications/phe/td00/td00-c13.pdf [accessed on 1st November

2006]

Case scenario 7: Cold Many people use home therapies when they are sick or to keep healthy. Examples of home therapies include drinking chicken soup when sick, drinking milk before bed for a better night's sleep and taking vitamin C to prevent the common cold. Is there evidence to support the taking of vitamin C supplements to help prevent the common cold?

Expected correct answer: No

Location of evidence:

Appendices 237

Vitamins - common misconceptions (Better Health Channel, HealthInsite) http://www.betterhealth.vic.gov.au/bhcv2/bhcarticles.nsf/pages/Vitamins_common_misco nceptions [accessed on 1st November 2006]

Vitamin C for preventing and treating the common cold (The Cochrane Collaboration) http://www.cochrane.org/reviews/english/ab000980.html [accessed on 1st November 2006]

Common cold – prevention http://www3.niaid.nih.gov/healthscience/healthtopics/colds/prevention.htm [accessed on

1st November 2006]

Vitamins http://www.hsph.harvard.edu/nutritionsource/vitamins.html [accessed on 1st November

2006]

Colds http://www.cyh.com/HealthTopics/HealthTopicDetails.aspx?p=114&np=303&id=1767

[accessed on 1st November 2006]

Colds, commonsense not antibiotics http://www.mydr.com.au/default.asp?article=3979 [accessed on 1st November 2006]

The common cold http://www.lungnet.com.au/fact_sheets/common-cold-health.html [accessed on 1st

November 2006]

Case scenario 8: AIDS We know that we can catch AIDS from bodily fluids, such as from needle sharing, having unprotected sex and breast-feeding. We also know that some diseases can be transmitted by mosquito bites. Is it likely that we can get AIDS from a mosquito bite?

Expected correct answer: No

Appendices 238

Location of evidence:

Can I Get HIV from Mosquitoes? http://www.cdc.gov/hiv/resources/qa/qa32.htm [accessed on 1st November 2006]

Can we get AIDS from mosquito bites? http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstrac t&list_uids=10554479&query_hl=1 [accessed on 1st November 2006]

Can I get HIV from a bite? http://www.cdc.gov/hiv/resources/qa/qa33.htm [accessed on 1st November 2006]

HIV and AIDS http://www.cyh.com/HealthTopics/HealthTopicDetails.aspx?p=240&np=299&id=2024 [accessed on 1st November 2006]

Human Immunodeficiency Virus infection (HIV) http://www.dhhs.tas.gov.au/healthyliving/factsheet.php?id=725 [accessed on 1st November 2006]

HIV and AIDS: 12 common questions answered http://www.mydr.com.au/default_new.asp?article=4024 [accessed on 1st November 2006]

HIV and AIDS: How to reduce your risk http://familydoctor.org/005.xml [accessed on 1st November 2006]