Thinking About Diagnostic Thinking: a 30-Year Perspective
Total Page:16
File Type:pdf, Size:1020Kb
Adv in Health Sci Educ (2009) 14:7–18 DOI 10.1007/s10459-009-9184-0 ORIGINAL PAPER Thinking about diagnostic thinking: a 30-year perspective Arthur S. Elstein Received: 14 July 2009 / Accepted: 14 July 2009 / Published online: 11 August 2009 Ó Springer Science+Business Media B.V. 2009 Abstract This paper has five objectives: (a) to review the scientific background of, and major findings reported in, Medical Problem Solving, now widely recognized as a classic in the field; (b) to compare these results with some of the findings in a recent best-selling collection of case studies; (c) to summarize criticisms of the hypothesis-testing model and to show how these led to greater emphasis on the role of clinical experience and prior knowledge in diagnostic reasoning; (d) to review some common errors in diagnostic reasoning; (e) to examine strategies to reduce the rate of diagnostic errors, including evidence-based medicine and systematic reviews to augment personal knowledge, guide- lines and clinical algorithms, computer-based diagnostic decision support systems and second opinions to facilitate deliberation, and better feedback. Keywords Clinical judgment Á Problem solving Á Clinical reasoning Á Cognitive processes Á Hypothetico-deductive method Á Expertise Á Intuitive versus analytic reasoning Accounts of clinical reasoning fall into one of two categories. The first comprises narra- tives or reports of interesting or puzzling cases, clinical mysteries that are somehow explicated by the clever reasoning of an expert. The narrative includes both the solution of the mystery and the problem solver’s account of how the puzzle was unraveled. These case reports resemble detective stories, with the expert diagnostician cast in the role of detec- tive.1 A recent example of this genre is ‘‘How Doctors Think’’, a collection of stories of clinical reasoning that has become a best-seller (Groopman 2007). The second broad category of studies falls within the realm of cognitive psychology. Investigators have analyzed clinical reasoning in a variety of domains—internal medicine, surgery, pathology, etc.—using cases or materials that might be mysteries but often are 1 The fictional detective Sherlock Holmes, perhaps the model of this genre, was inspired by one of Sir Arthur Conan Doyle’s teachers in medical school, Dr. Joseph Bell (http://www.siracd.com/work_bell.shtml). A. S. Elstein (&) Department of Medical Education, University of Illinois at Chicago, Chicago, IL, USA e-mail: [email protected] 123 8 A. S. Elstein intended to be representative of the domain. Rather than offering the solution and rea- soning of one clinician, the researchers collect data from a number of clinicians working the case to see if general patterns of performance, success and error can be identified. Some studies use clinicians of varying levels of experience, thereby hoping to study the devel- opment of expert reasoning or to discern patterns of error. Others concentrate on experi- enced physicians and aim to produce an account of what makes expert performance possible. These investigations are based on samples of both subjects and cases, although the samples are often relatively small. Two recent reviews (Ericsson 2007; Norman 2005) of the second category of studies date the beginning of this line of work to the publication of ‘‘Medical Problem Solving’’ (Elstein et al. 1978). It is exceedingly gratifying to be so recognized. When my colleagues and I began this project at Michigan State University 40 years ago, we never dreamed that our book would turn out to be a classic in medical education, cited as the foundation of the field of research on clinical judgment.2 For this reason, this paper begins with a review of the state of the field when the Medical Inquiry Project (as we called it) was conceived and implemented and then summarizes the major finding of that study. Next some of Groopman’s case studies will be examined because they vividly illustrate several important conclusions of the research and the strengths and weaknesses of the clinical narrative approach. The next two sections review some of the criticisms of the hypothetico-deductive model and some of the major errors in diagnostic reasoning that have been formulated by two somewhat divergent approaches within cognitive psychology—the problem solving and judgment/decision making para- digms. The essay concludes with some suggestions to reduce the error rate in clinical diagnosis. Medical Problem Solving: the research program The Medical Inquiry Project was aimed at uncovering and characterizing the underlying fundamental problem solving process of clinical diagnosis. This process was assumed to be ‘‘general’’ and to be the basis of all diagnostic reasoning. Because the process was assumed to be general and universal, extensive sampling of cases was not necessary, and each physician who participated was exposed to only three cases in the form of ‘‘high-fidelity simulations’’, a form known now as ‘‘standardized patients’’ (Swanson and Stillman 1990), plus some additional paper cases. Some of the paper simulations were borrowed from work on clinical simulations known as ‘‘Patient management problems’’ (PMPs; McGuire and Solomon 1971). In the light of our findings and subsequent research, the assumption of a single, uni- versal problem-solving process common to all cases and all physicians may seem unrea- sonable, but from the perspective of the then-current view of problem solving, it was quite plausible. Take language as an example: how many spoken sentences do you need to hear to judge whether or not a speaker is a native or non-native speaker of English? Not very many. Chomsky’s work on a universal grammar that was assumed to underpin all 2 To give credit, a case can be made that the field of research on clinical diagnosis really began earlier with studies of variation in clinical judgment that are nowadays regrettably overlooked (Bakwin 1945; Yer- ushalmy 1953; Lusted 1968). The reasons for neglect are unclear and in any event are beyond the scope of this paper, but it is a plain fact that they are cited neither by Ericsson nor Norman, nor by the vast majority of psychologists, physicians and educators who have done research on clinical reasoning in the past 30 years. 123 Thinking about diagnostic thinking 9 languages was one influence on our strategy (Chomsky 1957). Newell and Simon (1972), Newell et al. (1958) had been working toward a ‘‘general problem solver’’ (GPS). Their research on human (non-medical) problem solving similarly employed small sets of problems and intensive analysis of the thinking aloud of relatively few subjects. Likewise, deGroot’s study of master chess players used just a few middle-game positions as the task (De Groot 1965). Early in our work, Lee Shulman and I visited the Institute of Personality Assessment and Research at the University of California Berkeley. Ken Hammond was in the group to which we presented our plans and some pilot data. As the Brunswikian he is, Hammond warned us to sample tasks more extensively. He argued that tasks differ more than people and that we were not sampling the environment adequately. We did not understand the force of his comments until much later, but his views have influenced much subsequent research: both cases and physicians have to be adequately sampled. The assumption of a general problem-solving process also meant that any physician could be expected to exemplify the process. General internists and family physicians were the subjects, not specialists in the cases selected. Peer ratings of excellence were obtained to categorize the physicians as either experts or non-experts (the terms used in the book are ‘‘criterial’’ and ‘‘non-criterial’’). An earlier review of this project (Elstein et al. 1990) has discussed the problems arising from using peer ratings to identify expertise. Questions about whether the clinical reasoning of specialists and generalists would be similar did not arise until later. The major findings of our studies are quite familiar and can be briefly summarized. First, given the small sample of case materials used, expert and non-expert physicians could not be distinguished. The objective performance of expert diagnosticians was not significantly better than that of non-experts. All the subjects generated diagnostic hypotheses, all collected data to test hypotheses. The ‘‘experts’’ were not demonstrably more accurate or efficient. The groups might differ, but evidence was lacking. Second, diagnostic problems were approached and solved by a hypothetico-deductive method. A small set of findings was used to generate a set of diagnostic hypotheses and these hypotheses guided subsequent data collection. This did indeed appear to be a general method, but since this method was employed whether or not the final diagnosis was accurate, it was difficult to claim much for it. Third, between three and five diagnostic hypotheses were considered simultaneously, although any physician in our sample could easily enumerate more possibilities. This finding, a magic number of 4 ± 1, linked our results to work on the size of short-term memory (Miller 1956) and to Newell and Simon’s notion of limited or bounded rationality (Newell et al. 1958; Newell and Simon 1972). Thus, our analysis was far more grounded in cognitive theory than prior research on clinical reasoning or the usual clinical narrative. Finding #4 was that expertise was content- or case-specific. Physician performance was variable, and the outcome on one case was a poor predictor of performance on another. In medical education, this has been a very influential finding, especially in the assessment of clinical competence. In case-based examinations, test designers now use more cases to get satisfactory generalizability (Swanson and Stillman 1990; Bordage and Page 1987). The PMP method, which asked many questions about a small number of cases, was gradually replaced by approaches that sample cases more broadly and ask fewer questions about each case.