Quick viewing(Text Mode)

Enhancing the Quality and Credibility of Qualitative Studies 655

Enhancing the Quality and Credibility of Qualitative Studies 655

CHAPTER Enhancing the Quality and 9 Credibility of Qualitative Studies

e Medieval alchemi- In another village you tell the hungry to give up their cal symbol for fire was a preoccupation with food. In yet another village you tell single triangle, also the the people to pray for a richer harvest. In each village modern symbol for tri- the problem is the same, but always your message is dif- angulation in geometry, ferent. I can find no pattern of Truth in your teachings.” trigonometry, and survey- ing: the process of locat- e Mulla looked piercingly at the young man. ing an unknown point by measuring angles to it from known points. Triangulation “Truth? When you came here you did not tell me you in qualitative involves gathering and analyzing wanted to learn Truth. Truth is like the Buddha. When multiple perspectives, using diverse sources of data, and met on the road it should be killed. If there were only during analysis, using alternative frameworks. one Truth to be applieddistribute to all villages, there would be The double-triangle symbol, shown here, repre- no need of Mullahs to travel from village to village.” sented strong fire in alchemy. Strong fire was needed to ensure that the transformative process would work. “When youor first came to me you said you wanted Building and sustaining a strong fire required quality to ‘learn how to interpret’ what you see as you travel materials, good ventilation, and ongoing monitoring. through the world. Your confusion is simple. To inter- Using a strong fire required skill, experience, and rig- pret and to state Truths are two quite different things.” orous implementation of the transformative process to achieve the desired effects. Strong fire produces both Having finished his story Halcolm smiled at the atten- intense heat and bright illumination. Alchemists who tive youths. “Go, my children. Seek what you will, do could properly build, sustain, and appropriately usepost, what you must.” strong fire were held in high esteem, had great credi- bility, and produced much-valued products. —From Halcolm’s Parables

Interpreting Truth Chapter Preview A young man traveling through a new country heard that This chapter concludes the book by addressing ways a great Mulla, a Sufi guru with unequaledcopy, insight into the to enhance the quality and credibility of qualitative mysteries of the world, was also traveling in that region. analysis. Module 76 discusses and demonstrates ana- e young man was determined to become his disciple. lytical processes for enhancing credibility by system- He found his way to the wise man and said, “I wish to atically engaging and questioning the data. Module place my education in your hands that I might learn to 77 presents four triangulation processes for enhancing interpret what Inot see as I travel through the world.” credibility. Modules 78 and 79 present alternative and competing criteria for judging the quality of qualita- After six months of traveling from village to village tive studies. Module 80 discusses how and why the with the great teacher, the young man was confused credibility of the inquirer is critical to the overall cred- and disheartened. He decided to reveal his frustration ibility of qualitative findings. Module 81 examines toDo the Mulla. core issues of generalizability, extrapolations, transfer- ability, generating principles, and harvesting lessons. “For six months I have observed the services you pro- Module 82 concludes the chapter and the book by vide to the people along our route. In one village you tell addressing philosophy of science issues related to the the hungry that they must work harder in their fields. credibility and utility of qualitative inquiry.

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. MODULE

Analytical Processes for Enhancing Credibility: Systematically Engaging and Questioning the Data 76

The credibility of qualitative inquiry depends on four doubts about the nature of qualitative analysis because distinct but related inquiry elements: it is so judgment dependent. Statistical analysis fol- lows formulas and rules, while, at the core, qualitative 1. Systematic, in-depth fieldwork that yields analysis depends on the insights, conceptual capabil- high-quality data ities, and integrity of the analyst. Qualitative analysis is driven by the capacity for astute pattern recogni- 2. Systematic and conscientious analysis of data with tion from beginning to end. Staying open to the data, attention to issues of credibility for example, involves aggregating and integrating the 3. Credibility of the inquirer, which depends on train- data around a particular expected pattern while also ing, experience, track record, status, and presenta- watching for unexpected patterns. This process is tion of self epitomized in health by the scientist work- ing on one problem who suddenly notices a pattern 4. Readers’ and users’ philosophical in the value of related to a quite different problem—and thus discov- qualitative inquiry—that is, a fundamental appre- ers Viagra; as Pasteur distributeexplained when he was asked ciation of naturalistic inquiry, qualitative methods, how he happened to discover how to stop bacterial inductive analysis, purposeful sampling, and holis- contamination of milk, “Chance favors the prepared tic thinking (indeed, all 12 core qualitative strate- mind.” Here, then, are some techniques that prepare gies presented in Exhibit 2.1, pp. 46–47) or the mind for insight while also enhancing the credi- bility of the resulting analysis. The first of the elements that determine credibility, systematic, in-depth fieldwork that yields high-quality data, was covered in Chapter 5 (purposeful qualita- Integrity in Analysis: tive designs), Chapter 6 (in-depth fieldwork and rich Generating and Assessing Alternative observational data), and Chapter 7 (high-quality, skillful interviewing). post,Conclusions and Rival This module and the next focus on the remaining One barrier to credible qualitative findings stems three elements of quality: systematic and conscien- from the suspicion that the analyst has shaped find- tious analysis of data. Module 80 discusses credibility ings according to his or her predispositions and . of the inquirer, and Module 82 examines readers’ and Being able to report that you engaged in a system- users’ philosophical belief in the value of qualitative atic and conscientious search for alternative themes, inquiry. divergent patterns, and rival explanations enhances copy, credibility, not to mention that it is simply good ana- Strategies for Enhancing lytical practice and the very essence of being rigorous in analysis. This can be done both inductively and log- the Credibility of Analysis ically. Inductively, it involves looking for other ways of organizing the data that might lead to different Chance favorsnot the prepared mind. findings. Logically, it means thinking about other log- ical possibilities and then seeing if those possibilities —Louis Pasteur (1822–1895) can be supported by the data. When considering rival French microbiologist (known as the “father of organizing schemes and competing explanations, your Domicrobiology”) who discovered the process for mind-set should not be one of attempting to disprove pasteurizing milk, named after him the alternatives; rather, you look for data that support alternative explanations. Chapter 8 presented analytical strategies for coding In evaluation of a training program for chronically qualitative data, identifying patterns and themes, unemployed men of color, we conducted case studies creating typologies, determining substantive signifi- of a group of successes. The program model was based cance, and reporting findings. However, at the heart on training in both hard skills (e.g., machine tooling, of much controversy about qualitative findings are keyboarding, welding, and accounting) and soft skills

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 654 ANALYSIS, INTERPRETATION, AND REPORTING

(showing up to work on time, dressing appropriately, you wish to make, to ensure that you have not missed and respecting supervisors and coworkers). The cases anything that might lead you to question their appli- studied validated the importance of both kinds of cability. Essentially this means looking for negative or skills, but an additional emerged in later deviant cases—situations and examples that just do not cases, namely, that the program experience and peer fit the general points you are trying to make. However, support led to an identity shift: Successful trainees the discovery of negative cases or counter- began to think of themselves as capable of holding to a hunch in qualitative analysis does not mean its a job. They were used to being labeled as “losers.” immediate rejection. You should investigate the nega- The opportunity to think of themselves as “winners” tive cases and try to understand why they occurred and involved more than acquiring “soft skills.” It involved what circumstances produced them. As a result, you a shift in identity. We went back to earlier cases to might extend the idea behind the code to include the find out if that phenomenon was evident there as well. circumstances of the negative case and thus extend the It was, as was evidence for how that shift in identity richness of your coding. (Gibbs, 2007, p. 96) occurred. Might this change be simply a function of participants being older by the time they entered In the Southwest Field Training Project involving this particular program (a maturation effect)? No, the wilderness education, virtually all participants reported change was evident in younger participants as well significant “personal growth” as a result of their par- as older ones. We continued in this fashion, looking ticipation in the wilderness experiences; however, the for alternative explanations and checking them out two people who reported “no change” provided partic- against the case data. ularly useful insights intodistribute how the program operated Failure to find strong supporting evidence for alter- and affected participants. These two had crises going native ways of presenting data or contrary explanations on back home that limited their capacity to “get into” helps increase confidence in the initial, principal expla- the wilderness experiences. The project staff treated nation you generated. Comparing alternative patterns the wildernessor experiences as fairly self-contained, will not typically lead to clear-cut “yes there is support” closed-system experiences. The two negative cases versus “no there is no support” kinds of conclusions. opened up thinking about “baggage carried in from You’re searching for the best fit, the preponderance of the outside world,” “learning-oriented mind-sets,” and evidence. This requires assessing the weight of evi- a “readiness” factor that subsequently affected partici- dence and looking for those patterns and conclusions pant selection and preparation. that fit the preponderance of data. Keep track of and Negative cases also provide instructive opportu- report alternative classification systems, themes, andpost, nities for new learning in formative . For explanations that you considered and “tested” during example, in a health education program for teenage data analysis. This demonstrates intellectual integ- mothers where the large majority of participants rity and lends considerable credibility to the final set complete the program and show knowledge gains, an of findings and explanations offered. Analysis of rival important component of the analysis should include explanations in case studies is analogous to counterfac- examination of reactions from dropouts, even if the tual analysis in experimental designs. sample is small for the dropout group. While the small copy, proportion of dropouts may not be large enough to make a difference in a statistical analysis, qualitatively Searching for and Analyzing Negative the dropout feedback may provide critical information or Disconfirming Evidence and Cases about a niche group or a specific subculture, and/or clues to program improvement. Closely related tonot testing alternative constructs is the No specific guidelines can tell you how and how search for and analysis of negative cases. Where patterns long to search for negative cases or how to find alterna- and trends have been identified, our understanding of tive constructs and hypotheses in qualitative data. Your those patterns and trends is increased by considering the obligation is to make an “assiduous search . . . until no instancesDo and cases that do not fit within the pattern. further negative cases are found” (Lincoln & Guba, These may be exceptions that illuminate the bounda- 1986, p. 77). You then report the basis for the conclu- ries of the pattern. They may also broaden understand- sions you reach about the significance of the negative ing of the pattern, change the conceptualization of the or deviant cases. pattern, or cast doubt on the pattern altogether. Readers of a qualitative study will make their own decisions about the plausibility of alternate expla- In qualitative analysis you need to keep analyzing the nations and the why deviant cases do not fit data to check any explanations and generalizations that within dominant patterns. But I would note that the

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 655

ADVOCACY ADVERSARY ANALYSIS SIDEBAR

In 1587, the Roman Catholic Church created advocacy– their respective strategies. These are experienced evalu- adversary roles to test the validity of evidence in support of ators engaged in a battle not only of data but also of wits. the canonization process for elevating someone to saint- hood. The Devil’s Advocate (Latin: advocatus diaboli) in As the two teams prepared their nal reports, a concern this process (o cially designated the Promoter of the Faith) emerged among some about the narrow focus of the evalu- was a canon lawyer whose job was to argue against the ation. The summative question concerned whether the pro- canonization by presenting doubts about or holes in the evi- gram should be continued or terminated. Education ocials dence, for example, to argue that any miracles attributed to were asking how to improve the program without terminat- the candidate were unsubstantiated or even fraudulent. The ing it. Was it possible that a great amount of time, eort, and Devil’s Advocate opposed God’s Advocate, whose job was money was directed at answering the wrong question? Was it to present evidence supporting and make the in appropriate to force the data into a simple save-it-or-scrap-it favor of canonization. This advocacy–adversary process choice? In , middle-ground positions were more sensi- endured until 1983, when it was abolished by Pope John Paul ble. But the advocacy–adversary analytical process design II as overly adversarial and contentious. obliged opposing teams to do battle on the unembellished question of whether to maintain or terminate a program. A Advocacy–Adversary Analysis in Evaluation systematic assessment of strengths and weaknesses, with ideas for improvement,distribute gave way to an all-good, all-bad fram- A formal and forced approach to engaging rival conclusions ing, and that’s how the results were presented (Patton, 2008, draws on the legal system’s reliance on opposing perspec- pp. 142–143). tives battling it out in the courtroom. The advocacy-adversary The weaknessor of the advocacy–adversary approach is that it model suggested by Wolf (1975) developed in response to emphasizes contrasts and opposite conclusions, to the detri- concerns that evaluators could be biased in their conclusions. ment of appreciating and communicating nuances in the data Also called the Judicial Model of Evaluation (Datta, 2005), to bal- and accepting and acknowledging genuine and meaningful ance possible evaluator biases, two teams engage in debate. . Advocacy–adversary analysis forces data sets into The advocacy team gathers and presents information that sup- combat with each other. Such oversimplification of complex ports the proposition that the program is effective; the adver- and multifaceted findings is a primary why advocacy– sary team gathers information that supports the conclusionpost, adversary evaluation is rarely used (in addition to being expen- that the program ought to be changed or terminated. sive and time-consuming). Still, it highlights the importance of engaging in some systematic analysis of alternative and rival Some years ago, I served as the judge for what would con- conclusions, and as one approach (but not the only one) to stitute admissible evidence in an advocacy–adversary eval- testing conclusions, it can be useful and revealing. uation of an innovative education program in Hawaii. The task of the advocacy team was to gather and present data supporting the proposition that the program was effective Practical Analytical Variations on a Theme and ought to be continued. Thecopy, adversaries were charged 1. A variation of the overall advocacy–adversary approach with marshalling all possible evidence demonstrating that would be to arbitrarily create advocacy and adversary teams the program ought to be terminated. When I arrived on the only during the analysis stage so that both teams work with scene, I immediately felt the exhilaration of the competition. the same set of data but each team organizes and interprets I wrote in my journal, not those data to support dierent and opposite conclusions, including identifying ambiguous ndings. No longer staid academic scholars, these are athletes in a contest that will reveal who is best; these are lawyers 2. Another variation would be for a lone analyst to organize prepared to use whatever means necessary to win their data systematically into pro and con sets of evidence to see Docase. The teams have become openly secretive about what each yielded.

section of the report that involves exploration of alter- well written, this section of a report reads something native explanations and consideration of why certain like a detective study in which the analyst (detective) cases do not fall into the main pattern can be among looks for clues that lead in different directions and tries the most interesting sections of a report to read. When to sort out which direction makes the most sense given

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 656 ANALYSIS, INTERPRETATION, AND REPORTING

dilemmas posed by negative cases is both intellectually ANALYTIC INDUCTION: HYPOTHESIS honest and politically strategic. TESTING WITH NEGATIVE CASES Avoid the Numbers Game Analytic induction emphasizes giving special attention to nega- SIDEBAR tive or deviant cases for testing propositions that should, based Philosopher of science Thomas H. Kuhn (1970), hav- on the theory being examined, apply to all cases that have been ing studied extensively the value systems of scientists, sampled in the design to manifest the phenomenon of interest. observed that “the most deeply held values concern Analytic induction works through one case at a time. If the case predictions” and “quantitative predictions are preferable data fit the hypothesis, the inductive analyst takes up the next to qualitative ones” (pp. 184–185). The methodologi- case. If a case isn’t consistent with the hypothesis—that is, it is cal status hierarchy in science ranks “hard data” above a negative or deviant case—then the hypothesis is revised or “soft data,” where “hardness” refers to the precision of the case is rejected as not actually relevant to the phenomenon statistics. Qualitative data can carry the stigma of “being being studied. The analytical focus is examining the extent to soft.” This carries over into the public arena, especially in which every case confirms the hypothesis and to either refine the media and among policymakers, creating what has the hypothesis or the statement of the problem to account for been called the tyranny of numbers (Eberstadt, 1995). all cases. No cases can be ignored. All must be accounted for How can one deal with a lingering against and used in the analysis. qualitative methods? A starting point is helping people understand that qualitative methods are not Here’s an example of testing a hypothesis about the effect of weaker or softer than quantitativedistribute approaches. Qual- mother–daughter relationships on anorexia. The proposition itative methods are different. Making the case for being tested was “If mother was critical of daughter’s body the value of qualitative involves being able image and mother–daughter relationship was strained and to communicateor the particular strengths of qualita- daughter experiences weight loss, then count that as an exam- tive methods (Chapters 1 and 2) and the kinds of ple of mother’ s negative influence on daughter’s self-image.” evaluation and other applications for which qualita- Once particular interviews were identified as containing the tive data are especially appropriate (Chapter 4). But codes identified in the hypothesis, the qualitative data from those understandings can only open the door to dia- interviews and cases could be examined to determine whether logue. The fact is that numbers have a special allure support for this causal interpretation could be justified for each in modern society. Statistics are seductive—so precise, case (Hesse-Biber & Dupuis, cited in Silverman & Marvasti, 2008, post,so clear. Numbers convey that sense of precision and p. 252). The rigor of this approach is that finding even a single accuracy, even if the measurements that yielded the disconfirming case disconfirms the hypothesis requiring either numbers are relatively unreliable, invalid, and mean- refinement or reformulation, for the goal is to identify and con- ingless (e.g., see Hausman, 2000; Silver, 2012). firm a generalizable, universal, causal explanation for the phe- nomenon of interest (Flick, 2007a, p. 30; Schwandt, 2007, p. 6). Quantitizing copy, Quantitizing, commonly understood to refer to the clues (data) that are available. Such writing adds the numerical translation, transformation, or con- credibility by showing the analyst’s authentic search for version of qualitative data, has become a staple of what makes most sense rather than marshalling all the mixed-methods research (Sandelowski, Voils, & data toward a single conclusion. Indeed, the whole tone of a report feels differentnot when the qualitative analyst is willing to openly consider other possibilities than those finally settled on as most reasonable in accordance with the preponderance of evidence. Compare the approach of weighing alternatives with the report where all the dataDo lead in a single-minded fashion, in a rising cre- scendo, toward an overwhelming presentation of a single point of view. Perfect patterns and omniscient explanations are likely to be greeted skeptically—and for good reason: The human world is not perfectly ordered, and human researchers are not omnisci- ent. Humility can do more than certainty to enhance credibility. Dealing openly with the complexities and and Michael Cochran ©2002 Michael Quinn Patton

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 657

Knafl, 2009, p. 208). Quantitized qualitative data are Moreover, those engaged in quantiziting seem obliv- analyzed statistically, including using statistical sig- ious to the issues involved. nificance tests (Collingridge, 2013). There are different techniques by which quantiti- Typically glossed, however, are the foundational zation may be achieved. Two common strategies are assumptions, judgments, and compromises involved (1) dichotomizing and (2) counting. Dichotomizing in converting qualitative into quantitative data and refers to assigning a binary value (e.g., 0 and 1) to whether such conversions advance inquiry. . . . Such variables with two mutually exclusive and exhaus- conversions “are by no means transparent, uncon- tive categories, such as assigning “0” to participants tentious, or apolitical” (Love, Pritchard, Maguire, who did not express a particular theme and “1” to McCarthy, & Paddock, 2005, p. 287; Sandelowski, participants who did express the theme. In contrast, Voils, & Knafl, 2009, p. 28). counting involves calculating the number of themes expressed by each participant, as in the case of deter- Substantive Significance Trumps Statistical mining that a participant expressed two out of four themes in a study. Counting also includes calculating Significance the number of qualitative codes assigned to specific The point, however, is not to be anti-numbers. The themes, as in the case of determining that a partici- point is to be pro-meaningfulness. pant expressed 10 qualitative codes associated with a I’m not numbers phobic. I have used numbers reg- theme (Collingridge, 2013, p. 82). ularly in titling exhibits throughout this book: In Chapter 8, I devoted my MQP Rumination to distribute why I consider this kind of quantizing to be generally Exhibit 8.1 Twelve Tips for Ensuring a Strong a bad idea and advocated keeping qualitative analysis Foundation for Qualitative Analysis (pp. 522–523). qualitative (see pp. 557–559). I won’t repeat that argu- ment here. Still, it strikes me as a worrisome trend. Exhibit 8.10or Ten Types of Qualitative Analysis What’s driving it? Partly, it’s simply the cultural and (see pp. 551–552). political allure of numbers. But there’s more. Exhibit 9.1 Ten Systematic Analysis Strategies to Enhance Credibility and Utility (pp. 659–660). Pragmatic and ecumenical impulses, and the advent of computerized software programs to manage both qual- Module 77 presents four triangulation processes for enhancing credibility. itative and quantitative data, have served to promote a post, largely technical view of quantitizing. Moreover, the When there is something meaningful to be rhetorical appeal of numbers—their cultural associa- counted, then count. As sample sizes increase, espe- tion with scientific precision and rigor—has served to cially in mixed-methods studies, quantizing is likely reinforce the necessity of converting qualitative into to become even more pervasive. One study in the sys- quantitative data. tematic review of quantizing articles had a sample size of 400 (Fakis et al., 2014, p. 146). Such studies will A systematic literature review of quantitizing quantitize and do so appropriately. Weaver-Hightower studies—that is, studies featuringcopy, quantitative anal - (2014) studied political influence by reviewing pub- ysis of qualitative interviews—shows the widespread lic policy documents; from 1,459 transcript pages, nature of the phenomenon and some of the problems he coded 2,294 unique and relied heavily that arise, especially applying statistics to small sam- on quantitative analysis. That’s understandable and ple sizes. Quantitativenot analyses of qualitative data are appropriate, though reporting the results to two dec- done to disaggregate results by background charac- imal places, “the average agreement score was 5.22%” teristics of participants (cross-tabs and correlations), (p. 125), illustrates the allure of pretentious precision. to statistically test hypotheses, and to determine the Or maybe just habit. prevalence of themes. But the overall problem is pre- So while I advocate keeping qualitative analysis ciselyDo what one would expect: “The conversion of qualitative and focusing on substantive significance the qualitative information to frequency counts has when interpreting findings, this is no hard-and-fast reduced the rich interpretation of people’s experience rule (my Chapter 8 MQP Rumination notwithstand- that was expressed through their interviews” (Fakis, ing). Do what is appropriate. It doesn’t make sense to Hilliam, Stoneley, & Townend, 2014, p. 156). That report percentages in a sample of 10 interviewees; it is the crux of the issue, as is replacing a determina- does make sense with a sample of 400. By knowing tion of substantive significance with the safe fall- the strengths and weaknesses of both quantitative back position of replying on statistical significance. and qualitative data, you can help those with whom

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 658 ANALYSIS, INTERPRETATION, AND REPORTING

CONSTANT COMPARISON

A lot of qualitative analysis involves comparisons: comparing In particular, look for variation across cases, settings and cases, comparing quotations, comparing observations, and events (Gibbs, 2007, p. 96). SIDEBAR comparing findings in others studies with your own findings. Constant comparison is an ongoing analysis of similarities The point about these comparisons is that they are constant; and dierences: What things go together in the data? What they continue throughout the period of analysis and are things are dierent? What explains these similarities and dif- used not just to develop theory and explanations but also to ferences? What are the implications for your overall inquiry increase the richness of description in your analysis and thus purpose and conclusions? ensure that it closely captures what people have told you and what happened. Design Checks: Keeping Methods and Data in Context There are two aspects to this constant process: One issue that can arise during analysis is concerns about how design decisions affect results. For example, purposeful sam- 1. Use the comparisons to check the consistency and accu- pling strategies provide a limited number of cases for exami- racy of application of your codes, especially as you rst nation. When interpreting findings, it becomes important to develop them. Try to ensure that the passages coded reconsider how design constraints may have affected the data the same way are actually similar. But at the same time, available for analysis. This meansdistribute considering the rival method- keep your eyes open for ways in which they are dierent. ological hypothesis that the findings are due to methodological Filling out the detail of what is coded in this way may idiosyncrasies. lead you to further codes and to ideas about what is asso- By their nature, orqualitative findings are highly context and case ciated with any variation. This can be seen as a circular or iterative process. Thus, develop your code, check for dependent. Three kinds of sampling limitations typically arise other occurrences in your data, compare these with the in qualitative research designs: original, and then revise your coding (and associated memos) if necessary. 1. There are limitations in the situations (critical events or cases) that are sampled for observation (because it is 2. Look explicitly for dierences and variations in the activi- rarely possible to observe all situations even within a sin- ties, experiences, actions and so on that have been coded.post, gle setting).

you dialogue focus on really important questions Summary of Strategies rather than, as sometimes happens, focusing primar- for Systematically Analyzing ily on how to generate numbers.copy, The really important Qualitative Data to Enhance Credibility questions are about what the findings mean. A single illuminative case or interview may be more substan- Qualitative analysis aims to make sense of qualita- tively meaningful and insightful than 20 routine cases. tive data: detecting patterns, identifying themes, That 5% level of insight is not a reason to pay more answering the primary questions framing the study, attention to the 95%not degree of mediocrity just because and presenting substantively significant findings. In there’s more of it. Information-rich cases stand out not this chapter, we’ve been looking at ways of enhancing because there are lots of them but precisely because the credibility of findings by deepening the analysis, they are so rare—and rich with revelation (the very reexamining initial findings, and continuously work- definitionDo of being information-rich). Rare, precious ing back and forth between the findings and the data gems are valued over widely available (and less expen- to validate findings against data. Exhibit 9.1 sum- sive), semiprecious stones for the same reason. Qual- marizes the analytical techniques we’ve just covered itative analysis must include the analytical insight to and looks ahead to the four kinds of triangulation I’ll distinguish signal from noise and valuable insights present and discuss in the next module (Items 7–10 from commonplace ones. in Exhibit 9.1).

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 659

2. There are limitations from the time periods during which The wise fool in Su tales, Mulla Nasrudin, was once called observations took place—that is, constraints of temporal on to make this point to his monarch. Although he was sampling. supposed to be a wise man, Nasrudin was accused of being illiterate. Nagged to action by skeptics, the mon- 3. The ndings will be limited based on selectivity in the peo- arch decided to test him. ple who were sampled for observations or interviews, or selectivity in document sampling. “Write something for me, Nasrudin,” said the king. In reporting how purposeful sampling decisions aect nd- “I would willingly do so, but I have taken an oath ings, the analyst returns to the reasons for having made the never to write so much as a single letter again,” replied initial design decisions. Purposeful sampling involves study- Nasrudin. ing information-rich cases in depth and detail to understand and illuminate important cases rather than generalizing from “Well, write something in the way in which you used to a sample to a population (see Chapter 5). For instance, sam- write before you decided not to write, so that I can see pling and studying highly successful and unsuccessful cases what it was like.” in an intervention yields quite dierent results from study- “I cannot do that, because every time you write some- ing a “typical” case or a mix of cases. People unfamiliar with thing, your writing changes slightly through practice. If I purposeful samples may think of small, purposeful samples wrote now, it would be something written for now.” as “biased,” a perception that undermines credibility in their minds. In communicating ndings, then, it becomes impor- “Then,” addressingdistribute the crowd, the king commanded: tant to emphasize that the issue is not one of dealing with “Bring me an example of Nasrudin’s writing, anyone who a distorted or biased sample but rather one of clearly delin- has something he’s written.” eating the purpose, strengths, and limitations of the sample studied—and therefore being careful about not inappropri- Someoneor brought a terrible scrawl that Nasrudin had ately extrapolating the ndings to other situations, other time once written to him. periods, and other people—a caution we’ll return to later in “Is this your writing?” asked the monarch. this chapter. Reporting both methods and results in their proper contexts will avoid many controversies that result from “No,” said Nasrudin. “Not only does writing change with yielding to the temptation to overgeneralize from purposeful time, but reasons for writing change. You are now show- samples. Keeping ndings in context is a cardinal principle of ing a piece of writing done by me to demonstrate to qualitative analysis. Design decisions are context for analysis.post,someone how he should not write.” (Shah, 1973, p. 92)

copy, not Do

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 660 ANALYSIS, INTERPRETATION, AND REPORTING

EXHIBIT 9.1 Ten Systematic Analysis Strategies to Enhance Credibility and Utility

1. Generate and assess alternative conclusions and rival 6. Keep qualitative analysis qualitative. Paraphrasing explanations. Don’t settle quickly on initial conclu- poet Dylan Thomas, do not go gently into that sions. Go back to the data. What are other ways of numerical night. Quantitize thoughtfully, carefully, explaining what you’ve found? Look for the explana- and even reluctantly. Do so when it’s appropriate and tion that best fits the preponderance of evidence. enhances understanding, all the while aware of the allure of numbers and the danger of losing the rich- 2. Advocacy–adversary analysis uses a debate format ness of qualitative data in the parsimony of numerical for testing the viability of conclusions. What are the reduction. evidence and arguments that support your conclu- sions? What are the contrary evidence and counter- 7. Integrate and triangulate diverse sources of qualitative arguments? Get another analyst to play the “Devil’s data: interviews, observations, document analysis. Any Advocate” role, or switch back and forth in advocacy single source of data has strengths and weaknesses. and adversary roles yourself. The aim is to surface Consistency of findings across types of data increases doubts and weaknesses as well as build on strengths confidence in the confirmed patterns and themes. and confirm solid conclusions. Inconsistency across types of data invites questions and reflection about why certain methods produced 3. Search for and analyze negative or disconfirming evi- certain findings. dence and cases. There are “exceptions that prove the distribute rule” and exceptions that question the rule. In either 8. Integrate and triangulate quantitative and qualitative case, look for and learn from exceptions to the pat- data in mixed-methods studies. The logic of triangu- terns you’ve identified. lation (see Item 7) applies in mixed-methods designs when the orstrengths and weaknesses of qualitative 4. Make constant comparison your constant companion. and quantitative data are used together to illuminate All analysis is ultimately comparative. You compare the inquiry. the data that fit into a category, pattern, or theme with the data that don’t fit. You compare alternative 9. Triangulate analysts. Having more than one pair of explanations, conclusions, and chains of evidence. eyes look at and think about the data, identify pat- Compare and contrast. Then compare and contrast terns and themes, and test conclusions and expla- some more. post,nations reduces concerns about the potential biases 5. Keep analysis connected to purpose and design. When and selective perception of a single analyst. deeply enmeshed in cataloguing, classifying, and 10. Undertake theory triangulation. Look at the find- comparing the trees in your qualitative data—that ings and conclusions through the lens of alterna- is, the depth and details of rich, thick qualitative tive theoretical frameworks. How would a symbolic data—change perspectives now and again to see the interactionist interpret the data compared with a forest—that is, reconnect with the big picture. phenomenologist or realist? How would a behavioral Purpose drives design. Purposecopy, and design drive psychologist interpret the findings compared with a data collection. Purpose, design, and the data col- humanistic psychologist? What does a mechanistic lected, in combination, drive analysis. Make sure that display reveal compared with a systems graphic? The your analysis is serving the purpose of the inquiry. A point is not to conduct an endless set of such theo- well-chosen, thoughtful design will have anticipated retical comparisons but to select only those theo- how analysisnot would unfold. Keep those linkages in retical frameworks most germane to your inquiry to mind so that analysis doesn’t become isolated from see what the alternative perspectives yield by way of the inquiry’s overall purpose and context. insight and explanation. Keeping findings in context is a cardinal principle of Doqualitative analysis.

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. MODULE Four Triangulation Processes for Enhancing Credibility 77

1. Triangulation of qualitative sources: Checking out By combining multiple observers, the consistency of different data sources within the theories, methods and data sources, same method (consistency across interviewees) [researchers] can hope to overcome 2. Mixed qualitative–quantitative methods triangu- the intrinsic bias that comes from lation: Checking out the consistency of findings generated by different data collection methods single-methods, single-observer, 3. Analyst triangulation: Using multiple analysts to and single-theory studies. review findings —Norman K. Denzin (1989c, p. 307) 4. Theory/perspective triangulation: Using multiple perspectives or theories to interpret data Chapter 5 on design discussed the benefits of using multiple data-collection techniques, a form of trian- By triangulating with multiple data sources, meth- gulation, to study the same setting, issue, or program. ods analysts, and/or theories, qualitative analysts can You may recall from that discussion that the term tri- make substantial stridesdistribute in overcoming the skepti- angulation is taken from land surveying. Knowing a cism that greets singular methods, lone analysts, and single landmark only locates you somewhere along a single-perspective interpretations. line in a direction from the landmark, whereas with or two landmarks you can take bearings in two directions and locate yourself at their intersection. The notion of Interpreting Triangulation Results: Making triangulating also works metaphorically to call to mind Sense of Conflicting and Inconsistent Patterns the world’s strongest geometric shape—the triangle, which in its double alchemical form serves as the sym- A common misconception about triangulation bol for this chapter. The logic of triangulation is based involves thinking that the purpose is to demonstrate on the that no single method ever adequatelypost, that different data sources or inquiry approaches yield solves the problem of rival explanations. Because each essentially the same result. The point is to test for such method reveals different aspects of empirical reality consistency. Different kinds of data may yield some- and social perception, multiple methods of data col- what different results because different types of inquiry lection and analysis provide more grist for the analyt- are sensitive to different real-world nuances. Thus, ical mill. Combinations of interviewing, observation, understanding inconsistencies in findings across different and document analysis are expected in most fieldwork. kinds of data can be illuminative and important. Finding Mixed qualitative–quantitative studies are increasingly such inconsistencies ought not to be viewed as weak- valued as more credible thancopy, single-method studies. ening the credibility of results but, rather, as offering Studies that use only one method are more vulnerable opportunities for deeper insight into the relationship to errors linked to that particular method (e.g., loaded between inquiry approach and the phenomenon under interview questions, biased or untrue responses) than study. I’ll comment briefly on each of the four types of studies that use notmultiple methods, in which different triangulation. types of data provide cross-data consistency checks. 1. Triangulation of Qualitative Data Sources FourDo Kinds of Analytical Triangulation It is in data analysis that the strategy of triangulation Four kinds of persons: zeal without really pays off, not only in providing diverse ways of knowledge; knowledge without zeal; looking at the same phenomenon, but in adding to neither knowledge nor zeal; both zeal credibility by strengthening confidence in whatever conclusions are drawn. Four kinds of triangulation and knowledge. can contribute to the verification and validation of —Pascal, Pensées qualitative analysis:

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 662 ANALYSIS, INTERPRETATION, AND REPORTING

perspectives of actors. It is true that the immediate Four kinds of qualitative triangulation: result of observation is description, but this is equally interviews with observations; true of interviewing: e latter gives you a description of what the informant said, not a direct understanding interviews with documents; of their perspective. Generating an interpretation of observations with documents; and someone’s perspective is inherently a matter of infer- interviews from multiple sources with ence from descriptions of their behavior (including verbal behavior), whether the data are derived from observations of diverse events and observations, interviews, or some other source such as documents of many kinds. written documents. —Halcolm, Qualitative Pensées While interviewing is often an efficient and valid way of understanding someone’s perspective, observation Triangulation of data sources within and across dif- can enable you to draw inferences about this perspec- ferent qualitative methods means comparing and tive that you couldn’t obtain by relying exclusively cross-checking the consistency of information derived on interview data. . . . For example, watching how a at different times and by different means from inter- teacher responds to boys’ and girls’ questions in a sci- views, observations, and documents. It can include ence class may provide a much better understanding of the teacher’s actual views about gender and science • comparing observations with interviews; than what the teacher says in an interview. • comparing what people say in public with what distribute they say in private; Conversely, although observation often provides • checking for the consistency of what people say a direct and powerful way of learning about peo- about the same thing over time; ple’s behavioror and the context in which this occurs, • comparing the perspectives of people from differ- interviewing can also be a valuable way of gaining ent points of view—for example, in an evaluation, a description of actions and events—often the only triangulating staff views, participants’ views, funder way, for events that took place in the past or to which views, and views expressed by people outside the you can’t gain observational access. Interviews can program; and provide additional information that was missed in • checking interviews against program documents observation, and can be used to check the accuracy and other written evidence that can corroboratepost, of the observations. However, in order for interviews what interview respondents report. to be useful for this purpose, you need to ask about specific events and actions rather than posing ques- Quite different kinds of data can be brought tions that elicit only generalizations or abstract opin- together in a case study to illuminate various aspects ions. . . . In both of these situations, triangulation of of a phenomenon. In a classic evaluation of an observations and interviews can provide a more com- innovative educational project, historical program plete and accurate account than either could alone. documents, in-depth interviews,copy, and ethnographic (pp. 106–107) participant observations were triangulated to illumi- nate the roles of powerful actors in supporting adop- Triangulation of data sources within qualitative tion of the innovation (Smith & Kleine, 1986). The methods may not lead to a single, totally consistent evaluation of the Paris Declaration on development picture. The point is to study and understand when aid triangulated notinterviews with a variety of key infor- and why differences appear. The fact that observa- mants, government reports, donor agency reports, tional data produce different results from interview and observations of donor–recipient decision- data does not mean that either or both kinds of making meetings (Wood et al., 2011). data are “invalid,” although that may be the case. MaxwellDo (2012) is especially insightful about the More likely, it means that different kinds of data interrelationship of interview and observation data in have captured different things and so the analyst qualitative inquiry and analysis. attempts to understand the reasons for the differ- ences. Either consistency in overall patterns of data One belief that inhibits triangulation is the widespread from different sources or reasonable explanations (though often implicit) assumption that observation for differences in data from divergent sources can is mainly useful for describing behavior and events, contribute significantly to the overall credibility of while interviewing is mainly useful for obtaining the findings.

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 663

from a sample to a population, testing hypotheses for ETHNOGRAPHIC TRIANGULATION statistical significance, and making systematic com- parisons on standardized criteria). Thus, it is common In ethnographic research practice, triangulation of data sorts that quantitative methods and qualitative methods are and methods and of theoretical perspectives leads to extended used in a complementary fashion to answer different SIDEBAR knowledge potentials, which are fed by the convergences, and questions that do not easily come together to provide even more by the divergences, they produce. a single, well-integrated picture of the situation. Given the varying strengths and weaknesses As in other areas of qualitative research, triangula- of qualitative versus quantitative approaches, the tion in ethnography is a way of promoting quality of researcher using different methods to investigate the research. . . . Good ethnographies are characterized by same phenomenon should not expect that the findings exible and hybrid use of dierent ways of collecting data generated by those different methods will automati- and by a prolonged engagement in the eld. As in other cally come together to produce some nicely integrated areas of qualitative research, triangulation can help reveal whole. Indeed, the evidence is that one ought to expect different perspectives on one issue in research such as initial conflicts in findings from qualitative and quan- knowledge about and practices with a speci c issue. Thus, titative data and expect those findings to be received triangulation is again a way to promote quality of qual- with varying degrees of credibility. It is important, itative research in ethnography also and more generally then, to consider carefully what each kind of analy- a productive approach to managing quality in qualitative sis yields and thereby giving different interpretations research. (Flick, 2007b, p. 89) the chance to arise, withdistribute each considered on its mer- its, before favoring one result over the other based on methodological biases. 2. Mixed-Methods or Triangulation: Integrating Critical Multiplism as an Analytical Strategy Qualitative and Quantitative Data Critical multiplism is a research strategy that advo- cates designing packages of imperfect methods and Tis not the many oaths that makes the theories in a manner that minimizes the respective and inevitable biases of each. Multiplism, applied to truth, post,analysis, acknowledges that any analysis can usually be conducted in any one of several ways, but in many But the plain single vow that is vow’d cases, no single way is known to be uniformly the true. best. Under such circumstances, a multiplist advocates making heterogeneous those aspects of analysis about —William Shakespeare (written 1604–1605) which uncertainty exists, so that the task is conducted Diana in All’s Wells at Ends Well in several different ways, each of which is subject to copy, different biases. Mixed-methods triangulation often involves compar- ing and integrating data collected through some kind Critical refers to rational, empirical, and social efforts of qualitative methods with data collected through to identify the assumptions and biases present in the some kind of quantitative method. Such efforts flow options chosen. Putting the two concepts together, we from a pragmaticnot approach to mixed-methods analy- can say that the central tenet of critical multiplism is sis that assumes potential compatibility and seeks to this: When it is not clear which of several defensible discover the degree and nature of such compatibility options for a scientific task is least biased, we should (Tashakkori & Teddlie, 1998; Teddlie & Tashakkori, select more than one, so that our options reflect differ- 2003,Do 2011). This is seldom straightforward because ent biases, avoid constant biases, and leave no plausible certain kinds of questions lend themselves to quali- bias overlooked. (Shadish, 1993, p. 18) tative methods (e.g., developing hypotheses or theory in the early stages of an inquiry, understanding par- When multiple analytical approaches yield similar ticular cases in depth and detail, getting at meanings results across different analytical biases, confidence in in context, and capturing changes in a dynamic envi- the resulting findings is increased. If different results ronment), while other kinds of analyses lend them- occur when the analysis is done in different ways, then selves to quantitative approaches (e.g., generalizing we have to try to explain the differences.

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 664 ANALYSIS, INTERPRETATION, AND REPORTING

Different Findings From Different Methods SIDEBAR STRATEGY FOR ACHIEVING QUALITY In a classic article, Shapiro (1973) described in detail IN MIXEDMETHODS STUDIES her struggle to resolve basic differences between quali- tative data and quantitative data in her study of Follow The quantitative researchers work side by side every step of the Through Classrooms; she eventually concluded that way as full members of the case study team, bringing the ana- some of the conflicts between the two kinds of data lytic rigor of their quantitative frameworks to bear on case study were the result of measuring different things, although and observation design, data collection, analysis, integration the ways in which different things were measured were with other methods, and reporting. The qualitative research- not immediately apparent until she worked to sort out ers, in turn, are full members of the quantitative team (analysis the conflicting findings. She began with greater trust of administrative data, survey research, and time series assess- in the data derived from quantitative methods and ments), bringing their own rigor to survey designs, data reduc- ended by believing that the most useful information tion decisions, and interpretations. As a result, assumptions came from the qualitative data. are more rigorously examined, methodological lacunae more Another pioneering article, by M. G. Trend (1978) clearly (and early) identified, and the team leaders become suf- of ABT Associates, has become required reading for ficiently methodologically multilingual so that they can discuss anyone becoming involved in a team project that will both qualitatively and quantitatively based findings with equal involve collecting and analyzing both qualitative and confidence (Datta, 2006, p. 427). quantitative data, where different members of the team have responsibilities for different kinds of data. The Trend study involved an analysis of three social exper- distribute iments designed to test the concept of using direct- cash housing allowance payments to help low-income families obtain decent housing on the open market. or The analysis of qualitative data from a participant Qualitative Inquiry Quantitative Analysis observation study produced results that were at var- iance with those generated by analysis of quantitative data. The credibility of the qualitative data became a Last year you 3 had 2 home central issue in the analysis. runs all season. This year you post,have 5 in one e difficulty lay in conflicting explanations or accounts, month. What’s each based largely upon a different kind of data. The the difference? problems we faced involved not only the nature of obser- vational versus statistical inferences, but two sets of pref- erences and biases within the entire research team. . . .

ough qualitative/quantitative tension is not the only

problem which may arise in research,copy, I suggest that it and Michael Cochran ©2002 Michael Quinn Patton is a likely one. Few researchers are equally comfortable with both types of data, and the procedures for using the collected applied their inquiry skills to examine the two together are not well developed. e tendency is to nature of the experience in the 1970s. The problems relegate one type of analysis or the other to a secondary they have shared were stark evidence that qualita- role, according to the nature of the research and the pre- not tive methods at that time were typically perceived as dilections of the investigators. . . . Commonly, however, exploratory and secondary when used in conjunction observational data are used for “generating hypotheses,” with quantitative/experimental approaches. When or “describing process.” Quantitative data are used to qualitative data supported quantitative findings, that “analyze outcomes,” or “verify hypotheses.” I feel that this Do was the icing on the cake. When qualitative data division of labor is rigid and limiting. (Trend, 1978, p. 352) conflicted with quantitative data, the qualitative data have often been dismissed or ignored (Society of Early Efforts at Applied Anthropology, 1980). Quantitative–Qualitative Triangulation A strategy of methods triangulation, then, doesn’t magically put everyone on the same page. While Anthropologists participating in teams in which valuing and endorsing triangulation, Trend (1978) both quantitative and qualitative data were being suggested that

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 665

we give different viewpoints the chance to arise, and In essence, triangulation of qualitative and quanti- postpone the immediate rejection of information or tative data constitutes a form of comparative analysis. hypotheses that seem out of joint with the majority The question is “What does each analysis contribute viewpoint. Observationally derived explanations are to our understanding?” Areas of convergence increase particularly vulnerable to dismissal without a fair trial. confidence in findings. Areas of divergence open win- (pp. 352–353) dows to better understanding of the multifaceted, complex nature of a phenomenon. Deciding whether results have converged remains a delicate exercise sub- From Separation to Integration ject to both disciplined and creative interpretation. Qualitative and quantitative data can be fruitfully Focusing on the degree of convergence rather than forc- combined to elucidate complementary aspects of the ing a dichotomous choice—that the different kinds of same phenomenon. For example, a community health data do or do not converge—yields a more balanced indicator (e.g., teenage pregnancy rate) can provide overall result. a general and generalizable picture of an issue, while case studies of a few pregnant teenagers can put faces on the numbers and illuminate the stories behind Mixed-Methods Analysis and the quantitative data; this becomes even more pow- Triangulation in the Twenty-First Century erful when the indicator is broken into categories While difficulties still arise in triangulating and inte- (e.g., those under the age of 15, those 16 and above), grating qualitative and quantitative data, advances in with case studies illustrating the implications of and mixed methods have distributepropelled integrated analyses rationale for such categorization. into the spotlight, especially in applied and interdisci- plinary areas like policy analysis, program evaluation, environmentalor studies, international development, and global health. Where disciplinary barriers have A STORY OF MIXEDMETHODS yielded to genuine interdisciplinary engagement, tra- TRIANGULATION: TESTING ditional methodological divisions have yielded to col- CONCLUSIONS WITH MORE laboration and integration. Exhibit 8.27 (pp. 618–619) presented mixed-methods challenges and solutions. SIDEBAR FIELDWORK Exhibit 9.2 presents 10 developments that are mak- post,ing mixed-methods triangulation both valued and, Economists Lawrence Katz and Jeffrey Liebman of Harvard, increasingly, expected in applied social science. and Jeffrey R. Kling of Princeton, were trying to interpret data from a federal housing experiment that involved randomly assigning people to a program that would help them get out 3. Triangulation With Multiple Analysts of the slums. The evaluation focused on the usual outcomes of improved school and job performance. However, to get beyond A third kind of triangulation is investigator or analyst the purely statistical data, they decidedcopy, to conduct interviews triangulation—that is, using multiple as opposed to with residents in an inner-city poverty community. singular observers or analysts. This is the core of qual- itative team research (Guest & MacQueen, 2008). Professor Lieberman commented to a New York Times reporter, Triangulating observers or using several interviewers helps reduce the potential bias that comes from a sin- I thought they were going to say they wanted access to gle person doing all the data collection and provides better jobs and schools,not and what we came to understand means of more directly assessing the consistency of was their consuming fear of random crime; the need the the data obtained. Triangulating observers provides a mothers felt to spend every minute of their day making check on potential bias in data collection. sure their children were safe. (Uchitelle, 2001, p. 4) A related strategy is triangulating analysts—that Do is, having two or more persons independently analyze By adding qualitative, eld-based interview data to their the same qualitative data and compare their findings. study, Kling, Liebman, and Katz (2001) came to a new and In the traditional social science approach to qualita- dierent understanding of the program’s impacts and partic- tive inquiry, engaging multiple analysts and comput- ipants’ motivations based on interviewing the people directly ing the interrater reliability among these different aected, listening to their perspectives, and including those analysts is valued, even expected, as a means of estab- perspectives in their analysis. lishing credibility of findings (Silverman & Marvasti, 2008, pp. 238–239).

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 666 ANALYSIS, INTERPRETATION, AND REPORTING

EXHIBIT 9.2 Ten Developments Enhancing Mixed-Methods Triangulation

1. Designs that are truly mixed-methods inquiries are teams has advanced significantly as experience has demonstrating the value of systematic, planned trian- accumulated about what to do—and what not to do gulation. Increased understanding of the strengths (Guest & MacQueen, 2008; Morgan, 2014). and weaknesses of qualitative and quantitative data 7. Software supports mixed-methods data analysis and has led to both the commitment and capacity to triangulation. As data analysis software has become build on the strengths of each at the design stage. more sophisticated, flexible, and responsive to ana- 2. Asking integrating questions of the data supports tri- lysts’ needs, techniques and processes for triangula- angulation. Triangulation is most powerful when tion are becoming more common and easier to use. mixed-methods studies are designed for integration, 8. Resources available for mixed-methods designs and which begins by asking the same questions of both analysis have burgeoned. The Journal of Mixed Methods methods and gathering both qualitative and quan- began publishing in 2007, with an opening editorial titative data on those questions. That is happening by Abbas Taskhakkori and John Creswell proclaim- at a level unprecedented in applied social science ing, “The New Era of Mixed Methods.” This means that research and evaluation. there are more outlets for publishing mixed-methods 3. Mixed-methods sampling strategies anticipate and studies. The Handbook of Mixed Methods was pub- facilitate triangulation. Sampling with triangulation in lished in 2003 (Tashakkori & Teddlie). Excellent mixed-methods textsdistribute provide guidance on the full mind is a collaborative strategy that anticipates and lays the foundation for mixed-methods analysis. process from designing mixed-methods studies to analyzing and triangulating mixed data (Bamberger, 4. Specific methods are incorporating mixed data inten- 2013; Bergman,or 2008; Greene, 2007; Mertens, 1998; tionally to support triangulation. Surveys ask both Mertens & Hesse-Biber, 2013; Morgan, 2014). closed- and open-ended questions. Case studies col- lect both quantitative and qualitative data. Strong 9. Researchers are developing mixed skills, capabili- experimental designs gather both standardized inter- ties, and capacities—and being recognized and val- vention and quantitative effects data plus qualitative ued for their mixed-methods expertise. In 2014, the process data. International Association of Mixed Methods Research post,was launched and hailed as “a momentous develop- 5. Mixed methods are proving especially appropriate for ment in mixed-methods research” (Mertens, 2014). studying complex issues. Mixed-methods researchers 10. Mixed-methods exemplars show what is possible. Early are extending our understandings of how to under- experiences with qualitative–quantitative triangula- stand complex social phenomena as well as how to tion were mixed at best—and many were quite neg- use research to develop effective interventions to ative, as indicated in the cautionary tales reported address complex social problems (Mertens, 2013; preceding the exhibit. When I was doing earlier edi- Patton, 2011). copy, tions of this book, there were more bad examples 6. Team approaches are being created and implemented and negative experiences than good and positive with mixed-methods skills and capabilities in mind. exemplars. That balance has shifted for all the reasons High-quality mixed-methods designs often require listed here. The momentum is building as funders teams because individuals lack the full skill set of research and evaluation are coming to demand needed. Knowingnot how to form and manage such mixed-methods studies.

Here,Do however, is a perfect example of how dif- She acknowledges that interrater reliability may be ferent criteria for judging quality lead to different acceptable when everyone is asked the same ques- practices. In a lead editorial for the journal Quali- tion in the same way (the preferred interviewing tative Health Research, Janet Morse (1997) took on approach to meet traditional social science con- “the myth of inter-rater reliability” from a social cerns about validity and reliability), but in the more constructionist perspective. She begins by dis- adaptive, personalized, individualized, and flexible tinguishing standardized interview formats from approach of interview guides and conversational more flexible and open interview guide approaches. interviewing, what constitutes coherent passages

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 667

for coding is more problematic and depends on the ferent perspectives, but that’s quite different from analyst’s interpretive framework. Multiple analysts computing a statistical interrater reliability coeffi- might still discuss what they see in the data, share cient. (See the sidebar on “the myth of interrater insights, and consider what emerges from their dif- reliability” for her full argument.)

PERFECTLY HEALTHY BUT DEAD: THE MYTH OF INTERRATER RELIABILITY SIDEBAR

—Janet M. Morse (1997)

Qualitative researchers seem to have inherited a host of hab- in light of all the knowledge gained, small pieces of data may its from quantitative researchers and have adopted them into have monumental significance. The process is not necessarily the qualitative paradigm without considering the appropri- superficially objective: It is conducted in light of comprehen- ateness of their purpose, rationale, or underlying assump- sive understanding of the significance of each piece of text. tions. On the surface, these practices seem right, so they are The coding process is highly interpretative. unquestioningly maintained. One of these adopted habits is This comprehensive understanding of data bits cannot be the practice of obtaining interrater reliability of coding deci- acquired in a few objective definitions of each category. sions used in qualitative research when coding unstructured, Moreover, it cannot be conveyed quickly and in a few defini- interactive interviews. tions to a new member of the research team who has been The argument goes something like this: To be reliable, coding elected for the purposedistribute of determining a percentage agree- should be replicable. Replication is checked by duplication; if ment score. This new coder does not have the same knowledge coding decisions are explicit and communicated to another base as the researcher, has not read all the interviews, and researcher, that researcher should be able to make the same therefore does not have the same potential for insight or depth coding decisions as the first researcher. The result is reliable of knowledgeor required to code meaningfully. Maintaining a research. Right? simplified coding schedule for the purposes of defining cate- gories for an interrater reliability check will maintain the cod- Wrong. Interrater reliability is appropriate with semistruc- ing scheme at a superficial level. It will simplify the research to tured interviews, wherein all participants are asked the same such an extent that all of the richness attained from insight will questions, in the same order, and data are coded all at once at be lost. Ironically, it forcibly removes each piece of data from the end of the data collection period. But this does not hold the context in which each coding decision should be made. for unstructured interactive interviews. Recall that unstrucpost,- The study will become respectably reliable with an interrater tured, interactive interviews are used in research because reliability score, but this will be achieved at the cost of losing the researcher does not know enough about the topic or its all the richness and creativity inherent in analysis, ultimately parameters to construct interview questions. With unstruc- producing a superficial product. tured, interactive interviews, the researcher first assumes a listening stance and learns about the topic as she or he goes The cost of such an endeavor is equivalent to Mrs. Frisby, who, along. Thus, once the researcher has learned something about when the farmer commented that the poisoned rat looked the phenomenon from the first few participants, the sub- perfectly healthy, said sadly, “Perfectly healthy, but dead!” Your copy, research will be perfectly reliable, but trivial. stance of the interview then changes and becomes targeted on another aspect of the phenomenon. Importantly, unlike There is often a shocked silence when I discuss this with stu- semistructured interviews, all participants are not asked the dents. But then I ask two questions: “How many of you have same questions. Participants are used to verify the informa- written a literature review lately?” Almost every hand is raised. I tion learned innot the first interviews and are encouraged both then ask, “How many of you took a second person to the library to speak from their own experience and to speak for others. with you to make sure you interpreted each article in a manner Each interview may overlap with the others but may also have that was replicable?” Not a single hand remains raised. “Aren’t a slightly different focus and different content. you concerned?” I ask, “How do you know that your analysis, your interpretation of those articles, was reliable?” ThisDo notion, learning from participants as the study pro- gresses, is crucial to the understanding of the fluid nature of The analysis of unstructured, interactive interviews is exactly the coding unstructured interviews. Initially, coding decisions same case. Researchers must learn to trust themselves and their may be quite superficial—by topic, for instance—but later judgments and be prepared to defend their interpretations and coding decisions are made with the knowledge of, and in con- analyses. But it is death to one’s study to simplify one’s insights, sideration of, information gained from all the previously ana- coding, and analyses so that another person may place the same lyzed interviews. Such coding schemes are not superficial, and piece of datum in the same category.

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 668 ANALYSIS, INTERPRETATION, AND REPORTING

Triangulation rough Distinct cases, quotations, and findings as a matter of course; Evaluation Teams: e Goal-Free Approach that’s part of what collaboration and participation mean. However, investigative inquiries (Douglas, In program evaluation, an interesting form of team 1976) aimed at exposing what goes on beyond the triangulation has been used. Michael Scriven (1972b) public eye are often antagonistic to those in power, has advocated and used two separate teams, one that so their responses would not typically be used to conducts a traditional goals-based evaluation (assess- revise conclusions but might be used to at least offer ing the stated outcomes of the program) and a second them an opportunity to provide context and an alter- that undertakes a “goal-free evaluation” in which the native interpretation. Some traditional social science evaluators assess clients’ needs and program outcomes researchers and evaluators worry that sharing findings without focusing on stated goals (see Chapter 4, with participants for their reactions will undermine p. 206). Comparing the results of the goals-based the independence of their analysis. Others view it as team with those of the goal-free team provides a form an important form of triangulation. In an Internet list- of analytical triangulation for determining program serv discussion of this issue, one researcher reported effectiveness (Youker & Ingraham, 2014). this experience:

I gave both transcripts and a late draft of findings to Review by Inquiry Participants participants in my study. I wondered what they would Having those who were studied review the findings object to. I had not promised to alter my conclusions offers another approach to analytical triangulation. based on their feedback,distribute but I had assured them that my Researchers and evaluators can learn a great deal about aim was to be sure not to do them harm. My findings the accuracy, completeness, fairness, and perceived included some significant criticisms of their efforts that validity of their data analysis by having the people I feared/expectedor they might object to. Instead, their described in that analysis react to what is described and review brought forth some new information about ini- concluded. To the extent that participants in the study tiatives that had not previously been mentioned. And are unable to relate to and confirm the description and their primary objection was to my not giving the credit analysis in a qualitative report, questions are raised for their successes to a wider group in the community. about the credibility of the findings. In what became a What I learned was not to make assumptions about classic study of how evaluations were used, key inform- participants’ thinking. ants in each case study were asked for both verbal andpost, written reactions to the accuracy and comprehensive- Exhibit 9.3 summarizes three contrasting views ness of the cases. The evaluation report then included of involving those studied in reviewing findings and those written reactions (Alkin et al., 1979). In her study conclusions. of homeless youth, Murphy (2014) met with each of the 14 youth to go over the details of the case study Critical Friend Review she created from their transcribed interviews to affirm accuracy, add additional detailscopy, and reflections if they A critical friend can be defined as a trusted person who so desired, and choose a pseudonym that they wanted asks provocative questions, provides data to be exam- to be called in the study, if they had not already done so. ined through another lens, and offers critiques of a per- (See Thmaris’s case study example, pp. 511–516.) son’s work as a friend. A critical friend takes the time to fully understand the context of the work presented Obtaining the reactionsnot of respondents to your working and the outcomes that the person or group is working drafts is time-consuming, but respondents may (1) verify toward. e friend is an advocate for the success of that that you have reflected their perspectives; (2) inform you of work. (Costa & Kallick, 1993, p. 49) sections that, if published, could be problematic for either personal or political reasons; and (3) help you to develop Tessie Tzavaras Catsambasis is president of newDo ideas and interpretations. (Glesne, 1999, p. 152) EnCompass LLC, an international evaluation research company. She is active in evaluation capacity Different Purposes Drive building around the world, including leadership ser- Different Review Procedures vice with the International Organization for Coopera- tion in Evaluation. She also plays the role of critical Different kinds of studies have different participant friend with colleagues’ projects and within her own review processes, some none at all. Collaborative and organization. Here’s an example she shared with me participatory inquiry builds in participants’ review of (and kindly gave permission to include here) that

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 669

EXHIBIT 9.3 Dierent Perspectives on Triangulation by Those Who Were Studied

PERSPECTIVES ON CHECKING IN WITH PARTICIPANTS TO REVIEW THEIR CASES AND OVERALL FINDINGS RATIONALE

1. Against participants’ reviews a. May raise questions about the independence of ndings: risks too much inuence by participants on interpretation by the researcher

b. Not sure what to do if participants and researchers disagree; whose prevails?

2. Weigh pros and cons situationally: It a. Takes time and resources depends b. Must be carefully planned and done well or could create problems in meeting deadlines; may not be worth the hassle

c. Could be hard to get back to everyone, so some unfairness arises in who gets to review what

d. Depends on how important and credible it is to the audiences who will receive ndings distribute 3. In favor of participant reviews a. It’s the ethical thing to do b. It’s a chance to correct errorsor and inaccuracies so you end up with better data

c. It’s a chance to update the data

nicely illustrates the critical friend role as a form post,of capacity, we should recommend discontinuing it.” en, analyst triangulation. I asked what turned out to be the “turning point” ques- tions: “If this program has so many problems, why are 20 My team and I conducted an evaluation of a UN countries choosing to use it?” and “How are those coun- organization’s Internet-based system that countries tries addressing the problems you have documented?” could download to track their own HIV/AIDS activ- ities in any sector and area, nationally down to district is kind of question (at its best, how is it working?) level. A previous organizationalcopy, review of the UN is an appreciative analytical question asked as a critical organization recommended discontinuing this pro- friend from a systems dynamics perspective (change gram based on resource constraints and rumors about the shoes you are wearing, and from the perspective problems, but without looking at it closely. e depart- of a country, what do you see?). In response, my col- ment supporting this program decided to evaluate it leagues listed many innovations that countries were first, because theynot had invested in it significantly and undertaking to make this tracking program work, and wanted to make a final decision based on evidence. e then they concluded, “It is the only option out there evaluation we conducted (country visits, focus groups, that they can control fully, and it is cheap.” So “country interviews, survey, benchmarking) revealed many, many controlled” was also important, and so was “low cost.” problems.Do But interestingly, some 20 countries were en, I asked them, “Imagine you hold the button to using it (the tracking system). My colleagues who did kill the program, do you push it?” ey each said, “No, the data collection were ready to push the button to kill but we would . . . ” and proceeded to give me three fab- it, citing all the problems we had found. I got involved ulous recommendations. eir responses enabled us to at the last stage of the data analysis process. present to the client the findings, and engage the client in grappling with a tough decision. I grilled my colleagues, asking them to justify every con- clusion. eir perspective was clear: “ is program has so Essentially, we said, “ is program is filling a demand, many operational obstacles in the field, no Internet, low and 20 counties are using it in spite of significant

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 670 ANALYSIS, INTERPRETATION, AND REPORTING

operational problems. We know you have resource age-groups. Audience members outside the commu- constraints, and this system requires more techni- nity were especially focused in on conflicts and differ- cal assistance, but if you decide to stop supporting it, ences reported in the findings. Similarities and impor- consider transferring the system’s administration to tant areas of agreement got lost amid the reports’ another funding agency, and also consider certifying overemphasis on differences. Moreover, perspectives independent consultants as technical assistance pro- within ethnic group and age-group appeared much viders, so countries can contract with them directly for more monolithic and homogeneous than, in fact, they help on the system. And, if you cannot even do that, were. As a result of this feedback, we went back to think about how you will transition in a way that will the field and added heterogeneous focus groups to the not hurt countries.” data and then drafted a more balanced report. This was in no way undermining inquirer independence. It From an evaluation point of view, two things are was making sure we got it right. important: (1) if it were not for these two questions in the analysis, the team would have concluded some- thing very different from the same data, and (2) asking Evaluation Audiences and Intended Users these questions enabled us to facilitate the client to Program evaluation constitutes a particular challenge face these challenging findings, have an internal debate in establishing credibility because the ultimate test of about what to do, and own the final decision. the credibility of an evaluation report is the response of primary intended users and readers of that report. Audience Review as Their reactions often revolvedistribute around face validity. On Credibility Triangulation the face of it, is the report believable? Are the data rea- sonable? Do the results connect to how people under- Reflexive triangulation (Exhibit 2.5, p. 72) includes stand the world?or In seriously soliciting intended users’ the audience’s reactions to the triangulation mix: reactions, the evaluator’s perspective is joined to the (1) the inquirer’s reflexive perspective, (2) the perspec- perspective of the people who must use the findings. tives of those studied, and (3) the perspectives of Evaluation theorist Ernie House (1977) has suggested those who received the findings. The opening module that the more “naturalistic” (qualitative) the evalua- of this chapter emphasized that different readers of tion, the more it relies on its audiences to reach their qualitative reports will apply different criteria to judge own conclusions, draw their own generalizations, and quality and credibility. Audience reactions constitutepost, make their own interpretations: additional data. Whenever possible, I prefer to present draft findings to multiple audiences to learn how they Unless an evaluation provides an explanation for a par- react, what they focus on, what is clear and unclear, ticular audience, and enhances the understanding of that and what questions are inadequately answered. In a audience by the content and form of the argument it pre- sense, this is equivalent to theater or movie previews sents, it is not an adequate evaluation for that audience, when producers and directors get to gauge audience even though the on which it is based are verifiable reaction to a performance or film before it is released. by other procedures. One indicator of the explanatory Time and procedures for audiencecopy, previews and reac - power of evaluation data is the degree to which the audi- tions have to be planned in advance, but whenever I’ve ence is persuaded. Hence, an evaluation may be “true” in done them, I’ve been glad I did. the conventional sense but not persuasive to a particular In a study of a community development effort in audience for whom it does not serve as an explanation. an inner-city, low-incomenot neighborhood, focus groups In the fullest sense, then, an evaluation is dependent both on were done with diverse groups: African Americans, the person who makes the evaluative statement and on the Native Americans, Hispanics, Hmong residents, and person who receives it [italics added]. (p. 42) low-income whites. Age-based focus groups were also done: youth under the ages of 25, 25- to 55-year-olds, Understanding the interaction and mutuality andDo those over 55 years. A community advisory group between the evaluator and the people who use the reviewed the study design and voiced no objections to evaluation, as well as relationships with partici- focus groups done homogeneously by either ethnicity pants in the program, is critical to understanding or age. In fact, they thought such focus groups were a the human side of evaluation. This is part of what good idea. But when the draft results were reported gives evaluation—and the evaluator—situational and in a public meeting that included community peo- interpersonal “authenticity” (Lincoln & Guba, 1986). ple and public officials, the focus group results made Exhibit 9.16, at the end of this chapter (pp. 736–741), it appear that there were great divisions and differ- provides an experiential account from an evalua- ences of perspectives among neighborhood ethnic and tor dealing with issues of credibility while building

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 671

relationships with program participants and evalua- EXPERT REVIEW tion users; her reflections provide a personal, in-depth description of what authenticity is like from the per- spective of one participant-observer. What do you think?

Expert Audit Review A final review alternative involves using experts to assess the quality of analysis or, where the stakes for external credibility are especially high, performing a meta-evaluation or process audit. An external audit by a disinterested expert can render judgment about the quality of data collection and analysis. “That part freshspectrum.com

of the audit that examines the process results in a and Michael Cochran ©2002 Michael Quinn Patton dependability judgment [italics added], while that part concerned with the product (data and reconstruc- tions) results in a confirmability judgment [italics added]” (Lincoln & Guba, 1986, p. 77). Such an audit 4. eory Triangulation would need to be conducted according to appropri- ate criteria. For example, it would not be fair to audit distribute Greek legend tells of the fearsome hotelier Procru- an aesthetic and evocative qualitative presentation by stes who would adjust his guests to match the length traditional social science standards or vice versa. But of his bed, stretching the short and trimming off within a particular framework, expert reviews can the legs of orthe tall. Guides to program theory that increase credibility for those who are unsure how to are too prescriptive risk creating such a Procrustean distinguish high-quality work. That, of course, is the bed. When the same approach to program theory is role of the doctoral committee for graduate students used for all types of interventions and all types of and peer reviewers for scholarly journals. Problems purposes, the risk is that the interventions will be arise when peer reviewers apply traditional scientific distorted to fit into a preconceived format. Important criteria to constructivist studies, and vice versa. In aspects may be chopped off and ignored, and other such cases, the review or audit itself lacks credibilpost,- aspects may be stretched to fit into preconceived ity. Exhibit 9.4 on the next page presents an exam- boxes of a factory model, with inputs, processes, out- ple of an expert meta-evaluation (evaluation of the comes, and impacts. evaluation) to independently judge the quality and establish credibility for a high-stakes international Purposeful program theory requires thoughtful mixed-methods evaluation. assessment of circumstances, asking in particular, The challenge of getting the right expert, one “Who is going to use the program theory and for who can apply an appropriatelycopy, critical eye, is wit- what purposes?” and “What is the nature of the inter- tily illustrated by a story about the great French art- vention and the situation in which it is implemented?” ist Pablo Picasso. Marketing of fakes of his paint- It requires a wide repertoire, not a one-size-fits-all ings plagued Picasso. His friends became involved approach to program theory. in helping check out the authenticity of supposed genuine originals.not One friend in particular became Purposeful program theory also requires attention obsessed with tracking down frauds and brought to the limitations of any one program theory, which several paintings to Picasso, all of which the mas- must necessarily be a simplification of reality and a ter identified as fake. A poor artist who had hoped willingness to revise it as needed to address emerging to profit from having obtained a Picasso before the issues. greatDo artist’s works had become so valuable sent his —Funnell and Rogers (2011, p. xxi) painting for inspection via the friend. Again Picasso Purposeful Program eory pronounced it a forgery. Having discussed triangulation of qualitative “But I saw you paint this one with my very own eyes,” data sources, mixed-methods triangulation, and protested the friend. multiple analyst triangulation, we turn now to the “I can paint false Picassos as well as anyone,” retorted fourth and final kind of triangulation: using differ- Picasso. ent theoretical perspectives to look at the same data.

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 672 ANALYSIS, INTERPRETATION, AND REPORTING

EXHIBIT 9.4 Metaevaluation: Evaluating the Evaluation of the Paris Declaration on Development Aid

It has become a standard in major high-stakes evalua- and sharpened accordingly; engaging International tions to commission an independent review to deter- Reference Group participants in a reflective practice, mine whether the evaluation meets generally accepted lessons-learned session; surveying participants about standards of quality and, in so doing, to identify strengths, the evaluation process and partner country evaluations; weaknesses, and lessons (Stuebeam & Shrink eld, and interviewing key people involved in and knowl- 2007, p. 649). The major addition to the Joint Commit- edgeable about how the evaluation was conducted. tee Standards for Evaluation, when revised in 2010, was The evaluation of the evaluation included assessing that of “Evaluation Accountability Standards” focused on both the evaluation report’s findings and the techni- meta-evaluation. cal appendix that details how the findings were gen- erated. The Development Assistance Committee (DAC) Evaluation Accountability Standards of the Organization for Economic Co-operation and Development (OECD) established international stand- E1 Evaluation documentation: Evaluations should ards for evaluation in 2010, and those were the stand- fully document their negotiated purposes and imple- ards used for the meta-evaluation (OECD-DAC, 2010). mented designs, procedures, data, and outcomes. A meta-evaluation audit statement confirming the quality, credibility, and usability of the evaluation was E2 Internal meta-evaluation: Evaluators should use included as a preface to thedistribute full evaluation reports. The these and other applicable standards to examine the meta-evaluation report (Patton & Gornick, 2011a) was accountability of the evaluation design, procedures published and made available online two weeks after employed, information collected, and outcomes. the Final Evaluationor report was published. This timing E3 External meta-evaluation: Program evaluation spon- was possible because the meta-evaluation began half- sors, clients, evaluators, and other stakeholders should way through the Paris Declaration Evaluation and the encourage the conduct of external meta-evaluations meta-evaluation team had access to draft versions of using these and other applicable standards (Joint Com- the final report at each stage of the report’s develop- mittee on Standards, 2010; Yarbrough, Shulha, Hopson, ment. The process for conducting the meta-evaluation & Caruthers, 2010). post,and its uses are discussed in detail in Patton (2013). Evaluating the Evaluation of the Paris Declaration The Paris Declaration Evaluation received the 2012 American Evaluation Association (AEA) Outstanding Given the historic importance of the Evaluation of Evaluation Award. At the award ceremony, the chair of the the Paris Declaration on Development Aid (Dabel- AEA Awards Committee, Frances Lawrenz (2013), summa- stein & Patton, 2013b), the Management Group over- rized the merits of the evaluation that led to the award seeing the evaluation commissioned an independent selection and recognition: assessment of the evaluation. Prior to undertaking this review, we had no priorcopy, relationship with any members of the Management Group or the Core Eval- The success of the Paris Declaration Phase 2 Evaluation uation Team. We had complete and unfettered access required an unusually skilled, knowledgeable and com- to any and all evaluation documents and data, and mitted evaluation team; a visionary, well-organized, to all members of the International Reference Group, and well-connected Secretariat to manage the logi- the Managementnot group, the Secretariat, and the Core stics, international stakeholder meetings, and nan- Evaluation Team. Our evaluation of the evaluation cial accounts; and a highly competent and respected included reviewing data collection instruments, tem- Management Group to provide oversight and ensure plates, and processes; reviewing the partner country the Evaluation’s independence and integrity. This was andDo donor evaluation reports on which the synthesis an extraordinary partnership where all involved under- of findings was based; directly observing two meet- stood their roles, carried out their responsibilities fully ings of the International Reference Group where the and eectively, and respected the contributions of evidence was examined and the conclusions refined other members of the collaboration.

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 673

Chapter 3 presented a number of general theoret- ical frameworks derived from diverse intellectual and disciplinary traditions. More concretely, multi- ple theoretical perspectives can be brought to bear on specialized substantive issues. For example, one might examine interviews with therapy clients from different psychological perspectives: psychother- apy, Gestalt, Adlerian, and behavioral psychology. Observations of a group, community, or organiza- tion can be examined from a Marxian or Weberian perspective, a conflict or functionalist point of view. The point of theory triangulation is to understand how differing assumptions and affect find- ings and interpretations.

“I envy your con dence. Even after decades of evaluations, these Examples of eory Triangulation metaevalutions still make me feel naked.” Let’s suppose we are studying famine in a drought- afflicted region of an African country. We have quan- titative data on food production (sorghum and millet), (Who controls the meansdistribute of production? How do nutrition data from household surveys, health data the powerful benefit from famine?). • from clinics, rainfall data over many years, interviews Weberian theory would emphasize organizational with villagers (males and females), key informant competenceor and incompetence (How does the interviews (e.g., government officials, agricultural functioning and activities of government and inter- experts, aid agency staff members, and village leaders), national agencies exacerbate or alleviate famine?). • and case studies of purposefully sampled villages tell- Ecological systems theory would call for examining ing the story of their agricultural and nutritional situ- the interactions between the ecosystem, farming ations and experiences before and during the famine. practices, soil and water conditions, and markets. • Put all of these data together and we have an in-depth Cultural systems theory would emphasize the way in description of the extent and nature of the famine, itspost, which cultural beliefs and norms affect the experience effects on subsistence agriculture families, food and of and responses to famine by the people affected. • agricultural assistance provided, and the interventions Feminist theory would point to the role of women of government and international agencies. We have in the system as a factor in how the famine affects (a) mixed-methods triangulation and (b) multiple families and their responses to the crisis (Podems, sources of qualitative data (interviews, observations, 2014b). • case studies, documents), and (c) our team members Cognitive theory would focus on how people make have analyzed the patterns independently to confirm decisions in the face of changing conditions. the findings as well as had copy,the findings externally reviewed by experts. Thus, we can make a credible When designing the famine study, these vari- case for the nature, extent, and impacts of the famine. ous theoretical perspectives would inform the kinds What does theory triangulation add? of questions to be asked and data to be collected. When we movenot from description to interpretation, When analyzing the findings and explaining results, we need a framework to make sense of and explain these diverse theoretical perspectives provide com- the patterns in the data. Why is the region experienc- peting interpretations for explaining the patterns and ing famine? Why aren’t interventions more effective? observed impacts. Theory triangulation involves exam- DifferentDo theoretical frameworks emphasize different ining the data through different theoretical lenses explanatory variables. to see what theoretical framework (or combination) aligns most convincingly with the data (best fit). • Climate change theory would emphasize long- Theory triangulation for evaluation can involve term weather and climate trends. examining the data from the perspectives of various • Malthusian theory would emphasize overpopulation. stakeholder positions. It is common for diverse stake- • Marxian theory would emphasize power dynamics holders to disagree about program purposes, goals,

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 674 ANALYSIS, INTERPRETATION, AND REPORTING

and means of attaining goals. These differences rep- traditional explanations blamed the participants or resent different “theories of action” (Patton, 2012a) the larger societal problems that affected the partici- that can cast the same findings in different perspec- pants, but the actual data pointed to ineffective pro- tive-based lights. When we were seeking explanations grams, something that was actionable. Changes were to explain dropout rates for adult literacy programs made, and dropout rates went down significantly. in Minnesota, the predominant staff theory was that low-income people led chaotic lives and couldn’t manage regular attendance and follow-through in oughtful, Systematic Triangulation a program. Political explanations included laziness, All four of these different types of triangulation— effects of multigenerational poverty, lack of good jobs (1) mixed-methods triangulation, (2) triangulation of to motivate participants to complete programs, and qualitative data sources, (3) analyst triangulation, and cultural deprivation theories. But the explanation (4) theory or perspective triangulation—offer strategies that best fit the data (interviews with dropouts) was for reducing systematic bias and distortion during data that the adult literacy programs were lousy learning analysis, and thereby increasing credibility. In each case, experiences: large class sizes; disinterested and dis- the strategy involves checking findings against other respectful teachers, poorly paid and exhausted from sources and perspectives. Triangulation, in whatever having already taught all day in their regular jobs; form, increases credibility and quality by countering uninteresting and outdated curriculum materials; the concern (or accusation) that a study’s findings are and an all-around depressing environment. Most simply an artifact of a single method, a single source, or a single investigator’sdistribute blinders. Exhibit 9.1 (p. 660) reviews and summarizes the four types of triangulation THEORY INTEGRATION MEETS (items 7–10 in Exhibit 9.1). Exhibit 9.5or presents a model for rigorous analysis THEORY TRIANGULATION that broadens and deepens triangulation processes in high-stakes, high-visibility situations. Eight attributes Different criteria for evaluating the quality of qualitative of a rigorous analysis process were identified by stud- SIDEBAR remain fluid as qualitative inquirers move back and forth ying experienced intelligence analysts from multiple among genres, ignoring the boundaries, much as birds ignore U.S. federal investigative agencies. The researchers human fences—except to use them occasionally as conven- used a cognitive systems approach in which profes- ient places to rest. Consider the reflections on working across post,sional intelligence analysts were engaged in going and integrating multiple genres and theoretical orientations of beyond assessment of the quality of an analysis based self-described “critical educators” Patricia Burdell and Beth Blue on product quality to examine the analytic processes Swadener (1999). They combine autobiographical narratives necessary to generate a high-quality, credible, and use- with a variety of theoretical perspectives, including critical, dia- ful product. The understanding of rigor that emerged logic, phenomenological, feminist, and semiotic perspectives. was that it is not about following a standardized, highly They speculate that “it is perhaps both the intent and effect of prescribed analytical process (a formula or recipe) but, many of these texts to broaden the ‘acceptable’ or give voice to rather, “assessing the contextual sufficiency of many the intellectual contradictions and tensionscopy, in everyday lives of different aspects of the analytic process” (Zelik et al., scholar-teachers and researchers” (p. 23). 2007). The researchers posited that these dimensions could be relevant to any process where analysts must Our research has used narrative inquiry, collaborative eth- make sense of complex data, and the rigor and result- nography, and appliednot semiotics. Between us, we share an ing credibility of their analytical process will affect the identity and scholarship in critical and feminist curriculum utility of the findings for decision making. Examine theory. We are frequent border-crossers. We seek texts that Exhibit 9.5 carefully and thoughtfully. There’s a lot allow us to enter the world of others in ways that have us there pulled together in a comprehensive, coherent, more present in their experience, while better understand- and integrated triangulation model: The Rigor Attrib- ing ourDo own. (p. 23). ute Model. What comes across most powerfully from the work that generated the model is that a product They call this border-crossing genre “critical personal narrative (report, findings, or presentation of results) cannot and autoethnography.” The real world in which inquiry occurs be assessed for quality and credibility without know- is not a very neat and orderly place. Nor is it likely to become ing the nature and rigor of the analytical process that so. Theoretical and methodological border crossers are natu- generated the findings. That insight is consistent with ral and determined triangulators. the focus of my MQP Rumination, avoiding research rigor mortis, in this chapter (see pp. 701–703).

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 675

EXHIBIT 9.5 Dimensions of Rigorous Analysis and

The Rigor Attribute Model generated the product. The understanding of rigor that emerged was that it is not about following a standardized Eight attributes of a rigorous analysis process were iden- process but, rather, “assessing the contextual suciency ti ed by studying experienced intelligence analysts of many dierent aspects of the analytic process” (Zelik et from multiple U.S. federal investigative agencies. The al., 2007). The researchers posited that these dimensions researchers used a cognitive systems approach in which could be relevant to any process where analysts must professional intelligence analysts were engaged in going make sense of complex data, and the rigor and resulting beyond assessment of the quality of an analysis based credibility of their analytical process will aect the utility on product quality to examine the analytic process that of the ndings for decision making.

Overview of the Eight Dimensions of Rigorous Analysis

RIGOR ATTRIBUTE HIGHRIGOR PROCESS LOWRIGOR PROCESS 1. Hypothesis exploration. Extent to Test multiple hypotheses to identify Minimal weighing of alternatives which multiple hypotheses were the best, most probable explanations seriously examined against the data 2. Information search. Depth and Comprehensively explore as much Datadistribute collection limited to routine breadth of the search process used data as relevant to the inquiry and readily available data sources in collecting data (diligent, purposeful sampling) (convenience sampling) 3. Information validation. Information Systematic veri cation and orLittle eort made to triangulate sources are corroborated and cross- triangulation of information (use converging evidence to verify validated (triangulation) and sampling information-rich, source accuracy) trustworthy, and knowledgeable sources 4. Stance analysis. Evaluation of data Investigate key informants’ Nothing is done when clear bias to identify and contextualize the backgrounds to assess how their in a source is detected perspective of the source perspective post,might bias information they provide 5. Sensitivity analysis. The extent Systematic and strategic assessment Explanations accepted and to which analysts consider, of implications for interpretations, reported if they seem appropriate understand, and make explicit conclusions, and explanations and valid on a surface level. the assumptions, strengths, if elements of the supporting Emphasis on face validity weaknesses, limitations, and gaps sources and evidence prove invalid, of their analysis copy,inadequate, or otherwise problematic 6. Specialist collaboration. The degree The analyst has talked to, or may be, Little or no eort is made to seek to which an analyst incorporates a leading expert in the key content out and incorporate independent, the perspectives of experts and areas of the analysis. Seeks out external expertise; no peer review key knowledgeablesnot into their independent, external expert peer before reporting assessments review for high-stakes analysis 7. Information synthesis. Refers to Extracted and integrated information Analyst simply complies how far beyond simply collecting, with a thorough consideration and reports the relevant Dolisting, and analyzing distinct data of diverse interpretations of information in a sequential and elements, sources, and cases an relevant data, noting both areas of compartmentalized form; little or analyst went in the interpretive consistency in ndings and areas no integration and synthesis process where dierent methods and data yield conicting ndings

(Continued)

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 676 ANALYSIS, INTERPRETATION, AND REPORTING

(Continued)

RIGOR ATTRIBUTE HIGHRIGOR PROCESS LOWRIGOR PROCESS 8. Explanation critique. A form Peers and experts are involved Little or no use of other analysts of collaboration that engages in independently examining the to give input on explanation dierent perspectives in examining interpretive chain of reasoning quality the preponderance of evidence and inferences made, explicitly supporting primary conclusions distinguishing which are stronger and which weaker

SOURCE: Adapted and revised from Zelik, Patterson, and Woods (2007).

INTERPRETING TRIANGULATION RESULTS: MAKING SENSE OF CONFLICTING AND INCONSISTENT

SIDEBAR CONCLUSIONS distribute

A common misconception about triangulation involves think- ing that the purpose is to demonstrate that different data or sources or inquiry approaches yield essentially the same result. The point is to test for such consistency. Different kinds of data may yield somewhat different results because different types of inquiry are sensitive to different real-world nuances. Different theoretical frameworks will likely foster different interpreta- tions of the same findings. Different analysts may well inter- pret the same patterns in different ways. Thus, understanding post, inconsistencies in findings across different kinds of triangulation can be illuminative and important. Finding such inconsistencies ought not to be viewed as weakening the credibility of results but, rather, as offering opportunities for deeper insight into the relationship between inquiry approach and the phenomenon under study. copy, not Do

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. MODULE

Alternative and Competing Criteria for Judging the Quality of Qualitative Inquiries, Part 1 78

Universal Criteria, research and evaluation (Reynolds et al., 2011). They and Traditional Scientific Research found 93 papers published between 1994 and 2010 Versus Constructivist Criteria that offered and discussed quality criteria, 37 of which were sufficiently detailed to merit further analysis. (The 56 papers that were rejected focused only on Every way of seeing is also a way of not review criteria for publication or guidance on a specific seeing. qualitative method or single stage of the research pro- cess, such as data analysis.) They found no consensus —David Silverman (2000, p. 825) about how to ensure the quality of qualitative research. However, they were able to categorize approaches into Judging Quality: e Necessity two “narratives” about quality: (1) an output-oriented of Determining Criteria approach versus (2) a process-oriented approach: It all depends on criteria. Judging quality requires crite- 1. The most dominantdistribute narrative detected was that of ria. Credibility flows from those judgments. Quality and an output-oriented approach. Within this narrative, credibility are connected in that judgments of quality con- quality is conceptualized in relation to theoretical stitute the foundation for perceptions of credibility. constructsor such as validity or rigor, derived from Diverse approaches to qualitative inquiry— the positivist paradigm, and is demonstrated by the phenomenology, ethnomethodology, ethnography, her- inclusion of certain recommended methodologi- meneutics, symbolic interaction, heuristics, critical the- cal techniques: the use of triangulation, member ory, realism, grounded theory, and feminist inquiry, to (or participant) validation of findings, peer review name but a few—remind us that issues of quality and of findings, deviant or negative case analysis, and credibility intersect with audience and intended inquiry multiple coders of data. purposes. Research directed to an audience of indepost,- pendent feminist scholars, for example, may be judged Strengths of the output-oriented approach for assuring by somewhat different criteria from research addressed quality of qualitative studies include the acceptability to an audience of government economic policymak- and credibility of this approach within the dominant ers. Formative research or action inquiry for program positivist environment where decision making is based improvement involves different purposes and therefore on “objective” criteria of quality. Checklists equip those different criteria of quality compared with summative unfamiliar with qualitative research with the means to evaluation aimed at making fundamentalcopy, continuation assess its quality. decisions about a program or policy. Thus, it is impor- tant to acknowledge at the outset that particular philo- e weakness of this approach is that “following of sophical underpinnings or theoretical orientations and check-lists does not equate with understanding of special purposes for qualitative inquiry will generate and commitment to the theoretical underpinnings different criteria notfor judging quality and credibility. of qualitative paradigms or what constitutes quality Despite this, efforts to generate universal crite- within the approach. e privileging of guidelines as a ria and checklists for quality abound. The results are mechanism to demonstrate quality can mislead inex- as follows: multiple possibilities, no consensus, and perienced qualitative researchers as to what constitutes ongoing debate. good qualitative research. is runs the risk of reducing Do qualitative research to a limited set of methods, requir- ing little theoretical expertise and diverting attention A Review of Quality Assurance away from the analytic content of research unique to Recommendations for Qualitative Research the qualitative approach. Ultimately, one can argue that a solely output-oriented approach risks the values An interdisciplinary team of health researchers of qualitative research becoming skewed towards the engaged in worldwide malaria prevention and treat- demands of the positivist paradigm without retaining ment set out to identify quality criteria for qualitative quality in the substance of the research process.”

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 678 ANALYSIS, INTERPRETATION, AND REPORTING

2. By contrast, the second, process-oriented narrative, approach, and ways to demonstrate that these have presented conceptualizations of quality that were been upheld throughout the research process. . . . We linked to principles or values considered inher- propose that this framework be flexible enough to ent to the qualitative approach, to be understood accommodate different qualitative methodologies and enacted throughout the research process. without dictating essential activities for promoting Six common principles were identified across quality. (Reynolds et al., 2011) the narrative: (1) reflexivity of the researcher’s position, assumptions, and practice; (2) transpar- This chapter addresses this recommendation, offer- ency of decisions made and assumptions held; ing both a generic quality framework as well as spe- (3) comprehensiveness of approach to the research cialized quality criteria for specific types of qualitative question; (4) responsibility toward decision making inquiry. acknowledged by the researcher; (5) upholding good ethical practice throughout the research; and (6) a systematic approach to designing, con- THE PURPOSE OF AND SIDEBAR ducting, and analyzing a study. DEBATE ABOUT CRITERIA Strengths of the process-oriented approach include the ability of the researcher to address the quality Criteria are standards, benchmarks, norms, and, in some of their research in relation to the core principles cases, regulative ideals that guide judgments about the or values of qualitative research. e core principles goodness, quality, distributevalidity, truthfulness, and so forth of identified in this narrative also represent continu- competing claims (or methodologies, theories, inter- ous, researcher-led activities rather than externally pretations, etc.). . . . determined indicators such as validity, or end- Criteria orthat have been proposed for judging the pro- points. Reflexivity, for example, is an active, iterative cesses and products of social inquiry include truth, rel- process—an attitude of attending systematically to evance, validity, credibility, plausibility, generalizability, the context of knowledge construction . . . at every social action, and social transformation, among others. step of the research process. As such, this approach Some of these criteria are epistemic (i.e., concerned emphasises the need to consider quality through- with justifying knowledge claims as true, accurate, out the whole course of research, and locates the correct), others are political (i.e., concerned with war- responsibility for enacting good qualitative research post,ranting the power, use, and effects of knowledge claims practice firmly in the lap of the researcher(s). or the inquiry process more generally); still others are moral or ethical standards (i.e., concerned with the Need for a Flexible Quality Framework right conduct of the inquirer and the inquiry process in general). . . . The review team (Reynolds et al., 2011) found that “there is an increasing demand for the quali- Poststructuralist and postmodernist approaches to tative research field to move copy,forward in developing qualitative inquiry are also shaping the way we con- and establishing coherent mechanisms for quality ceive of criteria. Given the growing inuence of narra- assurance of qualitative research.” They concluded tive approaches and experimental texts in qualitative with a recommendation for “the development of a inquiry, it is becoming more common to nd discussions flexible framework to help qualitative researchers to of rhetorical and aesthetic criteria replacing discussions define, apply andnot demonstrate principles of qual- of epistemic criteria. Other scholars argue that epistemo- ity in their research.” They further recommended logical criteria cannot be neatly decoupled from political that “the strengths of both the output-oriented and and critical agendas and ethical concerns. Some scholars process-oriented narratives be brought together to in qualitative inquiry have little patience for discussing create guidance that reflects core principles of qual- criteria within dierent epistemological frameworks and itativeDo research but also responds to expectations of theoretical perspectives and prefer to focus on the craft the global health field for explicitly assured quality of using various methodological procedures for produc- in research.” ing “quality” work.

—Schwandt (2007, pp. 49–50) We recommend the development of a framework that The Sage Dictionary of Qualitative Inquiry helps researchers identify their core principles, appro- priate for their epistemological and methodological

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 679

Judging the Quality of Alternative and how people make sense of their worlds. Tracy Approaches to Qualitative Inquiry (2010), for example, identified “eight ‘big-tent’ cri- teria for excellent qualitative research”: (1) worthy There can be no universal, generic, standardized, and topic, (2) rich rigor, (3) sincerity, (4) credibility, (5) all-encompassing criteria for judging the quality of resonance, (6) significant contribution, (7) ethics, and qualitative studies because qualitative inquiry is not (8) meaningful coherence. But approaches to inquir- monolithic, uniform, or standardized. It’s as if some- ing into and judging attainment of these general cri- one set out to create a universal checklist for beauty teria are diverse and multifaceted, serve competing that ignored culture, human variability, variety, and purposes, and are, ultimately, a matter of debate differences in taste, socialization, and values (oh yes, (Gordon & Patterson, 2013). the “Miss Universe” and “Miss World” contests not- It is possible to specify quality criteria for research withstanding). The common core elements across all generally. These are not unique to qualitative inquiry but kinds of qualitative inquiry are attention to language, apply to scientific inquiries of all kinds. Exhibit 9.6 pre- words, narrative, description, stories, cases, worldviews, sents these general, science-based quality criteria.

EXHIBIT 9.6 General Scientic Research Quality Criteria

QUALITY CRITERIA ELABORATION/EXAMPLES 1. Clarity of purpose Basic research, applied research, and evaluationdistribute research, for example, serve dierent purposes and are judged by dierent standards. (See Exhibit 5.1, p. 250.) 2. Epistemological clarity Inquiry traditions like positivism, ornaturalism, social construction, realism, pheneomenology, and are based on dierent criteria about what constitutes knowledge, how it is acquired, and how it should be judged. (See Chapter, especially Exhibit 3.3, pp. 97–99.) 3. Questions and hypotheses ow Dierent purposes and iquiry traditions emphasize dierent priority from and are consistent with questions. (See Exhibit 3.3, pp. 97–99.) purpose and post, 4. Methods, design, and data Purpose, epistemology, and research questions, in combination, drive collection procedures are methods, design, and data collection decisions. Matching methods to aprorpriate for the nature of the questions and hypotheses, given constraints of time, resources, and inquiry access, is basic. 5. Data collection procedures The foundation of all science is careful methodological documentation so are systematic and carefully that those reviewing ndings can determine how they were produced. documented copy, 6. Data analysis is appropriate for the Matching analytical procedures to the nature and type of data kind of data collected collected is a basic standard. There may be disagreemnts about what is appropriate, but the researcher has the obligation to make the case for not appropriateness and justify methodological and analytical decsions made. 7. Strengths and weaknesses are No studies are perfect. All have limitations. These should be acknowledged and discussed acknowledged and their implications for interpreting ndings discussed. 8. Findings should ow from the data The connection between data collected, analysis undertaken, and ndings Doand analysis (conclusions, explanations) should be clear and explained. 9. Research should be presented for A fundamental principle of science is openness to review by those in a review position to judge quality. 10. Ethical reection and disclosure All scienti c traditions and disciplines have ethical standards, like avoiding (or at least disclosing) conicts of interest and treating human subjects with respect. Compliance with ethical standards should be discussed.

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 680 ANALYSIS, INTERPRETATION, AND REPORTING

EXHIBIT 9.7 Alternative Sets of Criteria for Judging the Quality and Credibility of Qualitative Inquiry

1. Traditional Scientific Research Criteria 1. of the inquirer (minimize bias) Fight 2. Hypothesis generation and testing TRUTH Decay 3. Validity of the data 4. Interrater reliability of codings and pattern analyses 5. Conclusions about the correspondence of ndings to reality 6. Generalizability (external validity) 7. Strength of causal explanations (attribution analysis) 8. Contributions to theory 9. Independence of conclusions and judgments 10. Credibility to knowledgeable disciplinary researchers (peer review) (These criteria are explained and discussed on pp. 683–684.) 2. Social Construction and Constructivist Criteria distribute 1. Subjectivity acknowledged (discuss and take into account inquirer perspective) 2. Trustworthiness and authenticity 3. Interdependence: relationship based (intersubjectivity) or Deconstruct 4. Triangulation (capturing and respecting multiple perspectives) TRUTHS 5. Reexivity 6. Particularity (doing justice to the integrity of unique cases) 7. Enhanced and deepened understanding (verstehen) 8. Contributions to dialogue 9. Extrapolation and transferability post, 10. Credible to and deemed accurate by those who have shared their stories and perspectives (These criteria are explained and discussed on pp. 684–686.)

3. Artistic and Evocative Criteria 1. Emotionally evocative: connects with and moves the audience 2. Integrates science and artcopy, to open the world to us 3. Creativity 4. Aesthetic quality, artistic representation 5. Interpretive vitality, sensuous Create 6. Embeddednot in lived experience TRUTHS 7. Stimulating and provocative 8. Voice distinct, expressive 9. Feels “true” or “authentic” or “real” Do10. Crystallization (These criteria are explained and discussed on pp. 687–690.)

4. Participatory and Collaborative Criteria 1. Genuine and signi cant participation from inquiry focus, through design, data collection, analysis, and reporting; participation is real 2. Researchers and participants are co-inquirers, sharing power and decision making 3. Interactive validity and interpersonal competence

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 681

4. Builds capacity through learning by doing 5. Mutual respect Group- 6. Group reexivity Sourcing 7. Interdependence TRUTH 8. Sense of group ownership (“We did this.”) 9. Group accountability: negotiated trade-os explicit and transparent 10. Credibility within the group the basis for external credibility (These criteria are explained and discussed on pp. 690–691.)

5. Critical Change Criteria 1. Critical perspective: increases consciousness about injustices 2. Identi es nature and sources of inequalities and injustices 3. Represents the perspective of the less powerful 4. Makes visible the ways in which those with more power exercise and bene t from power 5. Engages those with less power respectfully and collaboratively 6. Builds the capacity of those involved to take action Speak TRUTH 7. Identi es potential change-making strategies todistribute 8. Praxis Power 9. Clear historical and values context 10. Consequential validity or (These criteria are explained and discussed on pp. 691–693.)

6. Systems Thinking and Complexity Criteria 1. Analyze and map systems of interests 2. Attend to interrelationships 3. Capture perspectives post, 4. Sensitive to and explicit about boundary implications Truth 5. Capture emergence is 6. Expect and document nonlinearities COMPLEX 7. Adapt inquiry in the face of uncertainties 8. Describe systems changes and their implications 9. Contribution analysis 10. Credible to systems thinkerscopy, (These criteria are explained and discussed on pp. 693–695.)

7. Pragmatic, Utilization-Focused Criteria 1. Focus inquirynot on informing action and decisions 2. Identify intended uses and users What Is 3. Interactive engagement with intended users to enhance and use Useful Is 4. Practical orientation throughout TRUE Do 5. Relevance to real-world issues and concerns 6. Time ndings and feedback to support use 7. Understandable methods and ndings 8. Actionable ndings 9. Credible to primary intended users 10. What is useful is true 11. Extract lessons (These criteria are explained and discussed on pp. 695–697.)

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 682 ANALYSIS, INTERPRETATION, AND REPORTING

From the General to the Particular: Seven SIDEBAR Sets of Criteria for Judging the Quality of DIFFERENT AUDIENCES INTERESTED Different Approaches to Qualitative Inquiry IN AND INVOLVED IN ASSESSING THE QUALITY OF QUALITATIVE RESEARCH Once we move beyond general criteria for scientific inquiry (Exhibit 9.6) to address specific quality cri- AND EVALUATION teria for qualitative inquiry, we must move from the general to the particular and contextual. Exhibit 9.7 Criteria of quality can and often do vary by audience (Flick, lists criteria that are embedded in and flow from dis- 2007b, pp. 3–8). Here are some questions to consider in think- tinct qualitative inquiry frameworks. The traditional ing about the intersection of quality criteria and audience. scientific research criteria are embedded in and derived 1. Your criteria. You, the inquirer, presumably have an interest from what I discussed in Chapter 3 as reality-testing in doing quality work. How do you decide what standards inquiry frameworks that include positivist, postposi- and criteria of quality you will adhere to? tivist, empiricist, and foundationalist pp. 105–108. The social construction criteria are derived 2. Primary users of your ndings. Others will read and poten- from the discussion of “constructivism” in Chapter 3 pp. tially use your ndings. Who are the intended users of what 121–126. The artistic and evocative criteria are derived you generate, and what criteria will they apply in judging from the discussion of autoethnography and evocative the quality and credibility of your work? forms of inquiry in Chapter 3, especially the criteria 3. Funders of your inquiry. If your inquiry has been funded by suggested by Richardson (2000b) for “creative analytic a grant, an agency,distribute an evaluation contract, or some other practice of ethnography.” The fourth set of criteria, funding mechanism, funders will be judging whether what participatory and collaborative approaches, are based you produced was worth what it cost. How will they make on traditions and approaches reviewed in Chapter 4 that judgment?or pp. 213–222. The fifth set of criteria, critical change cri- teria, flow from critical theory, feminist inquiry, activist 4. Publication reviewers. You may want to publish your nd- research, and participatory research processes aimed at ings. How will journals, book editors, and peer reviewers empowerment. The sixth set of criteria, systems and judge your work? complexity criteria, are derived from the discussion in Chapter 3 pp. 139–151.The seventh and final set of You need not be passive about others’ criteria and judgments. criteria, pragmatic and utilization-focused criteria, arepost, Indeed, you ought not to be passive. You should make explicit based on discussions in Chapters 3 and 4 pp. 152–157 the quality criteria you have applied in designing and imple- as well as program evaluation standards and principles menting your inquiry and invite readers, funders, and peer ( Joint Committee and Standards, 2010) and “Guid- reviewers to join you in using your criteria. You may also add ing Principles for Evaluators” (AEA Task Force on the caveat that if they apply dierent criteria, their judgments Guiding Principles for Evaluators, 1995). of quality may well dier from yours and from those who fol- To some extent, all of the theoretical, philosophi- low the criteria you’re operating under. In all of this keep in cal, and applied orientations reviewed in Chapters 3 mind that and 4 provide somewhat distinctcopy, criteria, or at least the question of how to ascertain the quality of qualita- priorities and emphases, for what constitutes a qual- tive research has been asked since the beginning of qual- ity contribution within those particular perspectives itative research and attracts continuous and repeated and concerns. I’ve chosen these seven broader sets of attention. However, answers to this question have not criteria to capture the primary debates that differen- been found—at least not in a way that is generally agreed tiate qualitative notapproaches and, more specifically, to upon. (Flick, 2007b, p. 11) highlight what seem to me to differentiate reactions to qualitative inquiry. In this chapter, we are primarily concerned with how others respond to our work. With whatDo perspectives and by what criteria will our work to bear on our work, we can anticipate their reactions be judged by those who encounter and engage it? and help them position our intentions and criteria in Some of the confusion that people have in assessing relation to their own expectations and criteria. In terms qualitative research stems from thinking it represents of the Reflexive Triangulated Inquiry model presented a uniform perspective, especially in contrast to quan- in Chapter 2 as Exhibit 2.5 (see p. 72), we’re deal- titative research. This makes it hard for them to make ing here with the intersection between the inquirer’s sense of the competing approaches within qualitative perspective and the perspective of those receiving the inquiry. By understanding the criteria that others bring study (the audiences).

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 683

Criteria Determine What “And I call them as they are,” repeated the second. We See: e Umpires’ Perspectives “And they ain’t nothing until I call them,” concluded Different perspectives about things such as truth and the third. the nature of reality constitute paradigms or world- views based on alternative epistemologies and ontol- We turn now to discussion and elaboration of the ogies. People viewing qualitative findings through seven alternative sets of criteria for judging the quality different paradigmatic lenses will react differently just of qualitative work summarized in Exhibit 9.7. as we, as researchers and evaluators, vary in how we think about what we do when we study the world. These differences are nicely illustrated by the classic 1. Traditional Scientific Research Criteria story of three baseball umpires who, having retired after a game to a local establishment for the dispensing of reality-distorting but truth-enhancing libations, are e saddest aspect of life right now is discussing how they call balls and strikes. that science gathers knowledge faster than society gathers wisdom. “I call them as I see them,” says the first. —Isaac Asimov (1920–1992) “I call them as they are,” says the second. Science authordistribute and science fiction writer “ ey ain’t nothing until I call them,” says the third. One way to increase the credibility and legitimacy That’s the classic version of the story. Now, thanks of qualitative inquiry among those who place pri- to high-speed camera technology, we can update the ority on traditionalor scientific research criteria is to story. emphasize those criteria that have priority within As chance would have it, two management that tradition. Science has traditionally emphasized researchers, Brayden King of Northwestern Univer- objectivity, so qualitative inquiry within this tradi- sity and Jerry Kim of Columbia Business School, hap- tion emphasizes procedures for minimizing inves- pened to be in the same bar going over their research tigator bias. Those working within this tradition on the accuracy of umpires’ calls. Overhearing thepost, will emphasize rigorous and systematic data col- three umpires, they went up to them and said, “Four- lection procedures, for example, cross-checking and teen percent of the time you call them wrong.” Before cross-validating sources during fieldwork. In analysis the umpires could argue, they explained, it means, whenever possible, using multiple coders and calculating intercoder consistency to establish We analyzed more than 700,000 pitches thrown dur- the validity and reliability of pattern and theme ing the 2008 and 2009 seasons. In addition to an aver- analysis. Qualitative researchers working in this tra- age error rate of 14%, we found that umpires tended dition are comfortable using the language of “vari- to favor the home team and thatcopy, umpires were more ables” and “hypothesis testing” and striving for causal likely to make mistakes when the game was on the line. explanations and generalizability, especially in com- (Based on King & Kim, 2014, p. SR12) bination with quantitative data (e.g., Hammersley, 2008b). Qualitative approaches that manifest some The two researchers went on like this for some or all of these characteristics include grounded the- time, breaking downnot the error rates by innings, sit- ory (Glaser, 2000), qualitative comparative analysis uation, pitcher and batter ethnicity and race, pitcher (Ragin, 1987, 2000), and realism (Miles et al., 2014). reputation, and so forth and so on, until finally the Their common aim is to use qualitative methods to umpires together put up their hands and told them describe and explain phenomena as accurately and to stop.Do completely as possible so that their descriptions and explanations correspond as closely as possible to the The first umpire said, “Your criteria are based on a way the world is and actually operates (Reynolds et high-speed camera. We get that. You love your num- al., 2011). Government agencies supporting qualita- bers. We get that. And your analysis is interesting, even tive research (e.g., the U.S. Government Accounting fascinating. We get that. But during a game we don’t Office, the National Science Foundation, or the use a camera. So I still call them as I see them,” he National Institutes of Health) usually operate within reiterated. this traditional scientific framework.

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 684 ANALYSIS, INTERPRETATION, AND REPORTING

concepts to distinguish quality in qualitative research THE ROOTS OF TRADITIONAL SOCIAL (e.g., Glesne, 1999, pp. 5–6). Lincoln and Guba (1986) SCIENCE CRITERIA APPLIED TO proposed that constructivist inquiry demanded dif- QUALITATIVE INQUIRY ferent criteria from those inherited from traditional social science. They suggested “credibility as an ana- SIDEBAR log to internal validity, transferability as an analog to An emphasis on valid and reliable knowledge, as gen- external validity, dependability as an analog to reliabil- erated by neutral researchers utilizing the scientific ity, and confirmability as an analog to objectivity.” In method to discover universal Truth, reects an episte- combination, they viewed these criteria as addressing mology commonly referred to as positivism. Historically, “trustworthiness (itself a parallel to the term rigor)” social scientists understood positivism as reected in a (pp. 76–77). They went on to emphasize that natu- “realist ontology, objective epistemology, and value-free ralistic inquiry should be judged by dependability axiology.” Few, if any, qualitative researchers currently (a systematic process systematically followed) and subscribe to an absolute faith in positivism, however. authenticity (reflexive consciousness about one’s own Many postpositivists, or researchers who believe that perspective, appreciation for the perspectives of others, achievement of objectivity and value-free inquiry are not and fairness in depicting constructions in the values possible, nonetheless embrace the goal of production of that undergird them). They viewed the social world (as generalizable knowledge through realist methods and opposed to the physical world) as socially, politically, minimization of researcher bias, with objectivity as a and psychologically constructed, as are human under- standings and explanations of the physical world. “regulatory ideal” rather than an attainable goal. In short, distribute postpositivism does not embrace naive belief in pure sci- They advocated triangulation to capture and report enti c truth; rather, qualitative research conducted in a multiple perspectives rather than seek a singular truth. The team of researchers who reviewed approaches to strict postpositivist tradition utilizes precise, prescribed or processes and produces social scientific reports that assessing quality in qualitative research found that enable researchers to make generalizable claims about “the post-positivist criteria developed by Lincoln and the social phenomenon within particular populations Guba, based around the construct of ‘trustworthi- under examination. ness,’ were referenced frequently and appeared to be the basis upon which a number of authors made their Postpositivists commonly utilize qualitative methods that recommendations for improving quality of qualitative bridge quantitative methods, in which researchers con- post,research” (Reynolds et al., 2011). duct an inductive analysis of textual data, form a typology Constructivists embrace subjectivity as a pathway grounded in the data (as contrasted with a preexisting, deeper into understanding the human dimensions validated typology applied to new data), use the derived of the world in general as well as whatever specific typology to sort data into categories, and then count phenomena they are examining (Peshkin, 1985, 1988, the frequencies of each theme or category across data. 2000a,b). ey’re more interested in deeply under- Such research typically emphasizes validity of the coding standing specific cases within a particular context schema, inter-coder reliability, and careful delineation of than in hypothesizing about generalizations and procedures, including random or otherwisecopy, systematic causes across time and space. Indeed, they are sus- sampling of texts. Content analyses of media typify this picious of causal explanations and empirical general- approach. (Ellingson, 2011, pp. 596, 598; within-quote izations applied to complex human interactions and references omitted) cultural systems. ey offer perspective and encour- age dialogue among perspectives rather than aim- not ing at singular truths and linear predictions. Social constructivists’ case studies, findings, and reports are 2. Social Construction explicitly informed by attention to praxis and reflex- and Constructivist Criteria ivity—that is, understanding how one’s own experi- Do ences and background affect what one understands and how one acts in the world, including acts of What is perceived as real is real in its inquiry. For an in-depth discussion of this perspective consequences. and its implications, see the Handbook of Construc- tionist Research (Holstein & Gubrium, 2008). Also — e omas theorem see Chapter 3 pp. 121–126 for a much lengthier dis- cussion of constructionism and constructivism. Social construction, constructivist, and “interpretiv- Here are three examples of social construction as a ist” perspectives have generated new language and framework for program evaluation.

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 685

1. The evaluation of a community development One purpose of the evaluation was to capture project in an ethnically and racially diverse neigh- those different realities, including diverse expe- borhood collected and reported stories from resi- riences with the Paris Declaration principles, to dents purposefully sampled to present a range of facilitate dialogue on future international develop- experiences and perspectives. The evaluation did ment policies and practices. not render judgments but was called a “multivocal 3. Social constructivism was the foundation of Nora evaluation” in which the diverse stories were used Murphy’s (2014) study of homeless youth. The 14 for dialogue and to enhance mutual understanding. case studies showed diverse experiences and perspec- 2. The evaluation of the international Paris Decla- tives on homelessness. The study concluded that the ration on Development Aid included case stud- unique situation of each homeless youth meant that ies that revealed the different perspectives and program responses needed to be socially constructed contexts within which aid is given and received. together with the youth to be meaningful to them Donors (wealthier countries) and beneficiaries and to build trusting adult–youth relationships. (For (poorer countries) experience different “realities.” more on this evaluation, see pp. 194, 626–628.)

CONSTRUCTIVIST TRUSTWORTHINESS SIDEBAR

The credibility of your findings and interpretations depends on so on to the data themselves in readily discernible ways. your careful attention to establishing trustworthiness. Lincoln For each of these criteria,distribute Lincoln and Guba also speci ed and Guba (1985) describe prolonged engagement (spending a set of procedures that could be used to meet the crite- sufficient time at your research site) and persistent observation ria. For example, auditing was highlighted as a procedure (focusing in detail on those elements that are most relevant to useful foror establishing both dependability and con rm- your study) as critical in attending to credibility. “If prolonged ability, and member check and peer debrie ng, among engagement provides scope, persistent observation provides other procedures, were de ned as most appropriate for depth” (p. 304). With each, time is a major factor in the acquisi- credibility. tion of trustworthy data. Time at your research site, time spent interviewing, and time building sound relationships with In Fourth Generation Evaluation (1989), Guba and Lincoln respondents all contribute to trustworthy data. When a large reevaluated this initial set of criteria. They explained that amount of time is spent with your research participants,post, they trustworthiness criteria were parallel, quasi-foundational, less readily feign behavior or feel the need to do so; moreover, and clearly intended to be analogs to conventional criteria. they are more likely to be frank and comprehensive about what Furthermore, they held that trustworthiness criteria were they tell you. Lincoln and Guba posited four constructivist cri- principally methodological criteria and thereby largely teria as parallel to but distinct from traditional research criteria: ignored aspects of the inquiry concerned with the qual- ity of outcome, product, and negotiation. Hence, they First, credibility (parallel to internal validity) addressed advanced a second set of criteria called authenticity criteria, the issue of the inquirer providing assurances of the t arguing that this second set was better aligned with the between respondents’ viewscopy, of their life ways and the constructivist epistemology that informed their de nition inquirer’s reconstruction and representation of same. of qualitative inquiry. (Schwandt, 2007, pp. 299–300) Second, transferability (parallel to external validity) dealt with the issue of generalization in terms of case-to-case Continual alertness to your own biases and subjectivity transfer. It concerned the inquirer’s responsibility for pro- (reexivity) also assists in producing more trustworthy inter- not pretations. Consider your subjectivity within the context of viding readers with sufficient information on the case studied such that readers could establish the degree of the trustworthiness of your ndings. Ask yourself a series of similarity between the case studied and the case to which questions: Whom do I not see? Whom have I seen less often? ndings might be transferred. Third, dependability (paral- Where do I not go? Where have I gone less often? With whom Dolel to reliability) focused on the process of the inquiry and do I have special relationships, and in what light would they the inquirer’s responsibility for ensuring that the process interpret phenomena? What data-collecting means have I was logical, traceable, and documented. Fourth, conrma- not used that could provide additional insight? Triangulated bility (parallel to objectivity) was concerned with establish- ndings contribute to credibility. Triangulation may involve ing the fact that the data and interpretations of an inquiry the use of multiple data collection methods, sources, inves- were not merely gments of the inquirer’s imagination. It tigators, or theoretical perspectives. To improve trustworthi- called for linking assertions, ndings, interpretations, and ness, you can also consciously search for negative cases.

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 686 ANALYSIS, INTERPRETATION, AND REPORTING

Alternative Criteria Review the critique that traditional scientific research criteria were based on quantitative and experimental design Exhibit 9.7 (pp. 680–681) presented seven different sets thinking that, by the very nature of using those crite- of criteria for judging the quality of qualitative stud- ria for defining quality, led to qualitative studies being ies. This module reviewed the first two sets of criteria: judged inferior. The next module makes the issue of (1) traditional scientific research criteria versus (2) con- judging quality even more complicated by adding five structivist criteria. Constructivist criteria emerged from more sets of alternative and competing criteria.

Marry you? There you go, trying to construct reality again.

distribute or

post, © 2002 Michael Quinn Patton and Michael Cochran © 2002 Michael Quinn Patton A Realist Views a Constructivist Proposal copy, not Do

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. MODULE Alternative and Competing Criteria, Part 2 79

Artistic, Participatory, Critical Change, approaches to qualitative inquiry want to bring forth Systems, Pragmatic, and Mixed Criteria our emotional selves and do so by integrating art and science. Science makes us think. Great art makes us feel. From the perspective of artistic and evocation e moral and social yearnings of qualitative inquirers, great qualitative studies should evoke both understandings (cognition) and feelings fully realized human beings are not (emotions). reducible to universal laws and cannot be studied like physics. Persons are moved by emotion. . . . People are their —Brooks (2010, p. A27) emotions. To understand who a person is, it is neces- sary to understand emotion. . . . Emotions cut to the core of people. Within and through emotion peo- This module continues the presentation and discus- ple come to define the surface and essential, or core, sion of seven alternative sets of criteria for judging meanings of who they are. Emotions and moods are the quality of qualitative studies. The previous mod- ways of disclosing thedistribute world for the person. (Denzin, ule covered (1) traditional social science research cri- 2009, pp. 1–2) teria and (2) social construction and constructivist criteria. This module covers (3) artistic and evocative Artistic andor evocative criteria focus on aesthetics, criteria, (4) participatory and collaborative criteria, creativity, interpretive vitality, and expressive voice. (5) critical change criteria, (6) systems and complex- Case studies become literary works. Poetry or per- ity criteria, and (7) pragmatic and utilization-focused formance art may be used to enhance the audience’s criteria. We’ll then examine mixing criteria. direct experience of the essence that emerges from analysis. Artistically oriented qualitative analysts seek to engage those receiving the work, to connect with 3. Artistic and Evocative Criteria post,them, move them, provoke them, and stimulate them. Creative nonfiction and fictional forms of represen- TRUTH is visceral, palpable, tation blur the boundaries between what is “real” and what has been created to represent the essence of a sensuous, wrenching, hormonal, reality. A literal presentation of reality, real (scien- cognitive, cathartic, lyrical, contextual, tific) or perceived (constructivism), yields to artisti- awakening, fleeting, universal, and cally created reality. The results may be called creative syntheses, ideal-typical case constructions, scientific debatable. In other words,copy, truth is art. poetics, or any number of phrases that suggest the —From Halcolm’s Ruminations artistic emphasis. Artistic expressions of qualitative analysis strive to provide an experience with the find- ings where “truth” or “reality” is understood to have Researchers andnot audiences operating from the per- a feeling dimension that is every bit as important as spective of traditional scientific research criteria the cognitive dimension. Such qualitative inquiry emphasize the scientific nature of qualitative inquiry. is explicitly sensuous (Stoller, 2004) and emotional Researchers and audiences who view the world (Denzin, 2009). through the lens of social construction emphasize The performance art of The Vagina Monologues qualitativeDo inquiry as a particularly human form of (Ensler, 2001), based on interviews with women about understanding centered on the capacity of people their experiences of coming of age sexually, but pre- and groups to construct meaning. That brings us to sented as theater, offers a prominent example. The this third alternative, which emphasizes that human audience feels as much as knows the truth of the pres- beings both think and feel. Traditional social science entation because of the essence it reveals. In the artistic and constructivist inquiries focus on cognitive, logi- tradition, the analyst’s interpretive and expressive voice, cal, sense-making analyses. The artistic and evocative experience, and perspective may become as central to

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 688 ANALYSIS, INTERPRETATION, AND REPORTING

the work as depictions of others or the phenomenon of verified or corrected based on the participants’ interest. Here are some examples of artistic and evoca- experiences of the program, and new perspectives tive approaches used in program evaluation. can be recorded, which might otherwise have remained dormant (Folorunsho, 2014). • A program for low-income, pregnant, drug- • A team of educational evaluators concerted inter- addicted teenagers asked the young women to views with students and teachers about an inter- draw pictures of their hearts and tell what the national cross-cultural summer experience into a pictures meant. e initial hearts were portrayed play dramatizing critical events and key learnings. as wounded, knifed, torn, mangled, and tortured. e play was performed for the school board as the Over a period of four months (four drawings and project’s evaluation report. accompanying stories), the pictures showed some • Photographs taken before and after Vietnam sunshine, flowers, rainbows, and, most striking, implemented a motorbike helmet law showed dra- connections to other hearts. No perfect valentines. matic changes in compliance. (See Exhibit 8.20, Not even close. But to look at those raw drawings pp. 609 ; for other examples of using visuals created was to see hearts healing. to present evaluation findings, see Exhibits 8.21 • eater for development is being used in Nigeria through 8.27, pp. 610–619.) to engage community members in discussing pre- liminary results from an evaluation, with actors Exhibit 9.8 shows how an interview transcript was replaying scenarios relating to a program uncov- converted into a poem when presenting the findings, ered through fieldwork. By involving community all the better to give thedistribute reader a feel for what was said members in role-play, the early findings can be and the affect it carried. EXHIBIT 9.8 From Interview Transcript to Poem: An Artisticor and Evocative Presentation

In May 1994, Corrine Glesne (1997) interviewed Dona looking with the eyes of God, Juana, an 86-year-old professor in the College of Educa- from the tops of trees. tion at the University of Puerto Rico. How hard for country people That she chose a bird to represent her was no surpost,- picking green worms prise. Standing 5 feet tall, very thin (“a problem all my from fields of tobacco, life”), and with bright dark eyes, she was birdlike in sending their children to school, appearance. Her oce was a nest of books, papers, not wanting them to suffer and folders in organized piles on her large desk, on as they suffer. the beige metal ling cabinets next to the door oppo- site her desk, in the wooden cabinet along the wall to In the urban zone, the right of her desk, on thecopy, shelves below the win - students worked at night dow to her left, and on the two chairs before her desk. and so they slept in school. There was no sense of disorder, but rather an impres- Teaching was the real university. sion of an archive that would illuminate Dona Juana’s 50 years in research and higher education. (p. 203) So I came to study not to find out how I could help. Below is the poem Glesne (1997) created from the I am busy here at the university, interview transcript, followed by a table showing the there is so much to do. conversion from transcript to poem. Do But the University is not the Island. The Poem I am a flying bird That Rare Feeling moving fast, seeing quickly I am a flying bird so I can give strength, moving fast so I can have that rare feeling seeing quickly of being useful. (pp. 202–203)

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 689

Composing the Poem From the Interview Transcript

TRANSCRIPT POETIC NARRATIVE CREATED C (Corrine): If I asked you to use a metaphor to describe Version 1: Chronologically and linguistically faithful to yourself as a professor, what would you say the transcript you were like? Someone I asked said that I would be a ying bird. she was a bridge and then she told me why. I want to move so fast What metaphor comes to mind for you? so I can see quickly, everything. J (Juana): I would be a ying bird. I wish I could look at the world C: A ying bird. Tell me about it. with the eyes of God, How are you a ying bird? to give strength to those that need . . . J: Because I want to move so fast. Version 2: Draws from other sections of the interviews, takes more license with words. C: Mrn-hmmm. Cover a lot of territory. I am a ying bird J: Yes. Yes. moving fast, seeing quickly, C: Are you any kind of bird or just any bird? looking with the eyes of God J: Well, any bird because I don’t want to from the tops of trees: mention some birds, some birds here are How hard for country peopledistribute destructive. picking green worms C: Are what? from elds of tobacco, sending their children to school, J: Are destructive. They destroy and I don’t or not wanting them to suer want to . . . as they suer C: No, you don’t want to be one of them. No. In the urban zone, You’re just a bird that moves fast. students worked at night J: That moves fast and sees from the tops of and so they slept in school. trees. So I can see quickly. Teaching was the real university. C: See quickly, see everything. post,So I came to study to nd out how I could help. J: Everything. I am busy here at the university, J: So you can see me? there is so much to do. C: I can. I can see you, a ying bird. But the university is not the Island. J: I wish I could look at the world with the eyes I am a ying bird of God. moving fast, seeing quickly copy, so I can give strength, C: With the eyes of what? so I could have that rare feeling J: Of God, of that spiritual power that can give of being useful. (Glesne, 1997, p. 207) strength. C: Thatnot can give strength? Strength? J: Yes, to those that need.

CrystallizationDo boundaries of each of those as well. In these produc- tions, the scholar might have different “takes” on the Sociologist Laurel Richardson (2000b) introduced same topic, what I think of as a postmodernist decon- crystallization as a criterion of quality in artistic and struction of triangulation. . . . In postmodernist mixed- evocative qualitative inquiry, a replacement for trian- genre texts, we do not triangulate, we crystallize. . . . I gulation as a criterion. propose that the central image for “validity” for post- modern texts is not the triangle—a rigid, fixed, two- e scholar draws freely on his or her productions from dimensional object. Rather, the central imaginary is the literary, artistic, and scientific genres, often breaking the crystal, which combines symmetry and substance with

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 690 ANALYSIS, INTERPRETATION, AND REPORTING

an infinite variety of shapes, substances, transmutations, SIDEBAR multidimensionalities, and angles of approach. . . . Crys- RIGOR IN ARTISTIC AND tallization provides us with a deepened, complex, thor- EVOCATIVE CRYSTALLIZATION oughly partial, understanding of the topic. Paradoxically, we know more and doubt what we know. Ingeniously, One of the most helpful (albeit not foolproof) ways to enhance we know there is always more to know. (p. 934) your account and ward off editorial defensiveness toward crea- tive analytic work, in general, and crystallization, in particular, is Crystallization’s roots can be traced to to be absolutely clear about what you did (and did not do) in pro- ducing your manuscript. This includes data collection, analysis, the creative and courageous work of feminist meth- and especially choices made about representation. . . . By explain- odologists who blasphemed the boundaries of art and ing my process, I help alleviate suspicions that I took an “anything science. . . . goes” sloppy attitude toward constructing my representation.

Art and science do not oppose one another; they While some colleagues may not like or approve of what you did anchor ends of a continuum of methodology, and no matter how you explain it, concise, explicit details of your most of us situate ourselves somewhere in the vast process make it more difficult for them to dismiss it as careless middle ground. When scholars argue that we cannot or random. Accounting for your process (even in an appendix or include narratives alongside analysis or poems within endnote) constitutes an important nod toward methodological grounded theory, they operate under the assumption rigor. As many have posited, engaging in creative analytic work that art and science negate one another and hence are should be no less rigorous,distribute exacting, and subject to strict stand- incompatible, rather than merely differ in some dimen- ards of peer evaluation. . . . Moreover, such a roadmap assists sions. . . . My explanation of crystallization assumes a others who may seek to follow your lead. . . . basic understanding of the complexities involved in combining methods and genres from across regions of Some suggestionsor on issues to own: the continuum. (Ellingson, 2009, pp. 3, 5) • Explain choices you made in composing narratives, poems, or other artistic work; in other words, how did you get from 4. Participatory and Collaborative Criteria data to text? • Describe your standpoint vis-à-vis your topic, not just what it is, but (at least some of) how it shapes your interactions To be human is to engage in post,with your data (e.g., I am a cancer survivor studying clinics so I tend to be more empathetic with patients than health interpersonal dynamics. Inter: care providers; I am a feminist so I pay a lot of attention to between. Personal: people. Dynamics: power dynamics). forces that produce activity and • Indicate your awareness of and response to ethical consid- erations about voice, privacy, and responsibility to others. change. Combining these definitions, What steps did you take to ensure participant con den- interpersonal dynamics are the forces tiality? To privilege participants’ voices? Consider how between people that leadcopy, to activity and your work might be read in ways that do not reect your intentions—for example, what quotes from participants change. Whenever and wherever people could be taken out of context and used as justi cation interact, these dynamics are at work. for blaming the victim?—and surround vulnerable voices not—King and Stevahn (2013, p. 2) with preemptory statements that make it more dicult for Interactive Evaluation Practice oppositional forces to excerpt and reinterpret their mean- ing in regressive ways. • Detail your analytic procedures. . . . Even if you construct a Participatory and collaborative qualitative inquiries unique, outside-the-box artistic creation, you should explain haveDo four purposes and justifications: your methodology and cite some sources to contextualize your work. Again, this need not interfere with your aesthetic 1. Values premise: The right way to inquire into a goals; details should be concise and can be placed in an phenomenon of interest is to do it with the people appendix, footnote, or even a separate piece altogether. The involved and affected. This means doing research goal is to reveal crystallized projects as embodied, imper- and evaluation with as opposed to people. It means fect, insightful constructions rather than as immaculate end engaging them as fellow inquirers and coresearch- products. (Ellingson, 2009, pp. 199–120) ers rather than as research subjects.

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 691

2. Quality premise: Data will be better when people SIDEBAR who are the focus of the inquiry willingly partic- INTERPERSONAL VALIDITY ipate, understand the nature of the inquiry, and agree with the importance of the study. Interviews Educational evaluator Karen Kirkhart (1995) coined the will be richer and more detailed. Observations will term interpersonal validity: the extent to which an evalu- be open and unguarded. Documents will be readily ator is able to relate meaningfully and eectively to indi- available. Data are better. viduals in the evaluation setting. The interpersonal factor 3. Reciprocity premise: Researchers get data, pub- that undergirds interpersonal validity highlights the com- lications, knowledge, and career advancement petence of a participatory evaluator or researcher do two from research and evaluation studies. Those who things: (1) interact with people constructively through- are the focus of inquiry should benefit as well. As out the framing and implementation of an inquiry and coresearchers, through participation in the inquiry, (2) create activities and conditions conducive to positive they learn research skills, learn to think more sys- interactions among participants. The interpersonal factor tematically, and gain knowledge that they can use is concerned with creating, managing, and ultimately for their own purposes. mastering the interpersonal dynamics that make a col- laborative inquiry possible and inform its ndings. One is 4. Utility premise: In program evaluation and action concerned with eventual use, the other with establishing research inquiries, the findings are more likely to buy-in among participants and a valid inquiry process. be useful—and actually used—when those who (King & Stevahn, 2013, p. 6) must act on the findings collaborate in generating distribute and interpreting them.

From the classic articulation and justification of individual and group settings to ensure adequate infor- Participatory Action Research by William Foote Whyte or mation was available. As co-investigators their stories (1989, 1991) to methods and facilitation guides on were instrumental in establishing and representing a how to actually do it (Caister et al., 2011; Hacker, corporate set of themes and experiences. ough the 2013; King & Stevahn, 2013; Pyrch, 2012; Taylor, co-researchers in this project were not involved in Suarez-Balcazar, Forsyth, & Kielhofner, 2006), par- the writing stages, they did have the opportunity to ticipatory and collaborative engagement has been a respond to the stories the author wrote, offering their major approach to qualitative inquiry. When conductpost,- unique perspectives and feedback as participants in ing research in a collaborative mode, professionals and the research and characters in the stories. e result- nonprofessionals become coresearchers. Participatory ing research project is a collaboration between the action research encourages collaboration within a researcher and the researched, including participants mutually acceptable inquiry framework to understand as co-researchers. (p. 600) and/or solve organizational or community problems. Chapter 4 includes an in-depth discussion of partic- ipatory and collaborative approaches (pp. 213–222), 5. Critical Change Criteria including Exhibit 4.13, Principlescopy, of Fully Participa - tory and Genuinely Collaborative Inquiry (p. 222). We are distressed by underprivilege. Here’s an example of a participatory and collabo- rative qualitative inquiry. Robin Boylorn (2008) stud- We see gaps among privileged ied the experiences of black Southern women. She patrons and managers and staff and recruited a groupnot of participants from the community underprivileged participants and in which she had grown up and invited them to share stories about their experiences and lives growing up communities. . . . We are advocates of and raising families in the rural South. She facilitated a democratic society. their interactions together as co-investigators so that Do —Robert Stake (2004, pp. 103–107) they felt “equally invested and equally involved in the Qualitative evaluation pioneer process of collecting, writing, interpreting, and edit- ing the stories they wrote” (p. 600). She shared her experiences with the participants, and together they How Far Dare an Evaluator compared and contrasted their ideas and experiences. Go Toward Saving the World? eir involvement began during the early stages of rec- Those engaged in qualitative inquiry as a form of ommending other participants and retelling stories in critical analysis aimed at social and political change

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 692 ANALYSIS, INTERPRETATION, AND REPORTING

eschew any pretense of open-mindedness or objectiv- concept of knowledge as power and equalizes the gen- ity; they take an activist stance. Critical change inquiry eration of, access to, and use of that knowledge. Critical aims to critique existing conditions and through that action research is an ethical choice that gives voice to, critique bring about change. Critical change crite- and shares power with, previously marginalized and rion is derived from critical theory, which frames and muted people. (Davis, 2008, p. 140) engages in qualitative inquiry with an explicit agenda of elucidating power, economic, and social inequali- Critical change criteria apply to a number of specialized ties. The “critical” nature of critical theory flows from areas of qualitative inquiry (Given, 2008, pp. 139–179; a commitment to go beyond just studying society for Schwandt, 2007, pp. 50–55): the sake of increased understanding. Critical change researchers set out to use inquiry to critique society, Critical Critical discourse Critical realism raise consciousness, and change the balance of power ethnography analysis in favor of those less powerful. Influenced by Marx- Critical education Critical Critical ism, informed by the presumption of the centrality of studies hermeneutics research class conflict in understanding community and soci- etal structures, and updated in the radical struggles Critical arts-based Critical Critical theory of the 1960s, critical theory provides both philosophy inquiry humanism and methods for approaching research and evaluation Critical race Critical Critical action as fundamental and explicit manifestations of polit- theory pragmatism research ical praxis (connecting theory and action), and as distribute change-oriented forms of engagement. Critical pedagogy Critical social Critical systems science analysis Critical social science and critical social theory attempt to understand, analyze, criticize, and alter social, eco- or nomic, cultural, technological, and psychological struc- In addition, feminist inquiry often includes an explicit tures and phenomena that have features of oppression, agenda of bringing about social change (e.g., Benmayor, domination, exploitation, injustice, and misery. ey 1991; Brisolara, Seigart, & SenGupta, 2104; Hesse-Biber, do so with a view to changing or eliminating these 2013; Podems, 2014b). Liberation research and empower- structures and phenomena and expanding the scope ment evaluation derive, in part, from Paulo Freire’s philos- of freedom, justice, and happiness. e assumption is ophy of praxis and liberation education, articulated in his that this knowledge will be used in processes of social post,classics Pedagogy of the Oppressed (1970) and Education change by people to whom understanding their situa- for Critical Consciousness (1973), still sources of influence tion is crucial in changing it. (Bentz & Shapiro, 1998, and debate (e.g., Glass, 2001). Barone (2000) aspires p. 146; Kincheloe & McLaren, 2000) to “emancipatory educational storysharing” (p. 247). Qualitative studies informed by critical change criteria Critical change has three interconnected elements: range from largely intellectual and research-oriented (1) inquiry into situations of social injustice, (2) inter- approaches that aim to expose injustices to more activist pretation of the findings as a copy,critique of the existing forms of inquiry that actually engage in bringing about situation, and (3) using the findings and critique to social change. Stephen Brookfield (2004) uses critical mobilize and inform change. theory to illuminate adult education issues, trends, and inequities. Plummer (2011) integrates critical their and Critical theory looks at, exposes, and questions queer theory. Caruthers and Friend (2014) bring criti- hegemony—traditionalnot power assumptions held cal inquiry to online learning and engagement. Crave, about relationships, groups, communities, societies, Zaleski, and Trent (2014) emphasize the role of critical and organizations—to promote social change. Com- change in building a more equitable future through par- bined with action research, critical theory questions ticipatory program evaluation. the assumed power that researchers typically hold Here are two examples of critical change studies overDo the people they typically research. us, criti- that would expect to be evaluated for quality by criti- cal action research is based on the assumption that cal change criteria (Davis, 2008, p. 141): society is essentially discriminatory but is capable of becoming less so through purposeful human action. 1. Martin Diskin worked with policymakers and development agencies in Latin American studies to Critical action research also assumes that the dominant conduct what they called “power structure research,” forms of professional research are discriminatory and in which they exposed injustice as a strategy for must be challenged. Critical action research takes the building coalitions and motivating movements.

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 693

2. Christine Davis’s ethnography of a children’s men- Consequential validity as a critical change criterion tal health treatment team was an interdisciplinary for judging a research design or instrument makes the research project involving the fields of communica- social consequences of its use a value basis for assessing tion studies, social work, and mental health. Con- its credibility and utility. Thus, standardized achieve- ducted in partnership with community agencies, ment tests are criticized because of the discriminatory this research examined issues of power, margin- consequences for minority groups of educational deci- alization, and control within these teams. It sug- sions made with “culturally biased” tests. Consequen- gested a stance toward children and families that tial validity asks for assessments of who benefits and rejects the traditional hierarchical medical model who is harmed by an inquiry, measurement, or method of care and instead treats them as unique valuable (Brandon, Lindberg, & Wang, 1993; Messick, 1989; humans and as equal partners in treatment. Shepard, 1993).

A QUALITATIVE MANIFESTO: A CALL TO ARMS SIDEBAR

—Norman K. Denzin (2010)

The social sciences . . . should be used to improve quality example, how battered wives interpret the shelters, of life. . . . For the oppressed, marginalized, stigmatized hotlines, and public services that are made available to and ignored . . . and to bring about healing, reconcili- them by social welfaredistribute agencies. Through the use of per- ation and restoration between the researcher and the sonal experience narratives, the perspectives of women researched. and workers can be compared and contrasted. —Stan eld (2006, p. 725) 2. The assumptions,or often belied by the facts of experience, that are held by various interested parties—policy makers, clients, Mills wanted his sociology to make a dierence in the lives welfare workers, online professionals—can be located and that people lead. He challenged persons to take into shown to be correct, or incorrect (Becker, 1967, p. 239). their own hands. He wanted to bend the structures of capital- ism to the ideologies of radical democracy. . . . 3. Strategic points of intervention into social situations can be identi ed. Thus, the services of an agency and a pro- I want a critical methodology that enacts its own version of the gram can be improved and evaluated. sociological imagination. Like Mills, my version of the imaginapost,- 4. It is possible to suggest “alternative moral points of view tion is moral and methodological. And like Mills, I want a dis- from which the problem, the policy and the program course that troubles the world, understanding that all inquiry can be interpreted and assessed” (see Becker, 1967, pp. is moral and political. 239–240). Because of its emphasis on experience and This book is an invitation and a call to arms. It is directed to its meanings, the interpretive method suggests that pro- all scholars who believe in the connection between critical grams must always be judged by and from the point of inquiry and social justice (Denzin,copy, 2010, p. 10). view of the persons most directly aected. 5. The limits of statistics and statistical evaluations can be Qualitative inquiry can contribute to social justice in the exposed with the more qualitative, interpretive materials following ways: furnished by this approach. Its emphasis on the unique- 1. It can help identify different definitions of a problem ness of each life holds up the individual case as the mea- and/or a situationnot that is being evaluated with some sure of the eectiveness of all applied programs. (Denzin, agreement that change is required. It can show, for 2010, pp. 24–25) Do 6. Systems inking the not-so-distant past we expected careers, marriages, and Complexity Criteria parenthood, education, and citizenship to be finite games. When everyone agrees on the rules, and the In a finite game, it is easy to make sense. Everyone consequences of our actions are undeniable, responsible agrees on the goal; the rules are known; and the field people plan for what they want, take steps to achieve of play has clear boundaries. Baseball, football, and it, and enjoy the fruits of their labor. We know what it bridge are examples of finite games. At one time in takes to make sense in a finite game.

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 694 ANALYSIS, INTERPRETATION, AND REPORTING

Most of us realize we are playing in a very different integrated, as they are in Exhibit 9.7. That said, the game. We are playing in an infinite game. In which the systems field exemplifies the challenge of settling on boundaries are unclear or nonexistent, the scorecard is some definitive set of quality criteria for judging qual- hidden, and the goal is not to win but to keep the game itative inquiry, especially using a systems and com- in play. ere are still rules, but the rules can change plexity framing, because there are multiple approaches without notice. ere are still plans and playbooks, but within the systems field (e.g., Hieronymi, 2013), each many games are going on at the same time, and the win- of which would assert and favor particular criteria ning plans can seem contradictory. ere are still part- unique to that perspective. ners and opponents, but it is hard to know who is who, and besides that, the “who is who” changes unexpectedly. Systemic inquiry covers a wide range methodologies, methods, and techniques with a strong focus on the —Glenda Eoyang and behaviors of complex situations and the meanings we Royce Holladay (2013, p. 4) draw from those situations. It spans both the qualita- tive and quantitative research method domains but also Adaptive Action: Leveraging includes approaches that fit neither category nor both categories. . . . Uncertainty in Your Organization Studying “infinite games” in highly dynamic situations Any attempt to summarize a trans discipline like characterized by uncertainty and rapid change cre- systemic inquiry is fraught with difficulties. Despite ates special challenges for qualitative inquiry. Systems relatively simple origins,distribute the field has sprawled many thinking and complexity concepts offer a framework directions so that no single, universally accepted theory for studying such situations, tools for both inquiry and has emerged, and neither are there universally agreed “coping with chaos” (Eoyang, 1997), and criteria for definitions ofor basic concepts such as what is and what deciding whether such studies are of high quality. To be is not a system. Although we will find many defini- credible to systems thinkers and complexity scientists, tions in the systems literature, many authors argue that the qualitative inquiry must capture, describe, map, and single fixed definitions promote the kind of reduction- analyze, and map systems of interests; must attend to ist thinking that runs counter to systemic principles. interrelationships, capture diverse perspectives, attend Instead, they argue, the field should promote debates to emergence, and be sensitive to and explicit about around methodological principles to create learning boundary implications; and must document nonlinearpost,- rather than fixed definitions—what Kurt Richardson ities, adapt the inquiry in the face of uncertainties, and calls “critical pluralism.” (Williams, 2008, p. 858) describe systems changes and their implications. In so doing, the explanatory approach moves from attribution inking Systemically to contribution analysis (see pp. 596–597 in Chapter 8). Chapter 3 discussed systems theory and complexity So as not to get lost in or overwhelmed by different theory as distinct, though intersecting, theoretical frame- approaches to systems, let me close this section with works (see pp. 139–151). Exhibit 3.14 presents complex- an example grounded in the basics of attending to ity theory concepts and qualitativecopy, inquiry implications interrelationships, boundaries, perspectives, and emer- (pp. 147–148). Exhibit 3.16 presents the relationship of gence. An exemplar of applying systems thinking to systems theory to complexity theory (p. 150). understand an issue is the analysis done by Christopher Wells (2012) of the role and impact of the automobile • Systems theory inquiry questions: How and why in the United States. His analysis begins before there does this systemnot function as it does? What are the were automobiles (what complexity theorists call initial system’s boundaries and interrelationships, and conditions). He examines emergent land use patterns how do these affect perspectives about how and in the nineteenth century, sanitation problems in cit- why the system functions as it does? ies, the development of agricultural markets, the role of • ComplexityDo theory inquiry question: How can the horses in transportation, the influence of train routes, emergent and nonlinear dynamics of complex the challenges of riding bicycles on rutted and muddy adaptive systems be captured, illuminated, and roads, the function of farmers in maintaining roads understood? along their farms, population growth, and many other factors that established the initial conditions that auto- For my purpose here, namely, differentiating dis- mobiles emerged into. To understand the automobile in tinct sets of criteria by which to judge the quality and American society, culture, politics, and economics, you credibility of various approaches to qualitative inquiry, must look at the systems before the automobile existed the core systems and complexity dimensions can be (transportation, commerce, public health, political

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 695

jurisdictions, land use, and community values as start- SIDEBAR ing places) and continue to examine those systems and DIVERSE METHODS BASED ON their interactions through to the present day. The irony SYSTEMS AND QUALITY CONCEPTS is that engaging and thinking through those complex interactions yields extraordinary clarity. Systems and complexity concepts manifest nuances of differ- ence under varying application frameworks: 7. Pragmatic, Utilization-Focused Criteria • System dynamics: Focuses on the interrelationships between components of a situation, especially the conse- Usefulness! It is not a fascinating word, quences of feedback and delay and the quality is not one of which the • Viable systems: Explores relationships that support an aspiring spirit can dream o’ nights, yet organization’s viability within its environment on the stage it is the first thing to aim at. • Soft systems methodology: Looks at a situation from mul- —Dame Ellen Terry (1847–1928) tiple viewpoints to understand and anticipate both inter- Leading Shakespearean actress in Britain actions and unanticipated consequences • Critical systems heuristics: Focuses on ethical issues, mar- How use doth breed a habit in a man. ginalization of people, and ideas of power and coercion —William Shakespeare • Activity systems: Draws on cultural-historical activity theory to identify anddistribute track roles, tools, past features and dynamics, contradictions, tensions, conicts, disturbances, It is intriguing to find a great Shakespearean actress innovations, processes, and learning opportunities lauding usefulness as a matter of prime concern in her performances. Based on her musings about what she • Complexor adaptive systems: Independent and interdepen- aspired to, usefulness concerned using anything and dent elements or agents adapting to each other, self-orga- everything at her disposal to bring the play to life and nizing and emergent patterns, and nonlinear dynamics connect with the audience. This is, perhaps, an artistic • Network analysis: Examines dynamic interactions, con- and evocative view of usefulness, but it also connotes nectivity, processes, and outcomes among a group or sys- a practical twist that makes for a provocative intro- tem of interconnected people or things. (Network Impact duction to our final set of quality criteria: pragmatic,post, and Center for Evaluation Innovation, 2014) utilization-focused criteria. SOURCES: Williams (2005) and Williams and Hummelbrunner (2011). • Observations of a high school cafeteria revealed substantial food waste. Interviews showed why the students were so dissatisfied with the food offered. e school had recently experienced an influx of when the field training would occur. A radio news immigrants from Asian countries, where people program agreed to announce extension field visits. preferred rice rather than potatoescopy, and bread. e Attendance increased significantly. results were used by school officials and the student council to advocate for more culturally appropriate These are examples of simple inquiries aimed at pro- food. eir efforts were successful. viding practical and useful information to solve imme- • An early-childhood parent education program was diate problems. The pragmatic, utilization-focused experiencing nota high dropout rate. Fewer than half criteria emphasize qualitative data generated to solve the parents who started the program completed problems and inform decisions. This means focusing it. Interviews with the dropouts revealed that the the inquiry on informing action and decisions. To be program materials being used were academic and useful, specific intended users must be identified and difficultDo for poorly educated parents to understand. their information needs met. Interactive engagement Materials were revised to be more accessible and with intended users enhances relevance and use. Find- appropriate for parents with lower reading skills. ings and feedback are timed to support use. Findings • e agricultural extension service serving a remote must be actionable and results understandable. The rural area in West Africa had very poor attendance methods used need to be credible to those who will use at field trips aimed at helping farmers improve the findings. Epistemologically, the orientation of prag- their basic growing practices for the subsistence matic qualitative inquiry is that what is useful is true. crops sorghum and millet. Interviews with farm- Pragmatic, utilization-focused inquiry begins with ers revealed that they received no advance notice the premise that studies should be judged by their

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 696 ANALYSIS, INTERPRETATION, AND REPORTING

utility and actual use; therefore, evaluators and research- focus is on intended use by intended users. Since no study ers should facilitate the inquiry process and design any can be value-free, utilization-focused inquiry answers study with careful consideration of how everything the question of whose values will frame the study by that is done, from beginning to end, will affect use. Use working with clearly identified, primary intended users concerns how real people in the real world apply find- who have the responsibility to apply findings and take ings and experience the inquiry process. Therefore, the action (Patton, 2008, 2012a).

PRAGMATIC EVALUATION STANDARDS

The evaluation profession has adopted standards that call Evaluators were being asked to be “accountable,” just as program for evaluations to be useful, practical, ethical, accurate, and staff were supposed to be accountable. The questions emerged SIDEBAR accountable (Joint Committee on Standards, 2010). In the with uncomfortable directness: Who will evaluate the evalua- 1970s, as evaluation was just emerging as a field of profes- tors? How will evaluation be evaluated? It was in this context that sional practice, many evaluators took the position of tra- professional evaluators began discussing standards. ditional researchers that their responsibility was merely to design studies, collect data, and publish findings; what deci- The most comprehensive effort at developing standards was sion makers did with those findings was not their problem. hammered out over five years by a 17-member committee This stance removed from the evaluator any responsibility appointed by 12 professional organizations with input from hundreds of practicing evaluationdistribute professionals. Just prior to for fostering use and placed all the “blame” for nonuse or underutilization on decision makers. Moreover, before the publication, Dan Stufflebeam, chair of the committee, summa- field of evaluation identified and adopted its own standards, rized the results as follows: criteria for judging evaluations could scarcely be differentiated or from criteria for judging research in the traditional social and The standards that will be published essentially call for behavioral sciences, namely, technical quality and methodo- evaluations that have four features. These are utility, fea- logical rigor. Utility was largely ignored. Methods decisions sibility, propriety and accuracy. And I think it is interesting dominated the evaluation design process. Validity, reliability, that the Joint Committee decided on that particular order. measurability, and generalizability were the dimensions that Their rationale is that an evaluation should not be done at received the greatest attention in judging evaluation research all if there is no prospect for its being useful to some audi- proposals and reports. Indeed, evaluators concerned about post,ence. Second, it should not be done if it is not feasible to increasing a study’s usefulness often called for ever more conduct it in political terms, or practicality terms, or cost methodologically rigorous evaluations to increase the validity eectiveness terms. Third, they do not think it should be of findings, thereby supposedly compelling decision makers done if we cannot demonstrate that it will be conducted to take findings seriously. fairly and ethically. Finally, if we can demonstrate that an evaluation will have utility, will be feasible and will be By the late 1970s, however, program staff and funders were proper in its conduct, then they said we could turn to the becoming openly skeptical about spendingcopy, scarce funds on eval - dicult matters of the technical adequacy of the evalua- uations that they couldn’t understand and/or found irrelevant. tion. (Stuebeam, 1980, p. 90)

High-Stakes Debate:not What Counts as questions, (b) use different methods, (c) follow dif- Credible Evidence, and by What Criteria ferent analytical processes, (d) report their findings Shall Credibility Be Judged? in different ways, and (e) aim their claims of credi- bility to different audiences. These are not just aca- TheDo seven frameworks just reviewed show the range demic distinctions. The differences are far from trivial. of criteria that can be brought to bear in judging a Quite the contrary, the different orientations have far- qualitative study. They can also be viewed as “angles reaching implications for every aspect of inquiry. of vision” or “alternative lenses” for expanding the These different quality criteria constitute the under- possibilities available, not only for critiquing inquiry pinnings of significantly different ways of engaging in but also for undertaking it. What is most impor- qualitative inquiry. At the heart of all scientific debate tant to understand is that researchers and evaluators throughout history has been this burning question: attending to and operating with any one of the seven What counts as credible evidence and by what criteria different sets of quality criteria will (a) ask different shall credibility be judged?

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 697

Nor is the debate about what counts as credible Choosing a Framework Within Which to Work evidence just a matter of contention among sci- entists. Policymakers and politicians have gotten Which criteria you choose to emphasize in your work involved. It is down-and-dirty politics with mil- will depend on the purpose of your inquiry, the values lions of dollars in government-funded and phil- and perspectives of the audiences for your work, and anthropic-sponsored research at stake (Denzin your own philosophical and methodological orienta- & Giardina, 2006, 2008; Donaldson, Christie, & tion. Operating within any particular framework and Mark, 2008; Scriven, 2008). This means that advo- using any specific set of criteria will invite criticism cates of qualitative inquiry must understand and from those who judge your work from a different be prepared to enter the debate about the politics framework and with different criteria. (For examples of evidence (e.g., Eyben, 2013; Nutley et al., 2013; of the vehemence of such criticisms between those Schorr, 2012). In so doing, understanding the vari- using traditional social science criteria and those using ety of approaches to qualitative inquiry, and which artistic narrative criteria, see Bochner, 2001; English, approaches are legitimate by what criteria, will 2000.) Understanding that criticisms (or praise) flow become part of the debate. from criteria can help you anticipate how to position So make no mistake about it, advocates of one your inquiry and make explicit what criteria to apply particular set of criteria are likely to be vociferous to your own work as well as what criteria to offer oth- critics of alternative criteria. Using the research ers given the purpose and orientation of your work. process as an intervention to correct injustices and The profession of program evaluation is a micro- foment change is anathema to those who advocate cosm of these larger divisions.distribute Program evaluation is traditional scientific research criteria as the only a diverse, multifaceted profession manifesting many acceptable standards for judging quality. Those tra- different models and approaches (Christie & Alkin, ditional criteria insist on a clear line of demarcation 2013; Fitzpatrick, Sanders, & Worthen, 2010; Funnell between studying a phenomenon (basic research & Rogers, 2011;or Patton, 2008; Stufflebeam, Madeus, and independent, external evaluation) versus engag- & Kellaghan, 2000). All seven alternative quality cri- ing in change through the research process (advo- teria are advocated by various evaluation theorists, cacy). On the other hand, attempts to make tradi- methodologists, and practitioners. tional scientific research criteria the only legitimate Any particular evaluation study has tended to be approach to government-funded research are criti- dominated by one set of criteria, with a second set as cized as narrow-minded, self-serving political advo- possibly secondary. For example, a primarily construc- cacy that constitutes “a conservative challenge post,to tivist approach might add some artistic techniques as qualitative inquiry” (Denzin & Giardina, 2006, p. x). supporting methods. An evaluation dominated by the Constructivists generated their criteria of quality traditional scientific research approach might have a as a direct reaction to what they considered the section dedicated to dealing with pragmatic issues. gross inadequacies and methodological distortions Exhibit 9.9 shows how the seven frameworks can be of traditional scientific research criteria, which are found in the approaches of various evaluation theo- essentially derived from the experimental/quanti- rists, methodologists, and practitioners. tative paradigm (see pp. 87–95).copy, Thus, they system - atically set out to replace traditional research crite- ria like validity and reliability with trustworthiness Clouds and Cotton: Mixing and and authenticity (Lincoln & Guba, 1985, 1986). Changing Perspectives Advocates of artistic and evocative approaches attack both traditionalnot research and constructivism While each set of criteria manifest a certain coherence, as emotionally void. Traditional researchers have many researchers mix and match approaches. The been disinclined to use participatory and collabo- work of Tom Barone (2000), for example, combines rative approaches, sometimes believing that involv- aesthetic, political (critical change), and constructivist ing Dononresearchers in research inevitably leads elements. Denzin’s Performance Ethnography (2003) to poorer quality; in other cases, it’s a matter of uses artistic and evocative approaches to foment and lacking incentives, capacity, or interest. Pragmatic, contribute to “radical social change, to economic utilization-focused inquiries are attacked for being justice, to a culture of politics that extends critical theoretically useless, unscholarly, and so practical race theory and the principles of a radical democracy as to be worthless for generating explanations or to all aspects of society” (p. 3). A team of evaluators generalizations. Many traditional researchers don’t collaborated to integrate constructivism, participa- even consider action research worthy of the name tory evaluation, critical change, and a utilization focus “research.” (evaluations for improvement):

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 698 ANALYSIS, INTERPRETATION, AND REPORTING

EXHIBIT 9.9 Alternative Quality Criteria Applied to Program Evaluation

Program evaluation is a diverse, multifaceted profes- Madeus, & Kellaghan, 2000). All seven alternative quality sion manifesting many dierent models and approaches criteria are advocated by various evaluation theorists, (Alkin & Christie, 2013; Fitzpatrick, Sanders, & Worthen, methodologists, and practitioners. 2010; Funnell & Rogers, 2011; Patton, 2008; Stuebeam,

LEADING CLASSIC TEXTS QUALITY CRITERIA PROGRAM EVALUATION FOCUS AND RESOURCES

1. Traditional scienti c Apply research methods to attribute Chen and Rossi (1987), Rossi, Lipsey, and research criteria documented outcomes to the Freeman (2004), and Silverman, Ricci, and intervention and generalize the ndings. Gunter (1990)

2. Social construction Capture and report multiple perspectives Greene (1998, 2000), Guba and Lincoln and constructivist on participants’ experiences and diverse (1981, 1989), Lincoln and Guba (1985), and criteria program outcomes. Schwandt and Burgon (2006)

3. Artistic and Connoisseurship evaluation: Use artistic Barone (2001, 2008), Eisner (1985, 1991), evocative criteria representations to evoke participants’ Knowles and Cole (2008), and Mathison program experiences and judge a (2009) distribute program’s merit and worth. 4. Participatory and Involve program sta and participants Cousinsor and Chouinard (2012), Cousins collaborative in evaluation to enhance use and build and Earl (1992, 1995), King, Cousins, and criteria capacity for future evaluations. Whitmore (2007), Greene (2006), and King and Stevahn (2013)

5. Critical change Use evaluation to address social justice, Fetterman (2000), Fetterman and criteria empower participants, and bring about Wandersman (2005), Fetterman, Kaftarian, change; support genuine democracy; and and Wandersman (1996), Greene (2006), reduce power imbalances. post,House and Howe (2000), Kirkhart (1995), and Mertens (1998, 1999, 2013)

6. Systems and Understand programs through systems Eoyang and Holladay (2013), Jolley (2014), complexity criteria analysis and complexity concepts; Mowles (2014), Patton (2011), Sterman support program innovation and (2006), Walton (2014), Williams and adaptation; and evaluate systems change. Hummelbrunner (2011), and Williams and copy, Iman (2006) 7. Pragmatic, Get actionable answers to practical Alkin, Daillak, and White (1979), Davidson utilization-focused questions to support program (2012), Patton (2008, 2012a), Rogers and criteria improvement, guide problem solving, and Williams (2006), and Weiss (1977) enhance decision making, and ensure the not utility and actual use of ndings. Do

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 699

Evaluations for improvement, understanding lived expe- Consequently, our stand is that life stories, when prop- rience, or advancing social justice are fundamentally par- erly used, may provide researchers with a key to discov- ticipatory, involving key stakeholders in critical decisions ering identity and understanding it—both in its “real” about the evaluation’s agenda, direction, and use. Such a or “historical” core, and as narrative construction. (p. 8) principle is rooted epistemologically in the importance of understanding multiple perspectives and experiences Traditional scientific research criteria and critical in evaluation, and also politically in the importance of change criteria are polar opposites. The same study democratic inclusion. cannot aspire to independence, objectivity, and a pri- —Whitmore et al. (2006, p. 341) mary focus on contributing to theory while also being deeply engaged in using the inquiry process to foment As an evaluator, I have worked with mixed criteria change and ameliorate oppression. Mixing methods from all seven frameworks to match particular designs (qualitative and quantitative) is one thing. Mixing cri- to the needs and interests of specific stakeholders and teria of quality is a bit more challenging, one might clients (Patton, 2008). But mixing and combining cri- even say daunting. Certainly, constructivist, artistic, teria means dealing with the tensions between them. and participatory criteria can be intermingled. But After reviewing the tensions between traditional social traditional scientific research criteria are less amena- science criteria and postmodern constructivist criteria, ble to comingling. narrative researchers Lieblich, Tuval-Mashiach, and e remainder of this chapter will elaborate some Zilber (1998) attempted “a middle course,” but that of the most prominent of these competing criteria middle course reveals the very tensions they were trying that affect judgmentsdistribute about the quality and cred- to supersede as they worked with one leg in each camp. ibility of qualitative inquiry and analysis. But it’s not always easy to tell whether someone is operat- We do not advocate total relativism that treats all nar- ing from a realist, constructionist, artistic, activist, or ratives as texts of fiction. On the other hand, we do not evaluative framework.or Indeed, the criteria can shift take narratives at face value, as complete and accurate quickly. Consider this example. My six-year-old son, representations of reality. We believe that stories are Brandon, was explaining a geography science project usually constructed around a core of facts or life events, he had done for school. He had created an ecological yet allow a wide periphery for freedom of individuality display out of egg cartons, ribbons, cotton, bottle caps, and creativity in selection, addition to, emphasis on, and styrofoam beads. “ ese are three mountains and and interpretation of these “remembered facts.” . . . these are four valleys,” he said, pointing to the egg cup post,arrangement. “And is that a cloud?” I asked, pointing Life stories are subjective, as is one’s self or identity. ey to the big hunk of cotton. He looked at me, disgusted, contain “narrative truth” which may be closely linked, as though I’ve just said about the dumbest thing he’s loosely similar, or far removed from “historical truth.” ever heard. “ at’s a piece of cotton, Dad.” copy, not Do

Foreshadowing research rigor mortis, MQP Rumination # 9 in the next module.

SOURCE: © Chris Lysy—freshspectrum.com

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. MODULE 80 Credibility of the Inquirer

The previous modules in this chapter have reviewed principle is to report any personal and professional infor- strategies for enhancing the quality and credibility mation that may have affected data collection, analysis, of qualitative analysis: selecting appropriate criteria and interpretation—either negatively or positively—in for judging quality, searching for rival explanations, the minds of users of the findings. For example, health explaining negative cases, triangulation, and keeping status should be reported if it affected one’s stamina data in context. Technical rigor in analysis is a major in the field. Were you sick part of the time? Let’s say factor in the credibility of qualitative findings. This that the fieldwork for evaluation of an African health section now takes up the issue of how the credibility project was conducted over three weeks, during which of the inquirer affects the way findings are received. time the evaluator had severe diarrhea. Did that affect One barrier to credible qualitative findings stems the highly negative tone of the report? The evalua- from the suspicion that the analyst has shaped find- tor said it didn’t, but I’d want to have the issue out ings according to her or his predispositions and biases. in the open to make my own judgment. Background Whether this may have happened unconsciously, characteristics of the researcher (e.g., gender, age, race, inadvertently, or intentionally (with malice and fore- and/or ethnicity) may be relevant to report in that thought) is not the issue. The issue is how to coun- such characteristics can affect how the researcher was ter such a suspicion before it takes root. One strategy received in the setting underdistribute study and what sensitivi- involves discussing your predispositions and making ties the inquirer brings to the issues under study. biases explicit, to the extent possible. This involves In preparing to interview farm families in Minne- systematic and studious reflexivity (see pp. 70–74). sota, I began buildingor up my tolerance for strong coffee Another approach is engaging in mental cleansing a month before the fieldwork. Being ordinarily not a processes (e.g., epoche in phenomenological analysis, coffee drinker, I knew my body would be jolted by 10 to p. 575). Or one may simply acknowledge one’s ori- 12 cups of coffee a day doing interviews in farm kitch- entation as a feminist researcher (Podems, 2014b) or ens. In the Caribbean, I had to increase my tolerance for critical theorist and move on from there. The point is rum because some farmer interviews took place in rum that you have to address the issue of your credibility. shops. These are matters of personal preparation—both post,mental and physical—that affect perceptions about the quality of the study. Preparation and training for field- e Researcher as the work, discussed at the beginning of Chapter 6, should Instrument in Qualitative Inquiry be reported as part of the study’s methodology. Because the researcher is the instrument in qualitative inquiry, a qualitative report should include some infor- Reflexivity and Intellectual Rigor mation about you, the researcher.copy, What experience, (Othello to Iago, interpreting what it means for some- training, and perspective do you bring to the study? one to mutter something while sleeping) Who funded the study and under what arrangements with you? How did you gain access to the study site and the people observed and interviewed? What prior But this denoted a foregone knowledge did younot bring to the research topic and conclusion. study site? What personal connections do you have to the people, program, or topic studied? For example, —William Shakespeare suppose the observer of an Alcoholics Anonymous pro- (Othello to Iago, interpreting what it means for gram is a recovering alcoholic. This can either enhance someone to mutter something while sleeping) or reduceDo credibility depending on how it has enhanced or detracted from data gathering and analysis. Either The credibility of qualitative inquiry is so closely way, the analyst needs to deal with it in reporting find- connected to the credibility of the person or team con- ings. In a similar vein, it is only honest to report that ducting the inquiry that the quality of reflexivity and the evaluator of a family counseling program was going reflectivity offered in a report is a window into the through a difficult divorce at the time of fieldwork. thinking processes that are the bedrock of qualitative No definitive list of questions exists that must be analysis. Essentially, reflexivity involves turning quali- addressed to establish investigator credibility. The tative analysis on yourself. Who are you, and how has

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 701

how has who you are affected what you’ve found and strategies and techniques, the effectiveness and quality reported in the study? This puts your intellectual rigor of those strategies and techniques depend on the quality on display. The very notion of intellectual rigor connotes of thinking that directs them. Which brings me to this that as important as it is to employ systematic analytical chapter’s rumination: Avoiding Research Rigor Mortis.

MQP Rumination # 9

Avoiding Research Rigor Mortis

I am oering one personal rumination per chapter. These are issues that have persistently engaged, sometimes annoyed, occasionally haunted, and often amused me over more than 40 years of research and evaluation practice. Here’s where I state my case on the issue and make my peace.

Look for a pattern in what follows. changes in the muscles after death, causing the limbs of the See if you detect a theme. corpse to become sti and dicult to move or manipulate Research rigor mortis. Rigid designs, rigidly implemented, Rigor (de nition). Unyielding or then rigidly analyzed throughdistribute standardized, rigidly in exible; the quality of being extremely prescribed operating procedures and judged hierarchically thorough, exhaustive, or accurate; by standardized, rigid criteria, thereby manifesting rigorism at being strict in conduct, judgment, and every stage decision (Oxford Dictionary); scrupulous Rigorism. Extremeor strictness; no course may be followed or in exible accuracy or adherence that is contrary to doctrine (Random House Dictionary) (Random House Dictionary) Research rigorism. Technicism—reducing research to Measurement rigor. The underlying psychometric “the application of techniques or the following of rules” properties of a measure and its ability to fully and (Hammersley, 2008b, p. 31) meaningfully capture the relevant construct; the fact Did you nd the pattern? Did you detect a theme? that data have been collected in essentially the same Read on for the countertheme. (A countertheme is like a manner, across time, the program, and jurisdictions, post,counterfactual: a theme that might be dominant, even should adds methodological rigor; the reliability and validity of be dominant, in an alternate universe where the dominant instruments (Weitzman & Silver, 2012) theme is not so dominant.) Research design rigor. The true experiment (randomized controlled trials) as the optimal (gold standard) design for The Problem developing evidence-based practice (Ross, Barkaoui, & “The Problem of Rigor in Qualitative Research”—that’s the Scott, 2007) title of a classic article (Sandelowski, 1986) and a common Methodological rigor. Design elements that support strong refrain in textbooks about research methods. The “problem,” causal attributions and analytical generalization (Chatterji, it turns out, is that by traditional and dominant denitions 2007; Coryn, Schröter, & Hanssen, copy,2009) of rigor, qualitative methods are inferior. But dierent Evaluation research rigor. Evidence testing the extent to criteria for what constitutes methodological quality lead which valid and reliable measures of program outcomes to dierent judgments about rigor, the central point of this can be directly and condently attributed to a standardized, chapter. “The ‘problem of multiple standards’ describes the high-delity, consistently implemented program intervention; inherent diculties in selecting which, among many viable the most rigorous notevaluation is the randomized controlled candidates, is the standard process to which performance trial (Chatterji, 2007; Henry, 2009; Ross et al., 2007; Rossi should be compared” (Zelik, Patterson, & Woods, 2007, p. 2). et al., 2004); “methodological rigor can be assessed from Rigor begets credibility. Dierent criteria for what constitutes the evaluation plan and the quality of the evaluation’s methodological quality and rigor will yield dierent implementation” (Braverman, 2013, p. 101) judgments about credibility. That much is straightforward. AnalyticalDo rigor . “Meticulous adherence to standard The larger problem, it seems to me, is the focus on process . . .; scrupulous adherence to established standards methods and procedures as the basis for determining quality for the conduct of work” (Zelik, Patterson, & Woods, 2007, p. 1) and rigor. The notion that methods are more or less rigorous Rigor mortis. Latin: rigor “stiness,” mortis “of death”—one decouples methods from context and the thinking process of the recognizable signs of death, caused by chemical that determined what questions to ask, what methods to use, (Continued)

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 702 ANALYSIS, INTERPRETATION, AND REPORTING

(Continued) what analytical procedures to follow, and what inferences to thinking can be deductive, inductive, or abductive—and draw from the ndings. Avoiding research rigor mortis requires often draws on and creatively integrates all three analytical rigorous thinking. processes—but at the core, it is a erce examination of and allegiance to where the evidence leads. Rigorous Thinking A rigorously conducted evaluation will be convincing as a presentation of evidence in support of an No problem can withstand the assault of sustained evaluation’s conclusions and will presumably be more thinking. successful in withstanding scrutiny from critics. Rigor is multifaceted and relates to multiple dimensions of —Voltaire (1694–1778) the evaluation. . . . The concept of rigor is understood French philosopher and interpreted within the larger context of validity, which concerns the “soundness or trustworthiness Rigorous thinking combines (a) critical thinking, (b) of the inferences that are made from the results of creative thinking, (c) evaluative thinking, (d) inferential the information gathering process” (Joint Committee thinking, and (e) practical thinking. Critical thinking demands on Standards for Educational Evaluation, 1994, p. questioning assumptions; acknowledging and dealing 145). . . . There is relatively broad consensus that with preconceptions, predilections, and biases; diligently validity is a property of an inference, knowledge claim, or intended use, rather than a property either of a looking for negative and discon rming cases that don’t distribute research or evaluation study from the study’s ndings. t the dominant pattern; conscientiously examining rival (Braverman, 2013, p. 101) explanations; relentlessly seeking diverse perspectives; and analyzing what and how you think, why you think that way, In reectingor on and writing about “what counts as credible and the implications for your inquiry (Kahneman, 2011; Klein, evidence in applied research and evaluation practice,” Sharon 2011; Loseke, 2013). Rallis (2009), former president of the AEA and experienced Creative thinking invites putting the data together in qualitative researcher, emphasized rigorous reasoning: “I have new ways to see the interactions among separate ndings more holistically; synthesizing diverse themes in a search for come to see a true scientist [italics added], then, as one who coherence and essence while simultaneously developing puts forward her ndings and the reasoning that led her to comfort with and uncertainty in the messy, those ndings for others to contest, modify, accept, or reject” complex, and dynamic real work; distinguishing signal post,(p. 171). from noise while also learning from the noise; asking Practical thinking calls for assiduously integrating theory wicked questions that enter into the intersections and and practice, examining real-world implications of ndings, tensions between the search for coherent meaning and inviting interpretations and applications from nonresearchers persistent uncertainties and ambiguities; bringing artistic, (e.g., community members, program sta, and participants) evocative, and visualization techniques to data analysis and who can and will apply to the data what ordinary people presentations; and inviting outside-the-box, o-the-wall, and refer to as “common sense”; and applying real-world criteria beyond-the-ken perspectives and interpretations. to interpreting the ndings, criteria like understandability, Evaluative thinking forces claritycopy, about the inquiry purpose, meaningfulness, cost implications, and implications in who it is for, with what intended uses, to be judged by what addressing societal issues and problems. quality criteria; it involves being explicit about what criteria are being applied in framing inquiry questions, making design What’s at Stake? decisions, determining what constitutes appropriate methods, My words y up, my thoughts remain below: and selecting and notfollowing analytical processes and being aware of and articulating values, ethical considerations, Words without thoughts, never to heaven go. contextual implications, strengths and weaknesses of the inquiry, and potential (or actual) misinterpretations, misuses, —William Shakespeare (1564–1616) and misapplications. In contrast with the perspective of rigor The king in Hamlet as strictDo adherence to a standardized process, evaluative thinking emphasizes the importance of understanding the suciency of rigor relative to context and situational factors As I noted in Chapter 4, and is worth repeating here, (Clarke, 2005; Patton, 2012a). philosopher Hannah Arendt (1968) concluded that to Inferential thinking involves examining the extent to which resist eorts by the powerful to deceive and control the evidence supports the conclusions reached. Inferential thinking, people need to practice thinking: “Experience

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 703

in thinking . . . can be won, like all experience in doing than degree of adherence to an established analytic something, only through practice, through exercises” (p. 4). procedure. (Zelik, Patterson, & Woods, 2007, p. 1) Regardless of what one thinks of the U.S. invasion of Iraq to depose Saddam Hussein in 2003, both those who The phrase “degree of suciency” as a criterion for supported the war and those who opposed it ultimately assessing rigor refers to an evaluation of the extent to which agreed that the intelligence used to justify the invasion was a multidimensional, multiperspectival, and critical thinking deeply awed and systematically distorted (U.S. Senate Select process was followed determinedly to yield conclusions that Committee on Intelligence, 2004). Under intense political pressure to show sucient grounds for military action, those best t the data, and therefore ndings that are credible charged with analyzing and evaluating intelligence data to and inspire con dence among those who must use the began doing what is sometimes called cherry-picking or ndings. stove-piping—selecting and passing on only those data that support preconceived positions and ignoring or repressing all Bottom-Line Conclusion contrary evidence (Hersh, 2003; Tan, 2014; Zelik et al., 2007). The failure of the intelligence community to appropriately Methods do not ensure rigor. A research design does not and accurately assess whether Iraq had weapons of mass ensure rigor. Analytical techniques and procedures do not destruction was not a function of poor data but of weak ensure rigor. Rigor resides in, depends on, and is manifest in analysis, political manipulation of the analysis process, and a rigorous thinking—about everything, including methods and fundamental failure to think critically, creatively, evaluatively, analysis. and practically. The generation of the Rigor Attribute Model The thread that runs throughdistribute this rumination is the to support more rigorous intelligence analysis and restore importance of intellectual rigor. There are no simple credibility to the intelligence community focuses on rigorous formulas or clear-cut rules about how to do a credible, thinking (Zelik et al., 2007; see Exhibit 9.5, pp. 675–677). high-quality analysis. The task is to do one’s best to make sense of things.or A qualitative analyst returns to the data Despite the etymological implication that to be over and over again to see if the constructs, categories, rigorous is to “be sti,” expert information analysis interpretations, and explanations make sense—if they processes often are not rigid in their application of suciently reect the nature of the phenomena studied. a standard process, but rather, exible and adaptive Creativity, intellectual rigor, perseverance, insight—these to highly dynamic environments. In information are the intangibles that go beyond the routine application analysis, judgment of rigor reects a relationship in the of scienti c procedures. It is worth quoting again Nobel appropriateness of t between analytic processes and post,prize–winning physicist Percy Bridgman: “There is no contextual requirements. Thus, as supported by this scienti c method as such, but the vital feature of a scientist’s and other research, rigor is more meaningfully viewed procedure has been merely to do his utmost with his mind, as an assessment of degree of suciency, rather no holds barred” (quoted in Waller, 2004, p. 106). copy, Varieties of and Concerns About my net and the fish rush in. I quickly put the many fish Reactivity: How What We See and I’ve rescued on the dry ground, where they dance about in joy. But the dancing soon exhausts them and before Do Affects What Is Seen and Done long they cease to move. Alas, they dance themselves to death.” Nasrudin deniednot that he was a fisherman. From a passing tourist he had heard of something called phi- lanthropy and, feeling transformed by what he had “It is sad, but it is also wrong not to honor their strug- learned, he instantly adopted the moniker for himself. gle. So I take the dead fish to market where people HeDo explained to his fellow villagers: “When we see a contribute money to my effort to save more fish in problem that needs solving, it is wrong to just stand by exchange for my gifts to them of those fish who have and observe as scholars are wont to do. We must react. lost the struggle. With the financial tokens of appreci- It is wrong to remain passive and detached in the face ation I receive for my charitable work, I purchase more of need and noble to render help.” nets so I can rescue more fish.”

“I am a philanthropist. Each day I strive to help fish —From Halcolm’s Chronicles that are drowning in the lake. I save them. I throw out of Lessons Learned: Teach a Man to Fish

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 704 ANALYSIS, INTERPRETATION, AND REPORTING

4. researcher incompetence (including lack of suffi- INDEPTH REFLEXIVITY: GUIDELINES cient training or preparation). FOR QUALITY IN AUTOBIOGRAPHICAL FORMS OF SELFSTUDY RESEARCH Reactivity SIDEBAR • Autobiographical self-studies should ring true and enable connection. All accounts produced by researchers • Self-studies should promote insight and interpretation. must be interpreted within the • Autobiographical self-study research must engage history context in which they were generated. forthrightly, and the author must take an honest stand. • Authentic voice is a necessary but not sucient condition Interpretations must examine, as for the scholarly standing of a biographical self-study. carefully as possible, how the presence • The autobiographical self-study researcher has an ineluc- of the researcher, the context in which table obligation to seek to improve the learning situation not only for the self but also for the other. data were obtained, and so on shaped • Powerful autobiographical self-studies portray character the data. development and include dramatic action: Something —Schwandt (2007, p. 256) genuine is at stake in the story. • Quality autobiographical self-studies attend carefully to persons in context or setting. Problems of reactivitydistribute are well documented in the • Quality autobiographical self-studies oer fresh perspec- anthropological literature, which is one of the prime tives on established truths. reasons why qualitative methodologists advocate • To be scholarship, edited conversation or correspondence long-term observationsor that permit an initial period must not only have coherence and structure but that during which observers and the people in the setting coherence and structure should also provide argumenta- being observed get a chance to get used to each other. tion and convincing evidence. This increases trustworthiness, which supports credi- • Interpretations made of self-study data should not only bility both within and outside the study setting. reveal but also interrogate the relationships, contradic- tions, and limits of the views presented (adapted from The credibility of your findings and interpretations Bullough & Pinnegar, 2001, pp. 13–21). post,depend upon your careful attention to establishing trustworthiness. . . . Time is a major factor in the acquisition of trustworthy data. Time at your research site, time spent interviewing, and time building sound Considering and Reporting Investigator relationships with respondents all contribute to trust- Effects: Varieties of Reactivity worthy data. When a large amount of time is spent with your research participants, they less readily feign Reflectivity includes consideringcopy, and reporting how behavior or feel the need to do so; moreover, they are your presence as an observer or evaluator may have more likely to be frank and comprehensive about what affected what you observed. There are four primary they tell you. (Glesne, 1999, p. 151) ways in which the presence of an outside observer, or the fact that an evaluation is taking place, can affect, On the other hand, prolonged engagement may and possibly distort,not the findings of a study, namely, actually increase reactivity as the researcher becomes more a part of the setting and begins to affect what 1. reactions of those in the setting (e.g., program par- goes on through prolonged engagement. Thus, what- ticipants and staff ) to the presence of the qualita- ever the length of inquiry or method of data collection, tive fieldworker; researchers have an obligation to examine how their Do presence affects what goes on and what is observed. 2. changes in you, the fieldworker (the measuring instrument), during the course of the data collection It is axiomatic that observers must record what they or analysis—that is, what has traditionally been called perceive to be their own reactive effects. ey may treat instrumentation effects in quantitative measurement; this reactivity as bad and attempt to avoid it (which is 3. the predispositions, selective perceptions, and/or impossible), or they may accept the fact that they will biases you might bring to the inquiry that become have a reactive effect and attempt to use it to advan- evident to others during data collection; and tage. . . . e reactive effect will be measured by daily

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 705

field notes, perhaps by interviews in which the problem circumstances a teacher might get kids to move out of is pointedly inquired about, and also in daily observa- habitual patterns into some model mode of behavior tions. (Denzin, 1978b, p. 200) for as much as 10 or 15 minutes but that, habitual patterns being what they are, the kids would rapidly Anxieties that surround an evaluation can exacer- revert to normal behaviors and whatever artificiality bate reactivity. The presence of an evaluator can affect might have been introduced by the presence of the how a program operates as well as its outcomes. The visitor would likely become apparent. evaluator’s presence may, for example, create a halo Evaluators and researchers should strive to neither effect so that staff perform in an exemplary fashion overestimate nor underestimate their effects but to take and participants are motivated to “show off.” On the seriously their responsibility to describe and study what other hand, the presence of the evaluator may create so those effects are. much tension and anxiety that performances are below par. Some forms of program evaluation, especially “empowerment evaluation” and “intervention-oriented Effects on the Inquirer evaluation,” (Patton, 2008, chap. 5) turn this traditional of Being Engaged in the Inquiry threat to validity into an asset by designing data col- lection to enhance achievement of the desired pro- A second form of reactivity arises from the possibil- gram outcomes. For example, at the simplest level, the ity that the researcher or evaluator changes during the observation that “what gets measured gets done” sug- course of the inquiry. In Chapter 7, on interviewing, gests the power of data collection to affect outcomes I offered several examples of this, including how in attainment. A leadership program, for example, that a study of child sexualdistribute abuse, those involved were includes in-depth interviewing and participant jour- deeply affected by what they heard. One of the ways nal writing as ongoing forms of evaluation data col- this sometimes happens in anthropological research lection may find that participating in the interviewing is when participantor observers “go native” and become and writing reflectively have effects on participants’ absorbed into the local culture. The epitome of this in learning and program outcomes. Likewise, a com- a short-term observation is the legendary story of the munity-based AIDS awareness intervention can be student observers who became converted to Chris- enhanced by having community participants actively tianity while observing a Billy Graham evangelical engaged in identifying and doing case studies of criti- crusade (Lang & Lang, 1960). Evaluators sometimes cal community incidents. In short, a variety of reactive become personally involved with program participants responses are possible, some that support program propost,- or staff and therefore lose their sensitivity to the full cesses, some that interfere, and many that have impli- range of events occurring in the setting. cations for interpreting findings. Thus, the evaluator Johnson (1975) and Glazer (1972) have reflected has a responsibility to think about the problem, make a on how they and others have been changed by doing decision about how to handle it in the field, attempt to field research. The consensus of advice on how to deal monitor evaluator/observer effects, and reflect on how with the problem of changes in observers as a result reactivities may have affected the findings. of involvement in research is similar to advice about Evaluator effects can be copy,overrated, particularly how to deal with the reactive effects created by the by evaluators. There is more than a slight touch of presence of observers. self-importance in some concerns about reactivity. Lillian Weber, director of the Workshop Center for It is central to the method of participant observation Open Education, City College School of Education, that changes will occur in the observer; the important New York, oncenot set me straight on this issue, and I point, of course, is to record these changes. Field notes, pass her wisdom on to my colleagues. In doing obser- introspection, and conversations with informants and vations of open classrooms, I was concerned that my colleagues provide the major means of measuring this presence, particularly the way kids flocked around me dimension, . . . for to be insensitive to shifts in one’s as soon as I entered the classroom, was distorting the own attitudes opens the way for placing naive inter- evaluationDo to the point where it was impossible to do pretations on the complex set of events under analysis. good observations. Lillian laughed and suggested to (Denzin, 1978b, p. 200) me that what I was experiencing was the way those classrooms actually were. She went on to note that Inquirer-Selective Perception and Predispositions this was common among visitors to schools; they were always concerned that the teacher, knowing vis- The third concern about inquirer effects related to itors were coming, whipped the kids into shape for credibility has to do with the extent to which the pre- those visitors. She suggested that under the best of dispositions or biases of the inquirer may affect data

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 706 ANALYSIS, INTERPRETATION, AND REPORTING

analysis and interpretations. This issue carries mixed be free of the hermeneutical situation. is means that messages because, on the one hand, rigorous data col- scholars must state beforehand their prior interpreta- lection and analytical procedures, like triangulation, are tions of the phenomenon being investigated. Unless aimed at substantiating the credibility of the findings these meanings and values are clarified, their effects on and minimizing inquirer biases and, on the other, the subsequent interpretations remain clouded and often interpretative and constructivist perspectives remind misunderstood. (p. 23) us that data from and about humans inevitably rep- resent some degree of perspective rather than absolute Earlier I presented seven sets of criteria for judg- truth. Getting close enough to the situation observed ing the quality of qualitative inquiry (Exhibit 9.7, to experience it firsthand means that researchers can pp. 680–681). Those varying and competing frame- learn from their experiences, thereby generating per- works offer different perspectives on how inquirers sonal insights; but that closeness makes their objec- should deal with concerns about bias. Neutrality and tivity suspect. “For social scientists to refuse to treat impartiality are expected when qualitative work is being their own behavior as data from which one can learn is judged by traditional scientific criteria or by evaluation really tragic” (Scriven, 1972a, p. 99). In effect, all of the standards, thus the source of House’s (1977) admoni- procedures for validating and verifying analysis that tion quoted above. In contrast, constructivist analysts have been presented in this chapter are aimed at reduc- are expected to deal with these issues through con- ing distortions introduced by inquirer predisposition. scious and committed reflexivity—entering the herme- Still, people who use different criteria in determining neutical circle of interpretation and therein reflecting on evidential credibility will come at this issue from dif- and analyzing how theirdistribute perspective interacts with the ferent stances and end up with different conclusions. perspectives they encounter. Artistic inquirers often Consider the interviewing stance of emphatic neu- deal with issues of how they personally relate to their trality introduced in Chapter 2 and elaborated in work by invoking aesthetic criteria: Judge the work Chapter 7. An emphatically neutral inquirer will be on its artisticor merits . Participatory and collaborative perceived as caring about and interested in the people inquiries encourage the formation of meaningful and being studied but neutral about the content of what trusting relationships between researchers and those they reveal. House (1977) balances the caring, inter- participating in the inquiry. When critical change cri- ested stance against independence and impartiality for teria are applied in judging reactivity, the issue becomes evaluators, a stance that also applies to those working whether, how, and to what extent the inquiry furthered according to the standards of traditional science. the cause or enhanced the well-being of those involved post,and studied; neutrality is eschewed in favor of explicitly The evaluator must be seen as caring, as interested, using the inquiry process to facilitate change, or at least as responsive to the relevant arguments. He must be illuminate the conditions needed for change. impartial rather than simply objective. e impartial- ity of the evaluator must be seen as that of an actor in events, one who is responsive to the appropriate argu- Inquirer Competence ments but in whom the contending forces are balanced Concerns about the extent to which the inquirer’s rather than non-existent. e evaluatorcopy, must be seen as findings can be trusted—that is, trustworthiness—can not having previously decided in favor of one position be understood as one dimension of perceived meth- or the other. (pp. 45–46) odological rigor. But ultimately, for better or worse, the trustworthiness of the data is tied directly to the But neutrality and impartiality are not easy stances trustworthiness of those who collect and analyze the to achieve. Denzinnot (1989b) cites a number of scholars data—and their demonstrated competence. Compe- who have concluded, as he does, that every researcher tence is demonstrated by using the verification and brings preconceptions and interpretations to the prob- validation procedures necessary to establish the qual- lem being studied, regardless of the methods used. ity of analysis and thereby building a “track record” of Do quality work. As Exhibit 9.10 shows, inquirer compe- All researchers take sides, or are partisans for one point tence includes not just systematic inquiry knowledge of view or another. Value-free interpretive research is and skill but also interpersonal competence, reflective impossible. is is the case because every researcher practice skills, situational analysis, professional practice brings preconceptions and interpretations to the prob- competence, and project management. This array of lem being studied. e term hermeneutical circle or situ- competencies is being acknowledged and certified by ation refers to this basic fact of research. All scholars are professional evaluation associations around the world caught in the circle of interpretation. ey can never (King & Podems, 2014; Podems, 2014a). Consistent

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 707

with the overall message of this chapter, especially my excellent resource in that regard is the Critical Evalua- MQP Rumination on avoiding research rigor mortis, tion Skills Toolkit (Crebert, Patrick, Cragnolini, Smith, thinking skills also need ongoing development. An Worsfold, & Webb, 2011).

EXHIBIT 9.10 The Multiple Dimensions of Program Evaluator Competence

Essential Competencies for Program Evaluators

Professional Practice

Systematic Situational Inquiry Analysis

Project Reective Management Practice distribute

Interpersonal Competence or

SOURCES: Ghere, King, Stevahn, and Minnema (2006) and King, Stevahn,post, Ghere, and Minnema (2001). The principle for dealing with inquirer competence field, and competency to engage in the challenges and is this: Don’t wait to be asked. Anticipate competence deal with the ambiguities of qualitative inquiry. as an issue. Address the issue of competence proac- tively, explicitly, and multidimensionally. With quan- Review: e Credibility of the Inquirer titative methods, validity and reliability reside in tools, instruments, design parameters, and procedures. In Because the researcher is the instrument in qualitative qualitative inquiry, the competencycopy, stakes are greater inquiry, the credibility of the inquirer is central to the because the inquirer is the instrument. Trustworthiness credibility of the study. Exhibit 9.11 on the next page and authenticity are functions of systematic inquiry summarizes the issues that arise in establishing and procedures, interpersonal (relational) dynamics in the not judging the credibility of the inquirer. Do

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 708 ANALYSIS, INTERPRETATION, AND REPORTING

EXHIBIT 9.11 The Credibility of the Inquirer: Issues and Solutions

CREDIBILITY WHAT’S THE WAYS TO ADDRESS THE ISSUES EXAMPLES FROM NORA MURPHY’S CONCERN ISSUE? AND ENHANCE CREDIBILITY 2014 STUDY OF HOMELESS YOUTH 1. Who did Because the The methodology section of A University of Minnesota doctoral the study? researcher is the the report should present not student on the sta of the Minnesota Who is the instrument in only the usual design and data Evaluation Studies Institute, who has inquirer? qualitative inquiry, collection details and rationale but completed doctoral studies in evaluation who did the work also description of the inquirer’s theory, methods, and practice and and carried out the relevant experiences, training, is being supervised by experienced analysis matters. perspective, competence, and and knowledgeable researchers and purpose. evaluators, did the study. 2. Reexivity How has the Reexivity goes beyond reporting “I am a constructivist working from a inquirer’s background, experience, and systems perspective. My evaluation background training; it involves reecting on and approach is utilization focused, and and perspective reporting your reexive process and I’m using developmental evaluation aected the the answers to reexive questions: because it ts the dynamics, ndings? How do you know what you know? complexities, and developmental What shapes and has shaped your nature ofdistribute the initiative being perspective? (See Exhibit 2.5, p. 72.) evaluated. I have worked as a teacher and program sta member with ordisadvantaged youth.” 3. Potential How might the Options (not mutually exclusive) “I care about homeless youth and inquirer ndings be a believe they deserve an opportunity a. Acknowledge potential sources bias function of the to move on in their life’s journey past of bias: What brought you to this inquirer’s selective their period of homelessness. I believe inquiry? Why do you care about perception, in and subscribe to the values and what you’re studying? What are predispositions, principles expressed by the programs the implications of caring? How and bias? What post, participating in the evaluation. I want do you deal with concerns about steps have been to help them better elucidate and bias (which you acknowledge as taken to deal with implement those principles. I also want legitimate)? potential bias? evaluation to be a vehicle for giving b. Acknowledge your perspective voice to homeless young people, to and present it as a strength: “I honor their stories, and help them am a constructivist and view articulate what they’ve experienced.” getting close enough to people copy, The study is being done collaboratively to experience empathy as a strength of in-depth eldwork with six youth-serving agencies, which and interviewing.” have monitored the appropriateness and integrity of the data collection and c. Describe your process for analysis. not surfacing and setting aside any preconceptions (e.g., epoche in The study is supervised by experienced phenomenological analysis). researchers and evaluators who monitor the integrity of the methods d. Subject your analysis to and analysis. Do independent review (analyst triangulation, peer review, or external audit).

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 709

CREDIBILITY WHAT’S THE WAYS TO ADDRESS THE ISSUES EXAMPLES FROM NORA MURPHY’S CONCERN ISSUE? AND ENHANCE CREDIBILITY 2014 STUDY OF HOMELESS YOUTH 4. Reactivity How has the Options (not mutually exclusive) a. Youth interviewed were inquiry aected compensated for their time. They Demonstrate awareness of the issue the people in the reviewed and approved the case and take it seriously: setting studied? studies created from their interview transcripts. They expressed appreciation for the opportunity to tell their stories. a. Keep field notes on your b. The six participating agencies observations about how report having strengthened their your presence may have, or collaboration with each other actually did appear, to affect as a result of being part of this things. Describe effects and study (which was the intent). their implications for your They reported learning from the findings. experience, feeling that their work and approach was validated, and b. Gather data about reactions; are using the ndings for sta ask key informants how distribute development in their agencies. your presence aected the setting observed and people interviewed. or 5. Eects How were you Reect on and report what you’ve “This study took a personal toll. on the aected or changed seen and how it has aected you. There were times when I went home inquirer of by engaging in this Acknowledge emotional responses and cried. I felt guilt that I could not involvement inquiry? and any actions taken. Acknowledge help them more and fear that I was in the that as the research instrument, you exploiting them. What helped me was inquiry are also a humanpost, being—and report that listening seemed to help them. honestly your human responses. I still carry some of the sadness that I experienced as I sat with the youth and their telling of their lives, but I also carry the hope that I felt when I experienced their optimism and their strength.”

6. Competence How can I, thecopy, Acknowledge the importance The entire collaboration engaged reader and user of of competence and its multiple in reective practice together. your ndings, be dimensions (see Exhibit 9.10), and Con dentiality, rapport, and trust assured of your report on your competence in these were essential in interviewing the competence to areas. youth. Sensitivity to race, gender notundertake this orientation, and the trauma inquiry? experienced by homeless youth were monitored by the participating agencies. The methods section of the Do study addresses these and related issues in depth.

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. MODULE

Generalizations, Extrapolations, Transferability, 81 Principles, and Lessons Learned

experimental or naturalistic in design, can be general- e trouble with generalizations is that ized according to five principles: they don’t apply to particulars. 1. e principle of proximal similarity: We generalize —Lincoln and Guba (1985, p. 110) most confidently to applications where treatments, settings, populations, outcomes, and times are most Credibility and utility are linked. What can one do similar to those in the original research. . . . with qualitative findings? The results illuminate a particular situation or small number of cases. But 2. e principle of heterogeneity of irrelevancies: We can qualitative findings be generalized? Here, again, generalize most confidently when a research finding different qualitative frameworks based on differ- continues to hold over variations in persons, settings, ent criteria offer different answers. The traditional treatments, outcome measures, and times that are pre- scientific research criteria include generalizability. sumed to be conceptually irrelevant. e strategy here Constructivist criteria, in contrast, emphasize partic- is identifying irrelevancies,distribute and where possible includ- ularity; constructivists generally eschew, and are skep- ing a diverse array of them in the research so as to tical about, generalizability. They offer extrapolations demonstrate generalization over them. . . . and transferability instead. So let’s see if we can sort or out these different perspectives and their implications. 3. e principle of discriminant validity: We generalize most confidently when we can show that it is the target construct, and not something else, that is necessary to Purposeful Sampling and Generalizability produce a research finding. . . . Chapter 5 discussed the logic and value of purposeful sampling with small but carefully selected informapost,- 4. e principle of empirical interpolation and extrap- tion-rich cases. Certain kinds of small samples and olation: We generalize most confidently when we can qualitative studies are designed for generalizability specify the range of persons, settings, treatments, out- and broader relevance: a critical case, an index case, comes, and times over which the finding holds more a causal pathway sample, a positive deviance case, strongly, less strongly, or not all. e strategy here is and a qualitative synthesis review are examples (see empirical exploration of the existing range of instances Exhibit 5.8, pp. 266–272). Other sampling strategies, to discover how that range might generate variability for example, outlier cases (exemplars of excellence or in the finding for instances not studied. . . . failure), a high-impact case, sensitizingcopy, concept exem - plars, and principles-focused sampling, aim to yield 5. The principle of explanation: We generalize most insights about principles that might be adapted for confidently when we can specify completely and exactly application elsewhere. In short, the conditions for, (a) which parts of one variable (b) are related to which possibility of, and relative importance attached to parts of another variable (c) through which mediating generalizability arenot determined at the design stage. To processes (d) with which salient interactions, for then review: Purpose drives design. Design drives data col- we can transfer only those essential components to the lection. Data drive analysis. Purpose, design, data, and new application to which we wish to generalize. e analysis, in combination, determine generalizability. strategy here is breaking down the finding into compo- Do nent parts and processes so as to identify the essential ones. (pp. 424–426) Principles of Generalizability Shadish (1995a) has made the case that certain core Generalizability Versus Contextual Particularity principles of generalization apply to both experiments and ethnographies (or qualitative methods generally). Deep philosophical and epistemological issues are Both experiments and case studies share the prob- embedded in concerns about generalizing. What’s lem of being highly localized. Findings from a study, desirable or hoped for in science (generalizations across

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 711

CULTURAL LIMITS ON GENERALIZABILITY SIDEBAR

Psychological experiments have been used to study how peo- the mind’s capacity to mold itself to cultural and environ- ple react to things like negotiating rewards and perceptions mental settings was far greater than had been assumed. of whether two lines are of equal length when phony par- The most interesting thing about cultures may not be in ticipants in the experiment say that the shorter line is really the observable things they do—the rituals, eating pref- longer. The results of most such laboratory research have been erences, codes of behavior, and the like—but in the way interpreted as showing “evolved psychological traits com- they mold our most fundamental conscious and uncon- mon to all humans” (Watters, 2013, p. 1). However, when such scious thinking and perception. (Watters, 2013, p. 1) experiments are repeated in other cultures, the ways in which Americans respond can be quite different from how nonliter- Moreover, the experiments done on American undergrad- ate peoples respond. What were once thought to be tests of uate students may be especially prone to inappropriate basic perception (how the brain works) have turned out to be overgeneralizations. culturally determined. Social scientists had assumed that lab experiments studied It is not just our Western habits and cultural preferences that are dierent from the rest of the world, it appears. the human mind stripped of culture, [that] the human The very way we think about ourselves and others—and brain is genetically comparable around the globe, it was even the way we perceive reality—makes us distinct from agreed, so human hardwiring for much behavior, percep- other humans on thedistribute planet, not to mention from the vast tion, and cognition should be similarly universal. No need, majority of our ancestors. Among Westerners, the data in that case, to look beyond the convenient population of showed that Americans were often the most unusual, undergraduates for test subjects. A 2008 survey of the top leading the researchers to conclude that “American par- six psychology journals dramatically shows how common ticipantsor are exceptional even within the unusual popu- that assumption was: more than 96 percent of the subjects lation of Westerners—outliers among outliers.” tested in psychological studies from 2003 to 2007 were Westerners—with nearly 70 percent from the United States Given the data, they concluded that social scientists could alone. Put another way: 96 percent of human subjects in not possibly have picked a worse population [American these studies came from countries that represent only 12 undergraduate students)] from which to draw broad gen- percent of the world’s population. (Watters, 2013, p. 1) eralizations. Researchers had been doing the equivalent of post,studying penguins while believing that they were learning Cross-cultural research is now revealing that insights applicable to all birds. (Watters, 2013, p. 1)

time and space) runs into real-world considerations is in a position to appraise a practice or proposition in about what’s possible. Lee J. Cronbach (1975), one of that setting, observing effects in context. In trying to the major figures in psychometricscopy, and research meth - describe and account for what happened, he will give odology in the twentieth century, devoted considerable attention to whatever variables were controlled, but attention to the issue of generalizations. He concluded he will give equally careful attention to uncontrolled that social phenomena are too variable and context conditions, to personal characteristics, and to events bound to permit very significant empirical generaliza- that occurred during treatment and measurement. As tions. He comparednot generalizations in natural sciences he goes from situation to situation, his first task is to with what was likely to be possible in behavioral and describe and interpret the effect anew in each locale, social sciences. His conclusion was that “generalizations perhaps taking into account factors unique to that decay. At one time a conclusion describes the existing locale or series of events. . . . When we give proper situation well, at a later time it accounts for rather little weight to local conditions, any generalization is a work- variance,Do and ultimately is valid only as history” (p. 122). ing hypothesis, not a conclusion. (pp. 124–125) Cronbach (1975) offers an alternative to generaliz- ing that constitutes excellent advice for the qualitative Robert Stake (1978, 1995, 2000, 2006, 2010), mas- analyst: ter of the case study, concurs with Cronbach that the first priority is to do justice to the specific case, to do a Instead of making generalization the ruling considera- good job of “particularization” before looking for pat- tion in our research, I suggest that we reverse our priori- terns across cases. He quotes William Blake on the ties. An observer collecting data in a particular situation subject:

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 712 ANALYSIS, INTERPRETATION, AND REPORTING

To generalize is to be an idiot. To particularize is the 1. Generalizability is a chimera; it is impossible to gen- lone distinction of merit. General knowledges are eralize in a scientific sense at all. . . . those that idiots possess. 2. Generalizability continues to be important, and Stake (1978) continues, efforts should be made to meet normal scientific crite- ria that pertain to it. . . . Generalization may not be all that despicable, but par- ticularization does deserve praise. To know particulars 3. Generalizability is a fragile concept whose meaning fleetingly, of course, is to know next to nothing. What is ambiguous and whose power is variable. (pp. 68–70) becomes useful understanding is a full and thorough knowledge of the particular, recognizing it also in new Having reviewed these three positions, Guba (1978) and foreign contexts. at knowledge is a form of gen- proposed a resolution that recognizes the diminished eralization too, not scientific induction but naturalistic value and changed meaning of generalizations and generalization, arrived at by recognizing the similarities echoes Cronbach’s emphasis, cited above, on treating of objects and issues in and out of context and by sens- conclusions as hypotheses for future applicability and ing the natural covariations of happenings. To gener- testing rather than as definitive. alize this way is to be both intuitive and empirical, and not idiotic. (p. 6) e evaluator should do what he can to establish the generalizability of his findings. . . . Often naturalistic Stake (2000) extends naturalistic generalizations to inquiry can establish atdistribute least the “limiting cases” rele- include the kind of learning that readers take from vant to a given situation. But in the spirit of naturalistic their encounters with specific case studies. The “vicar- inquiry he should regard each possible generalization ious experience” that comes from reading a rich case only as a working hypothesis, to be tested again in account can contribute to the social construction of the next encounteror and again in the encounter after knowledge, which, in a cumulative sense, builds gen- that. For the naturalistic inquiry evaluator, premature eral, if not necessarily generalizable, knowledge. closure is a cardinal sin, and tolerance of ambiguity a virtue. (p. 70) Readers assimilate certain descriptions and assertions into memory. When researcher’s narrative provides Guba and Lincoln (1981) emphasized appreciation opportunity for vicarious experience, readers extend of and attention to context as a natural limit to nat- their memories of happenings. Naturalistic, ethno- post,uralistic generalizations. They ask, “What can a gen- graphic case materials, to some extent, parallel actual eralization be except an assertion that is context free? experience, feeding into the most fundamental pro- [Yet] it is virtually impossible to imagine any human cesses of awareness and understanding . . . [to permit] behavior that is not heavily mediated by the context in naturalistic generalizations. e reader comes to know which it occurs” (p. 62). They proposed substituting the some things told, as if he or she had experienced it. concepts “transferability” and “fittingness” for general- Enduring meanings come from encounter, and are ization when dealing with qualitative findings: modified and reinforced by repeatedcopy, encounter. e degree of transferability is a direct function of the In life itself, this occurs seldom to the individual similarity between the two contexts, what we shall call alone but in the presence of others. In a social process, “fittingness.” Fittingness is defined as degree of con- together they bend, spin, consolidate, and enrich their gruence between sending and receiving contexts. If understandings.not We come to know what has happened context A and context B are “sufficiently” congruent, partly in terms of what others reveal as their experi- then working hypotheses from the sending originat- ence. The case researcher emerges from one social ing context may be applicable in the receiving context. experience, the observation, to choreograph another, (Lincoln & Guba, 1985, p. 124) theDo report. Knowledge is socially constructed, so we constructivists believe, and, in their experiential and Cronbach (1980) offered a middle ground in the contextual accounts, case study researchers assist read- debate over generalizability. He found little value in ers in the construction of knowledge. (p. 442) experimental designs that are so focused on carefully controlling cause and effect (internal validity) that the Guba (1978) considered three alternative positions findings are largely irrelevant beyond that highly con- that might be taken in regard to the generalizability of trolled experimental situation (external validity). On naturalistic inquiry findings: the other hand, he was equally concerned about entirely

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 713

idiosyncratic case studies that yield little of use beyond Extrapolation the case study setting. He was also skeptical that highly specific empirical findings would be meaningful under Unlike the usual meaning of the term generalization, new conditions. He suggested instead that designs bal- an extrapolation clearly connotes that one has gone ance depth and breadth, realism and control so as to beyond the narrow confines of the data to think about permit reasonable “extrapolation” (pp. 231–235). other applications of the findings. Extrapolations are modest speculations on the likely applicability of find- ings other situations under similar, but not identical, conditions. Extrapolations are logical, thoughtful, case TESTING THEORY FROM A derived and problem oriented rather than statistical PURPOSEFUL SAMPLE OF and probabilistic. QUALITATIVE CASES TO GENERALIZE: Distinguished methodologist Thomas D. Cook (2014) has explained the nature and significance of SIDEBAR A CLASSIC CASE EXAMPLE extrapolation.

Sociologist Alfred Lindesmith (1905–1991), Indiana University, Informing future policy decisions also requires justi- wanted to test his theory about addiction to opiate drugs. fied procedures for extrapolating past findings to future The theory posited that people became addicted to opium, periods when the populations of treatment providers morphine, or heroin when they took the drug often enough and recipients might be different, when adaptations of and in sufficient quantity to develop physical withdrawal. But a previously studied treatmentdistribute might be required, when Lindesmith had observed that people become habituated to a novel outcome is targeted, when the application opiates in a hospital when medicated for pain and manifest might be to situations different from earlier, and when junkie behavior of compulsively searching for drugs at almost other factors affecting the outcome are novel too. We any cost after hospitalization. He hypothesized that two other call this theor extrapolation function since inferences are things had to happen: Having become habituated, the poten- required about populations and categories that are now tial addict now had to (1) stop using drugs and experience in some ways different from the sampled study particu- the painful withdrawal symptoms that resulted and (2) con- lars. Sampling theory cannot even pretend to deal with sciously connect withdrawal distress with ceasing drug use, a the framing of causal generalization as extrapolation connection not everyone made. Junkies, unlike former hospi- since the emphasis is on taking observed causal find- tal patients, then had to act on that realization and take more ings and projecting them beyond the observed sam- drugs to relieve the symptoms. Those steps, taken together post,pling specifics. and taken repeatedly, create the compulsive activity that is addiction. We argue here that both representation and extrapo- lation are part of a broad and useful understanding of A well-known statistician criticized Lindesmith’s sample external validity; that each has been quite neglected in because he had generalized to a large population (all the the past relative to internal validity—namely, whether addicts in the United States or in the world) from a small, pur- the link between manipulated treatments and observed posefully selected sample rather than studying a random sam- effects is plausibly causal; that few practical methods ple. Lindesmith replied that the purposecopy, of random sampling exist for validly representing the populations and other was to ensure that every case had a known probability of being constructs sampled in the existing literature; and that drawn for a sample and that researchers randomize to permit even fewer such methods exist for extrapolation. Yet, generalizations about distributions of some phenomenon in a causal extrapolation is more important for the policy population and in subgroups in a population. But, he argued, not sciences, I argue, than is causal representation. (p. 527) random sampling was irrelevant to his research on addicts because he was interested not in distributions but in a universal Extrapolations can be particularly useful when based process—how one became and remained an addict. He didn’t on information-rich samples and designs—that is, want to know the probability that any particular case would be studies that produce relevant information carefully tar- chosen forDo his sample. He wanted to maximize the probability geted to specific concerns about both the present and of finding a negative case so as all the better to test the theory. the future. Users of evaluation, for example, will usually Not finding disconfirming cases strengthened his confidence in expect evaluators to thoughtfully extrapolate from their generalizing his findings. findings in the sense of pointing out lessons learned and potential applications to future efforts. Sampling strat- —Adapted from Becker (1998, pp. 86–87) egies in qualitative evaluations can be planned with the stakeholders’ desire for extrapolation in mind.

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 714 ANALYSIS, INTERPRETATION, AND REPORTING

High-Quality Lessons Learned enhanced by designing an evaluation to answer the focused questions of specific primary intended users The notion of identifying and articulating “lessons (Cousins & Bourgeois, 2014; Patton, 2008). learned” has become popular as a way of extracting use- Ricardo Millett, former Director of Evaluation at ful and actionable knowledge from cross-case analyses. the W. K. Kellogg Foundation, and I analyzed the Rather than being stated in the form of traditional sci- lessons-learned sections of grantee evaluation reports. entific empirical generalizations, lessons learned take What we found was massive confusion and inconsist- the form of principles of practice that must be adapted ency. Listed under the heading “lessons” were findings, to particular settings in which the principle is to be , ideas, visions, and recommendations—but applied. For example, a lesson learned from research seldom lessons. Exhibit 9.12 provides examples of on evaluation use is that evaluation use will likely be what we found.

EXHIBIT 9.12 Confusion About What Constitutes a Lesson Learned

A lesson, in the context of extracting useable knowledge A physical law. If you heat water to 100 degrees Celsius at from ndings, takes the form of an if . . . then proposition sea level, it will boil. that provides direction for future action in the real world. distribute A recipe. Place a cup of oats in two cups of water, add a Lesson about evaluation use. If you actively involve intended pinch of salt, and boil for five minutes. Remove from heat, users in designing an evaluation to ensure its relevance, and leave covered for two minutes. It is then ready to serve. they are more likely to be interested in and actually use the or findings. A theoretical proposition. It describes how the world works, as with natural selection: If a mutation provides a repro- This lesson meets two criteria: (1) it is based on evidence ductive advantage that is heritable, over many genera- from studies of evaluation use (Cousins & Bourgeois, tions that trait will become dominant in the population. 2014; Patton, 2008) and (2) it provides guidance for future action (an extrapolation from past evidentiary patterns Using the definition of lesson and these distinctions, here to future desired outcomes). A lesson provides guidance,post, is a sample of statements from evaluation reports illus- but it is different from a law, a recipe, or a theoretical trating confusion about what constitutes a lesson—and proposition. a lesson learned.

STATEMENT REPORTED UNDER THE HEADING “LESSONS” IN EVALUATION REPORTS WHAT THE STATEMENTS ACTUALLY ARE 1. “Students whose parents helpedcopy, them with This is a nding. The lesson remains implicit and unexpressed. homework got higher grades than those who did not get such help at home.” 2. “One size doesn’t t all.” This is a conclusion (based on ndings that dierent people in a program wanted and needed dierent things), but the not conditions to which this conclusion applies (the “if” statement) and what will result (the “then” statement) are implicit. 3. “There are no workarounds powerful enough This is an opinion based on negative ndings from an evaluation to compensate for a failing educational about a single program. It is a gross overgeneralization born Dosystem.” of frustration and skepticism, but it lacks both supporting evidence and guidance about what to do in any applicable and useful manner. 4. “Be sure to provide daycare when you hold This is a recommendation. It prescribes a quite speci c action, community meetings.” but both the basis for the recommendation and the outcome that will follow its implementation are implicit.

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 715

STATEMENT REPORTED UNDER THE HEADING “LESSONS” IN EVALUATION REPORTS WHAT THE STATEMENTS ACTUALLY ARE 5. “Be prepared: By failing to prepare, you are This is an aphorism, no doubt wise, and certainly oft cited since preparing to fail.” published by Ben Franklin and adopted by the Boy Scouts, but reporting it as a central lesson in an evaluation report might at least include some acknowledgment that the observation has a long and distinguished history. (The report in which this appeared had nine others of like sentiment and no lessons original to or grounded in the actual evaluation done.) 6. “Lesson learned: Take time to do reective This, too, is a recommendation, but it invites introduction of a practice.” useful distinction between “lessons” and “lessons learned.” A lesson is a cognitive insight or understanding that if you do a certain thing, a certain result is likely to follow. A lesson is not “learned” until it is put into practice (behavioral change). 7. “We will never stop working to make the This is a visionary promise, an organizational commitment, and world a better place.” an inspirational reassurance to actual or potential funders. It is not a lesson. 8. If you want to formulate a meaningful and That’s a lesson. If you put that lessondistribute into action, you will have a useful lesson that provides guidance for lesson learned. future action, then learn what a lesson is (as distinct from a nding, conclusion, opinion, or recommendation, aphorism, or vision).

High-Quality Lessons post,independently triangulated to increase transferability as cumulative knowledge and working hypotheses that As we looked at examples of “lessons” listed in a vari- can be adapted and applied to new situations. This is ety of evaluation reports, it became clear that the label a form of pragmatic utilitarian generalizability, if you was being applied to any kind of insight, evidentially will. The pragmatic bias in this approach reflects the based or not. We began thinking about what would wisdom of Samuel Johnson: “As gold which he cannot constitute “high-quality lessons” and decided that spend will make no man rich, so knowledge which he one’s confidence in the transferability or extrapolated cannot apply will make no man wise.” relevance of a supposed lessoncopy, would increase to the extent towhich it was supported by multiple sources Principles and types of learnings (triangulation). Exhibit 9.13 on the next page presents a list of kinds of evidence that Principles are lessons expressed more generically, could be accumulated to support a proposed lesson, taken to a higher level of generalizability, and stated making it more notworthy of application and adaptation in a more direct and less contingent manner. to new settings if it has triangulated support from a Lesson about evaluation use: If you actively involve variety of perspectives and data sources. Questions intended users in designing an evaluation to ensure its for generating lessons learned are also listed. Thus, relevance, they are more likely to be interested in and for Doexample, the lesson that designing an evaluation actually use the findings. to answer the focused questions of specific primary Principle to enhance evaluation use: Form and nur- intended users enhances evaluation use is supported by ture a relationship with primary intended users built research on use, theories about diffusion of innovation around their information needs and intended uses of and change, practitioner wisdom, cross-case analyses the evaluation. of use, the profession’s articulation of standards, and Principles are built from lessons that are based on expert testimony. High-quality lessons, then, consti- evidence about how to accomplish some desired result. tute guidance extrapolated from multiple sources and Qualitative inquiry is an especially productive way

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 716 ANALYSIS, INTERPRETATION, AND REPORTING

EXHIBIT 9.13 High-Quality Lessons Learned

High-quality lessons learned. Knowledge that can be For ongoing learning, the trick is to follow future- applied to future action and derived from multiple sources supposed applications of lessons to test their wisdom of evidence (triangulation) and relevance over time in action in new settings. If implemented and validated, they become high-quality 1. Evaluation findings—patterns across programs lessons learned. 2. Basic and applied research Questions for Generating High-Quality Lessons Learned 3. Practice wisdom and experience of practitioners 4. Experiences reported by program participants/clients/ 1. What is meant by a “lesson”? intended beneficiaries 2. What is meant by “learned”?

5. Expert opinion 3. By whom was the lesson learned? 6. Cross-disciplinary findings and patterns 4. What’s the evidence supporting each lesson?

The idea is that the greater the number and quality of 5. What’s the evidence the lesson was learned? supporting sources for a “lesson,” the more rigorous the 6. What are the contextual boundaries around supporting evidence, and the greater the triangulation the lesson (i.e., underdistribute what conditions does it of supporting sources, the more con dence one has in apply)? the signi cance and meaningfulness of the lesson. Lessons promulgated with only one type of supporting 7. Is the lesson specific, substantive, and meaningful evidence would be considered a “lessons” hypothesis. enough toor guide practice in some concrete way? Nested within and cross-referenced to lessons should be the actual cases from which practice wisdom and 8. Who else is likely to care about this lesson? evaluation ndings have been drawn. A critical principle 9. What evidence will they want to see? here is to maintain the contextual frame for lessons— that is, to keep lessons grounded in their context.post, 10. How does this lesson connect with other “lessons”?

to generate lessons and principles precisely because • Chapter 3: Principles that undergird and guide purposeful sampling of information-rich cases, system- various theoretical perspectives: constructivism, atically and diligently analyzed,copy, yields rich, contextually hermeneutics, pragmatism sensitive findings. This combination of qualitative ele- • Chapter 4: Practical qualitative inquiry principles to ments constitutes the intellectual farming system from get actionable answers (Exhibit 4.1, pp. 172–173); which nutritious lessons and principles grow and thrive. principles of fully participatory and genuinely col- I have discussed principles-focused qualitative inquiry laborative inquiry (p. 222); and principles-focused throughout this book.not evaluation (p. 194) • Chapter 5: Principles-focused purposeful sampling • Chapter 1: Examples of principles as both a (p. 292) focus of inquiry (Paris Declaration Principle for • Chapter 6: Principles for engaging in qualitative DevelopmentDo Aid, p. 10) and the result of compar- fieldwork (pp. 415–416) ative case study analysis (principles that distinguish • Chapter 7: Ten interview principles and skills great from good organizations, Collins, 2001a; (Exhibit 7.2, p. 428) adaptive from nonadaptive companies, Collins & • Chapter 8: A principles-focused evaluation report Hansen, 2011) (pp. 627–528) • Chapter 2: Strategic principles for qualitative • Chapter 9: Rigor attribute analysis principles inquiry (Exhibit 2.1, pp. 46–47) (pp. 675–676)

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 717

How to Extract Credible and Useful and how to overcome them. Little by little, this process Principles: A Case Example changed from a private conversation between the two of us to ongoing conversations about scaling with an array of smart people. We were at the center of this Scaling Up Excellence tackles a challenge process: making decisions about which leads, stories, that confronts every leader and and evidence to pursue; choosing which to keep, dis- card, or save for later; and weaving them together into organization—spreading constructive (we hope) a coherent form. (p. 299) beliefs and behavior from the few to the many. is book shows what it Sutton and Rao (2014) then analyzed the evi- dence to reach preliminary conclusions. As conclu- takes to build and uncover pockets of sions emerged, they presented what they had found exemplary performance, spread those to people who had read their prior publications and/ splendid deeds, and as an organization or attended their classes and speeches. They recruited knowledgeable and thoughtful people to review, ques- grows bigger and older—rather than tion, and enhance their work. slipping toward mediocrity or worse— is book is best described as the product of years of recharge it with better ways of doing give-and-take between us and many thoughtful people, the work at hand. not as an integrated perspectivedistribute that we constructed in —Sutton and Rao (2014, p. 1) private and are now unveiling for the first time. Hun- dreds of people played direct roles in helping us, and thousands more played indirect roles—even if they This is how Robert Sutton and Huggy Rao (2014) didn’t realizeor it. (p. 299) open their influential book Scaling Up Excellence. Scaling is an applied version of the challenge of gener- To speak to issues of rigor and credibility, Sutton alization. Scholars worry about generalizing findings. and Rao (2014) have distilled their inquiry process Philanthropic foundations, policymakers, and social into seven core methods, each of which they elaborate innovators worry about spreading effective programs. in the methodological appendix of the book. Sutton and Rao identify five principles to guide scal- ing. How did they do it? post,1. Combing through research from the behavioral sci- Sutton and Rao (2014) focused on two goals: ences and beyond

Uncovering the most rigorous evidence and theory 2. Conducting and gathering detailed case studies we could find and generating observations and advice that were relevant to people who were determined to 3. Brief examples from diverse media sources scale up excellence. 4. Targeted interviews as unplanned conversations This meant bouncing back copy,and forth between 5. Presenting emerging scaling ideas to diverse the clean, careful, and orderly world of theory and audiences research—that rigor we love so much as academics— and the messy notproblems, crazy constraints, and daily 6. Teaching a “Scaling Up Excellence” class to Stanford twists and turns that are relevant to real people as they graduate students strive and struggle to spread excellence to those who need it. (p. 298) 7. Participation in and observation of scaling at the Stanford school (an executive professional development SevenDo Years of Inquiry program) (pp. 301–306) Sutton and Rao (2014) report that they began by What emerges from their description of their gathering ideas and evidence, a process that took years. inquiry methods is a portrayal of an ongoing, gen- erative, and iterative process of integrating theory, We did case studies, reviewed theory and research, and research, and practice around gathering and making huddled to develop insights about scaling challenges sense of the evidence and, ultimately, distilling what

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 718 ANALYSIS, INTERPRETATION, AND REPORTING

they found into principles. The principles constitute a SIDEBAR form of generalized guidance derived from and based FROM LESSONS TO PRINCIPLES: on lessons. Remember, earlier I postulated that lessons SCALING THE TRANSFORMATIVE lead to principles. The book opens with the four les- CHANGE INITIATIVE sons they identified that became the basis for formu- lating their five scaling principles. Here’s how Sutton and Rao (2014) describe that connection and the first Started in 2012, the Transformative Change Initiative (TCI) lesson, which is the basis for treating the principles as assists community colleges in scaling up innovation: “evi- generalizations. dence-based strategies to improve student outcomes and program, organization, and system performance.” The TCI eval- Our first big lesson is that, although the details and uation team reviewed case studies of effective innovations, daily dramas vary wildly from place to place, the sim- extracted themes and lessons from those separate evaluations ilarities among scaling challenges are more important of diverse programs, and generated seven principles to guide than the differences. e key choices that leaders face the next stage of innovation. and the principles that help organizations scale up The TCI Framework presents the rationale and guiding without screwing up are strikingly consistent. (p. xi) principles for scaling innovation in the community college context. It is important to link scaling to guiding principles because principles provide direction rather than prescrip- Why Principles? tion. They represent the intentionality of the innovation in ways that often allowdistribute for multiple actions (practices) to The seven years of inquiry described by Sutton and Rao (2014) generated five principles. Why principles? take place. Principles provide “guidance for action in the Because people engaged in a scaling initiative cannot face of complexity” so that adaptation can occur in ways simply look up some right answers and apply them. that achieveor the intended outcome. There is no recipe. The theory of change for TCI suggests scaling happens most successfully when practitioners apply guiding prin- In the case of scaling, there are so many different aspects ciples to their implementation and scaling eorts. In this of the challenge, and the right answers vary so much view, scaling is not so much about replicating what others across teams, organizations, and industries (and even assert is good practice, which is a classic theory of scal- across challenges faced by a single team or organization), ing, but about practitioners and stakeholders becoming that it is impossible to develop a useful “paint by num- post, instrumental to the scaling process by igniting a chain bers” approach. Regardless of how many cases, studies, of actions, reactions, and outcomes that reect and ulti- and books (including this one) you read, success at scal- mately reshape the context. To make this happen, practi- ing will always depend on making constantly shifting, tioners need to complex, and not easily codified judgments. (p. 298) • be aware of the principles that guide the changes they Principles guide judgment. Context informs are making to their practice, judgment. Qualitative inquiry generates principles, copy, • reect those principles in implementation over time, and then further qualitative inquiry, in a specific and context, illuminates that context so that the prin- • measure and assess whether the changes are produc- ciples can be interpreted and applied appropri- ing the intended improved performance. ately within that particular context. That process involves both extrapolationnot and assessing transfer- —Bragg et al. (2014, p. 6) ability, the qualitative approach to the challenge of Transformative Change Initiative generalizing.

PerspectivesDo on Generalizability: A Review The first part of this chapter dealt with the issue of Four core epistemological issues are at the center of quality by examining alternative criteria for judging debates about the credibility and utility of qualitative quality (Modules 76 and 77). Chapter 8 included an inquiry: (1) judging the quality of findings, (2) infer- extensive discussion of causal inference (pp. 582–595). ring causality (the challenge of attribution), (3) the This module has been examining perspectives on validity of generalizations, and (4) determining what making generalizations. The next and final module is true. will take up the issue of determining what is true. This

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 719

EXHIBIT 9.14 Twelve Perspectives on and Approaches to Generalization of Qualitative Findings

INQUIRY PERSPECTIVE APPROACH TO GENERALIZATION ELABORATION 1. Traditional scienti c Generalizations must be theory based: Qualitative inquiry can contribute research approaches: rigorous and systematic comparisons generalizable knowledge by generating, of observed patterns with theoretical testing, and validating theory. • Grounded theory propositions. • Analytic induction • Qualitative comparative analysis

2. Realism Generalizations depend on purposeful “By linking decisions about whom or theoretical sampling. what to sample both empirical and theoretical considerations are combined and claims can be made about how the chosen sample relates to a wider universe or population” (Emmel, 2013, p. 60). distribute 3. Constructivism Transferability of ndings from particular Eschew “generalization” in favor of cases to others based on similarity of assessing transferability based on in-depth context and conditions—also called knowledgeor about the cases studied that inferential generalization (Lewis & provides a basis for assessing the relevance Ritchie, 2003) of ndings to other similar cases (Lincoln & Guba, 1985).

4. In-depth case study Focus rst on in-depth particularity. “Particularization does deserve particularity Do justice to the case. The issue here praise. . . . What becomes useful is internal generalizationpost,: “generalizing understanding is a full and thorough within the setting . . . studied to people, knowledge of the particular, recognizing events, and settings that were not it also in new and foreign contexts. directly observed or interviewed . . . ; the That knowledge is a form of extent to which the times and places generalization too . . . , arrived at by observed may dier from those that recognizing the similarities of objects were not observed, either because of and issues in and out of context and sampling or because of the observation by sensing the natural covariations of copy,itself” (Maxwell, 2012, p. 142). happenings. To generalize this way is to be both intuitive and empirical” (Stake, 1978, p. 6).

5. Social construction Reective, experiential, and socially Naturalistic generalizations. The kind not shared generalizations: People naturally of learning that ordinary people take make comparisons, which become a from their encounters with speci c case “natural” form of generalizing among a studies: “The ‘vicarious experience’ that group of people. Qualitative cases can comes from reading a rich case account Do enhance those comparisons through can contribute to the social construction the depth and detail that enhances of knowledge which, in a cumulative understanding. sense, builds general, if not necessarily generalizable knowledge” (Stake, 1995, p. 38).

(Continued)

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 720 ANALYSIS, INTERPRETATION, AND REPORTING

(Continued)

INQUIRY PERSPECTIVE APPROACH TO GENERALIZATION ELABORATION 6. Phenomenology Essence. “A uni ed statement of Essence emerges from a synthesis of the essences of the experience meanings, a reduction of variation to of the phenomenon as a what is essential. Essence integrates and whole. . . . Essence . . . means that which supersedes individual experiences. “The is common or universal, the condition or essences of any experience are never quality without which a thing would not totally exhausted. The fundamental be what it is” (Moustakas, 1994, p. 100). textural-structural synthesis represents the essences at a particular time and place from the vantage point of an individual researcher following an exhaustive imaginative and reective study of the phenomenon” (Moustakas, 1994, p. 100). 7. Ethnography Conceptual generalization: describing Connecting the microscopic or both the common and variable situation-speci c ndings of a meanings and manifestations of particular ethnography to more general universal concepts across cultures, e.g., understandingsdistribute of culture (Geertz, 1988, kinship, conict, religion, coming of age, 2001) constitutes a form of ethnographic etc. generalization. It involves seeking “the patternor that connects” (Bateson, 1977, 1988). 8. Pragmatism Extrapolations. Modest, practical Extrapolations are logical, thoughtful, speculations on the likely applicability case derived, problem oriented, and of ndings to future times and other future oriented rather than statistical and situations under similar, but not probabilistic: what of practical value has identical, conditions; allows for more been learned that can be extrapolated to interpretive exibility thanpost, a direct guide future actions, whether in the same transferability assessment; as interested place or a dierent one. Extrapolations in temporal application of ndings can be particularly useful when based (applying what was learned to the on information-rich samples targeted to future) as in applications in other places. speci c concerns (Cronbach, 1980). 9. Program evaluation Lessons. Qualitative evaluation’s Patterns of eectiveness identi ed from and policy analysis contribution to general knowledge cross-case analysis of dierent programs copy,takes the form of lessons identi ed are analyzed to extract common lessons. A in one evaluation (or a cluster of classic example is Schorr’s (1988) “Lessons evaluations) that are oered for of Successful Programs,” serving high-risk application to other places and future children and families in poverty. Here is a programs. policy example: Learning From Iraq (Special not Inspector General for Iraq Reconstruction, 2013). 10. Systems and Principles. Dynamic and complex Case-based principles provide guidance for complexity systems defy simple empirical adaptive action in the face of complexity. Do generalizations. Instead, principles can Adaptive action through principles be identi ed that inform future systems contrasts high- delity replication of analyses and guide innovation in standardized models. Principles emphasize complex situations. contextual sensitivity and situational analysis (Eoyang & Holladay, 2013; Patton, 2011).

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 721

INQUIRY PERSPECTIVE APPROACH TO GENERALIZATION ELABORATION 11. Artistic, evocative Emotional connections and empathy Finding shared meaning in stories and representations are a form of human generalizability artistic works (as representations of based on shared feelings. Interpretive qualitative ndings) moves people from interactionism (Denzin, 1989b) involves their isolated, particular experience intersubjective understandings and to a more general experience and feelings. understanding of the human condition. Emotional resonance among humans (Denzin, 2009) is a form of empathic generalization. 12. Postmodernism All knowledge is local, speci c, and “Postmodernism is characterized by its immediate. Generalizations, either distrust of and incredulity toward all empirical or theoretical, are impossible ‘totalizing’ discourses or metanarratives” and undesirable. (Schwandt, 2007, p. 235). This discounts the validity of generalizations, theories, and predictions of any kind, “alerting us to postmodernism’s nihilistic tendencies” (Gubrium & Holstein,distribute 2003, p. 5). or

“ALL GENERALIZATIONS ARE FALSE.”

“All generalizations are false, including this one”—doesn’t clarify much of anything.

SIDEBAR post, “All generalizations are false, including this one” leads logically to “Some generalizations are true.” If you wish to trace this error back, consider “All Cretans lie,” uttered by a Cretan. It can’t be true, but it can be false. What’s interest- ing is that it not only leads to “Some Cretans tell the truth,” but it also leads to the conclusion that the Cretan speaking is not one of them. copy,

—Errol Morris (2014) Documentarynot lmmaker and philosopher Do

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. MODULE

Enhancing the Credibility and Utility of Qualitative 82 Inquiry by Addressing Philosophy of Science Issues

module concludes with a summary of perspectives on conclude, we take up the issue of philosophical belief in and approaches to generalization in Exhibit 9.14. the value of qualitative inquiry, that is, a fundamen- We come now to the fourth and final dimension of tal appreciation of naturalistic inquiry, qualitative credibility. Let’s review. The first dimension is system- methods, inductive analysis, purposeful sampling, and atic, in-depth fieldwork that yields high-quality data. holistic thinking. Exhibit 9.15 graphically depicts The second dimension that informs judgments of these four dimensions of credibility. In the center of credibility is systematic and conscientious analysis. The the graphic are the alternative criteria for judging third concerns judgments about the credibility of the quality that opened this chapter: traditional scientific researcher, which depends on training, experience, research criteria, constructivist and social construction track record, status, and presentation of self. Now, to criteria, artistic and evocative criteria, participatory

EXHIBIT 9.15 Criteria for Judging Quality distribute Credibility of the Philosophical belief in inquirer: How is the value of qualitative competence judged? orinquiry: What is credible evidence?

Criteria for Judgingpost, Qualitya Systematic, in-depth Systematic and fieldwork that yields conscientious high-quality data: analysis: What is What are quality rigorous analysis? data? copy, a. Traditional research criteria, constructivist and social construction criteria, artistic and evocative criteria, participatory and collaborative criteria, critical change criteria, systems and not complexity criteria, and pragmatic criteria.

andDo collaborative criteria, critical change criteria, sys- even defend the value and appropriateness of quali- tems and complexity criteria, and pragmatic criteria. tative inquiry, this module will briefly discuss some of the most contentious issues. The selection of which Philosophical belief in the value of qualitative philosophy of science issues to address in this closing inquiry is a prime determinant of credibility— section of the book is based on the workshops I regu- and a matter of debate and controversy. Given the larly teach on qualitative evaluation methods. In those often-controversial nature of qualitative findings and two- and three-day courses, which typically include the necessity, on occasion, to be able to explain and participants from around the world, I reserve the final

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 723

afternoon for open-ended exchanges about whatever it has been and can be intense, divisive, emotional, and matters of interest and concern participants want to rancorous. And to those experienced in and tired of raise. By then, we have covered types and applications the debate, let me say that I’ve followed it, and been of qualitative inquiry, design options, purposeful sam- personally engaged in it, for more than 40 years. I’ve pling approaches, fieldwork techniques, observational watched the debate ebb and flow, take on new forms, methods, interviewing skills, how to do systematic and and attract new advocates and adversaries. But it rigorous analysis, and ethical standards. Inevitably, doesn’t go away. The paradigms debate is an episte- questions come pouring forth about the paradigms mological phoenix that emerges anew when fires of debate, political considerations, and fundamental dissent mellow into become dying embers only to doubts participants encounter about the legitimacy flame again on new winds of contention. I doubt that of qualitative inquiry. I’ll reproduce the questions that you can use qualitative methods without encountering arise and offer my responses. and needing to deal with some aspects of the debate. Paradigms question: Why are qualitative methods so As I have illustrated throughout this chapter, both sci- controversial? I just want to interview people, see what entists and nonscientists hold strong opinions about they say, analyze the patterns, and report my findings? I what constitutes credible evidence. Those opinions are don’t want to debate paradigms. Do we really have to deal paradigm derived and paradigm dependent because a with paradigms stuff? paradigm constitutes a worldview built on epistemo- You have to deal with what constitutes credible logical assumptions, preferred definitions of key con- evidence. What constitutes credible evidence is a mat- cepts, comfortable habits, entrenched values defended ter of debate among both scientists and nonscientists. as truths, and beliefs offereddistribute up as evidence. As such, While not always framed as a paradigms debate, and paradigms are deeply embedded in the socialization there are disagreements about what a paradigm is and of adherents and practitioners, telling them what is whether it’s a useful concept, I think framing the con- important, legitimate, and reasonable. troversy as a paradigms debate is both accurate and So be preparedor to address controversies and com- illuminating. In Chapter 3, I discussed the qualitative/ peting perspectives about what constitutes credible quantitative paradigms debate at some length (see evidence even if it doesn’t come cloaked in the guise pp. 87–95), including an MQP Rumination against of a paradigms debate. Moreover, these are not sim- designating randomized controlled trials as the “gold ply matters of academic debate. They have entered the standard.” In this module, I’m going to focus specifi- public policy arena as matters of political debate. cally on how that debate affects credibility and utility. Politics of evidence question: What makes research Paradigms are a way of distinguishing differentpost, methods a matter of concern for politics and politicians? perspectives in science about how best to study and In the public policy arena, advocates of randomized understand the world. The debate sometimes takes control trials are organized and funded to lobby the the form of natural science versus social science, U.S. Congress to put their paradigm preferences into qualitative versus quantitative methods, behavioral legislation (Coalition for Evidence-Based Policy, psychology versus phenomenology, positivism versus 2014). They have experts who supply constructivism, or realism versus interpretivism. How reporters with positive news accounts (e.g., Keating, the debate is framed depends copy,on the perspectives that 2014; Kolata, 2013, 2014). On the other side, there people bring to it and the language available to them are strong political advocacy statements for alternative to talk about it. Whatever the terminology and labels paradigms: The Qualitative Manifesto (Denzin, 2010), for contrasting points of view, the debate is rooted in Qualitative Inquiry and the Conservative Challenge philosophical differences about the nature of reality (Denzin & Giardina, 2006), and Qualitative Inquiry and epistemologicalnot differences in what constitutes and the Politics of Evidence (Denzin & Giardina, 2008). knowledge and how it is created. The paradigms debate, Ray Pawson (2013) has produced A Realist Manifesto. whatever form it takes, affects credibility and utility But there is no organized and funded lobbying effort when particular worldviews are pitted against one on behalf of qualitative, mixed-methods, and/or real- anotherDo at the intersection of philosophy and methods ist approaches. So guess which group is successful in to determine what kinds of evidence are acceptable, getting its paradigm legitimated and funded in legis- believable, and useful. lation? Hint: It’s not the qualitative manifesto. You may be able to carry out a qualitative study Objectivity question: Doesn’t the paradigms debate without ever addressing the issue of paradigms. But come down to objectivity versus subjectivity? you ought to know enough about the debate and its French philosopher Jean-Paul Sartre once observed implications, it seems to me, to address the issue if it that “words are loaded pistols.” The words “objectiv- comes up. I would alert those new to the debate that ity” and “subjectivity” are bullets people arguing fire

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 724 ANALYSIS, INTERPRETATION, AND REPORTING

DIFFERENT MEANINGS AND USES OF OBJECTIVITY Honey, if you like me Daddy, to be objective, 1. Objective person. Unbiased, open-minded, and neutral do you like I’ll have to create SIDEBAR my picture? a rubric. 2. Objective process. Follow, document, and report proce- dures that do not predetermine results 3. Objective statement. Just the facts, unvarnished, put forward by an objective person following an objective process 4. Objective reality. Belief that there is knowable, absolute reality 5. Objective scientic claim. Findings subjected to scienti c peer review by members of a discipline capable of judging the extent to which a claim has been produced by appro- priate scienti c methods and analysis 6. Objective methods. A design, data collection procedures, SOURCE: © Chris Lysy—freshspectrum.com and analysis that follow accepted inquiry norms of a scien- ti c discipline distribute 7. Objective measure. The extent to which a given number can controlled experimental designs. Yet the ways in be interpreted as indicating the same amount of the thing which measures are constructed in psychological tests, measured, across persons or thing measured, using a vali- or questionnaires, cost–benefit indicators, and routine dated and reliable instrument management information systems are no less open to 8. Objective decisions. Fair and balanced judgment based on the intrusion of biases than making observations in preponderance of evidence presented and explicit; trans- the field or asking questions in interviews. Numbers parent criteria for weighing the evidence do not protect against bias; they merely disguise it. All statistical data are based on someone’s definition of post,what to measure and how to measure it. An “objective” statistic like the consumer price index is really made at each other. It’s true that objectivity is held in high up of very subjective decisions about what consumer esteem. Science aspires to objectivity and a primary items to include in the index. Periodically, govern- reason why decision makers commission an evaluation ment economists change the basis and definition of is to get objective data from an independent source such indices. external to the program being evaluated. The charge Philosopher of science Michael Scriven (1972a) that qualitative methods are inevitably “subjective” has insisted that quantitative methods are no more casts an aspersion connoting thecopy, very antithesis of sci- synonymous with objectivity than qualitative methods entific inquiry. Objectivity is traditionally considered are synonymous with subjectivity: the sine qua non of the . To be sub- jective means to be biased, unreliable, and irrational. Errors like this are too simple to be explicit. ey are Subjective data imply opinion rather than fact, intu- inferred confusions in the ideological foundations of ition rather thannot logic, impression rather than con- research, its interpretations, its application. . . . It is firmation. Chapter 2 briefly discussed concerns about increasingly clear that the influence of ideology on objectivity versus subjectivity, but I return to the issue methodology and of the latter on the training and here to address how these concerns affect the credibil- behavior of researchers and on the identification and ity andDo utility of qualitative analysis. disbursement of support is staggeringly powerful. Ide- Let’s take a closer look at the objective/subjective ology is to research what Marx suggested the economic distinction. The conventional means for controlling factor was to politics and what Freud took sex to be for subjectivity and maintaining objectivity are the psychology. (p. 94) methods of quantitative social science: distance from the setting and people being studied, standardized Scriven’s (1972a) lengthy discussion of objectiv- quantitative measures, formal operational procedures, ity and subjectivity in educational research deserves manipulation of isolated variables, and randomized careful reading by students and others concerned by

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 725

this distinction. He skillfully detaches the notions of manner of an advocate—as, for example, attorneys objectivity and subjectivity from their traditionally do in making a case in court. The presumption is narrow associations with quantitative and qualitative that the public, like a jury, is more likely to reach methodology, respectively. He presents a clear expla- an equitable decision after having heard each side nation of how objectivity has been confused with con- presented with as much vigor and commitment as sensual validation of something by multiple observers. possible. Yet a little research will yield many instances of “scien- tific blunders” (Dyson, 2014; Livio, 2013; Youngson, • It is assumed that the subject’s reaction to the 1998) where the majority of scientists were factually reporter and interactions between them heavily wrong while one dissenting observer described things determines what the reporter perceives. Hence one as they really were (Kuhn, 1970). test of fairness is the length to which the reporter Qualitative rigor has to do with the quality of the will go to test his own biases and rule them out. observations made by an inquirer. Scriven (1972a) emphasizes the importance of being factual about • It is a relative criterion that is measured by balance observations rather than being distant from the phe- rather than by isomorphism to enduring truth. nomenon being studied. Distance does not guarantee (pp. 76–77) objectivity; it merely guarantees distance. Nevertheless, in the end, Scriven (1998) still finds the ideal of objectiv- But times change, and Guba would be unlikely to ity worth striving for as a counter to bias, and he con- use the language of “fairness and balance” now that the tinues to find the language of objectivity serviceable. most politically conservativedistribute and deliberately biased In contrast, Lincoln and Guba (1986), as noted ear- American television channel has adopted that phrase lier, have suggested replacing the traditional mandate as its brand. Fairness and balance has become a euphe- to be objective with an emphasis on trustworthiness and mism for prejudiced and one-sided. Objectivity has also authenticity by being balanced, fair, and conscientious taken on unfortunateor political and cultural connota- in taking account of multiple perspectives, multiple tions in some quarters, meaning uncaring, unfeeling, interests, multiple experiences, and diverse construc- disengaged, and aloof. What about subjectivity, the tions of realities. Guba (1981) suggested that research- constructivist badge of honor? ers and evaluators can learn something about these attributes from the stance of investigative journalists. post,Subjectivity Deconstructed Journalism in general and investigative journalism in In public discourse, it is not particularly helpful to particular are moving away from the criterion of objec- know that philosophers of science now typically doubt tivity to an emergent criterion usually labeled “fair- the possibility of anyone or any method being totally ness” . . . Objectivity assumes a single reality to which the “objective.” But subjectivity fares even worse. Even if story or evaluation must be isomorphic; it is in this sense acknowledged as inevitable (Peshkin, 1988), or valu- a one-perspective criterion. It assumes that an agent can able as a tool to understanding (Soldz & Andersen, deal with an objective (or another person) in a nonreac- tive and noninteractive way. It iscopy, an absolute criterion.

Journalists are coming to feel that objectivity in that sense is unattainable. . . .

Enter “fairness”not as a substitute criterion. In contrast to objectivity, fairness has these features:

• It assumes multiple realities or truths—hence a Dotest of fairness is whether or not “both” sides of the case are presented, and there may even be multiple sides.

• It is adversarial rather than one-perspective in nature. Rather than trying to hew the line with the truth, as the objective reporter does, the fair reporter seeks to present each side of the case in the and Michael Cochran ©2002 Michael Quinn Patton

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 726 ANALYSIS, INTERPRETATION, AND REPORTING

2012), subjectivity carries such negative connotations lecture, but rather to learn and to communicate the at such a deep level and for so many people that the results, though not at the price of abandoning the core very term can be an impediment to mutual under- principles of the science. e pressure always exists to standing. For this and other reasons, as a way of elab- achieve a balance, and a researcher always has to make orating with any insight the nature of the research the call of how much and in what way to handle it. process, the notion of subjectivity may have become as useless as the notion of objectivity. is fact has to be part of the science, not to mention a central part of training for human social researchers. How e death of the notion that objective truth is attaina- to navigate this ambiguous territory with professional ble in projects of social inquiry has been generally rec- integrity and product quality is a neglected topic, a ognized and widely accepted by scholars who spend neglect understandable in light of academic traditions time thinking about such matters. . . . I will take this where one could assume that whatever the dissertation recognition as a starting point in calling attention committee or disciplinary peers would like was the to a second corpse in our midst, an entity to which right thing to do. at isolation is no longer possible. many refer as if it were still alive. Instead of exploring In my view, taking human social research out into the the meaning of subjectivity in qualitative educational world makes it more difficult, more interesting, more research, I want to advance the notion that following intellectually challenging, and of higher moral value the failure of the objectivists to maintain the viability than it has ever been. (pp. 215–216) of their epistemology, the concept of subjectivity has been likewise drained of its usefulness and therefore no Empathic Neutralitydistribute longer has any meaning. Subjectivity, I feel obliged to report, is also dead. (Barone, 2000, p. 161) No consensus about substitute terminology has emerged. I prefer empathic neutrality, one of the 12 But for other qualitative researchers, subjectivity is qualitative themesor that I presented in Chapter 2. not so much about philosophy of science as it is about While empathy describes a stance toward the people using one’s own experience to make sense of the world we encounter in fieldwork, calling on us to commu- through reflexivity (Connolly & Reilly, 2007). That nicate interest, caring, and understanding, neutrality perspective, once entertained, can lead from a focus suggests a stance toward their thoughts, emotions, on the researcher’s subjectivity as a window into sense and behaviors, a stance of being nonjudgmental. Neu- making to shared meaning making: intersubjectivity. post,trality can actually facilitate rapport and help build a relationship that supports empathy by disciplining the Intersubjectivity researcher to be open to the other person and nonjudg- mental in that openness. “Subjective” versus “objective” no longer makes sense, since everyone involved is a subject. . . . Human Social (See pp. 57–62 for the full discussion of empathic Research is intersubjective . . . built from encounters neutrality.) among subjects, including researchers who, like it or not, are also subjects. (Agar, 2013,copy, pp. 108–109) Open-Mindedness and Impartiality Eschewing both objectivity and subjectivity, inter- I have evaluation colleagues who simply describe subjectivity focuses on knowledge as socially constructed themselves as open-minded, which seems to satisfy in human interactions. Human science research, what most lay people. The political nature of evaluation anthropologist Michaelnot Agar (2013) calls the Lively means that individual evaluators must make their own Science, requires “human social relationships in order peace with how they are going to describe what they do. to happen at all. They are intersubjective sciences. They The meaning and connotations of words like objectiv- require social relationships with those who support the ity, subjectivity, neutrality, and impartiality will have to science,Do those who do it, those who serve as subjects of be worked out with particular stakeholders in specific it, and those who consume it” (p. 215). evaluation settings. In her leadership role in evaluation in the U.S. federal government, former AEA president e difficult judgment call for the researcher is this: To Eleanor Chelimsky emphasized her unit’s independ- some extent he or she should translate his or her own ence and impartiality. The perception of impartiality, framework and jointly build a framework for commu- she has explained, is at least as important as method- nication with subjects of all those different types. . . . e ological rigor in highly political environments. Cred- bedrock of intersubjective research isn’t to preach or to ibility, and therefore utility, are affected by “the steps

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 727

we take to make and explain our evaluative decisions, Churchill, taking the white meat he was offered, [and] also intellectually, in the effort we put forth to apologized and returned to his table. look at all sides and all stakeholders of an evaluation” The next morning the hostess received a beauti- (Chelimsky, 1995, p. 219; see also Chelimsky, 2006). ful orchid from Churchill with the following card: “I I think it is worth noting that the official Program would be most obliged if you would wear this on your Evaluation Standards ( Joint Committee on Stand- white meat.” ards, 2010) do not call for objectivity. The standards have been guiding evaluation practice for nearly four

decades. They were originally formulated by social sci- SIDEBAR entists and evaluators representing all the major dis- A REALIST PERSPECTIVE ciplinary associations. They have twice gone through ON OBJECTIVITY major review processes. The language used, therefore, has been thoroughly vetted. The standards call for Evaluation cannot hope for perfect objectivity but neither evaluations to be credible, systematic, accurate, use- does this mean it should slump into rampant subjectivity. ful, accurate, and dependable, but not objective. The We cannot hope for absolute cleanliness but this does not term objectivity has become a lightning rod attracting require us to enjoy a daily roll in the manure. The alterna- epistemological paradigms debate and therefore not tive to these two termini is for evaluation to embrace the useful as a standard for evaluation in the American goal of being “validity increasing” . . . context. In contrast, the international Quality Stand- ards for Development Evaluation define evaluation Skepticism . . . , in itsdistribute English spelling, . . . constitutes the as “objective assessment” (OECD-DAC, 2010, p. 5). nal desideratum of evaluation science. Different context, different language. Given the seven different sets of criteria for judg- Organised scepticism means that any scientific claim ing the quality of qualitative inquiry I identified at the must beor exposed to critical scrutiny before it becomes beginning of this chapter, and the terms associated accepted. . . . What counts is the depth of critical scrutiny with each, it seems unlikely that a consensus about applied to the inferences drawn from any inquiry. And this terminology is on the horizon. The methodological level of attention depends, in turn, on the presence of a and scientific Tower of Babel stands tall and casts a collegiate group of stakeholders and their willingness to long shadow. But the different perspectives on and put each other’s work under the microscope. uses of terms can be liberating because they opens up the possibility of getting beyond the meaninglesspost, —Ray Pawson (2013, p. 107) abstractions and heavy-laden connotations of objec- The Science of Evaluation: tivity and subjectivity to move instead toward carefully A Realist Manifesto selecting descriptive methodological language that best describes your own inquiry processes and procedures. That is, don’t label those processes as “objective,” “sub- jective,” “intersubjective,” “trustworthy,” or “authentic.” Instead, eschew overarching labels.copy, Describe how you Truth and reality question: “I don’t understand this approach your inquiry, what you bring to your work, talk about multiple realities and different truths for dif- and how you’ve reflected on what you do, and then let ferent people. If research is anything, it ought to be about the reader be persuaded, or not, by the intellectual and getting at true reality. I know you like quotes, so here’s one methodological rigor, meaningfulness, value, and util- of my favorite quotes for you, from George Orwell: ‘In a ity of the result. notIn the meantime, be very careful how time of universal deceit—telling the truth is a revolution- you use particular terms in specific contexts. Words ary act. ‘I think we ought to be research revolutionaries are bullets. They are also landmines. I end this diatribe and speak the truth. In fact, the mantra of evaluation is: with a cautionary tale about being sensitive to the cul- Speak truth to power. So, truth or not truth?” turalDo context within which terms are used. It’s an important question. Certainly, there are During a tour of America, former British prime a lot of quotes about truth. This is a thick book, minister Winston Churchill attended a buffet lunch- and it could contain nothing but quotes about truth, eon at which chicken was served. As he returned to which would serve to illustrate its evasiveness. Let the buffet for a second helping he asked, “May I have me offer a quote from the great comedian Lily Tom- some more breast?” lin, who, playing the character of a little girl accused His hostess, looking embarrassed, explained that by a scolding adult of making things up, responded “in this country we ask for white meat or dark meat.” thus:

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 728 ANALYSIS, INTERPRETATION, AND REPORTING

Lady, I do not make up things. at is lies. Lies are not truth as a criterion of quality and I suspected that he true. But the truth could be made up if you know how. hoped to lure me into an academic-sounding, arrogant, And that’s the truth. and philosophical discourse on the question “What is truth?” in the expectation that the public officials pres- Or consider this observation by Thomas Schwandt, ent would be alienated and dismiss my presentation. a philosopher of science and professional evaluator, So when he asked, “Do you, as a qualitative researcher, who has spent much of a distinguished career grap- swear to tell the truth, the whole truth and nothing but pling with this very issue. His conclusion: the truth?” I did not reply, “That depends on what truth means.” I said simply, “Certainly I promise to respond TRUTH is one of the most difficult of all philosoph- honestly.” Notice the shift from truth to honesty. ical topics, and controversies surrounding the nature The researcher applying traditional social science of truth lie at the heart of both apologies for and criti- criteria might respond, “I can show you truth insofar cisms of varieties of qualitative work. Moreover, truth is as it is revealed by the data.” intimately related to questions of meaning, and estab- The constructivist might answer, “I can show you lishing the nature of that relationship is also compli- multiple truths.” cated and contested. The artistically inclined might suggest that “beauty is truth.” And “fiction often reveals truth better than ere is general agreement that what is true or what nonfiction.” carries truth are statements, propositions, beliefs, and The critical theorist could explain that “truth assertions, but how the truth of same is established is depends on one’s consciousness.”distribute widely debated. (Schwandt, 2007, p. 300) The participatory qualitative inquirer would say, “We create truth together.” Schwandt presents 10 different philosophical The criticalor change activist might say, “I offer you orientations to and theories about truth: (1) corre- praxis. Here is where I take my stand. This is true for me.” spondence, (2) consensus, (3) coherence, (4) contex- The pragmatic evaluator might reply, “I can show tualist, (5) pragmatic, (6) hermeneutic, (7) critical you what is useful. What is useful is true.” theory (Foucault), (8) realist, (9) constructivist, and Indeed, in this vein, Exhibit 9.7, in presenting (10) objectivist theory. Pick your poison—or truth. the seven sets of criteria for judging quality, offers a We won’t resolve the debate here. Not even close. political campaign button about TRUTH for each Nor will others for, to add yet another quote to thepost, (pp. 680–681). collection, here’s cynic Ambrose Bierce’s (1999) By the way, I noted earlier that the Program Eval- assessment: uation Standards do not use the language of objec- tivity, but the “the Accuracy Standards are intended Discovery of truth is the sole purpose of philosophy, to increase the dependability and truthfulness [italics which is the most ancient occupation of the human added] of evaluation representations” (Joint Com- mind and has a fair prospect of exiting with increasing mittee on Standards, 2010). Note: Truthfulness is not activity to the end of time. (p. 201) TRUTH. You could do a little hermeneutic work on copy, that distinction, should you be so inclined. Since we can’t resolve the nature of truth, indulge Ironically, it is sometimes easier to determine what me in a story that illustrates why it may be important is false than what is true. For insights into how the to have figured out where you, yourself, stand on mat- academic peer review process has been distorted and ters of truth. Following a presentation of evaluation corrupted to generate invalid and untrustworthy not results, see the widely cited and influential analysis findings at a public school board meeting, I was asked by the school district’s internal evaluator, “Do you, as a by Professor of Health Research and Policy at Stan- qualitative researcher, swear to tell the truth, the whole ford School of Medicine, John P. A. Loannidis (2005) truth and nothing but the truth?” The question was “Why Most Published Research Findings Are False.” meantDo to embarrass me. The researcher had an article I had written attacking overreliance on standardized Truth Tests and Utility Tests tests for school evaluations and another advocating soliciting multiple perspectives from parents, teachers, Previously I have cited the influential research by students, and community members about their expe- Weiss and Bucuvalis (1980) that decision makers riences with the school district to document diverse apply both “truth” tests and “utility” tests to evalua- perspectives. In that article, and earlier editions of tion. “Truth,” in this case, however, means reasonably this book, I had expressed doubt about the utility of accurate and credible data (the focus of the program

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 729

evaluation standards) rather than data that are true in rejected objectivity and subjectivity as meaningless some absolute sense. Savvy policymakers know better criteria in the postmodern age, makes the case for than most the context and perspective-laden nature pragmatic utility: of competing truths. Qualitative inquiry can present accurate data on various perspectives, including the If all discourse is culturally contextual, how do we evaluator’s perspective, without the burden of deter- decide which deserves our attention and respect? e mining that only one perspective must be true. pragmatists offer the criterion of usefulness for this Evaluation theorist and methodologist Nick Smith purpose. . . . An idea, like a tool, has no intrinsic value (1978), pondering these questions, has noted that to and is “true” only in its capacity to perform a desired act in the world we often accept either approximations service for its handler within a given situation. When to truth or even untruths. the criterion of usefulness is applied to context-bound, historically situated transactions between itself and For example, when one drives from city to city, one a text, it helps us to judge which textual experiences acts as if the earth is flat and does not try to calculate are to be valued. . . . e gates are opened for textual the earth’s curvature in planning the trip, even though encounters, in any inquiry genre or tradition, that serve acting as if the earth is flat means acting on an untruth. to fulfill an important human purpose. (pp. 169–170) Therefore, in our study of evaluation methodology, two criteria replace exact truth as paramount: practi- Focusing on the connection between truth tests cal utility and level of certainty. e level of certainty and utility tests shifts attention back to credibility and required to make an adequate judgment under the law quality, not as absolute distributegeneralizable judgments but as differs depending on whether one is considering an contextually dependent on the needs and interests of administrative hearing, an inquest, or a criminal case. those receiving our analysis. This obliges researchers Although it seems obvious that much greater certainty and evaluatorsor to consider carefully how they present about the nature of things is required when legislators their work to others, with attention to the purpose to set national and educational policy than when a district be fulfilled. That presentation should include reflec- superintendent decides whether to continue a local tions on how your perspective affected the questions program, the in evaluation implies that the you pursued in fieldwork, careful documentation of all same high level of certainty is required of both cases. If procedures used so that others can review your meth- we were to first determine the level of certainty desired ods for bias, and being open in describing the limita- in a specific case, we could then more easily choose post,tions of the perspective presented. Exhibit 9.16, at the appropriate methods. Naturalistic descriptions give us end of this chapter (pp. 736–741), offers an in-depth greater certainty in our understanding of the nature description of how one qualitative inquirer dealt with of an educational process than randomized, controlled these issues in a long-term participant–observer rela- experiments do, but less certainty in our knowledge of tionship. The exhibit, titled A Documenter’s Perspec- the strength of a particular effect. . . . Our first concern tive, is based on her research journal and field notes. should be the practical utility of our knowledge, not its It moves the discussion from abstract philosophizing ultimate truthfulness. (p. 17) to day-to-day, in-the-trenches fieldwork encounters copy, aimed at sorting out what is true (small t) and useful. In studying evaluation use (Patton, 2008), I found Finding TRUTH can be a heavy burden. I once that decision makers did not expect evaluation reports had a student who was virtually paralyzed in writ- to produce “TRUTH” in any fundamental sense. ing an evaluation report because he wasn’t sure if the Rather, they viewednot evaluation findings as addi- patterns he thought he had uncovered were really tional information that they could and did combine true. I suggested that he not try to convince himself with other information (political, experiential, other or others that his findings were true in any absolute research, colleague opinions, etc.), all of which fed into sense but, rather, that he had done the best job he a slow, evolutionary process of incremental decision could in describing the patterns that appeared to him making.Do Kvale (1987) echoed this interactive and con- to be present in the data and that he present those textual approach to truth in emphasizing the “prag- patterns as his perspective based on his analysis and matic validation” of findings in which the results of interpretation of the data he had collected. Even if he qualitative analysis are judged by their relevance to believed that what he eventually produced was Truth, and use by those to whom findings are presented. any sophisticated person reading the report would This criterion of utility can be applied not only to know that what he presented was no more than his evaluation but also to qualitative analyses of all kinds, perspective, and they would judge that perspective by including textual analysis. Barone (2000), having their own commonsense understandings and use the

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 730 ANALYSIS, INTERPRETATION, AND REPORTING

write an answer to the following question: ‘What is TRUTH VERSUS RELATIVISM water?’” The king commanded the sages to do as they were Postmodern work is often accused of being relativistic asked. The answers were handed to the king, who read (and evil) since it does not advocate a universal, indepen- to the court what each sage had written. SIDEBAR dent standard of truth. In fact, relativism is only an issue The first wrote, “Water is to remove thirst.” for those who believe there is a foundation, a structure The second, “It is the essence of life.” against which other positions can be objectively judged. The third, “Rain.” In eect, this position implies that there is no alternative The fourth, “A clear, liquid substance.” between objectivism and relativism. Postmodernists The fifth, “A compound of hydrogen and oxygen.” dispute the assumptions that produce the objectivism/ The sixth, “Water was given to us by God to use in relativism binary since they think of truth as multiple, his- cleansing and purifying ourselves before prayer.” torical, contextual, contingent, political, and bound up in The seventh, “It is many different things—rivers, power relations. Refusing the binary does not lead to the wells, ice, lakes, so it depends.” abandonment of truth, however, as Foucault emphasizes The eighth, “A marvelous mystery that defies when he says, “I believe too much in truth not to sup- definition.” pose that there are dierent truths and dierent ways of The ninth, “The poor man’s wine.” speaking the truth.” Nasrudin turned to the court and the king: “I am guilty of saying that the wise men are confused. I am Furthermore, postmodernism does not imply that one not, however, guilty of distributetreason because, as you see, the does not discriminate among multiple truths, that “any- wise men are confused. How can they know if I have thing goes.”. . . If there is no absolute truth to which every committed treason if they cannot even decide what instance can be compared for its truth-value, if truth is water is? If the sages cannot agree on the truth about instead multiple and contextual, then the call for ethical water, somethingor which they consume every day, how practice shifts from grand, sweeping statements about can one expect that they can know the truth about truth and justice to engagements with speci c, complex other things?” problems that do not have generalizable solutions. This The king ordered that Nasrudin be set free. dierent state of aairs is not irresponsible, irrational, or nihilistic. . . . As with truth, postmodern critiques argue for multiple and historically specific forms of reason.

(St. Pierre 2000, p. 25) post, SIDEBAR TRUE FACTS VERSUS TRUE THEORIES

Facts and theories are born in different ways and are information according to how it contributed to their judged by dierent standards. Facts are supposed to be own needs. true or false. They are discovered by observers or experi- As one additional source of reflection on these menters. A scientist who claims to have discovered a fact issues, perhaps the following copy,Sufi story will provide that turns out to be wrong is judged harshly. One wrong some guidance about the difference between truth and fact is enough to ruin a career. perspective. Sagely, in this encounter, Nasrudin gath- Theories have an entirely dierent status. They are free ers data to support his proposition about the nature of creations of the human mind, intended to describe our truth. Here’s the story. understanding of nature. Since our understanding is Mulla Nasrudinnot was on trial for his life. He was incomplete, theories are provisional. Theories are tools of accused of no less a crime than treason by the king’s understanding; and a tool does not need to be precisely ministers, wise men charged with advising on matters true in order to be useful. Theories are supposed to be of great import. Nasrudin was charged with going more-or-less true, with plenty of room for disagreement. from village to village inciting the people by saying, Do A scientist who invents a theory that turns out to be wrong “The king’s wise men do not speak truth. They do not is judged leniently. Mistakes are tolerated, so long as the even know what truth is. They are confused.” Nasru- culprit is willing to correct them when nature proves them din was brought before the king and the court. “How wrong. do you plead, guilty or not guilty?” “I am both guilty and not guilty,” replied Nasrudin. —Physicist Freeman Dyson (2014, p. 4) “What, then, is your defense?” Institute for Advanced Studies, Princeton Nasrudin turned and pointed to the nine wise men who were assembled in the court. “Have each sage

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 731

Enhanced Credibility and Increased Won in what sense? Legitimacy for Qualitative Methods: Won acceptance. Looking Back and Looking Ahead The validity of experimental methods and quanti- tative measurement, appropriately used, was never in doubt. Now, qualitative methods have ascended to a level of parallel respectability. I have found increased e distinction between the past, interest in and acceptance of qualitative methods in present, and future is only a stubbornly particular and multiple methods in general. Especially in evaluation, a consensus has emerged that research- persistent illusion. ers and evaluators need to know and use a variety of —Physicist Albert Einstein methods in order to be responsive to the nuances of eory of Relativity particular empirical questions and the idiosyncrasies of specific stakeholder needs. The debate has shifted from quantitative versus qualitative to strong differ- Chapter Summary ences of opinion about how to establish causality (the attribution and so-called gold standard debate dis- This chapter has reviewed ways of enhancing the cussed in Chapters 3 and 8). While related, that’s a quality, credibility, and utility of qualitative analysis by narrower issue. dealing with four distinct but related inquiry concerns: The credibility and respectability of qualita- tive methods varies distributeacross disciplines, university • Rigorous methods for doing fieldwork that yield departments, professions, time periods, and coun- high-quality data tries. In the field I know best, program evaluation, • Systematic and conscientious analysis with atten- the increased legitimacy of qualitative methods is tion to issues of credibility a function ofor more examples of useful, high-qual- • e credibility of the researcher, which depends on ity evaluations employing qualitative methods and training, experience, track record, status, and pre- an increased commitment to providing useful and sentation of self understandable information based on stakeholders’ • Philosophical belief in the value of qualitative concerns. Other factors that contribute to increased inquiry—that is, a fundamental appreciation of credibility include more and higher-quality training naturalistic inquiry, qualitative methods, inductive in qualitative methods and the publication of a sub- analysis, purposeful sampling, and holistic thinking.post,stantial qualitative literature. The history of the paradigms debate parallels the Exhibit 9.15 presented a graphic depicting these history of evaluation. The earliest evaluations focused four dimensions, with criteria for judging quality in largely on quantitative measurement of clear, specific the center. goals and objectives. With the widespread social and educational experimentation of the 1960s and early 1970s, evaluation designs were aimed at comparing Conclusion: Beyond the copy, the effectiveness of different programs and treatments Qualitative/Quantitative Debate through rigorous controls and experiments. This was the period when the quantitative/experimental para- Question: What’s the status of the qualitative/quantita- digm dominated. By the middle 1970s, the paradigms tive debate today? From your perspective, what does the debate had become a major focus of evaluation dis- future look like fornot qualitative inquiry? cussions and writings. By the late 1970s, the alterna- The debate between qualitative and quantitative tive qualitative/naturalistic paradigm had been fully methodologists was often strident historically, but in articulated (Guba, 1978; Patton, 1978; Stake, 1975, recent years the debate has mellowed. A consensus has 1978). During this period, concern about finding graduallyDo emerged that the important challenge is to ways to increase use became predominant in evalu- appropriately match methods to purposes and inquiry ation, and evaluators began discussing standards. A questions, not to universally and unconditionally advo- period of pragmatism and dialogue followed, during cate any single methodological approach for all inquiry which calls for and experiences with multiple methods situations. Indeed, eminent methodologist Thomas and a synthesis of paradigms became more common. Cook, one of evaluation’s luminaries, pronounced in his The advice of Cronbach (1980), in his important book keynote address to the 1995 International Evaluation on reform of program evaluation, was widely taken to Conference in Vancouver that “qualitative researchers heart: “The evaluator will be wise not to declare alle- have won the qualitative/quantitative debate.” giance to either a quantitative–scientific–summative

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 732 ANALYSIS, INTERPRETATION, AND REPORTING

methodology or a qualitative–naturalistic–descriptive 5. Support for methodological eclecticism from major methodology” (p. 7). figures and institutions in evaluation increased meth- Signs of detente and pragmatism now abound. odological tolerance. When eminent measurement Methodological tolerance, flexibility, eclecticism, and methods scholars like Donald Campbell and and concern for appropriateness rather than ortho- Lee J. Cronbach began publicly recognizing the con- doxy now characterize the practice, literature, and tributions that qualitative methods could make, the discussions of evaluation. Several developments seem acceptability of qualitative/naturalistic approaches to me to explain the withering of the methodological was greatly enhanced. Another important endorse- paradigms debate. ment of multiple methods came from the Program Evaluation and Methodology Division of the U.S. 1. The articulation of professional standards General Accounting Office (GAO), which arguably has emphasized methodological appropriate- did the most important and influential evaluation ness rather than paradigm orthodoxy ( Joint work at the national level. Under the leadership of Committee, 2010; OECD-DAC, 2010). Within Assistant Comptroller General and former AEA the standards as context, the focus on conduct- president (1995) Eleanor Chelimsky, GAO published ing evaluations that are useful, practical, ethical, a series of methods manuals, including Case Study accurate, and accountable have reduced paradigms Evaluations (GAO, 1987), Prospective Evaluation polarization. Methods (GAO, 1989), and The Evaluation Synthesis (GAO, 1992). The GAO manual Designing 2. The strengths and weaknesses of both quantita- Evaluations put thedistribute paradigms debate to rest as it tive/experimental methods and qualitative/natu- described what constituted a “strong evaluation.” ralistic methods are now better understood. In the original debate, quantitative methodologists Strength is not judged by adherence to a particular par- tended to attack some of the worst examples adigm. It is ordetermined by use and technical adequacy, of qualitative evaluations while the qualitative whatever the method, within the context of purpose, evaluators tended to hold up for critique the time, and resources. worst examples of quantitative/experimental approaches. With the accumulation of experi- Strong evaluations employ methods of analysis that ence and confidence, exemplars of both qualita- are appropriate to the question; support the answer tive and quantitative approaches have emerged with evidence; document the assumptions, procedures, with corresponding analyses of the strengths andpost, and modes of analysis; and rule out competing evi- weaknesses of each. This has permitted more dence. Strong studies pose questions clearly, address balance and a better understanding of the situa- them appropriately, and draw inferences commensu- tions for which various methods are most appro- rate with the power of the design and the availability, priate as well as grounded experience in how to validity, and reliability of the data. Strength should not combine methods. be equated with complexity. Nor should strength be equated with the degree of statistical manipulation of 3. A broader conceptualization of evaluation, and of data. Neither infatuation with complexity nor statisti- evaluator training, has directedcopy, attention to the cal incantation makes an evaluation stronger. relation of methods to other aspects of evaluation, like use, and has therefore reduced the intensity of e strength of an evaluation is not defined by a par- the methods debate as a topic unto itself. ticular method. Longitudinal, experimental, quasi- 4. Advances innot methodological sophistication and experimental, before-and-after, and case study eval- diversity within both paradigms have strength- uations can be either strong or weak. . . . at is, the ened diverse applications to evaluation problems. strength of an evaluation has to be judged within the The proliferation of books and journals in evalua- context of the question, the time and cost constraints, tion, including but not limited to methods contri- the design, the technical adequacy of the data collec- butions,Do has converted the field into a rich mosaic tion and analysis, and the presentation of the findings. that cannot be reduced to quantitative versus A strong study is technically adequate and useful—in qualitative in primary orientation. Moreover, the short, it is high in quality. (GAO, 1991, pp. 15–16) upshot of all the developmental work in qualita- tive methods is that, as documented in Chapter 3, 6. Evaluation professional societies have supported today there is as much variation among qualitative exchanges of views and high-quality professional researchers as there is between qualitatively and practice in an environment of tolerance and quantitatively oriented scholars. eclecticism. The evaluation professional societies

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 733

and journals serve a variety of people from dif- Validity is a property of knowledge, not methods. No ferent disciplines who operate in different kinds matter whether knowledge comes from an ethnogra- of organizations at different levels, in and out of phy or an experiment, we may still ask the same kind the public sector, and in and out of universities. of questions about the ways in which that knowledge This diversity, and opportunities to exchange is valid. To use an overly simplistic example, if some- views and perspectives, has contributed to the one claims to have nailed together two boards, we do emergent pragmatism, eclecticism, and tolerance not ask if their hammer is valid, but rather whether in the field. A good example was the appearance the two boards are now nailed together, and whether two decades ago of a volume of New Directions the claimant was, in fact, responsible for that result. for Program Evaluation on The Qualitative– In fact, this particular claim may be valid whether the Quantitative Debate: New Perspectives (Reichardt nail was set in place by a hammer, an airgun, or the & Rallis, 1994). The tone of the eight distin- butt of a screwdriver. A hammer does not guarantee guished contributions in that volume is captured successful nailing, successful nailing does not require by phrases such as “peaceful coexistence,” “each a hammer, and the validity of the claim is in principle tradition can learn from the other,” “compromise separate from which tool was used. e same is true solution,” “important shared characteristics,” and of methods in the social behavioral sciences. (Shadish, “a call for a new partnership.” 1995a, p. 421) 7. There is increased advocacy of and experience in This brings us back to a pragmatic focus on the combining qualitative and quantitative approaches. utility of findings as a point of entry for determin- The Reichardt and Rallis (1994) volume just cited distribute ing what’s at stake in the claims made in a study also included these themes: “blended approaches,” and therefore what criteria to use in assessing those “integrating the qualitative and quantitative,” “pos- claims. As I noted in opening this chapter, judg- sibilities for integration,” “qualitative plus quantita- ments about orcredibility and quality depend on cri- tive,” and “working together.” Exhibit 9.2 presented teria. And though this chapter has been devoted to 10 developments enhancing mixed-methods trian- ways of enhancing quality and credibility, all such gulation (p. 666). efforts ultimately depend on the willingness of the inquirer to weigh the evidence carefully and be Matching Claims and Criteria open to the possibility that what has been learned most from a particular inquiry is how to do it better The withering of the methodological paradigmspost, next time. debate holds out the hope that studies of all kinds can Canadian-born bacteriologist Oswald Avery, dis- be judged on their merits according to the claims they coverer of DNA as the basic genetic material of the make and the evidence marshaled in support of those cell, worked for years in a small laboratory at the hos- claims. The thing that distinguishes the seven sets of pital of the Rockefeller Institute in New York City. criteria for judging quality introduced in this chap- Many of his initial hypotheses and research conclu- ter (Exhibit 9.7) is that they support different kinds sions turned out, on further investigation, to be wrong. of claims. Traditional scientificcopy, claims, constructivist His colleagues marveled that he never turned argu- claims, artistic claims, participatory inquiry claims, mentative when findings countered his predictions critical change claims, systems claims, and pragmatic and never became discouraged. He was committed claims will tend to emphasize different kinds of con- to learning and was often heard telling his students, clusions with varying implications. In judging claims “Whenever you fall, pick up something.” and conclusions,not the validity of the claims made is A final Halcolm story on the nature of journeys only partly related to the methods used in the process. ends this chapter—and this book. Do

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 734 ANALYSIS, INTERPRETATION, AND REPORTING

distribute or

post,

copy, not Do

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 735

distribute or

post,

copy, not Do

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 736 ANALYSIS, INTERPRETATION, AND REPORTING

EXHIBIT 9.16 A Documenter’s Perspective

by Beth Alberty sh insight and ideas from the ceaseless ocean of activity around me. Indeed, the fact that observing (and record Introduction keeping) does generate questions, insight, and matters for discussion is one of many reasons why records for any This exhibit provides a reective case study of the struggle documentation should be gathered by those who actu- experienced by one internal, formative program evaluator ally work in the setting. of an innovative school art program as she tried to gure out how to provide useful information to program sta from the My observing took many forms, each offering a different voluminous qualitative data she collected. Beth begins by way of releasing questions and ideas—interactive and describing what she means by “documentation” and then noninteractive observations were transcribed or discussed shares her experiences as a novice in analyzing the data, a with other staff members and thereby rethought; chil- process of moving from a mass of documentary material to a dren’s writing was typed out, the attention to every detail unied, holistic document. involving me in what the child was saying; notes of meet- ings and other events were rewritten for the record; and so on. Handling such detail with attention, I found, ena- Documentation bled me to see into the incident or piece of work in a way I Documentation, as the word is commonly used, may refer hadn’t on first look. Connections with other things I knew, to “slice of life” recordings in various media or to the mar- with other observations Idistribute made, or questions I was puz- shalling of evidence in support of a position or point of zling over seemed to proliferate during these processes; view. We are familiar with “documentary” lms; we require new perceptions and new questions began to form. lawyers or journalists to “document” their cases. Both I have heard othersor describe similarly their delighted meanings contribute to my view of what documentation discovery of the provocativeness of record-keeping pro- is, but they are far from describing it fully. Documentation, cesses. The teacher who begins to collect children’s art, to my mind, is the interpretive reconstitution of a focal without perhaps even having a particular reason for the event, setting, project, or other phenomenon, based on collecting, will, just by gathering the work together, begin observation and on descriptive records set in the context to notice things about them that he or she had not seen of guiding purposes and commitments. before—how one child’s work influences another’s, how post,really different (or similar) are the trees they make, and so I have always been a staff member of the situations I have documented, rather than a consultant or an employee of on. The in-school advisor or resource teacher who reviews an evaluation organization. At first this was by accident, all his or her contacts with teachers—as they are recorded but now it is by conviction: My experience urges that the or in a special meeting with his or her colleagues—may most meaningful evaluation of a program’s goals and begin, for example, to see patterns of similar interest in commitments is one that is planned and carried out by the requests he or she is getting and thus become aware the staff and that such an evaluation contributes to the of new possibilities for relationships within the school. program as well as to external needscopy, for information. As a My own delight in this apparently easy access to a first staff member, I participate in staff meetings and contrib- level of insight made me eager to collect more and ute to decisions. My relationships with other staff mem- more, and I also found the sheer bulk of what I could col- bers are close and reciprocal. Sometimes I provide ser- lect satisfying. As I collected more records, however, my vices or performnot functions that directly fulfill the purposes enthusiasm gradually changed to alarm and frustration. of the program—for example, working with children or There were so many things that could be observed and adults, answering visitor’s questions, and writing propos- recorded, so many perspectives, such a complicated his- als and reports. Most of my time, however, is spent plan- tory! My feelings of wanting more changed to a feeling ning, collecting, reporting, and analyzing documentation. of needing to get everything. It wasn’t enough for me to Do know how the program worked now—I felt I needed to First Perceptions know how it got started and how the present workings With this context in mind, let me turn to the beginning had evolved. It wasn’t enough to know how the central plunge. Observing is the heart of documenting, and it part of the program worked—I felt I had to know about was into observing that I plunged, coming up delighted all its spinoff activities and from all points of view. I was at the apparent ease and swiftness with which I could quickly drawn into a fear of losing something significant,

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 737

something I might need later on. Likewise, in my early my records that would make a coherent representation of observations of class sessions, I sought to write down the program to an outside audience. everything I saw. I have had this experience of wanting to get everything in every setting in which I have docu- At the same time, I began to rethink how I could make mented, and I think it is not unique. what I had collected more useful to the staff. Conceiving an audience was very important at this stage. I will be I was fortunate enough to be able to indulge these feel- returning to this moment of transition from initial collect- ings and to learn from where they led me. It did become ing to rethinking later, to analyze the entry into interpreta- clear to me after a while that my early ambitions for doc- tion that it entails. Descriptively, however, what occurred umenting everything far exceeded my time and, indeed, was that I began to see my observations and records as a the needs of the program. Nevertheless, there was a sense body with its own configurations, interrelationships, and to them. Collecting so much was a way of getting to know possibilities, rather than simply as excerpts of the larger a new setting, of orienting myself. And, not knowing the program that related only to the program. Obviously, the setting, I couldn’t know what would turn out to be impor- observations and records continued to have meaning tant in “reconstituting” it; also, the purpose of “reconsti- through their primary relationship to the setting in which tuting” it was sufficiently broad to include any number of they were made; but they also began to have meaning possibilities from which I had not yet selected. In fact, I through their secondary relationships to each other. found that the first insights, the first connections that came from gathering the records were a significant part of These secondary relationships also emerge from obser- the process of determining what would be important and vation as a process of reflecting.distribute Here, however, the focus what were the possibilities most suited to the purposes of of observation is the setting as it appears in and through the documentation. The process of gathering everything the observations and records that have accumulated, with at first turned out to be important and, I think, needs all their representationor of multiple perspectives and lon- to be allowed for at the beginning of any documenting gitudinal dimensions. These observations in and through effort. Even though much of the material so gathered may records —“thickened observations”—are of course con- remain apparently unused, as it was in my documenting, firmed and added to by continuing direct observation of in fact it has served its purpose just in being collected. the setting. A similar process may be required even when the docu- Beginning to see the records as a body and the setting menter is already familiar with the setting, since the new through thickened observation is a process of integrating role entails a new perspective. post,data. The process occurs gradually and requires a broad The first connections, the first patterns emerging from base of observation about many aspects of the program the accumulating records were thus a valuable aspect of over some period of time. It then requires concentrated the documenting process. There came a moment, how- and systematic efforts to find connections within the data ever, when the data I had collected seemed more massive and weave them into patterns, to notice changes in what than was justified by any thought I’d had as a result of the is reported, and find the relationship of changes to what collecting. I was ill at ease because the first patterns were remains constant. This process is supported by juxtapos- still fairly unformed and were notcopy, automatically turning ing the observations and records in various ways as well into a documentation in the full sense I gave earlier, even as by continual return to reobserve the original phenom- though I recognized them as part of the documentary enon. There is, in my opinion, no way to speed up the pro- data. Particularly, they did not function as “evaluation.” cess of documenting. Reflectiveness takes time. Some further development was needed, but what? “What do I do with themnot now?” is a cry I have heard regularly In retrospect, I can identify my own approach to an inte- since then from teachers and others who have been col- gration of the data as the time when I began to give my lecting records for a while. opinions on long-range decisions and interpretations of daily events with the ease of any other staff member. Up I began with the relatively simple procedure of reread- to the moment of transition, I shared specific observations ingDo everything I had gathered. Then, I returned to rethink from the records and talked them over as a way of gather- what my purposes were and sought out my original ing yet more perspectives on what was happening. I was resources on documentation. Rereading qualitative ref- aware, however, that my opinions or interpretations were erences, talking with the staff of the school and with my still personal. They did not yet represent the material I was staff colleagues, I began to imagine a shape I could give to collecting.

(Continued)

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 738 ANALYSIS, INTERPRETATION, AND REPORTING

(Continued) Thus, it may be that integration of the documentary of service. The plan of the documentation had called material becomes apparent when the documenter begins for certain results, but there was no specified format for to evince a broad perspective about what is being doc- presentation of results. Therefore, my entry into interpre- umented, a perspective that makes what has been gath- tation became a struggle with myself over what I was sup- ered available to others without precluding their own posed to be doing. It was a long internal debate about my perceptions. This perspective is not a fixed-point view responsibilities and commitments. of a finished picture, both the view and the picture con- structed somehow by the documenter in private and then When I began documenting this particular school’s art unveiled with a flourish. It is also not a personal opinion; program, for example, I had priorities based on my experi- nor does it arise from placing a predetermined interpre- ence and personal commitments. It seemed to me self-ev- tive structure or standard on the observations. The per- idently important to provide art activities for children and spective results from the documenter’s own current best to try and connect these to other areas of their learning. I integration of the many aspects of the phenomenon, of knew that art was not something that could be “learned” the teachers’ or staff’s aims, ideas, and current struggles, or even experienced on a once-a-week basis, so I thought and of their historical development as these have been it was important to help teachers find various ways of inte- conveyed in the actions that have been observed and the grating art and other activities into their classrooms. I had records that have been collected. already made a personal estimate that what I was docu- menting was worthwhile and honest. I had found points As documenter, my perspective of a program or a class- of congruence between mydistribute priorities and the program. I room is like my perspective of a landscape. The longer I could see how the various structures of the program spec- am in it, the sharper defined become its features, its hills ified ways of approaching the goals that seemed possible and valleys, forests and fields, and the folds of distance; and that also enabledor the elaboration of the goals. the more colorful and yet deeply shaded and nuanced in tone it appears; the more my memory of how it looks in This initial commitment was diffuse; I felt a kind of general other weather, under other skies, and in other seasons, enthusiasm and interest for the efforts I observed and a and my knowledge of its living parts, its minute detail, desire to explore and be helpful to the teachers. In retro- and its history deepen my viewing and valuing of it at spect, however, the commitment was sufficiently energiz- any moment. This landscape has constancy in its basic ing to sustain me through the early phases of collecting configurations, but is also always changing as circumpost,- observations and records, when I was not sure what these stances move it and as my perceptions gather. The per- would lead to. Rather than restricting me, the commitment spective the documenter offers to others must evoke the freed me to look openly at everything (as reflected in the constancy, coherence, and integrity of the landscape, and early enthusiasm for collecting everything). Obviously, it its possibilities for changing its appearance. Without such is possible to begin documenting from many other posi- a perspective, an organization or integration that is both tions of relative interest and investment, but I suspect that personal and informed by all that has been gathered by even if there is no particular involvement in program con- myself and by others in the setting—others could not tent on the part of the documenter, there must be at least share what I have seen—couldcopy, not locate familiar land- some idea of being helpful to its staff. (Remember, this marks and reflect on them as they exhibit new relation- was a formative evaluation.) Otherwise, for example, the ships to one another and to less familiar aspects. All that process of gathering data may be circumscribed. material, all those observations and records, would be a lifeless and undoubtedly dusty pile. At the point of beginning to “do something” with the not observations and records, I was forced to specify the The process of forming a perspective in which the data original commitment, to rethink my purposes and goals. gathered are integrated into an organic configuration is Rereading the observations and records as a preliminary obviously a process of interpretation. I had begun doc- step in reworking to address different audiences, I found umenting,Do however, without an articulated framework myself at first reading with an idea of “balancing” suc- for interpretation or a format for representation of the cess and failure, an idea that constricted and trivialized body of records, like the theoretical framework research- the work I had observed and recorded. Thankfully, it was ers bring to their data. Of course, there was a framework. immediately evident from the data itself that such bal- Conceptions of artistic process, of learning and develop- ance was not possible. If, during 10 days of observation, a ment, were inherent in the program; but these were not child’s experience was intense 1 day and characterized by explicit in its goals as a program to provide certain kinds rowdy socializing the other 9, a simple weigh-off would

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 739

not establish the success or failure of the child’s experi- a particular point of view. In examining this possibility, I ence. The idea was ludicrous. Similarly, the staff might be came to a view of interpreting observational data as a pro- thorough in its planning and follow-through on one day cess of “rendering,” much as a performer renders a piece of and disorganized on another day, but organization and classical music. The interpretation follows a text closely— planning were clearly not the totality of the experience as a scientist might say, it sticks closely to the facts. But for children. it also reflects the performer, specifically the performer’s particular manner of engagement in the enterprise shared Such trade-offs implied an external, stereotyped audience by text and performer, the enterprise of music. The same awaiting some kind of quantitative proof, which I was sup- relationship could exist, it seemed to me, between a body posed to provide in a disinterested way, like an external, of observations and records gathered participatively and summative evaluator. The “balanced view” phase was as documenter. The relationship would allow my personal also like my early record gathering of everything. What I experience and viewpoint to enhance rather than distort was documenting was still in fragments for me, and my the data. Indeed, I would become their voice. approach was to the particulars, to every detail. Through this relationship I could make the observations A second approach to interpreting, also brief, took a available to staff and to other audiences in a way that slightly broader view of the data, a view that acknowl- was flexible and responsive to their needs, purposes, edged my original estimate of program value and and standards. In so doing, of course, the framework of attempted to specify it. Perceiving through the data the inherent conceptions underlying the work of the program landscape-like configurations of program strengths, I would be incorporated. Thus, to interpret the observa- made assessments that included statements of past mis- distribute tional data I had gathered, I had to reaffirm and clarify my takes or inadequacies like minor “flaws” in the landscape relationship, my attachment to and participation in the (e.g., a few odd billboards and a garbage dump in one of program. Poussin’s dreams of classical Italy) rather than debits on or a balance sheet. Here again, the implication was of an My initial engagement, with its strong coloring of prior external audience, expecting some absolute of accom- interests and ideas, had never meant that I understood plishment. The “flaws” could be “minor” only by reference or was sympathetic with every goal or practice of every to an implied major flaw—that of failing to carry out the participant of the program all the time. In any joint enter- program goals altogether. prise, such as a school or program, there are diverse and The formulation of strength subsuming weakness couldpost, multiple goals and practices. Part of the task of document- not withstand the vitality of the records I was reading. ing is to describe and make these various understandings, The reality the data portrayed became clearer as the inad- points of view, and practices visible so that participants equacy of my first formulations of how to interpret the can reflectively consider them as the basis for planning. documentary material was revealed. Similarly, the impli- No participant agrees on all issues and points of practice. cations of external audience expectations were not jus- Part of being a participant is exploring differences and tified by the actuality of my relationship to the program how these illuminate issues or contribute to practice. and staff. My stated goal as documenter had been origi- My participation allowed me to examine and extend the nally to set up record-keeping procedurescopy, that would pre- interests and ideas I came with as well as observing and serve and make available to staff and to other interested recording those other people brought. In this process, my persons aspects of the beginnings and workings of the engagement was deepened, enabling me to make assess- program, and to collect and analyze some of the material ments closer to the data than my first readings brought. as an assessment of what further possibilities for develop- These assessments are evaluation in its original sense of not “drawing-value-from,” an interactive process of valuing, of ment actually existed. My goals had not been to evaluate in the sense of an external judgment of success or failure. giving weight and meaning.

Thinking over what other approaches to interpretation In the context of renewed engagement and deepened wereDo possible, I recalled that I had gathered documentary participation, assessments of mistakes or inadequacies materials quite straightforwardly as a participant, whose are construed as discrepancies between a particular prac- engagement was initially through recognition of shared tice and the intent behind it, between immediate and convictions and points of congruence with the program. long-range purposes. The discrepancy is not a flaw in an Perhaps, I decided, I could share my viewpoint of the otherwise perfect surface, but—like the discrepancy in a observations just as straightforwardly, as a participant with child’s understanding that stimulates new learning—is the

(Continued)

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 740 ANALYSIS, INTERPRETATION, AND REPORTING

(Continued) occasion for growth. It is a sign of life and possibility. The audience will need more history and formal description of burden of the discrepancy can lie either with the practice the broad aspects than the internal audience, with com- or with the intent, and that is the point for further exam- mentary that indicates the significance of recent develop- ination. Assessment can also occur through the observa- ments. This need can be met in the overall organization, tion of and search for underlying themes of continuity arrangement, and introduction of documents, which also between present and past intent and practice, and the convey the detail and vividness of daily activity. point of change or transformation in continuity. Whereas discrepancy will usually be a more immediate trigger to To limit the report to conventional format and expecta- evaluation, occasions for the consideration of continuity tions would probably misrepresent the quality of thought, may tend to be longer-range planning for the coming year, of relating, of self-assessment that goes into developing contemplating changes in staff and function, or commem- the work. If there is intent to use the occasion of a report orating an anniversary. for reflection—for example, by including staff in the development of the report—the reporting process can I have located the documenter as participant, internal to become meaningful internally while fulfilling the legiti- the program or setting, gathering and shaping data in mate external demands for accounting. Naturally, such a ways that make them available to participants and poten- comment engages the external audience in its own evalu- tially to an external audience. Returning to the image of a ative reflections by evoking the phenomenon rather than landscape, let me comment on the different forms availa- reducing it. bility assumes for these different audiences. distribute In closing, I return to what I see as the necessary engaged Participant access to the landscape through the docu- participation of the documenter in the setting being doc- menter’s perspective cannot be achieved through pon- umented, not only for data gathering but for interpreta- derous written descriptions and reports on what has tion. Whatever orauthenticity and power my perspective as been observed but must be concentrated in interaction. documenter has had has come, I believe, from my com- Sometimes this may require the development of special mitment to the development of the setting I was docu- or regular structures—a series of short-term meetings on menting and from the opportunities in it for me to pursue a particular issue or problem; an occasional event that my own understanding, to assess and reassess my role, sums up and looks ahead; a regular meeting for another and to come to terms with issues as they arose. kind of planning. But many times the need is addressed in very slight forms, such as a comment in passing aboutpost, We come to new settings with prior knowledge, experi- something a child or adult user is doing, about the ence, and ways of understanding, and our new percep- appearance of a display, or the recounting of another staff tions and understandings build on these. We do not sim- member’s observation. I do not mean that injecting doc- ply look at things as if we had never seen anything like umentation into the self-assessment process is a juggling them before. When we look at a cluster of light and dark act or some feat of manipulation; merely that the docu- greens with interstices of blue and some of deeper browns menter must be aware that his or her role is to keep things and purples, what we identify is a tree against the sky. open and that, while the observationscopy, and records are a Similarly, in a classroom we do not think twice when we resource for doing this, a sense of the whole they create is see, for example, a child scratching his head, yet the same also essential. The landscape is, of course, changed by the phenomenon might be more strictly described as a par- new observations offered by fellow viewers. ticular combination of forms and movements. Our daily functioning depends on this kind of apparently obvious The external audiencenot places different requirements on and mundane interpretation of the world. These interpre- the documenter who seeks to represent to it the docu- tations are not simply personal opinions—though they mentary perspective. By external audience I refer to fund- certainly may be unique—nor are they made up. They are ing agencies, supervisors, school boards, institutional hier- instead organizations of our perceptions as “tree” or “child archies, and researchers. Proposals, accounts, and reports scratching” and they correspond at many points with the toDo these audiences are generally required. They can be phenomena so described. burdensome because they may not be organically related to the process of internal self-reflection and because the It is these organizations of perception that convey to external audience has its own standards, purposes, and someone else what we have seen and that make objects questions; it is unfamiliar with the setting and with the available for discussion and reflection. Such organizations documenter, and it needs the time offered by written need not exclude our awareness that the tree is also a accounts to return and review the material. The external cluster of colors or that the child scratching his head is

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 741

also a small human form raising its hand in a particular Of course, there is a role for the experienced observer from way. Indeed, we know that there could be many other outside who can see phenomenon freshly; who can sug- ways to describe the same phenomena, including some gest ways of obtaining new kinds of information about it, that would be completely numerical—but not necessar- or, perhaps more important, point to the significance of ily more accurate, more truthful, or more useful! After all, already existing procedures or data; who can advise on we organize our perceptions in the context of immediate technical problems that have arisen within a documen- purposes and relationships. The organizations must corre- tation; and who can even guide efforts to interpret and spond to the context as well as to the phenomenon. integrate documentary information. I am stressing, how- ever, that the outside observer in these instances provides Facts do not organize themselves into concepts and the- support, not judgment or the criteria for judgment. ories just by being looked at; indeed, except within the framework of concepts and theories, there are no scien- The documenter’s obligation to interpret his or her obser- tific facts but only chaos. There is an inescapable a priori vations and those reflected in the records being collected element in all scientific work. Questions must be asked becomes increasingly urgent, and the interpretations before answers can be given. The questions are all expres- become increasingly significant, as all the observers in the sions of our interest in the world; they are at bottom valu- setting become more knowledgeable about it and thus ations. Valuations are thus necesssarily involved already at more capable of bringing range and depth to the inter- the stage when we observe facts and carry on theoretical pretation. Speaking of the weight of her observations of analysis and not only at the stage when we draw political the Manus over a period of some 40 years to great change, inferences from facts and valuations (Myrdal, 1969, p. 9). Margaret Mead clarifies thedistribute responsibility of the partici- pant–observer to contribute to both people studied and My experience suggests that the situation in document- to a wider audience the rich individual interpretation of ing is essentially the same as what I have been describ- his or her own observations: ing with the tree and the child scratching and what or Myrdal describes as the process of scientific research. Uniqueness, now, in a study like this (of people who Documentation is based on observation, which is always have come under the continuing inuence of con- an individual response both to the phenomena observed temporary world culture), lies in the relationships and to the broad purposes of observation. In documenta- between the eldworker and the material. I still have tion, observation occurs both at the primary level of see- the responsibility and incentives that come from ing and recording phenomena and at secondary levels of the fact that because of my long acquaintance with re-observing the phenomena through a volume of recordspost, this village I can perceive and record aspects of this and directly, at later moments. Since documentation has people’s life that no one else can. But even so, this as its purpose to offer these observations for reflections knowledge has a new edge. This material will be val- and evaluation in such a way as to keep alive and open the uable only if I myself can organize it. In traditional potential of the setting, it is essential that observations eldwork, another anthropologist familiar with the at both primary and secondary levels be interpreted by area can take over one’s notes and make them mean- those who have made them. The usefulness of the obser- ingful. But here it is my individual consciousness that vations to others depends on thecopy, documenter’s rendering provides the ground on which the lives of these peo- them as finely as he or she is able, with as many points of ple are gures. (Mead, 1977, pp. 282–283) correspondence to both the phenomena and the context of interpretation as possible. Such a rendering will be an In documenting, it seems to me the contribution is all interpretation that preserves the phenomena and so does the greater, and all the more demanded, because what is not exclude butnot rather invites other perspective. studied is one’s own setting and commitment. Do

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 742 ANALYSIS, INTERPRETATION, AND REPORTING

APPLICATION EXERCISES

1. Locate a published qualitative study on a subject of interest to you. How does the study address and establish the credibility of qualitative inquiry? Use Exhibits 9.4 and 9.15 to review and critique how credibility is addressed in the study you’ve chosen. What questions are left unanswered in the study you’re reviewing that, from your perspective, if answered, would enhance credibility?

2. Locate a study that highlights use of mixed methods. What was the nature of the mix? What rationale was used for mixing methods? To what extent was triangulation an explicit justification for mixing methods? How integrated was the analysis of qualitative and quantitative data? Based on your review, what are the strengths and weaknesses of the mixed-methods design and analysis you reviewed.

3. Exhibit 9.11 (pp. 707–709) provides the framework for establishingdistribute the credibility of a qualitative inquirer. If you have conducted a qualitative study, or been part of one, complete that table using your own experience (fill in the column for yourself that reports Nora Murphy’s experiences,or perspectives, reactions, and competence in Exhibit 9.11). If you haven’t done a qualitative study, imagine one and complete the table for a qualitative scenario that you construct. e purpose is to practice being reflexive and addressing inquirer credibility.

4. e discussion on objectivity considers a number of alternative ways of describing an inquirer’s stance andpost, philosophy (pp. 723–728). What is your preferred terminology? Write a statement describing your paradigm stance, a statement that you could give someone who was considering funding you to do a qualitative study. Describe a scenario or situation where you would need to explain your stance—and then do so. (You don’t have to be limited to the language options discussed here.)

5. a. As an exercise in distinguishing quality criteria frameworks, try matching the three umpires’copy, perspectives (p. 683) to the frameworks in Exhibit 9.7 (pp. 680–881). Explain your choices.

b. What would a systems-oriented umpire say about umpiring? (Explain). c. What would an artistic-evocative-oriented umpire say about umpiring not(Explain). d. What would a critical change umpire say? 6. (Advanced application) On the next page is a description of an edited Do volume of qualitative inquiries into the nature of family. Use the criteria for autoethnography in Chapter 3 (pp. 102–104) and the sets of criteria in Exhibit 9.7. Create your own set of 10 criteria for judging the methodological quality of this book by selecting criteria that seem especially relevant given the description of the book’s approach. Use this example to discuss the nature of quality criteria in judging the quality of qualitative inquiries.

From On (Writing) Families: Autoethnographies of Presence and Absence, Love and Loss (Wyatt & Adams, 2014):

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. Enhancing the Quality and Credibility of Qualitative Studies 743

Who are we with—and without—families? How do we relate as children to our parents, as parents to our children? How are parent–child relationships—and familial relationships in general—made and (not) maintained?

Informed by narrative, performance studies, poststructuralism, critical theory, and queer theory, contributors to this collection use autoethnography—a method that uses the personal to examine the cultural—to interrogate these questions. e essays write about/around issues of interpersonal distance and closeness, gratitude and dis- dain, courage and fear, doubt and certainty, openness and secrecy, remembering and forgetting, accountability and forgiveness, life and death.

roughout, family relationships are framed as relationships that inspire and inform, bind and scar—relationships replete with presence and absence, love and loss (p. 1).

7. (Advanced application) Martin Rees, astronomer, former Master of Trinity College, and ex-president of the Royal Society of Astronomy said, “Ultimately, I don’t think there’s anything special in the scientific method that goes beyond what a detective does” (quoted by Morris, 2014). Imagine that you are using this quotation to support the credibility of qualitative inquiry. Discuss how this quotation applies to each of the four dimensions of credibility distributediscussed in this chapter (see Exhibit 9.15, p. 722) or

post,

copy, not Do

Copyright ©2015 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher.