1

We are in it together

Brian A. Nosek

University of Virginia Center for Open Science

Sherman and Rivers (2020) summarize conceptual and definitional problems with the term “social ” and identify potential characterizations that distinguish replicable and more questionable findings in priming research, particularly the use of highly powered within-subjects designs versus poorly powered between-subjects designs. I agree with most of their claims. Here, I extend discussion of their characterization of distinguishing types of priming, and reflect on responsibility for replicability and credibility of social .

Two research subcultures

I started graduate school in 1996. It was a very exciting time to be in . Russ Fazio and colleagues had just published "Variability in automatic activation as an unobtrusive measure of racial attitudes: A bona fide pipeline?” providing evidence that a sequential priming technique could indirectly assess attitudes (Fazio et al., 1995). And, John Bargh and colleagues had just published "Automaticity of social behavior: Direct effects of trait construct and activation on action” providing evidence that subtle priming of could elicit stereotype congruent judgments and actions (Bargh et al., 1996). Both built on prior research and there were many related contemporary contributions, but these papers became essential citations and archetypes for the next 20 years of research.

Despite the strong conceptual and practical connections between these two papers, I perceived the research subcultures around them to be quite different. As a simple illustration, I created a graph of connected papers (h​ttp://connectedpapers.com/)​ for both the Bargh and Fazio papers. The graphs in the Figure are not citation trees. Connected Papers uses the Semantic Scholar database to create similarity scores between papers based on what they cite and what cites them, regardless of whether individual papers cite each other. Of the 41 papers in each graph, there are only 3 in common between the Bargh and Fazio graphs (Blair & Banaji, 1996; Devine, 1989; Wittenbrink et al., 2001). The Fazio appears in the Bargh graph, but not vice versa (the two papers have a 10.5 out of 100 similarity score, the lowest similarity score in the Fazio graph is 15.1). Graphing other seminal pieces from the same time such as Dijksterhuis and van Knippenberg (1998) and Blair and Banaji (1996) reveals similar clustering, the former like Bargh’s the latter like Fazio’s. This informal analysis aligns with my experience of two subcultures. Experientially, the subcultures were like young children engaged in parallel play: Similar interests, similar activities, occasional interaction, but largely independent work using different methods and moving different directions.

2

Figure. Similarity graphs of papers associated with Bargh and colleagues (1996; top) and Fazio ​ and colleagues (1995; bottom). Lighter shade = older papers, larger circle = more citations.

3

Sherman and Rivers (2020) suggest that a major difference between the replicable and challenged findings from the priming literature is the use of within- versus between-subjects designs. That is concordant with the distinction between the two archetype papers and the graphed subcultures. The subcultures also struck me to be different in two other important ways.

First, the development and use of sequential priming measures and its cousins included a substantial emphasis on methodology -- testing, debating, and improving the tools of investigation and development of best practices (Fazio & Olson, 2003; Gawronski & De Houwer, 2014; Wittenbrink & Schwarz, 2007). By contrast, there was comparatively little methodological work on priming methods used in the other subculture. For example, there is little systematic evidence about the scrambled sentence test characteristics such as the optimal number of 4 primes and foils or the relation between prime frequency and awareness. In my experience, the attentional emphasis in that subculture was on the outcomes influenced by the priming methods. Priming methods like scrambled sentences were tools of convenience rather than methods requiring study. If anything, the opposite occurred in the other subculture. The emphasis could become highly focused on the methods sometimes eliciting critique that the field was “about” the methods and no longer studying psychological phenomena. That critique may have some merit, but I believe that the emphasis on methodology had a salutary effect on increasing the likelihood of observing replicable findings.

Second, the two subcultures diverged in an underlying presumption about the power, breadth, and longevity of priming. The subculture investigating sequential priming procedures identified highly limited priming influences with narrow impact (Wentura & Rothermund, 2014), whereas the other subculture adopted a different view. Bargh and Chartrand’s (2000) perspective is worth quoting in full:

“It is notable that the same priming methods — such as the Scrambled Sentence Test and subliminal prime presentation — produce motivational and behavioral as well as perceptual effects. The inescapable conclusion from this fact is that in a given experiment, a priming manipulation simultaneously produces a​ll of these various effects. Just because the dependent variable of interest in a given study is, say, impressions of a target person, this does not mean that the only effects of the priming manipulation was on the participants’ social perception. If the experimenter had instead placed the participant in a situation in which he or she could behave in line with the primed construct, behavioral effects would have been obtained instead.

Priming effects, along with automaticity effects, occur and operate in parallel. Priming manipulations have more effects on the participants (and on people in real life) than happen to be measured by the experimenter. It is in our view one very important direction for priming and automaticity research in the future to sort out how these various simultaneous processes interact with one another.” (Bargh & Chartrand, 2000, emphasis in original)

This expansive view of the parallel impact of priming on perceptions, motivation, goals, and behaviors well beyond the milliseconds following the prime presented researchers with a whole world of influence to be discovered. The perspective invited a search for the outer limits. If priming old slowed people’s walking, then maybe a single exposure to the American flag could increase support of Republicans months later. If that’s possible, then maybe it is possible to invoke creativity by placing a person inside or outside a box because such a metaphor exists. Each new observation extended the amazing power of priming and dulled our sensitivity to their increasing implausibility.

My comments about the two subcultures are more impressionistic than systematic. A detailed historical analysis would be productive to assess the extent to which the two literatures operated 5 independently, their relative attentiveness to methodology, and whether their literatures displayed evidence of narrowing versus ever-expanding influence of priming.

Who is responsible?

Sherman and Rivers (2020) express anger that “Kahneman threw the whole field of psychology under the bus” by inappropriately using terminology that applied to the whole field. They add "tremendous amounts of effort, ink, and bile were spilled fighting about the reproducibility of a small handful of effects that were not and are still not representative of the research (priming or otherwise) being conducted by social psychologists.”

I disagree that this is a “controversy over a small handful of effects.” Even if replicability challenges were limited to just one of these subcultures, they are each huge subcultures containing many thousands of papers and were very strongly represented in social psychology journals, conferences, and public discussion for 20 years. I do understand the irritation of feeling that innocent bystanders are being caught up in a controversy. However, I think that perspective is misplaced.

Even if my characterization of the two subcultures above is accurate, it does not absolve one subculture and blame the other. Differences in subcultures on power, rigor, and methodological approaches might affect replicability and credibility on average, but these challenges exist and persist in all domains, even domains with well-established theories and replicable findings. Skepticism and self-correction are essential components of the scientific process. When they are working well, lots of problems, errors, and shortcomings of research are identified. Research is hard. Fields need not be embarrassed by discovering errors, but they should worry about failing to discover them.

We are individually responsible for the integrity and credibility of the work that we produce. But, we do not work in a vacuum. We are also collectively responsible for the integrity and credibility of the work that we produce. That collective responsibility is manifest in our roles as peer reviewers and researchers replicating, extending, or challenging each other’s work; as leaders and editors shaping the policies that govern publication, funding, and hiring decisions; as teachers in what and how we educate and train the next generation of scholars; and, as individuals and groups that collectively create and embody the norms of the discipline. By embracing both individual and collective responsibilities we can cultivate a culture of continuous improvement. Science’s strength and credibility does not come from being right, it comes from a relentless commitment to getting it right.

6

References

Bargh, J. A., & Chartrand, T. L. (2000). Studying the mind in the middle: A practical guide to

priming and automaticity research. In H​ andbook of Researhc Methods in Social

Psychology (p. 311). Cambridge University Press. ​ Bargh, J. A., Chen, M., & Burrows, L. (1996). Automaticity of social behavior: Direct effects of

trait construct and stereotype activation on action. J​ournal of Personality and Social

Psychology, 7​1(2), 230–244. ​ ​ Blair, I. V., & Banaji, M. R. (1996). Automatic and controlled processes in stereotype priming.

Automatic and Controlled Processes in Stereotype Priming, 7​0(6), 1142–1163. ​ ​ Devine, P. G. (1989). Stereotypes and prejudice: Their automatic and controlled components.

Journal of Personality and Social Psychology, 5​6(1), 5–18. ​ ​ https://doi.org/10.1037/0022-3514.56.1.5

Dijksterhuis, A., & van Knippenberg, A. (1998). The relation between perception and behavior,

or how to win a game of trivial pursuit. T​he Relation between Perception and Behavior,

or How to Win a Game of Trivial Pursuit, 7​4(4), 865–877. ​ ​ Fazio, R. H., Jackson, J. R., Dunton, B. C., & Williams, C. J. (1995). Variability in automatic

activation as an unobtrusive measure of racial attitudes: A bona fide pipeline? J​ournal of

Personality and Social Psychology, 6​9(6), 1013. ​ ​ Fazio, R. H., & Olson, M. A. (2003). Implicit Measures in Social Cognition Research: Their

Meaning and Use. A​ nnual Review of Psychology, 5​4(1), 297–327. ​ ​ https://doi.org/10.1146/annurev.psych.54.101601.145225

Gawronski, B., & De Houwer, J. (2014). Implicit measures in social and personality psychology.

In H​ andbook of research methods in social and personality psychology, 2nd ed (pp. ​ 283–310). Cambridge University Press. 7

Sherman, J. W., & Rivers, A. M. (2020). There’s Nothing Social about Social Priming: Derailing

the “Train Wreck.” P​ sychological Inquiry. ​ Wentura, D., & Rothermund, K. (2014). Priming is not Priming is not Priming. S​ ocial Cognition, ​ 32(Supplement), 47–67. https://doi.org/10.1521/soco.2014.32.supp.47 ​ Wittenbrink, B., Judd, C. M., & Park, B. (2001). Spontaneous prejudice in context: Variability in

automatically activated attitudes. J​ournal of Personality and Social Psychology, 8​1(5), ​ ​ 815–827. https://doi.org/10.1037/0022-3514.81.5.815

Wittenbrink, B., & Schwarz, N. (2007). I​mplicit measures of attitudes. Guilford Press. ​