Playing Prejudice: The Impact of Game-Play on Attributions of Gender and Racial Bias
Jessica Hammer
Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy under the Executive Committee of the Graduate School of Arts and Sciences
COLUMBIA UNIVERSITY
2014
© 2014 Jessica Hammer All rights reserved
ABSTRACT
Playing Prejudice:
The Impact of Game-Play on Attributions of Gender and Racial Bias
Jessica Hammer
This dissertation explores new possibilities for changing Americans' theories about racism and sexism. Popular American rhetorics of discrimination, and learners' naïve models, are focused on individual agents' role in creating bias. These theories do not encompass the systemic and structural aspects of discrimination in American society. When learners can think systemically as well as agentically about bias, they become more likely to support systemic as well as individual remedies. However, shifting from an agentic to a systemic model of discrimination is both cognitively and emotionally challenging. To tackle this difficult task, this dissertation brings together the literature on prejudice reduction and conceptual change to propose using games as an entertainment-based intervention to change players' attribution styles around sexism and racism, as well as their attitudes about the same issues. “Playable model – anomalous data” theory proposes that games can model complex systems of bias, while instantiating learning mechanics that help players confront the limits of their existing models.
The web-based game Advance was designed using playable model – anomalous data
theory, and was used to investigate three questions. First, can a playable model – anomalous data
game change players' likelihood of using systemic explanations for bias, and how does it
compare to the effectiveness of a control text? Second, how does the game change players'
attitudes as compared to a control text? Finally, are there differences between three different
versions of the game that offer players different rewards for investigating the bias in the game
system?
Advance did not outperform the control text at changing players' likelihood of using
systemic attributions for racism and sexism, nor did it outperform the control text in changing
players' attitudes. However, significant differences were found between White and non-White
player populations in their sensitivity to the different game conditions. White players were
unaffected by differences between versions of the game, while non-White players showed differences in play behaviors, in systemic attribution likelihood, and in attitude. Given that White
Americans may have more entrenched ideas about discrimination in America, we consider the impacts of the game on non-White player populations as an indicator of what future development of playable model – anomalous data games may be able to achieve.
Table of Contents
List of Tables ...... iv List of Figures ...... vi Chapter 1: Introduction ...... 1 Chapter 2: Literature Review ...... 14 Models of Discrimination ...... 14 Reducing Prejudice ...... 20 Achieving Conceptual Change ...... 23 Game Design for Conceptual Change ...... 28 Chapter 3: Design ...... 38 Playable Models, Anomalous Data ...... 38 PMAD Design Principles ...... 43 Game Design Overview ...... 45 Sample of Gameplay ...... 49 Modeling Race and Gender ...... 53 Modeling Bias ...... 55 Reward System Design ...... 63 Chapter 4: Methods ...... 72 Research Questions ...... 73 Procedures ...... 77 Subjects ...... 81 Instruments ...... 82 Attribution tests...... 83 Attitude analysis...... 86 In-game data collection...... 87 Demographic data...... 89 Data Processing ...... 89 Attribution data...... 90 Attitude data...... 91 In-game data...... 91 Demographic data...... 92 Data Analysis ...... 93
i Conclusion ...... 98 Chapter 5: Results ...... 100 Player Source Analyses ...... 100 Analysis of Web-Recruited Players ...... 106 Demographics...... 106 Mortality and priming...... 106 Attribution type...... 109 Attitudes...... 115 Analysis of Mechanical Turk Players ...... 121 Demographics...... 121 Mortality and priming...... 121 Attribution type...... 125 Attitudes...... 134 Analysis by Player Group ...... 146 Chapter 6: Summary and Discussion ...... 150 Project Summary ...... 150 Literature ...... 151 Design...... 153 Research Questions and Results ...... 156 Results cluster one: treatment condition effects...... 160 Results cluster two: bias guess effects for web-recruited players and for White Mechanical Turk players...... 161 Results cluster 3: bias guess effects for non-White Mechanical Turk players...... 162 Discussion ...... 163 Results cluster one: differences between control and game...... 163 Results cluster 2: bias guess effects for web-recruited players and for White Mechanical Turk players...... 168 Results cluster 3: bias guess effects for non-White Mechanical Turk players...... 172 Limitations of the Study ...... 177 Implications for the Literature ...... 178 Implications for Future Research and Practice ...... 180 References ...... 187 Appendix A: Name Selection ...... 201
ii Appendix B: Sexism Attribution Test...... 202 Appendix C: Racism Attribution Test ...... 208 Appendix D: Attribution Test Validation Data Analysis ...... 214 Appendix E: Modern Sexism Scale (adapted) ...... 221 Appendix F: Symbolic Racism Test (adapted) ...... 224 Appendix G: Control Text ...... 227 Appendix H: Check Questions...... 230 Appendix I: Demographic Questions...... 232 Appendix J: Full Data Tables ...... 233
iii
List of Tables
Tables Page 1 Sample differences between League of Legends and Angry Birds 7 2 Subjects by recruitment source and treatment condition 82 3 PMAD design principles 83 4 Attribution test answer categories 85 5 Crosstabulation of player source and player race 101 6 Crosstabulation of player source and living area 102 7 Systemic Sexism pretest means by player source 103 8 Systemic Racism pretest means by player source 103 9 Modern Sexism pretest means by player source 104 10 Symbolic Racism pretest means by player source 104 11 Crosstabulation of player source and games won 105 12 Symbolic Racism pretest means by completion status 107 13 Systemic Sexism pretest means by pretest group 108 14 Systemic Racism posttest means by pretest group 108 15 Systemic Sexism posttest means by treatment condition 110 16 Symbolic Racism posttest means by treatment condition 117 17 Symbolic Racism pretest means by completion status 122 18 Systemic Sexism posttest means by pretest group 123 19 Systemic Racism posttest means by pretest group 123 20 Modern Sexism posttest means by pretest group 124 21 Symbolic Racism posttest means by pretest group 124 22 Number of clients placed by player race 129 23 Systemic Sexism posttest marginal means by race 130 24 Systemic Racism pretest means by player race and gender 132 25 Percentage of score earned from bias group, marginal means by player 134 race and gender 26 Modern Sexism posttest marginal means by treatment condition 136
iv 27 Symbolic Racism posttest marginal means by player race 137 28 Modern Sexism posttest marginal means by guess condition and player 141 race 29 Modern Sexism posttest marginal means by player race 142 30 Modern Sexism posttest means by guess condition, Black, Hispanic, & 142 Other 31 Symbolic Racism posttest marginal means by guess condition and player 144 race 32 Symbolic Racism marginal means by player race 145 33 Symbolic Racism posttest means by guess condition, Black, Hispanic, & 145 Other 34 Summary of research findings 160
v List of Figures
Figures Page 1 Layout of the game Advance 46 2 An empty job in the game Advance 47 3 Job requirements in the game Advance 48 4 Peer reactions to a possible character placement in Advance 58 5 Study overview flowchart 79
vi Acknowledgments
Thank you to my advisor, Dr. Charles K. Kinzer, who challenged my preconceptions from the first day I walked into his office. I will always be grateful for his wisdom, his guidance, and his support. I would also like to thank Dr. John B. Black, who always pushed me to embrace rigor. Thanks to Dr. Matthew S. Johnson and to Dr. Gary Natriello for their disciplinary insights, and special thanks to Dr. Steven K. Feiner who stepped in to help at a critical moment.
I am grateful for the support of the Mellon Interdisciplinary Graduate Research
Fellowship program and the Breneman-Jaech Foundation. The Mellon Program gave me a community of peers and a support system when I badly needed one; particular thanks to Dr.
William McAllister, who gave me excellent advice on navigating the joys and challenges of interdisciplinarity. The Breneman-Jaech Foundation supported the technical development of
Advance, in particular the development of the data collection and parsing system.
I collaborated with some wonderful people, both during game development and during the dissertation-writing process. Austin Grossman helped me commit to a game design direction.
Alex Kaufmann created evocative, simple art. Tess Snider coded critical game features on a tight schedule, and showed me some elegant development tricks in the process. Giulia Barbano, John
Stavropolous, and John Adamus were invaluable both in recruitment and in keeping my spirits up during the recruitment process. Courtney Hall and Stoops Noh provided emergency statistics advice. Anthony Bamonte is a wizard of data processing. Lillian Cohen-Moore helped me create a consistent format and tone. I am deeply grateful to you all.
I have had extraordinary mentors who guided me in my development both as a scholar and as a game designer. I would like to thank Frank Lantz, Scot Osterweil, Clay Shirky, Dr.
vii Helen Tager-Flusberg, and Eric Zimmerman. You believed in me even before I believed in myself, and gave me the chance to show what I was capable of. I would not be here today without you.
I have also had extraordinary students who inspired and challenged me. You taught me far more than I taught you. You are too many to list, but you know who you are. Thank you.
It takes a village to write a dissertation. For unfailing support, even in the face of late- night crises, I’d like to thank Alan McAvinney, Meguey Baker, Anthony Bamonte, Giulia
Barbano, Emily Care Boss, Joanna Charambura, Rowan Cota, Jennifer Coy, Dan Edmonds, Julia
B. Ellingboe, Abigail Estes, Isaac Everett, Ajit George, Bret Gillan, James Grimmelmann, Austin
Grossman, Crystal Huff, Steve Huff, Elena Taurke Joseph, Renee Knipe, Diana Kudayarova,
Kimberley Lam, Rachel Elkin Lebwohl, Ben Lehman, Tse Wei Lim, Blair Kamage, Geoffrey
McVey, Nada O’Neal, Malka Older, Brand Robins, Travis Scott, Brianna Sheldon, John Sheldon,
Alexis Siemon, Richard Silvera, John Stavropolous, Danielle Sucher, Chris Thorpe, Moyra
Turkington, Graham Walmsley, and Krista White. I am so grateful to have friends like you.
I would particularly like to thank Kaitlin Heller, who is a triple threat: collaborator,
cheerleader, confidant. I feel better knowing you’ve got my back, Cap.
Special thanks also to Robert Scott and Amy Scott, who gracefully tolerated my
dissertation as an additional housemate, and who never let me forget about fun along the way.
Thanks to my in-laws, J. Christopher Hall and Susan Hall, and to my sister-in-law
Courtney Hall, for many inspiring conversations about teaching, learning, and technology.
viii I am profoundly grateful for the unfailing support of my family. My mother, Phyllis
Hammer, was the first person to encourage me to consider a research career. My late father,
Michael Hammer, taught me to think broadly and communicate precisely. My sister Alison went
above and beyond in picking up family responsibilities while I was absorbed in this project. My
sister Dana sustained my spirit with timely pep talks and reminders of the value of my work. My
brother David helped me solve some tough problems over excellent coffee, and my sister-in-law
Elizabeth showed me how to handle crunch times with laid-back humor. I am grateful as well for
my grandparents Henry and Helen Hammer, and Leo and Sara Thurm – especially for Pappa
Leo’s reminders to value people as much as ideas, and for Grandma Helen’s commitment to the life of the mind under extraordinarily difficult circumstances.
Above all: my husband and partner, Chris Hall. For everything, always.
ix
In memory of Michael Hammer
My north star
x Chapter 1: Introduction Contemporary American society is haunted by persistent inequality. Health care and housing, pay rates and poverty rates, employment and education all show disparities by race and gender (Hausmann, Tyson, & Zahidi, 2010; Kozol, 1992; Lipsitz, 1995; Valian, 1999; Wenneras
& Wold, 1997). At the same time, overt expressions of prejudice have become increasingly unacceptable (Blanchard, Lilly, & Vaughn, 1991; Klonis, Plant, & Devine, 2005). So what is the true state of discrimination in American society? And what should be done about it? In addressing these questions, Americans rely on their ideas about what discrimination is and how it functions. Similarly, educational programs relating to racism and sexism also operate from assumptions about how these social constructs work.
Prejudice has been extensively theorized and studied over the years (Allport, 1979;
Dovidio, Glick, & Budman, 2005). These theories, in turn, have made their way into the design of diversity and anti-bias curricula (e.g. Adams, Bell, & Griffin, 2007) and the design of prejudice-reduction interventions more broadly. Teachers and curriculum designers, however, do not have a monopoly on what concepts are present in anti-bias work1. Learners are the most important participants – and their ideas about discrimination matter. The constructivist approach to education frames learning as the transformation of the learner's own ideas (Olsen, 1999).
Learners' pre-existing conceptions of sexism and racism are, therefore, critical. To shift learners' ideas about racism and sexism, we must understand what they originally believe2.
The widespread ideology of discrimination among white Americans – or, as I will frame
1 While anti-bias work encompasses many types of discrimination (heterosexism, ageism, religious oppression and more), this paper limits itself to the consideration of racial and gender bias.
2 This is doubly true when learners are not themselves aware that their beliefs and actions are racist or sexist. Whether we explicitly challenge their ideas, reflect them back so that learners can see the true implications of their beliefs, or confront them with anomalous data, we must always work from learners' actual beliefs rather than from their claims about those beliefs.
1 it in this paper, their pre-existing understanding of discrimination – is highly individualistic
(Bidell, Lee, Bouchie, Ward, & Brass, 1994; Bonilla-Silva & Forman, 2000). Learners, particularly white learners and male learners, often enter the diversity classroom with the notion that racism and sexism can only be the products of individual, intentional actions. As we will see, this model only accounts for some of the challenges that women and minorities face in modern
American society. It excludes biases that operate at the level of aggregates, probabilities, feedback systems, and institutions, as well as being unable to account for unconscious judgments or unintentional harm.
To understand discrimination in American society, learners must incorporate these ideas into their understandings of racism and sexism. However, to achieve this, anti-discrimination education cannot simply provide information about the many types of bias. Instead, programs must work to change learners' underlying models of racism and sexism. This transformation must break learners' attachment to the ideologies of “color-blind racism” and “choice feminism,” with their emphases on the isolated individual. Instead, learners must develop models of discrimination that include systemic and emergent effects in addition to individualistic ones.
This dissertation explores the challenges of changing learners' theories about racism and sexism. Achieving conceptual change of this kind is not easy, particularly when the issue at hand is a sensitive one in which learners may have a personal stake (Chinn & Brewer, 1993).
Additionally, the learners it is most important to reach are the least likely to voluntarily participate in diversity training, or even to realize that they have something to learn about the problems of race and gender in America (Paluck & Green, 2009). How, then, to achieve a
difficult conceptual shift, particularly when reaching learners who may be resistant?
There is existing work on teaching “systems thinking” - showing learners that parts of the
2 world are represented by interrelated systems, and helping them build mental models of those
systems (von Bertalanffy, 1950; Forrester, 1992; Wilensky & Resnick, 1999; Jacobson &
Wilensky, 2006). Systems thinking is not about acquiring facts about a target domain. For
students to learn systems thinking means they must build mental models for themselves, ones
that accurately represent the desired system.
Broadly speaking, a system is defined by entities and the dynamic relationship between
them (von Bertalanffy, 1950; Forrester, 1961; Richardson, 1999; Meadows, 2008). For example,
one entity might cause the quantity of another entity to increase. Other entities feed back to
themselves, increasing or decreasing as far as the model will allow. Pedagogies for teaching
systems thinking include concept mapping, inquiry-based approaches, participatory simulations,
and computer modeling (Chen & Stroup, 1993; Jacobson & Wilensky, 1996). The common
factor in all these methods is that students engage with models consisting of relationships
between entities, whether by creating such models or analyzing them.
The bulk of the work on systems thinking has been done in science, where there are
broadly accepted systemic representations of particular scientific concepts. For example, a
classic example of systems thinking is building a mental model of an ecological system that
connects predators to prey. While there are differences among ecologists about the details of
ecological modeling, and there is much to learn about specific predator-prey networks, the basic representation of how predator and prey populations relate to one another is shared.
When it comes to racism and sexism, however, it is difficult to even agree on a representation of the underlying problem. Unlike systems in the physical world, the systems that reinforce racism and sexism are relatively difficult to isolate and measure. There is still debate in the field about even apparently simple claims, like what prejudice is and how it operates.
3 Although some models of bias exist (e.g. Schelling, 1971), it is much easier to construct a simplified example of predation than of, say, persistent income inequality. Additionally, learners are likely to be more attached to their explanations for income inequality than for predation, as issues of social inequity connect to their moral outlook and their basic faith in the world (Jost &
Banaji, 1994; Jost, 2002; Jost, Banaji & Nosek, 2004).
The inability to even conceive of systemic explanations for racism and sexism, however, poisons the public conversation about these issues (Iyengar, 1994). Even if the details of the model are wrong, the notion that a systemic explanation is possible is in itself valuable.
What we attempt to do with this work, therefore, is to introduce the concept of a systemic explanation for racism and sexism in the first place – even if learners do not agree with all the specifics of the model being presented. It is a precursor to developing accurate systemic models of sexism and racism. This project, therefore, uses simplified models of actual prejudices to construct its theories of what racism and sexism are, in order to help players understand that there might be systemic factors at play when looking at unequal outcomes along racial or gender lines. The goal is for players to be willing to consider systemic explanations in addition to agentic ones. This is a conceptual change in the type of model of discrimination they are applying, even if some of the details of the model may be contested (Niedderer, Schecker &
Bethge, 1991).
As we will see in chapter two, achieving this sort of conceptual shift is a difficult challenge. However, games may provide one way to meet it. The Serious Games movement proposes the use of game-based interventions to achieve social goods, from education to health to activism and more (Sawyer, 2010). The premise of game-based interventions is that games can achieve things that may be difficult for other forms of media. Games do a number of things
4 particularly well, from letting players take on new identities to reaching millions of people worldwide (Gershenfeld, 2010; Klopfer, Osterweil, & Salen, 2009). Harnessing these capabilities can give us a powerful new tool to help learners achieve conceptual change around racism and sexism.
Many games can be productively understood as complex systems of entities and their interrelationships (Strange, 2011). In fact, systemic features such as positive and negative feedback systems are core elements of what it means to make a game (Zimmerman & Salen,
2003). For this reason, games can incorporate many of the features of pedagogies for systems thinking, as will be explored later in this dissertation.
Games provide players with clear goals, and rely on player actions within a complex system to achieve those goals (Isbister, Flanagan, & Hash, 2010). Given appropriate game design, players can be motivated to engage with an arbitrary complex system, to explore it, and to figure out how to exploit it in order to succeed. This is the process of players exploring a game's rule-set and strategy space.
Well-designed games provide immediate and meaningful feedback, helping players test theories, then discard or build on them depending on how useful they are (Gee, 2003). This process of directed theory-building, followed by rapid testing, is a powerful way for learners to achieve conceptual change (Chinn & Brewer, 1993).
Games also provide a playful approach to a serious subject – potentially allowing the intervention to reach many people who are otherwise inaccessible to anti-racist and anti-sexist work Games take place in a “magic circle,” a space in which real-world objects, emotions and commitments do not have precisely the same meaning they do in one's ordinary life (Copier,
2005; Huizinga, 1950). Because players can choose the degree to which their game-play has real-
5 world implications, games may help separate people from their everyday positions and make challenging messages more palatable.
Finally, games have an immense reach. This is particularly true of the short-form, web- based games known as “casual games.” As of 2007, casual games reached 200 million people per month, many of whom play for as many as fifteen hours per week (Casual Games Association,
2007). Compared to many other forms of diversity training, such as in-person classes, a digital game intervention has the potential for rapid deployment to a large population who might otherwise be unreachable by anti-bias work.
Of course, games are not a unitary subject. For example, two of the most popular games in the world, as of this writing, are League of Legends and Angry Birds (Rovio Entertainment,
2012; Riot Games, 2012). We can argue that their popularity indicates good design, but they are radically different, demonstrating that a single theory of game design cannot encompass all games.
League of Legends (Riot Games, 2009) is primarily played on home computers. Teams of five players must collaborate to defeat enemy players, destroy computer-controlled agents, and capture their opponents' home base. Players spend their time completing tasks such as moving around a map, coordinating with teammates, predicting the behavior of opposing players based on limited information, executing swift power combos, and investing limited resources into improving their capacities. The game plays out in real time (Juul, 2005) and losses count against players in the long run; once a player initiates a game, they must follow through and win, since losing has a price (League of Legends Wiki, 2013).
Angry Birds (Rovio Entertainment, 2009), on the other hand, is primarily a single-player mobile game; players never interact directly, although they can compare their post-play
6 performance on leaderboards. In each level, players must “throw” birds at structures made of
different materials such as wood, glass, and stone, in order to destroy all the enemy pigs. Players
spend their time calculating trajectories, observing the results of a bird throw, and making plans
based on their understanding of bird properties, material properties, and pig location. The game
can be paused at any time, and players can continue to replay levels as often as they want – there
is no penalty for failure.
Some examples of differences between these games are summarized in Table 1.
Table 1
Sample differences between League of Legends and Angry Birds
League of Legends Angry Birds
Number of players Up to five per team One
Platform PC Primarily mobile
In-game Time3 Haste & synchronicity Interval control
Point of View Isometric 2D
Player Control Single humanoid character Slingshot
Loss Condition Base destroyed Pigs remaining at end of level
Even these games, however, have some fundamental similarities. For example, they are
both digitally mediated, rather than using cards, blocks, dice, or other physical game
technologies. In this way, both games are quite different from Jenga (Scott, 1983), which relies on its materiality, or from solitaire, which can be played using either physical or virtual cards. In both games, success and failure are primarily mediated by the technology; the role of player
3 Player control of temporal progression in the game, as per Elverdam & Aarseth (2007).
7 judgment is in strategy development rather than evaluating outcomes, unlike Apples to Apples
(Out of the Box Publishing, 1999). For that matter, players can win and lose the game in the first place, rather than experience an open-ended story as in 1,001 Nights (Baker, 2012).
Rather than rehash the debates about whether there are particular features that make a game a game rather than a puzzle or a toy (e.g. Costikyan, 2002), we raise this issue to point out that it is important to make claims about specific types of games. This paper proposes a theory for understanding a particular game type, and does not argue that all games will function in this way. Instead, we can look at to what extent a particular game fits the theory we develop, and we can use design principles based on this theory to guide the creation of games.
The advantages that games can provide depend on the game's being well-designed.
“Well-designed,” however, is not a neutral term. For example, game mechanics that allow losing players to catch up to the leaders can heighten the sense of tension in a game (LeBlanc, 2005). If that tension is consonant with other choices in the game, and appropriate for the audience for the game, then perhaps it can be described as “good” design. However, the very same choice in a different context might be a poor one, if it undermines the rest of the experience of play.
When we are discussing games for learning, therefore, we must always evaluate the
“goodness” of game design in relation to the game's learning model as well as to playability and enjoyment. For the purposes of this project, good game design means that in addition to being playable and enjoyable, the game challenges player's existing conceptions of bias and encourages them to shift their mental model to accommodate systemic explanations. The research indicates that one effective way to do this is through demonstrating to players that their existing model does not explain their experiences (Chinn & Brewer, 1993).
In order to talk about what we mean by “well-designed” and what we mean by “game,”
8 therefore, this dissertation proposes a theory of “playable model – anomalous data” games.
PMAD theory treats games as “playable models,” in which the game's rules express a model of
the situation at hand. However, when players engage with the game, they may encounter
“anomalous data” - game experiences that do not fit their preconceptions. Because players know
that game rules may be different from real-life rules, they may be more willing to rethink the
model they are using within the game context in order to explain, manipulate, and win the game.
Using this theory, we develop guiding principles for PMAD-based game design, as follows:
• The game system models the relevant domain.
• Player actions affect, and are affected by, the model.
• Players receive feedback about the impacts of their actions as they relate to the model.
• The game goals point players toward model conflict.
• Players can experiment with the game's model.
• Players must figure out rules and strategies for themselves.
These principles are based on game design theory and educational theory, as detailed in
chapter three. They can serve as an analytic framework for analyzing existing games, or provide
design guidelines for creating games to address specific issues. This dissertation takes the latter
approach, using the PMAD design principles to guide the design and development of Advance4,
a PMAD game which supports the development of systemic models of bias through play.
In chapter four, we outline a method for testing the impact of Advance, as a way of
4 Advance is available online at http://www.replayable.net/advance/.
9 testing PMAD theory more generally. Can a PMAD game that models systemic bias change the
type of explanation players use to explain incidents of bias? And might it also change players'
attitudes toward bias as part of the process? To investigate these questions, we gather data on
players' attribution style for incidents of racism and sexism, and on their attitudes about race and
gender. By randomly assigning players either to play the game, or to read a text about related
concepts, we can see whether the game or the text has a greater impact on players' attribution
styles and attitudes. This allows us to understand whether the game is effective at helping
American players use systemic models to explain discrimination5.
However, the overall test of effectiveness gives us limited information, because there are
many things about a PMAD game that might make it a more or less effective intervention. If it
succeeds, we have not isolated which factors were most important. Similarly, if it fails, we
cannot know whether PMAD theory has failed, or whether there is something else about the
game that made it less than useful. Work has begun to define game design patterns (Bjork &
Holopainen, 2004) and learning mechanics (Plass, Homer, Kinzer, Frye, & Perlin, 2011), but the
impact of specific game design decisions on a player's experiences and ideas is still an open
field.
The purpose of this dissertation, therefore, is not only to test the effectiveness of this
intervention, but to evaluate design decisions within the context of the PMAD game itself.
Specifically, in this game we test how different design patterns for in-game rewards are more or
less effective at changing players' attribution styles and attitudes about racism and sexism.
In line with the PMAD design principles, racial and gender bias are deeply embedded in
5 The cognitive and ideological biases described in this paper may be specific to American culture (Morris, Menon, & Ames, 2001). A cross-cultural analysis of mental models of discrimination would be a fruitful area for future research.
10 the game's model, as detailed in chapter three. To surface whether players are developing theories about the game's bias, players are offered the chance to identify the bias present in the game from a set of seven possible options. However, asking this in a vacuum is not necessarily meaningful in the game's model. Rather, we need to consider how their knowledge of the game's bias is connected to the rest of the game – in this case, through the game's reward system.
The three reward conditions in the game all give the player some type of value in exchange for identifying the bias. By holding all other game factors constant, we can analyze the comparative motivation to engage with this part of the game system under different reward types.
We choose three different reward patterns, based on common game design patterns (Bjork &
Holopainen, 2004). The first type of reward is an informational reward; players gain knowledge about how the game's system works. The second type of reward is a financial reward; players receive a flat sum for demonstrating their knowledge. The final type of reward is a generative reward, in which the reward is valueless on its own but gives the players new capacities by changing the rules of the game.
By randomly assigning players to reward conditions while holding all other game elements constant, we can examine which of these design techniques make Advance more and less effective at conveying complex concepts and challenging player preconceptions. We can also isolate the impact of the changes in the game's reward system on players' experiences in the game. More generally, we may be able to draw conclusions about the way in-game reward systems can shape player goals and activities.
The findings of this dissertation, detailed in chapter five and explored more deeply in chapter six, are not straightforward. Advance did not outperform a control text at either changing players' likelihood of using systemic attributions for racism and sexism, nor did it outperform the
11 control at changing players' attitudes about race or gender. This finding is not entirely surprising.
Advance tackled an extremely difficult problem using a novel design theory. If Advance had
outperformed the control text, it would have been a promising indicator in favor of PMAD
theory. However, the failure to find a main effect of the game does not necessarily contradict
PMAD theory. Rather, we must consider whether there were problems in the game's design that
were not based on PMAD theory, or whether PMAD theory itself is flawed. As we will see in
chapter six, it is most probable that elements of both are true. There were flaws in the game's
design, including the short play period, that limited its efficacy. However, in light of the data
gathered from this study, the PMAD design principles will also need to be revised and tested
further.
The study also found a significant difference between White and non-White player
populations6 in how they responded to the different game conditions, including the condition in
which players did not guess at the bias in the system and were not aware they could be rewarded
for it. White players were unaffected by differences between versions of the game. They did not
respond to the different bias guess conditions with differences in their play behaviors, and they
did not differ on any of the outcome measures. However, non-White players did show
differences in play behaviors and on the outcome measures. Given that White Americans benefit
most from racism in America, their beliefs about discrimination may be more entrenched,
making them a harder population to affect (e.g. Kunda, 1990). We therefore consider the impacts
of the game on non-White players as an indicator for what future development of PMAD games for racism and sexism might be able to achieve.
6 As described later in this dissertation, the distinction is in fact between web-recruited players (who were primarily White) and White players from Mechanical Turk taken together, compared to non-White players recruited from the Mechanical Turk site.
12 Finally, there are further lessons in the nature of the differences found for non-White
players. These players performed differently across the four bias guess conditions on post-test
measures of both racist and sexist attitudes. However, there were no performance differences
between the groups on in-game measures. We therefore conclude that factors other than changes
in play behavior were driving these differences between conditions. This study originally
hypothesized that players would engage differently with the game's anomalous data under
different reward conditions, and that differences in outcome measures would be driven by
changes in player behavior in response to different in-game opportunities and incentives. Instead,
it appears that change can also be driven by other factors, even within the context of a PMAD
design. For example, players may have drawn conclusions about the game world's commitment
to an unjust status quo based on the mechanics available to address bias in the game system.
When the game implied that discrimination was the norm, players retroactively justified it; when
it rewarded players for speaking out, players did not.
Although Advance did not outperform the control text at changing players' attributions or
attitudes, we believe that PMAD design theory can be revised to be more useful to a poorly-
theorized area. Using non-White players' reactions as our guide, we can develop PMAD theory to help create more effective entertainment-based prejudice-reduction interventions. As we will see in the following chapter, this is an important and difficult task.
13 Chapter 2: Literature Review
Models of Discrimination This chapter considers theories of how discrimination manifests in society, rather than its roots or origins. What do racism and sexism look like? How can one recognize them? What should one do to intervene? Answers to these questions are based on mental models of how discrimination, as a social process, functions. We propose that the mental models of many
Americans are significantly different from those of scholars, educators and other experts, and that they are different in ways that go far beyond the possession of different information.
This dissertation explores the differences between two models of bias: individual and systemic bias (e.g. Gomez & Wilson, 2006; Feagin, 1972). Individual bias is bias which is rooted in the beliefs, actions and attitudes of individuals, while systemic bias emphasizes the effects of larger systems and processes (Gomez & Wilson, 2006; Adams et al., 2007; Feagin, 2006;
Schmidt, 2005). For example, no one individual is responsible for the Federal Housing
Administration's policies from 1930 to 1970, but taken together, those policies resulted in white
Americans receiving far better housing outcomes than black Americans and other minorities
(Lipsitz, 1995).
The distinction between individual and systemic bias is cognitively significant. The former is an example of a direct process, in which actions are directly tied to outcomes (Chi,
2005). The latter is best represented as a complex system, in which structural factors, remote causes, and the interrelationship between entities cause racial or gender disparities (Meadows,
2008).
In direct processes, the focus is on the actions of individual agents and their intentional work toward a larger goal. The actions of agents directly predict the larger pattern, and are
14 considered in sequential order rather than simultaneously. Causes lead to effects in a relatively
straightforward fashion. This is a familiar narrative mode. As E. M. Forster might put it, “the
white man fired the black woman, and then discrimination occurred” is a story we can all
understand (Forster, 1956).
Systems, on the other hand, consist of entities and the relationships between them (von
Bertalanffy, 1950; Forrester, 1992; Wilensky & Resnick, 1999; Jacobson & Wilensky, 2006;
Richardson, 1999; Meadows, 2008). These relationships are dynamic, consisting of “flows” from
one entity to another. One entity might cause the quantity of another entity to increase, for
example. Entities can also return upon themselves, creating self-regulatory mechanisms.
The behavior of a system is latent in its structure (Meadows, 2008). This behavior can be conceptualized as emerging from the relationships between the entities of the system. In other words, systems are interdependent. The behavior of a complex system can also not always be
predicted from the behavior of individual entities; holistic results emerge from the behavior of
the system as a whole. Finally, systems can feed back to themselves; their full range of behaviors
emerges over time, as the patterns of activity accumulate.
Most Americans begin with a “deterministic” or “centralized” mindset (Wilensky &
Resnick, 1995, 1999; Resnick, 1996) rather than with a systemic way of seeing the world.
American learners primarily understand racism and sexism as direct processes – in other words,
as centering on individual bias (Hughs & Tuch, 2000). To include systemic effects in their
understanding of bias would require an underlying conceptual shift, helping people see sexism and racism as manifestations of a complex system. Many people have trouble making sense of complex systems (Mandinach & Cline, 1994; Tversky & Kahneman, 1974), so we would expect this shift to be difficult.
15 So what, precisely, do naïve learners believe about the nature of racism and sexism?
What sort of processes do they believe are at work? For this, we turn to popular ideologies of race and gender. Bonilla-Silva has demonstrated the prevalence of “color-blind racism” among
white Americans (Bonilla-Silva & Forman, 2000; Bonilla-Silva, 2006). Hirshman defines a
similar phenomenon regarding gender as “choice feminism” (Hirshman, 2007). As we examine
these ideologies, we will see strong evidence of thinking about discrimination as agentic rather
than systemic.
Color-blind racism, as outlined by Bonilla-Silva (Bonilla-Silva, 2006), relies on
underlying beliefs he refers to as frames. These frames reveal clear evidence of agentic thinking.
The first of these is the minimization of racism frame. This frame suggests that discrimination is
no longer an important part of life for minorities. To maintain this frame in the face of
widespread racial inequality, people using it define discrimination as all-out racist behavior -
“old-fashioned racism,” as scholars might call it (Sears & Jessor, 1996). Racist incidents with no obvious perpetrator or a lack of intentionality are minimized and dismissed – and all the more so when the problem cannot be reduced to a single incident, but rather appear in patterns and probabilities of behavior. This is precisely a definition of racism as a direct, agent-based process, and the corresponding minimization of systems of racism.
The other three frames support this approach by emphasizing the centrality of individual decisions for social behavior, even as they are simultaneously used to diffuse responsibility for those decisions. The frame of abstract liberalism emphasizes an individual's right to make free choices, regardless of larger social implications. The frame of naturalization removes individual responsibility for those choices by framing them as near-biological imperatives. And finally, the frame of cultural racism provides an explanation for racial inequality in which culture, not
16 individuals, is to blame; since individuals are not at fault, racism could not possibly be at work.
Taken together, these concepts underlying color-blind racism reveal a popular understanding of racism as a direct process – precisely what Chi's research on the prevalence of direct-process theories would lead us to believe (2008). The same is true of choice feminism.
Choice feminism is a cultural frame which valorizes choices made by women – any choices made by women, with no attention to their larger systemic impact. In other words, choice feminism looks only at direct processes of sexism and feminist action. Hirshman (2007) illustrates the way these supposedly individual choices (in her case, the choice for high-powered women to take primary responsibility for housework and opt out of elite careers) cause society- wide effects, such as the absence of women from the corridors of power, and are rooted in the effects of complex systems such as the tax code.
We can see, therefore, that for both racism and sexism, popular discourse reveals ideas of individual choices and direct processes. These discourses ignore systemic effects, or even use the idea of systemic bias as a way of minimizing the experience of discrimination.
Of course, we must also demonstrate that a systemic approach is a valuable addition to agentic understandings of sexism and racism can be understood through systems thinking. Much work has been done that demonstrates systemic bias around race and gender, both theoretically
(e.g. Schelling, 1971) and empirically (e.g. Swim, Hyers, Cohen, & Ferguson, 2001). These theories encompass individual acts of sexism and racism, but also explain situations that a direct- process approach cannot explain.
For example, a direct-process approach excludes racism and sexism that relies on feedback effects. In her discussion of women and ambition, Fels (2004) argues that women learn what is possible from their daily experiences, and then, rationally, adjust their ideas to match
17 their experiences. That adjustment, in turn, affects the types of experiences they have from day to
day. This type of feedback can occur at more abstract levels as well. Valian (1999) analyzes the
impact of repeated interactions that involve even a small level of bias. She demonstrates that bias
is amplified if these interactions are part of a feedback system rather than considered
individually.
A direct-process approach also excludes probabilistic effects. Consider the paucity of
films that pass the Bechdel Test (“Bechdel Test Movie List,” 2010). No individual movie is
necessarily problematic on its own. However, a moviegoer is far more likely to encounter films
which do not represent women as full-fledged characters, than to encounter films that portray
women as human beings with their own interests, motivations and goals.
Of course, there are other types of racism and sexism that are not well accounted for by
American's naïve models of discrimination, such as implicit bias (Greenwald, McGee &
Schwartz, 1998; Stanley et. al., 2011). However, the categories above are the ones which most
directly reject the assumptions of direct-process racism and sexism: that individuals are the unit
of analysis, and that each incident should be judged in isolation. For that reason, we emphasize
these categories, both in our analysis and in the design of an anti-bias intervention.
We can see, therefore, that the cultural discourse of Americans around racism and sexism frames them as direct processes. However, sexism and racism must also be understood systemically if we hope to change the social structures that reinforce and perpetuate racial and gender inequality. When facing a problem which has systemic aspects, a systemic approach allows the consideration of appropriate remedies. Having the right model is a crucial factor in problem-solving, as we know from decision-making research (Newell & Simon, 1972).
Remedies that only address direct-process discrimination will leave vast swaths of inequality
18 untouched – but Americans who use a direct-process model may react negatively to any
intervention that affects the systemic level. For example, an individual-centric, direct-process
approach to racial inequality significantly predicts less-progressive views on a wide range of
race-related policy measures (Iyengar, 1989; Iyengar, 1994; Hughes & Tuch, 2000; Lau & Sears,
1981). If only an individual's choice matters, then it is unsurprising that direct-process thinkers
react negatively to any intervention which constrains that choice because of systemic effects.
Whether these interventions focus on changes in individual behavior, or on reshaping social
structures, people will not support interventions whose rationale does not match their mental
model of the underlying problem.
For example, one result of direct-process thinking is that the word “racist” or “sexist” is
often treated as more problematic than discrimination itself. Consider Bush's claim that the worst
moment of his presidency was being called a racist by Kanye West – not the 9/11 attacks or
Hurricane Katrina or the collapse of Lehman Brothers (Bush, 2010). Rather than addressing the
systemic inequalities which lead to unfair outcomes, the emphasis is on the identification of
racism, which is necessarily framed as an accusation under the direct-process model. A culture in which sexism and racism cannot be even be identified without changing the conversation is a culture that cannot solve the problems of racial and gender bias.
The challenge, then, is one of conceptual transformation. Rather than conceiving of discrimination solely as the result of direct processes, learners must also consider it as a complex system. Without the language to describe the systemic aspects of racism and sexism, learners cannot identify and support appropriate remedies, or even identify major aspects of the problem in the first place. So how do we change people's minds about racism and sexism? And what work has already been done toward this goal?
19 Reducing Prejudice If we hope to propose methods for expand people’s models of discrimination, we must understand what has already been proven effective in this area. This section reviews existing prejudice-reduction interventions. Many of these interventions are based in psychological or sociological theories of racial and gender bias (Oskamp, 2000). However, as we shall see, these interventions largely focus on combating the causes of prejudice, not on changing learners' underlying models of how racism and sexism operate. Even those that include this element do not treat it as the fundamental conceptual shift that it is, but rather as a standard element of the curriculum to be conveyed (Adams, Bell, & Griffin, 2007).
Effective prejudice-reduction interventions must be based in accurate theories of prejudice. Duckitt (1994) proposed a four-level model of the origins of prejudice: genetic predispositions, norms for intergroup relations, mechanisms of social influence, and individual differences in attitudes and behaviors. Oskamp (2000) then connects the development of anti- prejudice interventions to these four levels. He concludes that interventions are unlikely, at present, to occur on the genetic level. However, it should be possible to design interventions around social norms, mass or interpersonal influence, or the modification of individual personality characteristics.
Paluck & Green (2009) review and categorize existing prejudice-reduction interventions, identify which theories the intervention relies on, and evaluate the standard of evidence for the effectiveness of each type of intervention. They define six types of intervention that are relatively well-supported by experimental evidence, both from the laboratory and from the field. These six approaches include explicit cross-cultural training; contact with members of other groups; cooperative learning with members of other groups; value consistency exercises, which show
20 bias to be inconsistent with other deeply-held values; peer influence; and exposure to books, films and other entertainment media.
These interventions are rooted in a variety of theories: the contact hypothesis, social interdependence theory, social norm theory, social learning theory and more (Khan, 2000; Paluck
& Green, 2009). These theories, however, emphasize the psychological and social causes that produce prejudice – not the models and ideas that learners have about what prejudice is, where it comes from, and how to recognize it in the real world.
In fact, some of these approaches, as effective as they are in reducing prejudice, may simultaneously contribute to the spread of direct-process models of discrimination. Consider, for example, the peer influence approach. Peer behavior can set the social norm for expressions of prejudice and attitudes toward discrimination (Blanchard et al., 1991; Paluck, 2006). Some theories even give peer influence a major role in the nature of prejudice itself (Crandall &
Stangor, 2005). However, without special training, peer influence programs are likely to spread naïve learners' own understandings of how prejudice functions. Direct-process models of racism and sexism are the ones most learners start with, and are easier to understand than emergent- process models, as we will see below. They may, therefore, become part of the social norm.
This is not to say it is impossible to teach an emergent understanding of racism or sexism.
Schmidt (2005) outlines seven concepts tied to the teaching of racism as systemic inequality.
These concepts include the larger notion of individual, institutional and cultural levels of racism
(Feagin, 2001; Feagin, 2006), as well as specifics such as the internalization of racism, which references unconscious prejudice (Fox, 2001), and historical inequality, which functions as a feedback effect (Oliver & Shapiro, 2006). However, the way in which these concepts are taught must be tied to the desired effect, namely conceptual change. Even if learners are able to grasp
21 these specific concepts, as long as they adopt a direct-process model of discrimination, they will not be able to build on them effectively.
A successful intervention aimed at developing a systemic understanding of bias will build on one of the six types found to be effective by Paluck and Green (2009), while addressing the specific learning challenges related to conceptual change. This dissertation explores the theory and practice of creating entertainment-based interventions using games.
Entertainment-based interventions are interventions that use popular forms of media to change people's beliefs and attitudes about sexism, racism, or other forms of prejudice. To date, this type of intervention has been particularly poorly theorized (Paluck & Green, 2006).
Research on entertainment-based interventions thus far has focused on two approaches: first, changing social norms through mass influence, and second, encouraging perspective-taking and empathy at the individual level (Strange, 2002; Zillmann, 1991). Entertainment interventions have been shown to be effective at changing social norms, but not in changing personal beliefs
(Paluck, 2009). However, recent research on perspective-taking in fiction suggests that this, too, may be possible. Kaufman & Libby (2012) found that inducing a reader to identify with a fictional character, then later revealing that the character was a member of an out-group, reduced the tendency to stereotype that character and improved reader attitudes toward their group. We therefore believe that with an appropriate theoretical grounding, entertainment interventions can change individual attitudes as well as social norms.
This dissertation proposes one appropriate theoretical grounding for using games as an anti-bias entertainment interventions to help players shift their conceptual models of racism and sexism from the individualistic to the systemic. Games are certainly a popular entertainment media form; for example, 97% of American teens play games (Lenhart et. al., 2008). However, in
22 the games community, a division is often drawn between “serious” and “entertainment” titles,
with games created for the social good falling into the former category (Michael & Chen, 2005).
Under this typology a game would be “serious” if it were constructed with an eye to social
change and successful research. However, that does not necessarily prevent the same game from
being entertaining. Games with serious intentions can still have satisfying, nuanced, entertaining
gameplay. For example, Dog Eat Dog (Burke, 2012) is a role-playing game that addresses issues of colonialism, power, and injustice, which are certainly serious topics; it has also been nominated for multiple game design awards (Diana Jones Award, 2013; Indiecade, 2013).
Additionally, the perceived seriousness of a game can be influenced by the context in which it is deployed. Games that are made mandatory, for example, are less likely to be perceived as entertaining (Heeter et. al., 2011). We therefore conclude that games are able to fit within Paluck and Green's category of entertainment intervention (2009), even if within the field of game studies a particular game might be called “serious” instead.
With this understanding in mind, we turn to the problem of identifying an appropriate theory on which to build an entertainment-based intervention using games. To do so, we begin with what we know about how individuals achieve conceptual change. If we hope to build a field of game-based interventions around prejudice and discrimination, we must work from what learning theory tells us about how individuals move from agentic to systemic approaches, then create a theory of game design to match.
Achieving Conceptual Change When we talk about conceptual change, we ask learners to understand processes, which are cognitively represented as mental models. Mental models are distinct from facts or skills.
They are sets of beliefs that can be treated as an internal simulation (Gentner & Stevens, 1983).
23 This simulation can be mentally “run” to make predictions about the world (Johnson-Laird,
1994; Jonassen & Henning, 1996). However, presenting information is not enough to cause people to create a new model. Changing one's mental model means acquiring new beliefs, integrating them into a runnable model, and then understanding the implications of those beliefs when one's new model is “executed” (Chi, 2008).
Significant research has been done on achieving conceptual change in science. Students come to school with naïve models of how things move, how heat is generated, and more. These models are inaccurate, but they are sufficient to explain learners' everyday experiences (Perkins
& Simmons, 1988; Roth, 1990). Through the schooling process, students learn the accepted theories of physics, biology, chemistry and mathematics. However, learning a new theory is not the same as achieving conceptual change. Unless the models underlying students' naïve understandings are specifically and explicitly addressed, students will often retain them. For example, students can succeed in high-school and university physics courses, yet still believe that heavier objects fall faster than lighter ones (Champagne, Klopfer, & Anderson, 1980). These inaccurate models are most likely to appear when students are presented with novel problems, for which they do not have well-rehearsed strategies. Unless learners can expect to be presented only with familiar problems, conceptual change is necessary.
Contemporary approaches to conceptual change emphasize the role of anomalous data, or evidence which contradicts students' naïve theories (Chinn & Brewer, 1993; Hewson &
Thorley, 1989). The anomalous data are intended to show learners that their existing models are inadequate to explain their experiences. Instead, learners must adopt a new, more scientifically accurate model which explains the data at hand. For example, many fourth-graders believe that sweaters and hats are warm because they generate heat. Watson and Konicek (1990) observed a
24 classroom experiment designed to produce anomalous data for this belief. Students wrapped thermometers in sweaters and observed whether the temperature changed. The students' naïve belief that the sweater would make the temperature rise was confronted with an unexpected reality, in which the temperature remained the same.
Anomalous data, however, does not always cause conceptual change to occur. Watson and Konicek's subjects developed multiple alternate hypotheses to avoid changing their minds, such as cold drafts which might have somehow gotten in to the sweaters and affected the results
(1990). Similar resistance to anomalous data has been demonstrated in other contexts, such as modeling electrical circuits or deciding what causes colds (Johsua & Dupin, 1987; Kuhn, 1989).
Chinn and Brewer (1993) list the factors that influence how people respond to anomalous data. They argue that there are four areas of influence: the nature of the learner's prior knowledge of the subject, characteristics of the new model the learner is asked to adopt, what the anomalous data themselves are like, and how the learner processes the material. In order to understand the challenges of achieving conceptual change around racism and sexism, we must examine each of these four areas in turn.
First, we take up the issue of prior knowledge. In the case of racism and sexism, prior knowledge does not just include explicit knowledge about racial and gender bias; it also includes the underlying models that learners believe explain disparate outcomes in American society.
Even if learners have never explicitly encountered racism or sexism, they are enculturated with
“common knowledge” explanations for these outcomes. As described earlier in this chapter, popular models of racism and sexism focus on individual rather than systemic explanations, usually in a way that undermines or even opposes anti-bias work. These underlying models should be considered part of learners' prior knowledge about racism and sexism, whether or not
25 they can articulate them explicitly.
If prior knowledge is “entrenched,” or deeply embedded in the way the learner understands the world, it is more difficult to change (Klahr, Dunbar, & Fay, 1990; Kunda, 1990).
Ontological beliefs, or beliefs about the fundamental categories and properties of the world, are particularly likely to be entrenched and are hard to change (Chi & Roscoe, 2002). However, beliefs may also be entrenched because they satisfy personal or social goals (Chinn & Brewer,
1993).
Beliefs about racism and sexism are likely to be deeply entrenched. Individuals are often resistant to anti-racist and anti-sexist work, which has been theorized in many ways. This resistance may be rooted in group self-interest (Kluegel, 1985; M. C. Taylor, 1998), in fear
(Stephan & Stephan, 2000), or even in conflict between conscious and unconscious beliefs
(Gaertner & Dovidio, 1986). Members of privileged groups resist acknowledging their role as oppressors, while members of disenfranchised groups may have internalized racist or sexist attitudes. Additionally, as we have demonstrated above, changing one's beliefs about prejudice from a direct-process to an emergent-process model is an ontological shift (Chi, 2005). This suggests that even individuals who are neither intellectually nor socially committed to a direct- process model of racism are likely to hold entrenched beliefs about it.
Second, we must consider the nature of the new theory which learners are being asked to adopt. For conceptual change to occur, a new theory must be available, coherent, and intelligible
(Chinn & Brewer, 1993). A bad theory is better than no theory at all, but coherent theories are preferred to less-coherent ones. Learners may try to adopt theories they do not understand, but they will be unable to apply them to novel situations (Linn & Songer, 1991).
Understanding complex systems, unfortunately, is difficult. Our cognitive biases make it
26 easier to understand individually-based, direct explanations than ones that rely on systemic, probabilistic or emergent thinking. For example, the availability heuristic means that humans use ease of retrieval as a proxy for likelihood (Kahneman, Slovic, & Tversky, 1982). Rather than seeing the big picture, we are biased toward using individual events to make judgments about larger patterns. Systems cannot be reduced to a set of incidents; systemic effects occur precisely at the level of the system. This means that the availability heuristic serves us poorly when thinking about emergent processes.
Because the emergent properties of systems often rely on chance, misconceptions about probability can also contribute to the difficulty of understanding them (Chi, 2005).
Misconceptions about probability persist beyond high-school and into adulthood (Batanero &
Sanchez, 2005). One misconception particularly relevant to emergent thinking is the “outcome approach” bias. Subjects reason backwards from the outcome of a particular trial to decide what the probability of its occurrence must have been. For example, if told that it rained, they concluded that there must have been a high probability of it raining, regardless of what they were told beforehand about the likelihood of rain (Batanero & Sanchez, 2005). Using outcomes to reason backward about the model that produced them is a useful process, but not when probabilistic processes are incorrectly treated as deterministic (Prediger & Rolka, 2009).
Third, we must examine the anomalous data itself. Chinn & Brewer (1993) argue that to achieve conceptual change, the anomalous data must be credible and unambiguous. Data that is not credible can be easily dismissed; data that is ambiguous can be distorted to fit an observer's existing theories (Chinn & Malhotra, 2002). The learner must also be presented with multiple anomalous experiences; any given experience may be able to be accommodated within the learner's naïve theory, but the full set of experiences cannot (Watson & Konicek, 1990).
27 Finally, there is the role of how the learner processes the anomalous data. Deep
processing means paying careful attention to the material at hand, elaborating relationships between the new material and prior knowledge, and working out the larger implications of the new information (Craik & Tulving, 1975; Nickerson, 1991). This sort of processing has been shown to promote theory change (Tesser & Shaffer, 1990).
It may be difficult to induce deep processing around issues of sexism and racism because of the social desirability effect. It has become increasingly socially unacceptable to express overtly racist or sexist attitudes (Jussim, 1991). Rather than thinking carefully about issues of race and gender, learners may use their knowledge of what is socially acceptable to guide their actions in this sensitive area.
There is also the issue of motivation. Deep processing requires a cognitive commitment on the part of the learner, whether that happens because they are personally engaged with the issue at hand or because they believe they will have to justify their positions (Tesser & Shaffer,
1990). No matter the reason, we must acknowledge that there are many individuals who are not motivated to learn about racism and sexism, let alone to reduce their own prejudice, and are therefore unlikely to make such a cognitive commitment. One might even argue that those least motivated to change are the ones who most need to be reached by prejudice-reduction interventions.
Conceptual change is difficult enough to achieve in science, but expanding learners' models of discrimination may be even harder. The question, then, is how to make it happen most effectively. We propose that games can help.
Game Design for Conceptual Change The Serious Games movement proposes that games can be effective interventions for
28 learning, social change, health, business and more (Sawyer, 2010). These interventions can
include approaches as diverse as including pro-social activities in a commercial game, such as
Farmville's fund-raising for Haitian earthquake relief (Morales, 2010); using commercial games
to prepare students for future learning (Hammer & Black, 2009; Squire & Barab, 2004); creating
custom-built games that inform or persuade (Bogost, 2007); or using game mechanics to drive
desired real-world behavior (Lieberman, 2006). What these diverse approaches have in common
is the notion that some problems are difficult to solve by conventional means, but can be
addressed by the unique affordances of games.
To understand how games can support conceptual change, it is important to understand
what games are and how they function7. Games can be described on three different levels: mechanics, player experience, and culture (Juul, 2003; Salen & Zimmerman, 2003, 2005).
Game mechanics refer to explicit rules, but also to the goals, resources, and materials used to play the game (Salen & Zimmerman, 2003). In Tetris (Pájitnov, 1984), for example,
blocks fall from the top of the screen at a pace that increases as the game continues. The blocks
are in-game entities that the player can interact with; the behavior of the blocks, as well as the
player's capacity to affect them, is governed by rules.
Player experience describes the player's emotional engagement with the game, and the physical or cognitive efforts they put forward to achieve the game's goals. Games can provide a wide variety of emotional and aesthetic experiences, which players often participate in constructing for themselves (Lazarro, 2005). For example, Flow (Chen & Clark, 2006) allows players to seek their desired level of challenge. Players can control the emotional experience they
7 While this author is conceptually committed to the fundamental similarity of digital and non-digital games, this paper primarily considers the impact of digital games.
29 have by choosing what level to play on, and how aggressively to pursue the other creatures in the
game.
Finally, games have a unique cultural position. As Squire and Barab (2004) found, games
can reach learners who are alienated by other forms of media. Students who reject school-based
literacy, for example, spend time and effort on reading and writing in games such as World of
Warcraft (Blizzard Entertainment, 2004) and Lineage (NCsoft, 1998; Steinkuehler, 2008). Not all
aspects of game-based literacy overlap with what is taught in school; players do not learn how to
write a job application letter, while students do not learn how to persuade strangers to join their
guild. However, there are also significant areas of overlap, such as using games as subject matter
to teach technical writing (Vie, 2008) or creating text-based games as creative writing (Kee,
Vaughan, & Graham, 2012). While games do not, cannot, and should not serve as a replacement for school-based literacies, they do provide an alternate source of connection for at least some learners with at least some literacy tasks.
Media interventions can help with prejudice reduction (Paluck & Green, 2009), but demonstrating their effectiveness requires a better theoretical grounding than what currently exists. To develop that grounding, we will consider, in turn, how mechanics, player experience, and game culture can help us design an intervention for cognitive change around racism and sexism. As we will see in the next chapter, games can present anomalous data to players in unusually effective ways. However, games also have challenges, particularly in the areas of credibility and transfer.
Digital games can be framed as playable simulations because they provide complex systems of rules with which the players can experiment. In Civilization IV (Firaxis, 2005), for example, players' cities gain resources based on the values of multiple other factors in the game,
30 such as the types of land adjacent to the city, the buildings the player has created in the city, and what type of government the player has adopted. Taken together, the rules governing how cities accumulate resources present a model that the player must first comprehend and then manipulate
(Squire & Barab, 2004). The model presents a very specific view of what matters in building a civilization, and of how cities grow (Squire & Durga, 2005).
Computer simulations can provide an alternate route for students to engage with anomalous data and form new theories (Beichner, 1996; White, 1993). Simulations have been shown to be effective at reducing misconceptions and achieving conceptual change (Lipson,
1997; Taylor & Chi, 2006; Zietsman & Hewson, 1986). In some situations, simulations may even be more effective than direct instruction. For example, Taylor & Chi (2006) found that simulations and text instruction both helped learners improve on decontextualized assessments, like tests. However, only the simulation caused an improvement in a contextualized situation, like one that might be found in the real world.
The effectiveness of computer simulations comes from their ability to present learners with otherwise inaccessible data, and to give learners the opportunity to experiment with that data (Windschitl & Andre, 1998). Simulations can, like games, be framed as a type of vicarious experience (Hammer & Black, 2009). If learners develop their naïve models through personal experiences, which serve as data to test their theories in the real world, vicarious experiences can help them develop better models through a similar learning process (Gorsky & Finegold, 1992).
The models in games, unlike the models in simulations, are designed for playability rather than for faithfulness to the real world. However, these models can nonetheless convey powerful social messages. As Bogost (2007) demonstrates, the rules of a game reflect ideologies which may either undermine or reinforce the game's overt message. Consider The Sims 2, for
31 example (Electronic Arts, 2004). While the game overtly emphasizes tending to one's characters'
needs, the underlying message of the game is consumerist. Characters can only be made happy
by the purchase and use of material goods, from couches to televisions to art and more. This
message is represented only in rules, and it becomes visible to players through experimentation
in play.
Experimentation is one of the central elements of play. To take a simple example, in
Super Mario 64 (Nintendo, 1996) the player must figure out how far the character can jump.
Which jumps can Mario hurdle, and which will send him hurtling to his doom? The only way8 to
find out is to try jumping, and dying, until the player's model of Mario's ability matches the
game's underlying structure (Church, 1999). Players approach the game as scientists, making
hypotheses, conducting tests, and then examining the game world for confirming or
disconfirming data (Gee, 2003).
Players do not conduct experiments in the game world at random. Their experiments are
guided by the game's goals and reward structure. An appropriate choice of reward mechanic can
focus player attention on any desired aspect of the game (Ciavarro, 2008; Klimmt & Hartmann,
2009). This aspect of games addresses one of the major weaknesses of simulations, namely that the unguided use of simulations is far less effective than simulations embedded in an explicitly instructional context. Even when explicit instructional material is included within the simulation itself, learners often do not attend to it (De Jong & Van Joolingen, 1998). This weakness limits the use of most simulations to classrooms and other monitored spaces.
8 While the Super Mario 64 example implies that experimentation is a solitary experience, many players build on the research of their peers by consulting FAQs and walkthroughs. In some games, these walkthroughs can give the player precise instructions, but many games include elements that mean walkthroughs can only guide, not define, the player's choices. For example, the game Desktop Dungeons (2013) generates a new level with randomly chosen monsters for each game. The shared material for the game emphasizes strategies and models for novice players to adopt, not how-to instructions. For more on collaborative model-making, see Steinkuehler, 2008.
32 The goals of a game serve as a deeply integrated guide for play. An appropriately
designed game can stimulate hypothesis testing around how to accomplish a given goal within
the game system (Gee, 2003; Osborne & Squires, 1987). If the goals relate to common
misconceptions about the game model, then players can be induced to try experiments which are
likely to conflict with their initial models of how the system works. The player can then receive
feedback on the outcome of their play decisions, to motivate them to pursue or abandon a
particular path.
By providing appropriate feedback on how the player is doing in achieving that goal, we
can address another major issue in conceptual change: learners' ability to perceive anomalous
data in the first place. Chinn & Malhotra (2002) identified four possible stages at which
conceptual change can break down: observation of the anomalous results, interpretation of those
results, generalizing the results to construct a theory, and retention of the new theory for future
encounters. They found that observation was the stage at which most learners failed. Fewer than
half the subjects who made incorrect predictions were able to correctly identify what had
happened right in front of their eyes. The more ambiguous the data, the less conceptual change
was achieved. However, once learners were able to correctly perceive the anomalous data, more
than half of them were able to interpret, generalize and retain the material. Though this does not
guarantee conceptual change, it is clear evidence of effective confrontation with anomalous data.
Games can amplify the feedback that players get, and motivate them to attend to it. When
Mario falls into a pit, the player has no doubt that something has gone wrong with their mental model; because Mario's tumble interferes with the player's goals, they care about it9. Games,
unlike simulations, are not ideologically committed to fidelity. Desirable objects can glow, or
9 Presumably.
33 shimmer, or make one's character invincible; failure to understand the game model can result in
one's army being captured or one's character falling into a pit. This exaggerated feedback may
not be realistic, but it may help players get past the perceptual difficulties found by Chinn and
Malhotra (2002).
Perhaps the most powerful effect of player experimentation in games, though, is precisely
that it is player-driven. Meier describes a good game as a series of interesting choices (Juul,
2005). What makes these choices interesting is that players must assess the options available to
them and decide which approach they will take, based on their understanding of the game's
model, their assessment of their own capabilities, and their in-game and out-of-game goals (Gee,
2003). This requires deep processing of the sort that encourages players to engage with anomalous data. Tasks that require learners to engage in active, constructive and integrative behavior are the most effective at producing conceptual change (Chi, de Leeuw, Chiu, &
LaVancher, 1994). This is exactly what games do.
It is a challenging design problem, however, to create a game that both serves a larger purpose and contains interesting choices (Klopfer, Osterweil, & Salen, 2009). The model underlying game-play must reward deep processing. However, if players can use their knowledge of socially acceptable behavior to play the game, they will be engaging only in superficial thinking. Since player choices and experiments are at the heart of what is interesting about a game, players who use social desirability to play the game are not truly playing, and hence unlikely to respond well to the game's mission. It is a designer's obligation to create challenges that cannot be solved simply by doing what is socially appropriate, even if the game addresses pro-social themes.
However, the same challenge becomes an opportunity when done well. Games can
34 provide an environment where people do not have to do the socially acceptable thing, but can experiment with a variety of approaches or identities (Klopfer, Osterweil, & Salen, 2009). Games can also reach people who do not care about the issue that the game addresses. If people come to the game for the play experience, they can still be engaged by the game's underlying concepts – particularly if those concepts are embedded in the mechanics, with which the player cannot help engaging deeply (Lindley & Mayra, 2002). Given the immense reach of casual, web-based games, this is an opportunity to influence many individuals who would never seek out a prejudice-reduction intervention or think critically about bias. One might even argue that those are the people who need such an intervention the most.
Retaining a game's sense of playfulness is also important because games invoke an epistemological frame – a set of assumptions about learning – that is particularly useful for achieving conceptual change. Windschitl and Andre (1998) found that simulations are most helpful for epistemologically sophisticated students. These students believed that learning is complicated and happens over time; that knowledge is context-dependent; and that people can learn how to learn.
When game players are engaged with a game, they display evidence of sophisticated epistemological beliefs such as the ones Windschitl and Andre (1998) describe. Failure is not failure; it is an opportunity. Players expect that their job during a game is to learn how to play it, which may take quite some time as they explore the many challenges of the game world. The skills they learn for one challenge may have to be reinvented for another challenge, and may not even be applicable to future challenges. At the same time, the way they must learn to engage in the new challenges of the game is to experiment repeatedly, pay close attention to the results, and identify new courses of action based on what they learn. These are the epistemological beliefs
35 that make simulations most useful, and they are already present in gamers (Gee, 2003).
As we can see, games provide strong support for some of the elements that support conceptual change. Games can simulate a complex model and allow players to interact with it experimentally. By providing goals, the game gives players directed challenges rather than leaving them to wander in a simulated environment. Games can reach players who are otherwise unmotivated, and evoke an appropriate epistemological frame. Most of all, games require players to take action and make considered choices.
A well-designed game can provide support for these elements over and above what is possible in a simulation. For example, both games and simulations allow players to experiment with complex models. However, in a well-designed game, players’ in-game goals align with both the learning content and with their desire to play, allowing them to guide themselves through the experience. Both games and simulations allow for experimentation and failure, but games also evoke a powerful cultural frame reinforcing that failure is an opportunity. Both games and simulations can be engaging to learners, but learners are far more likely to engage with a game voluntarily.
Games also have challenges, of course. Many of these challenges are closely related to games' most significant advantages. The “magic circle” effect helps draw players in, but it may also make players less likely to transfer knowledge gained in the game to the real world. Games' cultural position makes them accessible to unenthusiastic learners, but it may also make the material in the game less credible. Games require active participation by the player, but providing interesting choices means not all players will have the same in-game experience.
While we will return to these limitations at the conclusion of this dissertation, more research is required to understand them, particularly as they relate to social-issue games.
36 Overall, however, game-based interventions are a promising approach for achieving
conceptual change around complex, socially sensitive issues. By comparing their efficacy to
interventions of other types, we can understand to what extent they impact player's attribution
styles; by experimenting with the reward systems that guide players through the game, we can
learn something about what makes these games effective. A theoretically-grounded game for conceptual change around discrimination may provide an effective prejudice-reduction intervention, while the empirical results comparing reward systems within the game can provide best practices for future game-based interventions around issues we can only now imagine.
The next chapter translates the theory reviewed in this chapter into specific design
principles for creating games that may help players incorporate systemic understandings of a
problem. It also describes the design and development of Advance, a custom-created web-based game using these principles. In the remainder of this dissertation, we empirically investigate the question of whether Advance can, indeed, increase players' willingness to use systemic explanations for racism and sexism, and whether there are different impacts between different versions of the game; we also examine whether the game affects players' attitudes toward these socially sensitive issues.
37 Chapter 3: Design
Playable Models, Anomalous Data This dissertation uses a custom-designed and -developed game, Advance10, to try to shift players' conceptual models of racism and sexism. This chapter defines a theory that provides one way to guide such a design process, based on the research outlined in the previous chapter and on how games function as a medium, and proposes a set of specific design principles based on that theory. It then describes how the game Advance was designed, both as a game and in terms of what aspects of bias it attempts to model. Finally, we explore how Advance uses this chapter's principles to present anomalous data to players while emotionally engaging them in the service of achieving conceptual change around how bias functions. In the remainder of this dissertation, we will see to what extent it has succeeded in doing so.
In order to accomplish this, we must understand how games communicate. Many games function as playable models (Bogost, 2006, 2007). Games, as opposed to simulations, are not intended to have perfect fidelity to a knowledge domain. Instead, games simplify the domains they represent into rule systems that provide meaningful opportunities for play.
Sometimes the rule system directly represents a domain in the real world. For example,
Portal (Valve, 2007) has a physics engine with direct fidelity to real-world physics, which incorporates concepts such as momentum. However, it is not enough for the system to represent physical reality. Portal makes these rules of physics playable by introducing goals to point the player in a particular direction (the exit), new elements about which to reason (the portal gun), and challenges to make the player do so (robot sentries). In Portal the representation of the
10 As previously noted, Advance can be played online at http://www.replayable.net/advance/.
38 physics system happens through action, and its exploration is done through the actual play of the
game (“Teach with Portals,” n.d.).
Sometimes, however, the rule system functions more abstractly. Consider the game The
Sims 2 (Electronic Arts, 2004). The rules of the game define needs for each character, which are
represented to the player as progress bars. The player can direct a given character to satisfy those
needs by taking various actions in the world, usually in relation to objects. For example, sleeping
on a bed restores a character's energy and comfort levels, but decreases their hygiene and bladder
needs. The way in which these needs are abstracted and manipulated have no direct parallel in
the real world. Sleeping on a bed certainly restores our bodies – at least if the bed is a
comfortable one! - but not in the same direct way. Also, the player has very little control over
how the abstraction functions. The player can be clever about what actions they command their
character to take, but they have only strategic control. Execution matters less than decision-
making.
Although these two games are very different, they can both be understood as “playable
model” games - as opposed to games where, for example, player judgment or taste is at stake, as in Apples to Apples (Out of the Box Publishing, 1999). In each of these examples, there is a set
of rules embedded in code. These rules determine what objects exist in the game world (for
example: portals, robots, couches, fishtanks, energy meters), how they function, and what player
inputs are available to interact with them. These rules provide constraints for player behavior –
what they can do in the game, how the game will respond. When players try to achieve their
goals in the game, they are always interacting with the constraints of the ruleset, developing a
strategy based on its affordances.
39 This is true whether the goals are set by the game or the player. For example, in Portal the player must discover how to move through a puzzle room, from the entry to the exit gate.
Although the player may form other goals along the way, particularly setting sub-goals to achieve the larger goal, accomplishing this goal is the only way to continue playing the game.
The only way to move to the next level is to finish the current level. Portal puts physical obstacles between the player and their goal, whether in the form of deadly acid or laser-shooting robots. The player has affordances that are fundamentally physics-based – they can create portals, jump, run. The process of solving a Portal level is a process of figuring out how to exploit the player's abilities in the context of rule constraints to achieve a goal, which requires understanding of the game system.
The Sims 2 is a slightly more complex case because the game system affords many different goals. Real-world examples of play goals in The Sims 2 include trying to build a house that suits the player's tastes; replicating the lives and activities of the player's friends; setting up screenshots to use in making a comic book; or killing the characters in assorted grotesque ways.
For each of these goals, though, the process of achieving them is the same: constrained by the rules of the game's model, with the tools of limited player interaction possibilities. For example, a player who wants to keep their Sims happy at all times cannot simply press a button to do so.
They must do so indirectly, by giving their Sims a pleasant environment and tending to their daily needs.
What we have just described are game mechanics – the actions and interactions through which the player engages with the game (Salen & Zimmerman, 2003; Jarvinen, 2008). For playable model games, this takes the form of entities and relationships for the player to explore and master (Cook, 2006). The formulation of games as playable models is analogous to the
40 formulation of complex systems described in the previous chapter (e.g. Forrester, 1961). Game entities and their relationships provide a complex system for the player to explore, with dynamic behavior inherent in the design of the system itself arising through play.
Although playable-model games are about exploring the affordances of a game system, such exploration does not automatically induce learning. If we hope to have players learn about complex systems from playable-model games, we must go beyond game mechanics. We must consider learning mechanics and assessment mechanics as well (Plass, Homer, Kinzer, Frye, &
Perlin, 2011).
Learning mechanics are research-based conceptual approaches to how a game might help players learn (Plass, Homer, Kinzer, Frye, & Perlin, 2011). Learning mechanics are design patterns, which are always instantiated in game-mechanical choices that are specific to a particular game. However, a particular learning mechanic provides a grounded and substantiated organizing theory for how to make those decisions. Designers and scholars agree that learning mechanics must be embedded in the core mechanics of play (Isbister, Flanagan, & Hash, 2010;
Plass, Homer, Kinzer, Frye, & Perlin, 2011). Learning mechanics let us see how to adapt the moment-to-moment process of play into a meaningful learning activity.
Similarly, assessment mechanics are design patterns, instantiated in games, that allow the assessment of player activity in the game (Plass, Homer, Kinzer, Frye, & Perlin, 2011). As with learning mechanics, they are not separate game experiences. Rather, they are theoretical constructs that are expressed in the game mechanics of a particular game. Assessment mechanics allow us to create games that let players express their understanding of a specific concept.
41 When selecting learning and assessment mechanics, there are several criteria to keep in
mind. They must be theoretically substantive and research-backed (Plass, Homer, Kinzer, Frye,
& Perlin, 2011), and they should be able to integrate deeply with the mechanics of the game itself (Isbister, Flanagan, & Hash, 2010). When instantiating learning and assessment mechanics, the designer must watch for skill confounds and manage cognitive load (Plass, Homer, Kinzer,
Frye, & Perlin, 2011).
The previous chapter reviewed research that suggests a confrontation with anomalous data is a good approach to help players adopt a systemic model for bias (e.g. Chinn & Brewer,
1993). American learners hold to their existing models unless they are forced to revise them based on anomalous data. Even when people have experiences that do not match their mental models, they will often work to reconcile their existing model with the new data. Only when their previous model proves incapable of explaining the new information will they be willing to abandon it and develop a new model instead. In light of this research, we can adopt encountering anomalous data as our learning mechanic.
While this learning mechanic is substantiated in the research literature, the question becomes how to incorporate it in playable-model games. Existing work with simulations shows that people can in fact achieve conceptual change through an anomalous-data inquiry process.
The question is how we can effectively do this with the tools of game design. The need to bring anomalous data into the player's awareness adds additional design constraints to the “playable model” approach to games.
The core element of playable-model games is for players to figure out how to use the rules to achieve the goal – and therefore it is the process that we try to harness for conceptual change using games. That process is mediated by the challenges the game provides, which
42 provide the motivation for engaging with the game's model, and by the affordances available to
the player, which provide the tools for doing so. Finally, there needs to be clear feedback; if players are to develop effective strategies for using their limited abilities to interact with the game, they must be able to predict the effects of their actions on the game's state.
It is for this reason that we argue that players attend to a game's model when the rules of the game provide an obstacle to achieving in-game goals (Gee, 2003). While this may not be the only reason players ever attend to the model of a game, it leads us to define design principles for
PMAD games. If we expect players to include the target concepts in the models they are building of how the game works, the new ideas conveyed through game design should be deeply integrated into the rules, goals and challenges of the game itself (Isbister, Flanagan, & Hash,
2010).
PMAD Design Principles Based on this theory, we propose a set of principles for designing playable model – anomalous data games, henceforth referred to as PMAD games11. These principles build on the
way that players engage with game systems in order to encourage a confrontation with
anomalous data and therefore a model shift.
Principle 1: The game system models the relevant domain. The model is built into the
game rules. Text can clarify what the player is supposed to learn or attend to, as can narrative
elements. However, players' primary strategic engagement is with the game's ruleset, which
should therefore encapsulate the desired model.
11 PMAD defines a type of game and a type of learning mechanic that are appropriate for a specific category of learning problem, but does not specify a particular assessment mechanic. The assessment mechanics used in this study are specific to the model developed for Advance, and are discussed in the context of the game rather than generalized. Future work in this area could attempt to identify assessment mechanics that are particularly effective for PMAD games.
43 Principle 2: Player actions affect, and are affected by, the model. The things that players
can do in the game are inputs to the model in some way, and the outputs that the player cares
about are affected by the model. Players' strategy development is based on the actions they are
able to take in the game; unless they are engaging with the model in question in meaningful
ways, they will find other routes to their goals.
Principle 3: Players receive feedback about the impacts of their actions as they relate to the model. Players cannot develop play strategies effectively if they lack the necessary data about what the model is doing and how their actions affect it.
Principle 4: The game goals point players toward model conflict. Players develop strategies in order to move toward in-game goals. The game goals must create a situation where players' existing mental model conflicts with the model in the game.
Principle 5: Players can experiment with the game's model. Because strategy
development in games is an inquiry process, players can test and compare different possibilities
– as opposed to some games which make you commit to your decisions and don't let you test
possibilities.
Principle 6: Players must figure out rules and strategies for themselves. Players need to
know enough to figure out what they are supposed to be doing and what the impacts of their
behavior is, but they can't just be told how to succeed – figuring out how to overcome the game's
challenges is the heart of engagement with the playable model of the game.
These principles define what it means to be a PMAD game. However, they are not sufficient to construct a particular PMAD game, nor are they unique to a PMAD-theoretical approach. Instead, they are guiding principles to be used in concert with the constraints of the
44 particular domain being represented in the game, with decisions about game type and game
mechanics, and with the logistical and practical constraints of game development and
distribution.
Within a particular domain, the PMAD principles do not provide the only way to communicate through play. There are many types of game that could address issues of racism and sexism; for example, Dog Eat Dog (Burke, 2012) interrogates the colonial experience through storytelling and role-taking. Similarly, there are many underlying models that a game could choose for systemic bias, using different elements of racial and gender bias in the world we live in. Even given a PMAD-theoretical approach and a particular set of ideas around systemic bias, Advance is not the only game that could be created.
This chapter, however, describes the design and development of Advance as a particular
example of using the PMAD principles in practice. First, we explain the design of the game,
including how it models racism and sexism, what forms of feedback it presents to players, how
the game's goals encourage players to engage with the model, and more. Next, we look at the difference between the three experimental versions of the game, so that we can begin to break down the efficacy of individual strategies within a larger PMAD context. Finally, we return to
the PMAD principles to demonstrate how the game adheres to these design principles, and how
they helped shape the design of the game during development and play-testing.
Game Design Overview In this game, our goal is to represent "systemic bias" in a playable way. Doing so means
solving three design challenges. First, how do we model systemic bias? We must choose an
evidence-based model for how systemic bias operates, and develop a way to reduce it to simple
rules. While we lose real-world complexity and nuance, we gain in clarity and focus. Second, we
45 must choose a context and interface for representing these rules to the player. Finally, we must
define interactions between the player and the model, which allow players to explore elements of
the game's model as they pursue their in-game goals. All these elements must adhere to the
PMAD design principles outlined above.
Advance is a custom-designed web-based game developed in Flash. The player takes the
role of a corporate recruiter, whose goal is to make money and keep their business afloat. During
the game, clients approach the player character for assistance with getting a job. The player
makes money by placing these clients into jobs that suit them, and by helping them get promoted
over time. The player must survive for five minutes - five years of game time - without running
out of money to pay their business expenses. Whatever money the player has left at the end of
the game becomes their final score.
Figure 1. Layout of the game Advance
The game visually represents the player's clients and the company they work with in an isometric and highly stylized way (Figure 1). The player's client base is represented as a row of
46 abstract figures, any of whom can be clicked for more information. The company is represented as a connected network of jobs, some of which are available for the player to fill and some of which are occupied by non-player characters (NPCs). This abstract representation allows players to focus on the challenges of placing a client and earning money.
There are four key constraints that players must attend to in order to place their clients into jobs.
First, job availability constrains client placement. The game board begins with some empty jobs. Other jobs become empty when player clients are promoted or when NPCs leave a position. When a job is empty (Figure 2), the player can put a client into it. However, an empty job can also be filled by the arrival of an NPC. The player must be vigilant if they are to notice when a job is empty and act on it before it is taken by an NPC.
Figure 2. An empty job in the game Advance
47 Next, job requirements constrain client placement. Even if a job is open, that does not mean a given character can take it. Each job has a required level of competence, creativity, and charisma. Characters have the same three characteristics; in order to take a job, the character must meet or exceed the job's requirements in all three areas (Figure 3). Characters begin with low levels of each, but can be improved by upgrading them. If the player wants to place a character in a particular job, they can upgrade them until they can meet the job's requirements.
However, money constrains upgrades. The player must pay for every upgrade to their characters. Each successive upgrade to a given characteristic costs more money, as the character receives more specialized and advanced training. Jobs with stricter requirements, therefore, require the player either to choose more skilled clients or to invest money in training.
Figure 3. Job requirements in the game Advance
48 Finally, success constrains advancement. The player begins with access to only the
lowest-level jobs in the company. When a client is placed in a job, their success meter begins to
fill. When a client's success meter completely fills up, they move up to the next level in the
company, where higher-paying jobs are available. That level is then unlocked for the player; the
player can move between levels to place clients and consider jobs. However, clients only appear
at the higher levels after they have been promoted from lower levels. They never appear on their
own, except on the very first level of the game.
Taken together, these constraints make some jobs better than others, and create
challenges for the player around placing clients. Some jobs pay particularly well. Others have
low requirements, making them easy for any client to take without investing money. Yet other
jobs will lead to rapid success, putting the client on track for better-paying jobs at a higher level.
The player must strategically decide which clients to invest in, which jobs to attend to, and which
levels of the game to focus on. Players must also balance time spent gathering information with
time spent taking action. Making a good client-job match is critical, but so is the player's overall approach to managing their time, money, and attention.
Sample of Gameplay To better understand the flow of the game, this sample of gameplay follows Jane, a player who encounters the game for the first time. Jane is a composite of several players who assisted with play-testing the game during the development and pilot process.
Jane loads the page on which Advance is hosted. She immediately encounters the game
tutorial, which asks her to complete four tasks in sequence12. First, Jane must place a
client into a job. Next, she must upgrade a client. After that, she must have a client
12 These tasks are explained more fully later in this section.
49 promoted. Finally, she must visit the second level of the game. Each task in the tutorial is presented after the previous task is complete. Jane also has the option to exit the tutorial and begin play using a button at the side of the screen. However, Jane wants to learn how to play the game, so she completes the entire tutorial. Only then does she begin play.
When the game begins, Jane looks at the client list on the right-hand side of her screen.
By clicking on different clients, she can see who is available for placement. When she clicks on each client, she can see their name, their picture, and how good they are at different aspects of their job.
Jane begins the game with three clients in her queue. She clicks on each of them to see who they are, and decides to place “Destinee Benton,” the last of the three available clients.
With Destinee selected, Jane looks at the left side of the screen to see the available jobs.
While there are many jobs on the screen, represented by empty spots in the office building, some of them are already occupied by NPCs, or non-player characters. There are currently four jobs available for Jane to choose from.
Jane clicks on one of the available jobs, and the job's requirements pop up under
Destinee's stats. Two of the stat bars are blue, indicating that Destinee is qualified for the job. However, one of them is red, indicating that she does not have enough skill in that area.
Simultaneously, each NPC adjacent to the job selected displays what their attitude toward
Desiree would be, should she be placed in that job. There are three NPCs adjacent to
50 Desiree. Two are showing hearts, indicating a favorable attitude, but one is showing a red skull, indicating a potentially difficult relationship.
Jane must now choose among three options. She can upgrade Destinee, spending money to train her for this job. She can test a different client in the same job, and see whether they might be better qualified. Finally, she can try Destinee in a different job to see if she can arrange a better fit.
Jane sees that there are three other jobs open, so she clicks on one of them to see if she can find a better fit for Destinee. Indeed, Destinee's stat bars are all blue – she is qualified for the job. The “Place Client” button activates, allowing Jane to confirm that she would like to place Destinee into the job. Jane notices that in this job, there are three hearts and one skull among the adjacent non-player characters, so it seems like Destinee might be treated better here. Jane decides to place Destinee into the job. Jane receives $1500 as her commission.
Jane goes back to the first job where she tried to place Destinee. Maybe she can place someone else into that job! She goes back to her client list and notices that a new client has arrived. She clicks on him. His name is “Emiliano Cruz” and all his stat bars are blue.
He is eligible for the job – and this time, all the adjacent non-player characters are showing hearts. Everyone likes Emiliano! She decides to place him into the job and receives another $1500.
Jane checks her score to see how much money she has. She's doing fine – she's received some money from placing her clients – but she notices that business expenses are being
51 regularly deducted from her account. She decides that just to be safe, she'll place another
character before she spends money on upgrading anyone.
Meanwhile, a notice pops up telling her that Emiliano has been promoted! He is waiting
for her on the second level of the game. Jane clicks the button to take her to the second
level. When she clicks on the jobs there, she sees that they will earn her more money than
the first-level jobs did. Emiliano is in her client list, so she immediately selects him.
While he has one red stat bar – these jobs have stricter requirements than the lower-level
jobs – Jane decides that she wants to invest in Emiliano. That higher salary is very
tempting! She pays $600 to upgrade his charisma and places him into the job, earning
$3,000.
Jane remembers that she placed Desiree before she placed Emiliano, so she goes back to
the first level to check on Desiree. When she clicks on Desiree, she can see Desiree's
success meter, which determines how quickly she will be promoted. Desiree's success
meter is filling up very slowly. Jane wonders why. She knows that not all characters are
treated fairly – maybe this is an example. She considers pressing the large, flashing
“Blow the Whistle” button to report the unfair treatment of Desiree, but she doesn't feel
confident that she knows enough to do that yet. Instead, she decides to continue playing.
However, she pays closer attention to how characters seem to be doing. Over the course
of the game, she has multiple encounters where she must make decisions about where to
place characters and how much to invest in them. She does so with Desiree's experience
in mind.
As this gameplay snippet illustrates, players of Advance must make decisions about which characters to invest in, both in terms of time and money. They select among different
52 characters, choose which jobs to place characters into, and invest money into characters to
improve their job chances. While doing so, they must maintain enough money in their bank
account to pay their bills. If they go bankrupt, they lose the game. However, if they keep the
company afloat for five years of game time, they win. Their final score is how much money they
have left at the end of the game.
Modeling Race and Gender If we hope to model race and gender bias within this game, we must decide how we are representing race and gender. What categories and definitions will we use? And how will those categories be conveyed to the player?
Gender is modeled as two categories: male or female. These categories are used by major public institutions such as the U. S. Census (Howden & Meyer, 2011). This categorization does not address transgendered individuals or those who do not identify with either gender. These groups are certainly discriminated against in the workplace, and may be included in future versions of the game.
Gender is represented through a character's name and picture. Research indicates that a woman's name or picture is sufficient to evoke gender bias (Steinpreis, 1999).
Race is a more difficult modeling problem, because it is a poorly defined construct.
Turning to the U. S. Census questionnaire is unhelpful, as it includes more than a dozen racial categories - too many for players to recall and compare. However, the census office then collates
this data into six "major race groups" (Hume, Jones, & Ramire, 2010). Of these, only four
represent 1% or more of the American population. These groups are white (72%), Hispanic
(16%), black (13%), and Asian (5%). These are the groups we have chosen to use in Advance.
53 There are two potential concerns with this modeling choice. First, it excludes the categories of American Indian / Alaska Native and Native Hawaiian / Pacific Islander, as well as those who identify as multi-racial. These groups do experience discrimination, and may be included in future versions of the game. Second, the category "Hispanic" is collected and analyzed orthogonally to race, while the game treats "Hispanic" just as it treats "black," "white," and
"Asian" characters. To address the full complexity of Hispanic / Latin@ identity is beyond the scope of this game; however, Hispanic Americans are a large group who suffer ongoing discrimination. For example, Hispanic women experience the largest pay gap in the United
States (Hegewisch & Edwards, 2011). Due to the size of the population affected, we chose to risk oversimplifying Hispanic identity and include Hispanic as a category nonetheless.
As with gender, race is signaled through a character's name and picture. Research indicates that names and pictures are sufficient to invoke racial bias (Bertrand & Mullainathan, 2003).
Early play-testers were able to distinguish between male and female character images, and were able to match images to racial categories correctly. Names were selected from lists of the most common names for each of the four racial groups being modeled. Names that appeared on more than one list were eliminated. See Appendix A for more information.
Race and gender are never under the player's control. At the beginning of the game, the game randomly selects one race, one gender, or one race-gender combination to be discriminated against. The player cannot affect this choice, only react to it.
Similarly, every character in the game is categorized by race and gender, including the clients who seek the player's assistance. The player can never choose which clients approach them, only how they respond to the clients who do. Because every client is affected by the
54 game's bias, the player must always grapple with the bias in the game, but without being able to
affect it directly.
Modeling Bias How, then, is bias in the game modeled? We know it must exist, and we now understand
the categories on which it operates. But how does the game's bias actually work?
The first model of discrimination used in Advance is the microaggressions model (Pierce,
Carew, Pierce-Gonzalez, & Willis, 1978; Sue, 2010; Sue & Capodilupo, 2008).
Microaggressions refer to daily interactions that contain coded messages of racism and sexism.
Each individual interaction may seem harmless, but the cumulative impact can be significant.
For example, Sue describes how he and a black colleague were asked to move to the back of a
plane in order to improve the plane's load balance, although there were white passengers who
had arrived later and were seated nearby (Sue, 2010). To the white flight attendant making the
request, it seemed perfectly reasonable. To her passengers of color, on the other hand, the request evoked Jim Crow laws. Worse, it was part of a pattern of repeated small aggressions and humiliations.
Microaggressions draw on two critical elements of systemic bias, as defined earlier in this paper. First, many microaggressive events are unintentional. The microaggressor does not realize that they are enacting existing race or gender narratives, believing instead that what they are doing is neutral or harmless (Sue & Capodilupo, 2008). Additionally, microaggressive episodes may be minor in and of themselves. The cumulative impact of microaggressive interactions is what makes them so painful (Sue, 2010).
55 An agentic analysis of a microaggression, such as moving black passengers to the rear of
the plane, would focus on the microaggressive event in isolation, and would emphasize that the
aggressor did not intend to act hurtfully. Not only does this analysis fail to capture the real effect
of microaggressions, it makes the microaggressive incident worse. Am I overreacting? Should I
respond? These kinds of questions themselves become a source of stress and trauma (Sue, 2010).
A systemic analysis, on the other hand, looks at the impact of the event rather than the
intent of the perpetrator, and sees it as part of a pattern that extends across time and occurs in
many different contexts. For example, there is nothing inherently wrong with moving black
rather than white passengers to the back of the plane. It is only troublesome because of the
pattern it evokes. Trying to analyze each incident in isolation leads to dead-end questions (should
no person of color ever be asked to move their seat?), while understanding and addressing the
pattern can lead both to changes in individual behavior and to challenging the pattern itself. It is
precisely this aggregate and cumulative impact that makes microaggressional analysis so useful
to a systemic understanding of racism and sexism, and it is this that we attempt to model in our
game.
The theory of microaggressions may be a systemic one, but do microaggressions
negatively impact the lives of minority populations and women? The research indicates that it
does. Microaggressive stress injures the health of targeted groups - in our case, women and
people of color (Buser, 2009; Harrell, Hall, & Taliaferro, 2003). It lowers their subjective well- being and may be a risk factor for depression (Brondolo et al., 2008; Cortina & Kubiak, 2006;
Finch, Kolody, & Vega, 2000; Hill & Fischer, 2007). It can invoke stereotype threat, causing impaired performance when people must act against type (Steele, 1997). Finally,
56 microaggressive stress can directly impair cognitive performance for members of the group
being targeted (Salvatore & Shelton, 2007).
When modeling microaggressions in Advance, we must capture both the systemic nature of microaggressions and their negative impact. To do this, we assume that microaggressive stress affects success in the workplace. Each character in the game has an internal meter representing their progress toward advancement. The more skilled a client, the fuller the meter when they are first placed. The better the work environment, the faster the meter fills; a poor work environment, on the other hand, can cause the meter to decrease over time. Promotion and demotion occur when the meter fills entirely and empties entirely, respectively.
The "goodness" of an environment is determined by how many microaggressions the
client encounters on a regular basis. When a client is placed, relationships are calculated for the
client with all characters who are adjacent to them on the game board. If the character is from a
privileged group, relative to the client, the relationship is judged to contain microaggressive
interactions. For example, if women are privileged in a particular game of Advance, they make
jobs adjacent to them a worse fit for men. On the other hand, members of the same group are
considered to have a supportive effect. Male characters would be helped by their peer
relationships with other male characters. The total number of positive peer relationships and the
total number of negative peer relationships are used to calculate the speed at which a client's
success meter fills - or empties.
A client will be promoted fastest, therefore, if they are placed in a job with the most
positive peer relationships and the fewest negative peer relationships. Clients from the dominant
group never experience negative peer relationships; the player must consider both positive and
negative relationships only when placing clients who are being discriminated against. At the
57 same time, the impact of these relationships is only felt over time, and is evident only in
comparison to the performance of clients in better situations.
When a player has both a client and a job selected, the player can see the reactions of the
client's potential peers on the board. Negative relationships are marked by a skull and positive
ones by a heart (Figure 4). The player can therefore experiment with different characters for the
same position, or with the same character in different positions, both for strategic reasons and to
discover which characters are being discriminated against. The visual feedback also allows the
player to see the peer-based nature of microaggressive stress at a glance.
Similarly, each client's success meter is made visible to the player. When a client is
selected, players can see how far that client is toward promotion, whether their meter is filling or
emptying, and how fast it is moving. This information makes the cumulative impact of
microaggressive stress visible to the player, and can help them develop strategies for faster client
promotion.
While the model of bias in the game affects how quickly clients get promoted, players
still retain choice and agency. The model constrains the player's strategy, but does not determine
it. For example, a player might carefully place discriminated-against clients in positions where they have supportive members of their own group nearby, in order to maximize their chances of promotion. Alternately, the player might put clients from this group into jobs with the fewest peer connections, leaving the best jobs for the clients with the most long-term potential. Several other strategies are possible, as are combinations of these strategies in response to specific situations.
58
Figure 4. Peer reactions to a possible character placement in Advance
No single strategy is obviously virtuous; the player cannot solve the problems associated with this model of bias through social desirability analysis. The bias is located in the game's system, not in player actions, and cannot be changed by anything the player does. Clients from the dominant group will have a promotion advantage over clients from the non-dominant group.
The question is whether the player can turn this to their own advantage, or whether it will interfere with their goal of making money through client promotion.
The second model of discrimination used in Advance is bias in perceptions of competence (Valian, 1999). In our society, people unconsciously discount the contributions of women and people of color. Several classic experiments with resumes demonstrate this effect. A white name on a resume generates 50% more callbacks than a black name on the same resume
(Bertrand & Mullainathan, 2003). Similarly, faculty were more likely to hire male than female job candidates, when the only difference between the two was the name on the resume (Moss-
59 Racusin, Dovidio, Brescoll, Graham, & Handelsman, 2012; Steinpreis, 1999). Even the
perceived value of the resume fluctuates based on gender. Uhlmann and Cohen presented
subjects with two candidates, one male and one female. The candidates were randomly provided
with backgrounds: one had an excellent educational background and one had excellent
experience. Subjects claimed the job required whichever strength was possessed by the male
candidate, while arguing it was simply the right set of skills for the job (Uhlmann & Cohen,
2005).
This type of bias is unconscious, as many types of systemic bias are. Additionally, in hierarchical environments, this type of bias creates cumulative advantage. People who display
competence are given better opportunities and more resources. If women are repeatedly seen as
less competent than their male peers, and therefore receive fewer opportunities to display
competence, they will fall further and further behind.
Does this type of bias have a negative impact on women and minority populations?
Certainly. When evaluations are made race- and gender-blind, women and people of color
perform better. For example, when orchestra auditions were conducted behind a curtain,
women's acceptance rates soared (Goldin & Rouse, 1997). In short, the saying "Work twice as
hard to get half as far" is not far off. For a given level of recognition, women and people of color
must work harder and perform better than men or white people need to.
When modeling bias in perceptions of competence in Advance, we must capture the sense
that the discriminated-against group receives fewer rewards for the same work - or, as we will
see, works harder for the same level of reward. To do this, we must develop a model for client
competence and for the difficulty of a given job. That model can then be adjusted on a per-group
basis to reflect the reality of discrimination.
60 In Advance, each client has three statistics: talent, creativity, and charisma. Each client has a score between zero and twenty-five in each of these statistics. A new client begins with low scores in all three skills. Each time the client is promoted, they receive a bonus to one or more statistics. Additionally, the player may pay to upgrade any client's statistics. The higher the client's level, the higher the cost to upgrade it further.
Each job in Advance has a minimum required level for each of the three character statistics. The client must meet or exceed the required level in all three in order to be eligible for the job. If the client fails in any of the three, they cannot be placed into that job. If the player attempts to do so, nothing happens.
The game makes this information visually salient. When the player has a client selected, that client's satatistics are visible. When the player selects a job, that job's requirements become visible. When both are selected, both the client's statistics and the job's requirements are shown, and any skills the client must improve are highlighted. This makes it easy for the player to see whether a financial investment would be required for that client to take the job.
Given this model of competence in a corporate environment, how is bias modeled? In this game, job requirements are raised for members of the discriminated-against group. For example, if a given job required a creativity score of three, it would be a four for members of the discriminated-against group. By raising the bar for members of this group, rather than artificially lowering their scores, the player can see that clients from this group have the same underlying capacities as their other clients. It is only the expectations of them that have changed.
The visual design of the game makes it easy for players to see the difference between requirements for members of different groups. Early play-testing revealed that players had
61 difficulty remembering subtle differences in job requirements, which made it impossible for
them to compare requirements between groups. Players can now keep one job selected while
switching from client to client. This allows them to easily compare the job requirements for
different characters.
This model of bias adds a new strategic dimension to the game. Players can still choose
to put any client with high enough skill scores into any job. They must still decide whether to
spend their time searching for the most efficient client-job match, or whether to spend money
liberally to match clients with jobs quickly. The only difference? On average, it will take them
longer to find an existing client-job match for their discriminated-against clients, and it will cost
them more to improve these clients for a quick match. Successful players must incorporate this
reality into their play strategy.
The true impact of bias, however, becomes apparent as it intersects with the game's
hierarchical structure. To reach the higher, more lucrative levels of the game, the client must be
promoted. However, when this promotion occurs, the client does not automatically receive a job
at the higher level. Instead, the player must once again solve the problem of placement - with
bigger risks and better rewards. The financial and temporal challenges of placing a
discriminated-against client are repeated and magnified, allowing players to see biased outcomes
more clearly as each level is unlocked.
The anomaly in this game is that the source of the bias is located in the game system itself
– it cannot be attributed to any single character. The player observes differential outcomes, for example that some characters are harder to place than others, or that some characters are promoted more slowly. However, the game does not allow for agentic explanations. There is simply no individual who is making intentionally biased choices. Although in the long run
62 players are expected to incorporate both agentic and systemic explanations for bias into their explanatory repertoire, the game does not give players the easy alternative of a familiar approach.
It is important to note that players do not have the opportunity to change the bias in the game. The specific group being discriminated against is randomly selected at the beginning of each playthrough and cannot be changed by the player. The player also cannot change the way that bias operates in the game system. While this runs the risk of suggesting that the game and its designer agree that bias is normal and expected, this is also authentic to the way that systemic bias operates. A biased individual can have a change of heart, or can be removed from their position if need be. Bias located in the system, however, can only be addressed at the systemic level. The inability to directly affect the biased system is part of the game’s anomaly.
The challenge is to confront players with this anomaly as effectively as possible without further reifying discrimination – and to learn something about how different strategies for doing so function.
Reward System Design The pedagogical goal of Advance is to encourage players to notice the bias in the game system and respond to it with new play strategies, which may expand their attributions of the origins of bias and change their attitudes toward biased outcomes. However clever the game's model of bias, it is not useful if players do not engage with it. For this reason, one of the primary research goals of this project is to investigate the impact of reward systems on the player's engagement with the game's model of bias.
63 It is important to clarify what reward systems mean in this context. Discussion of rewards often ends in a discussion of extrinsic and intrinsic motivation, which are two ways of understanding what motivates people to perform a particular task (e.g. Deci & Ryan, 2000; Deci,
Koestner, & Ryan, 1999; Deci, Koestner, & Ryan, 2001). According to this theory, intrinsic motivation is the motivation to perform a task because of interest or enjoyment, while extrinsic motivation is the desire to perform the task in order to obtain some outcome or object. For example, a child who reads a book because they are interested in the story is intrinsically motivated, while a child who reads a book because they want to get a good grade is extrinsically motivated.
There are extensive debates about the virtues of intrinsic and extrinsic motivation, and particularly about the role of external rewards for getting learners to engage with a task.
Significant evidence suggests that providing desirable rewards lowers intrinsic motivation to engage with tasks (Deci, Koestner, & Ryan, 1999; Deci, Koestner, & Ryan, 2001). On the other hand, positive reinforcement can be effective at shaping human behavior with external rewards.
For example, behavior change has been demonstrated in token economies, where people receive tokens (markers of value which can be redeemed for a variety of concrete rewards, from special food to computer time) in return for good behavior (Martin & Pear, 1999). In fact, one can frame much of American society as behavior that is shaped with external rewards in the form of money.
Motivation, however, is contextual. The same person may engage in a particular activity with different motivations at different times. This is a concept Deterding (2011) explains as
“situated motivational affordances.” Within a given context, we can analyze what activities
64 people can engage in to address their motivational needs, compared to the other options available to them in that particular situation.
Recognizing that the context of this analysis is a game, we can avoid the intrinsic versus extrinsic comparison. Advance is designed to be played as a leisure activity, where players engage voluntarily with the game on their own computers in their own time. It is true that when games become mandatory, players' response to them changes for the worse (Heeter et. al., 2011).
However, this study does not attempt to motivate people to like the game, nor to obligate them to play. One group of subjects was drawn from voluntary web-based recruitment of casual gamers.
Another group was drawn from the microtask service Mechanical Turk, where subjects can choose to complete small jobs in return for payment. Although subjects in this group are being paid to play, they have still chosen to spend their free time using Mechanical Turk, and they have selected the game from many thousands of available tasks. As we will see later in this dissertation, there are differences between these two groups of players, but we can still consider both of them as freely choosing to engage with the game.
Games provide their own contextual frame within which meanings are made, which is often referred to as the “magic circle” (Huizinga, 1950; Goffman, 1974). Although this contextual frame is not inviolable (Copier, 2007; Consalvo, 2009), games do provide interpretive and imaginative frames within which a stick can become a horse, a sword can symbolize power, or green pigs can be a deadly enemy. Within these frames, games present challenges to the player. As Suits (2005) points out, the essence of playful challenges is that they are unnecessary and inefficient. For example, the rules of basketball make it less efficient, not more efficient, to put the ball through the hoop. Players expect games to set goals that are difficult but possible, and to require them to complete those goals in ways that are deliberately unproductive. In return,
65 players work to develop skills and strategies to overcome the challenges and accomplish their goals.
Given this context, we frame our reward system not as a motivator to play the game, but rather as motivators for particular choices and strategies within the game. The questions to ask are not “What would motivate a player to achieve this goal,” but rather “How does this goal relate to the game's system? What capacities does the player have to achieve it? What strategies does it encourage the player to develop? What behavior does it support or undermine?”
We do not expect to understand the precise motivation for players to approach the game, what alternative activities they have available, or what contextual lack they are attempting to address. For example, players may approach a game like Advance, which addresses the rather serious subject of racial and gender bias, with the goal of engaging with the subject in a safe way. Alternately, players might approach such a game playfully, hoping for the chance to turn the world as they know it on its head. In the study as designed, we cannot know whether players are approaching a game for a dose of reality or for a dose of escape.
What we can do, however, is to randomly assign players to experimental conditions. In order to accomplish this, three different versions of Advance were created and compared. While we do not know the exact motivations for people in playing the game in the first place, we can hold those motivations constant across the three game conditions, since we are randomly assigning players to groups; similarly, we can hold them constant across all game conditions and a control group. By comparing different types of rewards, therefore, we can look at how specific design patterns for in-game rewards affect the choices players make, the strategies they develop, and the impact on their attention to the game's model and its anomalous data.
66 In all three reward conditions, the game explicitly challenges players to guess the bias
shown by the company the player is working with. This goal encourages players to try to figure
out the bias by exploring and experimenting with the game's system, which confronts them with
the anomalous data in the game. The assessment mechanic used to determine whether they have
successfully done so is an explicit one. Players are asked to select the group they think is being
discriminated against from a multiple-choice list. We chose an explicit assessment mechanic in part to explore a common assessment design pattern in serious games – being explicitly asked about the content the player is supposed to learn. We can therefore investigate the most effective ways to use (and avoid) this pattern.
Another assessment design issue was salience. To what extent should guessing the bias be a required part of play? The bias guess system could be made mandatory – but requiring participation would change the motivational landscape (Heeter et. al., 2011). On the other hand, for research purposes, players should not completely overlook the bias guessing process. As a compromise position, Advance gives players explicit instructions to guess the bias in the tutorial, and uses a flashing button to draw their attention to the bias guess interface during play.
However, there is no penalty for not guessing the bias. In other words, it offers money for succeeding, but the player loses nothing if they avoid it; neither are players obligated to make a guess by in-game constraints.
Finally, the assessment design considered how to incentivize the appropriate behavior
(experimentation) rather than inappropriate behavior (guessing). Here we rely on the psychological principle of loss aversion (Kahneman & Tversky, 1991). When players are offered a reward for guessing the bias, they are told what reward they will get for a correct guess.
However, if they guess incorrectly, they will receive a lesser reward. This gives players the
67 freedom to guess if they don't know who is being discriminated against, and doesn't force them to get things right the first time; however, the game implies that they already have that great reward and they just need to lock it in by guessing correctly.
Within these broader constraints of assessment design, three types of reward were constructed. However, in order to understand them, we must first understand the underlying resources used to generate them.
Money is the most visible resource in Advance. It helps create both positive and negative feedback cycles during play. If the player has a lot of money, they can afford to upgrade their clients liberally. Upgraded clients are easier to place into jobs, earning the player even more money, and affording the player more freedom of choice in their in-game actions. However, players must also regularly pay in-game expenses. Players must earn money faster than they lose it; if a player runs out of money entirely, they immediately lose the game. Money therefore provides players with in-game opportunities, while simultaneously serving as a hedge against disaster. Taken together, these factors make money valuable within the frame of the game.
Money also serves as a signaling function. Players receive money for engaging in the target behavior, namely placing clients into jobs. At the end of the game, the player's score is calculated based on how much money they have. Money, therefore, tells the player how well they are doing at the game. A player who has a lot of money can, by definition, be confident that their strategies and techniques are successful. A player who has very little money is either playing poorly, or taking a significant risk for a later payoff.
Another resource in Advance is information. When players understand the way the game system works, they can design strategies to take advantage of it. For example, if a particular
68 player group is being discriminated against, the player can make an informed decision about how much time and money to invest in members of that group. They may choose to invest more time, because they know it takes more effort to place characters from that group successfully, or they may choose to ignore the character entirely for the same reason. In both cases, however, the information lets the player make better judgments about how to behave in the game, and gives them more insight into how to meet the game's challenges.
The reward systems created for Advance use information and money in different ways to incentivize engagement with the bias guess system, based on common design patterns for rewards in games (Bjork & Holopainen, 2004).
In the informational game condition, when players guess what the bias is, they learn whether they were right or wrong. The benefit of a correct guess is, in this condition, purely informational. As we have seen, characters who are being discriminated against cost more money to place, and are promoted more slowly than their peers. If the player recognizes this, they can develop strategies to compensate, which will earn them more money in the long run.
These strategies may be supportive of these characters, such as seeking placements that are minimally stressful; the player may also develop strategies that involve giving these characters little support and removing them from the game as quickly as possible. In both cases, the bias information can help players play better, because it helps them figure out how to allocate their time and effort.
In the financial game condition, players are rewarded for a correct guess with confirmation, but also with a one-time award of money. This monetary reward is worth many times as much as placing a single character, making it an appealing way for the player to earn money. The player is incentivized to identify the bias rather than simply guess at it, because the
69 reward amount decreases with each wrong guess. However, the player does not need to do anything with their knowledge of the bias beyond displaying it. The knowledge can remain entirely inert. Of course, the player may choose to apply it to strategy development, as in the first condition. However, there is no particular incentive for the player to do so.
Finally, in the generative game condition, players receive both confirmation of their hypothesis and an opportunity to earn more money. However, if they wish to follow up on the opportunity to earn more, they must develop strategies for managing the bias in the game.
Specifically, the player receives a significant monetary bonus for each member of the discriminated-against group they place. As with the explicit reward, guessing is disincentivized by reducing the bonus for each incorrect guess. Unlike the previous case, however, the player receives no reward unless they act on the knowledge they have gained. In order to receive the bonus, the player must work out strategies of some kind for placing discriminated-against characters, which requires them to engage with the game's model in a new way. Characters from the discriminated-against group now provide an opportunity for extra profit, rather than requiring an extra investment of time and money. Players must reconsider their strategies in light of this new information. Additionally, players may now feel invested in placing these individuals in order to receive their "rightful" payoff, which they have put effort into acquiring.
Because the game's bias is different every time, players must repeatedly go through the bias-discovery process, no matter which of the three reward systems they are working with. The process of bias-discovery is precisely what PMAD theory suggests players should engage with, because we propose that engagement with this process will in turn drive measurable changes to their ideas about racial and gender bias, and to their attitudes toward it. By looking at the
70 differences between the reward systems, we can understand how different reward mechanics may drive players to engage with the bias discovery process in different ways.
This chapter has drawn together a wide range of theoretical and empirical work to define
PMAD theory, an approach to creating games that help shift players' conceptions of complex systems. It also described Advance, a game built on PMAD theory to help players change their attributions and attitudes around racial and gender bias. In the next chapter, we describe the research methods used to empirically test the effectiveness of the game and to look at differences between the three versions of the game.
71 Chapter 4: Methods This project investigates three questions involving PMAD (playable model – anomalous
data) design theory. First, can a PMAD-based game that models systemic racial and gender bias
change players' likelihood of using systemic explanations for incidents of racial and gender bias?
How does it compare to more traditional methods of education, such as reading a text? Does the
reward structure (informational, financial, or generative) in such a game affect the likelihood of
the player using systemic explanations? Second, can such a game change players' attitudes about
racial and gender bias, and how does it compare to reading? Do differences in reward structure
help players shift their attitudes? Finally, how do differences in game-play behaviors affect
players' attributions and attitudes?
In this study, data is collected from subjects about their likelihood of using systemic
rather than agentic explanations for incidents of racial and gender bias, and about their attitudes
toward racism and sexism. The study uses a Solomon eight-group research design; half the subjects have data collected both before and after the experimental intervention (playing one of three versions of the custom-designed PMAD game Advance, or reading a control text), while half receive only a post-test to control for possible priming effects of the pre-test questions. This design allows us to investigate how players' attributions and attitudes change under the four different conditions.
By looking at pre-test to post-test change in the likelihood of using a systemic attribution, we can begin to understand whether PMAD games can help people adopt a systemic mindset. By understanding the difference in impact between playing a PMAD game and reading a piece of text on the same topic, we can further refine how to use the PMAD design principles to achieve conceptual change on a socially charged issue.
Similarly, we can look at pre-to-post change on measures of racially- and gender-biased
72 attitudes to understand whether PMAD games can help change players' attitudes toward racism
and sexism; by comparing the impact of the game to the impact of text, we can work toward
making PMAD games more effective.
Finally, we can compare both attitude and attribution measures across the three different
game conditions, in order to understand the comparative effect of different reward systems
within a PMAD game. Finding differences between reward systems tells us how to make the
most effective design choices for conceptual change using PMAD games as an intervention, but
it also indicates new research questions about the impact of these reward types in other kinds of
games for impact. Additionally, designers can use this work to choose appropriate reward
systems to focus player attention within complex game models.
Research Questions As discussed above, this project investigates three questions. First, can a PMAD
(playable model – anomalous data) game that models systemic racial and gender bias change
players' attribution styles around issues of racism and sexism, and how does it compare to
reading a text? Second, can such a game change players' attitudes about racial and gender bias,
and how does it compare to a text-based intervention? Finally, does the reward type
(informational, financial, or generative) in such a game affect either the likelihood of the player
using systemic explanations or of shifting their attitudes?
We begin by considering the question of attribution. As outlined in earlier chapters, we
believe that games designed using the PMAD principles may be able to induce conceptual
change. Players must engage deeply with anomalous data and therefore with the game's model of the target domain. We therefore hypothesize that if a game presents an alternate model of racism and sexism, players will show conceptual change related to those models after playing the game,
73 and that it will differ from the change experienced after reading text.
First, we investigate the differences in impact between the three game conditions and the
control activity (reading a piece of text) on the player's likelihood of using systemic explanations
for racism and sexism.
Q1: Controlling for pre-test scores, are there differences in attribution test
scores for race and gender across the four study conditions?
This analysis looks at the overall impact of the game. However, we know that all players
do not play with equal attention, engagement or success. We investigate whether successful game play will affect players more than simple exposure to game play, and whether there are differences in attribution post-test scores based on other game performance measures.
Q2: Are there associations between in-game measures (such as player score,
number of characters placed, and number of game plays) and attribution test
scores?
In order to make claims about the game's impact, it is important to understand not only whether the game impacts players, but how. As outlined in the game design portion of this paper, we believe that the reward system for bias detection can be manipulated to produce different effects in players. Comparing a “financial” reward system of the sort often found in educational games to a reward system that provides information for improved play and to a reward system which encourages strategy change after bias detection, we expect to find differences in the
74 effectiveness of the game as an intervention on players' attributions of racial and gender disparities. As above, we investigate whether in-game performance differences correlated with the post-test measures across the reward conditions.
We investigate four different conditions for the bias guess hypotheses, as not all players interacted with the bias guess system. Players who did not interact with the bias guess system never learned which reward condition they were in or even that they would be rewarded for guessing the bias, hence they must be treated as a separate group for purposes of understanding the differences between reward conditions. The groups analyzed, therefore, are as follows: players who did not interact with the bias guess system (no guess), players who interacted with the bias guess system in the informational condition (informational guess), players who interacted with the bias guess system in the financial condition (financial guess), and players who interacted with the bias guess system in the generative condition (generative guess).
Q3: For game players, are there differences in measures of systemic
understanding of racism and sexism across bias guess conditions?
Q4: For game players, are there differences in game performance measures
across bias guess conditions?
Additionally, games have a powerful affective impact on players, even when they are not specifically designed to do so. We therefore hypothesize that game-play will have at least a temporary effect on players' attitudes about race and gender. Because bias serves as an obstacle for the player, we predict that players will be more sympathetic to the struggles of discriminated-
75 against groups in American society. Specifically, we hypothesize that players will perform differently on existing measures of racism and sexism after playing the game, and that the impact of the game will differ from the impact of reading text. As with the attribution test, we investigate the impact of game-play on players' attitudes about race and gender.
Q5: Controlling for pre-test scores, are there differences in attitude test scores
for race and gender across the four study conditions?
Q6: Are there associations between in-game measures (such as player score,
number of characters placed, and number of game plays) and attitude test
scores?
We also investigate the differences between reward conditions for players attitudes about racism and sexism. As above, we include players who did not guess the bias as a separate group, and compare them to players who attempted to guess the bias in each of the three reward conditions.
Q7: For game players, are there differences in measures of attitudes toward
racism and sexism across conditions?
As we investigate each of these seven questions, we control for player race and gender.
There are many questions one could ask using the player's race and gender, such as the impact of an in-game racial or gender bias that matches (or does not match) the player's race or gender.
76 However, for the purposes of this project, we limit our analysis to whether there are differences
across race and/or gender groups.
Procedures This study compares the effect of three different versions of the Advance PMAD game, and of a control text, on players' conceptions of racism and sexism. It uses a Solomon eight- group design across the three game conditions (informational reward, financial reward, and generative reward) and the control condition. This design allows us to look at pre-test to post-test
differences, while parceling out the potential impact of pre-test priming on post-test scores. It
also allows us to examine differences between the three game conditions, and between the
treatment condition and the control.
Subjects who arrived at the Advance site were presented with IRB-approved study
information and asked whether they consented to participate. Individuals who chose not to
participate could still play the game, but their in-game data was not collected and they did not
receive any pre-test or post-test material associated with the study.
Upon obtaining consent, study procedures followed the procedure summarized in Figure
5.
Consenting subjects were randomly assigned to a pre-test / no pre-test condition. Subjects who were assigned to the pre-test condition completed the pre-test before playing the game; subjects assigned to the no pre-test condition moved immediately to the next step.
The pre-test consisted of the Attribution Test, the Symbolic Racism Scale and the Modern
Sexism scale. The tests were presented in a random order, and the order of questions within each test was randomized.
When an attribution pre-test was generated, five of the ten test questions were randomly
77 chosen to use the race version of the question. The other five questions used the gender version of the same question. The order of questions was randomized, and the order of answers was randomized for each question.
The Symbolic Racism and Modern Sexism scales were administered as per the test instructions. However, the Likert scales in the two tests as originally designed used opposite codings; the Symbolic Racism scale uses 1 to indicate Strongly Agree (Henry & Sears, 2002), while the Modern Sexism scale uses 1 to indicate Strongly Disagree (Swim et. al., 1995). For consistency, the Modern Sexism scale's Likert scales were flipped when presented to the user.
After completing or skipping the pre-test as per their assigned condition, each subject was randomly assigned either to the control group, or to one of three bias detection reward conditions: informational rewards, financial rewards and generative rewards.
In the control condition, players were presented with a brief text about the two core concepts addressed by the game: microaggressions and the cumulative impact of small differences. The former was adapted from the classic text on microaggressions by Sue (2010); the latter was adapted from Valian's work on gender schemas (1999). After reading the text, players were asked a factual question about the text to test their understanding, and asked how much they enjoyed reading the text. (See Appendix G for the control text, and Appendix H for the check question details.) Players were then taken to the post-test.
78
Figure 5. Study overview flowchart
In the game conditions, the player first played through a short tutorial. The tutorial explained four key game concepts: character placement, promotion, upgrades, and changing levels. The tutorial asked players to practice performing each task, and only began the next tutorial segment when the player succeeded. However, players could end the tutorial at any time and begin the game.
The player then played the game in the appropriate reward mode, either until they won by surviving for five minutes or until they lost the game by running out of money.
At the end of the game, the player was invited to "End Play" or "Play Again." The first
79 time the player encountered this screen, the "End Play" button was grayed out and inaccessible, since subjects must play twice before beginning the post-test. On their second time through the game, a new randomly chosen in-game bias was selected; the only constraint is that it could not be the same as the bias chosen in the first game. However, the player continued to play the game in the same reward mode. After playing a second time, the player had the option to select either
"Play Again" or "End Play." Players could select "Play Again" as many times as they liked, receiving a new bias each time but always remaining in the same treatment condition. When the player selected "End Play" they were asked a factual question confirming their understanding of game content, and asked about how much they enjoyed the game.
After completing the game comprehension and feedback questions, all players were sent to the post-test. The post-test consisted of the Attribution Test, the Symbolic Racism Scale
(Henry & Sears, 2002), and the Modern Sexism Scale (Swim et. al., 1995). As in the pre-test, the tests and questions were presented in a randomized order.
If a pre-test was administered, the attribution post-test included the ten questions not used in the pre-test (five race questions, five gender questions). If the player did not receive a pre-test, a ten-question attribution test was generated as for the pre-test: five randomly chosen questions used scenarios featuring race, and the other five used scenarios involving gender. The order of questions and the order of answers within each question was randomized. The Symbolic Racism and Modern Sexism scales were presented again in full.
At the end of the post-test, demographic data was collected from all participants.
After completing the post-test, web-recruited subjects had the option to submit an email address if they wanted to be entered in the raffle for an iPad. Mechanical Turk subjects were given a unique, randomly-generated code to submit to the Mechanical Turk site in order to be
80 paid for their participation.
Finally, all subjects were presented with links to organizations promoting racial and
gender equality, as required by the IRB.
At this point, players in the three game conditions were done with the experiment; as they had not yet had a chance to play, players in the control condition had the option of playing the game once the study was complete.
Subjects The target population for the study was "adult American players of online casual games."
As such, web-based recruitment took place on sites devoted to gaming, social media
communities related to gaming, and email lists of game-players. Potential subjects were provided
with a link to follow if they wished to participate in the study. Subjects were also informed that if
they completed the study, they would have the opportunity to enter a drawing to win an iPad.
In addition to the open online recruitment process, an additional subject pool was
recruited from Mechanical Turk, Amazon's microtask assignment site. In addition to the
logistical benefits of the service, the Mechanical Turk group serves to control for selection bias.
Recruiting through gaming communities is a snowball sample; the reach of Mechanical Turk is
broader. The former group is composed entirely of self-identified gamers who participate in
online gaming communities, while the latter group may include self-identified gamers as well as
participants who do not primarily see themselves as gamers. Using both data sources allowed us
to compare two different sources of potential players. Subjects who completed the entire study
were paid $5 for their time.
During the consent process, subjects were asked to provide their age in the demographic
section of the survey. If subjects consented to the study, but later reported an age under eighteen,
81 their data was discarded.
Four hundred valid responses were required to achieve a power over .8, given a two-way
ANOVA model with game condition (control, informational, financial, and generative) and pre-
test condition (present, absent) as independent factors, and assuming a “small” effect size
(Cohen's d of .2). Recruitment therefore continued until over four hundred valid responses were
recorded.
A total of 703 subjects began the study; 72 completed the study, but provided invalid data
for the purposes of this study; 412 valid responses were acquired. See Table 2 for details.
Table 2
Subjects by recruitment source and treatment condition Experimental Control (Game-task) (On-screen Text-task) Mechanical Turk Web-recruited Mechanical Turk Web-recruited Pretest* Yes No Yes No Yes No Yes No iReward n = 30 n = 20 n = 25 n = 18 fReward n = 24 n = 15 n = 28 n = 14 gReward n = 19 n = 26 n = 26 n = 24 Text n = 26 n = 31 n = 43 n = 42 * Pretests were: Modern Sexism Test, Symbolic Racism Test, Attribution Test
Instruments The game intervention is a custom-designed Flash-based online game built using the
PMAD design principles (Table 3), which models a systemic (but heavily simplified) approach to racism and sexism. Players take the role of a recruiter trying to place clients into biased organizations, and are encouraged to understand and manipulate the game's bias in order to maximize their score.
The study uses multiple instruments to collect different types of data before, after and
82 during game-play. Specifically, the study collects data on in-game player behavior, on player attributions of racism and sexism, on player attitudes about race and gender, and on player demographics.
Table 3
PMAD design principles 1 The game system models the relevant domain. 2 Player actions affect, and are affected by, the model. 3 Players receive feedback about the impacts of their actions as they relate to the model. 4 The game goals point players toward model conflict. 5 Players can experiment with the game's model. 6 Players must figure out rules and strategies for themselves.
As described above, consenting subjects were randomly assigned to the pre-test or no pre-test condition. The pre-test consisted of the Racism and Sexism Attribution Tests, the
Modern Sexism Scale, and the Symbolic Racism Scale.
Attribution tests. Because this project investigates conceptual change, we examine
subjects' underlying conceptual models of how racism and sexism function. Specifically, we
hope to uncover whether subjects see racism and sexism as direct processes, isolated episodes
resulting from the actions of individuals, or as systemic processes that are created by multiple,
often simultaneous, factors in the absence of a single agent (Chi & Roscoe, 2002; Chi, 2008;
Johnson, 2002; Meadows, 2008; Forrester, 1961).
Existing measures of conceptual change often analyze concept maps to understand how
ideas are related to each other (Markham, Mintzes, & Jones, 1994; Wallace & Mintzes, 1990).
However, because this project aimed for a large subject population, it required a scalable solution
that can be deployed online. Additionally, we are primarily interested how likely players are to
83 attribute discriminatory behavior to individual or systemic causes, not in the details of the
underlying models. It seemed possible to develop a simpler test that is also more amenable to
automated analysis.
We developed a "Racism Attribution Test" and a "Sexism Attribution Test." Each of these
tests measures whether subjects are more likely to use a direct-process model for understanding discrimination, or whether they are more likely to use a systemic model. As discussed later, each test was verified by domain experts to ensure that it assesses the target area accurately.
Each test contains ten questions, each of which presents a brief scenario in which unequal
outcomes occur. Test questions are paired; for each question in the Racism Attribution Test, there
is a parallel scenario in the Sexism Attribution Test, and vice versa. The Sexism Attribution Test
can be found in Appendix B, while the Racism Attribution Test can be found in Appendix C.
For purposes of this study, characters who benefit from this inequality are white or male;
characters who are discriminated against are non-white or female. Subjects are then asked to
choose the most likely explanation of the scenario.
To avoid conflation of attitudes about prejudice with attributions of prejudice, subjects
are provided with four explanatory options. Along one axis, the responses vary based on attribution, while along the other axis, responses vary based on attitude. For each question, therefore, the responses are framed as shown in Table 4.
84 Table 4
Attribution test answer categories
Unfair Outcome Fair Outcome Individual Explanation "Individual choices resulted in "Individual choices resulted an unfair outcome." in an appropriate outcome."
Systemic Explanation "Systemic factors resulted in an "Systemic factors resulted in unfair outcome." an appropriate outcome."
A sample question runs as follows:
The anthology "Best Short Stories By New Writers" is compiled by a single editor.
The editor chooses stories for the anthology from obscure small-press literary
magazines.
This year's anthology contains only stories by white writers.
Which of the following explanations would you consider most likely to be true?
1. The editor was racist in selecting the stories.
2. White authors are better at writing stories that appeal to an audience of
sophisticated, literary readers.
3. The editor selected the best stories without considering race.
4. Non-white authors are underrepresented in small-press literary magazines.
Fourteen questions were developed and reviewed by domain experts to ensure they appropriately represented systemic and individual explanations of racism and sexism. Next, a
85 pool of subjects was recruited to perform a sorting task on the answers using Mechanical Turk.
After training on the definitions of each category, subjects were asked to identify which answer
fell into which category. Subjects were able to assign as many answers to each category as they
wanted, so the ability to correctly assign answers to categories demonstrates that subjects were
indeed able to discriminate between systemic and agentic outcomes.
Binomial analyses were performed to see whether players were able to correctly assign answers to “systemic” and “agentic” categories at a statistically significant level. For each of the
112 answers (four answers to each of fourteen race questions and fourteen gender questions), we conducted a binomial test, using “answer is systemic” and “answer is agentic” as the categories.
In order to pass the validation stage, all four answers to the question had to be correctly assigned to the systemic or agentic categories with p < .05. Additionally, both the race and gender versions of the question had to pass in order to preserve parallelism between the tests.
Ten questions passed the validation stage and were integrated into the final test. See
Appendix D for the data table from the validation study.
As described in the procedures section, when an attribution pre-test was generated, five of the ten test questions were randomly chosen to use the race version of the question. The other five questions used the gender version of the same question. The order of questions was randomized, and the order of answers was randomized for each question. The answer chosen by the subject was stored for analysis.
Attitude analysis. In addition to understanding players' attributions of racism and sexism, we also hope to examine their attitudes about race and gender. While it might be possible to extract data on this from the attribution test, due to its four-cell design, we prefer to use existing measures when they fit the needs of the project.
86 As it becomes less acceptable to express overt racism and sexism in American society,
subjects become more likely to respond in a socially appropriate way to measures of explicit or
"old-fashioned" racism and sexism (Schuman, Steeh, Bobo, & Krysan, 1998; Swim, Aikin, Hall,
& Hunter, 1995). These measures of explicit racism and sexism no longer correlate meaningfully
with racist and sexist attitudes.
In response, researchers have developed scales that measure implicit or "modern" racism
and sexism. These measures correlate highly with discriminatory attitudes and behavior, and do
not evoke social desirability bias. The Symbolic Racism Scale, despite some debate over what
precisely it measures, has been validated as a measure of racial bias (Henry & Sears, 2007;
Sears, 2010). The Modern Sexism Scale, similarly, has been validated as an appropriate measure
of gender bias (Swim et. al., 1995). The Modern Sexism Scale can be found in Appendix E of
this document, and the Symbolic Racism Scale is included as Appendix F.
The Modern Sexism Scale and the Symbolic Racism Scale were administered to all
subjects as part of the pre-test. As described in the procedures section, the Likert scales in the
two tests as originally designed used opposite codings; the Symbolic Racism scale uses 1 to
indicate Strongly Agree (Henry & Sears, 2002), while the Modern Sexism scale uses 1 to
indicate Strongly Disagree (Swim et. al., 1995). For consistency, the Modern Sexism Scale's
Likerts were flipped when presented to the user.
Subjects who were assigned to one of the three game conditions had their in-game data
collected. Subjects in these conditions who received a pre-test began play after completing the
pre-test, while subjects who did not receive a pre-test began play immediately upon consenting to participate in the study.
In-game data collection. Advance tracks consenting players' in-game behavior using a
87 custom-developed Google App Engine data proxy. Every action the player takes, such as placing
a client or inspecting a job, is recorded and sent to a central database, from which it can be
downloaded and analyzed. This repository of in-game data can be used to follow a particular player's progress through the game, or to inspect player activity in the aggregate.
For the purposes of this project, we analyze the following in-game events, all of which contribute to the player's final score:
• A character enters the player's client list
• The player places a character into a job
• The player pays to upgrade a character
• A character is promoted or demoted
All relevant data is collected for each event. For example, when a character enters the player's client list, the game records the character's race and gender, the level the character is on, the character's name and statistics, and at what time the event occurred. Additionally, we collect data unique to each instance of the game, such as which version of the game the player received, what group was being discriminated against, at what time the player identified the game's bias, and the player's final score.
Taken together, this data can give us a picture of whether players are successful or unsuccessful in their play, and how effective they are at countering the bias in the system through skillful choices. In turn, we can correlate this data with more explicit instruments to see what play experiences best promote understanding.
After completing play, all subjects received a post-test. The post-test consisted of the
Racism and Sexism Attribution Tests, the Modern Sexism Scale, and the Symbolic Racism Scale.
As described in the procedures section, if a pre-test was administered, the attribution
88 post-test included the ten questions not used in the pre-test (five race questions, five gender
questions). If the player did not receive a pre-test, a ten-question attribution test was generated as
for the pre-test: five randomly chosen questions used scenarios featuring race, and the other five
used scenarios involving gender. The Modern Sexism and Symbolic Racism scales were re- administered in full.
Demographic data. After completing the post-test, all subjects were asked to provide demographic data. Subjects were asked to provide their race and gender. Additionally, they were asked to provide other demographic information that is known to influence levels of racial and gender bias: their age and what type of region they live in (urban, suburban, rural) (Schuman,
Steeh, Bobo, & Krysan, 1998). We collect this data in order to control for the impact of player race, gender, age, and living situation on attribution and attitude test performance. See Appendix
I for the demographic questions used in the study.
Data Processing Data was collected through a custom-designed Google App Engine data proxy connected to web forms. Player actions were recorded and transmitted to the proxy, which stored them in a database. Upon completion of the data collection period, all data was downloaded from the site for processing.
At this point, invalid data sets were removed from the analysis pool if any of the following three factors were present: if the subject reported an age under eighteen, if the subject reported a country of origin other than America or a first language other than English, or if the subject did not complete the study.
The subject pool was limited to eighteen and over in order to comply with standards of informed consent. Subjects were informed during the consent process that they must be over the
89 age of eighteen. If subjects consented and later reported an age under eighteen, their data was
immediately deleted without being processed or analyzed. Data from non-American subjects and subjects for whom English was not their first language was archived for future analysis. Data from non-American subjects was not used for the current study because some evidence suggests there may be cross-cultural differences in attribution styles (Choi, Nisbett, & Norenzayan, 1999;
Norenzayan, Choi, & Nisbett, 2001). A separate cross-cultural study would be required to properly analyze this data. The analysis was limited to English-speakers because both the game tutorial and the control condition required reading English text. Finally, incomplete data sets were used to test comparability between completers and non-completers. Data was sent to the data proxy for storage after the pre-test, after gameplay, and after the post-test. This allowed the collection of pre-test data for subjects who did not complete the study in its entirety. Non- completers' pre-test scores on the attribution and attitude tests were compared to the pre-test data of completers using t-tests to verify basic comparability between the groups.
Valid data sets, including pre-test data from non-completers, were processed for analysis as follows.
Attribution data. As noted in the design of the attribution test, each question on the attribution test has four answers: systemic-positive, systemic-negative, individual-positive and individual-negative. For each question, we record what answer the subject provided. Data is recorded separately for questions involving race and questions involving gender.
Given this basic data, we calculate a systemic-sexism score by combining the number of questions about gender answered with the systemic-positive or systemic-negative categories, taken together. We calculate a systemic-racism score in the same fashion, using the questions regarding race. For both race and gender, a higher number means the subject answered more
90 questions using systemic explanations.
Attitude data. The Symbolic Racism scale and Modern Sexism scale both contain
scoring instructions for how to convert subjects' responses into a single number representing
their total score. These instructions were used to calculate subjects' scores on these scales. The
instructions for the Modern Sexism Scale were adjusted to account for the reverse-coding of the
Likert scales. Given this adjustment, for both the Symbolic Racism scale and the Modern Sexism
scale, a higher score means less evidence of biased attitudes around race and gender.
In-game data. We conceptualize in-game data in two different ways. First, we use game- play data to measure player performance. Second, we examine whether in-game performance is affected by the reward condition. Since every action by the player is recorded, we must distill an enormous mass of data to determine appropriate metrics for addressing each of these questions.
In other words, we must operationalize player success so that we can measure it.
For the purpose of measuring the success of players, we are concerned with their ability to understand and manipulate the game's system. Fortunately, the game has a built-in metric for determining which players are best able to manipulate the game's system: the score. A high score reflects skillful play, while a low score reflects a poor understanding of the game or poor skills at manipulating it.
Unfortunately, the score displayed to the player will vary between the three versions of the game. In the financial-reward condition, the player receives a fixed bonus to their score for detecting the game's bias; in the generative-reward condition, players receive a score multiplier for placing certain kinds of characters; and in the informational-reward condition players receive no bonuses at all. This makes the raw score a useful measure for displaying feedback to an individual player, but not useful for comparing the performance of different players across
91 different versions of the game.
Instead, we calculate a version-independent score, which omits the special cases of the one-time bonus (in the financial-reward condition) and the score multiplier (in the generative- reward condition). The version-independent score is calculated during play and updated every time the player's actual score is updated. For example, it is updated when the player successfully places a character, when the player promotes a character, when they spend money on an upgrade, or when expenses are deducted from their score. Like the score visible to the player, this score is based on the player's success in placing characters and the level at which they place them.
However, because it omits all financial and generative rewards, it is calculated identically across all three versions of the game.
This version-independent score lets us determine whether there are performance differences between versions of the game. Because it distills player success at character placement to a single number, it allows us to see whether players are more successful at placing characters, on average, in a particular game version.
Using the version-independent score, we also analyze the impact of game version on a key in-game challenge – placing members of discriminated-against groups into jobs. For that reason, we also calculate the percentage of the version-independent score generated from placing each type of character. We can then calculate the percentage of the player's total score received from members of the discriminated-against group. These scores are normalized based on the total number of characters the player encountered from each group.
The version-independent score is used instead of the raw score in all score-related analyses in this study.
Demographic data. Players are asked to select one of three gender categories (male,
92 female, other). These three categories were the only options available; the player must choose
one from a drop-down menu. Players are asked to select one or more racial categories, which
include the four categories used in the game as well as additional categories from the United
States Census (Hume, Jones, & Ramire, 2010). Subjects could use the check-boxes to select
more than one racial identity. Finally, players are asked to provide their age, country of origin,
community type (urban, suburban, or rural), and first language.
Gender, age, and community type were used as-is for analysis. However, players were
grouped into three broad racial categories. All players who reported their racial identity as White,
and did not select any other racial categories in addition, were grouped into a White category.
Players who reported Black, Hispanic, or both were grouped into a Black and Hispanic category.
Finally, all remaining players were grouped. Black and Hispanic Americans are, broadly
speaking, disadvantaged by the racialized power structure of the United States, while White
Americans primarily benefit from it. We therefore choose these analytic categories to investigate
the impact of relative advantage and disadvantage on players' experiences with the game.
Data Analysis Before investigating the study's questions, three preliminary analyses were conducted.
First, we compared the web-recruited players to players recruited through Mechanical
Turk. In order to determine whether the two populations were equivalent, we compared their demographic characteristics using t-tests for the continuous dependent variables (age) and using chi-square tests for the categorical dependent variables (race, gender, community type). We also compared their scores on the pre-tests using a series of ANOVAs with player source as a fixed
factor and pretest scores as the dependent variables. We investigated whether the two groups win
at similar rates using a chi-square, testing player source against whether they won a game.
93 Finally, we examined whether the two groups score similarly on the test using an ANOVA with
group membership as a fixed independent factor and game score as the dependent variable.
As described in the results chapter, the web-recruited players and the Mechanical Turk players were not equivalent populations. All subsequent analyses were performed separately for the web-recruited and Mechanical Turk players.
Next, the pre-test scores of completers and non-completers were compared using a series of ANOVAs, with completion status as a fixed independent factor and pre-tests scores as the dependent factors. ANOVAs were conducted for the Modern Racism Scale, the Symbolic Racism
Scale, the Systemic Racism Test, and the Systemic Sexism Test.
Finally, we checked the possible impact of the pre-test on post-test performance, by comparing post-test means between players who received the pre-test and players who did not.
To achieve this, we conducted four ANCOVAs, using pre-test group membership as the independent fixed factor, participant race and gender as the covariates (modeled as independent fixed factors), and test scores as the dependent variables as per above. This analysis demonstrates whether the presence of the pre-test influenced the post-test outcomes.
Next, we investigated the questions laid out at the beginning of this chapter.
Q1: Controlling for pre-test scores, are there differences in attribution test
scores for race and gender across the four study conditions?
For research question 1, we examine within-player change. We test the hypothesis that there are score differences between groups on the attribution test, as opposed to the null hypothesis that no such differences exist. These hypotheses were tested using ANCOVAs with
94 group membership as an independent fixed factor, participant race and gender as covariates
(modeled as independent fixed factors), pre-test scores as a covariate, and post-test scores as the dependent variable. Contrasts are used to test for an overall difference between the three game conditions and the single control condition. This analysis was conducted for both the Systemic
Sexism and Systemic Racism measures.
Q2: Are there associations between in-game measures (such as player score,
number of characters placed, and number of game plays) and attribution test
scores?
For research question 2, we test the hypothesis that there is an association between in- game measures and attribution post-test scores, as opposed to the null hypothesis that no such associations exist. The hypotheses were tested using Spearman's rho, conducting a partial correlation that controls for pre-test score. This analysis was conducted for both the Systemic
Sexism and the Systemic Racism measures.
As part of understanding the effect of game performance, we also investigate the effect of player race and gender on in-game measures. We test the hypothesis that there are in-game performance differences between players of different races and/or genders, as opposed to the null hypothesis that no such differences exist. These hypotheses were tested using ANCO VAs with participant race and gender as independent fixed factors and game performance measures as the dependent variables.
Q3: For game players, are there differences in measures of systemic
95 understanding of racism and sexism across bias guess conditions?
For research question 3, we test the hypothesis that there are mean differences in attribution test scores between bias guess conditions, as opposed to the null hypothesis that no such differences exist. These hypotheses were tested using ANCOVAs with bias guess condition as an independent fixed factor, participant race and gender as covariates (modeled as independent fixed factors), pre-test scores as a covariate, and post-test scores as the dependent variable.
Contrasts were used to determine whether there was a difference between players who did and did not make a guess. This analysis was conducted for both the Systemic Sexism and Systemic
Racism measures.
Q4: For game players, are there differences in game performance measures
across bias guess conditions?
For research question 4, we test the hypothesis that there were in-game performance differences between bias guess conditions, as opposed to the null hypothesis that players performed identically across game conditions. These hypotheses were tested using ANCOVAs with bias guess condition as an independent fixed factor, participant race and gender as covariates (modeled as independent fixed factors), and in-game performance measures as the dependent variable. Contrasts were used to determine whether there was a difference between players who did and did not make a guess.
Q5: Controlling for pre-test scores, are there differences in attitude test scores
96 for race and gender across the four study conditions?
For research question 5, we test the hypothesis that there are mean differences in attitude
test scores between treatment conditions, as opposed to the null hypothesis that there are no
differences between treatment conditions. These hypotheses were tested using ANCOVAs with
group membership as an independent fixed factor, participant race and gender as covariates
(modeled as independent fixed factors), pre-test scores as a covariate, and post-test scores as the dependent variable. Contrasts were used to test for an overall difference between the three game conditions and the single control condition. This analysis was conducted for both the Modern
Sexism and Symbolic Racism measures.
Q6: Are there associations between in-game measures (such as player score,
number of characters placed, and number of game plays) and attitude test
scores?
For research question 6, we test the hypothesis that there is an association between in- game measures and attitude post-test scores, as opposed to the null hypothesis that there is no such association. The hypotheses were tested using Spearman's rho, conducting a partial correlation that controls for pre-test score. This analysis was conducted for both the Modern
Sexism and the Symbolic Racism measures.
Q7: For game players, are there differences in measures of attitudes toward
racism and sexism across conditions?
97
For research question 7, we test the hypothesis that there are mean differences on the
attitude tests between bias guess conditions, as opposed to the null hypothesis that no such
differences exist. These hypotheses were tested using ANCOVAs with bias guess condition as an
independent fixed factor, participant race and gender as covariates (modeled as independent
fixed factors), pre-test scores as a covariate, and post-test scores as the dependent variable.
Contrasts were used to determine whether there was a difference between players who did and
did not make a guess. This analysis was conducted for both the Modern Sexism and Symbolic
Racism measures.
Conclusion This project aims to determine the impact of targeted game-play on player conceptions of racism and sexism, and to explore the effect of different reward systems on such conceptions. It looks both at player attributions of racial and gender disparities, and at evidence of players' racist and sexist attitudes. The study described in this chapter allows us to address these questions.
Taken together, our first four hypotheses let us examine whether playing Advance changes players' likelihood of using systemic explanations for racism and sexism. We compare the overall effect of the game to the effect of reading a piece of text, which allows us to assess the overall effectiveness of the game compared to other possible interventions. Next, we investigate the impact of game-play on player attributions. Because we investigate an association between player performance measures and attribution change, we can investigate whether player behavior may be responsible for changes in player attributions. Finally, we look for differences in attribution style and in game performance between players who encountered the three reward systems in the game (informational, financial, and generative) and players who did not encounter
98 the reward system at all.
Our second set of hypotheses examine the same questions, but looking at player attitudes about race and gender rather than at attribution style.
Using data gathered from this study, we can draw conclusions about the overall effectiveness of the game, and about which reward systems function most effectively at achieving cognitive or affective change around issues of racism and sexism. In the next chapter, we turn to the results of the study and what they mean.
99 Chapter 5: Results
As described in the previous chapter, data was collected from two separate populations.
One population was recruited online, recruiting self-identified game players through affinity
groups and social networking. The other population was recruited through Amazon's Mechanical
Turk microtask service, who were able to select this particular task from a broad range of other
tasks and activities. The latter group may have included self-identified game players, but also reached a larger subject pool.
While both groups fit the broad profile of subjects for this study (English-speaking
Americans over 18 who play casual games), there may be differences between them.
Comparability between these groups was determined before conducting further analyses.
Player Source Analyses Given that players were drawn from different sources, we first investigated whether there were demographic differences between the samples. The two player sources were the web and
Mechanical Turk. Four demographic factors were examined: gender, race, age, and community type.
Possible gender differences were tested using a chi-square analysis of player source by player gender. No significant difference was found between the groups, χ2 (1,403) = 1.798, p
= .180 (Table J113).
Possible racial identity differences were tested using a chi-square analysis of player
source by player race, using the three analytic categories detailed in the previous chapter.
Significant differences were found between the two player groups, χ2 (2,412) = 10.939, p = .004.
13 For all data tables not interpolated in text, please see Appendix J. A separate table of contents is provided with the appendix to aid in referencing specific tables.
100 87% of Web-recruited players reported themselves as White, while only 77% of Mechanical Turk
players did the same (Table 5).
Table 5
Crosstabulation of player source and player race
Player Race χ² p Total
Black and Player Source White Hispanic Other
Web 194 5 22 10.94a .004* 221
Mechanical Turk 148 17 26 191
Total 342 22 48
a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 10.20.
* p≤ .005
An ANOVA was conducted on player age, with player source treated as a fixed
independent variable. There was a significant effect of player source on player age, F(1, 408) =
4.787, p = .029 (Table J4). As shown in Table J3, the average age of web-recruited players was
32.6 years (SD = 8.73), while the average age of Mechanical Turk players was 34.72 years (SD
= 10.927). However, the explanatory power of player source is negligible (η2 = .012).
Finally, possible differences in community type were tested using a chi-square analysis.
There was a significant difference in community type between the two groups, χ2 (2,412) =
7.540, p = .023. Subjects in the web group were more likely to live in rural areas, while subjects
in the Mechanical Turk group were more likely to live in suburban or urban environments (Table
6).
101 Table 6
Crosstabulation of player source and living area
Living Area χ² p
Player Source Rural Suburban Urban
Web 17 118 86 7.54 .023*
Mechanical Turk 31 97 63
Total 48 215 149
a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 22.25.
p ≤ .05
Taken together, the demographic analysis indicates that web-recruited subjects are younger, more likely to be white, and more likely to be rural than the Mechanical Turk subjects.
However, while these demographic differences potentially influence the factors we are interested in investigating, the study collected direct data about the pre-existing attitudes and attribution
styles of both groups of players. For this reason, pre-test scores were analyzed to see if there are
differences in performance between player groups.
Two attribution style pre-test scores were calculated: the Systemic Sexism score and the
Systemic Racism score. For each of these scores, an ANOVA was conducted with the score as
the dependent variable and player source as a fixed independent factor.
There was a significant effect of player source on Systemic Sexism pre-test scores, F(1,
219) = 50.803, p < .001. η2 = .132 (Table J7). Web-recruited players scored significantly higher
than Mechanical Turk players, indicating they are more likely to use systemic explanations for
sexism (Table 7).
102 Table 7 Systemic Sexism pretest means by player source
Player Source Mean SD N
Web 2.89 1.29 122
Mechanical Turk 1.93 1.16 99
Total 2.46 1.32 221
There was a significant effect of player source on Systemic Racism pre-test scores, F(1,
219) = 47.561, p < .001. η2 = .178 (Table J9). Web-recruited players scored higher than
Mechanical Turk players, indicating a greater likelihood of using systemic explanations for racial
disparities (Table 8).
Table 8 Systemic Racism pretest means by player source
Player Source Mean SD N
Web 3.08 1.28 122
Mechanical Turk 1.90 1.25 99
Total 2.55 1.40 221
Two attitude pre-test scores were calculated: the Modern Sexism score and the Symbolic
Racism score. For each of these scores, an ANOVA was conducted with the score as the
dependent variable and player source as an independent fixed factor.
There was a significant effect of player source on Modern Sexism pre-test scores, F(1,
219) = 45.075, p < .001, η2 = .171 (Table J11). Web-recruited subjects scored higher, on average, than the Mechanical Turk subjects (Table 9). This indicates that, prior to the intervention, web-
recruited players held less sexist attitudes than Mechanical Turk players did.
103 Table 9 Modern Sexism pretest means by player source
Player Source Mean SD N
Web 31.82 4.15 122
Mechanical Turk 27.43 5.56 99
Total 29.86 5.29 221
There was a significant effect of player source on Symbolic Racism pre-test scores, F(1,
219) = 95.392, p < .001, η2 = .303 (Table J13). Again, web-recruited players scored higher than
Mechanical Turk players, indicating that before the study web-recruited players held less racist
attitudes than Mechanical Turk players did (Table 10).
Table 10 Symbolic Racism pretest means by player source
Player Source Mean SD N
Web 26.28 4.38 122
Mechanical Turk 20.12 4.99 99
Total 23.52 5.57 221
These results indicate that, prior to the study, the web-recruited players were more likely to use systemic attribution styles and held less racist and sexist attitudes than the Mechanical
Turk players. The difference between player groups is largest for attitudes about racism, as expressed on the Symbolic Racism test, with a medium as opposed to a small effect size.
Although the demographics of the two groups might lead one to expect the reverse, it is empirically evident that the web-based players are a sample with more liberal attitudes and a greater likelihood of using systemic explanations.
Finally, differences in player performance in the game itself were investigated. The web-
104 recruited group self-identified as game players, while the Mechanical Turk group had the chance
to select a game task from among many other tasks. The Mechanical Turk group may have
included self-identified game players, but may also have included a broader range of subjects. It
is therefore important to verify the nature of the gameplay differences between the groups.
Win frequency and player score were investigated as possible axes of player competence.
For win frequency, a chi-square analysis of win frequency by player source was conducted. For
player score, an ANOVA with player score as the dependent variable and player source as a fixed
independent factor was conducted.
Significant differences in win likelihood were found between the two player source
groups, χ2(1,345) = 9.710, p = .002. 93% of web-based players won a game, compared to 83% of Mechanical Turk players (Table 11).
Table 11 Crosstabulation of player source and games won
Won Game χ² p Total
Player Source No Wins Wins
Web 11 171 9.71a .002* 182
Mechanical Turk 27 136 163 Total 38 307 345
a. 0 cells (0.0%) have expected count less than 5. The minimum expected count is 17.95.
p ≤ .05
However, no significant difference in average score between the groups was found,
F(1,226) = 2.384, p = .124 (Table J15, Table J16). Because only players who won a game were
105 given a score, this suggests that successful players in both groups performed equally well at the
game. However, the Mechanical Turk group contained more players who were unable to master
the game well enough to end with a positive score.
In summary, significant differences were found demographically, in players' prior
likelihood of using systemic attribution styles, in their previously-held attitudes about race and gender, and in their ability to win games. Broadly speaking, before beginning the study,
Mechanical Turk players were less likely to attribute racial or gender disparities to systemic rather than agentic causes; they held more racist and sexist attitudes; and they were less likely to win games, although winners scored comparably to web-based players. Given that this study looks at the impact of gameplay on attribution style and attitudes about racism and sexism, we concluded that these groups differ too significantly to be considered as a single sample. We therefore chose to conduct further analyses on these groups separately.
Analysis of Web-Recruited Players Demographics. There were 221 players in this group, including 106 males and 108 females; 7 players selected neither male nor female as their gender identity (Table J17). 194 players reported themselves as White, 5 as Black or Hispanic, and 22 as another racial identification (Table J17). The average age of players was 32.60 years (SD = 8.730) (Table J17).
Mortality and priming. Before drawing conclusions about the study's research questions, we investigated possible mortality and priming effects.
The mortality analysis compared the pre-test scores of completers and non-completers for each of the four tests: Systemic Sexism, Systemic Racism, Modern Sexism, and Symbolic
Racism. For each test, an ANOVA was conducted with pre-test score as the dependent variable, and completion status as a fixed independent factor. No significant differences were found
106 between completers and non-completers for the Systemic Sexism test (F(1, 243) = .210, p = .647;
Tables J19 and J20), the Systemic Racism Test (F(1, 243) = 2.489, p = .116; Tables J21 and J22),
or the Modern Sexism test (F(1, 243) = .223, p = .637; Tables J23 and J24).
Pre-test score significantly correlates with completion status for the Symbolic Racism
test, F(1,243) = 5.060, p = .025 (Table J26). Completers scored higher than non-completers,
indicating less racist attitudes (Table 12). However, this effect is responsible for very little of the
variance in pre-test score (η2 = .02). We therefore conclude that it does not represent a significant
mortality effect for this group.
Table 12 Symbolic Racism pretest means by completion status
Completed Mean SD N
No 25.17 4.33 88
Yes 26.42 4.08 157
Total 25.97 4.21 245
To investigate possible priming effects, we compared the post-test scores of players who received a pre-test and players who did not. For each of the four tests (Systemic Sexism,
Systemic Racism, Modern Sexism, and Symbolic Racism), an ANOVA was conducted with post-
test score as the dependent variable, and pre-test group as a fixed independent factor.
There was a significant effect of pre-test group on Systemic Sexism post-test scores,
F(1,218) = 4.3, p = .039 (Table J28). Subjects who received the pre-test scored lower than subjects who did not, indicating a lower likelihood of using systemic explanations (Table 13).
While it might be possible to explain this effect through a fatigue effect, we do not consider this a significant impact of the pre-test due to a low η2 (.019).
107 Table 13 Systemic Sexism pretest means by pretest group
Pretest Group Mean SD N
No Pretest 3.50 1.24 98
Pretest 3.12 1.42 122
Total 3.29 1.35 220
There was a significant effect of pre-test group on Systemic Racism post-test scores,
F(1,218) = 9.011, p = .003. As with the Systemic Sexism test, subjects who received the pre-test scored lower on the Systemic Racism post-test, indicating a lower likelihood of using systemic explanations (table 10). This effect shows a more substantial η2 (.04).
Table 14 Systemic Racism posttest means by pretest group
Pretest Group Mean SD N
No Pretest 3.56 1.32 98
Pretest 3.00 1.43 122
Total 3.25 1.40 220
No significant effect of pre-test group was found for the Modern Sexism test (F(1,218)
= .698, p = .404; Table J31 and J32) or the Modern Racism test (F(1,218) = .217, p = .642;
Tables J33 and J34).
Taken together, we conclude that the priming effect of the pre-test has a limited impact on
this analysis. While significant differences were found for the Systemic Sexism test, the impact
was very small (η2 of .019), and no significant differences were found for the Modern Sexism
and Symbolic Racism tests. However, there was a significant difference found for the Systemic
Racism test. Players scored lower on the Systemic Racism post-test if they received the pre-test,
with η2 of .04.
108 Attribution type. Advance, the game used in this study was designed using the PMAD
principles noted in chapter three, in an attempt to change players' likelihood of using systemic
rather than agentic explanations for racial and gender bias. In order to determine whether
Advance successfully affected players, its impact was compared to the impact of reading text on
the same topics that are modeled by the game. This comparison was conducted by comparing
player performance on the attribution post-tests for each of the four study conditions (control, informational game, financial game, generative game), controlling for pre-test performance.
Q1: Controlling for pre-test scores, are there differences in attribution test
scores for race and gender across the four study conditions?
To answer this question, an ANCOVA was conducted for each of the two attribution tests,
the Systemic Sexism test and the Systemic Racism test, with treatment condition (control,
informational, financial, or generative) as a fixed independent factor and the post-test score for
each test as the dependent factor. Pre-test score, player race, and player gender were controlled
for, with pre-test score analyzed as a covariate and player race and player gender modeled as
fixed independent factors.
For the Systemic Sexism test, there was a significant effect of treatment condition on
post-test score, controlling for pre-test score, player race, and player gender, F(3,101) = 3.695, p
= .014 (Table J37). Players performed best in the control condition and worst in the financial
condition, with the other two game conditions falling in between (Table 15). An ANCOVA was conducted using a contrast to compare the control condition to the game conditions, which found a significant effect, F(1,101) = 5.192, p = .025 (Table J38). Players performed significantly better
109 in the control condition than when playing the game, no matter which version of the game they
encountered. The control text was more effective than the game at changing players' likelihood
of using systemic explanations for gender bias.
Table 15
Systemic Sexism posttest means by treatment condition
Treatment Condition Mean SD N
Control 3.73 1.18 41
Informational 2.92 1.26 25
Financial 2.30 1.41 27
Generative 3.33 1.37 24
Total 3.15 1.39 117
To understand the meaning of this difference, we conducted t-tests to determine whether there was positive or negative change from pre-test to post-test. For each of the four conditions, a difference score between the pre-test and post-test was calculated. T-tests were conducted on each difference score to determine whether they were significantly different from zero, which would indicate no change from pre-test to post-test.
In the control condition, the pre-post difference score was significantly different from zero, t(42) = 3.478, p < .001. The mean difference score was .5814, indicating that subjects were more likely to use a systemic attribution style for sexism after reading the control text (Table
J39).
In the informational condition, the pre-post difference scores were not significantly different from zero, indicating no significant effect of the game (t(24) = .156, p = .880; Table
J39). In the financial condition, the pre-post difference scores were not significantly different
110 from zero, indicating no significant effect of the game (t(27) = -1.070, p = .294; Table J39).
Finally, in the generative condition, the pre-post difference scores were not significantly different from zero, indicating no significant effect of the game (t(25) = 1.397, p = .175; Table J39).
In other words, the text increased players' likelihood of using systemic explanations for sexism, but none of the game conditions did. However, the explanatory power of treatment condition is small (η2 = .049).
For the Systemic Racism test, no significant effect of treatment condition was found,
F(3,101) = 1.549, p = .932 (Table J40, J41). A t-test was conducted to determine whether pre- post difference scores on this test were significantly different from zero. No significant effect was found, t(122) = -.649, p = .517 (Table J44). We conclude that there was no impact of either game or text on players' likelihood of using systemic attributions for racism.
In addition to investigating differences between treatment conditions, we want to know whether there is a relationship between in-game decisions and player attributions. Do more
successful players experience more change? Are there specific in-game behaviors that are linked
to the game having an impact? Research question two addresses these questions.
Q2: Are there associations between in-game measures (such as player score,
number of characters placed, and number of game plays) and attribution test
scores?
The following in-game measures were investigated: player score (as a measure of overall player success), total clients placed (as a measure of how often players had to consider character placement issues), total clients placed from the bias group (as a measure of how often players
111 contended with bias), how many attempts it took them to identify the game's bias (as a measure
of guessing versus investigating), and how many times the player chose to play (to control for
time on task).
For player score, total clients placed, bias-group clients placed, and bias identification attempts, partial correlations were conducted between the game measure and the post-test score on each of the attitude tests, controlling for the influence of the pre-test. For the number of plays, players were separated into two categories: those who played two times (n = 120), and those who played more than twice (n = 2). This was due to the small number of players who played more than twice. ANCOVAs were conducted on the post-test scores with number of plays as an independent fixed factor and the pre-test scores as a covariate.
For Systemic Sexism, there were no significant correlations found between any of the in- game measures and post-test scores, controlling for pre-test scores. For game score, r(70) = -
.180, p = .130 (Table J45). For clients placed, r(99) = -.038, p = .706 (Table J45). For bias group clients placed, r(99) = -.054, p = .590 (Table J45). For number of guesses, r(100) = .157, p = .115
(Table J45). For number of plays, F(1,119) = .056, p = .814 (Table J46, J47).
For Systemic Racism, there were no significant correlations found between any of the in- game measures and post-test scores, controlling for pre-test scores. For game score, r(70) = -
.059, p = .621 (Table J48). For clients placed, r(99) = .032, p = .732 (Table J48). For bias group clients placed, r(99) = -.039, p = .697 (Table J48). For number of guesses, r(100) = .165, p = .097
(Table J48). For number of plays, F(1,119) = 2.888, p = .092 (Table J49, J50).
We can conclude that there is no relationship between performance on these in-game
measures and player attribution style, indicating that in-game skill did not drive the effect.
We also investigated the relationship between player race and gender and these in-game
112 measures: player score (as a measure of overall player success), total clients placed (as a measure
of how often players had to consider character placement issues), total clients placed from the
bias group (as a measure of how often players contended with bias), how many attempts it took
them to identify the game's bias (as a measure of guessing versus investigating), and how many
times the player chose to play (to control for time on task). ANCOVAs were conducted on each of the first four measures, with player race and gender modeled as independent fixed factors and the game measure as the dependent variable. For number of plays, a chi-square analysis was conducted for player race and gender.
For game score, no significant effect of player race and gender was found, F(2,114)
= .001, p = .999 (Table J51, J52). For clients placed, no significant effect of player race and gender was found, F(2,171) = 3.054, p = .055 (Table J53, J54). For bias-group clients placed, no significant effect of player race and gender was found, F(2, 170) = 1.058, p = .349 (Table J55,
J56). For the number of guess attempts, no significant effect of player race and gender was found, F(2, 171) = .944, p = .391 (Table J57, J58). No association was found between player race and number of plays, χ2 (2,221) = .433, p = .806 (Table J59). No association was found between player gender and number of plays, χ2 (2,214) = .001, p = .981 (Table J70).
Finally, we investigate the impact of game type on player attribution style. For players who received the game intervention, some did not attempt to identify the bias. We compare those players to players who interacted with the bias system in the informational condition, in the financial condition, and in the generative condition, to understand the impact of the bias system specifically on players' attributions around racism and sexism. Formally stated, these comparisons derive from research question three.
113 Q3: For game players, are there differences in measures of systemic
understanding of racism and sexism across bias guess conditions?
For each of the attribution style tests (Systemic Sexism and Systemic Racism), we
conducted an ANCOVA with bias guess condition (no guess, guess in informational condition, guess in financial condition, guess in generative condition) as an independent fixed factor, pre-
test score as a covariate, player race and gender as covariates (modeled as independent fixed
factors), and post-test score as the dependent variable.
For the Systemic Sexism test, no significant effect of bias guess group was found, F(3,59)
= 1.496, p = .225 (Table J61, J63). For the Systemic Racism test, no significant effect of bias
guess group was found, F(3,59) = .866, p = .464 (Table J65, 66). From our earlier investigation
into Systemic Sexism and Systemic Racism difference scores, we know that subjects in the game
condition did not change significantly from pre-test to post-test. We therefore conclude that not only were there no differences between bias guess conditions, there was no change from pre- to post-test in all the conditions taken together. None of the bias guess conditions made players either more or less likely to use systemic attributions for either sexism or racism.
In order to understand this result, the impact of bias guess condition on in-game measures was checked. This investigation shows whether players behaved differently across game conditions. If players acted differently in different bias guess conditions, then we can conclude that those behavioral differences in play had no impact on player attribution style. If, on the other hand, players behaved the same no matter which game condition they encountered – and particularly if they behaved the same when they guessed and when they did not guess – then we can conclude that the bias guess system did not successfully affect in-game activity. If players
114 did not respond to the bias guess system by changing their in-game behavior, then we cannot know whether the intended strategic shifts would have impacted player attribution styles.
Q4: For game players, are there differences in game performance measures
across bias guess conditions?
ANCOVAs for in-game measures were conducted on: game score, the percentage of the score earned from bias-group clients, the number of clients placed, and the number of clients placed from the bias group. For each measure, we use bias guess condition (no guess, guessed in informational condition, guessed in financial condition, guessed in generative condition) as an independent fixed factor, player race and gender as covariates (modeled as independent fixed factors), and the game measure as the dependent variable.
No significant differences were found between bias guess conditions for any of the in- game measures. For game score, F(3,106) = .066, p = .978 (Table J67, J69). For the percentage of player score obtained from bias-group members, F(3,106) = .283, p = .838 (Table J71, J73).
For the total number of clients placed, F(3,116) = 1.265, p = .290 (Table J74, J76). For the total clients placed from the bias group, F(3,116) = .594, p = .620 (Table J77, J79).
Since no differences in player behavior or outcomes were found between groups, we conclude that the failure to find an impact of bias guess condition is a failure of the design to drive player action differently between groups. This has implications for the PMAD principles on which the game in this study was designed, as will be further discussed in the following chapter.
Attitudes. In addition to investigating the influence of the game on attribution styles, we
115 investigated whether the game can influence player attitudes about racism and sexism. In order to determine the answer to this question, we conduct the same comparison as in research question one, comparing the impact of the game to the impact of a control text, but using attitude measures as our dependent variables and covariates.
Q5: Controlling for pre-test scores, are there differences in attitude test scores
for race and gender across the four study conditions?
To answer this question, an ANCOVA was conducted for each of the two attitude tests, the Modern Sexism test and the Symbolic Racism test, with treatment condition (control, informational, financial, or generative) as a fixed independent factor and the post-test score for each test as the dependent factor. Pre-test score, player race, and player gender were controlled for, with pre-test score analyzed as a covariate and player race and player gender modeled as fixed independent factors.
For the Modern Sexism test, no significant effect of treatment condition was found,
F(3,101) = .525, p = .666 (Table J80, J82). There were no significant differences between treatment conditions. To investigate whether there was an overall effect, a t-test was conducted on the difference score between the Modern Sexism pre-test and post-test. No significant differences were found from pre- to post-test, t(121) = 1.756, p = .082 (Table J83). The game did not change players' attitudes about sexism.
For the Symbolic Racism Test, a significant effect of treatment condition was found,
F(3,101) = 2.993, p = .034 (Table J86). Players performed best in the informational game condition and worst in the financial game condition, with the control group and the generative
116 game condition falling in between (Table 16). The η2 for the effect of treatment condition was
small (.082).
An ANCOVA was conducted using a contrast to determine whether there was a difference
between the game conditions, taken together, and the control condition. No significant difference
was found, F(1,101) = .084, p = .773 (Table J87). The differences between conditions do not
reflect an underlying game-versus-control difference.
Table 16
Symbolic Racism posttest means by treatment condition
Treatment Condition Mean SD N
Control 23.12 1.98 41
Informational 23.44 2.48 25
Financial 22.78 2.26 27
Generative 22.88 2.15 24
Total 23.06 2.18 117
To investigate the impact of each condition on player performance, t-tests were conducted for each of the treatment conditions. For each condition, a t-test was conducted on the pre-post difference score for the Symbolic Racism test to see whether it significantly differs from zero.
In the control condition, the difference score was significantly different from zero, t(42) =
-6.849, p < .001. The mean difference score was -3.3488 (SD = 3.20627). In the informational condition, the difference score was significantly different from zero, t(24) = -4.313, p < .001. The mean difference score was -2.600 (SD = 3.01386). In the financial condition, the difference score was significantly different from zero, t(27) = -4.571, p < .001. The mean difference score was -
2.8571 (SD = 3.30784). Finally, in the generative condition, the difference score was significantly different from zero, t(25) = -6.426, p < .001. The mean difference score was -3.7308
(SD = 2.96051). (See Table J88.)
117 In all four conditions, the mean difference score was significantly different than zero, and was negative, indicating a lower score on the Symbolic Racism test after the study. In other words, experiencing the study caused subjects to express more racist attitudes. Because there was no difference between the game conditions, taken together, and the control, we cannot conclude that the game is what caused players to express more racist attitudes; the control condition had an equally large effect. Rather, we must consider explanations that refer to the study as a whole.
When people are asked to socially perform in ways that indicate racial openness, such as talking about their choice to vote for Obama, they later are more willing to express racist attitudes
(Effron, Cameron, & Monin, 2009). This may be because they feel they have “certified” themselves as anti-racist, or because they have exhausted the cognitive resources they use to overcome the racially biased attitudes of our society (Monin & Miller, 2001; Devine, 1989).
Being exposed to the study itself, therefore, in which players are asked to help unravel issues of discrimination, may have caused this effect.
As with attributions, we wanted to know whether there is an relationship of player in- game behavior to the attitude outcome measures. Do more successful players experience more change? Are there specific in-game behaviors that are linked to the game having an impact?
Q6: Are there associations between in-game measures (such as player score,
number of characters placed, and number of game plays) and attitude test
scores?
The following in-game measures were investigated: player score (as a measure of overall player success), total clients placed (as a measure of how often players had to consider character
118 placement issues), total clients placed from the bias group (as a measure of how often players
contended with bias), how many attempts it took them to identify the game's bias (as a measure
of guessing versus investigating), and how many times the player chose to play (to control for
time on task).
For player score, total clients placed, bias-group clients placed, and bias identification attempts, partial correlations were conducted between the game measure and the post-test score on each of the attitude tests, controlling for the influence of the pre-test. For the number of plays, players were separated into two categories: those who played two times (n = 120), and those who played more than twice (n = 2). This was due to the small number of players who played more than twice. ANCOVAs were conducted on the post-test scores with number of plays as an independent fixed factor and the pre-test scores as a covariate.
For the Modern Sexism test, no significant correlations between the in-game measures and Modern Sexism post-test scores were found, controlling for pre-test score. For game score, r(70) = -.121, p = .313 (Table J89). For clients placed, r(99) = -.021, p = .834 (Table J89). For bias-group clients placed, r(99) = -.011, p = .916 (Table J89). For the number of attempts at bias identification, r(100) = .042, p = .673 (Table J89). Finally, for the number of plays, F(1,119)
= .793, p = .375 (Table J90, J91).
For the Symbolic Racism test, no significant correlations between the in-game measures and Symbolic Racism post-test scores were found, controlling for pre-test score. For game score, r(70) = .081, p = .497 (Table J92). For clients placed, r(99) = -.085, p = .398 (Table J92). For bias-group clients placed, r(99) = .039, p = .700 (Table J92). For identification attempts, r(100) =
-.106, p = .288 (Table J92). Finally, for number of plays, F(1,119) = .406, p = .525 (Table J93,
J84).
119 Taken together, this data suggests that there was no relationship between greater skill at
the game and the outcome measures. Neither did more encounters with the core mechanic,
namely placing clients successfully.
Finally, we revisit the impact of game type, this time looking at player attitudes. For
players who received the game intervention, some did not attempt to identify the bias. We
compare those players to players who interacted with the bias system in the informational
condition, in the financial condition, and in the generative condition, to understand the impact of
the bias system specifically on players' attitudes about racism and sexism.
Q7: For game players, are there differences in measures of attitudes toward racism
and sexism across conditions?
For each of the attitude tests (Modern Sexism and Symbolic Racism), an ANCOVA was conducted with bias guess condition (no guess, guess in informational condition, guess in financial condition, guess in generative condition) as an independent fixed factor, pre-test score as a covariate, player race and gender as covariates (modeled as independent fixed factors), and post-test score as the dependent variable.
For the Modern Sexism test, no significant differences between bias guess conditions were found, F(3,74) = .026, p = .994 (Table J95, J97). For the Symbolic Racism test, no significant differences between bias guess conditions were found, F(3,74) = 1.525, p = .217
(Table J99, J101). Given the previous finding that the game design failed to drive behavioral differences between players, this result is unsurprising. If players interacted with the game's model the same way across conditions, PMAD theory predicts that there will be no differences in
120 impact on either attribution or attitude.
We conclude that for web-based players, the game did not drive changes in attribution style, though the control text successfully helped players use more systemic attributions for gender disparities in outcome. Neither the game nor the control text impacted player attitudes around gender, while all intervention conditions caused players to express more negative attitudes about race, possibly due to a fatigue effect.
The most important finding is that players did not react differently to the different game conditions in their play behaviors. The PMAD model is driven by players engaging differently with the game's model under different game-mechanical conditions. We did not find any impact of bias identification condition on either attributions or attitudes, but this is because the game failed to evoke differences in player behavior between bias conditions rather than because differing player behavior across bias conditions failed to affect player attributions or attitudes.
Analysis of Mechanical Turk Players Demographics. There were 191 players in this group. 81 were male and 108 were female; 2 did not report their gender (Table J103). 148 reported White as their race, while 17 reported Black or Hispanic and 26 reported another racial category (Table J103). The average age of players in this group was 34.72 (SD = 10.927; Table J104).
Mortality and priming. Before drawing conclusions about the study's research questions, we investigated possible mortality or priming effects.
The mortality analysis compared the pre-test scores of completers and non-completers for each of the four tests: Systemic Sexism, Systemic Racism, Modern Sexism, and Symbolic
Racism. For each test, an ANOVA was conducted with pre-test score as the dependent variable, and completion status as a fixed independent factor. No significant differences were found for the
121 Systemic Sexism test (F(1,130) = .084, p = .804; Table J105, J106), the Systemic Racism test
(F(1,130) = 1.716, p = .193; Table J107, J108), or the Modern Sexism test (F(1,130) = .023, p
= .880; Table J109, J110).
Pre-test score significantly correlated with completion status for the Symbolic Racism test, F(1,130) = 4.754, p = .031 (Table J112). Non-completers scored higher than completers on
the Symbolic Racism pre-test, indicating that they held less racist attitudes (Table 17), with η2
of .035.
Table 17 Symbolic Racism pretest means by completion status
Completed Mean SD N
No 22.28 4.50 32
Yes 20.13 4.96 100
Total 20.65 4.93 132
Possible priming effects were investigated by comparing the post-test scores of players who received a pre-test and players who did not. For each of the four tests (Systemic Sexism,
Systemic Racism, Modern Sexism, and Symbolic Racism), an ANOVA was conducted with post- test score as the dependent variable, and pre-test group as a fixed independent factor.
A significant effect of pre-test condition on post-test score was found for the Systemic
Sexism test, F(1,189) = 19.724, p < .001, η2 = .094 (Table J114). Subjects who received a pre-
test scored lower on the post-test than subjects who did not (Table 18). In other words, subjects
who received the pre-test were less likely to use a systemic explanation than subjects who did
not.
122 Table 18 Systemic Sexism posttest means by pretest group
Pretest Group Mean SD N
No Pretest 2.53 1.31 92
Pretest 1.71 1.26 99
Total 2.10 1.35 191
A significant effect of pre-test condition was found for the Systemic Racism test,
F(1,189) = 5.520, p = .020, η2 = .028 (Table J116). Again, subjects who received the pre-test
scored lower than subjects who did not (Table 19), indicating that they were less likely to use
systemic explanations.
Table 19
Systemic Racism posttest means by pretest group
Pretest Group Mean SD N
No Pretest 2.37 1.52 92
Pretest 1.89 1.31 99
Total 2.12 1.43 191
A significant effect of pre-test condition was found for the Modern Sexism test, F(1,189)
= 5.718, p = .018, η2 = .029 (Table J118). Subjects who received the pre-test scored lower than
those who did not, indicating more sexist attitudes (Table 20).
123 Table 20
Modern Sexism posttest means by pretest group
Pretest Group Mean SD N
No Pretest 29.28 4.46 92
Pretest 27.56 5.43 99
Total 28.39 5.05 191
Finally, a significant effect of pre-test condition was found for the Symbolic Racism test,
F(1,189) = 4.473, p = .036, η2 = .023 (Table J120). Subjects who received the pre-test scored
lower than those who did not, indicating more racist attitudes (Table 21).
Table 21
Symbolic Racism posttest means by pretest group
Pretest Group Mean SD N
No Pretest 20.84 2.59 92
Pretest 20.07 2.42 99
Total 20.44 2.52 191
While significant differences were found for all four tests, the impacts were very small
for the Systemic Racism, the Modern Sexism, and the Symbolic Racism tests. We therefore
conclude that for these three tests, the priming effect of the pre-test is not a major factor in the analysis. A small effect of pre-test condition was found for the Systemic Sexism test (η2 = .094).
Subjects who received a pre-test were less likely to use systemic explanations for sexist at post-
test. Since many of the analyses conducted rely on pre-post differences, this finding suggests that the impact of the intervention on Systemic Sexism attribution test scores may be underreported by this study.
124 Attribution type. As previously discussed, Advance, the game used in this study, was designed using the PMAD principles discussed in chapter three in an attempt to change players' likelihood of using systemic rather than agentic explanations for racial and gender bias. In order to determine whether Advance successfully affected players, its impact is compared to the impact of reading text on the same topics modeled by the game. This comparison was conducted by comparing player post-test performance on the attribution tests across the four study conditions
(control, Informational, financial, generative), controlling for pre-test performance.
Q1: Controlling for pre-test scores, are there differences in attribution test
scores for race and gender across the four study conditions?
To answer this question, an ANCOVA was conducted for each of the two attribution tests, the Systemic Sexism test and the Systemic Racism test, with treatment condition (control, informational, financial, or generative) as a fixed independent factor and the post-test score for each test as the dependent factor. Pre-test score, player race, and player gender were controlled for, with pre-test score analyzed as a covariate and player race and player gender modeled as fixed independent factors.
No significant effect of treatment condition on Systemic Sexism post-test score was found, F(3,80) = 2.279, p=.086 (Table J121, Table J122). An ANCOVA was conducted using a contrast to determine whether there was a difference between the game conditions, taken together, and the control condition. No significant effect was found, F(1,80) = .093, p = .761
(Table J123).
A t-test was used to determine whether the intervention, as a whole, changed players'
125 likelihood of using systemic explanations, by comparing the pre-post difference score on the
Systemic Sexism test to 0. No significant effect was found, t(98) = -1.551, p = .124 (Table J124).
For the Systemic Racism test, no significant difference between conditions was found,
F(3,80) = .291, p = .832 (Table J125, J126). An ANCOVA was conducted using a contrast to determine whether there was a difference between the game conditions, taken together, and the control group. No significant effect was found, F(1,80) = .546, p = .462 (Table J127).
A t-test was used to determine whether the intervention, as a whole, changed players'
likelihood of using systemic attributions for racism. No significant effect was found, t(98) = -
.065, p = .949 (Table J128).
For the Mechanical Turk group, therefore, the overall intervention had no effect, neither
in the control condition nor in any of the game conditions.
As with the web-recruited group, we want to know whether there is a relationship
between in-game decisions and player attributions. Do more successful players experience more
change? Are there specific in-game behaviors that are linked to the game having an impact?
Research question two addresses these differences.
Q2: Are there associations between in-game measures (such as player score,
number of characters placed, and number of game plays) and attribution test
scores?
The following in-game measures were investigated: player score (as a measure of overall player success), total clients placed (as a measure of how often players had to consider character placement issues), total clients placed from the bias group (as a measure of how often players
126 contended with bias), how many attempts it took them to identify the game's bias (as a measure
of guessing versus investigating), and how many times the player chose to play (to control for
time on task).
For player score, total clients placed, bias-group clients placed, and bias identification attempts, partial correlations were conducted between the game measure and the post-test score on each of the attitude tests, controlling for the influence of the pre-test. For the number of plays, players were separated into two categories: those who played two times (n = 95), and those who played more than twice (n = 4). This was due to the small number of players who played more than twice. ANCOVAs were conducted on the post-test scores with number of plays as an independent fixed factor and the pre-test scores as a covariate.
For Systemic Sexism, there were no significant correlations found between any of the in- game measures and post-test scores, controlling for pre-test scores. For game score, r(55) = .162, p = .227 (Table J129). For clients placed, r(88) = -.083, p = .439 (Table J129). For bias group clients placed, r(88) = -.131, p = .217 (Table J129). For number of guesses, r(88) = -.106, p
= .322 (Table J129). For number of plays, F(1, 98) = 1.744, p = .190 (Table J130, J131).
For Systemic Racism, there were no significant correlations found between any of the in- game measures and post-test scores, controlling for pre-test scores. For game score, r(55) = .057, p = .671 (Table J132). For client placed, r(88) = .159, p = .134 (Table J132). For bias-group clients placed, r(88) = .077, p = .473 (Table J132). For number of identification attempts, r(88)
= .079, p = .459 (Table J132). For number of plays, F(1,96) = .239, p =.626 (Table J133, J134).
We can conclude that there is no relationship between engagement with these specific game activities and player change on the outcome measures.
We also investigated the relationship of player race and gender to these in-game
127 measures: player score (as a measure of overall player success), total clients placed (as a measure
of how often players had to consider character placement issues), total clients placed from the
bias group (as a measure of how often players contended with bias), how many attempts it took
them to identify the game's bias (as a measure of guessing versus investigating), and how many
times the player chose to play (to control for time on task). ANCOVAs were conducted on each
of the first four measures, with player race and gender modeled as independent fixed factors and
the game measure as the dependent variable. For number of plays, a chi-square analysis was
conducted for player race and gender.
For game score, no significant effect of player race and gender was found, F(2,98) = .581,
p = .561 (Table J135, J136). For bias-group clients placed, no significant effect of player race
and gender was found, F(2, 169) = .217, p = .805 (Table J140, J141). For number of guesses, no
significant effect of player race and gender was found, F(2,169) = .171, p = .843 (Table J142,
J143). No association was found between player race and number of plays, χ2 (2,191) = 3.169, p
= .205 (Table J144). No association was found between player gender and number of plays, χ2
(2,189) = .010, p = .921 (Table J145).
A significant effect of player race was found for the total number of clients placed,
F(2,169) = 4.259, p = .016, η2 = .048 (Table 22, J137, J138). White players placed the most
clients, while Black and Hispanic players placed the fewest. A second ANCOVA was conducted
using a contrast to determine whether White players performed different from Black, Hispanic,
and Other players taken together. A significant effect was found, F(1,169) = 8.383, p = .004, η2
= .047 (Table J139). White players placed more clients during the game than Black, Hispanic, and Other players did.
128 Table 22
Number of clients placed by player race
Pretest Group Mean SD N
White 26.15 22.02 136
Black and Hispanic 14.41 14.10 17
Other 18.55 14.43 22
Total 24.06 20.06 175
Finally, we investigated the impact of game type on player attribution style for
Mechanical Turk players. For players who received the game intervention, some did not attempt
to identify the bias. Those players were compared to players who did interact with the bias
system in the informational condition, in the financial condition, and in the generative condition,
to understand the impact of the bias system specifically on players' attributions around racism
and sexism. These comparisons are investigated with research question three.
Q3: For game players, are there differences in measures of systemic
understanding of racism and sexism across bias guess conditions?
For each of the attribution style tests (Systemic Sexism and Systemic Racism), an
ANCOVA was conducted with bias guess condition (no guess, guess in informational condition,
guess in financial condition, guess in generative condition) as an independent fixed factor, pre-
test score as a covariate, player race and gender as covariates (modeled as independent fixed
factors), and post-test score as the dependent variable.
For Systemic Sexism, no significant difference between bias guess conditions was found,
F(3,54) = 1.542, p = .214 (Table J146, J147). An ANCOVA was conducted using a contrast to
129 determine whether there was a difference between non-guessers and guessers. No significant effect was found, F(1,54) = .009, p = .925 (Table J148).
Player race was found to be a significant predictive factor, F(2,54) = 3.855, p = .049, η2
= .106 (Table J147). Black and Hispanic players had the highest mean score (Table 23),
indicating the highest likelihood of using systemic explanations for sexism.
Table 23 Systemic Sexism posttest marginal means by race
Player Race Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
White 1.54a 0.15 1.23 1.85
Black and Hispanic 2.73a 0.45 1.83 3.63
Other 1.48a 0.35 0.77 2.19
a. Covariates appearing in the model are evaluated at the following values: Systemic Sexism Pretest Score = 1.9315.
T-tests were conducted for each group to determine whether the means differed from
zero. While the groups did significantly differ from each other, none of the group means
significantly differed from zero. For the White group, t(53) = -1.901, p = .063 (Table J150). For
the Black and Hispanic group, t(6) = .918, p = .394 (Table J150). For the Other group, t(11) = -
.761, p = .463 (Table J150).
Note that this racial difference is found only when looking at players; when looking
across all four treatment conditions, no differences are found. This is suggestive that perhaps the
game can be most effective for Black and Hispanic players in thinking about issues of sexism.
Although there were play differences between White and non-White players, leading us to expect
that there may be differences between these groups in their performance on post-game measures,
130 it is surprising that this effect appears only for the issue of sexism. The effect may appear for
Systemic Sexism because it was the only test found to have a significant priming effect.
However, other results from this player group suggest that race may also be the dominant
category experienced by players when playing the game. This will be discussed further in the
following chapter.
For the Systemic Racism test, no significant effect of bias guess condition on post-test
score was found, F(3,54) = .139, p = .936 (Table 151, 151). An ANCOVA was conducted using a
contrast to determine whether there was a difference between non-guessers and guessers. No significant effect was found, F(1,54) = .187, p = .667 (Table 153).
The interaction of player race and gender significantly predicted post-test scores, F(2,54)
= 3.480, p = .038, η2 = .114 (Table 24). To understand the results more deeply, t-tests were conducted on the pre-post difference scores of each group to determine whether the pre-post difference was significantly different from 0. The only group whose pre-post difference scores significantly differed from zero were white women, t(31) = -2.370, p = .024. The mean observed difference score was -.5625 (SD = 1.34254), indicating that white women were less likely to use systemic explanations for racism after playing the game (Table J155).
131
Table 24
Systemic Racism pretest means by player race and gender
Player Gender Player Race Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
Female White 1.62a 0.21 1.20 2.03
Black and Hispanic 3.06a 0.74 1.58 4.55
Other 2.14a 0.61 0.91 3.36
Male White 2.09a 0.25 1.60 2.58
Black and Hispanic 0.52a 0.74 -0.97 2.01
Other 1.64a 0.45 0.74 2.53
a. Covariates appearing in the model are evaluated at the following values: Systemic Racism Pretest Score = 2.0137.
As with the web-recruited group, to understand this result, the impact of bias guess
condition on in-game measures is investigated to see whether there were conditions that
motivated players to behave differently in the game from their peers. If players behave
differently across conditions, then the lack of performance differences would indicate that
players' behavior was not driving attribution style change. If, on the other hand, players behave
identically across conditions, it suggests that the bias guess condition does not affect player
behavior. Because PMAD theory predicts that players must respond to the anomalous data in the
game in order to for their attribution style to be affected, if there is no difference in player
behavior between conditions then we do not expect differences in attribution style either.
Q4: For game players, are there differences in game performance measures
across bias guess conditions?
132 ANCOVAs were conducted for the following in-game measures: game score, the percentage of the score earned from bias-group clients, the number of clients placed, and the number of clients placed from the bias group. For each measure, bias guess condition (no guess, guessed in informational condition, guessed in financial condition, guessed in generative condition) was used as an independent fixed factor, player race and gender were treated as covariates (modeled as independent fixed factors), and the game measure was the dependent variable.
No significant differences were found between bias guess conditions for any of the in- game measures. For game score, F(3,86) = 2.475, p = .067 (Table J156, J157). For percentage of score earned from the bias group, F(3,86) = 1.158, p = .330 (Table J158, J159). For the total number of clients placed, F(3,114) = .714, p = .546 (Table J161, J162). For the total number of bias-group clients placed, F(3,114) = .735, p = .533 (Table J163, J164).
Since no differences in player behavior or outcomes were found between groups, we conclude that the failure to find an impact of bias guess condition is a failure of the design to drive player action differently between groups.
Since we found racial and gender differences on the outcome measures, we might expect to find an effect of player race or gender on play styles. There was one significant difference between game measures for player race and gender, F(2,86) = 3.821, p = .026, η2 = .082 (Table
J159). Among women, White women earned the highest proportion of their score from the bias
group and Other women earned the smallest, while Other men earned the largest and White men
the smallest proportion (Table 25).
133
Table 25
Percentage of score earned from bias group, marginal means by player race and gender
Player Race Player Gender Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
White Female 0.22 0.07 0.09 0.36
Male 0.14 0.08 -0.02 0.30
Black and Hispanic Female 0.11 0.35 -0.58 0.81
Male 0.22 0.19 -0.15 0.59
Other Female 0.09 0.18 -0.27 0.45
Male 0.72 0.17 0.39 1.05
This finding suggests that differential levels of success in earning money from bias-group
clients may drive some of the racial and gender differences previously observed. We conclude
that further investigations of the racial and gender differences present in this game should look
more deeply at differences in the bias group placement process.
Attitudes. In addition to investigating the influence of the game on attribution styles, we
investigated whether the game can influence player attitudes about racism and sexism. In order
to determine the answer to this question, the same comparison as in research question one was
conducted, comparing the impact of the game to the impact of a control text, but using attitude
measures as our dependent variables and covariates.
Q5: Controlling for pre-test scores, are there differences in attitude test scores
for race and gender across the four study conditions?
134 To answer this question, an ANCOVA was conducted for each of the two attitude tests,
the Modern Sexism test and the Symbolic Racism test, with treatment condition (control,
informational, financial, or generative) as a fixed independent factor and the post-test score for
each test as the dependent factor. Pre-test score, player race, and player gender were controlled
for, with pre-test score analyzed as a covariate and player race and player gender modeled as
fixed independent factors.
A significant effect of treatment condition was found on Modern Sexism post-test score,
F(3,80) = 3.868, p = .012, η2 = .127 (Table J166; Table 26). An ANCOVA was conducted using a contrast to determine whether there was a difference between the game conditions, taken together, and the control condition. No significant difference was found, F(1,80) = 2.334, p
= .131 (Table J167).
T-tests were conducted on the Modern Sexism pre-post difference score for each of the four treatment conditions, to determine whether the pre-post change was significantly different from zero. A significant difference was found for the control group, t(25) = 2.461, p = .021
(Table J158). The mean was .8077 (SD = 1.67378), indicating that players showed less evidence of sexist attitudes after reading the control text. A near-significant effect was found for the informational group, t(30) = -2.009, p = .054, with a mean of -.3667 (SD = .99943; Table J158).
No significant effect was found for the financial condition, t(23) = 1.030, p = .314 (Table J158), or the generative condition, t(19) = -1.455, p = .163 (Table J158).
135
Table 26 Modern Sexism posttest marginal means by treatment condition
Treatment Condition Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
Control 28.71a 0.69 27.34 30.09
Informational 26.85a 0.34 26.18 27.51
Financial 28.67a 0.49 27.70 29.64
Generative 27.29a 0.48 26.33 28.25
a. Covariates appearing in the model are evaluated at the following values: Modern Sexism Pretest Score = 27.4343.
Given that the game-versus-control contrast found no significant effect, we cannot simply
interpret this finding to mean that the control group benefited, while there was no change in the
other groups. Rather, we note that while the control group was the only group to significantly
benefit from the intervention, differences between the other game conditions are worth investigating further. Specifically, although the informational condition showed only a near-
significant effect on changing player attitudes, its lower score compared to all other groups
suggests that players may have struggled most with the informational condition. The
informational condition provides no explicit way to counteract the bias present in the game;
given that Mechanical Turk players were also less likely to win games than their web-recruited
peers, they may have experienced frustration because of this and reacted against it.
For the Symbolic Racism test, no significant effect of treatment condition was found,
F(3,80) = 2.281, p = .086. An ANCOVA was conducted using a contrast to compare the game conditions, taken together, to the control condition. No significant effect was found, F(1,80)
= .314, p = .577.
To determine whether there was an impact of the intervention as a whole, a t-test was
136 conducted on the pre-post difference score for the Symbolic Racism test, to determine whether
the difference score was significantly different from zero. No significant difference was found,
t(98) = -.142, p = .887 (Table J169, J170).
A significant effect of player race was found, F(2,80) = 3.676, p = .030, η2 = .084 (Table
J170). White players scored highest, followed by Black and Hispanic players, with Other players
scoring the lowest (Table 27).
Table 27
Symbolic Racism posttest marginal means by player race
Player Race Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
White 20.24a 0.18 19.89 20.59
Black and Hispanic 19.65a 0.65 18.36 20.94
Other 18.78a 0.52 17.75 19.82
a. Covariates appearing in the model are evaluated at the following values: Symbolic Racism Pretest Score = 20.1212.
White subjects showed no significant change from pre-test to post-test, t(77) = .940, p
= .350 (Table J173). Black and Hispanic subjects also showed no change, t(7) = -.759, p = .472
(Table J173). Subjects in the Other racial category, however, showed a significant decrease from pre-test to post-test, t(12) = -2.547, p = .026 (Table J173). Their mean difference score was -
1.9231 (SD = 2.72218). In other words, the racial attitudes of White, Black, and Hispanic subjects were not affected by any of the conditions of the study, nor were there differences between conditions. Players reporting a racial identity of Other, on the other hand, reported more racist attitudes after the study. However, this effect may be because of the nature of the Symbolic
Racism test, which treats racism as though it were about Black people only.
As with attributions, we want to know whether there is a relationship between player in-
137 game behavior and the attitude outcome measures. Do more successful players experience more change? Are there specific in-game behaviors that are linked to the game having an impact?
Research question six addresses this area.
Q6: Are there associations between in-game measures (such as player score,
number of characters placed, and number of game plays) and attitude test
scores?
The following in-game measures were investigated: player score (as a measure of overall player success), total clients placed (as a measure of how often players had to consider character placement issues), total clients placed from the bias group (as a measure of how often players contended with bias), how many attempts it took them to identify the game's bias (as a measure of guessing versus investigating), and how many times the player chose to play (to control for time on task).
For player score, total clients placed, bias-group clients placed, and bias identification attempts, partial correlations were conducted between the game measure and the post-test score on each of the attitude tests, controlling for the influence of the pre-test. For the number of plays, players were separated into two categories: those who played two times (n = 95), and those who played more than twice (n = 4). This was due to the small number of players who played more than twice. ANCOVAs were conducted on the post-test scores with number of plays as an independent fixed factor and the pre-test scores as a covariate.
For the Modern Sexism test, a significant negative correlation was found between player score and post-test score, controlling for pre-test score, r(55) = -.384, p = .003 (Table J174). A
138 higher player score was associated with a lower post-test score, indicating that players expressed more sexist attitudes.
No significant effect was found for the total number of clients placed (r(88) = .070, p
= .509; Table J174), the number of clients placed from the bias group (r(88) = .007, p = .948;
Table J174), the number of guesses made (r(88) = -.109, p = .308; Table J174), or the number of plays (F(1,96) = .731, p = .500; Table J175, J176).
For the Symbolic Racism test, no significant correlations were found. For game score, r(55) = .089, p = .511 (Table J177). For clients placed, r(88) = -.118, p = .268 (Table J177). For clients placed from the bias group, r(88) = -.092, p = .387 (Table J177). For bias identification attempts, r(88) = -.002, p = .988 (Table J177). For number of plays, F(1,98) = .001, p = .973
(Table J178, 179).
For Symbolic Racism, this data suggests that greater skill at the game did not correlate with the outcome measures. Neither did more encounters with the core mechanic, namely placing clients successfully. However, greater skill at the game did have a negative association with score for modern sexism. More successful players expressed more sexist attitudes after play.
This suggests that players may have been frustrated with their ability to place female characters; it will be discussed further in the following chapter.
Finally, we revisit the impact of game type, this time looking at player attitudes. For players who received the game intervention, some did not attempt to identify the bias. We compare those players to players who interacted with the bias system in the informational condition, in the financial condition, and in the generative condition, to understand the impact of the bias system specifically on players' attitudes about racism and sexism.
139 Q7: For game players, are there differences in measures of attitudes toward racism
and sexism across conditions?
For each of the attitude tests (Modern Sexism and Symbolic Racism), an ANCOVA was conducted with bias guess condition (no guess, guess in informational condition, guess in financial condition, guess in generative condition) as an independent fixed factor, pre-test score as a covariate, player race and gender as covariates (modeled as independent fixed factors), and post-test score as the dependent variable.
For the Modern Sexism test, a significant interaction (at alpha of .025 to account for family effects) was found between bias guess condition and player race, F(6,54) = 2.644, p
= .004 (Table J181). Black and Hispanic players performed best in the financial condition, as did
Other players, but White players performed best when not guessing (Table 28).
To understand these results more clearly, an ANCOVA was performed using a contrast to compare the no-guessing condition to the guess conditions, taken together. No significant effect was found, F(1,54) = .257, p = .614 (Table J183). A second ANCO VA was performed using a contrast to compare White players to all other players. A significant difference between White and non-White players was found at alpha of .025, F(1,54) = 10.277, p = .002 (Table J185, Table
29).
140 Table 28
Modern Sexism posttest marginal means by guess condition and player race
Guessing Condition Player Race Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
No Guess White 27.74a 0.24 27.27 28.22
Black and Hispanic 27.57a 0.61 26.36 28.79
Other 27.66a 0.51 26.64 28.68
Informational Guess White 27.06a 0.23 26.60 27.51
Black and Hispanic 26.42a 0.61 25.20 27.63
Other 27.14a 0.44 26.27 28.01
Financial Guess White 27.14a 0.24 26.66 27.61
Black and Hispanic 31.37a 1.01 29.34 33.40
Other 28.54a 0.72 27.09 29.98
Generative Guess White 27.10a 0.25 26.59 27.60
Black and Hispanic 27.96a 0.82 26.33 29.60
Other 27.75a 0.51 26.74 28.76
a. Covariates appearing in the model are evaluated at the following values: Modern Sexism Pretest Score = 27.5068.
For White players only, an ANCOVA was performed with the Modern Sexism post-test
score as the dependent variable, the bias guess condition as a fixed independent factor, and the
Modern Sexism pre-test score as a covariate. No significant result was found at an alpha of .025,
F(1,49) = 2.181, p = .102 (Table J186, J187). For White players, therefore, we treat all game
conditions as identical. We conducted a single t-test to determine whether there was any pre-post
change, comparing the pre-post difference score to zero. No significant effect was found, t(77)
= .441, p = .660 (Table J188). White players showed no change in their attitudes about sexism after playing the game, and this was true across all game conditions.
141 Table 29
Modern Sexism posttest marginal means by player race
Player Race Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
White 27.26a 0.12 27.02 27.50
Black and Hispanic 28.33a 0.35 27.63 29.03
Other 27.77a 0.27 27.22 28.32
a. Covariates appearing in the model are evaluated at the following values: Modern Sexism Pretest Score = 27.5068.
For Black, Hispanic, and Other players, an ANCOVA was performed with the Modern
Sexism post-test score as the dependent variable, the bias guess condition as a fixed independent
factor, and the Modern Sexism pre-test score as a covariate. A significant result was found at
alpha of .025, F(1,14) = 5.887, p = .003 (Table J190). There were differences in post-test score
between conditions for Black, Hispanic, and Other players (Table 30).
Table 30
Modern Sexism posttest means by guess condition, Black, Hispanic, & Other
Guessing Condition Mean SD N
No Guess 30.60 3.58 5
Informational Guess 22.83 6.46 6
Financial Guess 27.33 3.51 3
Generative Guess 29.80 5.22 5
Total 27.42 5.77 19
T-tests were conducted for each condition to determine whether the Modern Sexism pre-
post difference score was significantly different from zero. For the no-guess condition, no
142 significant effect was found, t(4) = .001, p = 1.000 (Table J191). For the informational-guess condition, no significant effect was found, t(5) = -1.000, p = .363 (Table J191). For the financial- guess condition, no significant effect was found, t(2) = 2.646, p = .118 (Table J191). For the generative-guess condition, no significant effect was found, t(4) = .374, p = .372 (Table J191). In other words, while the differences between conditions were significant, none of the individual conditions caused players to change their attitudes about sexism.
For the Symbolic Racism test, a significant interaction (at an alpha of .025) was found between bias guess condition and player race, F(6,54) = 2.567, p = .029 (Table J193). Black and
Hispanic players performed best in the generative-guess condition, as did Other players, but
White players performed best in the financial-guess condition (Table 31).
To understand these results more clearly, an ANCOVA was performed using a contrast to compare the no-guessing condition to the guess conditions, taken together. No significant effect was found, F(1,54) = .289, p = .593 (Table J195). A second ANCOVA was performed using a contrast to compare White players to all other players. A significant difference between White and non-White players was found at alpha of .025, F(1,54) = 9.572, p = .003 (Table J197; Table
32).
For White players only, an ANCOVA was performed with the Symbolic Racism post-test score as the dependent variable, the bias guess condition as a fixed independent factor, and the
Symbolic Racism pre-test score as a covariate. No significant result was found at an alpha of .025, F(1,49) = .420, p = .739 (Table J198, J199). For White players, therefore, all game conditions were treated as identical, and a single t-test was conducted to determine whether there was any pre-post change, comparing the pre-post difference score to zero. No significant effect was found, t(77) = .940, p = .350 (Table J200). White players showed no change in their attitudes
143 about sexism after playing the game, and this was true across all game conditions.
Table 31
Symbolic Racism posttest marginal means by guess condition and player race
Guessing Condition Player Race Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
No Guess White 20.62a 0.39 19.84 21.41
Black and Hispanic 20.13a 1.01 18.12 22.15
Other 18.34a 0.84 16.66 20.01
Informational Guess White 19.96a 0.38 19.20 20.71
Black and Hispanic 20.16a 1.00 18.16 22.16
Other 20.07a 0.70 18.66 21.48
Financial Guess White 20.48a 0.39 19.70 21.27
Black and Hispanic 15.62a 1.66 12.29 18.96
Other 17.05a 1.19 14.67 19.44
Generative Guess White 20.24a 0.42 19.40 21.07
Black and Hispanic 20.85a 1.38 18.09 23.61
Other 20.17a 0.84 18.49 21.85
a. Covariates appearing in the model are evaluated at the following values: Symbolic Racism Pretest Score = 20.4521.
For Black, Hispanic, and Other players, an ANCOVA was performed with the Symbolic
Racism post-test score as the dependent variable, the bias guess condition as a fixed independent
factor, and the Symbolic Racism pre-test score as a covariate. A significant result was found at
alpha of .025, F(1,14) = 8.164, p = .002 (Table J202). There were differences in post-test score
between conditions for Black, Hispanic, and Other players (Table 33).
144 Table 32
Symbolic Racism marginal means by player race
Player Race Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
White 20.33a 0.20 19.93 20.72
Black and Hispanic 19.19a 0.58 18.04 20.35
Other 18.91a 0.45 18.00 19.81
a. Covariates appearing in the model are evaluated at the following values: Symbolic Racism Pretest Score = 20.4521.
T-tests were conducted for each condition to determine whether the Symbolic Racism
pre-post difference score was significantly different from zero. For the no-guess condition, a significant effect was found, t(4) = -4.000, p = .016 (Table J203). The mean difference score was
-3.2000 (SD = 1.78885), indicating that Black, Hispanic, and Other subjects expressed more racist attitudes after the game in this condition.
Table 33 Symbolic Racism posttest means by guess condition, Black, Hispanic, & Other
Guessing Condition Mean SD N
No Guess 20.00 1.87 5
Informational Guess 19.83 2.79 6
Financial Guess 15.00 1.00 3
Generative Guess 21.80 1.92 5
Total 19.63 2.97 19
For the informational-guess condition, no significant effect was found, t(5) = .104, p
= .921 (Table J203). For the financial-guess condition, no significant effect was found, t(2) = -
1.732, p = .225 (Table J203). For the generative-guess condition, no significant effect was found,
145 t(4) = -2.449, p = .070 (Table J203).
In summary, White players were not affected by any of the treatment conditions, and there were no differences between treatment conditions for White players. All differences found on these measures were for Black, Hispanic, and Other players. For the Modern Sexism test, there were differences in attitudes about sexism between conditions, but there was no significant change from the pre-test. For the Symbolic Racism test, Black, Hispanic, and Other players expressed more racist attitudes when they did not guess, while there was no significant change from the pre-test in the other conditions.
We conclude that Black, Hispanic, and Other players were most emotionally sensitive to the differences between guess conditions. However, the only overall change induced by the game was to make players who did not guess express more racist attitudes. Possible interpretations will be further discussed in the next chapter.
Analysis by Player Group Broadly speaking, the web-recruited group was unaffected by the intervention. There was a small positive effect of the control text on players' likelihood of using systemic attributions for sexism, but no effect of the game. Players expressed more racist attitudes after the intervention, with some differences between conditions, but there was no difference between the impact of the game and that of the control text. Rather, engaging with the issue at all may have made players more likely to express racist attitudes because of a moral justification effect (Monin & Millar,
2001).
We might have expected these self-identified gamers to be sensitive to the differences between the game conditions, but there was no effect for the four bias-group conditions on either attitude or attribution measures. We attribute this to players' not actually playing differently
146 across game conditions, as evidenced by the lack of significant differences in game measures.
For this group of players, the rewards offered by the different game conditions were not effective
at inducing behavior change in play.
What this suggests is that if we hope to have an effect on the web-recruited player group,
and players like them, we will need to go back to the PMAD principles and consider ways to
make the design differences between conditions more salient to the player, as well as ways to
make sure that they drive differences in play behavior. From there, it would be possible to
investigate whether such play differences would change conceptual models about discrimination.
As discussed in the next chapter, this will likely mean modifying the principles themselves.
The paucity of significant results for the web-based player group may also be explained by the small number of non-White subjects; we may not be picking up the race-based effects visible in the other groups. However, it may also be that there is a ceiling effect. The web- recruited group had higher pre-test scores on all four pre-tests, indicating they were more likely to use systemic attributions for racism and sexism, and that they held less racist and sexist attitudes.
We believe that the web-recruited group, who are self-identified gamers with a significant online presence in the hobby, may be somewhat atypical for their demographic. Gaming culture is quite racist and sexist in both subtle and overt ways; players willing to participate in a study addressing bias may not be particularly representative of gaming culture as a whole. Mechanical
Turk players, on the other hand, may be more typical of their demographic. Although the
Mechanical Turk population may also include self-identified gamers, this population is more representative in its racial makeup than the self-identified gamers recruited through affinity networks.
147 The results for the Mechanical Turk are primarily driven by racial differences, and by
interactions of condition or gender with race. Although the effect size was modest, the game may
be most effective for Black and Hispanic players in changing attribution styles. In terms of
changing attitudes, Other players expressed more racist attitudes after the study, though this may
be an artifact of their exclusion from the Symbolic Racism measure. Players in this group also
demonstrated differences in client placement rates by player race, with White players placing
more clients than Black, Hispanic, and Other players. These play differences may drive the
differences in outcomes.
Although they were not recruited as gamers per se, this group also seemed more sensitive
to differences between game conditions. There were differences in the percentage of money
earned from members of the discriminated-against group between the four guess conditions, indicating that players responded to the guess condition with changes in play style. Perhaps as a result of the changed play behaviors, there were differences between bias guess conditions for the attitude tests. For both attitude tests, white players were unaffected by the bias guess condition, while Black, Hispanic, and Other players were sensitive to it. While the patterns of difference between conditions were different for racism and sexism, the fact that non-White players showed sensitivity to the difference between conditions is something we can build on.
This suggests that future design research regarding rewards should focus on working with people of non-White racial backgrounds, including those who are not self-identified gamers. It could be an intervention to change their attitudes and attribution styles around sexism, which is an issue for people of all racial backgrounds – but it would also be helpful to make change on the racism front. While in the long run, changing the minds of White players would be useful because of their comparatively dominant position in society, internalized racism is an issue that
148 can be addressed with non-White players; giving these players a systemic perspective on racism can, as discussed in chapters one and two, help them articulate challenges to the dominant narratives of race in American society.
The implications of these results will be further discussed in the final chapter of this dissertation.
149 Chapter 6: Summary and Discussion
Project Summary This dissertation proposed a specific theoretical mechanism through which games might be able to expand Americans' models of how racism and sexism function, and to change their attitudes about racism and sexism directly. It defined the theory behind this mechanism, the
“Playable Model – Anomalous Data” (PMAD) approach, and laid out design principles for creating games of this type.
Advance, a web-based PMAD game addressing issues of systemic racial and gender bias, was used to empirically test the PMAD approach. Advance was designed specifically for this project, using the PMAD principles, and developed using Flex and Google Apps. The study was deployed online. Subjects received either a control text or one of three versions of the Advance game, and data was collected about subjects' attribution styles and attitudes around racism and sexism before and after the intervention. Subjects in the three game conditions also had data about their in-game performance collected. Finally, demographic data was collected for all subjects.
Using this data, we were able to address the following research questions. Can a PMAD game that models systemic bias change the likelihood of players using systemic rather than agentic explanations for sexism and racism? How does the game perform compared to a control group, and are there differences between different versions of the game? Might it also change players' attitudes about bias? And how do attribution changes and attitude changes relate to player performance in the game?
This chapter sums up the findings presented in the previous chapter, discusses them in more depth, and considers the larger implications of the study.
150 Literature Popular rhetorics of racism and sexism in American society, such as color-blindness and choice feminism, implicitly rely on an individualistic, agentic approach (Bidell, Lee, Bouchie,
Ward, & Brass, 1994; Hughs & Tuch, 2000; Bonilla-Silva & Forman, 2000; Bonilla-Silva, 2006;
Hirshman, 2007). Just as these models propose that the ultimate fault for incidents of racism and sexism lies with the intentional behavior of a single individual, these theories make individuals independently responsible for disparities in outcomes between groups.
This approach inadequately explains the realities of racism and sexism in modern
America. Americans are increasingly unlikely to admit to explicitly sexist or racist attitudes, or to admit to behaving in intentionally racist or sexist ways (Blanchard, Lilly, & Vaughn, 1991;
Klonis, Plant, & Devine, 2005). At the same time, significant racial and gender disparities persist in housing, healthcare, employment, and beyond (e.g. Hausmann, Tyson, & Zahidi, 2010; Kozol,
1992; Lipsitz, 1995; Valian, 1999; Wenneras & Wold, 1997). These disparities are not enacted by independent, intentional actions of individuals, as the agentic rhetorics would have it; they are produced by complex systems of bias and privilege. The impact of these systems come not from single intentional incidents, but from patterns of behavior over time, feedback systems, and emergent effects (Gomez & Wilson, 2006; Adams et al., 2007; Feagin, 2006; Schmidt, 2005).
It is important to help Americans adopt a systemic understanding of racism and sexism for two related reasons. First, agentic explanations are insufficient to explain the mechanisms of racial and gender discrimination in modern American society. Americans who rely purely on their naïve conceptions or on popular rhetorics, both of which are agentically oriented, will be less likely to support the systemic remedies needed to address structural discrimination, since the proposed remedies do not match their mental models of how discrimination functions (Iyengar,
1989; Iyengar, 1994; Hughes & Tuch, 2000; Lau & Sears, 1981). Conversely, the agentically
151 oriented remedies they may be more willing to support are unlikely to help with systemic problems. Second, individualistic models of racism and sexism are often used to defend a racist, sexist status quo (Bonilla-Silva & Forman, 2000; Bonilla-Silva, 2006; Hirshman, 2007).
Adopting a systemic approach to racial and gender bias can help dismantle and neutralize common rhetorical moves.
In order to change players' models of racism and sexism, we turn to two different bodies of literature. First, we examine the existing literature on prejudice-reduction interventions.
Entertainment-based anti-bias interventions are interventions that use popular media forms, like radio or television shows, to reduce prejudice among media consumers. Such interventions have been demonstrated to be effective in the field, and they have the potential for reaching broad populations; however, they are poorly theorized (Paluck & Green, 2009). Building a testable theory for entertainment-based prejudice-reduction interventions would be a significant contribution to the literature.
To build the theory itself, we turn to the literature on conceptual change. One method for helping learners adopt a new mental model for understanding a complex process is to confront them with anomalous data (Chinn & Brewer, 1993; Hewson & Thorley, 1989). Anomalous data interventions have been demonstrated to produce conceptual change in the classroom, but they have primarily been tested with academic subjects (e.g. Watson and Konicek, 1990). The question is how to apply the anomalous data approach to models of racism and sexism in an entertainment context.
For an answer, we turn to the field of “games for change.” Games for change use the medium of games to change players' ideas, attitudes, or behavior about social issues (Sawyer,
2010). Like entertainment-based prejudice-reduction interventions, games for change operate on
152 two levels at once. For their designers, these games are ways of changing attitudes or behavior;
for players, they are one form of entertainment media. By deeply integrating theories of learning
into the core mechanics of the game, these two approaches can be unified (Isbister, Flanagan, &
Hash, 2010; Plass, Homer, Kinzer, Frye, & Perlin, 2011).
Games, of course, can take many different forms, from leapfrog to League of Legends
(Riot Entertainment, 2009), and not all games support all theories of learning equally well.
Within many games, players explore rulesets and test hypotheses about how to use those rulesets
most effectively to achieve their goals within the game context. Anomalous data – experiences
that indicate that their models of the game are incorrect – become tools for better understanding
the game. We frame games of this type as “playable models.” When the model of a game is
based on the target learning domain, playable model games seem to be a good fit for an
anomalous data approach.
Design Taken together, these theoretical approaches give us “Playable Model – Anomalous
Data,” or PMAD, design theory. This theory provides a testable approach to creating games that help players use systemic rather than agentic explanations.
PMAD theory defines an approach to building games, but it does not entirely specify the type of game that is created. For example, PMAD theory assumes that the model being built in the game is domain-specific; even within a domain-specific model, different aspects of the model can be emphasized or simplified, and different game experiences can be used to provoke the encounter with anomalous data. The PMAD design principles provide a general approach to creating a playable model and maximizing the encounter with anomalous data.
The PMAD design principles are as follows:
153 • The game system models the relevant domain.
• Player actions affect, and are affected by, the model.
• Players receive feedback about the impacts of their actions as they relate to the model.
• The game goals point players toward model conflict.
• Players can experiment with the game's model.
• Players must figure out rules and strategies for themselves.
For this dissertation, PMAD theory was used to create Advance, a game about systemic bias around race and gender14. Advance models the impact of microaggressions and of perceptual bias in the workplace (Sue, 2010; Sue & Capodilupo, 2008; Valian, 1999). Both rely on systemic rather than individual processes – repeated patterns and feedback systems – to have their most significant impacts (Sue, 2010; Valian, 1999), and both have significant negative outcomes for women and people of color (e.g. Harrell, Hall, & Taliaferro, 2003; Cortina & Kubiak, 2006;
Steele, 1997; Bertrand & Mullainathan, 2003; Steinpreis, 1999).
In Advance, the player takes the role of a corporate recruiter. The player earns money by placing clients into jobs; the player has a slate of clients to select from, and can also select from multiple jobs within the organization they are working with. By clicking on a client, the player can see information about that client, including their race, their gender, and their stats. By clicking on a job, the player can see the stat requirements for that job and its salary. When the player chooses both a client and a job, the game provides further feedback on how successful the client will be in that job, and on whether that client meets the job's requirements. If the client does not meet the job's requirements, the player can either spend money to upgrade their stats,
14Advance can be accessed and played at http://www.replayable.net/advance/.
154 select another client for the same job, or try another job for the same client.
Not all clients are equally easy to place. Each time the game is played, a discriminated- against group is randomly selected. For this group, some jobs are more likely to expose them to microaggressions, which interfere with their on-the-job success. They also need to be more qualified than a member of the dominant group would be to get the same job. By trying different clients in the same job, or different jobs for the same client, players can compare the situations of clients from dominant and non-dominant groups.
Upon successfully placing a client, the player earns money. The player also earns money when a client is promoted to the next level of the game; more successful clients are promoted faster, while truly unsuccessful clients get fired. Players benefit from money because it allows them to upgrade characters, and money also serves as their final score if they can keep their company afloat for the entire length of the game. However, they also must pay their business expenses using this money; if they do not have enough money to pay their expenses, they go bankrupt and lose the game.
Three versions of the game were created based on common design patterns (Bjork &
Holopainen, 2004). In each, players were explicitly asked to identify who was being discriminated against, and is offered a different reward for successfully doing so. The first version of the game is informational; with a correct guess, players have their suppositions about the bias in the system confirmed. The second version offers players a financial reward for getting the guess right. Finally, the third version of the game changes the rules to offer players a greater benefit from placing members of the disadvantaged group. In addition to looking at the overall impact of PMAD design, these three conditions allow us to examine whether players respond differently to different design patterns for reward systems in a PMAD context.
155 Research Questions and Results Broadly speaking, the study asks whether a PMAD game that models systemic bias can
change the likelihood of players using systemic rather than agentic explanations; whether it can
change players' attitudes about racial and gender bias; and what the relationship is between in-
game behavior and player change.
A total of 412 players were recruited to participate in this study. One group of players
was recruited online based on their participation in online gaming. A second group was recruited
using Amazon's microtask service, Mechanical Turk. As the two groups were not comparable,
results for the two groups are described separately.
In order to examine the overall impact of the game, players were randomly assigned to
one of four treatment conditions: three game conditions or a text-based control group. Pre- and post-test data on player attributions and attitudes around racism and sexism were collected.
Using this data, we were able to answer the research questions noted in chapter four and restated below with a summary of the findings. The findings summarized under each question are further discussed later in this chapter.
Q1: Controlling for pre-test scores, are there differences in attribution test scores for
race and gender across the four study conditions?
This question looks across the four treatment conditions: the control group, who received a text-based intervention; the informational game group, who could confirm their hypotheses about the bias in the game; the financial game group, who received a monetary reward for correctly identifying the bias; and the generative game group, who received more money for placing bias-group clients after a successful bias identification.
156 For web-based players, subjects in the control condition were more likely to use systemic explanations for sexism after the intervention; their scores were significantly different from the game conditions taken together. No differences were observed between conditions for any of the other measures. For Mechanical Turk players, no significant differences between conditions were found.
Q2: Are there associations between in-game measures (such as player score, number
of characters placed, and number of game plays) and attribution test scores?
No associations were found for either group. However, for the Mechanical Turk players, there was an effect of player race on the number of characters placed.
Q3: For game players, are there differences in measures of systemic understanding
of racism and sexism across bias guess conditions?
This question looks across the four bias guess conditions: players who did not attempt to guess the bias, players who attempted a guess in the informational game condition, players who attempted a guess in the financial game condition, and players who attempted a guess in the generative game condition.
For web-recruited players, no significant differences were found between the bias guess conditions. For Mechanical Turk players, no significant differences were found between the bias guess conditions for White players; however, differences were found between bias guess conditions for non-White players. For detailed results, see chapter five; the meaning of this
157 finding is discussed later in this chapter.
Q4: For game players, are there differences in game performance measures across
bias guess conditions?
For the web-recruited players, no differences were found in game performance measures
across bias guess conditions. This was also the case for the Mechanical Turk players. However,
there was an effect of player race and gender on the percentage of money earned from clients
who were being discriminated against.
Q5: Controlling for pre-test scores, are there differences in attitude test scores for
race and gender across the four study conditions?
For web-recruited players, no significant differences were found between treatment
conditions. However, for Mechanical Turk players, a significant difference in attitudes toward
sexism was found between the four treatment conditions. Subjects in the control condition
expressed less sexist attitudes after the intervention; although none of the game conditions
caused a significant change in player attitudes, there were still significant differences between
them. See chapter five for more details.
For both web-recruited and Mechanical Turk players, the entire intervention – including
the control group – elicited more racist attitudes at post-test. Although there were no differences in this result between treatment conditions, these findings are discussed later in this chapter.
158 Q6: Are there associations between in-game measures (such as player score, number
of characters placed, and number of game plays) and attitude test scores?
For the web-recruited players, no significant associations were found between in-game measures and attitude test scores. However, for the Mechanical Turk players, a significant negative correlation was found between player score and the Modern Sexism post-test score; a higher player score was associated with a lower post-test score, indicating that more successful players expressed more sexist attitudes at post-test.
Q7: For game players, are there differences in measures of attitudes toward racism
and sexism across conditions?
For the web-recruited players, no significant differences were found between the bias guess conditions. For the Mechanical Turk players, no significant differences were found between guess conditions for White players. However, significant interactions were found between player race and treatment condition for non-White players on both the racism and sexism attitude measures. Detailed results can be found in chapter five; the meaning of this finding is discussed at greater length later in this chapter.
The results described above are summarized in Table 34. To better understand these results, we cluster them into the following three broad categories: treatment condition effects, guess condition effects for web-recruited and White Mechanical Turk players, and guess condition effects for non-White Mechanical Turk players.
159
Table 34
Summary of study findings
Sexism – Web Racism – Web Sexism – MT Racism – MT Q1: Attribution Control increased differences between chance of systemic No effect No effect No effect treatment conditions attributions Q2: Attribution No effect No effect correlations with No effect No effect (Race effect on game (Race effect on game game data data) data) Q3: Attribution Non-White players Race x gender differences between No effect No effect outperform White interaction bias guess conditions players Q4: Game data differences between No effect (game data) Race x gender interaction (game data) bias guess conditions Q5: Attitude All conditions All conditions Control decreased differences between No effect increased racist increased racist sexist attitudes treatment conditions attitudes attitudes Q6: Attitude Negative correlation correlations with No effect No effect No effect with player score game data Q7: Attitude Differences found for Differences found for differences between No effect No effect non-White players non-White players bias guess conditions
Results cluster one: treatment condition effects. This cluster of results deals with the differences between the four treatment conditions – the control text and the three versions of the game.
When dealing with issues of sexism, effects were found for the control text for both the web-recruited and Mechanical Turk players. For web-recruited players, the control text increased their likelihood of using systemic explanations for sexism, while the game, taken as a whole, did not. For Mechanical Turk players, subjects in the control group expressed less sexist attitudes
160 after the intervention. While subjects in the other three game conditions showed no pre-post
change, the differences between game conditions were still significant. In both cases, the effect appeared only for sexism and not for racism – a difference discussed in the next section of this chapter.
When dealing with issues of racism, there were no significant differences between treatment conditions. However, the study as a whole caused players in both groups to express more racist attitudes at post-test than at pre-test. This effect was found for all web-based players and for Mechanical Turk players reporting a racial identity of Other.
We cannot conclude that this was an effect of the game, because the control text had the same effect on study participants as the game did. Rather, we conclude that when it comes to racism, something about the study itself caused all web-recruited subjects and some Mechanical
Turk subjects to express more racist attitudes. We consider possible explanations for this finding, and discuss its implications, later in this chapter.
Results cluster two: bias guess effects for web-recruited players and for White
Mechanical Turk players. This cluster of results deals with differences between the four bias guess conditions. Among subjects who received a game condition, we look at differences between players who did not make a guess about the bias and players who made such a guess in each of the three game conditions.
Web-recruited players displayed the same patterns as white Mechanical Turk players.
Although there were significant pre-existing differences between the two groups, we consider their results together for clarity.
For these two groups of players, no differences were found between bias guess conditions on any of the in-game or outcome measures. We discuss the implications of this null finding later
161 in this chapter.
Results cluster 3: bias guess effects for non-White Mechanical Turk players. These
results deal with differences between the four bias guess conditions. Among subjects who played
the game, we look at differences between players who did not make a guess about the bias and
players who did make such a guess in each of the three game conditions.
For non-White players from the Mechanical Turk group, differences were found between bias guess conditions for both the attribution and attitude measures. As described in the previous chapter, non-White players were significantly more likely to use systemic explanations for
sexism after the game than White players were, although neither group showed significant
change from their pre-test performance. Non-White players also showed significant differences
between guess conditions for both the Modern Sexism and Symbolic Racism measures. Although
for the majority of guess conditions there was no significant change in attitude from their
performance at pre-test, non-White players scored significantly lower on the Symbolic Racism
test after playing a game in which they did not make a guess. Possible explanations for this
finding are discussed later in this chapter.
A more complex pattern was found for the Systemic Racism test, where there was an
interaction between race and gender. Among women, White women were less likely to use
systemic explanations for racism than their non-White peers at post-test, and significantly decreased their likelihood of using such explanations from their pre-test performance; among men, White men were more likely to use systemic explanations for racism than non-White
players, although none of the groups showed a significant change from their pre-test performance.
In-game performance differences aligned with these results. White women earned a
162 higher proportion of their score from members of the discriminated-against group than non-
White women did, while Black, Hispanic, and Other men all earn a higher proportion of their
score from bias group members than White men did.
Additionally, non-White players placed fewer characters during the game than White
players did – but with no significant difference between scores. This difference suggests that
non-White players from the Mechanical Turk group relied on quality, rather than quantity, of
placements in order to succeed in the game.
These results taken together suggest that non-White players may be more sensitive to the effects of the game than White players, and that the race-gender interactions found here require further investigation. These findings will be further discussed later in this chapter.
Discussion The three clusters of findings, above, have raised several complex issues that need further investigation. We now return to each of the three clusters, looking into the issues they raise in greater depth.
Results cluster one: differences between control and game. The pattern of differences between the control and the game conditions raises several questions. First, why did the control text have an impact on sexism, but not racism? Second, why did the intervention as a whole make players express racist attitudes, and what might that imply? Finally, what do the results, and particularly the differences between racism and sexism, mean for the project as a whole? We deal with each of these questions in turn.
As described above, for both player groups, the control text performed significantly better than the game at getting players to change their attribution styles (for web-based players) and attitudes (for Mechanical Turk players) about sexism. However, there were no differences
163 between control and game for racism for either group. Why did these differences between sexism and racism appear?
It is possible that the control text conveyed ideas about sexism more effectively than it did for racism. However, the text was constructed to alternate between examples using race and examples using gender for each of the concepts explained. Given that there were also differences between player groups in which measures were affected by the control text, we propose that a more likely explanation has to do with underlying differences in players' prior conceptions of racism and sexism. As we will see, that is consistent with other results found in this section.
While there were no differences between control and game conditions for racism, the intervention as a whole caused players to express more racist attitudes after completing the study.
We therefore ask why the study made players express more racist attitudes.
First, we dismiss the hypothesis that playing the game caused players to express more racist attitudes, because the control condition also caused players to express more racist attitudes at post-test. Reading text about racism had precisely the same effect as the game – namely to cause players to express more racist attitudes after the study. The activity itself, regardless of format, had the same effect on study participants.
The existing literature on racism notes that people's attitudes about racism do not always match what they consider to be socially appropriate. As with other social desirability effects, people want to conform to what they think is appropriate behavior (Holtgraves, 2004; Uziel,
2010). However, social norms about the expression of racism are strong, so having this discrepancy revealed can be unusually stressful (Devine, 1989; Fiske, 1998; Crocker, Major &
Steele, 1998).
One hypothesis is that maintaining an appropriate set of beliefs is a cognitively
164 exhausting task. It takes effort for a person to avoid expressing or acting on underlying
prejudices that they cannot entirely eliminate (Devine, 1989; Monteith, 1993). When a person's
willpower is exhausted, or under circumstances of high cognitive load, underlying attitudes will
be expressed (Macrae, Bodenhausen, Milne, & Jetten, 1994). However, in this case we might
expect to see a difference between the control text and the game. There is no reason to believe
that reading a text is as hard as learning a new game and playing it multiple times. While the
cognitive exhaustion theory makes sense as stated in the literature, it may not fully explain what
is happening here.
Instead, we turn to the concept of “moral credentials” (Monin & Miller, 2001). When
people have a chance to establish themselves as non-racist, they are more likely to act in ways that could possibly be interpreted as racist, because they feel they have proven that racism could not be the explanation for their actions. For example, Effron, Cameron, and Monin (2009) found that subjects were more likely to describe a job as suited for whites if they had just endorsed
Obama for president, compared to endorsing a White Democrat for president or identifying
Obama without endorsing him.
We propose that taking part in this study served as a “moral credential” for subjects, who then felt more comfortable expressing ambiguously racist attitudes at post-test. In other words, the study did not cause players to become more racist. Rather, it made them more likely to express their true attitudes about racism. Paradoxically, expressing more negative attitudes about race can actually be a positive thing for anti-bias work. People with racist attitudes will find justifications to validate them, even if they are not willing to explicitly commit to them
(Uhlmann & Cohen, 2005; Gaertner & Dovidio, 1977). When people conceal racist attitudes with rhetoric, it only leads to racist outcomes that use a different set of justifications. Exposing
165 those attitudes can be the first step to changing them. For example, a value-consistency approach
is one of the validated techniques for reducing prejudice (Paluck & Green, 2009; Rokeach,
1973). When individuals are confronted with a conflict between their prejudice and another value that is important to them, such as equality, they behave in less prejudiced ways (Grube, Mayton,
& Ball-Rokeach, 1994). For these techniques to be effective, subjects must be aware of their own prejudice.
Moral credentialing is also a concern for any anti-bias intervention. By being the kind of person who takes part in an anti-bias intervention, one could see oneself as creating moral credentials that would allow the expression of ambiguously racist beliefs or behavior. This hardly invalidates the importance of anti-bias work itself.
However, there is a question about this interpretation as well. Monin & Miller (2001) have shown that the moral credentialing effect works for sexism as well as for racism. Why do we only see it for racism here, while for sexism we see a benefit from the control text and no impact from the game? It might be an impact of the Symbolic Racism test itself (Henry & Sears,
2002). Because the Symbolic Racism test discusses only anti-Black racism, subjects who were not Black may have reacted poorly to having their experiences with racism omitted from the study.
The deeper finding here is that the patterns of behavior for sexism and racism were
different. For sexism, the control condition had a positive impact on player attributions (for the
web-recruited players) and attitudes (for Mechanical Turk players). For racism, the study as a
whole caused players to express more racist attitudes. What this suggests challenges one of the
underlying premises of this study. The premise of the game was to model the common structural
underpinnings of sexism and racism. While sexism and racism are not identical in practice, they
166 could be deconstructed to produce a common set of patterns that produce both racial and gender
disparities, and that therefore could be addressed by the same set of game mechanics. In this
case, those were the patterns of microaggressive stress and perceptual bias.
The patterns themselves may operate the same way for race and gender – but players do
not. The game simplified racism and sexism to this common model, but it did not successfully
induce the player to reduce racism and sexism to the same model, or at least to do it without
incorporating their previous understandings of the phenomena. For example, Sidaneus and
Veniegas (2000) argue that sexism and racism are qualitatively different forms of discrimination,
with racism functioning primarily as an act of aggression and sexism functioning primarily as an
act of control. These differences may have been more salient than the common structural model.
Players bring their own assumptions about race and gender to the table, as well as their
prior experiences with racism and sexism. While the study attempted to control for this by
controlling for player race and gender, this approach did not succeed. The assumption was that
people would respond to the game based on their dominant or non-dominant position in
American society, which would shape their prior experiences. Instead, all players showed the
same pattern of differences between racism and sexism in this results cluster, indicating that it
may be the differences in their conceptions of racism and sexism, rather than their personal experiences with racism and sexism, that are driving these findings.
The broader lesson of these results is that underlying structural commonalities between
real-world phenomena do not override differences in players' prior understandings of those phenomena, and that those conceptual and social understandings may even override differences in personal experience that would lead players to experience the game differently. That said, there were differences between players by race (and, occasionally, gender), which we describe
167 more fully below.
Results cluster 2: bias guess effects for web-recruited players and for White
Mechanical Turk players. For web-recruited players, and for White Mechanical Turk players, no differences were found in either player behavior or outcomes between the bias guess groups.
In other words, players who never encountered the bias guess system behaved the same way as players who did encounter it; among players who encountered the bias guess system, it did not matter which version of the bias guess system they used.
This result could be interpreted to suggest that PMAD theory is not useful for helping players achieve conceptual change. PMAD theory predicts that the design of the bias system will affect player learning. If the type of bias system made no difference to players’ attitudes or attribution styles, and if there was no difference between players who encountered the bias system and players who did not, then perhaps PMAD theory is simply wrong.
While this explanation sounds plausible, we reject it as a conclusion. First, there were indeed differences found for some players, as will be discussed later in this chapter. More importantly, this explanation does not account for the mechanism by which the theory operates.
PMAD theory predicts that the differences between game versions will drive differences on the outcome measures because of players' response to differences in the game rules. Change in attribution style and attitudes, therefore, would be the result of differences in play behavior.
Because we found no differences in play behavior between bias guess conditions for these groups, we cannot draw conclusions about whether this aspect of PMAD theory is effective.
Instead, we can draw the more limited conclusion that the game's design did not successfully drive players to change their behavior in the game. Because subjects played the same way across all game versions, their experience in confronting the anomalous data presented by the game's
168 model was effectively also identical.
The question, therefore, is why Advance, which was designed according to PMAD
principles, failed to evoke differences in player behavior between game conditions for web-
recruited players and for White Mechanical Turk players. We consider possible explanations for
this, and suggest ways in which the PMAD principles can be changed to make model encounters
more salient to the player. From there, it will be possible to conduct further studies to investigate
how these changes would drive differences in player behavior, and to test whether such play
differences would in fact change players' conceptual models about discrimination.
First, we consider that the game may have failed to drive differences in behavior between
groups because it was insufficiently fun. After playing the game, players were asked about their
enjoyment of the game. Fewer than 25% of players reported more than moderate liking for the
experience, or reported that they would be likely to recommend it to a friend. The game replay
data are similarly suggestive; although all players had the option to continue with the game after
the required two playthroughs, only 3.6% of players did so. Players who did not enjoy the game
may have been less motivated to attend to its underlying rules and processes for their own sake.
Instead, they may have focused on trying to complete the experience and move on. While fun is
difficult to reduce to a single design principle, this possibility suggests that future projects based on PMAD theory should use better metrics for player enjoyment during play-testing and pilot- testing.
Another possible explanation for the lack of differences between player groups is time on task. Time on task is a significant factor in learning (Arlin & Roth, 1978; Bell & Davidson, 1976;
Cobb, 1972). In the intervention as currently designed, the vast majority of players engaged in ten minutes of gameplay over two play instances. The time spent playing the game may simply
169 have been insufficient to engage players with the game's model, let alone have them figure out strategies with which to respond. If time on task were the critical factor, however, we might expect to see an impact for players who chose to play a third time, since they would actually spend more time on task. This is not, in fact, the case. As described in the previous chapter, there were no significant differences between players who played twice and players who played more than twice on any in-game or outcome measures. However, as with fun, we note time on task as a possible issue to investigate in future PMAD-based designs.
A third possible explanation is problems with player feedback. During play-testing, players described difficulty understanding the in-game reasons for their play experiences. In response, more of the game's model was made directly visible to players. For example, players described confusion about why some jobs were better than others for a given client. In response, during the client placement process, the reactions of adjacent NPCs to the given client were represented visually by hearts (for positive relationships) and skulls (for negative relationships).
This allowed players to see the pattern of relationships for a given client in a given job. However, this access to the game's underlying model may not have gone far enough. Players were able to see the pattern of who they were placing, but they did not have an easy way to see who they had placed. In order to see the patterns in their own placement behavior, players would have to hunt through multiple levels, clicking on each of their successfully-placed clients individually and mentally tracking what they'd found. At best, they might be able to infer which types of clients they were successfully placing based on the “leftover” clients in their queue.
This information was critical to players developing models of bias, because the bias in the system emerges in patterns of placement. The difficulty of understanding these patterns was meant to be an engaging central mechanic of the game, but it may simply have been too complex
170 a task in too short a period of time. Giving players explicit access to their own behavioral
patterns – for example, by allowing them to bring up a control panel showing who they'd placed
on which level – might have helped players quickly see what was happening, in time to develop strategies to respond to it for a higher score. This suggests that, for example, the principle of feedback may need revision. Making patterns more accessible to players more quickly may help
players get to the model confrontation faster, and may help them develop goals and strategies to
accommodate the model shift.
The issue of feedback also relates to the fourth possibility: reflection. Reflection can help
players learn more from their experience in games (Moreno & Mayer, 2005). Advance does not
explicitly ask players to reflect on their in-game behavior or experiences. The game was intended
to use players' difficulty in client placement to induce them to reflect on possible strategies for
improved performance. However, players may not have done this on their own.
In some ways, the game actively discouraged players from taking time for reflection.
While finding a good placement for a client earned more money than a poor placement, there
was a significant opportunity cost of spending time hunting for a better placement, with no
guarantee of success. This tradeoff was meant to cause players to develop more efficient
strategies for placement, so that they could earn more money with less time invested. However,
players may have felt uncertain that they would be able to benefit from spending more time on
the task, and therefore disproportionately chose to place characters as quickly as possible instead
of thinking carefully about their options (McGuire & Cable, 2013).
To address this issue, a new PMAD design principle could be added that deals with
reflection. Reflection, in this context, is meant to occur when the player has a decision to make
about the best way to engage with the game's model in pursuit of their in-game goals. PMAD
171 theory predicts that players' encounters with the game model, and subsequent engagement with it through strategy development, help them integrate the anomalous data into their worldview.
Instead of focusing on getting the player to reflect on the whole game experience after playing, the new principle would emphasize repeated opportunities to reflect on their strategy, performance, and goals during play.
New principles would also need to address a fifth issue, namely the issue of centrality.
The current version of Advance was designed to make interactions with the bias system optional.
Players would always be trying to place clients within a biased system, but they did not have to attempt to identify the bias or take advantage of the rewards for doing so. This design decision was meant to reduce the players' sense of mandatory engagement with the bias system (Heeter et. al., 2011), so that players would be freely choosing whether or not to engage with that mechanical subsystem during play. Unfortunately, over 50% of players chose not to interact with the bias system at all. The bias system was not central enough to motivate the majority of players to interact with it, even though it was affecting their play. Without making the bias system mandatory, there are ways to make it more central to the design. Developing a new principle, and testing the impact of centrality, would be a good way to investigate this possibility.
Finally, a PMAD design principle would need to be added to account for the issue raised by the first results cluster, namely that player preconceptions need to be taken into account when designing the game's model. For this study, extensive research was conducted about players' models of racism and sexism, but things that are tangential to the model may actually be driving significant differences – such as implications about how the society described by the game handles issues of bias – as we will see in the next cluster of results.
Results cluster 3: bias guess effects for non-White Mechanical Turk players. There
172 are several race-based effects and interactions in this cluster that require deeper investigation.
First, we consider the issue of why non-White players outperformed White players on systemic
attributions of sexism. Second, we look more deeply at the interaction between race and gender
for the systemic racism test. Finally, we consider why different game conditions may have
affected attitudes for this particular group, both in terms of differences between conditions and in
terms of some conditions displaying pre-post differences.
First, we examine the finding that Black, Hispanic, and Other Mechanical Turk players were more likely to use systemic attributions for sexism after playing the game than White players were. This effect may be driven by play differences between the groups. Black, Hispanic, and Other players placed fewer characters than White players, but without any difference in overall game score. Since the game ran for the same length of time for all players, this finding suggests that this player group spent more time on placing each character, and made better per-
character investments. This difference in engagement with the game may explain why the non-
White players were more willing to use systemic explanations at post-test.
That said, this finding requires further explanation. Even given differences in play behavior between White and non-White players, why would this effect appear only for systemic explanations for sexism? This may be a population effect. Gaming culture is quite sexist; for example, over 85% of characters in videogames are male (Williams, Martins, Consalvo, & Ivory,
2009), identifiably female players are disproportionately harassed (Kuznkeoff & Rose, 2012), and prominent women designers and critics face enormous, gendered hostility (Lewis, 2012).
Women who are more sensitive to sexism may have been driven out of gaming culture entirely, leaving a population of game-playing women who are less sensitive to the issue. Gaming culture is also racist, in that Black and Hispanic characters are underrepresented in games (Williams et.
173 al., 2009) and racial insults are commonly deployed among gamers (Gray, 2012). Women, however, may find it harder to “pass” in gaming culture while still participating fully. For example, many online games require voice chat, where it is easier to identify female voices than the voices of non-White players, and where female voices have been shown to provoke hostility and gender-based harassment (Kuznekoff & Rose, 2012). This effect appears only for the
Mechanical Turk group, not for the web-recruited group. However, as noted above, web- recruited players may not be particularly representative of gaming culture, since self-identified gamers willing to participate in a study addressing bias are bucking a gaming culture that is racist and sexist in both subtle and overt ways.
Second, there was one area where an interaction between race and gender was found: on the Systemic Racism test. Women and men showed opposite patterns. White women scored lowest among women, while White men scored highest among men; Black and Hispanic women scored highest among women, while Black and Hispanic men scored lowest for men; both male and female Other players scored in between the other two groups. In other words, White men and
Black and Hispanic women were most likely to use systemic explanations for racism at post-test.
A similar, but not identical, interaction was found between race and gender for how much the player earned from discriminated-against clients. Again, the patterns for women and men were reversed. Among women, White women earned the highest proportion of their score from discriminated-against clients, while White men earned the lowest proportion among men.
However, instead of Black and Hispanic players situated at the other extreme, Other men earned the highest proportion of their score from bias-group members among men, while Other women earned the lowest proportion among women. Black and Hispanic players scored in between.
Although we have no direct evidence that play differences drove the differences in
174 likelihood of using systemic explanations for racism at post-test, the patterns are suggestively similar, and it is certainly consistent with PMAD theory. If this play difference is indeed responsible for the differences on the Systemic Racism test, then it implies that players who were more successful at earning money from bias-group clients were less likely to use systemic explanations for racism after playing the game. To explain this, we must recall that there were no differences between groups for the number of bias-group clients placed, or for the total number of clients placed. The extra income being earned from the bias group, therefore, must be coming from promotion bonuses and from the higher salaries available at higher levels. In other words, the players who were more successful at promoting discriminated-against clients were less likely to use systemic explanations for racism after the study. These players may have concluded that if they were able to place these clients, the problem couldn't be all that serious. Frustration with the system, on the other hand, may have been comparatively more productive at getting players to consider systemic explanations for bias. The frustration hypothesis may also explain the finding that for Mechanical Turk players, a higher game score was correlated with more sexist attitudes at post-test.
Finally, we consider the differences between game conditions that were found for non-
White Mechanical Turk players. These differences appeared only for the attitude tests, which gives us a clue as to what might underlie these differences.
Although there were differences between Modern Sexism and Symbolic Racism, in both cases players performed best on conditions where they received money in some form for reporting the bias – the financial-guess condition for sexism, and the generative-guess condition for racism. Conversely, the only case in which there was a significant pre-post difference was on the Symbolic Racism test in the no-guess condition. Players held more racist attitudes at post-test
175 when they did not realize they had the option of doing anything about the bias in the game
system, even if only reporting it.
We propose that this has to do with non-White players being sensitive to the vision of the
larger world presented by the game. What happens in a world where bias exists? Will someone
do something about it? Or is the unjust state of affairs simply presented as normalized and
inevitable? When players did not make a guess, they encountered bias helplessly; when they did,
even if they did not succeed in guessing the bias, they discovered that someone cared about
whether bias existed. Players expressing less racist attitudes when they thought they could do
something about the game's bias, and more racist attitudes when they had no such option,
suggests a retroactive justification for their game experiences (Jost & Banaji, 1994; Jost, 2002;
Jost, Banaji & Nosek, 2004).
The original hypothesis of this study was that players would be sensitive to the variation
in the mechanics of the reward conditions and respond by making different play choices.
However, these results suggest that players were more sensitive to the social implications of differences between conditions. Instead of responding with game strategies, they responded by developing a model of the game world's implicit attitudes toward justice and justifying it. This question is worth investigating more deeply. It suggests that players perform narrative extrapolation even in simple games, which challenges earlier findings that players pay more attention to game mechanics than story (Lindley, 2002). A better understanding of this extrapolation process can help designers use narrative framing more effectively to convey their game concepts, and, conversely, to avoid investing effort in narrative elements that are not necessary to make their point.
We must, however, consider why these effects did not show up for the web-recruited
176 subjects. This may be due to the small number of non-White web-recruited subjects, which
prevents us from picking up on race-based effects. However, it may also be that there is a ceiling
effect due to differences between groups. Players from the web-recruited group were more likely to use systemic attributions for racism and sexism, and they held less racist and sexist attitudes.
The race-based effects found here may only be visible among a more biased population.
Limitations of the Study There are significant limitations to the study, which need to be kept in mind as we interpret the results and determine its implications.
First, the study is limited because of its player population. Two groups of subjects were recruited, one from web-based communities of online gamers and one from Amazon's
Mechanical Turk. We can question to what extent these subjects are representative of the population as a whole. Both were recruited in ways that suggest there are limits to their generalizability – whether because they are self-identified game-players or because they spend
their free time on Mechanical Turk. (Though of course these categories are not mutually
exclusive.)
Second, there were demographic issues with the study. It failed to recruit enough non-
White players in the web group to know whether there are differences in play based on racial
identity; even in the Mechanical Turk condition, where more non-White subjects were recruited,
the demographics of the subject pool do not reflect the larger demographics of the United States
(Howden & Meyer, 2011; Hume, Jones, & Ramire, 2010).
Third, the study was deliberately limited to American participants. This was done because
race as a social construct is understood differently in different countries. To get at attitudes and
attributions surrounding race, we would need to use different instruments and different models
177 that are appropriate for different groups of subjects. Additionally, we have reason to expect that attribution patterns are culturally specific (Choi, Nisbett, & Norenzayan, 1999; Norenzayan,
Choi, & Nisbett, 2001). Because this study specifically looked at attribution styles, we limited our study to American subjects to avoid cultural confounds.
Finally, the major limitation of this study is the design of the game itself. While this dissertation attempted to create a game that conveyed the model and functioned successfully as a game, the game was not entirely successful. Although many problems were caught in play-test, the pilot study found that many players did not understand the game interface. A tutorial was added before the full study was performed, which helped players understand the game, but player satisfaction rates were still low.
Implications for the Literature In light of the above-noted weaknesses and limitations, what are the implications of this work as related to the body of literature used to develop this dissertation?
Based on this study, there is no reason to doubt that it is important to change Americans' models of sexism and racism. Although this study did not find that Advance was more effective than a control text at changing players' attribution styles, the original problem posed by the dissertation still stands. Using dominant, individualistic models of racism and sexism inadequately explains systemic racial and gender bias, and switching from the former to the latter is a difficult problem. Without a systemic model of bias, people are less likely to support systemic interventions to reduce racial and gender inequity (Iyengar, 1989; Iyengar, 1994;
Hughes & Tuch, 2000; Lau & Sears, 1981), and hence less likely to effectively advocate for social justice. Similarly, changing people's attitudes about racism and sexism matters because attitudes shape social norms (Oskamp, 2000). The findings of this study in no way negate the
178 need for social change.
Although it is likely that PMAD theory will require significant modification and testing before it can be said to be effective, this approach can still aid in the development of
entertainment interventions for prejudice reduction. There are projects which have been
successful in this area (e.g. Paluck, 2009), but the literature is inadequately theorized (Paluck &
Green, 2009). While PMAD theory, as currently constructed, does not change outcomes for
players, it does provide a testable theoretical base for future work. Testing the changes and
additions proposed in this chapter, for example, may lead to PMAD being shown to be effective.
In other words, it is too soon to say that PMAD theory does not provide anything useful to the
prejudice-reduction enterprise. Advance was only a first attempt at building games based on this
theory, and the fact that there is a testable and modifiable theory in the first place is a step
forward for this sub-field. The PMAD design principles, even in their currently-limited form,
give us a way into the problem of designing games as entertainment-based prejudice-reduction interventions.
This study also may have implications for the field of games for impact. This study demonstrates that even when a game provides a simplified model, players may bring their prior knowledge with them across the border of the magic circle (Huizinga, 1950). For example, players appeared to be more influenced by differences in their prior knowledge of racism and sexism than by the shared model of racism and sexism constructed by the game. This is consistent with existing work in the field (e.g. Copier, 2007; Consalvo, 2009), but it demonstrates that this effect can be present in casual games as well as for more in-depth gaming experiences.
The finding that non-White players are the most sensitive to the game is a challenge to
179 the usual recruitment issues for game-based research, which often focus on White players or on
games that are primarily played by White players. While we do not yet know if this finding holds
true for all PMAD games, or only for PMAD games about racism and sexism, or only for
Advance in particular, further investigation can answer this question and challenge the field to widen the search for its research subjects and potential audiences.
Finally, this study suggests some of the complexities of interventions within a game context. Reading the text had no impact for the majority of measures for each group; it is possible that reading text within the context of a game activity made players take it less seriously. As suggested earlier in this dissertation, games' cultural position makes them accessible to unenthusiastic learners, but it may also make the material in the game less credible
– even if that material is not itself part of the game. This suggests that the positioning of game interventions is something worth investigating further, not just for this study but for game research of all sorts.
Implications for Future Research and Practice As discussed earlier in this chapter, there are many research questions that emerge from the findings of this dissertation.
First, there are many questions that could be answered with further analysis of the data collected in this study. For example, full gameplay logs were collected for every player.
Analyzing these logs would allow us to look more deeply at the question of how player behavior in the game drives conceptual change. We could work to identify play patterns associated with conceptual change around issues of race and gender, and then validate the predictive value of those play patterns in a second study.
There are also research questions to be answered about whether the Advance game would
180 prove to be effective in other contexts. Specifically, we could investigate the possibility of preparation for future learning (PFL) effects using Advance. Preparation for future learning, in this context, means that games can function as virtual experiences that players can build on when making sense of new concepts (Hammer & Black, 2009). Would playing the game change the extent to which players are later affected by more formal learning experiences around discrimination and bias? For example, how would players perform if they first played the game and then read the control text, or if they first read the control text and then played the game?
Looking at PFL effects with Advance would not just help us investigate ways of changing players' models of racism and sexism. It would also help us understand PFL effects in games better. While some games appear to prepare learners for future encounters with formal learning experiences such as reading from a textbook, others do not (Hammer & Black, 2009). Knowing whether PFL effects appear in Advance would help us learn more about the game features and knowledge domains where game-based PFL can be the most effective.
We could also study the impact of situating the game within a more explicitly educational context. This might change what players attend to within the game, especially if framed in a
“productive failure” context (Kapur, 2008). Gameplay could be understood as a challenging and potentially unsolvable problem to which a subsequent text intervention could provide a solution.
Given that the only finding of differences in game-play effects between game conditions indicated that groups who struggled more with earning money from discriminated-against clients were also more likely to use systemic explanations for sexism, we consider that this might be a fruitful approach.
Earlier in this chapter, we proposed possible additions and alterations to the PMAD principles. Each of these principles would need to be investigated more thoroughly. In future
181 studies on the PMAD principles, the research would be differently constructed. First, we would
iterate on designs using a given principle to determine whether it provokes the desired gameplay
differences. Only then would we investigate whether in-game behaviors affect any outcome measures. Additionally, further research on the PMAD principles can investigate the importance of the principles already defined in this dissertation. While they are based in both game design and educational theory, they can be individually tested to see to what extent they are important to
PMAD theory. For example, the first principle (model-building) will need to be modified to account for player preconceptions that are not model-based, and such modification should be empirically tested.
Additionally, we found that the challenge of Advance was that the differences between conditions did not drive differences in play behaviors. There are elements of the game beyond the PMAD principles that could be modified and tested in order to get players to react differently.
For example, the study’s findings suggest that players may have connected to the game narratively and emotionally. We could investigate whether creating more of a personal connection for the player, such as by allowing them to name their company, would increase the impact of the game. Similarly, we could investigate the effect of creating a deeper personal connection between the player and the characters they are helping, such as by allowing the player to see a particular character’s backstory or by having characters express gratitude to the player.
We could also investigate the impact of allowing players to change the bias in the system, as opposed to simply react to it. Bias in Advance is presented as an inescapable and unchangeable fact; players who fail to successfully navigate it may feel helpless, while players who successfully place discriminated-against characters may conclude that their success means systemic bias does not exist. Allowing players to modify the bias system can help address both
182 these issues. For the former group, it demonstrates that discrimination can be fought; for the
latter group, it allows them to see that their success could be even greater in a fairer world.
Finally, engaging with the game before and after the change in the bias system could help players
compare and contrast the two rulesets in order to understand the systemic aspects of
discrimination more deeply.
Other findings of this dissertation suggest that it is worth further investigating to what
extent player preconceptions cross the line into the game, and to what extent the game's model
can challenge or undermine those preconceptions. Although the game's model was built on the
similarities between racism and sexism, differences appeared between the two. The control text
helped players address issues of sexism, while the study as a whole caused players to express
more racist attitudes. The attempted simplification to a common model did not succeed, possibly
because of players' prior experiences with the concepts. It is worth trying to understand which
features of racism and sexism, as social phenomena, remain attached to the game even when
using simplified models, and which are successfully abstracted away.
Similarly, the original hypothesis of this study prioritized the mechanical differences between game reward conditions as a way of creating differences in player behavior. However, the results of the study suggest that players are at least equally sensitive to the social implications of game rules, and that they can be affected by these implications even without changing their play behavior. It suggests that future designs of games for impact need to account for the affective and narrative environments suggested by gameplay, and what features of games are most effective at conveying such elements.
Finally, there is the issue of the impact of player race on the results of the study. By looking at a sample of web-recruited non-White subjects, we could see whether they show the
183 same patterns of difference as the Mechanical Turk players, indicating that this study's difference in findings were the result of small sample size, or whether there are indeed underlying population differences. It would be highly informative to know whether the same pattern of results applies across the two groups.
Whether or not it does, our future research and practice involving PMAD design needs to address the fact that the impacts of the study were greatest for non-White players who were not recruited through gaming communities. These players played differently from web-recruited and
White players, and they were most sensitive both to the study’s overall effects and to differences
between conditions. If this is the population for whom this type of game is most effective for
addressing bias, we need to consider what this means for the possibility of making social change
through PMAD game design.
Do non-White Americans even need help changing their models of how racism and
sexism operate? And do they need interventions aimed at their attitudes? We argue that they do.
Although it is certainly valuable to create anti-bias interventions for White Americans, because
of their comparatively dominant position in society, they are hardly the only people worth
reaching.
Sexism is an issue for people of all racial backgrounds. While some specific
manifestations of sexism differ by class and race (Williams, 2012), among other factors, non-
White Americans hold and express sexist attitudes and adopt individualistic rhetorical frames
around issues of sexism (Hunt, 1996; Hughes & Tuch, 2000). Changing these attributions and
attitudes would help in terms of personal change, in terms of supporting appropriate remedies for
systemic sexism, and in terms of changing the kinds of conversations we have about sexism in
society. Instead of talking about how individual women can “lean in,” for example, the
184 conversations can begin to be about the structural factors that make the rhetoric of leaning in
necessary in the first place (Losse, 2013).
However, it is also worth designing interventions around racism for non-White
Americans. This population is not somehow exempt from the racist attitudes and rhetorics that pervade our society. For example, internalized racism and intraethnic othering mean that individuals can hold negative stereotypes about groups to which they belong, police their own performance of ethnic identity, and acquiesce to the structures of racial oppression that disadvantage them (Steele & Aaronson, 1995; Pyke & Dang, 2003; Speight, 2007). Additionally, different non-White groups can hold negative views about each other. If we can change the attitudes of non-White Americans, that is a worthwhile thing both for reducing internalized racism and for reducing racism between non-dominant groups.
Additionally, changing the attribution styles of non-White Americans can be a powerful way to expand the reach of existing groups that work for social change. As described earlier in this dissertation, people with an individualistic model are less likely to support interventions that address the systemic elements of bias (Iyengar, 1989; Iyengar, 1994; Hughes & Tuch, 2000; Lau
& Sears, 1981). Changing attributions can help anti-racist groups recruit people who share their vision about the structural changes necessary for a just world. It can also give people a tool to strike back against the dominant rhetoric of color-blindness, not by arguing with its details but by deconstructing the basic premises of individualism on which it relies. Changing the attribution styles of non-White players can be seen as giving them more ways to articulate challenges to the dominant narratives of race in American society.
Finally, we can think about how we can generalize the characteristics of this particular group of players. By understanding better why non-White players in the Mechanical Turk were
185 affected by the game, we may be able to make testable inferences who else might be affected.
There may be underlying commonalities that would allow us to affect some White players as
well because of things other than racial identification that they have in common with the affected group.
In conclusion, while this study’s limitations, as delineated earlier, may have affected its outcomes, it provides impetus and direction for future work in an important area that can be
addressed through a “games for impact” perspective. The study addressed the difficult and
socially relevant problem of prejudice by proposing and testing a new approach to creating
entertainment-based interventions, namely PMAD theory. Although PMAD theory has not been
demonstrated to be completely effective in its current form, it provides the field of entertainment-
based prejudice-reduction interventions with a direction to test more theoretically-grounded
approaches on which to build. The modifications to the PMAD design principles suggested in
this dissertation could refine PMAD theory in ways that result in effective designs to address
social problems that stem from individuals’ perceptions and beliefs. Additionally, this study
found that some players were sensitive to the differences between different versions of the game,
namely non-White players in the Mechanical Turk group. This suggests that further research to
determine whether PMAD theory may differentially persuade non-White players about the nature
of prejudice could be productive, and, ultimately, may result in knowledge about designing the
most effective game-based interventions for various populations when attempting to address
beliefs about prejudice using PMAD theory.
186 References
Casual Games Association (2007). 2007 casual games report. Retrieved from https://dl.dropboxusercontent.com/u/3698805/Reports/CasualGamesMarketReport- 2007.pdf Adams, M., Bell, L. A., & Griffin, P. (Eds.). (2007). Teaching for diversity and social justice. London, UK: Routledge. Allport, G. W. (1979). The nature of prejudice: 25th anniversary edition. New York, NY: Basic Books. Arlin, M., & Roth, G. (1978). Pupils’ use of time while reading comics and books. American Educational Research Journal, 15(2), 201-216. Banaji, M. R., & Greenwald, A. G. (1995). Implicit gender stereotyping in judgments of fame. Journal of Personality and Social Psychology, 68(2), 181–98. Batanero, C., & Sanchez, E. (2005). What is the nature of high school students’ conceptions and misconceptions about probability? In G. Jones (Ed.), Exploring probability in school (pp. 241–266). New York, NY: Springer. Bechdel Test Movie List. (2010). Retrieved November 8, 2010, from http://bechdeltest.com/ Beichner, R. J. (1996). The impact of video motion analysis on kinematics graph interpretation skills. American Journal of Physics, 64, 1272–1278. Bell, M., & Davidson, C. (1976). Relationships between pupil-on-task-performance and pupil achievement. The Journal of Educational Research, 69(5), 172-176. Bertrand, M., & Mullainathan, S. (2003). Are Emily and Greg more employable than Lakisha and Jamal? A field experiment on labor market discrimination. (NBER Working Paper No 9873). Cambridge, MA: National Bureau of Economic Research. Retrieved from http://www.nber.org/papers/w9873 Bidell, T. R., Lee, E. M., Bouchie, N., Ward, C., & Brass, D. (1994). Developing conceptions of racism among young white adults in the context of cultural diversity coursework. Journal of Adult Development, 1(3), 185–200. Bjork, S., & Holopainen, J. (2004). Patterns in game design. Cambridge, MA: Charles River Media. Blanchard, F. A., Lilly, T., & Vaughn, L. A. (1991). Reducing the expression of racial prejudice. Psychological Science, 2(2), 101–105. Blizzard Entertainment. (2004). World of warcraft. [PC game]. Irvine, CA: Blizzard Entertainment. Bogost, I. (2006). Unit operations: An approach to videogame criticism. Cambridge, MA: The MIT Press.
187 Bogost, I. (2007). Persuasive games: The expressive power of videogames. Cambridge, MA: The MIT Press. Bonilla-Silva, E., & Forman, T. A. (2000). “I am not a racist but...”: Mapping White college students’ racial ideology in the USA. Discourse & Society, 11(1), 50–85. Bonilla-Silva, E. (2006). Racism without racists: Color-blind racism and the persistence of racial inequality in the United States (2nd ed.). Lanham, MD: Rowman & Littlefield. Brondolo, E., Brady, N., Thompson, S., Tobin, J. N., Cassells, A., Sweeney, M., McFarlane, D., et al. (2008). Perceived racism and negative affect: Analyses of trait and state measures of affect in a community sample. Journal of Social and Clinical Psychology, 27(2), 150– 173. Burke, L. L. (2012). Dog eat dog. [Tabletop roleplaying game]. Oakland, CA: Liwanag Press. Buser, J. K. (2009). Treatment-seeking disparity between African Americans and Whites: Attitudes toward treatment, coping resources, and racism. Journal of Multicultural Counseling and Development, 37(2), 94–104. Bush, G. W. (2010). Decision Points. New York, NY: Crown. Champagne, A. B., Klopfer, L. E., & Anderson, J. H. (1980). Factors influencing the learning of classical mechanics. American Journal of Physics, 48(12), 1074-1079. Chen, J. & Clark, N. (2006). Flow. [Flash game]. Los Angeles, CA: thatgamecompany. Chi, M. T. H., de Leeuw, N., Chiu, M. H., & LaVancher, C. (1994). Eliciting self-explanation improves understanding. Cognitive Science, 18, 439–477. Chi, M. T. H., & Roscoe, R. (2002). The processes and challenges of conceptual change. In M. Limon & L. Mason (Eds.), Reconsidering conceptual change: Issues in theory and practice (3–27). New York, NY: Springer. Chi, M. T. H. (2008). Three types of conceptual change: Belief revision, mental model transformation, and categorical shift. In S. Vosniadou (Ed.), Handbook of research on conceptual change (pp. 61-82). Hillsdale, NJ: Erlbaum. Chi, M. T. H. (2005). Commonsense conceptions of emergent processes: Why some misconceptions are robust. Journal of the Learning Sciences, 14(2), 161–199. Chinn, C. A., & Malhotra, B. A. (2002). Children’s responses to anomalous scientific data : How is conceptual change impeded ? Journal of Educational Psychology, 94(2), 327–343. Chinn, C. A., & Brewer, W. F. (1993). The role of anomalous data in knowledge acquisition: A theoretical framework and implications for science instruction. Review of Educational Research, 63(1), 1–49. Choi, I., Nisbett, R. E., & Norenzayan, A. (1999). Causal attribution across cultures: Variation and universality. Psychological Bulletin, 125(1), 47-63. Church, D. (1999). Formal abstract design tools. Game Developer, 6(8), 44–50.
188 Ciavarro, C. (2008). Implicit learning as a design strategy for learning games: Alert Hockey. Computers in Human Behavior, 24(6), 2862–2872. Cobb, J. (1972). Relationship of discrete classroom behaviors to fourth-grade academic achievement. Journal of Educational Psychology, 63(1), 74-80. Consalvo, M. (2009). There is no magic circle. Games & Culture, 4(4), 408-417. Cook, D. (2006). What are game mechanics? [Web log message]. Retrieved from http://www.lostgarden.com/2006/10/what-are-game-mechanics.html Copier, M. (2005). Connecting worlds: Fantasy role-playing games, ritual acts and the magic circle. Proceedings of the DIGRA 2005 Conference: Changing Views - Worlds in Play. Copier, M. (2007). Beyond the magic circle : A network perspective on role-play in online games. (Doctoral dissertation). Retrieved from http://dspace.library.uu.nl/handle/1874/21958 Cortina, L. M., & Kubiak, S. P. (2006). Gender and posttraumatic stress: sexual violence as an explanation for women’s increased risk. Journal of abnormal psychology, 115(4), 753–9. Costikyan, G. (2002). I have no words & I must design: Toward a critical vocabulary for games. Proceedings from Computer Games and Digital Cultures Conference. Tampere, Finland: Tampere University Press. Retrieved from http://www.digra.org/digital- library/publications/i-have-no-words-i-must-design-toward-a-critical-vocabulary-for- games/ Craik, F. I. M., & Tulving, E. (1975). Depth of processing and the retention of words in episodic memory. Journal of Experimental Psychology, 104, 268–294. Crandall, C. S., & Stangor, C. (2005). Conformity and prejudice. In J. F. Dovidio, P. Glick, & L. A. Rudman (Eds.), On the nature of prejudice: Fifty years after Allport (pp. 295–309). Malden, MA: Blackwell. Diana Jones Awards, The (2013). The shortlist for the 2013 Diana Jones award for excellence in gaming. (2013). Retrieved from http://www.dianajonesaward.org/13nominees.html Deci, E., Koestner, R., & Ryan, R. (1999). A meta-analytic review of experiments examining the effects of extrinsic rewards on intrinsic motivation. Psychological Bulletin, 125(6), 627- 668. Deci, E., Koestner, R., & Ryan, R. (2001). Extrinsic rewards and intrinsic motivation in education: Reconsidered once again. Review of Educational Research, 71(1), 1-27. De Jong, T., & Van Joolingen, W. R. (1998). Scientific discovery learning with computer simulations of conceptual domains. Review of Educational Research, 68(2), 179–201. Deterding, S. (2011). Situated motivational affordances of game elements: A conceptual model. Proceedings from CHI 2011. Retrieved from http://gamification-research.org/wp- content/uploads/2011/04/09-Deterding.pdf Devine, P. (1989). Stereotypes and prejudice: Their automatic and controlled components. Journal of Personality and Social Psychology, 56(1), 5-18.
189 Doerr, H. (1996). STELLA ten years later: A review of the literature. International Journal of Computers for Mathematical Learning, 1(2), 201-224. Dovidio, J. F., Glick, P., & Budman, L. A. (Eds.). (2005). On the nature of prejudice: Fifty years after Allport. Hoboken, NJ: Wiley-Blackwell. Duckitt, J. (1994). The social psychology of prejudice. Westport, CT: Praeger. Elverdam, C., & Aarseth, E. (2007). Game classification and game design: Construction through critical analysis. Games and Culture, 2(3), 3-22. Feagin, J. R. (2001). Racist America: Roots, current realities and future reparations. London, UK: Routledge. Feagin, J. R. (2006). Systemic racism: A theory of oppression. London, UK: Routledge. Fels, A. (2004). Necessary dreams: Ambition in women’s changing lives. New York, NY: Pantheon. Finch, B. K., Kolody, B., & Vega, W. A. (2000). Perceived discrimination and depression among Mexican-origin adults in California. Journal of Health and Social Behavior, 41(3), 295– 313. Firaxis Games Inc. (2005). Sid Meier's Civilization IV [PC game]. New York, NY: Take-Two Interactive Software. Forster, E. M. (1956). Aspects of the novel. Boston, MA: Mariner Books Forrester, J. (1961). Industrial dynamics. New York, NY: Productivity Press. Forrester , J. (1992). System dynamics and learner-centered education in kindergarten through 12th grade education. Retrieved from http://www.mitocw.espol.edu.ec/courses/sloan- school-of-management/15-988-system-dynamics-self-study-fall-1998-spring- 1999/readings/learning.pdf Fox, H. (2001). “When race breaks out:” Conversations about race and racism in college classrooms. New York, NY: Peter Lang Publishing. Gaertner, S. L., & Dovidio, J. F. (1986). The aversive form of racism. In J. F. Dovidio & S. L. Gaertner (Eds.), Prejudice, discrimination and racism: Theory and research (pp. 61–89). Orlando, FL: Academic Press. Gaertner, S., & Dovidio, J. (1977). The subtlety of white racism, arousal, and helping behavior. Journal of Personality and Social Psychology, 35(10), 691-707. Gee, J. P. (2003). What video games have to teach us about learning and literacy. New York, NY: Palgrave Macmillan. Gentner, D., & Stevens, A. L. (1983). Mental models. London, UK: Routledge. Gershenfeld, A. (2010). Computer and Video Games That Engage, Educate and Empower. Retrieved September 29, 2010, from http://elineventures.com/static/GameOnTexas-Alan Gershenfeld.pdf
190 Goffman, I. (1974). Frame analysis: An essay on the organization of experience. Cambridge, MA: Harvard University Press. Goldin, C., & Rouse, C. (1997). Orchestrating Impartiality: The Impact of “Blind” Auditions on Female Musicians (NBER Working Paper No 5903). Cambridge, MA: National Bureau of Economic Research. Retrieved from http://www.nber.org/papers/w5903 Gomez, B., & Wilson, J. (2006). Rethinking symbolic racism: Evidence of attribution bias. The Journal of Politics, 68(3), 611-625. Gorsky, P., & Finegold, M. (1992). Using computer simulations to restructure students’ conceptions of force. Journal of Computers in Mathematics and Science Teaching, 11, 163–178. Gray, K. L. (2012). Deviant bodies, stigmatized identities, and racist acts: examining the experiences of African-American gamers in XBox Live. New Review of Hypermedia and Multimedia, 18(4), 261-276. Greenwald, A., McGhee, D., & Schwartz, J. (1998). Measuring individual differences in implicit cognition: The implicit association test. Journal of Personality and Social Psychology, 74(6), 1464-80. Grube, J., Mayton , D., & Ball-Rokeach, S. (1994). Inducing change in values, attitudes, and behaviors: Belief system theory and the method of value self-confrontation. Journal of Social Issues, 50(4), 153–173. Hammer, J., & Black, J. B. (2009). Games and (preparation for future) learning. Educational Technology, 49(2), 29–34. Harrell, J. P., Hall, S., & Taliaferro, J. (2003). Physiological responses to racism and discrimination: An assessment of the evidence. American Journal of Public Health, 93(2), 243–8. Hausmann, R., Tyson, L. D., & Zahidi, S. (2010). Global Gender Gap Report. World Economic Forum. Retrieved from http://www.weforum.org/pdf/gendergap/report2010.pdf Heeter, C., Lee, Y., Magerko, B., & Medler, B. (2011). Impacts of forced serious game play on vulnerable subgroups. International Journal of Gaming and Computer-Mediated Simulations, 3(3), 34-53. Hegewisch, A., & Edwards, A. (2011). The gender wage gap: 2011. Retrieved from http://www.iwpr.org/publications/pubs/the-gender-wage-gap-2011 Henry, P. J., & Sears, D. O. (2007). The Symbolic Racism 2000 Scale. Political Psychology, 23(2), 253–283. Hewson, P. W., & Thorley, R. N. (1989). The conditions of conceptual change in the class-room. International Journal of Science Education, 11, 541–553. Hill, M. S., & Fischer, A. R. (2007). Examining objectification theory: Lesbian and heterosexual women’s experiences with sexual- and self-objectification. The Counseling Psychologist, 36(5), 745–776.
191 Hirshman, L. R. (2007). Get to Work: . . . And Get a Life, Before It’s Too Late. New York, NY: Penguin. Holtgraves, T. (2004). Social desirability and self-reports: Testing models of socially desirable responding. Personality and Social Psychology Bulletin, 30(2), 161-172. Howden, L. M., & Meyer, J. A. (2011). Age and sex composition: 2010. Retrieved from http://www.census.gov/prod/cen2010/briefs/c2010br-03.pdf Hughes, Michael, and Steven Tuch. 2000. How beliefs about poverty influence racial policy. In D. O. Sears, J. Sidanius, & L. Bobo (Eds.), Racialized politics: The debate about racism in America (pp. 165-190). Chicago, IL: University of Chicago Press. Huizinga, J. (1950). Homo ludens: A study of the play element in culture. Boston: The Beacon Press. Hunicke, R., LeBlanc, M., & Zubek , R. (2004). MDA: A formal approach to game design and game research. Proceedings from The Challenges in Games AI Workshop, Nineteenth National Conference of Artificial Intelligence. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.79.4561 Humes, K., Jones, N., & Ramire, R., (2010). U.S. census bureau overview of race and Hispanic origin: 2010. Retrieved from http://www.census.gov/prod/cen2010/briefs/c2010br-02.pdf IndieCade. (2013). 2013 festival games nominees. Retrieved from http://www.indiecade.com/2013/nominees/ Isbister, K., Flanagan, M., & Hash, C. (2010). Designing games for learning : Insights from conversations with designers. Proceedings of the 28th International Conference on Human Factors in Computing Systems (pp. 2041–2044). Atlanta, GA. Iyengar, S. (1994). Is anyone responsible?: How television frames political issues. Chicago, IL: University Of Chicago Press. Iyengar, S. (1989). How citizens think about national issues: A matter of responsibility. American Journal of Political Science, 33(4), 878-900. Jacobson, M., & Wilensky, U. (2006). Complex systems in education: Scientific and educational importance and implications for the learning sciences. Journal of the Learning Sciences , 15(1), 11-34. Järvinen, A. (2009). Games without frontiers: Methods for game studies & design. Saarbrücken, Germany: VDM Verlag. Johnson, S. (2002). Emergence: The connected lives of ants, brains, cities, and software. New York, NY: Scribner. Johnson-Laird, P. N. (1994). Mental models and probabilistic thinking. Cognition, 50(1-3), 189– 209. Johsua, S., & Dupin, J. J. (1987). Taking into account student conceptions in instructional strategy: An example in physics. Cognition and Instruction, 4, 117–135.
192 Jonassen, D. H., & Henning, P. (1996). Mental models : Knowledge in the head and knowledge in the world. Educational Technology, 39(3), 37-42. Jost, J.T., & Hunyady, O. (2002). The psychology of system justification and the palliative function of ideology. European review of social psychology, 13, 111-153. Jost, J.T., Banaji, M.R., & Nosek, B.A. (2004). A decade of system justification theory: Accumulated evidence of conscious and unconscious bolstering of the status quo. Political Psychology, 25, 881-919. Jost, J.T., & Banaji, M.R. (1994) The role of stereotyping in system‐justification and the production of false consciousness. British Journal of Social Psychology, 33(1), 1-27. Jussim, L. (1991). Social perception and social reality: A reflection-construction model. Psychological Review, 98, 54–73. Juul, J. (2003). The game, the player, the world: Looking for a heart of gameness. Proceedings from Level Up: Digital Games Research Conference. Utrecht: Utrecht University. Juul, J. (2005). Half-real: Video games between real rules and fictional worlds. Cambridge, MA: MIT Press. Kahneman, D., Slovic, P., & Tversky, A. (1982). Judgment under uncertainty: Heuristics and biases. Cambridge, UK: Cambridge University Press. Kapur, M. (2008). Productive failure. Cognition and Instruction, 26(3), 379-424. Karniol, R., & Ross, M. (1977). The effect of performance-relevant and performance-irrelevant rewards on children's intrinsic motivation. Child Development, 48(2), 482-487. Kaufman, G. F., & Libby, L. K. (2012). Changing beliefs and behavior through experience- taking. Journal of Personality and Social Psychology, 103(1): 1-19. Kee, K., Vaughan, T., & Graham, S. (2010). The haunted school on horror hill: A case study of interactive fiction in an elementary classroom. In Y. Baek (Ed.), Gaming for classroom- based learning: Digital role playing as a motivator of study (pp. 113-124). Cheongwon, North Chungcheong, South Korea: Korea National University of Education. Khan, S. R. (2000). Teaching an undergraduate course on the psychology of racism. Teaching of Psychology, 28(1), 28–33. Kirby, M., & Osterhaus, M. A. (1999). Apples to apples. [Card game]. Madison, WI: Out of the Box Publishing. Klahr, D., Dunbar, K., & Fay, A. L. (1990). Designing good experiments to test bad hypotheses. In J. Shrager & P. Langley (Eds.), Computational models of scientific discovery and theory formation (pp. 355–402). San Mateo, CA: Morgan Kaufmann. Klimmt, C., & Hartmann, T. (2009). Effectance, self-efficacy and the motivation to play video games. In P. Vorderer & J. Bryant (Eds.), Playing video games: Motives, responses and consequences (pp. 133-145). Mahwah, NJ: Lawrence Erlbaum.
193 Klonis, S. C., Plant, E. A., & Devine, P. G. (2005). Internal and external motivation to respond without sexism. Personality and Social Psychology Bulletin, 31(9), 1237–49. Klopfer, E., Osterweil, S., & Salen, K. (2009). Moving Learning Games Forward. Cambridge, MA. Retrieved from http://education.mit.edu/papers/MovingLearningGamesForward_EdArcade.pdf Kluegel, J. R. (1985). If there isn’t a problem, you don't need a solution. American Behavioral Scientist, 28, 761-784. Kozol, J. (1992). Savage inequalities: Children in America’s schools. New York, NY: Harper Perennial. Kuhn, D. (1989). Children and adults as intuitive scientists. Psychological Review, 96, 674–689. Kunda, Z. (1990). The case for motivated reasoning. Psychological Bulletin, 108, 480–498. Kuznekoff, J., & Rose, L. (2013). Communication in multiplayer gaming: Examining player responses to gender cues. New Media & Society, 15(4), 541-556. Lau, R., & Sears, D. (1981). Cognitive links between economic grievances and political responses. Political Behavior, 3(4), 279-302. Lazarro, N. (2005). Why we play games: Four keys to more emotion without story. Retrieved from http://www.xeodesign.com/xeodesign_whyweplaygames.pdf League System. (2013). Retrieved November 27, 2013 from http://leagueoflegends.wikia.com/wiki/League_system LeBlanc, M. (2005). Tools for creating dramatic game dynamics. In K. Salen & E. Zimmerman (Eds.), The game design reader: A rules of play anthology. Cambridge, MA: MIT Press. Lenhart, A., Kahne, J., Middaugh, E., Macgill, A., Evans, C., & Vitak, J. (2008). Teens, video games and civics. Retrieved from http://www.pewinternet.org/Reports/2008/Teens- Video-Games-and-Civics.aspx Lieberman, D. A. (2006). Dance games and other exergames: What the research says. Retrieved from http://www.comm.ucsb.edu/faculty/lieberman/exergames.htm Lindley, C. A., & Mayra, F. (2002). The gameplay gestalt, narrative, and interactive storytelling. Proceedings of Computer Games and Digital Cultures Conference. Tampere, Finland: University of Tampere Press. Linn, M. C., & Songer, N. B. (1991). Teaching thermodynamics to middle school students: What are the appropriate cognitive demands? Journal of Research in Science Teaching, 28, 885–918. Lipsitz, G. (1995). The possessive investment in Whiteness: Racialized social democracy and the “White” problem in American Studies. American Quarterly, 47(3), 369–387. Lipson, K. (1997). What do students gain from computer simulation exercises? In J. B. Garfield & G. Burrill (Eds.), Research on the Role of Technology in Teaching and Learning Statistics (pp. 137–150). Voorburg: International Statistical Institute.
194 Losse, K. (2013). Feminism’s tipping point. Retrieved from http://www.dissentmagazine.org/online_articles/feminisms-tipping-point-who-wins- from-leaning-in Macrae, C., Bodenhausen, G., Milne, A., & Jetten, J. (1994). Out of mind but back in sight: Stereotypes on the rebound. Journal of Personality and Social Psychology, 67(5), 808- 817. Mandinach, E. B., & Cline, H. F. (1994). Classroom dynamics: Implementing a technology- based learning environment. Mahwah, NJ: Lawrence Erlbaum Associates Markham, K. M., Mintzes, J. J., & Jones, M. G. (1994). The concept map as a research and evaluation tool: Further evidence of validity. Journal of Research in Science Teaching, 31(1), 91–101. Maxis. (2004). The sims 2. [PC game].Redwood City, Redwood Shores, CA: Electronic Arts. McGuire, J., & Kable, J. (2013). Rational temporal predictions can underlie apparent failures to delay gratification. Psychological Review, 120(2), 395-410. Meadows, D. (2008). Thinking in systems. White River Junction, VT: Chelsea Green Publishing Michael, D., & Chen, S. (2005). Serious games: Games that educate, train & inform. Cincinnati, OH: Muska & Lipman. Monteith, M. (1993). Self-regulation of prejudiced responses: Implications for progress in prejudice-reduction efforts. Journal of Personality and Social Psychology, 65(3), 469- 485. Morales, A. (2010). Farmville meets 1 billion Haiti charity goal, gives away free Hot Air Balloon. Retrieved from http://blog.games.com/2010/10/03/farmville-meets-1-billion- haiti-charity-goal-gives-away-free-ho/ Moreno, R., & Mayer, R. (2005). Role of guidance, reflection, and interactivity in an agent-based multimedia game. Journal of Educational Psychology, 97(1), 117-128. Morris, M. W., Menon, T., & Ames, D. R. (2001). Culturally conferred conceptions of agency: A key to social perception of persons, groups and other actors. Personality and Social Psychology Review, 5(2), 169–182. Moss-Racusin, C. A., Dovidio, J. F., Brescoll, V. L., Graham, M. J., & Handelsman, J. (2012). Science faculty’s subtle gender biases favor male students. Proceedings of the National Academy of Sciences of the United States of America, 109(41), 16474–9. Newell, A., & Simon, H. A. (1972). Human problem solving. Englewood Cliffs, NJ: Prentice Hall. Nickerson, R. S. (1991). Modes and models of informal reasoning: A commentary. In J. F. Voss, D. N. Perkins, & J. W. Segal (Eds.), Informal reasoning and education (pp. 291–309). Hillsdale, NJ: Lawrence Erlbaum. Niedderer, H., Schecker, H., & Bethge, T. (1991). The role of computer-aided modelling in learning physics. Journal of Computer Assisted Learning, 7(2), 84–95.
195 Nintendo EAD. (1996). Super Mario 64. [Nintendo 64 game]. Kyoto, Japan: Nintendo. Norenzayan, A., Choi, I., & Nisbett, R. (2002). Cultural similarities and differences in social inference: Evidence from behavioral predictions and lay theories of behavior. Personality and Social Psychology Bulletin, 28(1), 109-120. Oliver, M. L., & Shapiro, T. M. (2006). Black wealth, white wealth: A new perspective on racial inequality. New York, NY: CRC Press. Olsen, D. G. (1999). Constructivist principles of learning and teaching. Education, 120(2). Osborne, J., & Squires, D. (1987). Learning science through experiential software. In J. Novak (Ed.), Proceedings of the Second International Seminar on Misconceptions and Educational Strategies in Science and Mathematics (pp. 373–380). Ithaca, NY: Cornell University. Oskamp, S. (Ed.). (2000). Reducing prejudice and discrimination. New York, NY: Taylor & Francis. Overview of Race and Hispanic Origin: 2010. (2010). Retrieved November 13, 2012, from http://www.census.gov/prod/cen2010/briefs/c2010br-02.pdf Pajitnov, Alexey. (1984). Tetris. [PC game]. Alameda, CA: Spectrum HoloByte, Inc. Paluck, E. L. (2009). Reducing intergroup prejudice and conflict using the media: A field experiment in Rwanda. Journal of Personality and Social Psychology, 96(3), 574-587. Paluck, E. L., & Green, D. P. (2009). Prejudice reduction: What works? A review and assessment of research and practice. Annual Review of Psychology, 60, 339–67. Perkins, D. N., & Simmons, R. (1988). Patterns of misunderstanding: An integrative model for science, math, and programming. Review of Educational Research, 58(3), 303-326. Pierce, C., Carew, J., Pierce-Gonzalez, D., & Willis, D. (1978). An experiment in racism: TV commercials. In C. Pierce (Ed.), Television and education (pp. 62–88). Beverly Hills: Sage. Plass, J. L., Homer, B. D., Kinzer, C., Frye, J., & Perlin, K. (2011). Learning mechanics and assessment mechanics for games for learning. G4LI White Paper # 01/2011 Version 0.1 September 30, 2011, Available online at g4li.org Valve. (2007). Portal. [PC game]. Bellevue, WA: Valve Entertainment. QCF Design. (2013). Desktop Dungeons. [PC game]. Cape Town, South Africa: QCF Design. Resnick, M. (1996). Beyond the centralized mindset. Journal of the Learning Sciences, 5, 1-22. Richardson, G. (1999). Feedback thought in social science and systems theory. Westford, MA: Pegasus Communications. Riot Games. (2009). League of legends. [PC game]. Santa Monica, CA: Riot Games.
196 Riot Games Blog. (2012). League of legends’ growth spells bad news for Teemo. Retrieved from http://www.riotgames.com/articles/20121015/138/league-legends-growth-spells- bad-news-teemo Rokeach, M. (1973). The nature of human values. New York, NY: Free Press. Roth, K. J. (1990). Developing meaningful conceptual understanding in science. In B. F. Jones & L. Idol (Eds.), Dimensions of thinking and cognitive instruction. London, UK: Routledge. Rovio Entertainment. (2009). Angry birds. [iOS mobile game]. Espoo, Finland: Rovio Entertainment. Rovio Entertainment Reports 2012 Financial Results. (2013). Retrieved November 27, 2013 from http://www.rovio.com/en/news/press-releases/284/rovio-entertainment-reports- 2012-financial-results/ Ryan, R., & Deci, E. (2000). Intrinsic and extrinsic motivations: Classic definitions and new directions. Contemporary Educational Psychology, 25, 54–67. Salen, K., & Zimmerman, E. (2003). Rules of play: Game design fundamentals. The MIT Press. Salen, K., & Zimmerman, E. (2005). The game design reader: A rules of play anthology. Cambridge, MA: The MIT Press. Salvatore, J., & Shelton, J. N. (2007). Cognitive costs of exposure to racial prejudice. Psychological science, 18(9), 810–5. Sawyer, B. (2010). Serious Games Initiative. Retrieved November 10, 2010, from http://www.seriousgames.org/. Sawyer, R. K. (2005). Social emergence: Societies as complex systems. Cambridge, UK: Cambridge University Press. Schelling, T. C. (1971). Dynamic Models of Segregation. Journal of Mathematical Sociology, 1, 143–186. Schmidt, S. L. (2005). More than men in white Sheets: Seven concepts critical to the teaching of racism as systemic inequality. Equity & Excellence in Education, 38(2), 110–122. Schuman, H., Steeh, C., Bobo, L. D., & Krysan, M. (1998). Racial attitudes in America: Trends and interpretations, revised edition. Cambridge, MA: Harvard University Press. Sears, D. O., Inquiry, S. P., & Sears, D. (2010). A perspective on implicit prejudice from survey research. Psychological Inquiry, 15(4), 293–297. Sears, D. O., & Jessor, T. (1996). Whites’ racial policy attitudes: The role of White racism. Social Science Quarterly, 77(4). Sibley, C. G., & Duckitt, J. (2008). Personality and prejudice: a meta-analysis and theoretical review. Personality and Social Psychology Review, 12(3), 248–79. Sidanius, J., & Veniegas, R. (2000). Gender and race discrimination: The interactive nature of disadvantage. In S. Oskamp (Ed.), Reducing Prejudice and Discrimination (pp. 47-69). Mahwah, NJ: Lawrence Erlbaum Associate.
197 Song, J. (1998). Lineage. [PC game]. Seoul, South Korea: NCsoft. Speight, S. (2007). Internalized racism, one more piece of the puzzle. The Counseling Psychologist, 35(1), 126-134. Squire, K., & Barab, S. (2004). Replaying history: Engaging urban underserved students in learning world history through computer simulation games. Proceedings of the 6th International Conference on Learning Sciences. International Society of the Learning Sciences. Retrieved from http://portal.acm.org/citation.cfm?id=1149126.1149188 Squire, K., & Durga, S. (2005). Productive gaming : The case for historiographic game play. In R. Ferdig (Ed.), The Handbook of Research on Effective Electronic Gaming. Hershey, PA: IGI Global. Stanley, D., Sokol-Hessner, P., Banaji, M., & Phelps, E. (2011). Implicit race attitudes predict trustworthiness judgments and economic trust decisions. Proceedings of the National Academy of Sciences of the United States of America, 108(19), 7710-7715. Steele, C. M. (1997). A threat in the air: How stereotypes shape intellectual identity and performance. The American Psychologist, 52(6), 613–29. Steele, C., & Aronson, J. (1995). Stereotype threat and the intellectual test performance of African Americans. Journal of Personality and Social Psychology, 69(5), 797-811. Steinkuehler, C. (2008). Cognition and literacy in massively multiplayer online games. In J. Coiro, M. Knobel, C. Lankshear, & D. Leu (Eds.), Handbook of Research on New Literacies (pp. 611–634). Mahwah, NJ: Erlbaum. Steinpreis, R., Anders, K. A., & Ritzke, D. (1999). The impact of gender on the review of the curricula vitae of job applicants and tenure candidates: A national empirical study. Sex Roles, 41(7-8), 509-528. Stephan, W. G., & Stephan, C. W. (2000). An integrated threat theory of prejudice. In S. Oskamp (Ed.), Reducing Prejudice and Discrimination (pp. 23–46). Mahwah, NJ: Lawrence Erlbaum. Strange, J. J. (2002). How fictional tales wag real-world beliefs. Narrative impact: social and cognitive foundations (pp. 263–286). London, UK: Routledge. Strange, S. (2011, March). Tension maps. Game Developer Magazine, 18(3), 7-11. Sue, D. W. (2010). Microaggressions in everyday life: Race, gender, and sexual orientation. Hoboken, NJ: Wiley. Sue, D. W., & Capodilupo, C. M. (2008). Racial, gender, and sexual orientation microaggressions: Implications for counseling and psychotherapy. In D. W. Sue & D. Sue (Eds.), Counseling the culturally diverse: Theory and practice. Hoboken, NJ: Wiley. Suits, B. (2005). The grasshopper: Games, life and utopia. Peterborough, Ontario: Broadview Press. Swim, J. K., Aikin, K. J., Hall, W. S., & Hunter, B. A. (1995). Sexism and racism: Old-fashioned and modern prejudices. Journal of Personality and Social Psychology, 68(2), 199-214.
198 Swim, J. K., Hyers, L. L., Cohen, L. L., & Ferguson, M. J. (2001). Everyday sexism: Evidence for its incidence, nature, and psychological impact from three daily diary studies. Journal of Social Issues, 57(1), 31–53. Taylor, M. C. (1998). How White attitudes vary with the racial composition of local populations: Numbers count. American Sociological Review, 63, 512-535. Taylor, R., & Chi, M. (2006). Simulation versus text: Acquisition of implicit and explicit information. Journal of Research in Science Teaching, 35(3), 289–313. Teach with Portals. (n.d.). Retrieved November 13, 2012, from http://www.teachwithportals.com/ Tesser, A., & Shaffer, D. R. (1990). Attitudes and attitude change. Annual Review of Psychology, 41, 479–523. Tversky, A., & Kahneman, D. (1991). Loss aversion in riskless choice: A reference-dependent model. The Quarterly Journal of Economics , 106(4), 1039-1061. US Census Bureau, D. I. D. (n.d.). Genealogy Data: Frequently Occurring Surnames from Census 1990 – Names Files - U.S. Census Bureau. Retrieved from http://www.census.gov/genealogy/www/data/1990surnames/names_files.html Uhlmann, E. L., & Cohen, G. L. (2005). Constructed criteria: Redefining merit to justify discrimination. Psychological Science, 16(6), 474–480. Uziel, L. (2010). Rethinking social desirability scales from impression management to interpersonally oriented self-control. Perspectives on Psychological Science, 5(3), 243- 262. Valian, V. (1999). The cognitive bases of gender bias. Brooklyn Law Review, 65(4), 1037-1061. Vie, S. (2008). Tech writing, meet tomb raider: video and computer games in the technical communication classroom. E-Learning and Digital Media, 5(2), 157-166. Von Bertalanffy, L. (1958). An outline of general system theory. British Journal for the Philosophy of Science, 1, 134-165. Wallace, J. D., & Mintzes, J. J. (1990). The concept map as a research tool: Exploring conceptual change in biology. Journal of Research in Science Teaching, 27(10), 1033–1052. Watson, B., & Konicek, R. (1990). Teaching for conceptual change: Confronting children’s experience. Phi Delta Kappan, 71, 680–685. Wenneras, C., & Wold, A. (1997). Nepotism and sexism in peer review. Nature, 387, 341–343. White, B. Y. (1993). ThinkerTools: Causal models, conceptual change, and science education. Cognition and Instruction, 10, 1–100. Wilensky, U., & Resnick, M. (1999). Thinking in levels: A dynamic systems approach to making sense of the world. Journal of Science Education and Technology, 8(1), 3-19. Williams, D., Martins, N., Consalvo, M., & Ivory, J. (2009). The virtual census: representations of gender, race and age in video games. New Media & Society , 11(5), 815-834.
199 Williams, J. (2012). Reshaping the work-family debate: Why men and class matter. Cambridge, MA: Harvard University Press Windschitl, M., & Andre, T. (1998). Using computer simulations to enhance conceptual change: The roles of constructivist instruction and student epistemological beliefs. Journal of Research in Science Teaching, 35(2), 145–160. Zietsman, A. I., & Hewson, P. W. (1986). Effect of instructing using microcomputer simulations and conceptual change strategies on science learning. Journal of Research in Science Teaching, 23, 27–39. Zillmann, D. (1991). Affect from bearing witness to the emotions of others. Responding to the screen: Reception and reaction processes. Hillsdale, NJ: Lawrence Erlbaum.
200 Appendix A: Name Selection
The most common hundred last names for each of the four racial groups/ethnicities used in the game were selected using US census data, provided by Mongabay
(http://names.mongabay.com). To remove ambiguity, any last name that appeared on more than one list was removed from all lists. For groups where this brought the total under one hundred last names, the next fifty most common last names were selected. Again, any last name appearing on more than one list was removed from all lists. This process was repeated until all lists had at least one hundred last names.
Frequency data on first names by ethnicity was not available from the US Census data, so we turned to alternate sources. Lists were generated as follows:
• Black: the site http://www.babynamesworld.com reports the most popular names for
African American children in 2011.
• Hispanic: the site http://www.babycenter.com reports the most popular names for
Hispanic children in 2011.
• Asian and White: lists of popular Asian and White names were generated using the
randomNames package for R, which draws on the 2010 US Census data. (Package
available at http://cran.r-project.org/web/packages/randomNames/).
Fifty male and fifty female names were selected for each group. To remove ambiguity, any first name that appeared on more than one list was removed from all lists. For lists with fewer than fifty names remaining, ten names were added and the process repeated until all lists had at least fifty names.
201 Appendix B: Sexism Attribution Test
For the following questions, answers 1 and 2 are individual answers, while answers 3 and 4
are systemic. Questions are matched to questions with the same number on the race questions. To
score the test, sum the number of systemic answers; a higher score indicates more systemic
thinking about sexism.
Question 1
The anthology "Best Short Stories By New Writers" is compiled by a single editor. The
editor chooses stories for the anthology from obscure small-press literary magazines.
This year's anthology contains only stories by male writers.
Which of the following explanations would you consider most likely to be true?
1. The editor gave special preference to stories by male authors.
2. The editor selected the best stories without considering gender.
3. Female authors are underrepresented in small-press literary magazines.
4. Male authors are better at writing stories that appeal to an audience of sophisticated,
literary readers.
Question 2
Professor Jones teaches an advanced mathematics seminar at a public university. The seminar requires a lower-level prerequisite, as well as permission from Professor Jones, to enroll.
Only half the students who apply for the seminar are admitted.
This year, there are no female students in Professor Jones' course.
Which of the following explanations would you consider most likely to be true?
202 1. Professor Jones discouraged female students from taking his classes.
2. Professor Jones selected the best students for his seminar, regardless of gender.
3. Female students are underrepresented in the lower-level math class required to study with
Professor Jones.
4. The female brain is not wired for mathematical excellence, so fewer women are good
enough to study with Professor Jones.
Question 3
Every year, the president of a small country nominates several high-level army officers to become generals.
This year, all the nominees are male.
Which of the following explanations would you consider most likely to be true?
1. The president gave preferential treatment to the male candidates.
2. The president chose the best candidates available, regardless of gender.
3. Women have fewer opportunities to reach the top ranks of the military.
4. Since most of the lower-level officers are male, they would be most effectively led by
male generals.
Question 4
Every year, the hockey coach at Middlebrook College offers scholarships to promising young hockey players. The coach chooses from college-bound high-school students who have demonstrated skill at hockey.
This year, all the scholarships go to male hockey players.
203 Which of the following explanations would you consider most likely to be true?
1. The coach gave special treatment to the male players.
2. The coach picked the best players for the scholarship, regardless of gender.
3. Many more high-schools have a hockey team for men than for women.
4. Students are more interested in attending the men's hockey games, so the coach
prioritized putting together a great men's team.
Question 5
A male acquaintance refers to Nancy as "babe." Nancy becomes angry.
Which of the following explanations would you consider most likely to be true?
1. The acquaintance intended to disparage Nancy for her gender.
2. Nancy overreacted to a good-humored comment.
3. Many people use the word "babe" to disparage women, even if Nancy's acquaintance did
not mean to.
4. The term "babe" may once have been used in a sexist way, but its use in other contexts is
more important.
Question 6
Alison, a female college student, rarely participates in her literature seminar. At the end of the semester, the professor gives Alison a poor grade.
Which of the following explanations would you consider most likely to be true?
1. The professor gave Alison a poor grade because of her gender.
2. Alison chose not to participate in class, so she deserved a poor grade.
204 3. In the past, when Alison tried to participate in class discussions, she was often ignored by
her teachers in favor of her male classmates.
4. Women are not aggressive enough to have their opinions heard in the classroom.
Question 7
Susan is a computer programmer. An acquaintance compliments Susan on how good she is with computers. Susan becomes upset.
Which of the following explanations would you consider most likely to be true?
1. The acquaintance meant that it was unusual for a woman to be good at programming.
2. Susan overreacted to a well-meaning compliment.
3. Many people assume women are bad with technology, even if Susan's acquaintance did
not.
4. Women on average are worse with technology than men, so it is reasonable to assume
that Susan would not be good at programming.
Question 8
Amelia, a fourth-grader, is tested for mathematical disabilities. The results are
ambiguous, but the school counselor assigns Amelia to a special education program.
Which of the following explanations would you consider most likely to be true?
1. The school counselor deliberately referred Amelia for special education because of her
gender.
2. The school counselor used excellent professional judgment in making a difficult call.
3. Popular stereotypes portray female students as mathematically inferior, which influenced
205 the counselor's recommendation.
4. Women are less mathematically adept than men, so it is reasonable to assume that
Amelia's ambiguous results mean that she has a problem.
Question 9
Mary applies for a promotion to manager. Later, she learns that her boss recommended a
male candidate with six months' less experience. Mary does not get the promotion.
Which of the following explanations would you consider most likely to be true?
1. Mary's boss discriminated against her because of her gender.
2. Mary's boss recommended the most qualified candidate for the job.
3. Women are rarely portrayed as effective leaders, which influenced Mary's boss's
judgment about who could lead the team.
4. Men do not like to be led by a woman, so it is better to promote male candidates to
manager.
Question 10
Hannah attends a costume party with a male friend. Both of them wear skimpy costumes.
As they leave the party, a police officer warns Hannah about dressing so scandalously, while leaving her male friend alone.
Which of the following explanations would you consider most likely to be true?
1. The police officer only warned Hannah about her appearance because of her gender.
2. The police officer used excellent professional judgment in deciding who to caution.
3. Women's bodies are assumed to be on display for men, leading the officer to believe the
206 warning was appropriate.
4. Dressing provocatively increases the chances a woman will be sexually assaulted, so the
officer behaved correctly in warning Hannah.
207 Appendix C: Racism Attribution Test For the following questions, answers 1 and 2 are individual answers, while answers 3 and 4 are systemic. Questions are matched to questions with the same number on the gender questions.
To score the test, sum the number of systemic answers; a higher score indicates more systemic thinking about racism.
Question 1
The anthology "Best Short Stories By New Writers" is compiled by a single editor. The editor chooses stories for the anthology from obscure small-press literary magazines.
This year's anthology contains only stories by white writers.
Which of the following explanations would you consider most likely to be true?
1. The editor gave special preference to stories by white authors.
2. The editor selected the best stories without considering race.
3. Non-white authors are underrepresented in small-press literary magazines.
4. White authors are better at writing stories that appeal to an audience of sophisticated,
literary readers.
Question 2
Professor Jones teaches an advanced mathematics seminar at a public university. The seminar requires a lower-level prerequisite, as well as permission from Professor Jones, to enroll.
Only half the students who apply for the seminar are admitted.
This year, there are no black or Hispanic students in Professor Jones' course.
Which of the following explanations would you consider most likely to be true?
1. Professor Jones discouraged black and Hispanic students from taking his classes.
208 2. Professor Jones selected the best students for his seminar, regardless of race.
3. Black and Hispanic students are underrepresented in the lower-level math class required
to study with Professor Jones.
4. Black and Hispanic students prioritized fun over schoolwork in the past, leaving them
unprepared for Professor Jones' advanced classes.
Question 3
Every year, the president of a small country nominates several high-level army officers to become generals.
This year, all the nominees are white.
Which of the following explanations would you consider most likely to be true?
1. The president gave preferential treatment to the white candidates.
2. The president chose the best candidates available, regardless of race.
3. Non-whites have fewer opportunities to reach the top ranks of the military.
4. Since most of the lower-level officers are white, they would be most effectively led by
white generals.
Question 4
Every year, the hockey coach at Middlebrook College offers scholarships to promising young hockey players. The coach chooses from college-bound high-school students who have demonstrated skill at hockey.
This year, all the scholarships go to white hockey players.
Which of the following explanations would you consider most likely to be true?
209 1. The coach gave special treatment to the white players.
2. The coach picked the best players for the scholarship, regardless of race.
3. Because hockey is an expensive sport, white students have more opportunities to try
hockey at the high-school level than non-white students do.
4. White athletes value education more than their non-white peers, so more of them apply to
college.
Question 5
A white acquaintance refers to Jamal, a black man, and his friends as "you people." Jamal
becomes angry.
Which of the following explanations would you consider most likely to be true?
1. The acquaintance intended to disparage Jamal for his race.
2. Jamal overreacted to a good-humored comment.
3. In the past, Jamal has heard others use the term "you people" in ways that carry racist
overtones.
4. The term "you people" may once have been used in a racist way, but its use in other
contexts is more important.
Question 6
Ramon, a Hispanic college student, rarely participates in his literature seminar. At the end of the semester, the professor gives Ramon a poor grade.
Which of the following explanations would you consider most likely to be true?
1. The professor gave Ramon a poor grade because of his race.
210 2. Ramon chose not to participate in class, so he deserved a poor grade.
3. In the past, when Ramon tried to participate in class discussions, he was often ignored by
his teachers in favor of his white classmates.
4. Ramon grew up in a community that taught him not to value academic achievement or
class participation.
Question 7
Jin, an Asian man, speaks English as his first language. An acquaintance compliments Jin on how well he speaks English. Jin becomes upset.
Which of the following explanations would you consider most likely to be true?
1. The acquaintance meant that it was unusual for an Asian person to speak English well.
2. Jin overreacted to a well-meaning compliment.
3. In the past, Jin has encountered many people who assume he is not American-born
because of his ethnic background.
4. Many Asians are not fluent in English, so it is only reasonable to assume that Jin would
not be either.
Question 8
Tyrone, a black fourth-grader, is tested for reading disabilities. The results are ambiguous, but the school counselor assigns Tyrone to a special education program.
Which of the following explanations would you consider most likely to be true?
1. The school counselor deliberately referred Tyrone for special education because of his
race.
211 2. The school counselor used excellent professional judgment in making a difficult call.
3. Popular stereotypes portray black students as academically inferior, which influenced the
counselor's recommendation.
4. Tyrone's family and peers are negative role models for him, leading him to behave poorly
in the classroom and require special attention.
Question 9
Zhou, an Asian man, applies for a promotion to manager. Later, he learns that his boss recommended a white candidate with six months' less experience. Zhou does not get the promotion.
Which of the following explanations would you consider most likely to be true?
1. Zhou's boss discriminated against him because of his race.
2. Zhou's boss recommended the most qualified candidate for the job.
3. Asian men are rarely portrayed as effective leaders, which influenced Zhou's boss's
judgment about who could lead the team.
4. Zhou grew up in a culture that encourages deference and obedience, making him unsuited
for a leadership position.
Question 10
Deshaun, a black man, is hanging out with a white friend in the park. A police officer searches Deshaun for drugs but leaves his friend alone.
Which of the following explanations would you consider most likely to be true?
1. The police officer only searched Deshaun because of his race.
212 2. The police officer used his best professional judgment in deciding who to search.
3. Law enforcement is harsher on black men than white men for the same offense.
4. Most drug dealers are young, black and male, so police officers should focus their efforts
on young black men.
213 Appendix D: Attribution Test Validation Data Analysis
Question 1
Individual answers
Question Type Answer Individual Systemic Sig. Gender The editor gave special preference to stories by male authors. N = 38 N = 3 .000 The editor selected the best stories without considering gender. N = 28 N = 14 .050 Race The editor gave special preference to stories by white authors. N = 36 N = 5 .000 The editor selected the best stories without considering race. N = 36 N = 5 .000
Systemic answers
Question Type Answer Individual Systemic Sig. Gender Female authors are underrepresented in small-press literary N = 3 N = 38 .000 magazines. Male authors are better at writing stories that appeal to an audience N = 4 N = 37 .000 of sophisticated, literary readers. Race Non-white authors are underrepresented in small-press literary N = 0 N = 41 .000 magazines. White authors are better at writing stories that appeal to an audience N = 4 N = 37 .000 of sophisticated, literary readers.
Question 2
Individual answers
Question Type Answer Individual Systemic Sig. Gender Professor Jones discouraged female students from taking his classes. N = 37 N = 4 .000 Professor Jones selected the best students for his seminar, regardless N = 30 N = 11 .004 of gender. Race Professor Jones discouraged black and Hispanic students from N = 37 N = 4 .000 taking his classes. Professor Jones selected the best students for his seminar, regardless N = 33 N = 8 .000 of race.
214 Systemic answers
Question Type Answer Individual Systemic Sig. Gender Female students are underrepresented in the lower-level math class N = 4 N = 37 .000 required to study with Professor Jones. The female brain is not wired for mathematical excellence, so fewer N = 8 N = 33 .000 women are good enough to study with Professor Jones. Race Black and Hispanic students are underrepresented in the lower-level N = 0 N = 41 .000 math class required to study with Professor Jones. Black and Hispanic students prioritized fun over schoolwork in the N = 11 N = 30 .004 past, leaving them unprepared for Professor Jones' advanced classes.
Question 3
Individual answers
Question Type Answer Individual Systemic Sig. Gender The president gave preferential treatment to the male candidates. N = 39 N = 2 .000 The president chose the best candidates available, regardless of N = 29 N = 12 .012 gender. Race The president gave preferential treatment to the white candidates. N = 35 N = 6 .000 The president chose the best candidates available, regardless of race. N = 33 N = 8 .000
Systemic answers
Question Type Answer Individual Systemic Sig. Gender Women have fewer opportunities to reach the top ranks of the N = 4 N = 37 .000 military. Since most of the lower level officers are male, they would be most N = 7 N = 34 .000 effectively led by male generals. Race Non-whites have fewer opportunities to reach the top ranks of the N = 1 N = 40 .000 military. Since most of the lower level officers are white, they would be most N = 1 N = 40 .000 effectively led by white generals.
215 Question 4
Individual answers
Question Type Answer Individual Systemic Sig. Gender The coach gave special treatment to the male players. N = 36 N = 5 .000 The coach picked the best players for the scholarship, regardless of N = 28 N = 13 .028 gender. Race The coach gave special treatment to the white players. N = 36 N = 5 .000 The coach picked the best players for the scholarship, regardless of N = 36 N = 5 .000 race.
Systemic answers
Question Type Answer Individual Systemic Sig. Gender Many more high-schools have a hockey team for men than for N = 1 N = 40 .000 women. Students are more interested in attending the men's hockey games, N = 8 N = 33 .000 so the coach prioritized putting together a great men's team. Race Because hockey is an expensive sport, white students have more N = 1 N = 40 .000 opportunities to try hockey at the high-school level than non-white students do. White athletes value education more than their non-white peers, so N = 2 N = 39 .000 more of them apply to college.
Question 5
Individual answers
Question Type Answer Individual Systemic Sig. Gender The acquaintance intended to disparage Nancy for her gender. N = 36 N = 5 .000 Nancy overreacted to a good-humored comment. N = 39 N = 2 .000 Race The acquaintance intended to disparage Jamal for his race. N = 32 N = 9 .000 Jamal overreacted to a good-humored comment. N = 35 N = 8 .000
216 Systemic answers
Question Type Answer Individual Systemic Sig. Gender Many people use the word "babe" to disparage women, even if N = 8 N = 31 .000 Nancy's acquaintance did not mean to. The term “babe” may once have been used in a sexist way, but its N = 9 N = 32 .000 use in other contexts is more important. Race In the past, Jamal has heard others use the term“boy” in ways that N = 9 N = 32 .000 carry racist overtones. The term “boy” may once have been used in a racist way, but its use N = 6 N = 35 .000 in other contexts is more important.
Question 6
Individual answers
Question Type Answer Individual Systemic Sig. Gender The professor gave Alison a poor grade because of her gender. N = 39 N = 2 .000 Alison chose not to participate in class, so she deserved a poor N = 40 N = 1 .000 grade. Race The professor gave Ramon a poor grade because of his race. N = 34 N = 7 .000 Ramon chose not to participate in class, so he deserved a poor N = 41 N = 0 .000 grade.
Systemic answers
Question Type Answer Individual Systemic Sig. Gender In the past, when Alison tried to participate in class discussions, she N = 9 N = 32 .000 was often ignored by her teachers in favor of her male classmates. Women are not aggressive enough to have their opinions heard in N = 4 N = 37 .000 the classroom. Race In the past, when Ramon tried to participate in class discussions, he N = 11 N = 30 .004 was often ignored by his teachers in favor of his white classmates. Ramon grew up in a community that taught him not to value N = 2 N = 39 .000 academic achievement or class participation.
217 Question 7
Individual answers
Question Type Answer Individual Systemic Sig. Gender The acquaintance meant that it was unusual for a woman to be good N = 29 N = 12 .012 at programming. Susan overreacted to a well-meaning compliment. N = 38 N = 3 .000 Race The acquaintance meant that it was unusual for an Asian person to N = 31 N = 10 .001 speak English well. Jin overreacted to a well-meaning compliment. N = 37 N = 4 .000
Systemic answers
Question Type Answer Individual Systemic Sig. Gender Many people assume that women are bad with technology, even if N = 6 N = 35 .000 Susan's acquaintance did not. Women on average are worse with technology than men, so Susan's N = 5 N = 36 .000 skill at programming is surprising. Race In the past, Jin has encountered many people who assume he is not N = 9 N = 32 .000 American-born because of his ethnic background. Many Asians are not fluent in English, so it is only reasonable to N = 8 N = 33 .000 assume that Jin would not be either.
Question 8
Individual answers
Question Type Answer Individual Systemic Sig. Gender The school counselor deliberately referred Amelia for special N = 37 N = 4 .000 education because of her gender. The school counselor used excellent professional judgment in N = 35 N = 6 .000 making a difficult call. Race The school counselor deliberately referred Tyrone for special N = 34 N = 7 .000 education because of his race. The school counselor used excellent professional judgment in N = 39 N = 2 .000 making a difficult call.
218 Systemic answers
Question Type Answer Individual Systemic Sig. Gender Popular stereotypes portray female students as mathematically N = 7 N = 34 .000 inferior, which influenced the counselor's recommendation. Women are less mathematically adept than men, so it is reasonable N = 2 N = 39 .000 to assume that Amelia's ambiguous results mean that she has a problem. Race Popular stereotypes portray black students as academically inferior, N = 9 N = 32 .000 which influenced the counselor's recommendation. Tyrone's family and peers are negative role models for him, leading N = 7 N = 34 .000 him to behave poorly in the classroom and require special attention.
Question 9
Individual answers
Question Type Answer Individual Systemic Sig. Gender Mary's boss discriminated against her because of her gender. N = 36 N = 5 .000 Mary's boss recommended the most qualified candidate for the job. N = 34 N = 7 .000 Race Zhou's boss discriminated against him because of his race. N = 37 N = 4 .000 Zhou's boss recommended the most qualified candidate for the job. N = 38 N = 3 .000
Systemic answers
Question Type Answer Individual Systemic Sig. Gender Women are rarely portrayed as effective leaders, which influenced N = 8 N = 33 .000 Mary's boss's judgment about who could lead the team. Men do not like to be led by a woman, so it is better to promote N = 5 N = 36 .000 male candidates to manager. Race Asian men are rarely portrayed as effective leaders, which N = 6 N = 35 .000 influenced Zhou's boss's judgment about who could lead the team. Zhou grew up in a culture that encourages deference and obedience, N = 3 N = 38 .000 making him unsuited for a leadership position.
219 Question 10
Individual answers
Question Type Answer Individual Systemic Sig. Gender The police officer only warned Hannah about her appearance N = 34 N = 7 .000 because of her gender. The police officer used excellent professional judgment in deciding N = 33 N = 8 .000 who to caution. Race The police officer only searched Deshaun because of his race. N = 31 N = 10 .001 The police officer used his best professional judgment in deciding N = 34 N = 7 .000 who to search.
Systemic answers
Question Type Answer Individual Systemic Sig. Gender Women's bodies are assumed to be on display for men, leading the N = 8 N = 33 .000 officer to believe the warning was appropriate. Dressing provocatively increases the chances a woman will be N = 5 N = 36 .000 sexually assaulted, so the officer behaved correctly in warning Hannah. Race Law enforcement is harsher on black men than white men for the N = 2 N = 39 .000 same offense. Most drug dealers are young, black and male, so police officers N = 0 N = 41 .000 should focus their efforts on young black men.
220 Appendix E: Modern Sexism Scale (adapted) Questions 1, 3, 4, 5, 6, and 8 should be coded from “Strongly Disagree” = 5 to “Strongly
Agree” = 1. Questions 2 and 7 should be reverse-coded as “Strongly Disagree” = 1 to “Strongly
Agree” = 5.
Sum the scores for all questions to get the overall score. A higher score indicates less evidence of bias.
1. Discrimination against women is no longer a problem in the United States.
1 – Strongly agree
2 – Somewhat agree
3 – Neither agree nor disagree
4 – Somewhat disagree
5 – Strongly disagree
2. Women often miss out on good jobs due to sexual discrimination.
1 – Strongly agree
2 – Somewhat agree
3 – Neither agree nor disagree
4 – Somewhat disagree
5 – Strongly disagree
3. It is rare to see women treated in a sexist manner on television.
1 – Strongly agree
2 – Somewhat agree
221 3 – Neither agree nor disagree
4 – Somewhat disagree
5 – Strongly disagree
4. On average, people in our society treat husbands and wives equally.
1 – Strongly agree
2 – Somewhat agree
3 – Neither agree nor disagree
4 – Somewhat disagree
5 – Strongly disagree
5. Society has reached the point where women and men have equal opportunities for achievement.
1 – Strongly agree
2 – Somewhat agree
3 – Neither agree nor disagree
4 – Somewhat disagree
5 – Strongly disagree
6. It is easy to understand the anger of women's groups in America.
1 – Strongly agree
2 – Somewhat agree
3 – Neither agree nor disagree
222 4 – Somewhat disagree
5 – Strongly disagree
7. It is easy to understand why women's groups are still concerned about societal limitations of women's opportunities.
1 – Strongly agree
2 – Somewhat agree
3 – Neither agree nor disagree
4 – Somewhat disagree
5 – Strongly disagree
8. Over the past few years, the government and news media have been showing more concern about the treatment of women than is warranted by women's actual experiences.
1 – Strongly agree
2 – Somewhat agree
3 – Neither agree nor disagree
4 – Somewhat disagree
5 – Strongly disagree
223 Appendix F: Symbolic Racism Test (adapted)
Questions 1, 2, 4, and 8 should be coded from “Strongly Disagree” = 4 to “Strongly Agree”
= 1. Questions 5, 6, and 7 should be coded from “Strongly Disagree” = 1 to “Strongly Agree” =
4. Question 3 should be coded as follows:
– “Very much too fast” = 1
– “Moving at about the right speed” = 2
– “Going too slowly” = 3
Sum the scores for each question to get an overall score. A higher score indicates less evidence of bias.
1. It's really a matter of some people not trying hard enough; if blacks would only try harder they could be just as well off as whites.
1 – Strongly agree
2 – Somewhat agree
3 – Somewhat disagree
4 – Strongly disagree
2. Irish, Italian, Jewish and many other minorities overcame prejudice and worked their way up.
Blacks should do the same.
1 – Strongly agree
2 – Somewhat agree
3 – Somewhat disagree
224 4 – Strongly disagree
3. Some say that black leaders have been trying to push too fast. Others feel that they haven’t
pushed fast enough. What do you think?
1 – Trying to push very much too fast
2 – Going too slowly
3 – Moving at about the right speed
4. How much of the racial tension that exists in the United States today do you think blacks are responsible for creating?
1 – Strongly agree
2 – Somewhat agree
3 – Somewhat disagree
4 – Strongly disagree
5. How much discrimination against blacks do you feel there is in the United States today,
limiting their chances to get ahead?
1 – Strongly agree
2 – Somewhat agree
3 – Somewhat disagree
4 – Strongly disagree
6. Generations of slavery and discrimination have created conditions that make it difficult for
225 blacks to work their way out of the lower class.
1 – Strongly agree
2 – Somewhat agree
3 – Somewhat disagree
4 – Strongly disagree
7. Over the past few years, blacks have gotten less than they deserve.
1 – Strongly agree
2 – Somewhat agree
3 – Somewhat disagree
4 – Strongly disagree
8. Over the past few years, blacks have gotten more economically than they deserve.
1 – Strongly agree
2 – Somewhat agree
3 – Somewhat disagree
4 – Strongly disagree
226 Appendix G: Control Text
Microaggressions are brief, everyday interactions that convey subtle negative messages
about race or gender. Microaggressions can be intentional or unintentional; in fact, often the
perpetrator of a microaggression does not even realize they have done something demeaning.
During his primary campaign against Barack Obama, Joe Biden was asked about Obama's appeal
to voters. He responded, “I mean, you got the first mainstream African-American candidate who
is articulate and bright and clean and a nice-looking guy.” On the surface, this comment sounds like praise. However, the implication is that Obama is an exception. If being articulate, bright, clean, and nice-looking is exceptional for a black man, then most blacks must be unintelligent, inarticulate, dirty, and unattractive. This is a classic example of a microaggression.
Each individual microaggression may have little impact on the person who experiences it,
but when they happen repeatedly over an entire lifetime, microaggressions can cumulatively do
great harm. Ongoing exposure to microaggresions hurts people physically, mentally,
emotionally, and socially. They produce health problems and reduce life expectancy; deplete
mental focus and energy; create feelings of anger while reducing feelings of self-worth; and deny
recipients equal access and opportunity in education, employment, and health care.
Schemas are hypotheses that we use to understand the world around us. They give us
clues about what to expect and how to interpret other people's behavior. For example, the schema
for “politician” often includes dishonesty, leading us to interpret anything a politician says as
self-serving.
227 American culture has strong schemas about gender. For example, women are thought of
as more nurturing and men as more aggressive. Because schemas tell us how to interpret
behavior, it is easier for us to notice men being aggressive and women being nurturing than the
other way around. Similarly, we have ideas about how white people and people of color behave.
In ambiguous situations, we interpret people's behavior in a way that matches our racial
expectations.
Our culture's model of professional competence contains many qualities from our
schemas of “man” and “white person,” and contains fewer qualities from our schemas of
“woman” and “person of color.” This makes it easier for us to see professional competence in
men and white people than in women and people of color. For example, we think of leadership
as part of the male role, making it easier to notice when men are leaders and interpreting
ambiguous actions in that light. This difference means we unconsciously overrate the professional ability of men and whites, while underrating the work of women and people of color.
Our everyday evaluations have a cumulative effect on the advancement of the people we judge, even when each individual effect is minor. The importance of the accumulation of advantage and disadvantage is that even small imbalances add up.
Every time we judge someone positively, it casts a slight halo over whatever they do next. They are also in a slightly better position to be thought of positively when the next opportunity to excel arises and to obtain, in turn, the next organizational reward. Further, they will be perceived as having earned their opportunities. Each small advantage generates opportunities for further advantage, even if every evaluation after the first is perfectly fair.
228 Naturally, a disadvantaged person can still go on to do stunning, brilliant work that, combined with superior interpersonal skills and an in-depth understanding of how institutions work, will guarantee their success. Only a tiny percentage of people, however, turn out stunningly brilliant work, have extensive interpersonal skills, and understand thoroughly how to exploit institutional procedures. Most advancement comes from having a small to medium edge over other employees. Our way of evaluating women and people of color puts them at a disadvantage, compared to men and white people, in acquiring that edge.
229 Appendix H: Check Questions
Control group questions
Schemas are . . .
1) hypotheses we use to understand the world.
2) trends in American behavior.
3) Strategies women use to improve their professional competence.
4) Dishonest techniques used by politicians.
As you were reading the text, how much did you want to continue reading?
1) Extremely.
2) A lot.
3) Somewhat.
4) A little bit.
5) Not at all.
How likely would you be to recommend this reading to a friend?
1) Very likely.
2) Likely.
3) Somewhat likely.
4) Unlikely.
5) Not at all likely.
230 Game condition questions
In the game you just played, money can be used to . . .
1) upgrade your clients' skills.
2) Bribe your clients' co-workers.
3) Send your clients on vacation.
4) Buy office supplies for your clients.
As you were playing the game, how much did you want to continue playing?
1) Extremely.
2) A lot.
3) Somewhat.
4) A little bit.
5) Not at all.
How likely would you be to recommend this game to a friend?
1) Very likely.
2) Likely.
3) Somewhat likely.
4) Unlikely.
5) Not at all likely.
231 Appendix I: Demographic Questions
• Age? [numeric response]
• Gender? [male, female, other]
• Race? [American Indian, Asian, Black, Hispanic, Native Hawaiian / Pacific Islander,
White, Other]
• First language? [English, Other]
• Nationality? [USA, Other with dropdown of ISO country codes]
• In which type of community do you live? [Urban, Rural, Suburban]
232 Appendix J: Full Data Tables
Table J1 ...... 246 Crosstabulation of Player Source and Player Gender Table J2 ...... 246 Crosstabulation of Player Source and Player Race Table J3 ...... 246 Mean Player Age by Player Source Table J4 ...... 247 ANOVA, Player Age by Player Source Table J5 ...... 247 Crosstabulation of Player Source and Living Area Table J6 ...... 247 Systemic Sexism Pretest Means by Player Source Table J7 ...... 247 ANOVA, Systemic Sexism Pretest Scores by Player Source Table J8 ...... 248 Systemic Racism Pretest Means by Player Source Table J9 ...... 248 ANOVA, Systemic Racism Pretest Means by Player Source Table J10 ...... 248 Modern Sexism Means by Player Source Table J11 ...... 248 ANOVA, Modern Sexism Pretest Score by Player Source Table J12 ...... 249 Symbolic Racism Pretest Means by Player Source Table J13 ...... 249 ANOVA, Symbolic Racism Pretest Score by Player Source Table J14 ...... 249 Crosstabulation of Player Source and Games Won Table J15 ...... 250 Mean Player Score by Player Source
233 Table J16 ...... 250 ANOVA, Player Score by Player Source Table J17 ...... 250 Demographics, Web Players Table J18 ...... 250 Age, Web Players Table J19 ...... 251 Systemic Sexism Pretest Means by Completion Status (web) Table J20 ...... 251 ANOVA, Systemic Sexism Pretest Score by Completion Status (web) Table J21 ...... 251 Systemic Racism Pretest Means by Completion Status (web) Table J22 ...... 251 ANOVA, Systemic Racism Pretest Score by Completion Status (web) Table J23 ...... 252 Modern Sexism Pretest Means by Completion Status (web) Table J24 ...... 252 ANOVA, Modern Sexism Pretest Score by Completion Status (web) Table J25 ...... 252 Symbolic Racism Pretest Means by Completion Status (web) Table J26 ...... 252 ANOVA, Symbolic Racism Pretest Score by Completion Status (web) Table J27 ...... 253 Systemic Sexism Posttest Means by Pretest Group (web) Table J28 ...... 253 ANOVA, Systemic Sexism Posttest Score by Pretest Group (web) Table J29 ...... 253 Systemic Racism Posttest Means by Pretest Group (web) Table J30 ...... 253 ANOVA, Systemic Racism Posttest Score by Pretest Group (web) Table J31 ...... 254 Modern Sexism Posttest Means by Pretest Group (web)
234 Table J32 ...... 254 ANOVA, Modern Sexism Posttest Score by Pretest Group (web) Table J33 ...... 254 Symbolic Racism Posttest Means by Pretest Group (web) Table J34 ...... 254 ANOVA, Symbolic Racism Posttest Score by Pretest Group (web) Table J35 ...... 255 Systemic Sexism Posttest Means by Treatment Condition (web) Table J36 ...... 255 Systemic Sexism Posttest Marginal Means by Treatment Condition (web) Table J37 ...... 256 ANCOVA, Systemic Sexism Posttest Score by Treatment Condition (web) Table J38 ...... 256 Treatment Condition Control vs. Game Contrast, Systemic Sexism Posttest Score (web) Table J39 ...... 256 Systemic Sexism Difference Score T-Tests by Treatment Condition (web) Table J40 ...... 257 Systemic Racism Posttest Means by Treatment Condition (web) Table J41 ...... 257 Systemic Racism Posttest Marginal Means by Treatment Condition (web) Table J42 ...... 258 ANCOVA, Systemic Racism Posttest Score by Treatment Condition (web) Table J43 ...... 258 Treament Condition Control vs Game Contrast, Systemic Racism Posttest Score (web) Table J44 ...... 258 Systemic Racism Difference Score Overall T-Test (web) Table J45 ...... 259 Game Data Correlations with Systemic Sexism Posttest Score* (web) Table J46 ...... 259 Systemic Sexism Posttest Means by Number of Plays (web) Table J47 ...... 259 ANCOVA, Systemic Sexism Posttest Score by Number of Plays (web)
235 Table J48 ...... 260 Game Data Correlations with Systemic Racism Posttest Score* (web) Table J49 ...... 260 Systemic Racism Posttest Means by Number of Plays (web) Table J50 ...... 260 ANCOVA, Systemic Racism Posttest Score by Number of Plays (web) Table J51 ...... 261 Mean Score by Player Race and Gender (web) Table J52 ...... 261 ANCOVA, Mean Score by Player Race and Gender (web) Table J53 ...... 261 Mean Clients Placed by Player Race and Gender (web) Table J54 ...... 262 ANCOVA, Mean Clients Placed by Player Race and Gender (web) Table J55 ...... 262 Mean Bias-Group Clients Placed by Player Race and Gender (web) Table J56 ...... 262 ANCOVA, Mean Bias-Group Clients Placed by Player Race and Gender (web) Table J57 ...... 263 Mean Guesses by Player Race and Gender (web) Table J58 ...... 263 ANCOVA, Mean Guesses by Player Race and Gender (web) Table J59 ...... 264 Crosstabulation of Player Race and Games Played (web) Table J60 ...... 264 Crosstabulation of Player Gender and Games Played (web) Table J61 ...... 265 Systemic Sexism Posttest Means by Bias Guess Condition (web) Table J62 ...... 265 Systemic Sexism Posttest Marginal Means by Bias Guess Condition (web) Table J63 ...... 266 ANCOVA, Systemic Sexism Posttest Score by Bias Guess Condition (web)
236 Table J64 ...... 266 Systemic Racism Posttest Means by Bias Guess Condition (web) Table J65 ...... 267 Systemic Racism Posttest Marginal Means by Bias Guess Condition (web) Table J66 ...... 267 ANCOVA, Systemic Racism Posttest Score by Bias Guess Condition (web) Table J67 ...... 268 Game Score Means by Bias Guess Condition (web) Table J68 ...... 268 Game Score Marginal Means by Bias Guess Condition (web) Table J69 ...... 269 ANCOVA, Game Score by Bias Guess Condition (web) Table J70 ...... 269 Bias Guess Condition No Guess vs. Guess Contrast, Game Score (web) Table J71 ...... 269 Mean Score Percentage Earned from Bias Group by Bias Guess Condition (web) Table J72 ...... 270 Marginal Means, Score Percentage Earned from Bias Group by Bias Guess Condition (web) Table J73 ...... 270 ANCOVA, Score Percentage Earned from Bias Group by Bias Guess Condition (web) Table J74 ...... 271 Clients Placed Means by Bias Guess Condition (web) Table J75 ...... 271 Clients Placed Marginal Means by Bias Guess Condition (web) Table J76 ...... 272 ANCOVA, Clients Placed by Bias Guess Condition (web) Table J77 ...... 272 Bias Group Clients Placed by Bias Guess Condition (web) Table J78 ...... 273 Bias Group Clients Placed Marginal Means by Bias Guess Condition (web) Table J79 ...... 273 ANCOVA, Bias Group Clients Placed by Bias Guess Condition (web)
237 Table J80 ...... 273 Modern Sexism Posttest Means by Treatment Condition (web) Table J81 ...... 274 Modern Sexism Posttest Marginal Means by Treatment Condition (web) Table J82 ...... 274 ANCOVA, Modern Sexism Posttest Score by Treatment Condition (web) Table J83 ...... 275 Modern Sexism Difference Score Overall T-Test (web) Table J84 ...... 275 Symbolic Racism Posttest Means by Treatment Condition (web) Table J85 ...... 275 Symbolic Racism Posttest Marginal Means by Treatment Condition (web) Table J86 ...... 276 ANCOVA, Symbolic Racism Posttest Score by Treatment Condition (web) Table J87 ...... 276 Treatment Condition Control vs Game Contrast, Symbolic Racism Posttest Score (web) Table J88 ...... 276 Symbolic Racism Posttest Score T-Tests by Treatment Condition (web) Table J89 ...... 277 Game Data Correlations with Modern Sexism Posttest Score* (web) Table J90 ...... 277 Modern Sexism Posttest Means by Number of Plays (web) Table J91 ...... 277 ANCOVA, Modern Sexism Posttest Score by Number of Plays (web) Table J92 ...... 278 Game Data Correlations with Symbolic Racism Posttest Score* (web) Table J93 ...... 278 Symbolic Racism Posttest Means by Number of Plays (web) Table J94 ...... 278 ANCOVA, Symbolic Racism Posttest Score by Number of Plays (web) Table J95 ...... 279 Modern Sexism Posttest Means by Bias Guess Condition (web)
238 Table J96 ...... 279 Modern Sexism Posttest Marginal Means by Bias Guess Condition (web) Table J97 ...... 280 ANCOVA, Modern Sexism Posttest Score by Bias Guess Condition (web) Table J98 ...... 280 Bias Guess Condition No Guess vs Guess Contrast, Modern Sexism Posttest Score (web) Table J99 ...... 280 Symbolic Racism Posttest Means by Bias Guess Condition (web) Table J100 ...... 281 Symbolic Racism Posttest Marginal Means by Bias Guess Condition (web) Table J101 ...... 281 ANCOVA, Symbolic Racism Posttest Score by Bias Guess Condition (web) Table J102 ...... 281 Bias Guess Condition No Guess vs Guess Contrast, Symbolic Racism Posttest Score (web) Table J103 ...... 282 Demographics, Mechanical Turk Players Table J104 ...... 282 Age, Mechanical Turk Players Table J105 ...... 282 Systemic Sexism Pretest Means by Completion Status (MT) Table J106 ...... 283 ANOVA, Systemic Sexism Pretest Score by Completion Status (MT) Table J107 ...... 283 Systemic Racism Pretest Means by Completion Status (MT) Table J108 ...... 283 ANOVA, Systemic Racism Pretest Score by Completion Status (MT) Table J109 ...... 283 Modern Sexism Pretest Means by Completion Status (MT) Table J110 ...... 284 ANOVA, Modern Sexism Pretest Score by Completion Status (MT) Table J111 ...... 284 Symbolic Racism Pretest Means by Completion Status (MT)
239 Table J112 ...... 284 ANOVA, Symbolic Racism Pretest Score by Completion Status (MT) Table J113 ...... 284 Systemic Sexism Posttest Means by Pretest Group (MT) Table J114 ...... 285 ANOVA, Systemic Sexism Posttest Score by Pretest Group (MT) Table J115 ...... 285 Systemic Racism Posttest Means by Pretest Group (MT) Table J116 ...... 285 ANOVA, Systemic Racism Posttest Score by Pretest Group (MT) Table J117 ...... 285 Modern Sexism Posttest Means by Pretest Group (MT) Table J118 ...... 286 ANOVA, Modern Sexism Posttest Score by Pretest Group (MT) Table J119 ...... 286 Symbolic Racism Posttest Means by Pretest Group (MT) Table J120 ...... 286 ANOVA, Symbolic Racism Posttest Score by Pretest Group (MT) Table J121 ...... 286 Systemic Sexism Posttest Marginal Means by Treatment Condition (MT) Table J122 ...... 287 ANCOVA, Systemic Sexism Posttest Score by Treatment Condition (MT) Table J123 ...... 287 Treatment Condition Control vs. Game Contrast, Systemic Sexism Posttest Score (MT) Table J124 ...... 287 Systemic Sexism Difference Score Overall T-Test (MT) Table J125 ...... 288 Systemic Racism Posttest Marginal Means by Treatment Condition (MT) Table J126 ...... 288 ANCOVA, Systemic Racism Posttest Score by Treatment Condition (MT) Table J127 ...... 288 Treatment Condition Control vs. Game Contrast, Systemic Racism Posttest Score (MT)
240 Table J128 ...... 289 Systemic Racism Difference Score Overall T-Test (MT) Table J129 ...... 289 Game Data Correlations with Systemic Sexism Posttest Score* (MT) Table J130 ...... 289 Systemic Sexism Posttest Means by Number of Plays (MT) Table J131 ...... 289 ANCOVA, Systemic Sexism Posttest Score by Number of Plays (MT) Table J132 ...... 290 Game Data Correlations with Systemic Racism Posttest Score* (MT) Table J133 ...... 290 Systemic Racism Posttest Means by Number of Plays (MT) Table J134 ...... 290 ANCOVA, Systemic Racism Posttest Score by Number of Plays (MT) Table J135 ...... 291 Mean Score by Player Race and Gender (MT) Table J136 ...... 291 ANCOVA, Mean Score by Player Race and Gender (MT) Table J137 ...... 291 Mean Clients Placed by Player Race and Gender (MT) Table J138 ...... 292 ANCOVA, Mean Clients Placed by Player Race and Gender (MT) Table J139 ...... 292 White vs. non-White Contrast, Clients Placed (MT) Table J140 ...... 292 Mean Bias-Group Clients Placed by Player Race and Gender (MT) Table J141 ...... 293 ANCOVA, Mean Bias-Group Clients Placed by Player Race and Gender (MT) Table J142 ...... 293 Mean Guesses by Player Race and Gender (MT) Table J143 ...... 293 ANCOVA, Mean Guesses by Player Race and Gender (MT)
241 Table J144 ...... 294 Crosstabulation of Player Race and Games Played (MT) Table J145 ...... 294 Crosstabulation of Player Gender and Games Played (MT) Table J146 ...... 295 Systemic Racism Posttest Marginal Means by Bias Guess Condition (MT) Table J147 ...... 295 ANCOVA, Systemic Sexism Posttest Score by Bias Guess Condition (MT) Table J148 ...... 295 Bias Guess Condition No Guess vs Guess Contrast, Systemic Sexism Posttest Score (MT) Table J149 ...... 296 Systemic Sexism Posttest Marginal Means by Player Race (MT) Table J150 ...... 296 Systemic Sexism Difference Score T-Tests by Player Race (MT) Table J151 ...... 296 Systemic Racism Posttest Marginal Means by Bias Guess Condition (MT) Table J152 ...... 297 ANCOVA, Systemic Racism Posttest Score by Bias Guess Condition (MT) Table J153 ...... 297 Bias Guess Condition No Guess vs Guess Contrast, Systemic Racism Posttest Score (MT) Table J154 ...... 298 Systemic Racism Posttest Means by Player Gender and Player Race (MT) Table J155 ...... 298 Systemic Racism Difference Score T-Tests by Player Race and Gender (MT) Table J156 ...... 299 Game Score Marginal Means by Bias Guess Condition (MT) Table J157 ...... 299 ANCOVA, Game Score by Bias Guess Condition (MT) Table J158 ...... 299 Mean Score Percentage Earned from Bias Group by Bias Guess Condition (MT) Table J159 ...... 300 ANCOVA, Score Percentage Earned from Bias Group by Bias Guess Condition (MT)
242 Table J160 ...... 300 Score Percentage Earned from Bias Group Means by Player Race and Gender (MT) Table J161 ...... 300 Total Clients Placed by Bias Guess Condition (MT) Table J162 ...... 301 ANCOVA, Total Clients Placed by Bias Guess Condition (MT) Table J163 ...... 301 Bias Group Clients Placed by Bias Guess Condition (MT) Table J164 ...... 301 ANCOVA, Bias Group Clients Placed by Bias Guess Condition (MT) Table J165 ...... 302 Modern Sexism Posttest Marginal Means by Treatment Condition (MT) Table J166 ...... 302 ANCOVA, Modern Sexism Posttest Score by Treatment Condition (MT) Table J167 ...... 302 Treatment Condition Control vs Game Contrast, Modern Sexism Posttest Score (MT) Table J168 ...... 303 Modern Sexism Difference Score T-Tests by Treatment Condition Table J169 ...... 303 Symbolic Racism Posttest Marginal Means by Treatment Condition (MT) Table J170 ...... 304 ANCOVA, Symbolic Racism Posttest Score by Treatment Condition (MT) Table J171 ...... 304 Treatment Condition Control vs Game Contrast, Symbolic Racism Posttest Score (MT) Table J172 ...... 304 Symbolic Racism Posttest Marginal Means by Player Race (MT) Table J173 ...... 305 Symbolic Racism Difference Score T-Tests by Player Race (MT) Table J174 ...... 305 Game Data Correlations with Modern Sexism Posttest Score* (MT) Table J175 ...... 305 Modern Sexism Posttest Means by Number of Plays (MT)
243 Table J176 ...... 306 ANCOVA, Modern Sexism Posttest Score by Number of Plays (MT) Table J177 ...... 306 Game Data Correlations with Symbolic Racism Posttest Score* (MT) Table J178 ...... 306 Symbolic Racism Posttest Means by Number of Plays (MT) Table J179 ...... 306 ANCOVA, Symbolic Racism Posttest Score by Number of Plays (MT) Table J180 ...... 307 Modern Sexism Posttest Marginal Means by Bias Guess Condition and Player Race (MT) Table J181 ...... 308 ANCOVA, Modern Sexism Posttest Score by Bias Guess Condition (MT) Table J182 ...... 308 Modern Sexism Posttest Marginal Means by Bias Guess Condition (MT) Table J183 ...... 308 Bias Guess Condition No Guess vs. Guess Contrast, Modern Sexism Posttest Score (MT) Table J184 ...... 309 Modern Sexism Posttest Marginal Means by Player Race (MT) Table J185 ...... 309 Player Race Contrast, Modern Sexism Posttest Score (MT) Table J186 ...... 309 Modern Sexism Posttest Means by Guess Condition, White Players Only (MT) Table J187 ...... 310 ANCOVA, Modern Sexism Posttest Score by Bias Guess Condition, White Players Only (MT) Table J188 ...... 310 Modern Sexism Difference Score T-Test, White Players Only Table J189 ...... 310 Modern Sexism Posttest Means by Bias Guess Condition, Black, Hispanic, and Other Players (MT) Table J190 ...... 310 ANCOVA, Modern Sexism Posttest Score by Bias Guess Condition, Black, Hispanic and Other Players (MT) Table J191 ...... 311
244 Modern Sexism Difference Score T-Test, Black, Hispanic, and Other Players (MT) Table J192 ...... 311 Symbolic Racism Posttest Marginal Means by Bias Guess Condition and Player Race (MT) Table J193 ...... 312 ANCOVA, Symbolic Racism Posttest Score by Bias Guess Condition (MT) Table J194 ...... 312 Symbolic Racism Posttest Means by Bias Guess Condition (MT) Table J195 ...... 312 Bias Guess Condition No Guess vs. Guess Contrast, Symbolic Racism Posttest Score (MT) Table J196 ...... 313 Symbolic Racism Posttest Marginal Means by Player Race (MT) Table J197 ...... 313 White vs. non-White Player Race Contrast, Symbolic Racism Posttest Score (MT) Table J198 ...... 313 Symbolic Racism Posttest Means by Bias Guess Condition, White Players (MT) Table J199 ...... 314 ANCOVA, Symbolic Racism Posttest Score by Bias Guess Condition, White Players (MT) Table J200 ...... 314 Symbolic Racism Difference Score T-Test, White Players Table J201 ...... 314 Symbolic Racism Posttest Means by Bias Guess Condition, Black, Hispanic, and Other Players (MT) Table J202 ...... 315 ANCOVA, Symbolic Racism Posttest Score by Bias Guess Condition,, Black, Hispanic, and Other Players (MT) Table J203 ...... 315 Symbolic Racism Difference Score T-Tests by Bias Guess Condition, Black, Hispanic, Other Players (MT)
245 Table J1
Crosstabulation of Player Source and Player Gender Player Gender χ² p Total
Player Source Female Male
Web 108 106 1.80a .180 214
Mechanical Turk 108 81 189 Total 216 187 403 a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 87.70.
Table J2
Crosstabulation of Player Source and Player Race Player Race χ² p Total
Black and Player Source White Hispanic Other
Web 194 5 22 10.94a .004* 221
Mechanical Turk 148 17 26 191
Total 342 22 48 a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 10.20.
* p≤ .005
Table J3 Mean Player Age by Player Source Player Source Mean SD N
Web 32.60 8.73 220
Mechanical Turk 34.72 10.93 190
Total 33.58 9.855 410
246 Table J4
ANOVA, Player Age by Player Source Source df F p η2
Player Source 1 4.79 .029* 0.012
* p≤ .05
Table J5
Crosstabulation of Player Source and Living Area Living Area χ² p Total
Player Source Rural Suburban Urban
Web 17 118 86 7.54 .023* 221
Mechanical Turk 31 97 63 191
Total 48 215 149 412 a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 22.25. p ≤ .05
Table J6
Systemic Sexism Pretest Means by Player Source Player Source Mean SD N
Web 2.89 1.29 122
Mechanical Turk 1.93 1.16 99
Total 2.46 1.32 221
Table J7 ANOVA, Systemic Sexism Pretest Scores by Player Source Source df F p η2
Player Source 1 33.30 .000* 0.132
* p≤ .001
247 Table J8
Systemic Racism Pretest Means by Player Source Player Source Mean SD N
Web 3.08 1.28 122
Mechanical Turk 1.90 1.25 99
Total 2.55 1.40 221
Table J9
ANOVA, Systemic Racism Pretest Means by Player Source Source df F p η2
Player Source 1 47.56 .000* 0.178
* p≤ .001
Table J10
Modern Sexism Means by Player Source Player Source Mean SD N
Web 31.82 4.15 122
Mechanical Turk 27.43 5.56 99
Total 29.86 5.29 221
Table J11
ANOVA, Modern Sexism Pretest Score by Player Source Source df F p η2
Player Source 1 45.08 .000* 0.171
* p≤ .001
248 Table J12
Symbolic Racism Pretest Means by Player Source Player Source Mean SD N
Web 26.28 4.38 122
Mechanical Turk 20.12 4.99 99
Total 23.52 5.57 221
Table J13
ANOVA, Symbolic Racism Pretest Score by Player Source Source df F p η2
Player Source 1 95.39 .000* 0.303
* p≤ .001
Table J14
Crosstabulation of Player Source and Games Won
Won Game χ² p Total
Player Source No Wins Wins
Web 11 171 9.71a .002* 182
Mechanical Turk 27 136 163 Total 38 307 345 a. 0 cells (0.0%) have expected count less than 5. The minimum expected count is 17.95. p ≤ .05
249 Table J15
Mean Player Score by Player Source Player Source Mean SD N
Web 254490.33 1004718.87 123
Mechanical Turk 870099.71 4287142.14 105
Total 537994.65 3009601.50 228
Table J16
ANOVA, Player Score by Player Source Source df F p η2
Player Source 1 2.38 .124 0.010
Table J17
Demographics, Web Players N %
Player Gender Female 108 50.50
Male 106 49.50
Player Race White 194 87.80
Black and Hispanic 5 2.30
Other 22 10.00
Living Area Rural 17 7.70
Suburban 118 53.40
Urban 86 38.90
Table J18
Age , Web Players N Min Max Mean SD
Age 220 19 68 32.60 8.73
250 Table J19
Systemic Sexism Pretest Means by Completion Status (web) Completed Mean SD N
No 2.98 1.34 88
Yes 3.06 1.30 157
Total 3.03 1.31 245
Table J20
ANOVA, Systemic Sexism Pretest Score by Completion Status (web) Source df F p η2
Completed 1 0.21 .647 0.001
Table J21
Systemic Racism Pretest Means by Completion Status (web) Completed Mean SD N
No 2.97 1.36 88
Yes 3.24 1.29 157
Total 3.14 1.32 245
Table J22
ANOVA, Systemic Racism Pretest Score by Completion Status (web) Source df F p η2
Completed 1 2.49 .116 0.010
251 Table J23
Modern Sexism Pretest Means by Completion Status (web) Completed Mean SD N
No 31.93 4.30 88
Yes 32.18 3.86 157
Total 32.09 4.02 245
Table J24
ANOVA, Modern Sexism Pretest Score by Completion Status (web) Source df F p η2
Completed 1 0.22 .637 0.001
Table J25
Symbolic Racism Pretest Means by Completion Status (web) Completed Mean SD N
No 25.17 4.33 88
Yes 26.42 4.08 157
Total 25.97 4.21 245
Table J26
ANOVA, Symbolic Racism Pretest Score by Completion Status (web) Source df F p η2
Completed 1 5.06 .025* 0.020
* p≤ .05
252 Table J27
Systemic Sexism Posttest Means by Pretest Group (web) Pretest Group Mean SD N
No Pretest 3.50 1.24 98
Pretest 3.12 1.42 122
Total 3.29 1.35 220
Table J28
ANOVA, Systemic Sexism Posttest Score by Pretest Group (web) Source df F p η2
Pretest Group 1 4.3 .039* 0.019
* p≤ .05
Table J29
Systemic Racism Posttest Means by Pretest Group (web) Pretest Group Mean SD N
No Pretest 3.56 1.32 98
Pretest 3.00 1.43 122
Total 3.25 1.40 220
Table J30
ANOVA, Systemic Racism Posttest Score by Pretest Group (web) Source df F p η2
Pretest Group 1 9.01 .003* 0.040
* p≤ .005
253 Table J31
Modern Sexism Posttest Means by Pretest Group (web) Pretest Group Mean SD N
No Pretest 32.45 4.06 98
Pretest 31.98 4.14 122
Total 32.19 4.10 220
Table J32
ANOVA, Modern Sexism Posttest Score by Pretest Group (web) Source df F p η2
Pretest Group 1 0.698 .404 0.003
Table J33
Symbolic Racism Posttest Means by Pretest Group (web) Pretest Group Mean SD N
No Pretest 22.99 1.74 98
Pretest 23.11 2.15 122
Total 23.06 1.97 220
Table J34
ANOVA, Symbolic Racism Posttest Score by Pretest Group (web) Source df F p η2
Pretest Group 1 0.217 .642 0.001
254 Table J35
Systemic Sexism Posttest Means by Treatment Condition (web) Treatment Condition Mean SD N
Control 3.73 1.18 41
Informational 2.92 1.26 25
Financial 2.30 1.41 27
Generative 3.33 1.37 24
Total 3.15 1.39 117
Table J36
Systemic Sexism Posttest Marginal Means by Treatment Condition (web) Treatment Condition Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
Control 3.64a,b 0.38 2.90 4.38
Informational .a,c . . .
Financial .a,c . . .
Generative 3.13a,b 0.34 2.46 3.79 a. Covariates appearing in the model are evaluated at the following values: Systemic Sexism Pretest Score = 2.8974. b. Based on modified population marginal mean. c. This modified population marginal mean is not estimable.
255 Table J37
ANCOVA, Systemic Sexism Posttest Score by Treatment Condition (web) Source df F p η2
Condition 3 3.70 .014* 0.099
Gender 1 1.08 .301 0.011
Race 2 1.04 .357 0.02
Systemic Sexism Pretest 1 54.67 .000* 0.351
Gender * Race 1 0.56 .455 0.006
Condition * Gender 3 2.06 .110 0.058
Condition * Race 3 0.77 .514 0.022
*p≤ .05
**p≤ .001
Table J38
Treatment Condition Control vs. Game Contrast, Systemic Sexism Posttest Score (web) Source df F p η2
Contrast 1 5.19 .025* 0.049
* p≤ .005
Table J39 Systemic Sexism Difference Score T-Tests by Treatment Condition (web) Test Value = 0 N Mean SD t df p
Control Group 43 0.58 1.10 3.48 42 .001*
Informational Group 25 0.04 1.31 0.15 24 .880
Financial Group 28 -0.25 1.24 -1.07 27 .294
Generative Group 26 0.35 1.26 1.40 25 .175
* p≤ .001
256 Table J40
Systemic Racism Posttest Means by Treatm ent Condition (web) Treatment Condition Mean SD N
Control 3.24 1.37 41
Informational 2.92 1.44 25
Financial 2.56 1.55 27
Generative 3.00 1.38 24
Total 2.97 1.44 117
Table J41
Systemic Racism Posttest Marginal Means by Treatment Condition (web ) Treatment Condition Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
Control 3.013a,b 0.45 2.11 3.92
Informational .a,c . . .
Financial .a,c . . .
Generative 3.389a,b 0.40 2.59 4.19 a. Covariates appearing in the model are evaluated at the following values: Systemic Racism Pretest Score = 3.0940. b. Based on modified population marginal mean. c. This modified population marginal mean is not estimable.
257 Table J42
ANCOVA, Systemic Racism Posttest Score by Treatment Condition (web) Source df F p η2
Condition 3 1.55 .207 0.044
Gender 1 0.17 .680 0.002
Race 2 0.07 .932 0.001
Systemic Racism Pretest 1 27.01 .000* 0.211
Gender * Race 1 0.19 .665 0.002
Condition * Gender 3 0.32 .808 0.010
Condition * Race 3 0.98 .406 0.028
* p≤ .001
Table J43
Treament Condition Control vs Game Contrast, Systemic Racism Posttest Score (web) Source df F p η2
Contrast 1 0.001 .972 .000
Table J44 Systemic Racism Difference Score Overall T-Test (web) Test Value = 0 N Mean SD t df p
Systemic Racism Difference Score 122 -0.08 1.39 -0.65 121 .517
258 Table J45
Game Data Correlations with Systemic Sexism Posttest Score* (web) df r p
Player Score 70 -0.180 .130
Total Clients Placed 99 -0.038 .706
Total Clients Placed (Biased Group) 99 -0.054 .590
Number of Guesses 100 0.157 .115
*Controlling for Systemic Sexism Pretest Score
Table J46
Systemic Sexism Posttest Means by Number of Plays (web) Number of Plays Mean SD N
Two 3.13 1.42 120
More Than Two 3.00 1.41 2
Total 3.12 1.42 122
Table J47
ANCOVA, Systemic Sexism Posttest Score by Number of Plays (web) Source df F p η2
Systemic Sexism Pretest 1 62.08 .000* 0.343
More Than Two 1 0.06 .814 0
* p≤ .001
259 Table J48
Game Data Correlations with Systemic Racism Posttest Score* (web) df r p
Player Score 70 -0.059 .621
Total Clients Placed 99 0.032 .753
Total Clients Placed (Biased Group) 99 -0.039 .697
Number of Guesses 100 0.165 .097
*Controlling for Systemic Racism Pretest Score
Table J49
Systemic Racism Posttest Means by Number of Plays (web) Number of Plays Mean SD N
Two 3.02 1.43 120
More Than Two 2.00 0 2
Total 3.00 1.43 122
Table J50
ANCOVA, Systemic Racism Posttest Score by Number of Plays (web) Source df F p η2
Systemic Racism Pretest 1 36.98 .000* 0.237
More Than Two 1 2.89 .092 0.024
* p≤ .001
260 Table J51
Mean Score by Player Race and Gender (web) Player Gender Player Race Mean SD N
Female White 272694.04 1072307.04 54
Black and Hispanic 76740.00 - 1
Other 67562.50 18428.25 4
Male White 283028.52 1079014.54 54
Black and Hispanic 39420.00 - 1
Other 98473.33 48092.55 6
Total 258219.08 1017017.13 120
Table J52
ANCOVA, Mean Score by Player Race and Gender (web) Source df F p η2
Race 2 .19 .824 .003
Gender 1 .00 .998 .000
Race * Gender 2 .00 .999 .000
Table J53
Mean Clients Placed by Player Race and Gender (web) Player Gender Player Race Mean SD N
Female White 32.95 17.76 76
Black and Hispanic 34.00 9.90 2
Other 19.50 16.08 6
Male White 36.90 19.42 81
Black and Hispanic 18.00 - 1
Other 35.56 19.82 10
Total 34.90 18.67 176
261 Table J54
ANCOVA, Mean Clients Placed by Player Race and Gender (web) Source df F p η2
Race 2 .67 .514 .008
Gender 1 .10 .756 .001
Race * Gender 2 3.05 .055 .036
Table J55
Mean Bias-Group Clients Placed by Player Race and Gender (web) Player Gender Player Race Mean SD N
Female White 12.75 8.00 76
Black and Hispanic 11.00 2.82 2
Other 6.67 6.62 6
Male White 14.39 9.24 81
Black and Hispanic 8.00 - 1
Other 14.70 9.37 10
Total 13.36 8.65 176
Table J56
ANCOVA, Mean Bias-Group Clients Placed by Player Race and Gender (web) Source df F p η2
Race 2 1.03 .361 .012
Gender 1 .34 .564 .002
Race * Gender 2 1.06 .349 .012
262 Table J57
Mean Guesses by Player Race and Gender (web) Player Gender Player Race Mean SD N
Female White 1.96 2.13 77
Black and Hispanic 3.00 .00 2
Other .67 .82 6
Male White 1.63 2.08 81
Black and Hispanic 3.00 - 1
Other 1.90 2.73 10
Total 1.78 2.10 177
Table J58
ANCOVA, Mean Guesses by Player Race and Gender (web) Source df F p η2
Race 2 .86 .424 .010
Gender 1 .10 .750 .001
Race * Gender 2 .94 .391 .011
263
Table J59
Crosstabulation of Player Race and Games Played (web)
Games Played χ² p Total
Player Race 2 >2
White 189 5 .433a .806 194
Black and Hispanic 5 0 5
Other 21 1 22
Total 38 307 221 a. 3 cells (50.0%) have expected count less than 5. The minimum expected count is .14.
Table J60
Crosstabulation of Player Gender and Games Played (web)
Games Played χ² p Total
Player Gender 2 >2
Female 105 3 .001a .981 108
Male 103 3 106 Total 208 6 214 a. 2 cells (50.0%) have expected count less than 5. The minimum expected count is 2.97.
264 Table J61
Systemic Sexism Posttest Means by Bias Gues s Condition (web) Bias Guess Condition Mean SD N
No Guess 2.45 1.37 22
Informational Guess 3.00 1.26 20
Financial Guess 2.29 1.49 17
Generative Guess 3.56 1.15 16
Total 2.80 1.39 75
Table J62
Systemic Sexism Posttest Marginal Means by Bias Guess Condition (web) Guessing Condition Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
No Guess 2.37a,b 0.36 1.66 3.09
Informational Guess .a,c . . .
Financial Guess .a,c . . .
Generative Guess 3.24a,b 0.42 2.39 4.09 a. Covariates appearing in the model are evaluated at the following values: Systemic Sexism Pretest Score = 2.7200. b. Based on modified population marginal mean. c. This modified population marginal mean is not estimable.
265 Table J63
ANCOVA, Systemic Sexism Posttest Score by Bias Guess Condition (web) Source df F p η2
Bias Guess Condition 3 1.50 .225 0.071
Gender 1 0.50 .482 0.008
Race 2 0.53 .592 0.018
Systemic Sexism Pretest 1 29.15 .000* 0.331
Gender * Race 1 0.59 .446 0.010
Bias Guess Condition * Gender 3 2.43 .074 0.110
Bias Guess Condition * Race 3 0.59 .622 0.029
* p≤ .001
Table J64
Systemic Racism Posttest Means by Bias Guess Condition (web) Bias Guess Condition Mean SD N
No Guess 2.41 1.44 22
Informational Guess 2.80 1.58 20
Financial Guess 2.76 1.48 17
Generative Guess 3.38 1.26 16
Total 2.80 1.46 75
266 Table J65
Systemic Racism Posttest Marginal Means by Bias Guess Condition (web) Guessing Condition Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
No Guess 2.83a,b 0.43 1.96 3.70
Informational Guess .a,c . . .
Financial Guess .a,c . . .
Generative Guess 3.36a,b 0.51 2.33 4.38 a. Covariates appearing in the model are evaluated at the following values: Systemic Racism Pretest Score = 2.9067. b. Based on modified population marginal mean. c. This modified population marginal mean is not estimable.
Table J66
ANCOVA, Systemic Racism Posttest Score by Bias Guess Condition (web) Source df F p η2
Bias Guess Condition 3 0.87 .464 0.042
Gender 1 0.70 .406 0.012
Race 2 0.40 .675 0.013
Systemic Racism Pretest 1 15.16 .000* 0.204
Gender * Race 1 0.06 .803 0.001
Bias Guess Condition * Gender 3 0.12 .945 0.006
Bias Guess Condition * Race 3 0.69 .561 0.034
* p≤ .001
267 Table J67
Game Score Means by Bias Guess Condition (web) Guessing Condition Mean SD N
No Guess 82763.93 29562.34 28
Informational Guess 298966.45 1251433.62 31
Financial Guess 369154.00 1463081.04 25
Generative Guess 282558.06 818733.96 36
Total 258219.08 1017017.13 120
Table J68
Game Score Marginal Means by Bias Guess Condition (web) Guessing Condition Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
No Guess 99792.33a 305353.91 -505601.49 705186.14
Informational Guess .b . . .
Financial Guess .b . . .
Generative Guess 189722.95a 327346.89 -459274.09 838719.99 a. Based on modified population marginal mean. b. This modified population marginal mean is not estimable.
268 Table J69
ANCOVA, Game Score by Bias Guess Condition (web) Source df F p η2
Bias Guess Condition 3 0.07 .978 0.002
Gender 1 0.02 .883 0.000
Race 2 0.07 .929 0.001
Bias Guess Condition * Gender 3 0.72 .543 0.020
Bias Guess Condition * Race 2 0.07 .934 0.001
Gender * Race 1 0.04 .853 0.000
Table J70
Bias Guess Condition No Guess vs. Guess Contrast, Game Score (web) Source df F p η2
Contrast 1 0.05 .830 0.000
Table J71
Mean Score Percentage Earned from Bias Group by Bias Guess Condition (web) Guess Condition Mean SD N
No Guess 0.18 0.13 28
Informational Guess 0.18 0.14 31
Financial Guess 0.19 0.15 25
Generative Guess 0.16 0.10 36
Total 0.18 0.13 120
269 Table J72
Marginal Means, Score Percentage Earned from Bias Group by Bias Guess Condition (web) Guessing Condition Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
No Guess .19a 0.037 0.11 0.26
Informational Guess .b . . .
Financial Guess .b . . .
Generative Guess .16a 0.04 0.08 0.24 a. Based on modified population marginal mean. b. This modified population marginal mean is not estimable.
Table J73
ANCOVA, Score Percentage Earned from Bias Group by Bias Guess Condition (web) Source df F p η2
Bias Guess Condition 3 0.28 .838 0.008
Gender 1 0.23 .632 0.002
Race 2 0.45 .641 0.008
Bias Guess Condition * Gender 3 1.69 .173 0.046
Bias Guess Condition * Race 2 0.73 .484 0.014
Gender * Race 1 0.00 .972 0.000
270 Table J74
Clients Placed Means by Bias Guess Condition (web) Guess Condition Mean SD N
No Guess 39.38 16.13 29
Informational Guess 33.37 19.43 35
Financial Guess 30.96 19.81 28
Generative Guess 36.87 18.62 39
Total 35.23 18.63 131
Table J75
Clients Placed Marginal Means by Bias Guess Condition (web) Guessing Condition Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
No Guess 38.09a 5.32 27.55 48.64
Informational Guess .b . . .
Financial Guess .b . . .
Generative Guess 37.00a 5.72 25.67 48.33 a. Based on modified population marginal mean. b. This modified population marginal mean is not estimable.
271 Table J76
ANCOVA, Clients Placed by Bias Guess Condition (web) Source df F p η2
Bias Guess Condition 3 1.27 .290 0.032
Gender 1 1.88 .174 0.016
Race 2 1.24 .292 0.021
Bias Guess Condition * Gender 3 0.02 .996 0.001
Bias Guess Condition * Race 3 0.85 .472 0.021
Gender * Race 1 0.27 .606 0.002
Table J77
Bias Group Client s Placed by Bias Guess Condition (web) Guess Condition Mean SD N
No Guess 15.31 7.75 29
Informational Guess 13.09 8.57 35
Financial Guess 12.39 10.47 28
Generative Guess 13.44 8.01 39
Total 13.53 8.65 131
272 Table J78
Bias Group Clients Placed Marginal Means by Bias Guess Condition (web) Guessing Condition Mean Standard 95% Confidence Interval Error
Lower Bound Upper Bound
No Guess 13.84a 2.52 8.84 18.83
Informational Guess .b . . .
Financial Guess .b . . .
Generative Guess 13.42a 2.71 8.05 18.79 a. Based on modified population marginal mean. b. This modified population marginal mean is not estimable.
Table J79
ANCOVA, Bias Group Clients Placed by Bias Guess Condition (web) Source df F p η2
Bias Guess Condition 3 0.59 .620 0.015
Gender 1 1.09 .298 0.009
Race 2 1.42 .247 0.024
Bias Guess Condition * Gender 3 0.03 .992 0.001
Bias Guess Condition * Race 3 0.53 .665 0.013
Gender * Race 1 0.15 .697 0.001
Table J80
Modern Sexism Posttest Means by Treatment Condition (web) Treatment Condition Mean SD N
Control 32.20 3.79 41
Informational 31.56 3.49 25
Financial 31.00 4.91 27
Generative 32.54 4.39 24
Total 31.85 4.16 117
273 Table J81
Modern Sexism Posttest Marginal Means by Treatment Condition (web) Treatment Condition Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
Control 32.18a,b 0.33 31.52 32.84
Informational .a,c . . .
Financial .a,c . . .
Generative 31.86a,b 0.30 31.28 32.45 a. Covariates appearing in the model are evaluated at the following values: Modern Sexism Pretest Score = 31.7265. b. Based on modified population marginal mean. c. This modified population marginal mean is not estimable.
Table J82
ANCOVA, Modern Sexism Posttest Score by Treatment Condition (web) Source df F p η2
Condition 3 0.53 .666 0.015
Gender 1 0.10 .755 0.001
Race 2 2.25 .110 0.043
Modern Sexism Pretest 1 1996.76 .000* 0.952
Gender * Race 1 0.47 .496 0.005
Condition * Gender 3 0.20 .894 0.006
Condition * Race 3 0.37 .777 0.011
* p≤ .001
274 Table J83 Modern Sexism Difference Score Overall T-Test (web) Test Value = 0 N Mean SD t df p
Modern Sexism Difference Score 122 0.16 1.03 1.76 121 .082
Table J84
Symbolic Racism Posttest Means by Treatme nt Condition (web) Treatment Condition Mean SD N
Control 23.12 1.98 41
Informational 23.44 2.48 25
Financial 22.78 2.26 27
Generative 22.88 2.15 24
Total 23.06 2.18 117
Table J85
Symbolic Racism Posttest Marginal Means by Treatment Condition (web) Treatment Condition Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
Control 23.11a,b 0.52 22.09 24.14
Informational .a,c . . .
Financial .a,c . . .
Generative 22.43a,b 0.46 21.52 23.34 a. Covariates appearing in the model are evaluated at the following values: Symbolic Racism Pretest Score = 26.2137. b. Based on modified population marginal mean. c. This modified population marginal mean is not estimable.
275 Table J86
ANCOVA, Symbolic Racism Posttest Score by Treatment Condition (web) Source df F p η2
Condition 3 2.99 .034* 0.082
Gender 1 1.65 .202 0.016
Race 2 0.24 .784 0.005
Symbolic Racism Pretest 1 105.00 .000** 0.510
Gender * Race 1 0.18 .675 0.002
Condition * Gender 3 0.29 .832 0.009
Condition * Race 3 2.09 .106 0.058
* p≤ .05
** p≤ .001
Table J87
Treatment Condition Control vs Game Contrast, Symbolic Racism Posttest Score (web) Source df F p η2
Contrast 1 0.08 .773 0.001
Table J88 Symbolic Racism Posttest Score T-Tests by Treatment Condition (web) Test Value = 0
N Mean SD t df p
Control Group 43 -3.35 3.21 -6.85 42 .000*
Informational Group 25 -2.60 3.01 -4.31 24 .000*
Financial Group 28 -2.86 3.31 -4.57 27 .000*
Generative Group 26 -3.73 2.96 -6.43 25 .000*
* p≤ .001
276 Table J89
Game Data Correlations with Modern Sexism Posttest Score* (web) df r p
Player Score 70 -0.121 .313
Total Clients Placed 99 -0.021 .834
Total Clients Placed (Biased Group) 99 -0.011 .916
Number of Guesses 100 0.042 .673
*Controlling for Modern Sexism Pretest Score
Table J90
Modern Sexism Posttest Means by Number of Plays (web) Number of Plays Mean SD N
Two 31.98 4.17 120
More Than Two 32.00 2.83 2
Total 31.98 4.14 122
Table J91
ANCOVA, Modern Sexism Posttest Score by Number of Plays (web) Source df F p η2
Modern Sexism Pretest 1 1844.43 .000* 0.939
More Than Two 1 0.79 .375 0.007
* p≤ .001
277 Table J92
Game Data Correlations with Symbolic Racism Posttest Score* (web) df r p
Player Score 70 0.081 .497
Total Clients Placed 99 -0.085 .398
Total Clients Placed (Biased Group) 99 0.039 .700
Number of Guesses 100 -0.106 .288
*Controlling for Symbolic Racism Pretest Score
Table J93
Symbolic Racism Posttest Means by Number of Plays (web) Number of Plays Mean SD N
Two 23.13 2.15 120
More Than Two 22.00 2.83 2
Total 23.11 2.15 122
Table J94
ANCOVA, Symbolic Racism Posttest Score by Number of Plays (web) Source df F p η2
Symbolic Racism Pretest 1 146.50 .000* 0.552
More Than Two 1 0.41 .525 0.003
* p≤ .001
278 Table J95
Modern Sexism Posttest Means by Bias Guess Condition (web) Guess Condition Mean SD N
No Guess 32.00 3.88 22
Informational Guess 31.35 4.02 20
Financial Guess 30.47 5.15 17
Generative Guess 32.81 4.65 16
Total 31.65 4.38 75
Table J96
Modern Sexism Posttest Marginal Means by Bias Guess Condition (web) Guessing Condition Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
No Guess 31.656a,b 0.30 31.06 32.25
Informational Guess .a,c . . .
Financial Guess .a,c . . .
Generative Guess 31.625a,b 0.34 30.94 32.31 a. Covariates appearing in the model are evaluated at the following values: Modern Sexism Pretest Score = 31.6400. b. Based on modified population marginal mean. c. This modified population marginal mean is not estimable.
279 Table J97
ANCOVA, Modern Sexism Posttest Score by Bias Guess Condition (web) Source df F p η2
Bias Guess Condition 3 0.03 .994 0.001
Gender 1 0.05 .816 0.001
Race 2 2.47 .093 0.077
Modern Sexism Pretest 1 1399.11 .000* 0.960
Gender * Race 1 0.18 .669 0.003
Bias Guess Condition * Gender 3 0.50 .685 0.025
Bias Guess Condition * Race 3 0.70 .554 0.034
* p≤ .001
Table J98
Bias Guess Condition No Guess vs Guess Contrast, Modern Sexism Posttest Score (web) Source df F p η2
Contrast 1 0.01 .940 0.000
Table J99
Symbolic Racism Posttest Means by Bias Guess Condition (web) Guess Condition Mean SD N
No Guess 22.45 2.20 22
Informational Guess 23.35 2.72 20
Financial Guess 23.12 2.00 17
Generative Guess 23.25 2.27 16
Total 23.01 2.30 75
280 Table J100
Symbolic Racism Posttest Marginal Means by Bias Guess Condition (web) Guessing Condition Mean Standard 95% Confidence Interval Error
Lower Bound Upper Bound
No Guess 22.64a,b 0.46 21.73 23.55
Informational Guess .a,c . . .
Financial Guess .a,c . . .
Generative Guess 22.71a,b 0.53 21.65 23.78 a. Covariates appearing in the model are evaluated at the following values: Symbolic Racism Pretest Score = 26.0933. b. Based on modified population marginal mean. c. This modified population marginal mean is not estimable.
Table J101
ANCOVA, Symbolic Racism Posttest Score by Bias Gues s Condition (web) Source df F p η2
Bias Guess Condition 3 1.53 .217 0.072
Gender 1 0.01 .905 0.000
Race 2 0.39 .680 0.013
Symbolic Racism Pretest 1 88.99 .000* 0.601
Gender * Race 1 0.41 .526 0.007
Bias Guess Condition * Gender 3 0.77 .513 0.038
Bias Guess Condition * Race 3 0.82 .486 0.040
* p≤ .001
Table J102
Bias Guess Condition No Guess vs Guess Contrast, Symbolic Racism Posttest Score (web) Source df F p η2
Contrast 1 1.60 .212 0.026
281 Table J103
Demographics , Mechanical Turk Players N %
Player Gender F emale 108 57.10
Male 81 42.90
Player Race White 148 77.50
Black and Hispanic 17 8.90
Other 26 13.60
Living Area Rural 31 16.20
Suburban 97 50.80
Urban 63 33.00
Table J104
Age, Mechanical Turk Players N Min Max Mean SD
Age 190 18 76 34.72 10.93
Table J105
Systemic Sexism Pretest Means by Completion Status (MT) Completed Mean SD N
No 1.97 1.12 32
Yes 1.91 1.17 100
Total 1.92 1.16 132
282 Table J106
ANOVA, Systemic Sexism Pretest Score by Completion Status (MT) Source df F p η2
Completed 1 0.06 .804 0.000
* p≤ .001
Table J107
Systemic Racism Pretest Means by Completion Status (MT) Completed Mean SD N
No 2.25 1.52 32
Yes 1.90 1.24 100
Total 1.98 1.32 132
Table J108
ANOVA, Systemic Racism Pretest Score by Completion Status (MT) Source df F p η2
Completed 1 1.72 .193 0.013
Table J109
Modern Sexism Pretest Means by Completion Status (MT) Completed Mean SD N
No 27.59 4.68 32
Yes 27.43 5.53 100
Total 27.47 5.32 132
283 Table J110
ANOVA, Modern Sexism Pretest Score by Completion Status (MT) Source df F p η2
Completed 1 0.02 .880 0.000
Table J111
Symbolic Racism Pretest Means by Completion Status (MT) Completed Mean SD N
No 22.28 4.50 32
Yes 20.13 4.96 100
Total 20.65 4.93 132
Table J112
ANOVA, Symbolic Racism Pretest Score by Completion Status (MT) Source df F p η2
Completed 1 4.75 .031* 0.035
* p≤ .05
Table J113
Systemic Sexism Posttest Means by Pretest Group (MT) Pretest Group Mean SD N
No Pretest 2.53 1.31 92
Pretest 1.71 1.26 99
Total 2.10 1.35 191
284 Table J114
ANOVA, Systemic Sexism Posttest Score by Pretest Group (MT) Source df F p η2
Pretest Group 1 19.724 .000* 0.094
* p≤ .001
Table J115
Systemic Racism Posttest Means by Pretest Group (MT) Pretest Group Mean SD N
No Pretest 2.37 1.52 92
Pretest 1.89 1.31 99
Total 2.12 1.43 191
Table J116
ANOVA, Systemic Racism Posttest Score by Pretest Group (MT) Source df F p η2
Pretest Group 1 5.52 .020* 0.028
* p≤ .05
Table J117
Modern Sexism Posttest Means by Pretest Group (MT) Pretest Group Mean SD N
No Pretest 29.28 4.46 92
Pretest 27.56 5.43 99
Total 28.39 5.05 191
285 Table J118
ANOVA, Modern Sexism Posttest Score by Pretest Group (MT) Source df F p η2
Pretest Group 1 5.72 .018* 0.029
* p≤ .05
Table J119
Symbolic Racism Posttest Means by Pretest Group (MT) Pretest Group Mean SD N
No Pretest 20.84 2.59 92
Pretest 20.07 2.42 99
Total 20.44 2.52 191
Table J120
ANOVA, Symbolic Racism Posttest Score by Pretest Group (MT) 2 Source df F p η
Pretest Group 1 4.47 .036* 0.023
* p≤ .05
Table J121
Systemic Sexism Posttest Marginal Means by Treatment Condition (MT) Treatment Condition Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
Control 2.11a 0.66 0.81 3.41
Informational 1.19a 0.32 0.55 1.84
Financial 2.23a 0.46 1.31 3.15
Generative 2.27a 0.45 1.38 3.16 a. Covariates appearing in the model are evaluated at the following values: Systemic Sexism Pretest Score = 1.9293.
286
Table J122
ANCOVA, Systemic Sexism Posttest Score by Treatment Condition (MT) Source df F p η2
Condition 3 2.28 .086 0.079
Gender 1 0.00 .975 0.000
Race 2 1.05 .353 0.026
Systemic Sexism Pretest 1 12.31 .001* 0.133
Gender * Race 2 0.45 .642 0.011
Condition * Gender 3 0.42 .736 0.016
Condition * Race 6 1.40 .226 0.095
* p≤ .001
Table J123
Treatment Condition Control vs. Game Contrast, Systemic Sexism Posttest Score (MT) Source df F p η2
Contrast 1 0.093 .761 0.001
Table J124 Systemic Sexism Difference Score Overall T-Test (MT) Test Value = 0 N Mean SD t df p
Systemic Sexism Difference Score 99 -0.22 1.43 -1.55 98 .124
287 Table J125
Systemic Racism Posttest Marginal Means by Treatment Condition (MT) Treatment Condition Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
Control 1.24 a 0.71 -0.18 2.65
Informational 1.72a 0.35 1.02 2.42
Financial 1.95a 0.51 0.94 2.96
Generative 1.70a 0.49 0.73 2.68
a. Covariates appearing in the model are evaluated at the following values: Systemic Racism Pretest Score = 1.8990.
Table J126
ANCOVA, Systemic Racism Posttest Score by Treatment Condition (MT) Source df F p η2
Condition 3 0.29 .832 0.011
Gender 1 1.51 .223 0.019
Race 2 1.24 .295 0.030
Systemic Racism Pretest 1 11.64 .001* 0.127
Gender * Race 2 1.87 .161 0.045
Condition * Gender 3 0.17 .919 0.006
Condition * Race 6 1.02 .418 0.071
* p≤ .001
Table J127
Treatment Condition Control vs. Game Contrast, Systemic Racism Posttest Score (MT) Source df F p η2
Contrast 1 0.55 .462 0.007
288 Table J128 Systemic R acism Difference Score Overall T-Test (MT) Test Value = 0 N Mean SD t df p
Systemic Sexism Difference Score 99 -0.01 1.56 -0.07 98 .949
Table J129
Game Data Correlations with Systemic Sexism Posttest Score* (MT) df r p
Player Score 55 0.162 .227
Total Clients Placed 88 -0.083 .439
Total Clients Placed (Biased Group) 88 -0.131 .217
Number of Guesses 88 -0.106 .322
*Controlling for Systemic Sexism Pretest Score
Table J130
Systemic Sexism Posttest Means by Number of Plays (MT) Number of Plays Mean SD N
Two 1.68 1.21 95
More Than Two 2.25 2.22 4
Total 1.71 1.26 99
Table J131
ANCOVA, Systemic Sexism Posttest Score by Number of Plays (MT) Source df F p η2
Systemic Sexism Pretest 1 11.07 .001* 0.103
More Than Two 1 1.74 .190 0.018
* p≤ .001
289 Table J132
Game Data Correlations with Systemic Racism Posttest Score* (MT) df r p
Player Score 55 0.057 .671
Total Clients Placed 88 0.159 .134
Total Clients Placed (Biased Group) 88 0.077 .473
Number of Guesses 88 0.079 .459
*Controlling for Systemic Racism Pretest Score
Table J133
Systemic Racism Posttest Means by Number of Plays (MT) Number of Plays Mean SD N
Two 1.89 1.28 95
More Than Two 1.75 2.06 4
Total 1.89 1.31 99
Table J134
ANCOVA, Systemic Racism Posttest Score by Number of Plays (MT) Source df F p η2
Systemic Racism Pretest 1 7.25 .008* 0.070
More Than Two 1 0.24 .626 0.002
* p≤ .05
290
Table J135
Mean Score by Player Race and Gender (MT) Player Gender Player Race Mean SD N
Female White 1033152.39 4713025.69 46
Black and Hispanic 2047543.33 3471347.99 3
Other 40575.71 20201.24 7
Male White 171668.18 488615.17 33
Black and Hispanic 4458084.29 11238569.33 7
Other 49826.25 24847.07 8
Total 877133.85 4307294.34 104
Table J136
ANCOVA, Mean Score by Player Race and Gender (MT) Source df F p η2
Race 2 1.69 .190 .033
Gender 1 .17 .684 .002
Race * Gender 2 .58 .561 .012
Table J137
Mean Clients Placed by Player Race and Gender (MT) Player Gender Player Race Mean SD N
Female White 23.94 20.36 84
Black and Hispanic 9.25 12.51 8
Other 16.00 15.05 11
Male White 29.73 21.76 52
Black and Hispanic 19.00 14.51 9
Other 21.09 14.03 11
Total 24.06 20.06 175
291
Table J138
ANCOVA, Mean Clients Placed by Player Race and Gender (MT) Source df F p η2
Race 2 4.26 .016* .048
Gender 1 2.44 .120 .014
Race * Gender 2 .08 .920 .001
* p≤ .05
Table J139
White vs. non-White Contrast, Clients Placed (MT) Source df F p η2
Contrast 1 8.38 .004* .047
* p≤ .05
Table J140
Mean Bias-Group Clients Placed by Player Race and Gender (MT) Player Gender Player Race Mean SD N
Female White 9.06 8.24 84
Black and Hispanic 3.63 3.96 8
Other 7.09 8.24 11
Male White 11.08 9.32 52
Black and Hispanic 8.44 7.09 9
Other 9.09 6.20 11
Total 9.26 8.35 175
292 Table J141
ANCOVA, Mean Bias-Group Clients Placed by Player Race and Gender (MT ) Source df F p η2
Race 2 2.08 .128 .024
Gender 1 2.52 .114 .015
Race * Gender 2 .22 .805 .003
Table J142
Mean Guesses by Player Race and Gender (MT) Player Gender Player Race Mean SD N
Female White 2.24 2.24 84
Black and Hispanic 2.63 1.92 8
Other 2.45 2.16 11
Male White 1.62 1.84 52
Black and Hispanic 1.67 2.55 9
Other 2.27 2.06 11
Total 2.06 2.11 175
Table J143
ANCOVA, Mean Guesses by Player Race and Gender (MT) Source df F p η2
Race 2 .44 .643 .005
Gender 1 1.55 .215 .009
Race * Gender 2 .17 .843 .002
293
Table J144
Crosstabulation of Player Race and Games Played (MT)
Games Played χ² p Total
Player Race 2 >2
White 141 7 3.169 a .205 148
Black and Hispanic 15 2 17
Other 26 0 26
Total 182 9 191 a. 2 cells (33.3%) have expected count less than 5. The minimum expected count is .80.
Table J145
Crosstabulation of Player Gender and Games Played (MT)
Games Played χ² p Total
Player Gender 2 >2
Female 103 5 .010a .921 108
Male 77 4 81 Total 180 9 189 a. 1 cells (25.0%) have expected count less than 5. The minimum expected count is 3.86.
294 Table J146
Systemic Racism Posttest Marginal Means by Bias Guess Condition (MT) Guessing Condition Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
No Guess 1.95a 0.36 1.23 2.66
Informational Guess 1.29a 0.33 0.62 1.96
Financial Guess 2.11a 0.51 1.09 3.14
Generative Guess 2.31a 0.42 1.47 3.15 a. Covariates appearing in the model are evaluated at the following values: Systemic Sexism Pretest Score = 1.9315.
Table J147
ANCOVA, Systemic Sexism Posttest Score by Bias Guess Condition (MT) Source df F p η2
Bias Guess Condition 3 1.54 .214 0.079
Gender 1 0.03 .865 0.001
Race 2 3.19 .049* 0.106
Systemic Sexism Pretest 1 7.91 .007** 0.128
Gender * Race 2 0.90 .414 0.032
Bias Guess Condition * Gender 3 1.92 .137 0.096
Bias Guess Condition * Race 6 1.37 .242 0.132
* p≤ .05
*p≤ .01
Table J148
Bias Guess Condition No Guess vs Guess Contrast, Systemic Sexism Posttest Score (MT) Source df F p η2
Contrast 1 0.01 .925 0.000
295 Table J149
Systemic Sexism Posttest Marginal Means by Player Race (MT) Player Race Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
White 1.54a 0.15 1.23 1.85
Black and Hispanic 2.73a 0.45 1.83 3.63
Other 1.48a 0.35 0.77 2.19 a. Covariates appearing in the model are evaluated at the following values: Systemic Sexism Pretest Score = 1.9315.
Table J150 Systemic Sexism Difference Score T-Tests by Player Race (MT) Test Value = 0 N Mean SD t df p
White 54 -0.37 1.43 -1.90 53 .063
Black and Hispanic 7 0.71 2.06 0.92 6 .394
Other 12 -0.25 1.14 -0.76 11 .463
Table J151
Systemic Racism Po sttest Marginal Means by Bias Guess Condition (MT) Guessing Condition Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
No Guess 1.99a 0.38 1.24 2.75
Informational Guess 1.79a 0.34 1.11 2.47
Financial Guess 1.96a 0.52 0.91 3.01
Generative Guess 1.64a 0.44 0.76 2.51 a. Covariates appearing in the model are evaluated at the following values: Systemic Racism Pretest Score = 2.0137.
296 Table J152
ANCOVA, Systemic Racism Posttest Score by Bias Guess Condition (MT) Source df F p η2
Bias Guess Condition 3 0.14 .936 0.008
Gender 1 3.01 .089 0.053
Race 2 0.01 .987 0.000
Systemic Racism Pretest 1 9.76 .003* 0.153
Gender * Race 2 3.48 .038** 0.114
Bias Guess Condition * Gender 3 0.11 .954 0.006
Bias Guess Condition * Race 6 0.48 .820 0.051
* p≤ .01
** p≤ .05
Table J153
Bias Guess Condition No Guess vs Guess Contrast, Systemic Racism Posttest Score (MT) Source df F p η2
Contrast 1 0.19 .667 0.003
297 Table J154
Systemic Racism Posttest Means by Playe r Gender and Player Race (MT) Player Gender Player Race Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
Female White 1.62a 0.21 1.20 2.03
Black and Hispanic 3.06a 0.74 1.58 4.55
Other 2.14a 0.61 0.91 3.36
Male White 2.09a 0.25 1.60 2.58
Black and Hispanic 0.52a 0.74 -0.97 2.01
Other 1.64a 0.45 0.74 2.53
a. Covariates appearing in the model are evaluated at the following values: Systemic Racism Pretest Score = 2.0137.
Table J155 Systemic Racism Difference Score T-Tests by Player Race and Gender (MT) Test Value = 0
N Mean SD t df p
White Women 32 -0.56 1.34 -2.37 31 .024*
Black and Hispanic Women 3 2.00 1.00 3.46 2 .074
Other Women 5 -0.40 1.14 -0.78 4 .477
White Men 22 0.27 1.24 1.03 21 .315
Black and Hispanic Men 4 -2.00 1.83 -2.191 3 .116
Other Men 7 0.00 1.41 0.00 6 1.000
* p≤ .05
298 Table J156
Game Score Marginal Means by Bias Guess Condition (MT) Guessing Condition Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
No Guess 2939601.00 1026789.83 898410.38 4980791.63
Informational Guess 1524784.55 1106723.51 -675308.97 3724878.07
Financial Guess -1223709.07 1749676.16 -4701950.16 2254532.02
Generative Guess -1095283.95 1433944.59 -3945871.44 1755303.54
Table J157
ANCOVA, Game Score by Bias Guess Condition (MT) Source df F p η2
Bias Guess Condition 3 2.48 .067 0.079
Gender 1 1.77 .187 0.020
Race 2 0.15 .865 0.003
Bias Guess Condition * Gender 3 1.66 .181 0.055
Bias Guess Condition * Race 6 2.03 .071 0.124
Gender * Race 2 2.07 .133 0.046
Table J158
Mean Score Percentag e Earned from Bias Group by Bias Guess Condition (MT) Guessing Condition Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
No Guess 0.45 0.11 0.22 0.68
Informational Guess 0.23 0.12 -0.01 0.48
Financial Guess 0.11 0.19 -0.28 0.50
Generative Guess 0.21 0.16 -0.11 0.53
299 Table J159
ANCOVA, Score Percentage Earned from Bias Group by Bias Guess Condition (MT) Source df F p η2
Bias Guess Condition 3 1.16 .330 0.039
Gender 1 1.81 .182 0.021
Race 2 1.34 .266 0.030
Bias Guess Condition * Gender 3 0.78 .508 0.027
Bias Guess Condition * Race 6 2.02 .071 0.124
Gender * Race 2 3.82 .026* 0.082
* p≤ .05
Table J160
Score Percentage Earned from Bias Group Means by Player Race and Gender (MT) Player Race Player Gender Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
White Female 0.22 0.07 0.09 0.36
Male 0.14 0.08 -0.02 0.30
Black and Hispanic Female 0.11 0.35 -0.58 0.81
Male 0.22 0.19 -0.15 0.59
Other Female 0.09 0.18 -0.27 0.45
Male 0.72 0.17 0.39 1.05
Table J161
Total Clients Placed by Bias Guess Condition (MT) Guessing Condition Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
No Guess 24.85 4.79 15.36 34.34
Informational Guess 18.78 5.13 8.62 28.94
Financial Guess 26.98 6.25 14.60 39.36
Generative Guess 17.94 4.80 8.43 27.44
300 Table J162
ANCOVA, Total Clients Placed by Bias Guess Condition (MT) Source df F p η2
Bias Guess Condition 3 0.714 .546 0.018
Gender 1 1.373 .244 0.012
Race 2 2.253 .110 0.038
Bias Guess Condition * Gender 3 1.382 .252 0.035
Bias Guess Condition * Race 6 0.238 .963 0.012
Gender * Race 2 0.020 .980 0.000
Table J163
Bias Group Clients Placed by Bias Guess Condition (MT) Guessing Condition Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
No Guess 9.98 2.06 5.90 14.06
Informational Guess 9.14 2.21 4.77 13.51
Financial Guess 10.54 2.69 5.22 15.87
Generative Guess 6.37 2.06 2.28 10.46
Table J164
ANCOVA, Bias Group Clien ts Placed by Bias Guess Condition (MT) Source df F p η2
Bias Guess Condition 3 0.74 .533 0.019
Gender 1 1.87 .174 0.016
Race 2 0.87 .424 0.015
Bias Guess Condition * Gender 3 0.73 .534 0.019
Bias Guess Condition * Race 6 0.20 .976 0.010
Gender * Race 2 0.47 .626 0.008
301 Table J165
Modern Sexism Posttest Marginal Means by Treatment Condition (MT) Treatment Condition Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
Control 28.71a 0.69 27.34 30.09
Informational 26.85a 0.34 26.18 27.51
Financial 28.67a 0.49 27.70 29.64
Generative 27.29a 0.48 26.33 28.25 a. Covariates appearing in the model are evaluated at the following values: Modern Sexism Pretest Score = 27.4343.
Table J166
ANCOVA, Modern Sexism Posttest Score by Treatment Condition (MT) Source df F p η2
Condition 3 3.87 .012* 0.127
Gender 1 0.74 .392 0.009
Race 2 1.80 .172 0.043
Modern Sexism Pretest 1 1485.32 .000** 0.949
Gender * Race 2 0.71 .493 0.018
Condition * Gender 3 0.52 .667 0.019
Condition * Race 6 1.23 .298 0.085
* p≤ .05
** p≤ .001
Table J167
Treatment Condition Control vs Game Contrast, Modern Sexism Posttest Score (MT) Source df F p η2
Contrast 1 2.33 .131 0.028
302 Table J168 Modern Sexism Difference Score T-Tests by Treatment Condition Test Value = 0 N Mean SD t df p
Control Group 26 0.81 1.67 2.46 25 .021*
Informational Group 30 -0.37 1.00 -2.01 29 .054
Financial Group 24 0.25 1.19 1.03 23 .314
Generative Group 19 -0.21 0.63 -1.46 18 .163
* p≤ .05
Table J169
Symbolic Racism Posttest Marginal Means by Treatment Condition (MT) Treatment Condition Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
Control 19.94a 0.85 18.25 21.63
Informational 19.99a 0.42 19.16 20.82
Financial 18.25a 0.61 17.05 19.46
Generative 20.06a 0.61 18.84 21.27 a. Covariates appearing in the model are evaluated at the following values: Symbolic Racism Pretest Score = 20.1212.
303 Table J170
ANCOVA, Symbolic Racism Posttest Score by Treatment Condition (MT) Source df F p η2
Condition 3 2.28 .086 0.079
Gender 1 0.17 .683 0.002
Race 2 3.68 .030* 0.084
Symbolic Racism Pretest 1 113.58 .000** 0.587
Gender * Race 2 0.25 .779 0.006
Condition * Gender 3 0.66 .579 0.024
Condition * Race 6 1.66 .141 0.111
* p≤ .05
** p≤ .001
Table J171
Treatment Condition Control vs Game Contrast, Symbolic Racism Posttest Score (MT) Source df F p η2
Contrast 1 0.314 0.577 0.004
Table J172
Symbolic Racism Posttest Marginal Means by Player Race (MT) Player Race Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
White 20.24a 0.18 19.89 20.59
Black and Hispanic 19.65a 0.65 18.36 20.94
Other 18.78a 0.52 17.75 19.82 a. Covariates appearing in the model are evaluated at the following values: Symbolic Racism Pretest Score = 20.1212.
304 Table J173 Symbolic Racism Difference Score T -Tests by Player Race (MT) Test Value = 0 N Mean SD t df p
White 78 0.37 3.49 0.94 77 .350
Black and Hispanic 8 -1.13 4.18 -0.76 7 .472
Other 13 -1.92 2.72 -2.55 12 .026*
* p≤ .05
Table J174
Game Data Correlations with Modern Sexism Posttest Score* (MT) df r p
Player Score 55 -0.384 .003**
Total Clients Placed 88 0.070 .509
Total Clients Placed (Biased Group) 88 -0.002 .988
Number of Guesses 88 -0.109 .308
*Controlling for Modern Sexism Pretest Score
** p≤ .005
Table J175
Modern Sexism Posttest Means by Number of Plays (MT) Number of Plays Mean SD N
Two 27.51 5.44 95
More Than of Two 28.75 5.85 4
Total 27.56 5.43 99
305 Table J176
ANCOVA, Modern Sexism Posttest Score by Number of Plays (MT) Source df F p η2
Modern Sexism Pretest 1 1719.24 .000* 0.947
More Than Two 1 0.46 .500 0.005
* p≤ .001
Table J177
Game Data Correlations with Symbolic Racism Posttest Score* (MT) df r p
Player Score 55 0.089 .511
Total Clients Placed 88 -0.118 .268
Total Clients Placed (Biased Group) 88 -0.092 .387
Number of Guesses 88 -0.002 .988
*Controlling for Symbolic Racism Pretest Score
Table J178
Symbolic Racism Posttest Means by Number of Plays (MT) Number of Plays Mean SD N
Two 20.07 2.40 95
More Than Two 20.00 3.16 4
Total 20.07 2.42 99
Table J179
ANCOVA, Symbolic Racism Posttest Score by Number of Plays (MT) Source df F p η2
Symbolic Racism Pretest 1 129.33 .000* 0.574
More Than Two 1 0.00 .973 0.000
* p≤ .001
306 Table J180
Modern Sexism Posttest Marginal Means by Bias Guess Condition and Player Race (MT) Guessing Condition Player Race Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
No Guess White 27.74a 0.24 27.27 28.22
Black and Hispanic 27.57a 0.61 26.36 28.79
Other 27.66a 0.51 26.64 28.68
Informational Guess White 27.06a 0.23 26.60 27.51
Black and Hispanic 26.42a 0.61 25.20 27.63
Other 27.14a 0.44 26.27 28.01
Financial Guess White 27.14a 0.24 26.66 27.61
Black and Hispanic 31.37a 1.01 29.34 33.40
Other 28.54a 0.72 27.09 29.98
Generative Guess White 27.10a 0.25 26.59 27.60
Black and Hispanic 27.96a 0.82 26.33 29.60
Other 27.75a 0.51 26.74 28.76 a. Covariates appearing in the model are evaluated at the following values: Modern Sexism Pretest Score = 27.5068.
307 Table J181
ANCOVA, Modern Sexism Posttest Score by Bias Guess Condition (MT) Source df F p η2
Bias Guess Condition 3 7.04 .000* 0.281
Gender 1 0.10 .749 0.002
Race 2 5.30 .008** 0.164
Modern Sexism Pretest 1 2549.47 .000* 0.979
Gender * Race 2 0.94 .396 0.034
Bias Guess Condition * Gender 3 0.89 .452 0.047
Bias Guess Condition * Race 6 3.63 .004** 0.288
* p≤ .001
**p≤ .01 ***p≤ .005
Table J182
Modern Sexism Posttest Marginal Means by Bias Guess Condition (MT) Guessing Condition Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
No Guess 27.66a 0.28 27.10 28.22
Informational Guess 26.87a 0.26 26.34 27.40
Financial Guess 29.02a 0.40 28.22 29.82
Generative Guess 27.03a 0.33 26.94 28.27 a. Covariates appearing in the model are evaluated at the following values: Modern Sexism Pretest Score = 27.5068.
Table J183
Bias Guess Condition No Guess vs. Guess Contrast, Modern Sexism Posttest Score (MT) Source df F p η2
Contrast 1 0.26 .614 0.005
308 Table J184
Modern Sexism Posttest Marginal Means by Player Race (MT) Player Race Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
White 27.26a 0.12 27.02 27.50
Black and Hispanic 28.33a 0.35 27.63 29.03
Other 27.77a 0.27 27.22 28.32 a. Covariates appearing in the model are evaluated at the following values: Modern Sexism Pretest Score = 27.5068.
Table J185
Player Race Contrast, Modern Sexism Posttest Score (MT) Source df F p η2
Contrast 1 10.28 .002* 0.160
* p≤ .005
Table J186
Modern Sexism Posttest Means by Guess Condition , White Players Only (MT) Guessing Condition Mean SD N
No Guess 28.69 4.70 13
Informational Guess 27.19 5.41 16
Financial Guess 27.54 7.08 13
Generative Guess 26.00 6.27 12
Total 27.37 5.80 54
309 Table J187
ANCOVA, Modern Sexism Posttest Score by Bias Guess Condition, White Players Only (MT) Source df F p η2
Bias Guess Condition 3 2.18 .102 0.118
Modern Sexism Pretest 1 2482.75 .000* 0.981
* p≤ .001
Table J188 Modern Sexism Difference Score T-Test , White Players Only Test Value = 0 N Mean SD t df p
Modern Sexism Difference Score 78 0.06 1.28 0.44 77 .660
Table J189
Modern Sexism Posttest Means by Bias Guess Condition, Black, Hispanic, and Other Players (MT) Guessing Condition Mean SD N
No Guess 30.60 3.58 5
Informational Guess 22.83 6.46 6
Financial Guess 27.33 3.51 3
Generative Guess 29.80 5.22 5
Total 27.42 5.77 19
Table J190
ANCOVA, Modern Sexism Posttest Score by Bias Guess Condition, Black, Hispanic and Other Players (MT) Source df F p η2
Bias Guess Condition 3 7.35 .003 0.612
Modern Sexism Pretest 1 477.06 .000* 0.971
* p≤ .001
310 Table J191 Modern Sexism Difference Score T- Test, Black, Hispanic, and Other Players (MT) Test Value = 0 N Mean SD t df p
No Guess 5 0.00 0.71 0.00 4 1.000
Informative Guess 6 -0.5 1.22 -1.00 5 .363
Financial Guess 3 2.33 1.53 2.65 2 .118
Generative Guess 5 0.2 0.45 1.00 4 .374
Table J192
Symbolic Racism Posttest Marginal Means by Bias Guess Condition and Player Race (MT) Guessing Condition Player Race Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
No Guess White 20.62a 0.39 19.84 21.41
Black and Hispanic 20.13a 1.01 18.12 22.15
Other 18.34a 0.84 16.66 20.01
Informational Guess White 19.96a 0.38 19.20 20.71
Black and Hispanic 20.16a 1.00 18.16 22.16
Other 20.07a 0.70 18.66 21.48
Financial Guess White 20.48a 0.39 19.70 21.27
Black and Hispanic 15.62a 1.66 12.29 18.96
Other 17.05a 1.19 14.67 19.44
Generative Guess White 20.24a 0.42 19.40 21.07
Black and Hispanic 20.85a 1.38 18.09 23.61
Other 20.17a 0.84 18.49 21.85 a. Covariates appearing in the model are evaluated at the following values: Symbolic Racism Pretest Score = 20.4521.
311 Table J193
ANCOVA, Symbolic Racism Posttest Score by Bias Guess Condition (MT) Source df F p η2
Bias Guess Condition 3 3.39 .025* 0.158
Gender 1 0.48 .490 0.009
Race 2 5.39 .007* 0.166
Symbolic Racism Pretest 1 74.97 .000** 0.581
Gender * Race 2 0.12 .888 0.004
Bias Guess Condition * Gender 3 1.14 .340 0.060
Bias Guess Condition * Race 6 2.57 .029 0.222
* p≤ .05
** p≤ .01 *** p≤ .001
Table J194
Symbolic Racism Posttest Means by Bias Guess Condition (MT) Guessing Condition Mean Standard Error 95% Confidence Interval
Lower Bound Upper Bound
No Guess 19.70a 0.46 18.78 20.62
Informational Guess 20.06a 0.43 19.21 20.92
Financial Guess 17.72a 0.66 16.39 19.05
Generative Guess 20.42a 0.56 19.30 21.54 a. Covariates appearing in the model are evaluated at the following values: Symbolic Racism Pretest Score = 20.4521.
Table J195
Bias Guess Condition No Guess vs. Guess Contrast, Symbolic Racism Posttest Score (MT) Source df F p η2
Contrast 1 0.29 .593 0.005
312 Table J196
Symbolic Racism Posttest Marginal Means by Playe r Race (MT) Player Race Mean Standard 95% Confidence Interval Error
Lower Bound Upper Bound
White 20.33a 0.20 19.93 20.72
Black and Hispanic 19.19a 0.58 18.04 20.35
Other 18.91a 0.45 18.00 19.81 a. Covariates appearing in the model are evaluated at the following values: Symbolic Racism Pretest Score = 20.4521.
Table J197
White vs. non-White Player Race Contrast, Symbolic Racism Posttest Score (MT) Source df F p η2
Contrast 1 9.57 .003* 0.151
* p≤ .005
Table J198
Symbolic Racism Posttest Means by Bias Guess Condition, White Players (MT) Guessing Condition Mean SD N
No Guess 20.77 1.64 13
Informational Guess 19.88 1.75 16
Financial Guess 20.62 2.72 13
Generative Guess 19.75 2.45 12
Total 20.24 2.15 54
313 Table J199
ANCOVA, Symbolic Racism Posttest Score by Bias Guess Condition, White Players (MT) Source df F p η2
Bias Guess Condition 3 0.42 .739 0.025
Symbolic Racism Pretest 1 63.50 .000* 0.564
* p≤ .001
Table J200 Symbolic Racism Difference Score T-Test, White Players Test Value = 0 N Mean SD t df p
Symbolic Racism Difference 78 0.37 3.49 0.94 77 .350
Table J201
Symbolic Racism Posttest Means by Bias Guess Condition, Black, Hispanic, and Other Players (MT) Guessing Condition Mean SD N
No Guess 20.00 1.87 5
Informational Guess 19.83 2.79 6
Financial Guess 15.00 1.00 3
Generative Guess 21.80 1.92 5
Total 19.63 2.97 19
314 Table J202
ANCOVA, Symbolic Racism Posttest Score by Bias Guess Condition, , Black, Hispanic, and Other Players (MT) Source df F p η2
Bias Guess Condition 3 8.16 .002* 0.636
Symbolic Racism Pretest 1 41.35 .000** 0.747
* p≤ .005
* p≤ .001
Table J203 Symbolic Racism Difference Score T-Tests by Bias Guess Condition , Black, Hispanic, Other Players (MT) Test Value = 0
N Mean SD t df p
No Guess 5 -3.20 1.79 -4.00 4 .016
Informative Guess 6 0.17 3.92 0.10 5 .921
Financial Guess 3 -2.00 2.00 -1.73 2 .225
Generative Guess 5 -2.40 2.19 -2.45 4 .070
315