Investigating Sports Commentator Bias Within a Large Corpus of American Football Broadcasts
Total Page:16
File Type:pdf, Size:1020Kb
Investigating Sports Commentator Bias within a Large Corpus of American Football Broadcasts Jack MerulloF♠ Luke YehF♠ Abram Handler♠ Alvin Grissom II| Brendan O’Connor♠ Mohit Iyyer♠ University of Massachusetts Amherst♠ Ursinus College| fjmerullo,lyeh,ahandler,miyyer,[email protected] [email protected] Abstract Player Race Mention text Baker white “Mayfield the ultimate com- Sports broadcasters inject drama into play- Mayfield petitor he’s tough he’s scrappy” by-play commentary by building team and player narratives through subjective analyses Jesse white “this is a guy . does nothing and anecdotes. Prior studies based on small James but work brings his lunch pail” datasets and manual coding show that such Manny nonwhite “good specs for that defensive theatrics evince commentator bias in sports Lawson end freakish athletic ability” broadcasts. To examine this phenomenon, we B.J. nonwhite “that otherworldly athleticism assemble FOOTBALL, which contains 1,455 Daniels he has saw it with Michael broadcast transcripts from American football Vick” games across six decades that are automat- ically annotated with 250K player mentions Table 1: Example mentions from FOOTBALL that high- and linked with racial metadata. We identify light racial bias in commentator sentiment patterns. major confounding factors for researchers ex- amining racial bias in FOOTBALL, and perform about each player’s race and position. Our re- a computational analysis that supports conclu- sulting FOOTBALL dataset contains over 1,400 sions from prior social science studies. games spanning six decades, automatically an- 1 Introduction notated with ∼250K player mentions (Table1). Analysis of FOOTBALL identifies two confounding Sports broadcasts are major events in contempo- factors for research on racial bias: (1) the racial rary popular culture: televised American football composition of many positions is very skewed (henceforth “football”) games regularly draw tens (e.g., only ∼5% of running backs are white), and of millions of viewers (Palotta, 2019). Such broad- (2) many mentions of players describe only their casts feature live sports commentators who weave actions on the field (not player attributes). We the game’s mechanical details into a broader, more experiment with an additive log-linear model for subjective narrative. Previous work suggests that teasing apart these confounds. We also confirm this form of storytelling exhibits racial bias: non- prior social science studies on racial bias in nam- white players are less frequently praised for good ing patterns and sentiment. Finally, we publicly plays (Rainville and McCormick, 1977), while release FOOTBALL,2 the first large-scale sports white players are more often credited with “in- commentary corpus annotated with player race, to telligence” (Bruce, 2004; Billings, 2004). How- spur further research into characterizing racial bias ever, such prior scholarship forms conclusions in mass media. from small datasets1 and subjective manual cod- ing of race-specific language. 2 Collecting the FOOTBALL dataset We revisit this prior work using large-scale We collect transcripts of 1,455 full game broad- computational analysis. From YouTube, we col- casts from the U.S. NFL and National Collegiate lect broadcast football transcripts and identify Athletic Association (NCAA) recorded between mentions of players, which we link to metadata 1960 and 2019. Next, we identify and link men- FAuthors contributed equally. tions of players within these transcripts to infor- 1 Rainville and McCormick(1977), for example, study only 16 games. 2 http://github.com/jmerullo/football 6355 Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pages 6355–6361, Hong Kong, China, November 3–7, 2019. c 2019 Association for Computational Linguistics mation about their race (white or nonwhite) and bias in sports broadcasts which informs our work position (e.g., quarterback). In total, FOOTBALL (Rainville and McCormick, 1977; Rada, 1996; contains 267,778 mentions of 4,668 unique play- Billings, 2004; Rada and Wulfemeyer, 2005) typ- ers, 65.7% of whom are nonwhite.3 We now de- ically assume hard distinctions between racial scribe each stage of our data collection process. groups, which measurably affect commentary. In this work, we do not reify these racial categories; 2.1 Processing broadcast transcripts we use them as commonly understood within the We collect broadcast transcripts by downloading context of the society in which they arise. YouTube videos posted by nonprofessional, indi- To conduct a large-scale re-examination of this vidual users identified by querying YouTube for prior work, we must identify whether each player football archival channels.4 YouTube automati- in FOOTBALL is perceived as white or nonwhite.7 cally captions many videos, allowing us to scrape Unfortunately, publicly available rosters or player caption transcripts from 601 NFL games and 854 pages do not contain this information, so we re- NCAA games. We next identify the teams play- sort to crowdsourcing. We present crowd workers ing and game’s year by searching for exact string on the Figure Eight platform with 2,720 images matches in the video title and manually labeling of professional player headshots from the Asso- any videos with underspecified titles. ciated Press paired with player names. We ask After downloading videos, we tokenize tran- them to “read the player’s name and examine their scripts using spaCy.5 As part-of-speech tags pre- photo” to judge whether the player is white or dicted by spaCy are unreliable on our transcript nonwhite. We collect five judgements per player text, we tag FOOTBALL using the ARK TweetNLP from crowd workers in the US, whose high inter- POS tagger (Owoputi et al., 2013), which is more annotator agreement (all five workers agree on the robust to noisy and fragmented text, including TV race for 93% of players) suggests that their percep- subtitles (Jørgensen et al., 2016). Additionally, we tions are very consistent. Because headshots were use phrasemachine (Handler et al., 2016) to only available for a subset of players, the authors identify all corpus noun phrases. Finally, we iden- labeled the race of an additional 1,948 players by tify player mentions in the transcript text using ex- performing a Google Image search for the player’s act string matches of first, last, and full names to name8 and manually examining the resulting im- roster information from online archives; these ros- ages. Players whose race could not be determined ters also contain the player’s position.6 Although from the search results were excluded from the we initially had concerns about the reliability of dataset. transcriptions of player names, we noticed mini- 3 Analyzing FOOTBALL mal errors on more common names. Qualitatively, we noticed that even uncommon names were often We now demonstrate confounds in the data and correctly transcribed and capitalized. We leave a revisit several established results from racial bias more systematic study for future work. studies in sports broadcasting. For all experi- ments, we seek to analyze the statistics of con- 2.2 Identifying player race textual terms that describe or have an important Racial identity in the United States is a creation association with a mentioned player. Thus, we of complex, fluid social and historical processes preprocess the transcripts by collecting contex- (Omi and Winant, 2014), rather than a reflec- tual terms in windows of five tokens around each tion of innate differences between fixed groups. player mention, following the approach of Ananya Nevertheless, popular perceptions of race in the et al.(2019) for gendered mention analysis. 9 United States and the prior scholarship on racial We emphasize that different term extraction strategies are possible, corresponding to different 3See Appendix for more detailed statistics. 4We specifically query for full 7While we use the general term “nonwhite” in this paper, NFLjNCAAjcollege football games the majority of nonwhite football players are black: in 2013, 1960sj1970sj1980sj1990sj2000, and the full list 67.3% of the NFL was black and most of the remaining play- of channels is listed in in the Appendix. ers (31%) were white (Lapchick, 2014). 5https://spacy.io/ (2.1.3), Honnibal and Montani(2017) 8We appended “NFL” to every query to improve precision 6Roster sources listed in Appendix. We tag first and last of results. name mentions only if they can be disambiguated to a single 9If multiple player mentions fall within the same window, player in the rosters from opposing teams. we exclude each term to avoid ambiguity. 6356 many players’ actions (“passes the ball down- field”) depend on the position they play, which is often skewed by race (Figure1). Furthermore, the racial composition of mentions across different decades can differ dramatically—Figure2 shows these changes for quarterback mentions—which makes the problem even more complex. Modeling biased descriptions of players thus requires dis- entangling attributes describing shifting, position- dependent player actions on field (e.g., “Paulsen the tight end with a fast catch”) from attributes referring to intrinsic characteristics of individual Figure 1: Almost all of the eight most frequently- players (“Paulsen is just so, so fast”). mentioned positions in FOOTBALL are heavily skewed To demonstrate this challenge, we distinguish in favor of one race. between per-position effects and racial