The Construction and Analysis of a Social Media Speech Act Corpus
Total Page:16
File Type:pdf, Size:1020Kb
ILLOCUTION ON TWITTER: THE CONSTRUCTION AND ANALYSIS OF A SOCIAL MEDIA SPEECH ACT CORPUS A Dissertation submitted to the Faculty of the Graduate School of Arts and Sciences of Georgetown University in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Linguistics By Laura Beth Bell, M.S. Washington, DC December 6, 2019 Copyright 2019 by Laura Beth Bell All Rights Reserved ii ILLOCUTION ON TWITTER: THE CONSTRUCTION AND ANALYSIS OF A SOCIAL MEDIA SPEECH ACT CORPUS Laura Beth Bell, M.S. Thesis Advisor: Paul Portner, Ph.D. ABSTRACT In the years since J.L. Austin (1962) first proposed Speech Act Theory (SAT), the literature has taken it in many directions. One recurring point of discussion is the extent to which the direct illocutionary force of an utterance is overtly encoded (Searle & Vanderveken, 1985; Bach & Harnish, 1979; Kissine, 2009; among others). A related question, which has received notably less attention, is the exact identity of the elements proposed to be responsible for such encoding. This dissertation began with the goal of conducting an empirical investigation into the existence and identity of these illocutionary force indicating devices (IFIDs, Searle & Vanderveken, 1985). This investigation was realized through the construction and analysis of a corpus of posts from the social media platform Twitter. The data was annotated, using Amazon Mechanical Turk, for a variety of features including direct illocutionary act category, indirect illocutionary act presence and category, and hashtag functionality. The annotated corpus was analyzed using logistic regression and random forest methodologies, with direct act category as the response variable, in an attempt to identify potential IFIDs. Additional hypotheses that arose during the course of planning the project were also assessed. For example, the corpus was utilized to investigate stylistic accommodation on Twitter as a secondary goal. The corpus analysis identified several features as having a significant effect on the model, though no features were identified by the model as individually significant for direct iii illocutionary act classification. The dissertation proposes that IFIDs may be realized as templates of typical features as opposed to categorical entities. The analysis also identified several stylistic features to which Twitter users appear to accommodate over time. The dissertation proposes that several of these features may be explained by a style that promotes authenticity and is uncertain about its own level of formality, both attributes of Twitter attested in the literature. iv ACKNOWLEDGEMENTS I could not decide where to start. Do you start with the greatest contributors, to let everyone know that, “These guys deserve top billing”? Or do you save them for last, letting the weight and suspense build until the final reveal? I ultimately went for neither, as I do not like the idea of ranking the support of family and friends anyway (what is this, MySpace?). I therefore now acknowledge the contributions of the following parties, in an intentionally randomized order. My sincerest thanks: To my cats, for emotional healing and excellent tuition as regards proper stepwise feature dropping. To my committee members, for your crucial support and guidance. To Meredith, for always reminding me of what is most important in life and demonstrating true dedication to hygge. To Sarah, for sending me so many encouraging phone calls, texts, and care packages that it never actually felt like you lived 3,000 miles away. To my pilot annotators, for your honest and constructive feedback. To my advisor Paul, for encouraging me to try something new and being patient as I blazed that trail. To my brother Scott, for pursuing a more stable line of work so that Mom only has to worry about one of us. To Stacy and Shannon, for escaping with me into the land of André and Hallmark Christmas Movies when I just couldn’t look at Twitter anymore. v To my spouse Scott, for daily pep talks, unwavering faith, and a level of support that can only be classified as monumental. To the ladies of my D&D group, for the bonus inspiration die and an additional +1 to constitution checks. To my parents, for the innumerable opportunities that you have made possible with your constant love, support, and generosity. To Babby, for waiting until after the defense to let me know you were there. To Russ and Merrilee, for defying every “in-law” stereotype possible and truly loving and supporting me as one of your own. To my anonymous annotators, for your critical participation and good-faith efforts. To all members of the Ryals, Kuchta, and Bell crews, for giving me the sort of love and security that most people can only dream of. And to Dan, for introducing me to Scott and letting us get cats even though you’re allergic. The completion of this work is dedicated to the memory of Kim. You were a paragon of empathy and groundedness. I feel lucky to have benefited from your mentorship for as long as I did. vi TABLE OF CONTENTS CHAPTER 1: INTRODUCTION AND BACKGROUND ……………………………………….1 1.0 Introduction …………………………………………………………………………………....1 1.0.1 Outline and Key Findings ………….…………...…………………………………………...5 1.1 Overview: Speech Act Theory ………………………………………………………………...6 1.1.1 Austin (1962) ………………………………………………………………………………..7 1.1.2 The Family Tree …………………………………………………………………………....10 1.1.2.1 Literalist Accounts of Illocutionary Force ……………………………………………….11 1.1.2.1.1 Literalist Mentalist Accounts: Speaker Only …………………………………………..11 1.1.2.1.2 Literalist Mentalist Accounts: Speaker and Hearer …………………………………....15 1.1.2.1.3 Literalist Non-Mentalist Accounts ……………………………………………………..20 1.1.2.2 Contextualist Accounts of Illocutionary Force …………………………………………..24 1.1.2.2.1 Contextualist Accounts: Reduced Role of the Proposition …………………………….24 1.1.2.2.2 Contextualist Accounts: No Role for Proposition ……………………………………..27 1.1.2.3 Summary Discussion …………………………………………………………………….29 1.1.3 Direct Acts vs. Indirect Acts ……………………………………………………………….39 1.1.4 Summary …………………………………………………………………………………...44 1.1.4.1 Definitions of Key Terms ………………………………………………………………...45 1.1.4.2 Motivation for Hypotheses 1 and 2 ……………………………………………………....46 1.1.4.3 Motivation for Hypothesis 3 ……………………………………………………………..47 1.2 Overview: Technological Affordances of the Medium ……………………………………....48 1.2.1 Emoticons and Emoji ……………………………………………………………………....50 vii 1.2.1.1 Background ……………………………………………………………………………....50 1.2.1.1.1 Affecting Interpretation of Adjacent Text ……………………………………………..50 1.2.1.1.2 Standing in for Words/Concepts/Phrases ……………………………………………....55 1.2.1.2 Speech Act Status ………………………………………………………………………..56 1.2.2 Hashtags and @-tags ……………………………………………………………………….58 1.2.2.1 Background ……………………………………………………………………………....58 1.2.2.1.1 Tagging Usages ………………………………………………………………………...59 1.2.2.1.2 Non-tagging Usages …………………………………………………………………....64 1.2.2.2 Speech Act Status ………………………………………………………………………..69 1.2.3 Quotations ………………………………………………………………………………….74 1.2.3.1 Background ……………………………………………………………………………....74 1.2.3.1.1 The Name Theory ……………………………………………………………………...78 1.2.3.1.2 The Demonstrative Theory …………………………………………………………….79 1.2.3.1.3 The Identity Theory …………………………………………………………………....81 1.2.3.2 Speech Act Status ………………………………………………………………………..83 1.2.4 Likes and Retweets ………………………………………………………………………...85 1.2.4.1 Background ……………………………………………………………………………....85 1.2.4.1.1 Posting Functions of Likes and Retweets ……………………………………………...85 1.2.4.1.2 Linguistic Function of Likes …………………………………………………………...87 1.2.4.1.3 Linguistic Function of Retweets ……………………………………………………….88 1.2.4.2 Speech Act Status ………………………………………………………………………..90 1.3 Summary: The Annotation Task So Far ……………………………………………………...90 viii CHAPTER 2: ANNOTATION INFORMATION AND GUIDELINES ………………………...94 2.0 Introduction …………………………………………………………………………………..94 2.1 Annotation Guidelines ……………………………………………………………………….94 2.1.1 Theoretical Motivation …………………………………………………………………….96 2.1.2 Comparability Motivation ………………………………………………………………….98 2.1.2.1 Speech Act Studies in Computer-Mediated Communication …………………………....99 2.1.2.1.1 Twitter ………………………………………………………………………………….99 2.1.2.1.2 Facebook ……………………………………………………………………………...102 2.1.2.1.3 Online Discussion Forums …………………………………………………………....103 2.1.2.1.4 Instant Messaging …………………………………………………………………….107 2.1.2.1.5 Email ………………………………………………………………………………….109 2.1.2.2 Speech Act Studies in Spoken Conversation …………………………………………...110 2.1.2.3 Summary ………………………………………………………………………………..112 2.1.3 Accessibility Motivation ………………………………………………………………….114 2.1.3.1 Taxonomy ……………………....……………………………………………………….114 2.1.3.2 Category Definitions ……..…………………………………………………………….115 2.1.3.3 Key Concept Definitions ……………………………………………………………….116 2.1.4 Summary ………………………………………………………………………………….119 2.2 Segmentation Guidelines and Final Form of the Task ……………………………………...120 2.2.1 Speech Act Units in Previous Literature ………………………………………………….120 2.2.2 Segmentation Guidelines ………………………………………………………………....125 2.2.2.1 Motivating Segmentation Guidelines …………………………………………………..125 ix 2.2.2.1.1 Full Sentences ……………………………………….………………………………..125 2.2.2.1.2 Sentential Fragments …………………………………………..……………………..125 2.2.2.1.3 Conjoined and/or Coordinated Elements ……………………………………………..126 2.2.2.1.4 Complement Clauses ………………………………..………………………………..128 2.2.2.1.5 Subordinate Clauses and Restrictive Relative Clauses ……..………………………..129 2.2.2.1.6 Non-restrictive Relative Clauses and Other Parenthetical Elements …………..……..131 2.2.2.1.7 Speech Act Units Interrupting