Tweeting About Mental Health: Big Data Text Analysis of Twitter For
Total Page:16
File Type:pdf, Size:1020Kb
Dissertation Tweeting About Mental Health Big Data Text Analysis of Twitter for Public Policy Mikhail Zaydman This document was submitted as a dissertation in January 2017 in partial fulfillment of the requirements of the doctoral degree in public policy analysis at the Pardee RAND Graduate School. The faculty committee that supervised and approved the dissertation consisted of Douglas Yeung (Chair), Luke Matthews, and Joie Acosta. PARDEE RAND GRADUATE SCHOOL For more information on this publication, visit http://www.rand.org/pubs/rgs_dissertations/RGSD391.html Published by the RAND Corporation, Santa Monica, Calif. © Copyright 2017 RAND Corporation R® is a registered trademark Limited Print and Electronic Distribution Rights This document and trademark(s) contained herein are protected by law. This representation of RAND intellectual property is provided for noncommercial use only. Unauthorized posting of this publication online is prohibited. Permission is given to duplicate this document for personal use only, as long as it is unaltered and complete. Permission is required from RAND to reproduce, or reuse in another form, any of its research documents for commercial use. For information on reprint and linking permissions, please visit www.rand.org/pubs/permissions.html. The RAND Corporation is a research organization that develops solutions to public policy challenges to help make communities throughout the world safer and more secure, healthier and more prosperous. RAND is nonprofit, nonpartisan, and committed to the public interest. RAND’s publications do not necessarily reflect the opinions of its research clients and sponsors. Support RAND Make a tax-deductible charitable contribution at www.rand.org/giving/contribute www.rand.org Abstract: This dissertation examines conversations and attitudes about mental health in Twitter discourse. The research uses big data collection, machine learning classification, and social network analysis to answer the following questions 1) what mental health topics do people discuss on Twitter? 2) Have patterns of conversation changed over time? Have public messaging campaigns been able to change the conversation? 3) Does Twitter data provide insights that match the results obtained from survey and experimental data? This dissertation finds that Twitter covers a wide range of topics, largely in line with the impact that these conditions have on the population. There is evidence that stigma about mental illness and the appropriation of mental health language is declining in Twitter discourse. Additionally the conversation is heterogeneous across various self‐ forming communities. Finally, I find that public messaging campaigns are small in scale and difficult to evaluate. The findings suggest that policy makers have a broad audience on Twitter, that there are communities engaged with specific topics, and that more campaign activity on Twitter may generate greater awareness and engagement from populations of interest. Ultimately, Twitter data appears to be an effective tool for analysis of mental health attitudes and can be a replacement or a complement for the traditional survey methods depending on the specifics of the research question. iii Table of Contents ABSTRACT: ........................................................................................................................................................ III LIST OF FIGURES ............................................................................................................................................... VII LIST OF TABLES ................................................................................................................................................. VII ACKNOWLEDGMENTS ....................................................................................................................................... IX CHAPTER 1 : INTRODUCTION .............................................................................................................................. 1 POLICY AND RESEARCH QUESTIONS .................................................................................................................................... 4 CHAPTER 2 : BACKGROUND AND MOTIVATION .................................................................................................. 6 INTRODUCTION .............................................................................................................................................................. 6 MENTAL HEALTH AND STIGMA .......................................................................................................................................... 6 STATE OF SOCIAL MEDIA ANALYSIS ..................................................................................................................................... 9 NETWORK AND CASCADE ANALYSIS .................................................................................................................................. 10 MACHINE LEARNING AND SUPERVISED CLASSIFICATION ........................................................................................................ 12 SENTIMENT ANALYSIS .................................................................................................................................................... 12 TOPIC MODELING ......................................................................................................................................................... 14 CHAPTER 3 : METHODS ..................................................................................................................................... 15 DATA ACQUISITION ....................................................................................................................................................... 16 Gathering Tweets about mental health .............................................................................................................. 16 Gathering campaign‐relevant Tweets ................................................................................................................ 19 Search results and working data sets ................................................................................................................. 20 DEVELOPMENT OF CODING SCHEME ................................................................................................................................. 20 Reliability of coders ............................................................................................................................................. 25 AUTOMATED CODING .................................................................................................................................................... 26 MODEL PERFORMANCE AND SELECTION ............................................................................................................................ 28 MODELING APPROACH COMPARISON ............................................................................................................................... 33 NETWORK ANALYSIS ...................................................................................................................................................... 37 CHAPTER 4 : CHARACTERIZING THE MENTAL HEALTH CONVERSATION ............................................................... 40 INTRODUCTION ............................................................................................................................................................ 40 MOST MENTAL HEALTH CONTENT ON TWITTER IS SELF‐FOCUSED AND FEW TWEETS SHARE MENTAL HEALTH RESOURCES ................... 40 APPROXIMATELY 10 PERCENT OF MENTAL HEALTH‐RELEVANT TWEETS ARE STIGMATIZING .......................................................... 41 NETWORK COMMUNITY CONVERSATIONS ABOUT MENTAL HEALTH VARY IN TYPES AND TOPICS OF TWEETS ..................................... 42 AMONG LARGE COMMUNITIES WITH MENTAL HEALTH CONVERSATIONS, 71 PERCENT DEMONSTRATED LOW LEVELS OF STIGMA IN COMMUNITY CONVERSATIONS ........................................................................................................................................ 44 CHAPTER 5 : CHANGES OVER TIME .................................................................................................................... 47 INTRODUCTION ............................................................................................................................................................ 47 LONGITUDINAL TRENDS IN MENTAL HEALTH DISCOURSE ON TWITTER ...................................................................................... 47 CAMPAIGN TWITTER PRESENCE ....................................................................................................................................... 49 The volume of campaign‐related Twitter activity is too low to assess whether the campaign activity is affecting the overall Twitter conversation about mental health ........................................................................ 49 Three of the four campaigns show positive signs of engaging other Twitter users to reTweet messages ......... 51 v Engagement with the campaign is visible in the social network ........................................................................ 54 CHAPTER 6 : COMPARING DISSERTATION FINDINGS WITH THE LITERATURE ....................................................... 58 POPULARITY OF TOPICS IN SOCIAL MEDIA TRACKS IMPACT OF CONDITIONS ON PUBLIC HEALTH .....................................................