Tweeting About Mental Health: Big Data Text Analysis of Twitter For

Total Page:16

File Type:pdf, Size:1020Kb

Tweeting About Mental Health: Big Data Text Analysis of Twitter For Dissertation Tweeting About Mental Health Big Data Text Analysis of Twitter for Public Policy Mikhail Zaydman This document was submitted as a dissertation in January 2017 in partial fulfillment of the requirements of the doctoral degree in public policy analysis at the Pardee RAND Graduate School. The faculty committee that supervised and approved the dissertation consisted of Douglas Yeung (Chair), Luke Matthews, and Joie Acosta. PARDEE RAND GRADUATE SCHOOL For more information on this publication, visit http://www.rand.org/pubs/rgs_dissertations/RGSD391.html Published by the RAND Corporation, Santa Monica, Calif. © Copyright 2017 RAND Corporation R® is a registered trademark Limited Print and Electronic Distribution Rights This document and trademark(s) contained herein are protected by law. This representation of RAND intellectual property is provided for noncommercial use only. Unauthorized posting of this publication online is prohibited. Permission is given to duplicate this document for personal use only, as long as it is unaltered and complete. Permission is required from RAND to reproduce, or reuse in another form, any of its research documents for commercial use. For information on reprint and linking permissions, please visit www.rand.org/pubs/permissions.html. The RAND Corporation is a research organization that develops solutions to public policy challenges to help make communities throughout the world safer and more secure, healthier and more prosperous. RAND is nonprofit, nonpartisan, and committed to the public interest. RAND’s publications do not necessarily reflect the opinions of its research clients and sponsors. Support RAND Make a tax-deductible charitable contribution at www.rand.org/giving/contribute www.rand.org Abstract: This dissertation examines conversations and attitudes about mental health in Twitter discourse. The research uses big data collection, machine learning classification, and social network analysis to answer the following questions 1) what mental health topics do people discuss on Twitter? 2) Have patterns of conversation changed over time? Have public messaging campaigns been able to change the conversation? 3) Does Twitter data provide insights that match the results obtained from survey and experimental data? This dissertation finds that Twitter covers a wide range of topics, largely in line with the impact that these conditions have on the population. There is evidence that stigma about mental illness and the appropriation of mental health language is declining in Twitter discourse. Additionally the conversation is heterogeneous across various self‐ forming communities. Finally, I find that public messaging campaigns are small in scale and difficult to evaluate. The findings suggest that policy makers have a broad audience on Twitter, that there are communities engaged with specific topics, and that more campaign activity on Twitter may generate greater awareness and engagement from populations of interest. Ultimately, Twitter data appears to be an effective tool for analysis of mental health attitudes and can be a replacement or a complement for the traditional survey methods depending on the specifics of the research question. iii Table of Contents ABSTRACT: ........................................................................................................................................................ III LIST OF FIGURES ............................................................................................................................................... VII LIST OF TABLES ................................................................................................................................................. VII ACKNOWLEDGMENTS ....................................................................................................................................... IX CHAPTER 1 : INTRODUCTION .............................................................................................................................. 1 POLICY AND RESEARCH QUESTIONS .................................................................................................................................... 4 CHAPTER 2 : BACKGROUND AND MOTIVATION .................................................................................................. 6 INTRODUCTION .............................................................................................................................................................. 6 MENTAL HEALTH AND STIGMA .......................................................................................................................................... 6 STATE OF SOCIAL MEDIA ANALYSIS ..................................................................................................................................... 9 NETWORK AND CASCADE ANALYSIS .................................................................................................................................. 10 MACHINE LEARNING AND SUPERVISED CLASSIFICATION ........................................................................................................ 12 SENTIMENT ANALYSIS .................................................................................................................................................... 12 TOPIC MODELING ......................................................................................................................................................... 14 CHAPTER 3 : METHODS ..................................................................................................................................... 15 DATA ACQUISITION ....................................................................................................................................................... 16 Gathering Tweets about mental health .............................................................................................................. 16 Gathering campaign‐relevant Tweets ................................................................................................................ 19 Search results and working data sets ................................................................................................................. 20 DEVELOPMENT OF CODING SCHEME ................................................................................................................................. 20 Reliability of coders ............................................................................................................................................. 25 AUTOMATED CODING .................................................................................................................................................... 26 MODEL PERFORMANCE AND SELECTION ............................................................................................................................ 28 MODELING APPROACH COMPARISON ............................................................................................................................... 33 NETWORK ANALYSIS ...................................................................................................................................................... 37 CHAPTER 4 : CHARACTERIZING THE MENTAL HEALTH CONVERSATION ............................................................... 40 INTRODUCTION ............................................................................................................................................................ 40 MOST MENTAL HEALTH CONTENT ON TWITTER IS SELF‐FOCUSED AND FEW TWEETS SHARE MENTAL HEALTH RESOURCES ................... 40 APPROXIMATELY 10 PERCENT OF MENTAL HEALTH‐RELEVANT TWEETS ARE STIGMATIZING .......................................................... 41 NETWORK COMMUNITY CONVERSATIONS ABOUT MENTAL HEALTH VARY IN TYPES AND TOPICS OF TWEETS ..................................... 42 AMONG LARGE COMMUNITIES WITH MENTAL HEALTH CONVERSATIONS, 71 PERCENT DEMONSTRATED LOW LEVELS OF STIGMA IN COMMUNITY CONVERSATIONS ........................................................................................................................................ 44 CHAPTER 5 : CHANGES OVER TIME .................................................................................................................... 47 INTRODUCTION ............................................................................................................................................................ 47 LONGITUDINAL TRENDS IN MENTAL HEALTH DISCOURSE ON TWITTER ...................................................................................... 47 CAMPAIGN TWITTER PRESENCE ....................................................................................................................................... 49 The volume of campaign‐related Twitter activity is too low to assess whether the campaign activity is affecting the overall Twitter conversation about mental health ........................................................................ 49 Three of the four campaigns show positive signs of engaging other Twitter users to reTweet messages ......... 51 v Engagement with the campaign is visible in the social network ........................................................................ 54 CHAPTER 6 : COMPARING DISSERTATION FINDINGS WITH THE LITERATURE ....................................................... 58 POPULARITY OF TOPICS IN SOCIAL MEDIA TRACKS IMPACT OF CONDITIONS ON PUBLIC HEALTH .....................................................
Recommended publications
  • Integrated Public Alert and Warning System Wireless Emergency Alerts Understand and Respond to Public Sentiment
    Integrated Public Alert and Warning System Wireless Emergency Alerts Understand and Respond to Public Sentiment First Responders Group November 30, 2014 Integrated Public Alert and Warning System Wireless Emergency Alerts Understand and Respond to Public Sentiment HS HSSEDI™ Task HSHQDC-13-J-00097 Version 4.0 This document is a product of the Homeland Security Systems Engineering and Development Institute (HS HSSEDI™). ACKNOWLEDGEMENTS The Homeland Security Systems Engineering and Development Institute (hereafter “HS HSSEDI” or “HSSEDI”) is a federally funded research and development center established by the Secretary of Homeland Security under Section 305 of the Homeland Security Act of 2002. The MITRE Corporation operates HSSEDI under the Department of Homeland Security (DHS) contract number HSHQDC- 09-D-00001. HSSEDI’s mission is to assist the Secretary of Homeland Security, the Under Secretary for Science and Technology, and the DHS operating elements in addressing national homeland security system development issues where technical and systems engineering expertise is required. HSSEDI also consults with other government agencies, nongovernmental organizations, institutions of higher education and nonprofit organizations. HSSEDI delivers independent and objective analyses and advice to support systems development, decision making, alternative approaches and new insight into significant acquisition issues. HSSEDI’s research is undertaken by mutual consent with DHS and is organized by tasks in the annual HSSEDI Research Plan. This report presents the results of test planning conducted under HSHQDC-13-J-00097, Science and Technology Directorate (S&T) FEMA Integrated Public Alert and Warning System (IPAWS)/Wireless Emergency Alert (WEA) Test and Evaluation of HSSEDI’s Fiscal Year 2013 Research Plan.
    [Show full text]
  • 24 Hours of Vine: Big Data and Social Performance
    Technological University Dublin ARROW@TU Dublin Conference Papers Fine Arts 2014 24 Hours of Vine: Big Data and Social Performance Conor McGarrigle Technological University Dublin, [email protected] Follow this and additional works at: https://arrow.tudublin.ie/aaschadpcon Part of the Art and Design Commons Recommended Citation McGarrigle, C. (2014) ‘24 Hours of Vine; Big Data and Social Performance’. In Digital Research in the Humanities and Arts . Greenwich University, London. This Conference Paper is brought to you for free and open access by the Fine Arts at ARROW@TU Dublin. It has been accepted for inclusion in Conference Papers by an authorized administrator of ARROW@TU Dublin. For more information, please contact [email protected], [email protected]. This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 4.0 License 24 hours of Vine, big data and social performance Digital Research in the humanities and Arts Conference, 2014. Conor McGarrigle Emergent Digital Practices University of Denver Denver CO USA Abstract—24h Social is a generative a data-driven generative approached through a case study of the video sharing platform video installation that explores the social media phenomenon of Vine and this author's data art project 24h Social that Vine video as performances in data. The project is created from intervened to capture a day of Vine videos [2]. The project a database of appropriated Vine content, extracted from millions simultaneously celebrates Vine as a platform that facilitates of tweets, with each video shown at the time of its original succinct creative expressions whilst acknowledging that these creation.
    [Show full text]
  • Robôs-PP2-Inglês.Pdf
    DIRETORIA DE ANÁLISE DE POLÍTICAS PÚBLICAS DA FUNDAÇÃO GETULIO VARGAS Policy Paper 2 • Bots, social networks and politics in Brazil • Interference of automated profiles and political actors in the Brazilian electoral debate Rio de Janeiro, Brazil FGV DAPP 2018 Bots, social networks and politics in Brazil 1 DIRETORIA DE ANÁLISE DE POLÍTICAS PÚBLICAS DA FUNDAÇÃO GETULIO VARGAS SUMMARY • 1. EXECUTIVE SUMMARY 4 2. GENERAL OVERVIEW 4 2.1 ANALYSIS RESULTS OF THE INFLUENCE OF AUTOMATED PROFILES ON THE 2014 ELECTIONS 6 2.2 ANALYSIS RESULTS OF THE INFLUENCE OF AUTOMATED PROFILES ON THE 2018 ELECTIONS 6 3. ANALYSIS OF THE INFLUENCE OF BOTNETS ON THE 2018 ELECTIONS PUBLIC DEBATE 7 3.1 PKTWEET GENERATOR 10 3.1.1 Analysis of the profiles 12 3.1.2 Pktweet in diffusion chains 16 3.1.2 Gran Polo Patriotico: another suspect generator 18 3.2. GENERATOR SEESMIC FOR BLUEBERRY 19 3.2.1 Analysis of profiles 20 3.2.2 Other Argentine generators 26 3.2.3 Seesmic for Blueberry in the diffusion chains 27 4. COMPARATIVE ANALYSIS 28 5. CONCLUSION 34 REFERENCES 35 EDITORIAL STAFF 36 Bots, social networks and politics in Brazil 2 DIRETORIA DE ANÁLISE DE POLÍTICAS PÚBLICAS DA FUNDAÇÃO GETULIO VARGAS 1. EXECUTIVE SUMMARY ● Bot activity in social networks has affected Latin America political contexts on a regular basis for years, similarly to its already proven interference in other countries such as France, Germany, the United Kingdom, and the United States. ● In Brazil, the activity of automated profiles on Twitter had already been found in post sharing during the electoral campaign of Aécio Neves, Marina Silva, and Dilma Rousseff, the main presidential candidates of 2014.
    [Show full text]
  • Making Sense of Microposts (#Microposts2015) Big Things Come in Small Packages
    Proceedings of the 5th Workshop on Making Sense of Microposts (#Microposts2015) Big things come in small packages at the 24th International Conference on the World Wide Web (WWW’15) Florence, Italy 18th of May 2015 edited by Matthew Rowe, Milan Stankovic, Aba-Sah Dadzie Preface #Microposts2015, the 5th Workshop on Making Sense of Microp- moment, breaking news, local and context-specific information and osts, was held in Florence, Italy, on the 18th of May 2015, during personal stories, resulted in an increased sense of community and (WWW’15), the 24th International Conference on the World Wide solidarity. Interestingly, in response to emergencies, mass demon- Web. The #Microposts journey started at the 8th Extended Se- strations and other social events such as festivals and conferences, mantic Web Conference (ESWC 2011, as #MSM, with the change when regular access to communication services is often interrupted in acronym from 2014), and moved to WWW in 2012, where it and/or unreliable, developers are quick to offer alternatives that end has stayed, for the fourth year now. #Microposts2015 continues to users piggyback on to post information. Line was born to serve highlight the importance of the medium, as we see end users appro- such a need, to provide an alternative communication service and priating Microposts, small chunks of information published online support emergency response during a natural disaster in Japan in with minimal effort, as part of daily communication and to interact 2011. Its popularity continued beyond its initial purpose, and Line with increasingly wider networks and new publishing arenas. has grown into a popular (regional) microblogging service.
    [Show full text]
  • Using Discovertext for Large Scale Twitter Harvesting
    MDR, Vol. 41, pp. 121–125, December 2012 • Copyright © by Walter de Gruyter • Berlin • Boston. DOI 10.1515/mir-2012-0019 Using DiscoverText for Large Scale Twitter Harvesting Yngvil Beyer Yngvil Beyer ([email protected]) is Ph.D. fellow at the National Library of Norway and University of Oslo, Norway. Introduction How can institutions like the National Library of Norway preserve new media content like Twitter for future research and documentation? This question is the topic this short paper, where I am presenting an ongoing research project, focusing on the data collec- discussed thoroughly in my forthcoming article «@ tion and some preliminary findings. jensstoltenberg talte til oss på Twitter»3 (Beyer 2012). The research question builds on two important un- Then, returning to my initial question concerning derlying conditions. Firstly, preserving the nation’s how such preservation might be carried out. In early national memory is explicitly mentioned in the Na- 2010 the US Library of Congress announced that tional Library’s mission statement. they would archive all tweets ever tweeted (Library of Congress 2010). The details about this archive are The National Library of Norway is responsible yet to be revealed. The infrastructure necessary to for administering the Act relating to the legal give access to the collection is yet not in place. Nei- deposit of generally available documents. The ther is it specified to what extent the collection will purpose of this act is to ensure that documents be made available, to whom, in which format, under with generally available information are depos- what conditions or including which data.
    [Show full text]
  • On Two Existing Approaches to Statistical Analysis of Social Media Data
    On two existing approaches to statistical analysis of social media data M. Patone∗1 and L.-C. Zhang y1,2 1Department of Social Statistics and Demography, Univ. of Southampton, UK 2Statistisk sentralbyr˚a,Norway May 3, 2019 Abstract Using social media data for statistical analysis of general population faces com- monly two basic obstacles: firstly, social media data are collected for different objects than the population units of interest; secondly, the relevant measures are typically not available directly but need to be extracted by algorithms or machine learning tech- niques. In this paper we examine and summarise two existing approaches to statistical analysis based on social media data, which can be discerned in the literature. In the first approach, analysis is applied to the social media data that are organised around the objects directly observed in the data; in the second one, a different analysis is applied to a constructed pseudo survey dataset, aimed to transform the observed social media data to a set of units from the target population. We elaborate systematically the relevant data quality frameworks, exemplify their applications, and highlight some typical challenges associated with social media data. arXiv:1905.00635v1 [stat.AP] 2 May 2019 Key words: quality, representation, measurement, test, non-probability sample. 1 Introduction There has been a notable increase of interest from researchers, companies and governments to conduct statistical analysis based on social media data collected from platforms such as ∗[email protected] [email protected] 1 Twitter or Facebook (see e.g. Kinder-Kurlanda and Weller (2014); Braojos-Gomez et al.
    [Show full text]
  • Online Research Tools
    Online Research Tools A White Paper Alphabetical URL DataSet Link Compilation By Marcus P. Zillman, M.S., A.M.H.A. Executive Director – Virtual Private Library [email protected] Online Research Tools is a white paper link compilation of various online tools that will aid your research and searching of the Internet. These tools come in all types and descriptions and many are web applications without the need to download software to your computer. This white paper link compilation is constantly updated and is available online in the Research Tools section of the Virtual Private Library’s Subject Tracer™ Information Blog: http://www.ResearchResources.info/ If you know of other online research tools both free and fee based feel free to contact me so I may place them in this ongoing work as the goal is to make research and searching more efficient and productive both for the professional as well as the lay person. Figure 1: Research Resources – Online Research Tools 1 Online Research Tools – A White Paper Alpabetical URL DataSet Link Compilation [Updated: August 26, 2013] http://www.OnlineResearchTools.info/ [email protected] eVoice: 800-858-1462 © 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013 Marcus P. Zillman, M.S., A.M.H.A. Online Research Tools: 12VPN - Unblock Websites and Improve Privacy http://12vpn.com/ 123Do – Simple Task Queues To Help Your Work Flow http://iqdo.com/ 15Five - Know the Pulse of Your Company http://www.15five.com/ 1000 Genomes - A Deep Catalog of Human Genetic Variation http://www.1000genomes.org/
    [Show full text]
  • Social Media Marketing: an Hour a Day, Second Edition
    Advance Praise for Social Media Marketing: An Hour a Day, Second Edition “If you’re looking for the definitive guide on social media, look no more. You are holding it in your hands. Whether you’re a social media novice or veteran, this book will be an invaluable resource in your journey to social media enlightenment and success.” —Kip Knight, President, KnightVision Marketing, and former Vice President of Marketing, eBay “Social Media Marketing: An Hour a Day, Second Edition is an important book not just for marketers but for all business leaders. It focuses on how social technologies are changing the very nature of the ecosystem that businesses operate in—from customers to partners to employees to other stakeholders. Dave covers this wonderfully and focuses on how a com- pany has to change its systems and processes internally to adapt to this ever changing reality. It’s a must read for current and future business leaders of all types!” —Gautam Ghosh, Platform Evangelist and India Marketing Head, BraveNewTalent “The thing I appreciate most about Dave’s book is that it is not only prescriptive but it is also built to fit into the busy schedule of any marketer (or executive). A must-read for anyone interested in putting social media marketing into practice.” —Aaron Strout, Head of Location-Based Marketing, WCG “Dave Evans gets social media! Social Media Marketing: An Hour a Day, Second Edition is a practical guidebook that integrates social strategy with the tools and metrics. I have used it with clients and business students with great results.
    [Show full text]
  • Enterprise Social Media Management Software: a Marketer's Guide
    MARTECH INTELLIGENCE REPORT: Enterprise Social Media Management Software: A Marketer’s Guide SIXTH EDITION A MarTech Today Research Report MARTECH INTELLIGENCE REPORT: Enterprise Social Media Management Software: A Marketer’s Guide Sixth Edition Table of Contents Scope and methodology .................................................................................................................................................... 2 ESMMS market overview ................................................................................................................................................... 3 Table 1: Social media advertising revenue, 2012-2016 ($ billions) .......................................................................... 3 Table 2: B2B vs. B2C social media spending forecasts ............................................................................................ 3 Table 3: Actual vs. predicted social media spending .............................................................................................. 4 Proving ROI remains a key challenge .............................................................................................................................. 4 Table 4: Proving ROI of social media ....................................................................................................................... 5 M&A activity reflects evolving market ............................................................................................................................. 5 Table 5: Selected ESMMS
    [Show full text]
  • Green Tweets
    twitter_praise_page Page i Thursday, March 12, 2009 12:35 PM Praise for Twitter API: Up and Running “Twitter API: Up and Running is a friendly, accessible introduction to the Twitter API. Even beginning web developers can have a working Twitter project before they know it. Sit down with this book for a weekend and you’re on your way to Twitter API mastery.” — Alex Payne, Twitter API lead “This book rocks! I would have loved to have had this kind of support when I initially created TwitDir.” — Laurent Pantanacce, creator of TwitDir “Twitter API: Up and Running is a very comprehensive and useful resource—any developer will feel the urge to code a Twitter-related application right after finishing the book!” — The Lollicode team, creators of Twitscoop “A truly comprehensive resource for anyone who wants to get started with developing applications around the Twitter platform.” — David Troy, developer of Twittervision “An exceptionally detailed look at Twitter from the developer’s perspective, including useful and functional sample code!” — Damon Cortesi, creator of TweetStats, TweepSearch, and TweetSum “This book is more than just a great technical resource for the Twitter API. It also provides a ton of insight into the Twitter culture and the current landscape of apps. It’s perfect for anyone looking to start building web applications that integrate with Twitter.” — Matt Gillooly, lead developer of Twalala “A wonderful account of the rich ecosystem surrounding Twitter’s API. This book gives you the insight and techniques needed to craft your own tools for this rapidly expanding social network.” — Craig Hockenberry, developer of Twitterrific twitter_praise_page Page ii Thursday, March 12, 2009 12:35 PM Twitter API: Up and Running Twitter API: Up and Running Kevin Makice Beijing • Cambridge • Farnham • Köln • Sebastopol • Taipei • Tokyo Twitter API: Up and Running by Kevin Makice Copyright © 2009 Kevin Makice.
    [Show full text]
  • THE SILENCE I CARRY Disclosing Gender-Based Violence in Forced Displacement GUATEMALA & MEXICO • Exploratory Report 2018
    THE SILENCE I CARRY Disclosing gender-based violence in forced displacement GUATEMALA & MEXICO • Exploratory Report 2018 TABLE OF CONTENTS TABLE OF CONTENTS ACRONYMS AND ABBREVIATIONS ...............................................................................1 ACRONYMS AND ABBREVIATIONS ...............................................................................1 EXECUTIVE SUMMARY ...................................................................................................2 EXECUTIVE SUMMARY ...................................................................................................2 INTRODUCTION .............................................................................................................5 INTRODUCTION .............................................................................................................5 METHODS AND ACTIVITIES ...........................................................................................7 METHODS AND ACTIVITIES ...........................................................................................7 FINDINGS .......................................................................................................................8 FINDINGS .......................................................................................................................8 DISCUSSION .................................................................................................................21 DISCUSSION .................................................................................................................21
    [Show full text]
  • Intro to Digital Tech & Emerging Media
    DTEM 1401 Version 1.2 Fall 2019 INTRO TO DIGITAL TECH & EMERGING MEDIA Course Information The course Introduction to Digital Technology and Emerging Media offers a comprehensive overview of the possibilities of Course Schedule: Mon & Thu communication in a digital world. Through a series of readings, 10-11:15 pm lectures and assignments, students study the rhetoric, history, theory, and practice of new media. Location: FMH 301 As the digital media landscape is constantly evolving, this course will Instructor Contact take a specific interest in understanding the evolution of media technologies and investigate the emergence of older forms of “new” E-mail: [email protected] media, from the original internet to big data, from graphical user Twitter: @klangable interfaces to social media platforms. As we do so, we will focus on how we use digital media, and how that use impacts individual Phone: 718-817-4870 identities, connections between people, our knowledge levels, Office: Faculty Memorial Hall, relationships of power, and so on. Room 438 Office Hours: MR 12:30-2pm, Objectives email for appointment. The course will allow students to: How to email your professor • gain an understanding of core concepts of digital content, such as http://klangable.com/blog/? mobility, interactivity, networking, as well its technical components page_id=4746 and how it impacts communication and information. • historicize media technologies we consider(ed) “new” media. • understand and contribute to contemporary debates over changes in identity, sociality, the economy, education, and play associated with the emergence of new media. • recognize how digital media constantly impact and/or structure their everyday social interactions, identities, and seemingly- mundane or rote behaviors.
    [Show full text]