Large-Scale Analysis of Counseling Conversations: an Application of Natural Language Processing to Mental Health
Total Page:16
File Type:pdf, Size:1020Kb
Large-scale Analysis of Counseling Conversations: An Application of Natural Language Processing to Mental Health Tim Althoff,∗ Kevin Clark,∗ Jure Leskovec Stanford University althoff, kevclark, jure @cs.stanford.edu { } Abstract could make a difference in someone’s life. Thus far, quantitative evidence for effective conversation Mental illness is one of the most pressing pub- strategies has been scarce, since most studies on lic health issues of our time. While counsel- counseling have been limited to very small sample ing and psychotherapy can be effective treat- sizes and qualitative observations (e.g., Labov and ments, our knowledge about how to conduct successful counseling conversations has been Fanshel, (1977); Haberstroh et al., (2007)). How- limited due to lack of large-scale data with la- ever, recent advances in technology-mediated coun- beled outcomes of the conversations. In this seling conducted online or through texting (Haber- paper, we present a large-scale, quantitative stroh et al., 2007) have allowed counseling ser- study on the discourse of text-message-based vices to scale with increasing demands and to col- counseling conversations. We develop a set of lect large-scale data on counseling conversations and novel computational discourse analysis meth- their outcomes. ods to measure how various linguistic aspects of conversations are correlated with conver- Here we present the largest study on counseling sation outcomes. Applying techniques such conversation strategies published to date. We use as sequence-based conversation models, lan- data from an SMS texting-based counseling service guage model comparisons, message cluster- where people in crisis (depression, self-harm, sui- ing, and psycholinguistics-inspired word fre- cidal thoughts, anxiety, etc.), engage in therapeutic quency analyses, we discover actionable con- conversations with counselors. The data contains versation strategies that are associated with millions of messages from eighty thousand counsel- better conversation outcomes. ing conversations conducted by hundreds of coun- selors over the course of one year. We develop a set 1 Introduction of computational methods suited for large-scale dis- course analysis to study how various linguistic as- Mental illness is a major global health issue. In the pects of conversations are correlated with conversa- U.S. alone, 43.6 million adults (18.1%) experience tion outcomes (collected via a follow-up survey). mental illness in a given year (National Institute of We focus our analyses on counselors instead of Mental Health, 2015). In addition to the person di- individual conversations because we are interested rectly experiencing a mental illness, family, friends, in general conversation strategies rather than prop- and communities are also affected (Insel, 2008). erties of specific issues. We find that there are sig- In many cases, mental health conditions can be nificant, quantifiable differences between more suc- treated effectively through psychotherapy and coun- cessful and less successful counselors in how they seling (World Health Organization, 2015). However, conduct conversations. it is far from obvious how to best conduct counsel- Our findings suggest actionable strategies that are ing conversations. Such conversations are free-form associated with successful counseling: without strict rules, and involve many choices that *Both authors contributed equally to the paper. 463 Transactions of the Association for Computational Linguistics, vol. 4, pp. 463–476, 2016. Action Editor: Lillian Lee. Submission batch: 12/2015; Revision batch: 3/2016; 7/2016 Published 8/2016. c 2016 Association for Computational Linguistics. Distributed under a CC-BY 4.0 license. 1. Adaptability (Section 5): Measuring the dis- cess can be found at http://snap.stanford. tance between vector representations of the lan- edu/counseling. guage used in conversations going well and go- Although we focus on crisis counseling in this ing badly, we find that successful counselors work, our proposed methods more generally apply are more sensitive to the current trajectory of to other conversational settings and can be used to the conversation and react accordingly. study how language in conversations relates to con- 2. Dealing with Ambiguity (Section 6): We de- versation outcomes. velop a clustering-based method to measure differences in how counselors respond to very 2 Related Work similar ambiguous situations. We learn that Our work relates to two lines of research: successful counselors clarify situations by writ- ing more, reflect back to check understanding, Therapeutic Discourse Analysis & Psycholinguis- The field of conversation analysis was born in and make their conversation partner feel more tics. the 1960s out of a suicide prevention center (Sacks comfortable through affirmation. and Jefferson, 1995; Van Dijk, 1997). Since then 3. Creativity (Section 6.3): We quantify the di- conversation analysis has been applied to various versity in counselor language by measuring clinical settings including psychotherapy (Labov cluster density in the space of counselor re- and Fanshel, 1977). Work in psycholinguistics has sponses and find that successful counselors re- demonstrated that the words people use can re- spond in a more creative way, not copying the veal important aspects of their social and psycho- person in distress exactly and not using too logical worlds (Pennebaker et al., 2003). Previous generic or “templated” responses. work also found that there are linguistic cues as- 4. We develop a Making Progress (Section 7): sociated with depression (Ramirez-Esparza et al., novel sequence-based unsupervised conversa- 2008; Campbell and Pennebaker, 2003) as well as tion model able to discover ordered conversa- with suicude (Pestian et al., 2012). These find- tion stages common to all conversations. Ana- ings are consistent with Beck’s cognitive model of lyzing the progression of stages, we determine depression (1967; cognitive symptoms of depres- that successful counselors are quicker to get to sion precede the affective and mood symptoms) and know the core issue and faster to move on to with Pyszczynski and Greenberg’s self-focus model collaboratively solving the problem. of depression (1987; depressed persons engage in 5. Change in Perspective (Section 8): We de- higher levels of self-focus than non-depressed per- velop novel measures of perspective change us- sons). ing psycholinguistics-inspired word frequency In this work, we propose an operationalized psy- analysis. We find that people in distress are cholinguistic model of perspective change and fur- more likely to be more positive, think about ther provide empirical evidence for these theoretical the future, and consider others, when the coun- models of depression. selors bring up these concepts. We further show that this perspective change is associated Large-scale Computational Linguistics Applied with better conversation outcomes consistent to Conversations. Large-scale studies have re- with psychological theories of depression. vealed subtle dynamics in conversations such as co- ordination or style matching effects (Niederhoffer Further, we demonstrate that counseling success on and Pennebaker, 2002; Danescu-Niculescu-Mizil, the level of individual conversations is predictable 2012) as well as expressions of social power and using features based on our discovered conversation status (Bramsen et al., 2011; Danescu-Niculescu- strategies (Section 9). Such predictive tools could be Mizil et al., 2012). Other studies have connected used to help counselors better progress through the writing to measures of success in the context of re- conversation and could result in better counseling quests (Althoff et al., 2014), user retention (Althoff practices. The dataset used in this work has been re- and Leskovec, 2015), novels (Ashok et al., 2013), leased publicly and more information on dataset ac- and scientific abstracts (Guerini et al., 2012). Prior 464 work has modeled dialogue acts in conversational Dataset statistics speech based on linguistic cues and discourse coher- Conversations 80,885 ence (Stolcke et al., 2000). Unsupervised machine Conversations with survey response 15,555 (19.2%) Messages 3.2 million learning models have also been used to model con- Messages with survey response 663,026 (20.6%) versations and segment them into speech acts, top- Counselors 408 ical clusters, or stages. Most approaches employ Messages per conversation* 42.6 Hidden Markov Model-like models (Barzilay and Words per message* 19.2 Lee, 2004; Ritter et al., 2010; Paul, 2012; Yang et Table 1: Basic dataset statistics. Rows marked with * are al., 2014) which are also used in this work to model computed over conversations with survey responses. progression through conversation stages. selor picks the request from the queue and engages Very recently, technology-mediated counseling with the incoming conversation. We refer to the cri- has allowed the collection of large datasets on coun- sis counselor as the counselor and the person in dis- seling. Howes et al. (2014) find that symptom sever- tress as the texter. After the conversation ends, the ity can be predicted from transcript data with com- texter receives a follow-up question (“How are you parable accuracy to face-to-face data but suggest that feeling now? Better, same, or worse?”) which we insights into style and dialogue structure are needed use as our conversation quality ground-truth (we use to predict measures of patient progress. Counseling