Using BERT for Qualitative Content Analysis in Psychosocial Online

Using BERT for Qualitative Content Analysis in Psycho-Social Online Counseling Philipp Grandeit, Carolyn Haberkern, Maximiliane Lang, Jens Albrecht, Robert Lehmann Nuremberg Institute of Technology Georg Simon Ohm, Nuremberg, Germany {grandeitph64509, haberkernca76525, langma76539, albrechtje, lehmannro}@th-nuernberg.de Abstract large numbers of consulting communications (Na- varro et al. 2019). It is, however, possible to under- Qualitative content analysis is a systematic stand and describe the meaning of online counsel- method commonly used in the social sci- ing content with qualitative approaches (Bambling ences to analyze textual data from inter- et al. 2008, Gatti et al. 2016). views or online discussions. However, this method usually requires high expertise and This allows linking certain interventions of the manual effort because human coders need counselors to the reactions of the clients on a case- to read, interpret, and manually annotate by-case basis. But generalized statements on causal text passages. This is especially true if the relationships are not possible with the small num- system of categories used for annotation is ber of cases from qualitative studies (Ersahin & complex and semantically rich. Therefore, Hanley 2017). qualitative content analysis could benefit An analysis of large numbers of counseling greatly from automated coding. In this conversations using qualitative social research work, we investigate the usage of machine tools would help to better understand how success- learning-based text classification models ful online counseling works. Few related studies on for automatic coding in the area of psycho- social online counseling. We developed a these topics are available. Althoff et al. (2016) de- system of over 50 categories to analyze fined different models to measure general conver- counseling conversations, labeled over sation strategies like adaptability, dealing with am- 10.000 text passages manually, and evalu- biguity, creativity, making progress or change in ated the performance of different machine perspective and illustrated their applicability on a learning-based classifiers against human corpus of data from SMS counseling. Pérez-Rosas coders. et al. (2019) analyzed the quality of consulting communications based on video recordings. Their 1 Introduction automatic classifier used linguistic aspects of the content and could predict counseling quality with 1.1 Psycho-Social Online Counseling relatively good accuracy. However, neither of the Online counseling has developed into a full- mentioned approaches had the intention to recog- fledged psycho-social counseling service in Ger- nize the meaning of individual phrases even though many since the 1990s. Today, people can get advice this deep understanding is crucial to eliminate on a wide variety of psycho-social topics in web weaknesses in the education of online counselors forums and dedicated text-based counseling plat- (Luitgaarden et al. 2016, Niuewboer et al. 2014). forms. Online counseling is provided by psycho- In addition, systems could be developed to provide social professionals who have received special online advisors with practical suggestions for im- training in this method. Similar to face-to-face psy- proving their work. cho-social counseling, some aspects are known to make up high-quality online counseling, but there 1.2 Qualitative Content Analysis is few empirical evidence for special impact factors Qualitative social research is a generic term for (Fukkink et al 2009, Dowling & Rickwood 2014). various research approaches. It attempts to gain a Due to the complexity of the content, quantita- better understanding of people's social realities and tive approaches have not been able to analyze the to draw attention to recurring processes, patterns of meaning and significance of methodical patterns in 11 Proceedings of the Fourth Workshop on Natural Language Processing and Computational Social Science, pages 11–23 Online, November 20, 2020. c 2020 Association for Computational Linguistics https://doi.org/10.18653/v1/P17 interpretation, and structural characteristics (Ker- these models can be used for qualitative content gel, 2018). analysis of online counseling conversations. One such research approach deals with the content analysis of texts, the so-called qualitative con- 1.4 Research questions / Contribution tent analysis according to Mayring (2015). It is a Our first research question is whether it is possible central source of scientific knowledge in qualita- to train a model to identify psycho-social codes tive social research. It tries to determine the subjec- with a human-like precision. It also needs to be tive meaning of contents in texts. For this purpose, clarified whether a certain machine learning ap- categories are formed based on known scientific proach is particularly well suited for certain topics. theories on the topic and the discursive examina- It is assumed that this training does not work tion of the content. The definitions of those catego- equally well with all codes of the codebook. There- ries along with representative text passages are fore, the second question is which characteristics summarized in a codebook. codes must have in order to be learned particularly Then, human coders are coached in using the well or particularly poorly. codebook. The coaching process and the imple- In social science research, the discussion of dif- mentation of the coding require high human exper- ferent assessments of text passages is an important tise and manual effort because the coders must part of the scientific process. Therefore, the analy- read, interpret, and annotate each text passage. sis of codes incorrectly assigned by a model is an Thus, qualitative studies can only be applied to a important part of this work. The third research limited number of texts. Furthermore, it is hardly question is, therefore: What differences can be ob- possible to define the categories so precisely that served between the machine and human coding of all coders find identical results, as human language text passages? If the deviations are plausible, they is inherently ambiguous and its interpretation al- can be perceived as enriching the discursive pro- ways partly subjective. cess. Machine learning could be a solution to the di- lemma: If a trained model was able to categorize 1.5 Methodology and Structure of the Paper parts of the conversations according to a given For the experimental evaluation, the social scien- codebook with similar accuracy as a human, the tists in our interdisciplinary team created a code- time-consuming text analysis could be automated. book consisting of over 50 fine-grained categories and labeled over 10.000 text sequences of psycho- 1.3 Machine Learning for Qualitative Con- social counseling conversations (described in Sec- tent Analysis tion 2). The computer scientists then trained and Previous studies have shown that supervised ma- evaluated a support-vector machine and different chine learning is generally suitable for qualitative state-of-the-art models (e.g. ULMFit and BERT) content analysis (Crowston e.a. 2010, Scharkow on the provided data set (Section 3). Finally, the 2013). However, these studies used only a few cat- team investigated how human coders from the so- egories that could be distinguished relatively good, cial sciences perform in comparison to the BERT e.g. news categories like sports and business. model on a subset of the data (Section 4). Online counseling, in contrast, is a complex do- main. A detailed system of categories is necessary 2 Creating the Data Set to identify impactful patterns in counseling conversations. Additionally, many categories such as Online forums for psycho-social counseling pro- “Empathy” or “Compassion” are quite similar in vide a good basis for an empirical evaluation be- terms of the words used and can only be distin- cause they contain large amounts of publicly acces- guished if the model is able to somehow "under- sible data. For our study, we used posts from a Ger- stand" the meaning of the texts. man site for parent counseling. Here, parents who Recent neural models have drastically outper- have problems in bringing up their children are formed previous approaches for sophisticated seeking advice. Possible topics are, for example, problems like sentiment analysis and emotion de- drug abuse by the child or inadequate school per- tection (Howard&Ruder 2018, Devlin e.a. 2018, formance. A user can start a new thread with a Chatterjee e.a. 2019). We wanted to investigate if problem description. Professional counselors reply and discuss solution approaches with the initial 12 Figure 2: Example of three labeled sequences. The original texts are in German. In the end, we obtained a heavily imbalanced data set: The average number of samples per cate- gory is about 200, but the numbers vary greatly (see Appendix A for more details). For some cate- Figure 1: Illustration of the codebook with an gories in the area “Impact factors“, e.g. “Evalua- exemplary breakdown of the categories tion / understanding / calming“ or “Experience / explanation / example“ we obtained over 1000 user and others. Thus, each thread contains a series samples, whereas other categories including of posts with questions and suggestions about the “Change“ or “Suggestion to put oneself in a prob- initially described problem. Since we are espe- lem situation physically“ are barely represented. cially interested in counseling patterns, we focused Such an unequal distribution

Load more