Using Topic Models to Investigate Depression on Social Media William Armstrong University of Maryland
[email protected] Abstract itored during the time between sessions to identify periods of higher depression and subjects that may In this paper we explore the utility of topic- be associated with such periods. For example, if modeling techniques (LDA, sLDA, SNLDA) schoolwork is identified as a troublesome subject in computational linguistic analysis of text for the purpose of psychological evaluation. for the patient the clinician can investigate whether Specifically we hope to be able to identify and homework or grades may be triggering depressive provide insight regarding clinical depression. episodes and propose appropriate actions for such. Furthermore, these tools can scale to larger popu- lations to both identify individuals in need of treat- 1 Introduction ment and discover general trends in depressive be- Topic modeling is a well-known technique in the haviors. field of computational linguistics as a model for re- This paper will explore the utility of three topic ducing dimensionality of a feature space. In addition modeling techniques in these objectives: latent- to being an effective machine-learning tool, reduc- Dirichlet allocation (LDA) in its base form, with su- ing a dataset to a relatively small number of “topics” pervision (sLDA), and with supervision and a nested makes topic models highly interpretable by humans. hierarchy (SNLDA). We will examine the results This positions the technique ideally for problems in of each topic modeling technique qualitatively by which technology and domain experts might work the potential usefulness of their posterior topics to together to achieve superior results.