Beyond Binary Labels: Political Ideology Prediction of Twitter Users Daniel Preot¸iuc-Pietro Ye Liu∗ Positive Psychology Center School of Computing University of Pennsylvania National University of Singapore
[email protected] [email protected] Daniel J. Hopkins Lyle Ungar Political Science Department Computing & Information Science University of Pennsylvania University of Pennsylvania
[email protected] [email protected] Abstract User trait prediction from text is based on the as- sumption that language use reflects a user’s de- Automatic political preference prediction mographics, psychological states or preferences. from social media posts has to date proven Applications include prediction of age (Rao et al., successful only in distinguishing between 2010; Flekova et al., 2016b), gender (Burger et al., publicly declared liberals and conserva- 2011; Sap et al., 2014), personality (Schwartz tives in the US. This study examines et al., 2013; Preot¸iuc-Pietro et al., 2016), socio- users’ political ideology using a seven- economic status (Preot¸iuc-Pietro et al., 2015a,b; point scale which enables us to identify Liu et al., 2016c), popularity (Lampos et al., 2014) politically moderate and neutral users – or location (Cheng et al., 2010). groups which are of particular interest to Research on predicting political orientation has political scientists and pollsters. Using focused on methodological improvements (Pen- a novel data set with political ideology nacchiotti and Popescu, 2011) and used data sets labels self-reported through surveys, our with publicly stated dichotomous political orien- goal is two-fold: a) to characterize the po- tation labels due to their easy accessibility (Syl- litical groups of users through language wester and Purver, 2015).