Quantifying Community Characteristics of Maternal Mortality Using Social Media

Rediet Abebe∗ Salvatore Giorgi∗ Anna Tedijanto [email protected] [email protected] [email protected] Harvard University University of Pennsylvania Cornell University Anneke Buffone H. Andrew Schwartz [email protected] [email protected] University of Pennsylvania Stony Brook University

ABSTRACT Development group [11, 23]. Approximately 700 individuals die While most mortality rates have decreased in the US, maternal mor- from pregnancy-related causes [3, 15, 68] and an estimated 60% of tality has increased and is among the highest of any OECD nation. these deaths are suspected to be preventable [68]. While the inter- Extensive public health research is ongoing to better understand the national trend has seen a reduction in maternal mortality, despite characteristics of communities with relatively high or low rates. In increased budgets, rates in the US have more than doubled in the 1 this work, we explore the role that social media language can play past 25 years [3]. Black and Latina mothers bear a disproportionate in providing insights into such community characteristics. Analyz- brunt of this burden: Black women are three to four times more ing pregnancy-related tweets generated in US counties, we reveal likely to die during childbirth, even after controlling for numerous a diverse set of latent topics including Morning Sickness, Celebrity socioeconomic and risk factors [68]. These rates vary by geography: Pregnancies, and Abortion Rights. We find that rates of mention- e.g., in New York City, Black women are 12 times more likely to die 2 ing these topics on Twitter predicts maternal mortality rates with during childbirth than white women [64, 73]. higher accuracy than standard socioeconomic and risk variables Public health research has examined potential causes for mater- such as income, race, and access to health-care, holding even after nal mortality and disparities, pointing to issues such as access to reducing the analysis to six topics chosen for their interpretability insurance, bias in health-care, segregated hospitals, and inadequate and connections to known risk factors. We then investigate psy- post-delivery care [8, 27, 49, 50, 54]. While it is understood that chological dimensions of community language, finding the use of each of community, health facility and system, patient, and provider less trustful, more stressed, and more negative affective language is all play a part, there is an overall pervasive concern that the specific significantly associated with higher mortality rates, while trust and causes and mechanisms for maternal mortality and disparities are negative affect also explain a significant portion of racial disparities not adequately understood [68]. The WHO cites a “general lack of in maternal mortality. We discuss the potential for these insights good data – and related analysis – on maternal health outcomes" to inform actionable health interventions at the community-level. as a bottleneck for gaining insights into this issue [3]. In this work, we seek to partially address this gap, focusing KEYWORDS on community-level factors that characterize maternal mortality as revealed through social media language. We examine whether maternal mortality, health disparities, language, topic modeling, community variables derived from social media language data can community characteristics predict community maternal mortality rates and its racial disparity. ACM Reference Format: While emotions and language analyzed using social media data Rediet Abebe, Salvatore Giorgi, Anna Tedijanto, Anneke Buffone, and H. are shown to have high-efficacy in tasks ranging from predicting Andrew Schwartz. 2020. Quantifying Community Characteristics of Mater- allergies or life satisfaction to depression or heart disease mortal- nal Mortality Using Social Media. In Proceedings of The Web Conference 2020 ity [26, 30, 33, 66, 76], the potential of social media has yet to be (WWW ’20), April 20–24, 2020, Taipei, Taiwan. ACM, New York, NY, USA, arXiv:2004.06303v1 [cs.CL] 14 Apr 2020 examined in this manner to help shed understanding on maternal 12 pages. https://doi.org/10.1145/3366423.3380066 mortality at the community level. 1 INTRODUCTION Our contributions in this work are in three-folds: The United States has one of the highest maternal mortality rates • We show that there is a diverse set of pregnancy-related of any country in the Organization for Economic Cooperation and topics ranging from Morning Sickness, to Abortion Rights, to Maternal Studies. We demonstrate that these topics predict ∗Both authors contributed equally to this research. maternal mortality rates with higher accuracy than standard socioeconomic (SES), risk factors, and race. This paper is published under the Creative Commons Attribution 4.0 International (CC-BY 4.0) license. Authors reserve their rights to disseminate the work on their personal and corporate Web sites with the appropriate attribution. WWW ’20, April 20–24, 2020, Taipei, Taiwan 1Note, on the other hand, US infant mortality is at a historic low [13]. © 2020 IW3C2 (International World Wide Web Conference Committee), published 2This issue has garnered increases attention in part due to concentrated efforts under Creative Commons CC-BY 4.0 License. by policy-makers, advocacy groups, and celebrities, in addition to long-standing work ACM ISBN 978-1-4503-7023-3/20/04. by community organizations [35, 36, 40, 59, 81]. e.g., see collaborations between the https://doi.org/10.1145/3366423.3380066 Atlanta-based Black Mamas Matter Alliance and the Black Maternal Health Caucus. WWW ’20, April 20–24, 2020, Taipei, Taiwan Abebe and Giorgi, et al.

• We show that a select set of six topics, chosen for their inter- 3 DATA pretability and relations to known maternal health factors, We used three sets of data sets for this study, described below: hold as much predictive power as all pregnancy-related topics. Specifically, four of these topics – Maternal Studies, Teen Pregnancy, Abortion Rights, and Congratulatory Remarks – 3.1 Twitter Data and Seed-Words have negative associations with mortality rates. To generate our pregnancy data set, we started with a random • We examine variables associated with racial disparities in 10% sample of the entire Twitter stream collected between 2009 maternal mortality (i.e. the difference between rates for Black and 2015 [70]. We then used this data set to build two subsets: (1) women and other races), finding that language-based scores pregnancy-related tweets and (2) tweets geo-located to US counties. for trust and affect hold explanatory power for the county- Pregnancy-Related Tweets. The first data set consisted of tweets level relationship between race and maternal mortality, even related to pregnancy and birth. Tweets were pulled from the main after controlling for standard SES and risk-factors. data set if they contained the following seed-words: pregnancy, pregnant, infant, fetus, miscarriage, prenatal, trimester, complications, pregnant, birth, childbirth, pregnancies, baby, children, pregnancy, mother, newborn, child, as well as their plural form, hashtags such 2 BACKGROUND AND RELATED WORK as #pregnancy, and capitalizations such as Pregnancy. These seed- Maternal Mortality Background. Public health research has sought words were selected by examining nearest neighbors from word2vec better measurements of maternal mortality rates and their causes for words related to ‘pregnancy’ and ‘pregnant.’ and consequences [3, 23, 68]. There is a long line of work exploring We then manually examined a random sample of 1,000 tweets what community, patient, hospital, provider, or systemic-level fac- from the data set to test for relevance to pregnancy. Tweets that tors may contribute to high rates of mortality and disparities in the were deemed off-topic, such as those containing phrases like “mis- US [34, 48, 56, 57]. At the patient-level, cardiovascular conditions, carriage of justice" were used to generate phrases for further data which are related to stress, cause about one third of all pregnancy- cleaning. We also randomly sampled tweets for specific seed-words related deaths [68]. At the community and systemic-level, studies and if a substantial (i.e., more than 20%) of the tweets were un- have shown that delivery site, segregation, and discrimination in related to pregnancy, all tweets were removed from the data set, maternity care during visits all play a role [8, 27, 49, 50]. At the reducing the seed-set. After these cleaning steps, we kept 74.40% of systemic-level, sociological and economic research have shown the data set, and validated in fresh sample of 1,000 tweets that over racial disparities in mortality and life-expectancy [18, 55]. In line 95% of them are related to pregnancy. with such studies, there are numerous calls to use a data-driven U.S. County Tweets. The second data set consisted of tweets geo- approach to better grasp the role and causes of maternal mortality located to U.S. counties. For this we used the County Tweet Lexical related to each of the above main categories [68]. Bank [39]. This data set was geo-located using self-reported location Social Media Data for Health. Twitter data and more gener- information (from the user description field) and latitude / longitude ally social media data has been a popular source for exploring coordinates [76]. The data were then filtered to contain only English community-level health measurements [67]. Examples include ex- tweets [58]. We then limited our data set to Twitter users with at cessive alcohol consumption [26], depression [29, 62], heart dis- least 30 posts and U.S. counties with at least 100 such users. The ease, [33], and more generally population health and well-being final Twitter data set consisted of 2,041 U.S. counties. [25, 38, 76]. In addition to measuring community-level insights, these data sources have been used to study health information seek- 3.2 Mortality Rates ing and sharing [31] and individual-level predictions [30]. In recent The World Health Organization (WHO) defines maternal mortal- years, there has also been interest in understanding the societal and ity as “the death of a woman while pregnant or within 42 days ethical implications and limitations around the use of social media of termination of pregnancy, irrespective of the duration and site data for health studies and roles for computing as a diagnostic of of the pregnancy, from any cause related to or aggravated by the social problems [1, 4, 16, 17, 20]. pregnancy or its management but not from accidental or incidental Maternal Health. An emerging topic of interest has been the use causes" with the Centers for Disease Control and Prevention (CDC) of language-driven analysis to understand pregnancy and maternal expanding this time period to 1 year [24, 82]. Data for maternal mor- experiences. For instance, De Choudhury et al. [28] studied Twitter tality was collected from the CDC WONDER online database [14]. posts to understand changes in emotions for mothers; Antoniak et al. We collected rates from 2009-2017, so as to match the time-span [5] looked at narrative paths in individuals sharing childbirth stories of our Twitter sample in addition to more recent years (2016 and on an online forum. Focusing on support, Costa Figueiredo et al. 2017) since these rates are on the rise [68]. Mortality rates are listed [22], Gui et al. [42], Vydiswaran et al. [80] looked at how online peer under the following International Classification of Diseases, Tenth support and information exchange for pregnant individuals, their Revision (ICD-10) categories: O00-O07 (pregnancy with abortive caregivers, and individuals experiencing fertility issues. Abebe et al. outcome) and O10-O99 (other complications of pregnancy, child- [2] looked at information seeking for pregnancy and breastfeeding birth and the puerperium). The CDC suppresses data if a county related to HIV. To our knowledge, ours is the first work to employ experiences less than 10 deaths in a given time period for privacy a language-driven study to understand maternal mortality in the reasons. Of the 2,041 counties in our Twitter set only 197 also had US. mortality rates (i.e., counties experiencing 10 or more deaths). Quantifying Community Characteristics of Maternal Mortality Using Social Media WWW ’20, April 20–24, 2020, Taipei, Taiwan

Since the CDC does not report age-adjusted rates for counties the expected topics per document. We set α = 2 since tweets are with low mortality numbers, we took the crude rate as reported and shorter than the typical length of documents. The number of topics created our own age-adjusted rate. To do this, we built a model using is a free parameter and we chose 50 topics.3 median age of females (American Community Survey, 2014; 5-year estimates) and predicted maternal mortality, taking the residuals as Topic Label Top Weighted Words our new “age-adjusted maternal mortality rate.” This age-adjusted Teen teen, rate, rates, teenage, highest, mortality, low, value is used throughout the paper. Pregnancy states, teens, higher, number, 20, country, amer- (1.34%) ican, united, education, lowest, population 3.3 Socioeconomic Measures and Risk Factors Morning morning, sickness, purpose, symptoms, lives, In addition to mortality, we collected additional county-level vari- Sickness wanted, williamson, tv, experience, cure, bra, ables on socioeconomics, risk factors, and race. Socioeconomics in- (0.54%) marianne, thinking, signs, oral, teenagers, simon cluded unemployment rate, median income, and education (percent- Celebrity kim, kardashian, kayne, amber, rose, beyonce, age of people with Bachelor’s degrees and High School graduate Pregnancies years, west, harry, finish, swear, north, who’s, percentage). For risk factors, we included insurance rates and ac- (1.42%) kayne’s, taylor, sets, louis, wiz cess to health-care (the ratio of population to number primary care Abortion women, abortion, care, health, abortions, bill, providers). Finally, we also explored the relationship between lan- Rights (2.15%) mortality, #prolife, rights, law, gift, support, cir- guage and maternal mortality with respect to percentage of Black cumstances, crisis, irrelevant, #prochoice, forced individuals in each county. As discussed previously, the disparity in Maternal risk, defects, study, health, weight, linked, flu, mortality rates for Black women is large and providing evidence to- Studies cancer, early, diet, drinking, smoking, blood, safe, ward the factors at play for such a disparity is a key application for (2.56%) alcohol, diabetes, autism, acid, disease, drug our analyses. Additionally, to account for overall rates of birth, all Congratulatory congrats, congratulations, :), boy, happy, love, analysis included a birth rate covariate (the rate per 1,000 women, Remarks daughter, son, <3, wait, sister, late, healthy, aged 15-50, with births in the past 12 months). (3.06%) cousin, xx, amazing, :d, meet, proud The birth rate, race, SES variables, and insurance rates were Table 1: Sample pregnancy topics with representative words collected from the 2014 American Community Survey (5 year estimates), whereas the primary care providers was collected from the 2017 County Health Rankings (as reported by the Area Health Re- source File/American Medical Association, 2014). We were able to We find that our data reveals a rich set of themes relatedto obtain these values for each of the counties which met the Twitter pregnancy and birth. In Table 5, we show a sample of six topics, which are hand-selected to demonstrate the breadth of topics in and mortality inclusion criteria above. 5 Overall, we obtained data for 197 U.S. counties and county equiv- the data set. The first column provides the topic label, which were hand-generated by the authors, and the frequency with which the alents that met each of the data requirements above and conducted 4 our study on these counties. The full list of these counties is in- topic occurs in the data set. The last column corresponds to the cluded in the project page.5 top 10 most representative words for the topic. These above topics show that pregnancy-related discussions 4 TOPICS AND THEORETICAL LINGUISTIC on Twitter can range from personal-health disclosure such as in Morning Sickness, to political conversations related to Abortion FEATURES Rights, and light topics such as Congratulatory Remarks. Topics that We used three sets of features that will characterize maternal mor- were not included in manuscript due to length constraints include tality through language. First, we created a set of automatically- Royal Baby, Food Cravings, and Pregnancy Timeline. Each of these derived topics built over the pregnancy-related tweets. These topics topics shows varying levels of popularity across the counties. reveal a diversity of themes in discussions around pregnancy on the platform. Next, we used a small set of theoretically-driven lan- 4.2 Theoretical Features guage features – (affect, depression, stress, and trust) – in order to We also explore a set of theoretically-driven language features: access psychological traits of a community and their relations to affect, depression, trust, and stress. We downloaded pre-existing maternal mortality. Finally, we use a large, general set of topics models to derive county-level language features including: (non-pregnancy related) to identify broader language patterns. • affect – positive and negative emotional valence trained 4.1 Pregnancy-Related Topics over Facebook posts [71]. • depression – degree of depressive personality (a facet of We start with our data set of over 5 million pregnancy-related the big five personality test) fit over social media users’ lan- tweets described in Section 3. We automatically extracted topics guage [75]. using Latent Dirichlet Allocation (LDA) [12]. LDA is a generative 3Before running the rest of our analysis, we ran LDA using 10, 20, 50, 100, and statistical model which assumes that each document (in our case 200 topics. We selected 50 topics based on manual inspection of coherence and inter- tweet) contains a distribution of topics, which in turn, are a distri- pretability of the topics. bution of words. We use the Mallet software package [60], which 4Note, since there are 50 topics, the average value is 2%. Furthermore, since some themes, such as celebrity pregnancy, occur in more than one topic, the overall estimates the latent variable of the topics using Gibbs sampling [37]. frequency of this theme in the data set is higher than the corresponding value in this All default Mallet settings were used, except α, which is a prior on table. WWW ’20, April 20–24, 2020, Taipei, Taiwan Abebe and Giorgi, et al.

• trust – degree of trustfulness (how much one tends to trust as covariates. We adjust for multiple comparisons by applying a persons or entities that they do not personally know) fit over Benjamini—Hochberg false discovery rate correction to the signifi- social media users’ language [83]. cance threshold (p < .05)[10]. For LDA topics we visualize topics • stress – amount of stress fit over social media users’ lan- significant correlations as word clouds. The word clouds display guage and Cohen’s Stress scale [19, 43]. the top 15 most prevalent words within a topic sized according to their posterior likelihood. 4.3 General Topics Finally, we use a larger set of LDA topics built over a more general 5.3 Mediating Language Analysis data set. By doing this in tandem with the pregnancy-related topics, We explore the relationship between maternal mortality and the per- we can zoom in on pregnancy-related themes while also exploring a centage of Black individuals within a county, as expressed through larger set of language correlates, which might help in characterizing the county’s language. Language based mediation analysis has been communities suffering from higher or lower rates of mortality. To used in the past to explore the relationship between socioeconomics this end, we downloaded a set of 2,000 topic posteriors that were and excessive drinking [26]. For this analysis, we residualize the automatically-derived over the MyPersonality data set [77]. These crude maternal mortality rate, as reported by the CDC, on median topics have been used over a large class of problems and have been age of female, birth rates, all socioeconomic variables (income, edu- found to be robust both in terms of interpretability and predictive cation and unemployment), insurance rates and rates of primary power [33, 51, 65, 69], so they form a point of comparison for our care providers. domain-specific topics. For each language variable, both the pregnancy related LDA topics and theoretical language features, we consider the mediat- 5 METHODS ing relationship between the topic (mediator), percentage Black To understand the relationship between community level language (independent variable) and residualized maternal mortality rates and maternal mortality, we perform three types of statistical analy- (dependent variable). We follow the standard three-step, Baron and ses: (1) prediction — can language be used to predict mortality rates Kenny approach [9]. Step 1: we regress our independent (x) and in a cross-sectional cross validation setup? (2) differential language dependent variables (y; path c) in a standard OLS regression. Step 2: analysis – can we gain insights into communities which suffer from we regress the independent variable (x) and mediator (m; path α). higher or lower maternal mortality through language? and (3) me- Finally, in Step 3 we create a multi-variate model and regress both the mediator (m; topic) and independent variable (x; percentage diating language analysis — can language be used to understand ′ the mechanisms through which Black communities experience in- Black) with maternal mortality (y; path c ). The three models are creased rates of maternal mortality? All data processing, feature as follows: extraction and statistical analysis are performed using the open y = cx + β1 + ϵ1, (1) source Python package DLATK [78]. m = αx + β2 + ϵ2, (2) ′ y = c x + +βm + β3 + ϵ3. (3) 5.1 Prediction The mediation effect sizec ( − c′) is taken as the reduction in the We use two types of predictive models, depending on the type of effect size between the direct relationship (i.e., percentage Black independent variables. All non-language variables (i.e., SES and and maternal mortality) and the mediated relationship. To test for risk factors) are modeled with an ordinary least squares (OLS) significance, we use a Sobel p [79] and correct all p values for false regression, whereas language features use an ℓ2 regularized (Ridge) discoveries via a Benjamini—Hochberg procedure. regression [47]. In addition to regularization, we also use a feature selection pipeline in all language based models, since the number 6 RESULTS of features can be larger than the number of observations (N =197 We begin by looking at correlations between maternal mortality counties). The pipeline first removes all low variance features and and various socioeconomics and risk factors. Table 2 shows the set then features that were not correlated with our outcome. Finally, of correlation coefficients. These results state that the percentage we applied Principal Component Analysis (PCA) to further reduce of the population that is Black and unemployment rate were posi- the number of features. All models are evaluated in a 10-fold cross tively correlated with maternal mortality rate and insurance access, validation setup, with the Ridge regularization parameter α tuned income, and education were negatively correlated with maternal on the training set within each fold. Predictive accuracy is measured mortality rate. Additionally, birth rates were not significantly cor- in terms of a single Pearson correlation between the actual values related with maternal mortality. Note that, in this paper, we only and the predicted values, whereas standard errors are calculated consider 197 counties in the US due to constraints around Twitter across all 10 folds. and county-mapped data as discussed in Section 3. While the correlation values do not exactly match correlations for all US counties, 5.2 Differential Language Analysis the general direction of relationship between maternal mortality Differential Language Analysis (DLA) is used to identify language rates and these SES and risk-factors was the same, with those the characterizing maternal mortality [52, 77]. Here we individually strongest associations – such as percent Black – also matching. regress each of our language variables (i.e., pregnancy related top- We next look at the predictive accuracy of our 50 topics, the ics and theoretical features) using an OLS regression, adding in 2000 general topics, and the above SES and risk-factors as well as access to health-care, birth rates, socioeconomics and risk factors percent Black values. For this, note that we used linear regression Quantifying Community Characteristics of Maternal Mortality Using Social Media WWW ’20, April 20–24, 2020, Taipei, Taiwan

Correlation the most negative association – i.e., counties where there are rela- Birth Rates tively more tweets related to this topic had lower rates of mortality. Rate per 1,000 women .10 [-.04,.24] Note that each of the four topics in the figure – Maternal Studies, Race Teen Pregnancies, Congratulatory Remarks, and Abortion Rights – all Black (percent) .49 [.36,.61]*** show negative associations with maternal mortality rates. Celebrity ∗ Risk Factors Pregnancies, not shown, is positively associated (.20 [.07,.33] ) with Primary Care Providers -.23 [-.38,-.09]** higher mortality. Uninsured (percent) .27 [.12,.41]*** Socioeconomics Income (log median) -.42 [-.55,-.29]*** High School or more (percent) -.14 [-.28,.01] Bachelor’s Degree (percent) -.38 [-.52,-.23]*** Unemployment (percent) .26 [.12,.39]*** Table 2: Correlations with risk factors, socioeconomics, and race. All non-birth rate correlations controlled for birth -.38 [-.49,-.25]*** -.35 [-.47,-.22]*** rates. Reported standardized β with 95% confidence intervals in square brackets; ***p < 0.001, **p < 0.01,*p < .05, after Benjamini—Hochberg correction

2000 General Topics

50 Pregnancy Topics -.28 [-.40,-.14]*** -.19 [-.32,-.05]*

6 Pregnancy Topics Figure 2: Differential Language Analysis using 6 pregnancy SES + Risk Factors + Race related LDA topics, controlled for race, risk factors and socioeconomics. Reported standardized β with 95% confidence Race intervals in square brackets;***p < 0.001, **p < 0.01,*p < .05,

Socio-economics after Benjamini—Hochberg correction.

Risk Factors We also used 4 theoretical features within the DLA framework: .00 .20 .40 .60 .80 Pearson r affect, depression, stress and trust. Results are presented in Table 3. We see higher rates of maternal mortality associated with higher distrust, higher stress, higher depression, and with less affect. Figure 1: Prediction accuracy for non-language variables (red), pregnancy related LDA topics (purple) and general set of LDA topics (blue). Reported Pearson r from 10-fold cross Correlation validation, errors bars are 95% CI. Affect -.30 [-.43,-.17]*** Depression .23 [.10,.36]** Stress .24 [.10,.37]** with maternal mortality values as the outcome variable and the Trust -.38 [-.49,-.25]*** aforementioned language variables as the explanatory variables. Table 3: Differential Language Analysis of theoretically rele- Figure 1 shows that the 2000 general Facebook topics had the high- vant features. Reported standardized β with 95% confidence est predictive power with a Pearson r = .72 [.65,.79]*** while risk intervals in square brackets;***p < 0.001, **p < 0.01,*p < .05, factors (PCP access and insurance rate) were the lowest with a Pear- after Benjamini—Hochberg correction. son r = .21 [.05,.38]**. Overall SES factors, risk factors, and race, had significantly less predictive accuracy (using a paired t-test) than the 50 pregnancy-related topics from the Twitter data (t = −4.63, p < .001) and the 2000 general topics (t = −4.74, p < .001). Finally, we explore disparities by race at the population level. For the Differential Language Analysis (DLA), we selected the The county-level health disparity itself can be seen simply from the 6 topics of interest. We ran a multi-linear regression, treating the strong correlation between the two variables: communities that are maternal mortality rate as an outcome variable and the prevalence more Black, have greater maternal mortality. We turn to Twitter- of these topics in the counties as the explanatory variable with based community characteristics as mediators (i.e. explainers) of birth rates, race, risk factors and socioeconomics as covariates. We this race-mortality relationship. The idea behind mediation analysis, found that five of the 6 topics, shown in Figure 2 had significant is that if included a 3rd variable (i.e. a Twitter measurement) in the associations with maternal mortality rates. Maternal studies had linear analysis reduces the relationship of the first 2 (i.e. race and WWW ’20, April 20–24, 2020, Taipei, Taiwan Abebe and Giorgi, et al.

c − c′ α β Black and Latina women, which is believed to drive higher mor- Affect .11** -.40 [-.53,-.27]*** -.27 [-.41,-.13]*** tality rates. Trust in physicians and medical institutions has been Depression -.04 -.26 [-.39,-.12]*** .14 [.01,.28]* extensively studied [44–46, 61], with multiple studies focusing on racial and ethnic differences in levels of trust [6, 7, 32, 41]. Find- Stress -.01 -.06 [-.20,.08] .15 [-.02,.28]* ings repeatedly show ethnic and racial differences in trust towards Trust .14** -.51 [-.63,-.39]*** -.27 [-.42,-.12]*** health-care systems, in addition to showing that distrust is associ- Table 4: Mediating Language Analysis: Analysis seeks to ex- ated with racial disparities in use of preventive services [63]. The plain the correlation, c = .36, between percent Black and affect result is also related to the Congratulatory Remarks topic, residualized maternal mortality through differences in lan- indicating that communities with both more positive language and guage. α: correlation between the theoretical factor and per- more positive discussions around pregnancy and birth may also be cent Black; beta: correlation between the theoretical factor experiencing lower maternal mortality rates and disparities. These and residualized maternal mortality. Reported Pearson r observations, along with existing discussions, provide potential with 95% confidence intervals in square brackets;p *** < actionable insights for policies at the community level. 0.001, **p < 0.01,*p < .05, after Benjamini- –Hochberg cor- The results here are not without limitations: as with other studies ′ rection. The c −c column uses a Sobel p for significance [79]. heavily relying on social media data, there are inherent issues of selection bias in who is on the platform and which users meet the inclusion thresholds we set for the pregnancy-related and county- mapping data sets. There is also selection bias in tweets that are maternal mortality), then this third variable is accounting for some geo-located as well as language use by the individuals on Twitter of the covariance between the first two. compared to other platforms. It is imperative to not take these data We considered each of the 4 theoretical dimensions as poten- sets as being representative of the U.S., the counties we study, or tial mediators. To zero in on explaining what is novel about the even individuals that maybe included in the data sets. race-mortality, we controlled for all previously mentioned socioe- Furthermore, we do not control for linguistic differences across conomic and risk factor variables by producing a residual of the different parts of the U.S. and some topics, as a result, mayshow variance left over. The correlation between percent Black and ma- significant spatial and geographic associations. Likewise, we set the ternal mortality was then c = .36 Without this step, it could be seed-words for constructing the pregnancy-related data set using that any mediators were simply accounting for socioeconomic or word2vec, which may also suffer bias issues: e.g, certain words risk factor effects. As seen in Table 4, we found two of the theo- which may be commonly used to discuss pregnancy and birth by retical dimensions and 3 of the topics had a significant mediation certain groups of under-represented individuals may not pass this effect, in part explaining the disparity. For example, trust medi- analysis. While we attempt to control for this by having a relatively ated the relationship – the fact that communities expressing lower large number of seed-words and instead relying on data cleaning, trust had greater maternal mortality, partially explained why Black this remains a notable limitation. percentage related to greater mortality. We were hindered by the availability of outcome data: a lot of the relevant data is available only at the county-level and crucial data 7 DISCUSSION like disparities by race were entirely unavailable. While we believe The results shown in this work demonstrate the efficacy of social that studies like ours will provide additional data-sources, models, media language to shed some light on community characteristics and measurements to further our understanding of maternal mor- of maternal mortality. While social media data, by itself, is not tality and disparities, availability of ground truth data presents a able to reliably identify causes for high maternal mortality rates significant bottleneck. The availability of ground truth data about and disparities, it can provide supporting evidence for existing mortality and disparities, including data regarding mortality rates conjectures and generate hypotheses for further investigation. for groups of individuals belonging to marginalized communities, The observation that pregnancy-related topics, as well as the as well as disaggregated data by different demographics such as general 2,000 topics, both hold more predictive power than SES, race, age, education, income, and other demographics would allow risk factors, and race, combined, shows that such language-based for more fine-grained analysis. data sets may contain characteristics of communities beyond that captured in standard variables used to study maternal mortality. 8 ETHICS STATEMENT Furthermore, the diversity of discussion themes in the pregnancy- This study was reviewed by the University of Pennsylvania institu- related data set presents an opportunity to consider how different tional review board and, due to lack of individual human subjects, topics relate with maternal mortality rates and patterns of topic found to be exempt. All data used in this study are publicly available. popularity across US counties. While the county-level language estimates are publicly available The novel mediation results presented in this work allow us to and will be posted on the project page5, the original tweets, which gain further insights into how affect, depression, stress, and trust are also publicly available, are unable to be redistributed by the relate to mortality rates and disparities. The results that trust and authors due to Twitter’s Terms of Service. For additional privacy affect related significantly with mortality rates mirrors discus- protection, we automatically replace any Twitter user names with sions from public health research: for instance, failure by hospitals, in our analysis and presentation in this paper. providers, and facilities to provide unbiased and nondiscriminatory care has already been shown to result in lower follow-up visits by 5All data available at: https://github.com/wwbp/maternal_mortality Quantifying Community Characteristics of Maternal Mortality Using Social Media WWW ’20, April 20–24, 2020, Taipei, Taiwan

REFERENCES 2018. Can Twitter be used to predict county excessive alcohol consumption [1] Rediet Abebe, Solon Barocas, Jon Kleinberg, Karen Levy, Manish Raghavan, and rates? PloS one 13, 4 (2018), e0194290. David G Robinson. 2020. Roles for computing in social change. In Proceedings of [27] Heike Thiel de Bocanegra, Monica Braughton, Mary Bradsberry, Mike Howell, the 2020 Conference on Fairness, Accountability, and Transparency. 252–260. Julia Logan, and Eleanor Bimla Schwarz. 2017. Racial and ethnic disparities [2] Rediet Abebe, Shawndra Hill, Jennifer Wortman Vaughan, Peter M Small, and in postpartum care and contraception in CaliforniaâĂŹs Medicaid program. H Andrew Schwartz. 2019. Using search queries to understand health information American journal of obstetrics and gynecology 217, 1 (2017), 47–e1. needs in africa. In Proceedings of the International AAAI Conference on Web and [28] Munmun De Choudhury, Scott Counts, and Eric Horvitz. 2013. Major life changes Social Media, Vol. 13. 3–14. and behavioral markers in social media: case of childbirth. In Proceedings of the [3] Priya Agrawal. 2015. Maternal mortality and morbidity in the United States of 2013 conference on Computer supported cooperative work. ACM, 1431–1442. America. [29] Munmun De Choudhury, Scott Counts, and Eric Horvitz. 2013. Social media as a [4] Tim Althoff. 2017. Population-scale pervasive health. IEEE pervasive computing measurement tool of depression in populations. In Proceedings of the 5th Annual 16, 4 (2017), 75–79. ACM Web Science Conference. ACM, 47–56. [5] Maria Antoniak, David Mimno, and Karen Levy. 2019. Narrative Paths and [30] Munmun De Choudhury, Michael Gamon, Scott Counts, and Eric Horvitz. 2013. Negotiation of Power in Birth Stories. In Proc. ACM Human Computer Interaction. Predicting depression via social media. In Seventh international AAAI conference CSCW. on weblogs and social media. [6] Katrina Armstrong, Mary Putt, Chanita Hughes Halbert, David Grande, J Sanford [31] Munmun De Choudhury, Meredith Ringel Morris, and Ryen W White. 2014. Schwartz, Kaijun Liao, Noora Marcus, Mirar Bristol Demeter, and Judy A Shea. Seeking and sharing health information online: comparing search engines and 2013. Prior experiences of racial discrimination and racial differences in health social media. In Proceedings of the 32nd annual ACM conference on Human factors care system distrust. Medical Care 51, 2 (2013), 144. in computing systems. ACM, 1365–1376. [7] Katrina Armstrong, Karima L Ravenell, Suzanne McMurphy, and Mary Putt. 2007. [32] Mark P Doescher, Barry G Saver, Peter Franks, and Kevin Fiscella. 2000. Racial Racial/ethnic differences in physician distrust in the United States. American and ethnic disparities in perceptions of physician style and trust. (2000). journal of public health 97, 7 (2007), 1283–1289. [33] Johannes C Eichstaedt, H Andrew Schwartz, Margaret L Kern, Gregory Park, [8] Laura Attanasio and Katy B Kozhimannil. 2017. Health care engagement and Darwin R Labarthe, Raina M Merchant, Sneha Jha, Megha Agrawal, Lukasz A follow-up after perceived discrimination in maternity care. Medical care 55, 9 Dziurzynski, Maarten Sap, Christopher Weeg, Emily E Larson, Lyle H Ungar, and (2017), 830–833. Martin EP Seligman. 2015. Psychological language on Twitter predicts county- [9] Reuben M Baron and David A Kenny. 1986. The moderator–mediator variable level heart disease mortality. Psychological Science 26 (2015), 159–169. Issue distinction in social psychological research: Conceptual, strategic, and statistical 2. considerations. Journal of personality and social psychology 51, 6 (1986), 1173. [34] Centers for Disease Control, Prevention, et al. 2019. Building US capacity to [10] Yoav Benjamini and Yosef Hochberg. 1995. Controlling the false discovery rate: a review and prevent maternal deaths. Report from nine maternal mortality review practical and powerful approach to multiple testing. Journal of the Royal statistical committees. society: series B (Methodological) 57, 1 (1995), 289–300. [35] Kirsten Gillibrand: United States Senator for New York. [n.d.]. With Maternal [11] Cynthia J Berg, William M Callaghan, Carla Syverson, and Zsakeba Henderson. Mortality Rates On The Rise In The United States, Gillibrand Announces New 2010. Pregnancy-related mortality in the United States, 1998 to 2005. Obstetrics Legislation To Help Reduce Maternal Deaths, Help Hospitals Implement Best & Gynecology 116, 6 (2010), 1302–1309. Practices To Prevent Women From Dying Before, During And After Child- [12] David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. birth. https://www.gillibrand.senate.gov/news/press/release/with-maternal- Journal of machine Learning research 3, Jan (2003), 993–1022. mortality-rates-on-the-rise-in-the-united-states-gillibrand-announces-new- [13] CDC. 2019. Center for Disease Control and Prevention. https://www.cdc.gov/ legislation-to-help-reduce-maternal-deaths-help-hospitals-implement-best- nchs/data/databriefs/db229. practices-to-prevent-women-from-dying-before-during-and-after-childbirth. [14] CDC. 2019. Center for Disease Control and Prevention: CDC Wonder. https: [36] Abby Gardner. [n.d.]. Black Women Are Dying During Childbirth. Sen. Kamala //wonder.cdc.gov/. Harris Is Working to Change That. https://www.glamour.com/story/senator- [15] CDC. 2019. Center for Disease Control and Prevention: Pregnancy kamala-harris-bill-maternal-mortality-crisis. Mortality Surveillance System. https://www.cdc.gov/reproductivehealth/ [37] Alan E Gelfand and Adrian FM Smith. 1990. Sampling-based approaches to maternalinfanthealth/pregnancy-mortality-surveillance-system.htm. calculating marginal densities. Journal of the American statistical association 85, [16] Stevie Chancellor, Michael L Birnbaum, Eric D Caine, Vincent Silenzio, and 410 (1990), 398–409. Munmun De Choudhury. 2019. A taxonomy of ethical tensions in inferring [38] Joseph Gibbons, Robert Malouf, Brian Spitzberg, Lourdes Martinez, Bruce Apple- mental health states from social media. In Proceedings of the Conference on Fairness, yard, Caroline Thompson, Atsushi Nara, and Ming-Hsiang Tsou. 2019. Twitter- Accountability, and Transparency. ACM, 79–88. based measures of neighborhood sentiment as predictors of residential population [17] Irene Y Chen, Peter Szolovits, and Marzyeh Ghassemi. 2019. Can AI help reduce health. PloS one 14, 7 (2019), e0219550. disparities in general medical and mental health care? AMA journal of ethics 21, [39] Salvatore Giorgi, Daniel Preotiuc-Pietro, Anneke Buffone, Daniel Rieman, Lyle H. 2 (2019), 167–179. Ungar, and H. Andrew Schwartz. 2018. The Remarkable Benefit of User-Level [18] Raj Chetty, Michael Stepner, Sarah Abraham, Shelby Lin, Benjamin Scuderi, Aggregation for Lexical-based Population-Level Predictions. In Proceedings of the Nicholas Turner, Augustin Bergeron, and David Cutler. 2016. The association 2018 Conference on Empirical Methods in Natural Language Processing. between income and life expectancy in the United States, 2001-2014. Jama 315, [40] Amanda Michelle Gomez. [n.d.]. There’s finally a group of lawmak- 16 (2016), 1750–1766. ers focused on one of the widest racial disparities in health care. [19] Sheldon Cohen, Ronald C Kessler, and Lynn Underwood Gordon. 1997. Measuring https://thinkprogress.org/house-forms-first-ever-black-maternal-health- stress: A guide for health and social scientists. Oxford University Press on Demand. caucus-alma-adams-lauren-underwood-32791417ffd7/. [20] Mike Conway and Daniel O’Connor. 2016. Social media, big data, and mental [41] Howard S Gordon, Richard L Street Jr, Barbara F Sharf, P Adam Kelly, and Julianne health: current advances and ethical implications. Current opinion in psychology Souchek. 2006. Racial differences in trust and lung cancer patients’ perceptions 9 (2016), 77–82. of physician communication. Journal of clinical oncology 24, 6 (2006), 904–909. [21] Paul T Costa and Robert R McCrae. 1992. Normal personality assessment in [42] Xinning Gui, Yu Chen, Yubo Kou, Katie Pine, and Yunan Chen. 2017. Investigat- clinical practice: The NEO Personality Inventory. Psychological assessment 4, 1 ing Support Seeking from Peers for Pregnancy in Online Health Communities. (1992), 5. Proceedings of the ACM on Human-Computer Interaction 1, CSCW (2017), 50. [22] Mayara Costa Figueiredo, Clara Caldeira, Tera L Reynolds, Sean Victory, Kai [43] Sharath Chandra Guntuku, Anneke Buffone, Kokil Jaidka, Johannes C Eichstaedt, Zheng, and Yunan Chen. 2017. Self-tracking for fertility care: collaborative and Lyle H Ungar. 2019. Understanding and measuring psychological stress using support for a highly personalized problem. Proceedings of the ACM on Human- social media. In Proceedings of the International AAAI Conference on Web and Computer Interaction 1, CSCW (2017), 36. Social Media, Vol. 13. 214–225. [23] Andreea A Creanga and William M Callaghan. 2017. Recent increases in the US [44] Mark A Hall, Fabian Camacho, Elizabeth Dugan, and Rajesh Balkrishnan. 2002. maternal mortality rate: disentangling trends from measurement issues. Obstetrics Trust in the medical profession: conceptual and measurement issues. Health & Gynecology 129, 1 (2017), 206–207. services research 37, 5 (2002), 1419–1439. [24] Andreea A Creanga, Carla Syverson, Kristi Seed, and William M Callaghan. 2017. [45] Mark A Hall, Elizabeth Dugan, Beiyao Zheng, and Aneil K Mishra. 2001. Trust Pregnancy-related mortality in the United States, 2011–2013. Obstetrics and in physicians and medical institutions: what is it, can it be measured, and does it gynecology 130, 2 (2017), 366. matter? The milbank quarterly 79, 4 (2001), 613–639. [25] Aron Culotta. 2014. Estimating county health statistics with twitter. In Proceedings [46] Mark A Hall, Beiyao Zheng, Elizabeth Dugan, Fabian Camacho, Kristin E Kidd, of the 32nd annual ACM conference on Human factors in computing systems. ACM, Aneil Mishra, and Rajesh Balkrishnan. 2002. Measuring patientsâĂŹ trust in their 1335–1344. primary care providers. Medical care research and review 59, 3 (2002), 293–318. [26] Brenda Curtis, Salvatore Giorgi, Anneke EK Buffone, Lyle H Ungar, Robert DAsh- [47] Arthur E Hoerl and Robert W Kennard. 1970. Ridge regression: Biased estimation ford, Jessie Hemmons, Dan Summers, Casey Hamilton, and H Andrew Schwartz. for nonorthogonal problems. Technometrics 12, 1 (1970), 55–67. WWW ’20, April 20–24, 2020, Taipei, Taiwan Abebe and Giorgi, et al.

[48] Elizabeth A Howell. 2018. Reducing Disparities in Severe Maternal Morbidity arousal in facebook posts. In Proceedings of the 7th Workshop on Computational and Mortality. Clinical obstetrics and gynecology 61, 2 (2018), 387–399. Approaches to Subjectivity, Sentiment and Social Media Analysis. 9–15. [49] Elizabeth A Howell, Natalia Egorova, Amy Balbierz, Jennifer Zeitlin, and Paul L [72] Daniel Rieman, Kokil Jaidka, H Andrew Schwartz, and Lyle Ungar. 2017. Domain Hebert. 2016. Black-white differences in severe maternal morbidity and site of adaptation from user-level facebook models to county-level twitter predictions. care. American journal of obstetrics and gynecology 214, 1 (2016), 122–e1. In Proceedings of the Eighth International Joint Conference on Natural Language [50] Elizabeth A Howell, Natalia N Egorova, Amy Balbierz, Jennifer Zeitlin, and Processing (Volume 1: Long Papers). 764–773. Paul L Hebert. 2016. Site of delivery contribution to black-white severe maternal [73] Robin Fields. [n.d.]. New York City Launches Committee to Review Maternal morbidity disparity. American journal of obstetrics and gynecology 215, 2 (2016), Deaths. https://www.propublica.org/article/new-york-city-launches-committee- 143–152. to-review-maternal-deaths. [51] Kokil Jaidka, Sharath Chandra Guntuku, Anneke Buffone, H. Andrew Schwartz, [74] James A Russell. 1980. A circumplex model of affect. Journal of personality and and Lyle Ungar. 2018. Facebook versus Twitter: Cross-Platform Differences in social psychology 39, 6 (1980), 1161. Self-Disclosure and Trait Prediction. In Proceedings of the International AAAI [75] H Andrew Schwartz, Johannes Eichstaedt, Margaret L Kern, Gregory Park, Conference on Web and Social Media. Maarten Sap, David Stillwell, Michal Kosinski, and Lyle Ungar. 2014. Towards [52] Margaret L Kern, Gregory Park, Johannes C Eichstaedt, H Andrew Schwartz, assessing changes in degree of depression through facebook. In Proceedings of the Maarten Sap, Laura K Smith, and Lyle H Ungar. 2016. Gaining insights from Workshop on Computational Linguistics and Clinical Psychology: From Linguistic social media language: Methodologies and challenges. Psychological methods 21, Signal to Clinical Reality. 118–125. 4 (2016), 507. [76] H Andrew Schwartz, Johannes C Eichstaedt, Margaret L Kern, Lukasz Dziurzyn- [53] Michal Kosinski, Sandra C Matz, Samuel D Gosling, Vesselin Popov, and David ski, Richard E Lucas, Megha Agrawal, Gregory J Park, Shrinidhi K Lakshmikanth, Stillwell. 2015. Facebook as a research tool for the social sciences: Opportunities, Sneha Jha, Martin E P Seligman, and Lyle H Ungar. 2013. Characterizing geo- challenges, ethical considerations, and practical guidelines. American Psychologist graphic variation in well-being using tweets. In Proceedings of the 7th International 70, 6 (2015), 543. AAAI Conference on Weblogs and Social Media (ICWSM). [54] Katy Backes Kozhimannil, Connie Mah Trinacty, Alisa B Busch, Haiden A [77] H Andrew Schwartz, Johannes C Eichstaedt, Margaret L Kern, Lukasz Dziurzyn- Huskamp, and Alyce S Adams. 2011. Racial and ethnic disparities in postpartum ski, Stephanie M Ramones, Megha Agrawal, Achal Shah, Michal Kosinski, David depression care among low-income women. Psychiatric Services 62, 6 (2011), Stillwell, Martin EP Seligman, and Lyle H Ungar. 2013. Personality, gender, and 619–625. age in the language of social media: The Open-Vocabulary approach. PLoS ONE [55] Robert S Levine, James E Foster, Robert E Fullilove, Mindy T Fullilove, Nathaniel C (2013). Briggs, Pamela C Hull, Baqar A Husaini, and Charles H Hennekens. 2016. Black- [78] H Andrew Schwartz, Salvatore Giorgi, Maarten Sap, Patrick Crutchley, Lyle white inequalities in mortality and life expectancy, 1933–1999: implications for Ungar, and Johannes Eichstaedt. 2017. DLATK: Differential language analysis healthy people 2010. Public health reports (2016). ToolKit. In Proceedings of the 2017 Conference on Empirical Methods in Natural [56] Judette M Louis, M Kathryn Menard, and Rebekah E Gee. 2015. Racial and ethnic Language Processing: System Demonstrations. 55–60. disparities in maternal morbidity and mortality. Obstetrics & Gynecology 125, 3 [79] Michael E Sobel. 1982. Asymptotic confidence intervals for indirect effects in (2015), 690–694. structural equation models. Sociological methodology 13 (1982), 290–312. [57] Michael C Lu. 2018. Reducing maternal mortality in the United States. Jama 320, [80] VG Vinod Vydiswaran, Yang Liu, Kai Zheng, David A Hanauer, and Qiaozhu 12 (2018), 1237–1238. Mei. 2014. User-created groups in health forums: What makes them special?. In [58] Marco Lui and Timothy Baldwin. 2012. langid. py: An off-the-shelf language Eighth International AAAI Conference on Weblogs and Social Media. identification tool. In Proceedings of the ACL 2012 system demonstrations (ACL). [81] Sen. Elizabeth Warren. [n.d.]. Sen. Elizabeth Warren On Black Women Ma- 25–30. ternal Mortality: ’Hold Health Systems Accountable For Protecting Black [59] N Martin, E Cillekens, and A Freitas. 2017. Lost mothers. ProPublica. Moms’. https://www.essence.com/feature/sen-elizabeth-warren-black-women- [60] Andrew Kachites McCallum. 2002. Mallet: A machine learning for language mortality-essence/. toolkit. http://mallet. cs. umass. edu (2002). [82] Carla Abou Zahr, Tessa M Wardlaw, and Yoonjoung Choi. 2004. Maternal mor- [61] David Mechanic. 1996. Changing medical organization and the erosion of trust. tality in 2000: estimates developed by WHO, UNICEF and UNFPA. World Health The Milbank Quarterly (1996), 171–189. Organization. [62] Danielle Mowery, Albert Park, Mike Conway, and Craig Bryan. 2016. Towards [83] Mohammadzaman Zamani, Anneke Buffone, and H Andrew Schwartz. 2018. automatically classifying depressive symptoms from Twitter data for population Predicting Human Trustfulness from Facebook Language. arXiv preprint health. arXiv:1808.05668 (2018). [63] Donald Musa, Richard Schulz, Roderick Harris, Myrna Silverman, and Stephen B Thomas. 2009. Trust in the health care system and the use of preventive health services by older black and white adults. American journal of public health 99, 7 (2009), 1293–1299. [64] NYC Health. [n.d.]. Severe Maternal Morbidity: New York City, 2008-2012. https://www1.nyc.gov/assets/doh/downloads/pdf/data/maternal-morbidity- report-08-12.pdf. [65] Gregory Park, H Andrew Schwartz, Johannes C Eichstaedt, Margaret L Kern, Michal Kosinski, David J Stillwell, Lyle H Ungar, and Martin EP Seligman. 2015. Automatic personality assessment through social media language. Journal of personality and social psychology 108, 6 (2015), 934. [66] Michael J. Paul and Mark Dredze. 2011. You Are What You Tweet: Analyzing Twitter for Public Health. In International Conference on Weblogs and Social Media (ICWSM). 265–272. [67] Michael J Paul and Mark Dredze. 2017. Social monitoring for public health. Synthesis Lectures on Information Concepts, Retrieval, and Services 9, 5 (2017), 1–183. [68] Emily E Petersen, Nicole L Davis, David Goodman, Shanna Cox, Nikki Mayes, Emily Johnston, Carla Syverson, Kristi Seed, Carrie K Shapiro-Mendoza, William M Callaghan, et al. 2019. Vital Signs: Pregnancy-Related Deaths, United States, 2011–2015, and Strategies for Prevention, 13 States, 2013–2017. Morbidity and Mortality Weekly Report 68, 18 (2019), 423. [69] Daniel Preotiuc-Pietro, Jordan Carpenter, Salvatore Giorgi, and Lyle Ungar. 2016. Studying the Dark Triad of personality through Twitter behavior. In Proceed- ings of the 25th ACM international on conference on information and knowledge management. ACM, 761–770. [70] Daniel Preotiuc-Pietro, Sina Samangooei, Trevor Cohn, Nicholas Gibbins, and Mahesan Niranjan. 2012. Trendminer: An architecture for real time analysis of social media text. In In Proceedings of the 6th International AAAI Conference on Weblogs and Social Media, Workshop on Real-Time Analysis and Mining of Social Streams, ICWSM. [71] Daniel Preoţiuc-Pietro, H Andrew Schwartz, Gregory Park, Johannes Eichstaedt, Margaret Kern, Lyle Ungar, and Elisabeth Shulman. 2016. Modelling valence and Quantifying Community Characteristics of Maternal Mortality Using Social Media WWW ’20, April 20–24, 2020, Taipei, Taiwan

APPENDIX B THEORETICAL MODELS We include further details on results and discussions from the main We present high-level details for each of the four models used in text below: this paper. Detailed descriptions and evaluations can be found in the corresponding papers. Note that none of the models described A SAMPLE TWEETS below were developed for this paper. The last column shows a sample of three tweets for the topic. To find these representative tweets, we extract topic loadings over a random Affect. An affect model was built using a set of 2,895 annotated set of 500,000 pregnancy-related tweets. We then order the tweets Facebook posts. Each post was rated by two psychologists on a by topic loadings and hand-select three tweets (out of the top ten) nine-point ordinal scale, based on the affective circumplex model that best describe the topic, ignoring noisy or uninformative tweets. introduced by Russell [74].A ℓ2 penalized linear (ridge) regression For example, a tweet “teen rates!!!” would load extremely high in was built using 1—2grams extracted from each message. Using our first topic, but it doesn’t capture any additional information a 10-fold cross-validation setup, the ngram model resulted in a over the list of the highest-weighted words within the topic. Note prediction accuracy (Pearson r) of 0.65. Full details can be found in that all typos and emoticons in the tweets are included unchanged. Preoţiuc-Pietro et al. [71].

Depression. The MyPersonality data set [53], which consisted Topic Label Sample Tweets of approximately 154,000 consenting users who shared Facebook Teen teenage pregnancy #iblamedavidcameron statuses and completed a 100-item personality questionnaire was Pregnancy Decreasing infant mortality around the world used. The personality questionnaire is based on the International (1.34%) Personality Item Pool proxy for the NEO Personality Inventory [21]. #BecauseOfYolo teenage pregnancy rate has This work then takes the average response to the seven depression- risen facet items (located within the larger Neuroticism scale) to estimate Morning The purpose of our lives is to give birth to the user-level degree of depression. A ridge-penalized regression model Sickness best which is within us Marianne Williamson was built [47] using a set of 2,000 LDA topics and 1-3grams extracted (0.54%) #spirituality over 27,749 individuals and tested on 1,000 random individuals who Ecotopic pregnancy diagnosis symptoms and used at least 1,000 words across all of their statuses. This resulted complications in a final prediction accuracy (Pearson r) of 0.39. Full details can be Getting a sickness that isn’t morning sickness found in Schwartz et al. [75]. while #pregnant #sucks #cough #throathurts #stuffynose #blah Trust. Similar to the depression model, the trust model was built Celebrity Amber rose is pregnant ? #damnwiz using the MyPersonality Facebook data set [53]. Consenting indi- Pregnancies hopefully kim k’s pregnancy doesnt last 72 days viduals were asked to share their Facebook statuses and answer a (1.42%) Taylor swift pregnant by harry Big-Five personality questionnaire. The average of three of the ten Abortion Lawmakers ban shackling of pregnant inmates trust-facet items from the agreeableness domain – (1) "I believe that Rights (2.15%) others have good intentions," (2) "I trust what people say," and (3) "I #SouthAfrica to care for all #HIV positive infants suspect hidden motives in others" (reverse-coded) – was used as a #worldaidsday #womensrights #children measure of trust. A predictive model was built on 26,243 users who Nebraska governor rejects prenatal care funding answered the above question and also shared Facebook statuses for illegal immigrants (with at least 1,000 words across all statuses) and evaluated on a Maternal Lower autism risk with folic acid supplements smaller set of users (N =621) who answered the ten trust-facet Studies in pregnancy items. Using a set of 2,000 LDA topics and 1-3grams, this resulted in (2.56%) Postpartum cardiovascular risk linked to glucose a prediction accuracy (Pearson r) of 0.35. Full details can be found intolerance during pregnancy in Zamani et al. [83]. Increased autism risk linked to hospital- diagnosed maternal infections Congratulatory Congrats to and on the birth Stress. Participants were recruited through Qualtrics (an online Remarks of their baby survey platform, similar to Amazon Mechanical Turk), where each (3.06%) #5yearsago i gave birth to my wonderful daugh- participant answered a series of demographic questions, the 10-item ter <3 <3 <3 Cohen’s Stress scale [19] and consented to share their Facebook Awwwwww my nephew’s wife is pregnant <3 statuses. The analysis was then limited to those who self-reported congrats! age and gender (female/male) and who posted at least 500 words across all Facebook statuses, resulting in a final set of 2,749 partic- Table 5: Sample topics with sample tweets ipants. A set of 2,000 Facebook topics were used as features in a ridge penalized regression model [47]. This resulted in a prediction accuracy of 0.32 (Pearson r), using a 10-fold cross validation setup. Full details can be found in Guntuku et al. [43]. WWW ’20, April 20–24, 2020, Taipei, Taiwan Abebe and Giorgi, et al.

C DOMAIN TRANSFER: APPLYING South 3 West FACEBOOK MODELS TO TWITTER DATA Northeast Midwest All four of our theoretical models were trained and evaluated on 2 Facebook data in their original papers, whereas we applied the models to Twitter data. Some of the models have been shown to 1

work in other domains (i.e., stress on Facebook vs Twitter; Guntuku Affect 0 2019). Additionally, previous work has found is that effect sizes tend to vanish without correcting for the domain transfer [72], 1 which we argue makes our prediction task harder. Additionally, Rieman et al. [72] showed that user-level Facebook models applied 2 1 0 1 2 3 4 to county-level Twitter data are stable in terms of direction of effect Maternal Mortality Rate sizes. (a) Affect D SPATIAL DISTRIBUTIONS 3 South Figure 3 shows the relationship between maternal mortality rates West Northeast (residualized on race, median age of females, socioeconomics and 2 Midwest risk factors) and the topic loadings for the Congratulatory Remarks topic. Markers in the scatter plot are colored according to U.S. 1 Census regions (Midwest, Northeast, South and West). We see that lower usage of this topic is associated with high mortality rates. We 0 also see spatial clustering across the regions. For example, the West Depression 1 tends to have lower rates of mortality but large variance in topic usage. The South has the most variation in mortality in addition to 2 the largest outliers in topic usage. Figure ?? includes a similar set 1 0 1 2 3 4 of plots for the theoretically-relevant features, showing significant Maternal Mortality Rate associations between affect and trust and maternal mortality. (b) Depression

5 South 3 South West West Northeast 4 Northeast 2 Midwest Midwest 3 1

2 0 Stress 1 1

2 0 3 1 1 0 1 2 3 4 Congratulatory Remarks Maternal Mortality Rate 2 1 0 1 2 3 4 Maternal Mortality Rate (c) Stress

3 South West Figure 3: Maternal mortality rate (residualized) vs the Con- Northeast gratulatory Remarks topic loading. Dots are colored by 2 Midwest which U.S. Census Region the county resides in: Midwest, 1 Northeast, South and West.

Trust 0

1 0 1 2 3 4 Maternal Mortality Rate

(d) Trust

Figure 4: Maternal mortality rate (residualized) vs theoretically relevant features. Dots are colored by which U.S. Cen- sus Region the county resides in: Midwest, Northeast, South and West.