Experiment Plan: The Effect of Sending Thanks to Others on

J. Nathan Matias and Julia Kamin July 28, 2019

Contents

Introduction 1

Scope of this Experiment Plan 2

Steps in this Experiment 2

Outcomes 4 Main Outcome: Socially Supportive Actions ...... 4 Secondary Outcome: Positive Feelings About One’s Contributions to Wikipedia . . . . .4

Estimation Procedures and Assumptions 4 Variables Used in Estimation Procedures ...... 4 Randomization Block ...... 4 Survey Compliance ...... 4 Variables To Report In Publications ...... 5 Language Wikipedia ...... 5 Previous Socially Supportive Actions on Wikipedia ...... 5 Previous and Subsequent Labor Hours on Wikipedia ...... 5 Year the account joined Wikipedia ...... 5

Code for Estimation of Treatment Effect 5 Main Outcome: Effect on Socially Supportive Actions ...... 5 Main Outcome: Effect on Positive Feelings About One’s Own Contributions to Wikipedia6

Exploratory Analyses 6

Acknowledgments 6

Introduction

Supporting the social processes of Wikipedia by mentoring newcomers, monitoring the site for damaging contributions, and participating in content disputes requires substantial emotional labor [1]. Surveys of German, Persian, and Polish Wikipedia have found that Wikipedians who report doing more mentoring and monitoring also report Wikipedia editing as a more emotionally draining activity ( r(440) = 0.26, p < 0.001)1. At the same time, Wikipedians who do more mentoring and monitoring also report feeling more positive about their contributions to the anyone can edit ( r(440) = 0.11, p = 0.01). The socially-supportive work of Wikipedia is burdensome but worthwhile to those who do it. Are people who carry out supportive activities those who already value social support, or does the work of supporting others on Wikipedia grow a person’s positive feelings about their

1Results of a pre-survey conducted in July 2019, and used to inform this study design

1 contributions to Wikipedia? In this field experiment co-designed with the Persian, Polish, and communities, we test the hypothesis that engaging in supportive behavior on Wikipedia has a positive effect on how peeople feel about their contributions. We also test the hypothesis that such an effect, if it exists, may also be associated with an increase in supportive behavior over time.

Scope of this Experiment Plan

This document describes the research procedure, outcomes, adjustment variables, and estimation procedures.

Steps in this Experiment

The experiment has the following steps:

• We recruit a convenience sample of participants from the population of people with accounts on including German, Polish, and Persian. Participants will be recruited based on the following criteria: – In German Wikipedia accounts that have permission to flag revisions. – In , accounts registered for at least one year with at least 500 edits. – In Polish Wikipedia, accounts with permission to flag revisions. • Participants will be recruited using the following methods: – Publish a Wikipedia banner ad to eligible accounts in a given language Wikipedia in a period of 16 days. The banner presents the study as related to improving the environ- ment and outcomes of Wikipedia. – Liaisons affiliated with a given language Wikipedia will send emails and other messages to encourage other Wikipedians to participate • We present participants with a consent form that describes the study as related to helping learn how to improve Wikipedia. To consent, participants enter their email address and their Wikipedia username into a form and submit the form to us. Participants are assigned a study ID in order to store personally identifiable information separately from data collected in the study. Participants are not compensated for the study, but they are offered a ”barnstar” or other symbolic reward indicating that they participated in the research. • Upon consent, participants are directed to a survey. We collect the following information from that survey and observational data: – Observational data collected from administrative data on Wikipedia (defined below) – Survey questions: ∗ Their rating of their overall experience as an editor on Wikipedia (ordinal scale of 1 to 5 from poor to excellent) ∗ Agreement or disagreement with statements about other Wikipedians (ordinal scale from 1 to 5 from strong disagreement to strong agreement) ∗ Agreement or disagreement with statements on their own feelings about contribut- ing to Wikipedia (ordinal scale from 1 to 5 from strong disagreement to strong agreement) ∗ How much time they contribute to Wikipedia in a range of activities (ordinal scale from 1 to 5 from never to all of my time) • At the end of the recruitment period, if we haven’t received a survey response from a partici- pant, we send them a message via email asking them to complete the survey as a pre-condition to participating

2 • After receiving a full complement of volunteers, we allocate participants into a condition of the experiment. This process includes:

– Compiling measures used for randomization ∗ Year the account was created ∗ Self-reported positive feelings about their contributions to Wikipedia ∗ Supportive actions over 84 days: The number of contributions to Wikipedia talk pages, Wikipedia namespace pages, and thanks sent by an account in the 84 day period before enrolling – Setting aside volunteers whose observed behavior on Wikipedia falls out of the 99% confidence interval for their language Wikipedia on the following behavioral measures: ∗ Supportive actions – Randomize participants into two conditions, blocking participants within their language Wikipedia in pairs using the blockTools R package, based on Mahalanobis distance [2, 3]. Matching will be conducted on the following characteristics: ∗ Account age in years ∗ Supportive actions ∗ Positive feelings about one’s contributions to Wikipedia

• We will email participants based on these randomizations to carry out a short task in one of the following conditions: – Mentoring condition: in this condition, participants are presented with a series of lists of Wikipedia edits from a single contributor. All contributors eligible for thanks will have made at least four edits rated to be ”non-damaging good faith” by the ORES machine learning system for that language (or in the case of German Wikipedia, edits by accounts that were approved by other Wikipedians). For each list, participants are asked to: ∗ Identify any Wikipedia edit that they would like to ”thank” the contributor for ∗ Select the item to thank by clicking a button ∗ If the participant does not see anything they are willing to ”thank,” they can click a ”skip” button and view another list of edits from a different account ∗ With their knowledge, our software will act in their behalf to ”thank” the account for the contribution. This message of appreciation uses Wikipedia’s existing thanking features, which allows a person to thank another for a contribution they made ∗ Our software will notify the participant that the account was thanked and show them a list of the next person’s edits ∗ Once the participant has given out the full complement of thanks or after 6 hours have elapsed, the participant is shown a completion page and not allowed to send further thanks via the system. – Routine condition: in this condition, participants are prompted to spend 10 minutes carrying out one of three possible routine activities on Wikipedia, randomly selected from the list below ∗ Categorize articles that have no or too few categories. ∗ Check for and note missing sources on article pages. ∗ Correct typos and grammar on article pages. (Copyediting) ∗ Improve layout, infoboxes, other templates. (”Wikifying”) ∗ Review new images and make sure they don’t violate copyright laws. ∗ Review articles to nominate for good article status. ∗ Add links to pages that have no or too few incoming links. (”Orphan pages”) ∗ Improve or create lead sections of articles.

3 • No more than 1 day after completing the mentoring or protection task, participants are asked to complete a survey with the same questions as the pre-survey.

• We also observe the following behavioral outcomes over the 56 days before starting and after completing the intervention, based on previously validated measures of “social roles” on Wikipedia [4]. – Number of Supportive Actions – Labor Hours

Outcomes

The following variables will be used to estimate the effect on differences in reported views toward newcomers on Wikipedia and on the behavior of participants. The units of observations are individual participants.

Main Outcome: Socially Supportive Actions This measure is the number of actions made to support the social endeavor of Wikipedia, including edits to Talk Pages, edits on the Wikipedia namespace, and Thanks sent to others from this account. This measure is the difference between the number of supportive actions in the 56 days before starting and after completing the intervention.

participant$diff.supportive.actions

Secondary Outcome: Positive Feelings About One’s Contributions to Wikipedia This measure, used in an exploratory analysis, is inspired by a question from the Maslach Burnout inventory and applied to a Wikipedia context [5]. How often do the following statements describe how you feel when contributing to Wikipedia? [1 to 5, 1 being Never.. 5 Always]

• I feel positive about the contributions I am making to [language] Wikipedia.

This measure is the difference between answers before and after the intervention.

participant$diff.positive.feeling

Estimation Procedures and Assumptions

We will use the procedures and assumptions below to calculate the average treatment effect esti- mate, standard errors / confidence intervals, and p-values.

Variables Used in Estimation Procedures Randomization Block The specific randomization block of a particular participant.

participant$block

Survey Compliance Complier is a binary variable (0 or 1) that records whether a participant completed both surveys.

4 participant$complier complier.participant <- subset(participant$complier ==1)

Variables To Report In Publications We will publish the following variables in publications to report the balance of random assignment.

Language Wikipedia This two-character string records the language wikipedia that the account was associated with in this study. Potential values include DE, PL, AR, and FA.

participant$lang

Previous Socially Supportive Actions on Wikipedia This integer is the total number of edits made to article talk pages, to the Wikipedia namespace, and the number of thanks sent by the account in the previous 56 days before enrolling. This variable is used in the experiment assignment and will be used to put into context the main outcome for the difference in the number of socially supportive actions.

participant$previous.supportive.actions participant$subsequent.supportive.actions

Previous and Subsequent Labor Hours on Wikipedia Labor hours is the estimated number of hours contributed by the account in the 56 days before enrolling and the 56 days after treatment [6]. This variable will be used to put into context the main outcome for the number of socially supportive actions. This measure omits the actions taken in the 6 hour period after a participant begins the assigned task.

participant$previous.labor.hours participant$subsequent.labor.hours

Year the account joined Wikipedia This integer records the year that the participant joined Wikipedia.

participant$year.joined

Code for Estimation of Treatment Effect

The decision rule for all analyses will be α = 0.05. The two main analyses will be adjusted for multiple comparisons using the Holm method [7].

Main Outcome: Effect on Socially Supportive Actions We estimate the average treatment effect on the difference in supportive actions on Wikipedia.

difference_in_means(diff.supportive.actions~ TREAT, blocks=block, data= ,→ participants)

5 Main Outcome: Effect on Positive Feelings About One’s Own Contribu- tions to Wikipedia We estimate the complier average treatment effect on the difference in reported trust in newcomers on Wikipedia, within the subset of participants that completed both surveys.

difference_in_means(diff.positive.feeling~ TREAT, data= ,→ complier.participants)

Exploratory Analyses

In addition to the above confirmatory analyses, we also expect to carry out a range of exploratory analyses, including:

• Testing the effect of the intervention on the difference in overall labor hours • Summaries of within-language observations • Tests of variation in treatment effects covaried with variables used in matching (previous supportive actions, etc) • Tests of correlations between survey items and behavioral outcomes

Acknowledgments

We are grateful to our community liaisons with the Arabic, German, Persian, and Polish Wikipedias, who facilitated this research and provided feedback on this experiment plan.

References

[1] Amanda Menking and Ingrid Erickson. The heart work of Wikipedia: Gendered, emotional labor in the world’s largest online encyclopedia. In Proceedings of the 33rd annual ACM con- ference on human factors in computing systems, pages 207–210. ACM, 2015. [2] Ryan T. Moore. Multivariate continuous blocking to improve political science experiments. Political Analysis, 20(4):460–479, 2012. [3] Ryan T. Moore and Keith Schnakenberg. blockTools: Blocking, assignment, and diagnosing interference in randomized experiments. Version 0.6-3, December, 2016. [4] Howard T. Welser, Dan Cosley, Gueorgi Kossinets, Austin Lin, Fedor Dokshin, Geri Gay, and Marc Smith. Finding social roles in Wikipedia. In Proceedings of the 2011 iConference, pages 122–129. ACM, 2011. [5] Christina Maslach, Susan E. Jackson, Michael P. Leiter, Wilmar B. Schaufeli, and Richard L. Schwab. Maslach burnout inventory, volume 21. Consulting Psychologists Press Palo Alto, CA, 1986. [6] R. Stuart Geiger and Aaron Halfaker. Using edit sessions to measure participation in Wikipedia. In Proceedings of the 2013 conference on Computer supported cooperative work, pages 861–870. ACM, 2013.

[7] Sture Holm. A simple sequentially rejective multiple test procedure. Scandinavian journal of statistics, pages 65–70, 1979.

6