
A report on the Account Creation Improvement Project and the Fellowship

By Lennart Guldbrandsson

Timeline
Before the Fellowship
The Fellowship
After the Fellowship
Participants and method

Interpreting the results
List of MediaWiki messages involved in the account creation process
Log In / Create Account page
User Create Page
Confirmation page
Function that checks for available usernames
The code
Implementation
Thanks

Introduction

We made a bet after a meeting in a room on Floor 6 in 's offices in San Francisco. It was me and Frank Schulenburg, plus Winifred Olliff. The bet was for two bottles of good champagne. The question was how much we would manage to increase the number of people who started to edit after we had changed the account creation process. For the life of me, I can't remember what Winifred said before we left our notes for safekeeping with Bryony Jones, but I remember distinctly that Frank surprised us by saying that we would go up to 42%. Since the current number was under 30% on most days, that was a bold guess.

But in my hand I held a note with another number on it. It was 47%...

This is the story on who won that bet. Or if you will, a report on the Account Creation Improvement Project and my, i.e. Lennart Guldbrandsson's, Wikimedia Foundation community Fellowship during the first half of 2011. After reading it, you'll know what the recommendations for the future are about this. You'll also know how to implement the suggested changes on your .

The Account Creation Improvement Project was started in the Wikimedia Foundation community department, by Frank Schulenberg. Lennart's Fellowship first consisted of working on two simultaneous projects: the Bookshelf project and the Account Creation Improvement Project. In the end of May, though, the Bookshelf part ended, allowing Lennart to focus on the Account Creation Improvement Project. During all this period, Lennart has reported regularly to Frank Schulenburg and Zack Exley, as well as to the Wiki/pm/edia community.

During the six months of the Fellowship, Lennart successfully conducted two surveys, led a series of low-quality tests, oversaw the design of a tracking system, and then the development of two high-quality test versions, after which we deployed them, and finally, worked on gathering and analyzing the data after the tests.

Overview of the project and the Fellowship

Timeline


Before the Fellowship September 18, 2010 – Frank starts Account Creation Improvement Project (Lennart signed on as interested to participate on same day)

The Fellowship December 16, 2010 – Lennart's Fellowship starts January 2-9, 2011 – Lennart to San Francisco January 25-February 4, 2011 - Survey rolled out on six languages February 2-March 30 – Low-quality tests live on English March 28 - April 1, 2011 – Lennart to San Francisco, again April 24, 2011 – Tracking system up and running June 9, 2011 – High-quality tests live on June 16, 2011 – Lennart's Fellowship ends

After the Fellowship
July 1, 2011 – Statistics on how many make 5 edits after the account creation tests
July 20-30, 2011 – Report being compiled
August 5, 2011 – Lennart presents Account Creation Improvement Project at Wikimania

Participants and method

For the most part, Lennart has been working with volunteers during the Account Creation Improvement Project. The main workspace has been the Outreach wiki at http://outreach.wikimedia.org/wiki/Account_Creation_Improvement_Project, but some work has been conducted on different language versions of Wikipedia, the Test wiki, and the Mediawiki wiki.

Besides the volunteers, the project has had some Wikimedia Foundation staff support, primarily in the form of ideas and technical support.

For brevity's sake, this reports uses the pronoun ”we” to signify the work of either of these three groups of people (Lennart, volunteers, staff), instead of constantly naming who did what - the focus being on the results.

Note: most of the links in this report lead to English Wikipedia. This is to keep things brief. If you change the ”en” in the URL to the ISO code for your languages, all MediaWiki pages, for instance, go to their counterparts on the Wikipedia in that language.

The lay of the land

Before we started to change anything in the account creation process, we wanted to understand what we had. That meant that we had to do a bit of documentation and research. There wasn't much before, at least not collected anywhere.

The original account creation process

The account creation process as it was before the Account Creation Improvement Project started has been created by the a) tech staff of the Wikimedia Foundation, and b) Wikipedia volunteers.

The technical aspect Since the project had very limited support in terms of full-time tech staff help, the existing account creation process would have to remain relatively unchanged. That was not a major problem. Compared to many other websites, Wikipedia has a rather simple and clear account creation process, with only a few fields to fill in. In contrast to websites such as Facebook, Wikipedia will of course let you use the service (both read and contribute) without an account.

The process to create an account in Wikipedia looks like this:


Steps for account creation:

0: User clicks on the Log in/ create account link in the navigation 1: User is taken to the Login page, which asks the user to either log in or to create an account 2: User is taken to the Account Creation pages, which asks the user to create an account by filling in a form 3: User is taken to a Confirmation page (Source)

Page 6 Each of these pages consist of a number of different MediaWiki messages that can be edited by administrators only, and some system messages that only programmers can change. For a complete list of MediaWiki messages that administrators can edit, see the Appendix.

The content aspect Whereas the technical aspect of the account creation process is controlled by the Wikimedia Foundation staff, the content of the pages in the process is decided by each Wikipedia language community. That kind of high-traffic pages are normally built up based on experiences, such as mistakes made by newcomers and frequently asked questions. On the account creation process for English Wikipedia, this arguably led to instruction creep, and even a process that was unwelcoming and daunting for the newcomers. This is how one of the pages in the original account creation process looked before the project started:


Page 7 A comparison between Wikipedia and Facebook's account creation processes makes this instruction creep very clear. Facebook's 54 words are compared with Wikipedia's 681 words, including a few warning signs and phrases marked in bolded red!

The ”before” statistics

We considered that the existing account creation process was working fine, but needed to have some statistics to back that up. During September 2010, Frank Schulenburg wrote several queries to the Toolserver. The goal was to understand how many created accounts per day, and how many of them started to edit. Here's the result for August 2010:

Number of new user New user accounts: Project accounts overall average per day English Wikipedia 169,706 5,474 27,102 874 17,963 579 16,712 539 12,653 408 5,743 185 5,393 174 3,992 129 English Wikinews 2,801 90 2,686 87 2,602 84 1,293 42 807 26 677 22 616 20 Number of new accounts per month.

Page 8 New accounts Number of with at least one edit Percent- Project new user within the first 10 days age accounts after creation English Wikipedia 169,706 51,864 31% Spanish Wikipedia 27,102 6,769 25% German Wikipedia 17,963 6,221 34% Russian Wikipedia 12,653 3,335 26% Dutch Wikipedia 5,743 1,413 24% Polish Wikipedia 5,393 1,255 23% Swedish Wikipedia 3,992 1,190 30% English Wikinews 2,801 2,798 100% Hindi Wikipedia 1,293 121 9% Tamil Wikipedia 807 82 10% Bengali Wikipedia 677 74 11% Malayalam Wikipedia 616 70 11% Number of new accounts that are used in the first 10 days.

(The rest of the results of the ”before” statistics are here.)

Is this result amazing or bad? According to a paper on Facebook, 44,6% of the new users drop off in the first 3 months, but since Facebook's system is very different, and most other websites don't publish data such as this, we concluded that the only data we would compare ourselves to were earlier data, i.e. if we could increase the numbers in the table above.

The surveys

Around January 25, we asked administrators on 7 language versions of Wikipedia (English, German, French, Spanish, Swedish, Bengali and Bahasa Melayu) to put a link to a very short survey on the page that new users "land" on after they have created their new accounts. That survey was the next step to understanding what we needed to change. The goal was not to have a statistically perfect survey, but some inkling on what new editors wanted out of their account and what we needed to explain to them. The survey asked them about what they expected when they created the account, or in essence, ”why create an account on Wikipedia?”.

This is the English version:

Why did you create your account?

Answer choices:

• I wanted to be part of Wikipedia • I was just curious about what would happen if I created an account • I thought an account is required to edit articles • I thought I would get some extra features

Page 9 • I usually register for websites I like1 • I tried to start an article, but I couldn't do that without a user account • I wanted to upload images • Other: … (Source and other language versions)

After a few days, we started a new survey on English, German and Spanish Wikipedia, with the following question:

Congratulations! You just created your own Wikipedia user account. If you had to create a user account again, which of the following statements would convince you more?

Answer choices:

• Having a user account under my real name means that everybody will know how I helped improve Wikipedia. • Having a user account will enable me to customize the appearance and behavior of Wikipedia to my own preference. • Having a user account means that I can follow all my favorite topics on Wikipedia. • Having a user account means that I will get my own user page on Wikipedia. Other users will be able to send me messages. • Having a user account means that my contributions to Wikipedia are not connected to an IP-address, but to a nickname that can't be tracked back to my physical address. • Other (please specify) (Source)

We got roughly 9545 persons in all answering our surveys (the French and the Bahasa Melaya was not started despite being translated). In order from the fewest to the most responses:

• Bengali - 112 • Swedish - 155 • German version 1 - 255 • Polish - 359 • Spanish version 1 - 881 • German version 2 - 932 • Spanish version 2 - 1050 • English version 1 - 1412 • English version 2 - 1200 (roughly2) • English version 3 - 3189 (Source)

1 This idea is humorously sketched out as a graph here. This is probably the main reason why people with accounts don't start to edit Wikipedia. 2 Due to a mistake, the specific result is swimming around somewhere on the LimeSurvey, but we have a screenshot of the results: http://upload.wikimedia.org/wikipedia/commons/6/65/Account_creation_survey_2_en.png

Page 10 Results The most common answer across all language versions we surveyed was "I want to be part of Wikipedia". In fact, the stable results on that point was the reason why we did another survey to understand what "part of Wikipedia" meant.

“Following favorite articles on Wikipedia” was the most common answer in the second survey, with the notable exceptions of German Wikipedia where anonymity is a bigger reason, and Spanish Wikipedia where getting a user page and being able to send/receive messages are the most important features. When forced to choose one alternative, newcomers to English Wikipedia think that being able to follow articles is the most important reason to create an account. That is also important when they can choose.

Full results of the surveys can be found in the Appendix.

Comments from people taking the survey In the first and second surveys, we gave people a free text area. People used that option a lot. The most common responses were along the lines of:

• "I don't want to edit articles anonymously." • "Class assignment" • ”want to create a book I can same and print” [or other function related answers] • "I LOVE WIKIPEDIA"

Some gave responses that showed that we still have some outreach work to do:

• ”i thought i would be able to reazd alot more arlicals” • ”chat with online friends” • ”To give my forum a page”

And some gave cryptic responses:

• ”I wanted to vandalize!” • ”Am an artist” • "Dunno. I was bored."

Page 11 Testing it live

During Lennart's trip to San Francisco in January, we decided that surveys and statistics are but the first step, and that we needed to test out different things live on Wikipedia.

The low-quality tests

Step 1 in the testing process was to gather as many different ideas as possible without too much thought into design, deploy them and measure how many new users who started editing. We called that the “low-quality tests”, in contrast to the more polished versions that we would work on later, once we knew more about what worked and what didn't. We were certainly aware that the versions in many cases were crude and simplistic, but the tests could show which we should work more on.

Starting February 10, the community was invited to contribute their designs of the three pages in the account creation process: the login page, the account creation page and the confirmation page. To help people out, we created a list of assumptions that they could use when designing their own versions. For example, we believed that showing the newcomers images of people may inspire more confidence. Read the full list here. Among the suggestions were one that was used on Polish Wikipedia, where the newcomer gets the “rulebook” on their user page. Other designs had videos and links to reading materials.

On February 23, the first test were performed on English Wikipedia. In all, 10 low-quality designs were tested, the last test ending on March 30. Most of the tests were focused on the last page in the account creation process, the confirmation page. The original version looked like this:

(Image source)

3 This version became known in the project as “Commons sallad”, because of all the images and extra ingredients.

Page 12 Samples of versions that were tested:

(version 2)

(version 3)

(version 4)

(version 7)

(version 9)

Page 13 Results of the first tests

Test Number of new New accounts with at least one edit on Best version accounts (best day) the day of account creation (best day) percentage Original 7,180 2,177 30%4 1 7,341 2,151 33% 2 6,574 2,328 35% 3 7,266 2,364 32% 4 7,267 2,433 33% 5 6,935 2,447 35% 6 6,080 1,917 32% 7 7,151 2,183 31% 8 6,066 1,879 31% 9 6,697 2,124 32% 10 6448 1978 31% (Source and all results)

Version 2 and 5 had a few percentages more people who started to edit than the others, but the remarkable thing is that all versions were better than the original one. Version 2 and 5 gave 5 percent units more newcomers than the standard one, which on a day with 6000 new accounts means 300 more potential Wikipedia contributors – per day! These versions were certainly worth exploring more thoroughly.

The backlash with version 2 We announced the changes before we implemented them, on the Village pump of English Wikipedia, as well as the Administrators' Noticeboard. Most who responded were interested in the tests, but the activity was not very high.

However, when we rolled out version 2 (see picture above) there was outcries in several places, and the page was reverted. To understand the reason why, let's look at how version 2 works, since this will become important later on. Never mind that the design lacks a certain style. The function was proposed by editors from Polish Wikipedia, based on the system that they had there. Once you have created an account, you're given one clear choice: to create a user page. When you click on the button, a template preloads information on your user page and all you have to do is click the “save” button. The Polish version had a list of around 30 links to guides, policies, and other pages on Wikipedia. The English version looked like this.

What the community reacted to was that so many new users created user pages that looked exactly the same. This function made the newcomers' user pages blue links instead of red links on for instance Recent changes, also made it more difficult to distinguish between newcomers (who according to Wikipedia wisdom are more prone to vandalism) and Wikipedians (who are less prone to vandalism).

4 This percentage here deviates a little bit from the percentage in the original statistics.

Page 14 However, we reasoned that it could be a good thing if Wikipedians who patrolled Recent changes didn't automatically scrutinize new users harder. A test on English Wikipedia using “under cover” Wikipedians posing as newcomers noticed that edits they made were more often reverted than if they made the same type of edits with their usual accounts. This made us think harder about how we could have the best of both worlds: tools for the vandalism patrol on the one hand, and an easier time getting accepted as a Wikipedian on the other.


To get a better measuring rod than statistics from the Toolserver accounts – which needed to be updated manually for each update – the tech department were given the task of creating a tracking system with A/B testing capabilities as described here.

In essence, what we wanted was a method to randomly send those who wanted to create an account one of several alternate routes, like this:


Also, we wanted track how many went through the different routes – and most importantly, how many of them began to edit after creating their account. This basically meant, which route was the most effective way of getting people to edit?

That task was done by an extension to the MediaWiki software, which placed a tracking cookie in the web browsers of the new users. The tech team wrote:

Page 15 What data we are storing We are storing a new cookie upon visiting the “Log in/create account” page, with a lifetime of three months. This cookie will be used to track the following information: • Which account creation messaging group the user was placed in (identified as ACP1, ACP2 or ACP3 for now) • What version of the account creation campaign they recieved • Whether the particular user made it to the end of the account creation process, or whether they dropped off after reaching the login screen or the account creation screen • If (and only if) the user creates a new account, the number of edits or previews during the course of the trial The information is associated with browser sessions (each of which has an individual unique identifier), not with an individual user or user account. (Source)

The extension was deployed on April 27, which is when the more advanced statistics begin.

The tech team theorized that more outreach activities could use the CustomUserSignup extension to gauge how successful the activity was in getting more people to sign up as editors.5

The high-quality tests

After the second trip to San Francisco, Lennart and Frank set out to create two high-quality account creation processes. They were different from the low-quality tests in two fundamental ways:

1) they were created with more style in mind, over a longer period of time than the low-quality designs, which among other things meant that we designed whole processes6 than individual pages 2) they were deployed with the CustomUserSignup extension in play, so that we could measure what happened to the people who went through the account creation process

The two new high-quality designs were based on the two most successful versions in the low- quality tests. Internally we called those versions “the user page creator” and “the Sparked.com model” (more details below). Concurrent with them, we would run a third process. The third one was the existing process, which we wanted to use as a baseline, in case we suddenly saw a spike or dive in the statistics of the new version.

The work to design and code the two high-quality versions proved to be harder than anticipated. There were technical challenges as well as content issues that needed to be addressed before they could be rolled out on English Wikipedia. Fortunately, even though Lennart was responsible for the user page creator and Frank took charge of the Sparked.com model, we had plenty of help from the communities of the Outreach wiki and other Wiki/pm/edians.

The user page creator The user page creator was based on version 2 of the low-quality tests – the controversial one. However, we felt that once we could provide the vandalism patrollers some tools to deal with the

5 This can be made easier by using such services as QRpedia, a website which turns URL:s into codes that can be read by mobile cameras. 6 For instance, we placed a progress bar at the top of both account creation processes.

Page 16 problems that the model created, this model would have a good chance to get many newcomers to start editing, which was the goal. It would also present the newcomer to the rest of the community, which we hypothesized would make the Wikipedia community more inclined to welcome them into the fold. The various tools and features will be explained as we go through the steps in the process.

This process was designated campaign=ACP1 in the CustomUserSignup extension, which meant that you can go directly to this account creation process (instead of getting randomly assigned to one of the three account creation processes) by clicking this link: http://en.wikipedia.org/w/index.php? title=Special:UserLogin&action=submitlogin&type=signup&campaign=ACP1

This is how it looks:

Step 0 and 1: Step 0 and 1 in the account creation process are the same as the normal one. This is also true for the Sparked.com model.

Step 2:


You'll note the difference to the original one: shorter text, the softened frame with shadows and the colorful progress bar at the top.

We should also mention that much of the page consists of a form that we could not change, for technical reasons. It would take too much time for a six-month project, and with very unsure results. The original design called for an inspirational “You can also become a part of Wikipedia” sidebar, but we were unable to get this to work with the form part without changing the software.

Page 17 Step 3:


Here, there are two main new features, beside the progress bar:

• the blue box to the right, which showcases an example of a user page7 (including a photo of a Wikipedian) • the “tell us about yourself” form8

The “tell us about yourself” form is a JavaScript-created function that is prefilled with the following text:

Replace this example text below with information about you:

Hello, my background is in biology, with a main interest in snakes.

I speak English and French. In my off-time, I listen to a lot of music, and I have discovered that Wikipedia is a very good source for information in that department. Hopefully I can help make it even better.9

This text is one of the tools we gave the vandalism patrollers: they can now check for user pages that include this exact wording.

Step 4: Normally, step 3 is where we let the newcomer find his or her own way. In this version, we added a fourth step. Once you click “Create my user page for me now”, you will be taken to your new user page, with the text you added pasted into the edit area, so that all you have to do is to scroll down and click “Save”. This looks like this:

7 The user page belongs to Tim Vickers, after a thorough search among user pages for a simple, informational user page that a newcomer could imitate. It was a happy coincidence that the user turned out to be a scientist. 8 My Swedish web browser underlines the words it doesn't understand, even after I switched it to English. In other words, it shouldn't look like that. 9 This text is generated by one part of the code on this page: http://en.wikipedia.org/wiki/MediaWiki:Common.js, which you also have to change if you are to import this version of the account creation process to other .

(Source)

There are a couple of things to note here11: • there are two “hidden” comments, reminding the user to click “Save” and welcoming the user to Wikipedia • the edit summary automatically adds the text “New user page through Outreach:ACIP” to this edit in Recent changes, which makes it very easy to spot for vandalism controllers • this is not the first thing the user sees after coming to this page. Above this is something quite interesting, see the next section.

The new user bar template One thing that we knew that the new users needed, and we wanted to give them, was information about how Wikipedia works. But we needed to do it without giving them a list of 30 guidelines and manuals on their user page, as in the Polish version. Our idea was to use tabs, i.e. place the information they needed close at hand. The method we chose was to use a template that could be pasted onto every new user page during the account creation process.

That template is expanded on the user page to give the new user information and guidance, without taking up too much space on the page. You can actually see the code that produces the tabs template in the image above, right after the first line of hidden text: {{New user bar}}. If you didn't notice it the first time you saw the picture – good. That's the point.

Let's take a look at the template that sits on every new user page created through this account creation process:

10 This page cannot be easily sourced, since it's session-based. So you'll have to try it yourself. 11 Besides the red-link problem I told you about before.

(Source)

One feature we have in the New user bar template is the prominent tab “Getting started”, which links to a page we created, called Starting editing. This page is a very short introduction to editing Wikipedia – much shorter, less technical and more focused on inspiration than almost any other manual on Wikipedia:


12 The links in the tab are: About me, Getting started, My test area, Get help and Inbox.

Page 20 But back to the New user bar template: Note also the placeholder image to the right, with a link to a special page about uploading images of yourself. We reasoned that it would be easier for new users to enter the community if their user pages had pictures – and perhaps increase their chance of avoiding frivolous reverting by vandalism patrollers.

At this stage, you can still edit the text in the edit area, since the first edit is when you click “Save” at the bottom of the page – and this is one of the weak spots in this approach13. We assume that many newcomers believe that the process is finished by this stage, that they shouldn't have to “create their user page twice”, so to speak, although we have no direct data, such as interviews, to support this hypothesis.

Category:New Wikipedians One other tool that we wanted to give vandalism patrollers is a quick way to find as many new Wikipedians as possible, so in the code for the New user bar template, we added code that automatically places anyone who has the New user bar template on their user page in a special category for new Wikipedians.

The Sparked.com model The Sparked.com was based on version 5 of the low-quality tests – essentially a box divided into four sections with the text “We need your help in making Wikipedia better – here are some tasks where you can help out. It's easy!”. But the name came from a website for microvolunteering, unsurprisingly called Sparked.com, especially starting on page 2: “Which causes really get you fired up?” Once the new user's chosen at least one cause he or she's on to a page where another question is asked: “Which skills do you have to offer?” The last step matches a cause with the chosen skill to produce a list of tasks where the new user could volunteer. That list should ideally contain at least some articles that the new user finds interesting, much like Amazon.com's Recommendations include some articles that the person would perhaps want to buy.14

This idea was conceived for new users who didn't know what tasks still needed to be done on Wikipedia (some readers believe Wikipedia to be finished, partly because they go to a highly visible page like World war II, and never see the stubs and the other problematic pages). Once we can direct new users to articles in need, we can perhaps inspire the new users to start editing.

We have similar lists on Wikipedia as Sparked.com have, but most of them are too unsorted and long to be practical for newcomers. That's why Frank first had to create a list of problematic areas that we could reasonably expect newcomers to contribute15 - and a set of topics that could inspire people. We decided on these four skills and these six topics:

• History, Biology, Technology, Geography, Mathematics, and Arts16 • Copyediting, Research & Writing, Searching the web, and Organizing

13 And indeed in Wikipedia itself: a more prominent “Save” button than below the edit area could potentially make it easier for some people to find it. 14 One difference between our version and Sparked.com's version is that you can only pick one skill and one topic. Part of the reason that our version don't have that is technical, as it would take more software work than we had time for. A difference between our version and Amazon.com's version is that their version are sent out regularly via email too, based on previous purchases. On Wikipedia we have SuggestBot that functions in much the same way, albeit not by sending out emails – yet. 15 Asking newcomers for instance to wikify an article is probably an exercise in futility. 16 We got a suggestion to change these to Arts & entertainment, People, Places, Science & technology, History, and Current events, and we would have liked to tested that version as well. However, there was no time.

Page 21 These lists have been manually created and would have to be manually updated, as the tasks are being dealt with, but with crowdsourcing, this should be manageable.

This process was designated campaign=ACP2 in the CustomUserSignup extension, which meant that you can go directly to this account creation process (instead of getting randomly assigned to one of the three account creation processes) by clicking this link: http://en.wikipedia.org/w/index.php? title=Special:UserLogin&action=submitlogin&type=signup&campaign=ACP2

This is how the process looks:

Step 0 and 1: Step 0 and 1 in the account creation process are the same as the normal one.

Step 2:


The header was consciously chosen to project the feeling that we want to know more about the new user's interests.

By making the newcomer choose a topic during the account creation process, we hoped to focus them on contributing to Wikipedia, instead of reading only. We also deliberately used pictures as links on this page, to give the image of Wikipedia as “more than just text”.17

Finally, note the blue box with instructions on the right.

17 One comment we have received about these versions is that they are not 100% compatible with the CC-by-sa license, since we haven't provided information about the sources for the images. We chose not to, since it would detract from the “straight” flow from account creation to starting to edit. The annual fundraiser have had the same discussions.

Page 22 Step 3:


Here we have tried to focus the new user into one of four tasks that we need volunteers to help out with. The blue box is present again with instructions.

Step 4:


Page 23 This is one of the 30 pages with similar lists based on skill and interest. All of them have been manually created, which is one of the drawbacks in this method. We have been in preliminary contact with WikiProjects that could help out with updating these lists.


We deployed the three account creation processes (ACP1 – the user page creator, ACP2 – the Sparked.com model, and ACP3 – the standard one) on June 9, 2011. The first results came the next day, and they were very encouraging in some respects.

What we measure and why The data we sought was how many start to edit after going through a particular account creation process. If for instance, ACP2 – the Sparked.com model - manages to attract more editors in the long run than any of the other processes, and the numbers are conclusive, we should implement that model on English Wikipedia.18 An important angle we looked at was how many make 1 edit versus how many make 5 edits. This distinction is crucial, because we want the newcomers to stay as editors, and not just make a random edit to test out Wikipedia.

As we stated above, the data is built upon a browser-based cookie. That means that if a person creates an account on her own computer and then starts editing with her new account on another computer, she would not be counted here. Currently, we have no method of getting data for those persons, but on the other hand, they will be present in approximately the same numbers in the three different account creation processes, so in the end it will still be fair.

The raw data The tech department made an automatic script on the Toolserver which updated daily on the statistics of the three account creation processes. It deposits the raw data numbers in this text file. The data there is a little hard to understand, so here is a short legend:

18 We haven't run these tests on any other Wikipedia. Out intention at the beginning was to have these kinds of statistics for several language versions of Wikipedia, but time constraints became a big factor. That's why we hope that volunteers make these experiments on their own wikis in the future.

Page 24 What is says in the text file – what it means /home/catrope/acp/perday/lennart-log-20110427-ACP1-final – date and ACP (in this case it's ACP1, the user page creator)19 WITH EDITS: 45 – of everyone who went through this ACP, how many made 1 edit? WITH 5 EDITS: 9 – how many made 5 edits? TOTAL: 110 – how many can we match to a specific user name? (ignore this) TOTAL WITH UNMATCHED: 119 – how many created an account through this process? Percentage with 1 edit: 37.8151 – how big share of everyone who went through this ACP made 1 edit?20 Percentage with 5 edits: 7.56303 – how big share of everyone who went through this ACP made 5 edits?

Crunching the numbers Sorting through the raw data, we can find these numbers:

All account creation processes With 1 edit With 5 edits Accounts Percentage Percentage created with 1 edit with 5 edits Average day 337.87 63.96 967.38 34.83 6.90 Best day 533 103 1166 48.93 10.61 Worst day 216 28 733 25.58 3.58

ACP1 With 1 edit With 5 edits Accounts Percentage Percentage created with 1 edit with 5 edits Average day 437.66 69.90 907.49 44.58 7.02 Best day 533 103 1059 48.93 10.61 Worst day 336 36 733 40.97 4.33

ACP2 With 1 edit With 5 edits Accounts Percentage Percentage created with 1 edit with 5 edits Average day 287.61 63.07 881.27 30.96 6.66 Best day 382 96 1062 34.98 9.06 Worst day 216 28 703 25.58 3.58

19 In the text file, the first day is April 27, which is when the extension was deployed. But we didn't roll out the three different account creation process until June 9. The data before that is based on three exactly alike account creation processes. We have therefore excluded any data from before June 10 – the first full day of testing. The last day included is July 20. 20 Remember, the original statistics said that 30% (on the best days) create accounts and then make one edit. This method of measuring is a little bit different still, so any numbers will deviate from the original statistics. The data in the legend is much higher but it seems that it was a peak day. The normal level is much closer to the original statistics.

Page 25 ACP3 With 1 edit With 5 edits Accounts Percentage Percentage created with 1 edit with 5 edits Average day 288.34 59.90 908.51 29.54 6.12 Best day 354 86 1075 33.49 7.84 Worst day 217 36 699 26.79 3.91 (Source: see separate file)

Total scores In total there were 118988 accounts created during the test period. Of these 41558 made at least 1 edit, and 7867 made at least 5 edits.

Which account creation process is better? To determine which account creation process is better, let's look at the important aspects:

• it has the highest percentage of people that make at least 1 edit on the best day • it has the highest percentage of people that make at least 5 edits on the best day • it performs well even on the “worst day” on how many percent make at least 1 edit • it performs well even on the “worst day” on how many percent make at least 5 edits • it has a high percent average of people that make at least 1 edit • it has a high percent average of people that make at least 5 edits

The one that's best in the most of these is clearly better at the job. Here are the scores for the three account creation process compared on these five aspects. The winner of each category is bolded.

ACP1 ACP2 ACP3 Highest 1 edit 48.93 34.98 33.49 Highest 5 edits 10.61 9.06 7.84 Worst day 1 edit 40.97 25.58 26.79 Worst day 5 edits 4.33 3.58 3.91 Average 1 edit 44.58 30.96 29.54 Average 5 edits 7.02 6.66 6.12

These figures may seem very abstract, but by increasing the inflow of new Wikipedians who make 5 edits from 6.12% to 7.02 on an average day, we gain 9 persons per day! (From 59.90 with the present model) to 68.90 persons (with ACP1).

To sum it up. There is no other account creation process of the ones that we have tested that can measure up – in any category – to that of the user page creator. This is the account creation process that we recommend.

And, should be added – that version also scored above 47%. So unless Winifred's note said 48%, 49% or 50%, those two champagne bottles are mine.

Page 26 Other results and aspects However, while the statistics are an important part of choosing a new account creation process, there are other concerns and aspects to keep in mind. To those six categories above, we should add that the account creation process we choose should be effective in terms of community effort. The ACP2/the Sparked.com model has an important drawback in that the community would have to keep the lists of tasks constantly updated.

The ACP1 and ACP2 has both been complimented on looking nice, which makes it easier to convince the community to accept them as the default account creation process.

That does not mean that ACP1 is without its challenges. Here are some of the things we need to take note of, should we decide to go with ACP1:

• we should ask the community to keep an eye on the category for new Wikipedians, and use it as a place to recruit new members of the WikiProjects based on their descriptions of themselves • some Wikipedians have expressed concern that the ACP1 increases the number of pictures of people on Wikimedia Commons – and since Commons is not a photo album, but the pictures should have educational value, they are likely to be deleted unless some thinking is done on how to approach this • we would need to discuss with the community about the issue of users with formulaic user pages, many of whom are inactive. • there have been some talk about browser-compatibility issues, especially when it comes to the JavaScript-based “tell us about yourself”-form. Those would have to be resolved • this is a big one: when we ask the newcomers to add information about themselves, we need to be very careful on what types of information we ask them for. Several Wikipedians have extremely hard objections to including data such as name, where you live, what age you are and other identifying information. Wiki-stalking is one reason. Another is that if a new user adds a university degree or another expertise to their profile, this may lead that person to expect special treatment • the deployment should be announced on both the Village pump (proposal) and the Administrators' noticeboard, plus the Wikimedia Foundation blog, with enough numbers to convince people • other language versions of Wikipedia may discover that their numbers differ from these. Before they deploy ACP1 uncritically, they should test this for a month and measure the difference. Of note is that some already have other account creation processes, that could be fine-tuned as well

As some Wikimedia Foundation staffers have noted, the ACP1 is an amalgamation of several pieces – the user page creator, the new user bar, the new Wikipedians category, etc. If there is time, it would probably be a good idea to test the various part further. We may yet increase the numbers to above 50%!

Page 27 Final thoughts

Although we didn't intend for it, the new user bar have already started to make its way around Wikipedia. On , some have begun to use it on their user pages, see for example: http://zh.wikipedia.org/wiki/User:Tommyang.

The extension CustomUserSignup has already been used for a special outreach project: http://hciresearch2.hcii.cs.cmu.edu/~rfarzan/wikipedia/tool/wsignup.php. We should use it more.

While we ended up with two designs, 19 different designs were collected. Some of those who contributed were very skillful and their talents should be harnessed.

Wikipedia's account creation process is really only technically lacking in one aspect only: it does not use AJAX for when you choose a username, which means that if that name is taken, the new user may have to re-do the entire form, instead of getting a warning. This function exists on Swedish Wikipedia, see Appendix.

If you are thinking about implementing any of these account creation processes, there is a guide for it here: http://outreach.wikimedia.org/wiki/User:Hannibal/guide_for_the_next_person. That guide will probably be moved somewhere else, and be continuously updated, as more challenges are discovered and new hacks are being implemented.

Further research

Of course, there is plenty left to do. Some of the more important things are these:

• ”how do we increase the number of people who create accounts?” • ”how do we increase the number of people who start to edit?” • ”how do we make sure they continue to edit?”21

We should also develop the email that is sent to anyone who creates an account, and include parts that use the same types of algorithms as SuggestBot and Amazon.com. We should also think about some sort of “We miss you” email to accounts that have been inactive for a period of time.

There have been some talk about adding a flag for new users, so that newcomers are easily spotted in Recent changes. That is a complicated topic that should be brought up on Wikimania and other meetups between Wikipedians.

More about creating formulaic user pages: Structured profile

During the Summer of Code of 2011, Akshay Agarwal worked on the Account Creation code.

21 In my experience, some Wikipedians aren't sure they want a big influx of new Wikipedians, since it takes a lot of effort to train them to become Wikipedians. If they did want them, they would treat them better. This is an important factor in this change in direction to attract more Wikipedia editors.

Appendices

Survey 1

Alternative 2 Alternative 3 Alternative 6 Alternative 1 I was just curious Alternative 4 Alternative 5 I tried to start an Alternative I thought an Language I wanted to be about what would I thought I I usually article, but I Alternative People account is 7 part of happen if I would get some register for couldn't do that version required to edit I wanted to 8: Other responding Wikipedia created an extra features websites I like without a user articles upload images account account22 Bengali 71.8% (79) 24.5% (27) 40.0% (44) 25.5% (28) 30.9% (34) 24.5% (27) 19.1 (21) 15.5% (17) 112 Swedish 60.4% (93) 17.5% (27) 21.4% (33) 9.1% (14) 24.7% (38) 22.7% (35) 9.1% (14) 11.7% (18) 155 German 66.7% (168) 6.7% (17) 28.6% (72) 6.7% (17) 17.5% (41) 15.9% (40) 3.6% (9) 15.9% (40) 255 Polish 45.5% (161) 13.8% (49) 17.8% (63) 20.1% (58) 16.4% (58) 30.2% (107) 7.9% (28) 13.0% (46) 359 Spanish 71.6% (618) 12.2% (105) 28.3% (244) 21.8% (188) 35.5% (306) 15.8% (136) 9.0% (78) 11.0% (95) 881 English 68.1% (952) 18.0% (252) 27.1% (379) 22.8% (319) 23.9% (334) 18.9% (264) 15.6% (218) 16.1% (225) 1412 Mean (rounded off 64,02% 15,45% 27,77% 17,67% 24,82% 21,33% 10,72% 13,87% Total: 3174 to two decimals)

22 In many language versions you can create articles without accounts, so any answers here are significant of what the new users think they need. Survey 2

Alternative 2 Alternative 3 Alternative 5 My contributions to Alternative 1 It will enable me to I will get my own user Alternative 4 Wikipedia are not connected Language Everybody will customize the page on Wikipedia. I can follow all my Other/Comm People to an IP-address, but to a know that I helped appearence and Other users will be favorite topics on version nickname that can't be ents responding improve Wikipedia behavior of Wikipedia to able to send me Wikipedia. tracked back to my physical my own preferences messages. address. German 45.9% (406) 30.3% (268) 24.7% (219) 32.0% (283) 53.0% (469) (101)[3] 932 Spanish 34.3% (347) 31.8% (322) 60.8% (615) 36.5% (369) 27.7% (280) (129)[4] 1050 English 21% (244) 13% (151) 13% (151) 36% (408) 15% (175) 1% (13) 1200 variant 1[5] English 45.16% (1459) 42.25% (1365) 41.91% (1354) 60.04% (1940) 38.13% (1232) 8.08% (261) 3189 variant 2[6] Total 36,59% 29,34% 35,10% 41,13% 33,46% 6371

Interpreting the results What conclusions can we draw from these survey results? I think there are a number of conclusions that are interesting. • The number of participants over a short period (volunteered by clicking on an extra link after creating an account) makes the numbers relatively significant. Of course a longer survey would mean more certainty, but the stable results across the survey period suggest that we would get a variation of these results. But I am very happy that this many took the time to answer the questions we put to them. Perhaps the short surveys (only one question at a time) is key here. Also, we can very soon get big groups of people to answer our surveys, which I feel we should use in the future. • The most common answer across all language versions we surveyed was "I want to be part of Wikipedia". In fact, the stable results on that point was the reason why we did another survey to understand what "part of Wikipedia" meant. • "I thought an account is required to edit articles" and "I usually register for websites I like" are the next most common answers. "Usually register" probably means that they do not intend to edit very much, while the other is the opposite. • Not many create accounts because they want to upload pictures. • A fair share of people thought they needed accounts to create articles (which is true for some but not all Wikipedias). On German and Spanish Wikipedia only about 15% thought that. • About a sixth of the new users thought they would get some more features. Whether they know about the features or not is not certain. That's why we added the alternative in the second survey about the watchlists ("follow articles"). On Swedish and German Wikipedia, this was not a big deal. • Following favorite articles on Wikipedia was the most common answer in the second survey, with the notable exceptions of German Wikipedia where anonymity is a bigger reason, and Spanish Wikipedia where getting a user page and being able to send/receive messages are the most important features. • Not many Germans were curious about what would happen when they created their accounts. Maybe they had enough information already. • When forced to choose one alternative, newcomers to English Wikipedia think that being able to follow articles is the most important reason to create an account. That is also important when they can choose. • When not forced to choose one alternative, newcomers to English Wikipedia more than the others that we surveyed like to tinker with the look and feel of Wikipedia. List of MediaWiki messages involved in the account creation process

This is a list of MediaWiki messages (pages) that administrators can edit, and that are involved in the account creation process:

Log In / Create Account page

• http://en.wikipedia.org/wiki/MediaWiki:Loginerror Default Value: “Login error” (just the title of the error, when you fail to log in correctly) • http://en.wikipedia.org/wiki/MediaWiki:Loginstart Default value: empty. (in the loginstart div, located between the page heading and the user login form) • http://en.wikipedia.org/wiki/MediaWiki:Login Default Value: “Log in” • http://en.wikipedia.org/wiki/MediaWiki:Loginprompt Default Value: empty. (In the userloginprompt div, just below the “Don't have an account? Create one.” line) • http://en.wikipedia.org/wiki/MediaWiki:Yourname Default Value: “Username:” • http://en.wikipedia.org/wiki/MediaWiki:Yourpassword Default Value: “Password:” • http://en.wikipedia.org/wiki/MediaWiki:Loginend Default Value: Appears to contain all the “Secure your login” text. (in the loginend div, just below the login and email new password buttons)

The page that are assembled by these messages can be seen here: http://en.wikipedia.org/w/index.php?title=Special:UserLogin

User Create Page

• http://en.wikipedia.org/wiki/MediaWiki:Signupstart Default Value: Empty. (in the signupstart div, located between the page heading and the user login form) • http://en.wikipedia.org/wiki/MediaWiki:Createaccount Default Value: “Create account“ (Page heading, AND the submit button) • http://en.wikipedia.org/wiki/MediaWiki:Yourname Default Value: “Username:” • http://en.wikipedia.org/wiki/MediaWiki:Yourpassword Default Value: “Password:” • http://en.wikipedia.org/wiki/MediaWiki:Yourpasswordagain Default Value: “Retype password:” • http://en.wikipedia.org/wiki/MediaWiki:Youremail Default Value: “E-mail (optional)*” • http://en.wikipedia.org/wiki/MediaWiki:Prefs-help-email Default Value: Starts with “* You do not have to provide an e-mail address, but if you forget your password” • http://en.wikipedia.org/wiki/MediaWiki:Signupend Default Value: See section on low-quality versions, (in the signupend div)

The page that are assembled by these messages can be seen here: http://en.wikipedia.org/w/index.php?title=Special:UserLogin&type=signup.

Note: if you go to that URL while you are logged in, that page will also contain a field asking for a reason (for why you are creating another account).

Confirmation page This page consists of: http://en.wikipedia.org/wiki/MediaWiki:Welcomecreation Function that checks for available usernames

This JavaScript checks for available usernames, and gives an error message if you try to create a new account with a username that is already taken. It was written by User:Fluff on Swedish Wikipedia.

The code

/* Making AJAX calls when a user wants to register a new username *********************** * * Description: Displays a text noticing the user if the target username is avaible or not. * Maintainers: [[User:Fluff]] * Link for stats: [[User:Fluff/cfau.js]] */ var cfauTimer = false;

function CFAUloader () { if(!document.getElementById('wpName2')) { return; }

if((wgPageName == "Special:Inloggning" || wgPageName == "Special:UserLogin") && (document.getElementById('wpName2').className == "loginText")) {

addHandler(document.getElementById('wpName2'), "keyup", CFAUtimerstart); var tdNode = document.getElementById('wpName2').parentNode;

var CFAUspan = document.createElement('span'); CFAUspan.setAttribute('id', 'CFAUresult'); tdNode.appendChild(CFAUspan); } } function CFAUtimerstart () { if(cfauTimer) { window.clearTimeout(cfauTimer); } if(document.getElementById('wpName2').value.length > 0) { cfauTimer = window.setTimeout('CheckForAvaibleUsername()', 500); } } function CheckForAvaibleUsername () { document.getElementById('CFAUresult').innerHTML = ''; injectSpinner(document.getElementById('CFAUresult'), 'cfau'); var cfauUsername = document.getElementById('wpName2').value; cfauUsername = cfauUsername.charAt(0).toUpperCase() + cfauUsername.substr(1);

if(!cfauUsername) { alert('Enter the username you want, to see if it's available.'); return; } var aobject = sajax_init_object(); try { aobject.open('GET', wgServer + wgScriptPath + '/api.php?format=xml&action=query&list=allusers&aulimit=1&aufrom=' + cfauUsername, true); } catch (e) { throw e; } aobject.onreadystatechange = function() { if (aobject.readyState != 4) { return; } if (aobject.status == 200) { var res = aobject.responseXML.getElementsByTagName('u')[0].attributes[0].value; removeSpinner('cfau'); if(res == cfauUsername) { document.getElementById('CFAUresult').style.color = 'red'; document.getElementById('CFAUresult').innerHTML = ' The username ' + cfauUsername + ' is busy.';

} else { document.getElementById('CFAUresult').style.color = ''; document.getElementById('CFAUresult').innerHTML = ' The username ' + cfauUsername + 'is available.'; } } else { // if we get an http-error, we won't bother showing an error message. removeSpinner('cfau'); } }; aobject.send(null); } addOnloadHook(CFAUloader);

Implementation To implement this function on other language versions of Wikipedia, copy the code above, paste it into the JavaScript page of the wiki, e.g. http://en.wikipedia.org/wiki/MediaWiki:Common.js and translate the phrases marked in red above. Thanks

In many respects, I, Lennart, was not the ideal person to handle the Account Creation Improvement Project. However, I had many people to help me. Some of them were volunteers, such as Fetchcomms, Rock drum, Mono, Sertion, Ainali, and everybody who made suggestions. Special thanks to those who helped out with the surveys.

There has also been a lot of help from Wikimedia Foundation staffers, such as Frank Schulenburg, Roan Kattouw, Nimish Gautam, Katie Horn, Howie Fung, Winifred Olliff, Erik Möller, Zack Exley, Brandon Harris, Alolita Sharma, Philippe Beaudette, and Steven Walling.

This report is licensed CC-BY-SA 3.0. This means that you may copy, distribute and remix it freely, as long as you attribute the author and as long as you distribute it under the same/a similar license.

If you want to contact Lennart Guldbrandsson, try emailing him at [email protected].

Page 37