Introducing New Features to Wikipedia: Case Studies for Web Science

Introducing New Features to Wikipedia: Case Studies for Web Science

S o c i e t y o n line, Part 2 Introducing New Features to Wikipedia: Case Studies for Web Science Mathias Schindler, Wikimedia Deutschland Denny Vrandecˇic´, Karlsruhe Institute of Technology ikipedia is a free Web-based encyclopedia produced by hundreds Wof thousands of online contributors.1 Today, Wikipedia offers more Introducing new than 10 million articles and is available in more than 250 languages. For features to Wikipedia some of these languages, Wikipedia is not just free, but the only encyclopedia is a complex available. Currently, it is among the 10 feature implementations suggest that the most-visited websites in the world, with better we understand the complex interac- sociotechnical more than 50,000 hits per second during tion between technical progress and social peak times. adaption, the more likely the introduction process. The authors Introducing new technical features to of a new technology to Wikipedia will be Wikipedia is thus a complex sociotechni- successful. compare the Web cal process. Wikipedia runs on the open Based on previous implementations of source MediaWiki wiki engine. Because new features—including the category sys- science process Wiki pedia’s technical implementation and tem in 2004, parser functions in 2006, and content development are both done in free, flagged revisions in 2008—this article dis- to the previous open projects that include complete change cusses the impact of introducing a further logs, we can trace and analyze the codevel- new feature: creating semantic annotations. introduction of new opment of the technical and social aspects. We show how the interaction between the (MediaWiki’s source code repository is ac- technical features and the community is an features and suggest cessible at http://svn.wikimedia.org, and the instantiation of the Web science process, database dumps of Wikipedia’s complete a model that describes the evolution of the how to use it as a log history are available at http://download. Web and its systems. wikipedia.org.) model for the future Because of Wikipedia’s open nature and Previous Wikipedia Upgrades abundant and freely available project and According to the Web science research pro- development of history data, researchers often use it to study cess, to resolve issues, engineers come up with collaborative content creation. Both avail- new ideas and then implement them. Because Wikipedia. able research and the examples of previous the Web is an inherently social infrastructure, 56 1541-1672/11/$26.00 © 2011 IEEE IEEE INTELLIGENT SYSTEMS Published by the IEEE Computer Society IS-26-01-Vran.indd 56 25/01/11 12:06 PM an idea’s technical implementation • the linked page itself behaves like category tree extensions, which ren- will necessarily involve social aspects. a normal article and displaying the der a tree of subcategories on a cat- If done well, the idea’s sociotechnical backlinks requires a second click egory page. The introduction and implementation can resolve the origi- on the appropriate link in the tool- application of the categories were nal issue on a micro level. However, bar, and heatedly debated in some language because of the Web’s size and com- • normal articles (such as the one communities, a few of which enforced plexity, such solutions often have un- for Greece) cannot be used as cat- restrictions on the use of categories expected implications on the macro egory pages because all normal with manual effort. The German level. Engineers must in turn analyze links then also appear in the list, Wikipedia community agreed on a these implications to identify new is- not just those that were introduced moratorium on category use to first sues (such as situations that do not for categorizing. (That is, Albania define some guidelines for their ap- conform to certain values), thus start- would be in the list of pages linking plication. The community members ing the process anew. (A detailed de- to Greece because it’s a bordering manually enforced the guidelines and scription of the steps in the process country.) the moratorium. was presented in an earlier work.2) At the time, the power to develop, Tim Berners-Lee has called the com- Wikipedia often avoided such wiki- enable, and disable new features lay plexity step from the micro solution to specific idiosyncrasies. For example, basically with the software develop- the macro solution one of the magics Wikipedia omitted camel-case syn- ers. The communities around the dif- of Web science, and he called the cre- tax in favor of free links, so it is often ferent language Wikipedias had no ativity to get from issues to ideas the described as an atypical wiki. option but to accept these changes second magic of Web science.3 The category system was intro- and deal with the aftermath. Tech- To illustrate this process, we de- duced to MediaWiki in 2004 to solve nical details, such as the fact that re- scribe examples from the history of this problem. Each article could be directs on categories did not mean Wikipedia, tracing and analyzing put into an arbitrary number of cat- that a contributor could use the re- them from the available data about egories that are identified by freely directing category as a synonym for the Wikipedia project and its history. chosen names. Adding a page to the categorization, had enormous social category “Greece” is done by add- impact because they formed the pat- Category System ing [[Category:Greece]] to the terns of interaction with the software. Following Wikipedia’s initial growth, wiki text. Category links themselves (A more thorough analysis of the cat- it became necessary to introduce fea- are not displayed in the article text, egory system is available elsewhere.4) tures to help users explore and navi- but in separate visual elements on gate the site’s content. Early versions page rendering. The category page Parser Functions of the MediaWiki software offered is a distinct page in the newly intro- Developers use templates to include only the manually created links, a duced category namespace and thus the text of another page (usually in backlinks feature (which displays all separate from the article space. Cat- the separate template namespace) pages that link to a specific page), egory pages list all pages belonging where the template is being called. and a full-text search to help users to a specific category. The new cate- This allows for higher consistency discover content in the encyclopedia. gory system resolved all the identified within the wiki, allowing some ele- In many other wikis, the backlinks technical issues of using backlinks for ments to be written only once and feature was used to implement a ru- categorizing articles. reused in several articles. Templates dimentary tagging system—that is, The category system was activated can also feature parameters—that is, once a link was created to a page such in all language editions at the same parametrized template calls will in- as “Greece-related topic” on all rel- time without consulting the respec- sert the parameter values in the re- evant pages, the list of backlinks on tive Wikipedia communities. For placing text (see Figure 1). Such tem- the page “Greece-related topic” was a most of the languages, it was quickly plates are widely used in Wikipedia; list of all pages on that topic. This ap- applied to categorize a majority of the for example, the English Wikipedia proach has several disadvantages: existing pages. Soon new issues were offers more than 200,000 templates. discovered, however, such as how to In 2006, some Wikipedians dis- • the links to the topic page are dis- organize categories themselves. This covered that through an intricate and played in the text like any other link, led to further new features such as complicated interplay of templating Ja NuarY/FEbruarY 2011 www.computer.org/intelligent 57 IS-26-01-Vran.indd 57 25/01/11 12:06 PM S o c i e t y o n line, Part 2 Greece is {{Country | Continent=Europe | Capital=Athens}} (a) were assigned at least one flagged re- a country in {{{Continent}}}. Its capital is {{{Capital}}}. vision. The remaining work is to keep [[Category:Country]] up with edits by unregistered or new users. Any Wikipedia author who is a (b) member of the editor group can mark a revised article as compliant with a Greece is a country in Europe. Its capital is Athens. set of conditions on content quality. [[Category:Country]] The current condition for such a flag is the “lack of obvious forms of van- (c) dalism,” a deliberately low thresh- old. Unregistered Wikipedia visi- Figure 1. Template use in Wikipedia. (a) The source of a page about Greece calling tors (the vast majority of Wikipedia the country template, (b) the country template source, and (c) the text of the page about Greece after template expansion. visitors) are shown the latest flagged revision instead of the most recent features and cascading style sheets simple writing of extension functions article version. (CSSs), they could create conditional to add arbitrary functionalities, such Both prior to implementation and wiki text—text that was displayed if as geocoding services or widgets. after the initial start of this feature, a template parameter had a specific The developers were clearly react- a lengthy debate was held within value. This included repeated calls ing to the community’s demands, be- the author community, resulting in for templates within templates, which ing forced either to fight the solution a straw poll. The preliminary result bogged down the whole system’s per- to the community’s issue or offer an was a rather strong mandate to keep formance. As a result, the develop- improved technical implementation the feature.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    6 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us