Patterns of Content Transclusion in Wikipedia

Patterns of Content Transclusion in Wikipedia

There and Here: Paerns of Content Transclusion in Wikipedia Mark Anderson Leslie Carr David E. Millard Southampton University Southampton University Southampton University Electronics and Computer Science Electronics and Computer Science Electronics and Computer Science Southampton SO17 1BJ, UK Southampton SO17 1BJ, UK Southampton SO17 1BJ, UK [email protected] [email protected] [email protected] ABSTRACT 1 INTRODUCTION As large, collaboratively authored hypertexts such as Wikipedia A large public collaborative hypertext gives free access to allow grow so does the requirement both for organisational principles any person both to read its content and to add to, or improve, the and methods to provide sustainable consistency and to ease the task hypertext’s data and structure. The hypertext may thus contain the of contributing editors. Large numbers of (potential) editors are not work of many authors, spread across discrete pages. Their varying necessarily a sucient bulwark against loss of coherence amongst editing skills can pose a challenge for those trying to maintain a corpus of many discrete articles. The longitudinal task of curation the overall coherence and accuracy of the hypertext’s content as a may benet from deliberate curatorial roles and techniques. whole—as opposed to activity revising individual articles or gener- A potentially benecial technique for the development and main- ating new content. In wikis, where focus is on the rendered page, tenance of hypertext content at scale is hypertext transclusion, by incremental edits can lead to unseen structural issues. For instance, oering controllable re-use of a canonical source. In considering under 50% of ‘articles’ in the English Wikipedia are actually content issues of longitudinal support of web collaborative hypertexts, we articles, the remainder are re-direction stubs (see Table 2). investigated the current degree and manner of adoption of transclu- The same information may need to be repeated within dierent sion facilities by editors of Wikipedia articles. We sampled 20 mil- articles across a large hypertext. If text is copied, potential exists for lion articles from ten discrete language wikis within Wikipedia to thematic drift between dierent articles through subsequent edits analyse behaviour both within and across the individual Wikipedia by dierent authors. Ideally, in order to retain coherence of the communities. hypertext over time, what we call longitudinal coherence, content We show that Wikipedia (as at February 2016) makes limited, duplication needs to be identied and consistency maintained. inconsistent of use of transclusion. Use is localised to subject areas, Transclusion [17] oers one means of avoiding duplication. De- which dier between sampled languages. A limited number of liberate and considered transclusional re-use of canonical sources patterns were observed including: Lists from transclusion, Lists throughout the hypertext can potentially assist with maintaining of Lists, Episodic Media Listings, Tangles, Articles as Macros, and coherence and avoiding divergent copy. For example, by re-using Self-Transclusion. We nd little indication of deliberate structural text summarising a subject in articles referring to that subject. Fur- maintenance of the hypertext. thermore, transclusion—if identied up as such—also oers the potential to indicate provenance of re-used text. CCS CONCEPTS It therefore follows that the use of transclusion within a large Web hypertext should increase longitudinal coherence, but it is • Information systems → Wikis; Document structure; • Human- unclear how widely and how eectively these techniques are used centered computing → Collaborative content creation; Com- in existing examples such as Wikipedia. Wikipedia’s MediaWiki puter supported cooperative work; software does support transclusion (see Section 3), but Wiki stud- ies appear to ignore the implied linkage created by transclusion. KEYWORDS Despite some analysis as to the functional nature of edits made in Hypertext, Transclusion, Collaboration, Wikis, Wikipedia, Digital Wikipedia [5], no study has been made of the nature of editing as Curation relating specically to transclusional (re-)use of content. Built-in Wikipedia queries (‘special’ pages1) and API methods can give some ACM Reference format: indication of transclusion use, but the reports are opaque and do Mark Anderson, Leslie Carr, and David E. Millard. 2017. There and Here: Patterns of Content Transclusion in Wikipedia. In Proceedings of The 28th not lend themselves to further exploration, especially as to how or ACM Conference on Hypertext and Social Media, Prague, Czech Republic, 4-7 why editors implemented their ideas. Thus more focused study of July 2017 (HT17), 10 pages. transclusion is needed. DOI: 10.475/123_4 By analysing the occurrence and nature of Wikipedia content transclusion, the study set out to investigate these questions: Permission to make digital or hard copies of part or all of this work for personal or • Does Wikipedia show evidence of deliberate use of tran- classroom use is granted without fee provided that copies are not made or distributed scluded article content? If transclusion is used in Wikipedia, for prot or commercial advantage and that copies bear this notice and the full citation then at minimum transclusion mark-up should be detected on the rst page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s). HT17, Prague, Czech Republic © 2017 Copyright held by the owner/author(s). 123-4567-24-567/08/06...$15.00 DOI: 10.475/123_4 1See: https://en.wikipedia.org/wiki/Special:WhatLinksHere, on all article pages. HT17, 4-7 July 2017, Prague, Czech Republic Mark Anderson, Leslie Carr, and David E. Millard in article source code using transclusion, disparity in us- transclusion still remains atypical for hypertextual writing for the age should become apparent, either within discrete per- Web. Research interest tends to focus on either the technical imple- language wikis, or between dierent wikis. mentation or the social aspect of use. Consideration of the writing • Does the nature of transclusion vary between discrete areas of hypertext, in a non-ction context, can fall between these stools. within per-language wikis, or between dierent languages? Halasz’s ‘Reections on “Seven Issues”’ [8, p.112] noted that the By categorising the subject area of any transclusion activ- versioning ‘issue’ was not fully resolved. In a wiki system [14], the ity, disparity in use of transclusion should become appar- default is to render the current edit of the requested page. All past ent, both within discrete per-language wikis and between edits can be rendered and by furnishing the UID of the desired dierent wikis. edit. However links, including transclusions, are not tied to a target • Does article content show distinct patterns of transclusion? edit; thus rendered content may change if the transcluded source If common, transclusion link patterns may be identied is edited. For a web-based hypertext wiki supporting transclusion which aid those maintaining the hypertext. this means, in simplest terms, that the rendered article content (the body copy) of a page is able dynamically to include content 2 BACKGROUND not present in the article’s own source code. Further indication Transclusion, as coined by Nelson in his Literary Machines [17], of transclusion, or ability to traverse such implied links is left to referred originally to a single hypermedia source occurring in mul- individual implementation. tiple places “Transclusion means that part of a document may be Transclusion, applied appropriately, could help Wikipedia’s many in several places—in other documents beside the original—without editors maintain cohesion. A precept of Wikipedia quality is the actually being copied there”[18, preface footnote]2. Subsequently, ‘many eyes’ theory [15]—that many people have looked at any given 4 he re-dened transclusion as “reuse with original context available, fact. However, Wikipedia’s Manual of Style makes no mention of through embedded shared instancing”[19, p32], tying it more closely transclusion (or transcluding from Wikidata), eectively blinding to ideas expressed in his Xanadu system with its ‘transpointing’3 the ‘many eyes’ to the concept. windows. Halfaker et al.[9] nd that there is a plateauing in numbers of Besides giving a canonical source, the inherent transclusion link- active editors of Wikipedia, with the suggestion that there may a age can help establish provenance and copyright. Nelson held that natural equilibrium in levels of active editors in collaborative wikis. indication of transclusion is a front-end function of the hypertext’s Wikipedia has a very at hierarchy of administrators and users reader (renderer) [18, footnote p2/37]. The technique does not pre- although either of those may have extra roles [1]. There is a no- clude changes in transcluded sources, it left to the user to select tion of a ‘quality assurance’ role but this seems to apply more to which version to link: if the system holds past version(s) of the anti-vandalism than hypertextual coherence. For Wikipedia editors source these may be linked [18, p2/26]. Web transclusion, e.g for kudos is most easily acquired, and thus promoted, by concentration 5 image placement, generally draws material directly from its source on the ‘quality’ of individual rendered articles. There appears to be meaning that the transcluding document will reect any change

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    10 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us