Lurking? Cyclopaths? a Quantitative Lifecycle Analysis of User Behavior in a Geowiki
Total Page:16
File Type:pdf, Size:1020Kb
Lurking? Cyclopaths? A Quantitative Lifecycle Analysis of User Behavior in a Geowiki Katherine Panciera1, Reid Priedhorsky1, Thomas Erickson2, Loren Terveen1 1GroupLens Research 2IBM T.J. Watson Research Center Department of Computer Science and PO Box 704 Engineering Yorktown Heights, NY 10598 USA University of Minnesota {snowfall}@acm.org Minneapolis, Minnesota, USA {katpa,reid,terveen}@cs.umn.edu ABSTRACT Online communities produce rich behavioral datasets, e.g., Usenet news conversations, Wikipedia edits, and Facebook friend networks. Analysis of such datasets yields important insights (like the “long tail” of user participation) and sug- gests novel design interventions (like targeting users with personalized opportunities and work requests). However, certain key user data typically are unavailable, specifically viewing, pre-registration, and non-logged-in activity. The absence of data makes some questions hard to answer; ac- cess to it can strengthen, extend, or cast doubt on previous results. We report on analysis of user behavior in Cyclopath, a geographic wiki and route-finder for bicyclists. With ac- cess to viewing and non-logged-in activity data, we were able to: (a) replicate and extend prior work on user lifecycles Figure 1. Screenshot of the Cyclopath application. in Wikipedia, (b) bring to light some pre-registration activ- ity, thus testing for the presence of “educational lurking,” and (c) demonstrate the locality of geographic activity and how editing and viewing are geographically correlated. INTRODUCTION The recent explosionof open content systems like Wikipedia, Flickr, and Stack Overflow has led to a new industry of on- Author Keywords line knowledge production and organization, carried out by Wiki, geowiki, open content, geographic volunteer work, distributed volunteers. The value of this new way of work is volunteered geographic information, lurking clear, and we, as many other researchers do, seek to under- stand how these systems work and how they can be nurtured. General Terms This paper is concerned with one such open content system. Human Factors, Design Cyclopath (shown in Figure 1) is a web-based mapping ap- plication for bicyclists in the Minnesota cities of Minneapo- 2 ACM Classification Keywords lis and St. Paul, a metro area of roughly 8,000km and 2.3 H.5.3 [Group and Organization Interfaces]: Collabora- million people. It serves as a standard web map, offering tive computing, computer-supportedcooperative work, web- cyclist-centric route-finding, but as a geowiki, it goes be- based interaction. yond, offering wiki editing of the entire map, from the ge- ometry and topology of the road/trail network to points of interest and annotations like notes and tags. Cyclopath is of interest to researchers for three reasons. First, as an open content system it furthers our understand- ing of this class of system by allowing us to compare and contrast it with other systems like Wikipedia (which is well- studied). Second, its design enables the study of phenomena – for example, viewing behavior by “lurkers” – which can- c ACM, (2010). This is the author’s version of the work. It is posted here not be studied with other open content systems. Third, Cy- by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of CHI 2010, (April 10–15, clopath is an example of a relatively new class of online sys- 2010) tem, geo-communities, that offers a chance to explore how pate in this space; for example, Google My Maps lets users users consume, explore, and edit geographic data. create and share arbitrary maps with low process overhead. Previous work has described Cyclopath’s design and ratio- Analyzing traces of online behavior. Internet-based so- nale [23], demonstrated its effectiveness under laboratory cial media has been much studied because it creates rich conditions [21], and discovered mechanisms for encourag- activity traces – messages posted and replied to, user pro- ing work [24]. These enabling efforts have established Cy- file settings, Wikipedia policies debated and articles edited, clopath as a viable research platform. This is the first work etc. Researchers have analyzed data from many systems to to report on Cyclopath’s use “in the wild”; as of this writing, draw compelling pictures of life online. For example, Whit- Cyclopath has been operational for 16 months and has suffi- taker et al. [31] and Jones et al. [14] revealed the dynamics cient users and data to provide a portrait of its use and some of large scale discussions; Lampe and Resnick investigated lessons learned. the effectiveness of distributed moderation for information filtering [16]. We offer two key contributions. First, we have quantita- tively analyzed the lifecycles of users in an open-contentsys- Types and lifecycles of users. Of most relevance to this tem, addressing in particular the question of pre-registration paper is work examining the types and lifecycles of online anonymous lurking. Second, and again quantitatively, we community members. One prominent line of research seeks analyze the kind of geographic work that is being done, fo- to assign users to different interaction roles, often through cusing on how the geographic nature of the system affects applying social network analysis [1, 7, 30]. what work is done and how public and hidden actions relate. Closer to the current work, other research analyzes users by The remainder of this paper is organized as follows. We first their level of participation. Famously, when the level of par- present related work, followed by data sources and methods. ticipation is visualized, it typically exhibits a “long tail”[2], We then present the bulk of our two contributions: first an such as by following a power law [1]. Such relationships analysis of the lifecycle of Cyclopath users, and then of ge- have been observed in many different kinds of online com- ographic viewing and editing. We close with some lessons munities, including Usenet discussions [31], tagging [10], learned and conjectures for other systems. and Wikipedia edits [22, 15]. Users near the top of the curve often are called “power users” or “elite editors.” In Wikipedia, a small group of elite editors does the majority RELATED WORK of the work [15] and produces the majority of value [22]. Geo-community networks. Online communities have a long history in CHI and CSCW. One strand of that his- Several researchers have studied how user behavior in on- tory concerns community networks – online communities line communities changes over time, especially among the focused on a particular city or region, such as Berkeley’s power users. Bryant et al. [3] interviewed a number of expe- Community Memory Project in the 1970s or Montana’s Big rienced Wikipedia editors and identified a broadening of in- Sky Telegraph in the 1980s [25]. This interest has continued terests and gradual assumption of community maintenance in sites like Blacksburg Electronic Village [4] and is today activities. Our earlier work [20] extended this work with manifested in Web 2.0-style sites such as EveryBlock.com quantitative analyses investigating the lifecycles of elite and and FourSquare.com that provide services for and tap into non-elite Wikipedia editors. Specifically, we found that elite the local knowledge of the inhabitants of a particular place. editors edited more than non-elites immediately upon ap- pearing, and that all editors’ activity was characterized by One interesting trend is that geographic information has be- an initial burst of intense activity followed by gradual de- gun to be incorporated into community network-style sites. cline to a fairly low, constant level of activity. This draws on geographic information systems (GIS), the field concerned with visualizing, analyzing, and manipulat- Activity spectrum: identified, anonymous, hidden. A ing geographic and spatial data [17]. While GIS has tradi- key difficulty of online community research is that it is lim- tionally been the province of experts using specialized (and ited by what researchers can see, which is not everything. expensive) software, more recently, the field has expanded Acts of “participation” or “editing” are traditionally visible, to consider what it refers to as volunteered geographic infor- while acts of “viewing” or “consumption” are not. Thus, mation [11, 6]. in discussion groups, posted messages, replies, and threads are visible, but reading of messages is hidden; similarly, in Within the CHI and CSCW communities, there has also been Wikipedia, edits to articles (and talk pages, user pages, etc.) growing interest in geographic-based collaboration and on- are visible, article reading is hidden, etc. The vast major- line communities, such as the systems for supporting map- ity of research, e.g. [31], analyzes only visible data, while based collaborative planning developed and studied at Penn pointing out the problem of “hidden” data. Those who do State [5]. Similarly, in the Web 2.0 world, we see open not participate but only view are called lurkers [19]. mapping sites like OpenStreetMap.org, which aims to build a global street map from scratch, and community-oriented A few research efforts have been able to study lurkers. Non- sites such as FixMyStreet.com and SeeClickFix.com that en- necke and Preece [19] quantified the amount of lurking in able locals to plot the location of potholes and similar prob- discussion lists. Soroka et al. [27] studied the amount of lems on a map. Major online mapping vendors also partici- lurking within a corporate discussion system and whether an example, creating a new block, changing the geometry of a initial lurking period served to educate users about system block, or changing the name and speed limit of a block each conventions. Harper et al. [12] analyzed how participation constitute one edit action; one edit action corresponds very invitations influenced users’ message reading behavior in a loosely to changing one word in Wikipedia. For each edit movie discussion system.