<<

132

Chapter 13 The Collaborative Organization System

Katrin Weller Heinrich Heine University of Düsseldorf, Germany

Isabella Peters Heinrich Heine University of Düsseldorf, Germany

Wolfgang G. Stock Heinrich Heine University of Düsseldorf, Germany

ABSTRACT This chapter discusses as a novel way of indexing and locating based on generated keywords. Folksonomies are considered from the point of view of knowledge organization and representation in the context of user collaboration within the Web 2.0 environments. Folksonomies provide multiple benefits which make them a useful indexing method in various contexts; however, they also have a number of shortcomings that may hamper precise or exhaustive retrieval. The position maintained is that folksonomies are a valuable addition to the traditional spectrum of knowledge organization methods since they facilitate user input, stimulate active language use and timeliness, create opportunities for processing large sets, and allow new ways of within document collections. Applications of folksonomies as well as recommendations for effective information indexing and retrieval are discussed.

INTRODUCTION of science and . In this context, studies focus A key problem facing today’s is on methods and to enable precise and how to find and retrieve information precisely and comprehensive searching of document collections effectively. Substantial research efforts concentrate (Frakes & Baeza-Yates, 1992; Stock, 2007a). In on the challenges of information structuring and addition, techniques of knowledge representation storing, particularly within different sub-disciplines have been established (Cleveland & Cleveland, 2001; Lancaster, 2003; Stock & Stock, 2008). Most prominent are approaches of document indexing: DOI: 10.4018/978-1-60566-368-5.ch013

Copyright © 2010, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. Folksonomy

Figure 1. An exemplary cloud. Tag clouds by Gene Smith (2004). Smith uses the term display the most popular tags within a folksonomy “classification” for paraphrasing folksonomies. based system. The bigger the font size, the more This term arouses a misleading and faulty con- documents have been indexed with a tag. notation. The same holds for the term “.” Folksonomies are not classifications or taxono- mies, since they work neither with notations nor with semantic relations. They are, however, a new type of knowledge organization system, with its own advantages and disadvantages. i.e., assigning content-descriptive keywords to documents. This enhances retrieval techniques and BACKGROUND aids users in deciding on a document’s relevance. Knowledge Organization Systems Different knowledge organization systems (KOS) are developed to support sophisticated document Knowledge representation methods are ap- indexing. Common examples of KOS include plied to provide a better basis for information classification systems (), thesauri, and retrieval tools. This may basically be done in controlled keywords (nomenclatures). two ways: by abstracting the topics of a docu- Recently, a well-known problem of indexing ment and by indexing a document, i.e., assigning documents with content-descriptive has content-descriptive keywords or placing it into a been addressed from a new, user centered perspec- concept scheme (Cleveland & Cleveland, 2001; tive. Within the so-called “Web 2.0” (O’Reilly, Lancaster, 2003). For indexing documents with 2005), web users have begun publishing their own content-descriptive keywords, different types of content on a large scale and started using social knowledge organization systems (KOS) have to store and share documents, such as been developed. The most important methods photos, or bookmarks (Gordon-Murnane, – classifications, thesauri and nomenclatures – 2006; Hammond, Hannay, Lund, & Scott, 2005). comprise a , which is used And they have also begun to these documents for indexing. The vocabulary of classifications with their own keywords to make them retrievable. and thesauri usually has the of a structured In this context, the assigned keywords are called concept hierarchy, which may be enriched with tags. The indexing process is called (social) tag- further semantic relations, e.g., relations of equiva- ging, the totality of tags used within one platform lence and concept associations (Peters & Weller, is called folksonomy. A is a popular 2008; Weller & Peters, 2007). method for displaying most frequently applied Recently, two new developments have en- tags of a folksonomy visually (Figure 1). tered the spectrum of KOS: folksonomies and Thus, a folksonomy is an indexing method open ontologies (Weller, 2007). They complement for users to apply freely chosen index terms. Peter traditional techniques in different ways. Folk- Merholz (2004) entitles this method “metadata for sonomies include novel social dimensions of user the masses”; the James Surowiecki (2004) involvement; ontologies extend the possibilities refers to it as one example of “the wisdom of of formal vocabulary structuring (e.g., Alexiev crowds.” The term “folksonomy”, as a combina- et al., 2005; Davies, Fensel, & van Harmelen, tion of “folk” and “taxonomy”, was introduced in 2003; Staab & Studer, 2004). Both have revived 2004 by Thomas Vander Wal and cited in a discussions about metadata on the web (Madhavan et al., 2006; , 2004) and have increased the

133 Folksonomy

Figure 2. Classification of KOS according to since been widely assumed, yet definitions still complexity and broadness of the domain vary slightly. Generally, it describes a new era of the , in which the users are in the spotlight and can to easily contribute to the creation of new web-content. The borders between “consumers” and “producers” of content are blur- ring; we may talk of a new type of web user: the “prosumer” as envisioned by Toffler (1980). Furthermore, the focus is on many-to-many relationships. The interrelation of groups of us- ers (namely, communities) is emphasized. The collaboration of large communities enables the creation of content in new formats and of enor- mous scale. Thus, besides social networking and personal interconnections, the interlinking of top- ics and discussions plays a decisive role. Various awareness of knowledge representation issues new channels create a “matrix of in scientific areas and even within the common dialogues” (Maness, 2006) across different types web-user community. of content and different data formats (e.g., , We may classify different KOS according to , , content, discussions, the complexity of their formal structure (mainly forums, personal profiles). defined by the number of specified semantic With this enormous growth of user-created relations in use for structuring the vocabulary) content, new ways of navigating through it are and the extent of the captured domain (Figure 2). needed. It was within platforms, Both aspects are inversely proportional: the more that folksonomies have been introduced as an easy complex the structure, the smaller the captured way to let users organize their data and make it domain will have to be, due to feasibility reasons. accessible and retrievable: Everyone was allowed Folksonomy is a completely unstructured method to tag documents with freely chosen keywords. of document indexing. While in most other cases With the success of the photo- platform trained indexers or other experts are responsible Flickr1, the -community YouTube2, the for indexing documents, folksonomies allow the tool Del.icio.us3, and the producers or the users of certain content to take blog Technorati4 the principles of over this task. There is no authority which con- searching documents by assigned tags became trols the terminology in use. This also means that widely known. folksonomies are in no way limited to a certain domain of interest. They can be easily applied to all given contexts, as long as a community of FOLKSONOMIES AND SOCIAL interest exists. TAGGING APPLICATIONS Characteristics of Folksonomies Web 2.0 and User Collaboration In folksonomies, we are confronted with three The term Web 2.0 was coined during a discus- different elements (Marlow, Naaman, Boyd, & sion by Tim O’Reilly and Dale Dougherty from Davis, 2006): the documents to be described, O’Reilly Media (O’Reilly, 2005). The phrase has the tags which are used for description, and the

134 Folksonomy

Figure 3. Interrelation of tags, users, and docu- links to documents, which related users have ments (adapted fromPeters & Stock, 2007). tagged or by using the tags that they also use. This can be described as one type of social navigation. The most popular tags of a folksonomy (via tag clouds) are another way for entering document collections and browsing for content (Sinclair & Cardew-Hall, 2008). According to Vander Wal (2005), one can distinguish between two types of folksonomies: broad folksonomies, where one document can be tagged by several users, so that tags can be assigned to each document more than once (e.g., Del.icio. us); and narrow folksonomies, where each tag is recorded for a document only once (e.g., , Technorati, YouTube) (Vander Wal, 2005; see also Peters & Stock, 2007). Usually the document’s author provides the tags, although occasionally other users are also allowed to add tags. users who are indexing the content. These three The basis for tag clouds are platform-specific elements enable different dimensions of inter- or resource-specific tag distributions which could connections that can be used for browsing and be also represented by graphs (see Figures 4 & navigation (Figure 3). 5). Vander Wal (2005), Shirky (2005) and others Users as well as documents are interconnected that in broad folksonomies the distribution with each other in a social network environment, of tags given to a document follows a Lotka-like in which the paths run along the tags. On the one power law (Egghe & Rousseau, 1990, p. 293; hand, documents are linked “thematically” with Egghe, 2005). If this assumption is true, we each other when they have been indexed with see a curve with only few tags at the top of the the same tags. On the other hand, documents are distribution, and a “long tail” of numerous tags related via users, so-called shared users. Finally, on the lower ranks on the right-hand side of the users are linked with each other when they use curve (see Figure 4). Investigations of document- the same tags for indexing or when they index specific tag distributions demonstrate that another the same documents. Users are thematically re- prototypic tag distribution may appear as well. lated when they index with the same tags; they This inverse logistic distribution (Stock, 2006) are coupled with shared documents when they shows a lot of relevant tags on the curve’s left- index the same documents. Thus, tags can help hand side (the “long trunk”) and the known “long to identify communities of interest. tail” (see Figure 5). As both distributions share The extent of commonality may be illustrated the characteristic “long tail,” they are difficult to quantitatively with similarity rates such as Co- differentiate. Therefore, they are both often called sinus, Jaccard-Sneath or Dice (Stock & Stock, “power law.” 2008, p. 373), while communities of similar us- The evolution of tag distributions with “long ers can be detected by cluster analysis. All these tail” characteristic is commonly explained by the interrelations can be used to browse a document known “rich gets richer” or “success breeds suc- collection. One may find interesting documents cess” phenomena. Popular tags will be used more not only via tag searches, but also by following often because of their better visibility whereas

135 Folksonomy

Figure 4. Power law distribution of tags (based to regard all tags of the long trunk up to the turn- on tagging data from Del.icio.us retrieved May ing point of the curve (Peters & Stock, 2007). In 15, 2008, http://www.go2web20.net). most cases the “long trunk” will be shorter than the “long tail.” Searches using the option “power tags only” will enhance precision of search results due to the reciprocal relationship of recall and precision. As power tags prune the tag distribution at a certain threshold, the amount of searchable and retrievable resources will decrease and recall will diminish along with it.

Applications of Folksonomies unpopular tags may be used seldom and will form the “long tail” (Cattuto, Loreto, & Pietronero, By now, folksonomies are an essential part of many 2007; Halpin, Robu, & Shepherd, 2007). Several social software and Web 2.0-based applications. studies showed that the characteristic shape of the Users can tag various types of data, including distributions of document-specific tags (not the scientific articles, references, bookmarks, pic- absolute number of tags) will remain stable at a tures, videos, audio files, blog posts, discussions, certain point in time (Kipp & Campbell, 2006; events or even other users. The emergent content- Golder & Huberman, 1006; Halpin, Robu, & descriptive tags can be used as an additional Shepherd, 2007; Maass, Kowatsch, & Münster, access-point to data collections besides traditional 2007). folder structures. They are particularly needed to To enhance precision in folksonomy-based improve retrievability of non-textual documents, information retrieval systems, we may profit from such as videos and photos. Based on an idea by this knowledge and work with power tags as an Luis von Ahn (2006), uses a game-like additional search feature (assuming that the most application to incite users to tag pictures on the frequent tags are the most relevant ones and that Web. With the Google Image Labeler5 metadata less frequent tags can be neglected). In case of for large collections of images on the Web are a power-law distribution we may consider only collected to improve Google’s image search. the first n tags (e.g., the first three tags) and in Besides the various Web 2.0 applications, it is case of an inverse-logistic distribution we have also possible to work with folksonomies in other contexts, e.g., in of companies (Fichter, 2006), for indexing corporate blogs, podcasts and Figure 5. Inverse-logistic distribution of tags vodcasts (Peters, 2006), for corporate - (based on tagging data from Del.icio.us retrieved ing services (Millen et al., 2006) and message May 15, 2008, from http://www.readwriteweb. boards (Murison, 2005). com). Commercial online information suppliers have started to work with folksonomies as well (e.g., Engineering Village6 by Reed Elsevier and WISO7 by GENIOS). Folksonomies are suggested for broader use within professional (Stock, 2007b) as well as (Kroski, 2005; Spiteri, 2006). Single libraries have begun to implement social tagging applications for their catalogue

136 Folksonomy

Table 1. Benefits and problems with folksono- 2007; Smith, 2008). Discussion on the quality mies of folksonomies often focuses on a comparison

Benefits Problems with other KOS. While traditional techniques Folksonomies • rep- Folksonomies • have are based on elaborated knowledge representa- resent an authentic use of no vocabulary control and do tions techniques, controlled vocabularies, and language • allow multiple not recognize and expert skills in indexing, folksonomies rely on interpretations • recognize homonyms • do not make use neologisms • are cheap meth- of semantic relations between the principles of “” (Weiss, ods of indexing • are the only tags • mix up different basic 2005) and wisdom of the crowds. This leads to way to index mass information levels • merge different lan- on the Web • give the quality guages • do not distinguish the following key aspects of a critical reflection “control“ to the masses • allow formal from content-descrip- on folksonomies, which will be discussed in searching and – perhaps even tive tags • include spam-tags, better – browsing • can help user-specific tags, and other more detail (Weller, 2007): (a) the confrontation to identify communities • misleading keywords of user’s language versus vocabulary control; are sources for collaborative recommender systems • are (b) the social and personal objectives in tagging sources for the development behavior; and (c) the contrast between retrieval of ontologies, thesauri or clas- sification systems • make and exploration. 1 summarizes the main people sensitive to information benefits and problems with folksonomies. indexing issues The main property of a folksonomy is that it authentically captures the language-use of its user community and reflects the prosumers’ conceptual (e.g., the University of Pennsylvania with its model of information (Quintarelli, 2005). People 8 system PennTags ). are free to use whichever tags they want and do not Trant (2006) analyzes folksonomies as a user depend on a predefined set of terms. This freedom centered access point to art collections. in the choice of tags however means that folksono- One can furthermore envision social tagging as mies are entirely uncontrolled vocabularies, which an addition to classificatory folder approaches, leads to the well known “vocabulary problem” e.g., for portals. (Furnas et al., 1987; Furnas et al., 2006; Golder & Some software developers have also inte- Huberman, 2006; Mathes, 2004): Different people grated the tagging principle into their products. use different words to describe the same object. With , enables tagging Synonyms, trans-language synonyms, spelling of several data formats, such as pictures, videos variants and abbreviations are not bound together. and Office files. A similar approach can be found Thus, someone searching for “ of with Apple’s iPhoto. Yet, as long as tagging is America” will not find documents tagged with performed by single users within their personal “US”, “USA”, “United States” or “America.” workspace, the social component is lacking and Homonyms and polysems are not distinguished, we cannot speak of folksonomies in a strict sense, thus for example searching for “trunk” in Flickr’s but of personomies (Hotho et al., 2006). folksonomy will retrieve photos of trees as well as suitcases and probably also elephants. As different Benefits and Problems languages are used within most folksonomies, with Folksonomies additional trans-language homonyms may occur. Misspellings and encoding limitations are serious Metadata produced by broad communities are cost- problems for folksonomies (Guy & Tonkin, 2006). efficient and can easily be applied to large data All these peculiarities have to be kept in mind collections. The pros and cons of folksonomies are by the user when searching in folksonomy based well discussed (e.g., Kroski, 2005; Peters & Stock, systems. Alternatively, additional techniques of

137 Folksonomy

vocabulary control may be applied to avoid some describe or evaluate a document only from the of these problems. user’s very own perspective so that some tags But still the flexibility in the choice of tags is “are virtually meaningless to anybody except their probably also the greatest advantage of folksono- creators” (Pluzhenskaia, 2006, p. 23). Some other mies: It enables timeliness and multiple perspec- tags can be called “performative”: Often a planned tives. A controlled vocabulary is always bound to or done activity is tagged, for example “toread” on a certain point in time and to a certain point of Del.icio.us (Kipp, 2006a). Additionally, there are view. Folksonomy users can create tags quickly syncategorematic tags – terms which can only be in response to new developments and changes in understood in the specific context. A good example terminologies (Kroski, 2005). of this type of tag is the term “me” on Flickr, which Some tags may be neologisms. Mathes (2004) describes a photo of the document’s author. Some discusses the words “sometaithurts” (for “so meta keywords are even mere spam-tags. it hurts”) and “flicktion” on Flickr. “Although Overall, research has analyzed the nature of small, there is a quick formation of new terms to tags as well as the different functions of tags (see describe what is going on, and others adopting Al-Khalifa & Davis, 2007; Golder & Huberman, that term and the activity is describes” (Mathes, 2006; Kipp, 2006b). The strength of folksonomies 2004). Such an unanticipated and unexpected lies in “” (Mathes, 2004), in discovering use of tags reflects a “communication and ad-hoc information via different paths, and in easy to handle group formation facilitated through metadata” search mechanisms (Quintarelli, 2005). “The long (Mathes, 2004). tail paradigm is also about discovery of informa- Collaborative tagging of documents leads to tion, not just about finding it”, Quintarelli (2005) “multiple interpretations”, different and some- adds. Folksonomies provide different entry points times disparate opinions and “multicultural to document collections; as described above, users views” of the same piece of information (Peter- may browse along relations between tags, users and son, 2006). Folksonomies “include everyone’s documents. Searching with tags is much easier for vocabulary and reflect everyone’s needs without non-information professionals than searching with cultural, social, or political bias” (Kroski, 2005), elaborated retrieval tools such as, for example, the even niche interests can be represented. “Shared International Patent Classification system. On the intersubjectivities” enable the users “to benefit, other hand, professionally generated metadata are not just from their own discoveries, but from those usually segmented into different fields, such as of others” (Campbell, 2006, p. 10). Tags can be the document type and the notations of classifica- used as basis for recommender systems (Szomszor tion systems. Here indexing distinguishes formal et al., 2007). Yet, the intentions of tags in social aspects from content-descriptive information tagging systems are not always social. Users who (“aboutness”). In folksonomies a strict boundary tag documents do not necessarily do this with between different metadata is lacking. There are the objective of helping a community in finding tags that identify what a document is about. At the relevant documents. Many users simply use tags same time, one can find tags referring to formal to organize their own private documents. Vander descriptions at the same level: i.e., tags identify- Wal (2008) describes social tagging as being “col- ing the owner of the document or tags referring to lective” work rather than “collaborative.” Thus, (Golder & Huberman, 2006, p. 203). many tags in use are personal rather than social Within a catalog, this can cause problems, (Guy & Tonkin, 2006). as one could not, for example, clearly distinguish Some tags do not describe the document, but between written by William Shakespeare give a judgment (“stupid”). User-specific tags and books about him.

138 Folksonomy

SOLUTIONS AND The development and updating of structured RECOMMENDATIONS KOS can profit from folksonomies (Aurnham- mer et al., 2006; Christiaens, 2006; Gendarmi There are basically three different approaches aim- & Lanubile, 2006; Macgregor & McCulloch, ing to solve the present problems of folksonomies. 2006; Mika, 2005; Spyns et al., 2006; Zhang et All approaches complement each other. First, one al., 2006), because the tags, their frequency and can focus on the actors and try to educate users their distribution are sources for new controlled to improve “tag literacy” (Guy & Tonkin, 2006). terms, for modifications of terms and perhaps for The second approach comprises combinations of deleting concepts in the sense of a “bottom-up social tagging with other knowledge organization categorization” (Vander Wal, 2005). In this way systems (Weller, 2007). And finally we may gener- tags guarantee a fast response to changes and ally consider tags as elements of natural language in the knowledge domain. Gener- and treat them by means of automatic methods ally, folksonomies should not be regarded as of natural language processing (NLP) for better competitors for classical KOS but rather as a retrieval results (Peters & Stock, 2007; Stock, complement. 2007a, chapter 13-18). According to Peters (2006) it is not advisable Improvement of tag literacy would require to work exclusively with folksonomies in profes- a broader understanding of indexing principles sional environments (e.g., intranets, commercial within the folksonomy community. For training the online services), but to mix them with other index- user in selecting “good” tags, systems that suggest ing methods. Here, a layer model (Krause, 1996) tags to the users (based on co-occurrences, lexical for the combined use of folksonomies, thesauri, similarity or semantic relations) may be useful (Ma- classification systems, etc. will work well. cLaurin, 2005; Xu et al., 2006). Yet, providing tag To revise applied tags for effective information suggestions has influences on the classical “wisdom indexing and retrieval, it is useful to treat them of the crowds” approach; the social component of by means of NLP. After language identification folksonomies may get lost and the “success breeds and of tags, typical NLP-tasks, including success”-effect (Egghe & Rousseau, 1995) may error detection, word form conflation, identifica- adulterate the tag distributions. tion of named entities, phrase recognition, and Combinations of folksonomies and other KOS decompounding, can be executed. In this way, are very promising: e.g., approaches of using the variety and ambiguity of tags can be reduced clustering mechanisms to apply some structure for considerably. search result presentations or methods of automatic query expansion or query (Grahl et al., 2007; Gruber, 2007; Kolbitsch, 2007). Some FUTURE TRENDS research is done in the field of emergent : i.e., gradually growing semantic structures from The topics of information retrieval and relevance folksonomies to more complex KOS (see Zacha- ranking within folksonomies have not yet been rias & Braun, 2007) for instance by identifying discussed exhaustively. First approaches try to existing semantic interrelations between concepts implement a PageRank-like relevance , (Angeletou et al., 2007; Peters & Weller, 2008). the FolkRank, for the ranking of tagged documents Related approaches of editing and improving (Hotho et al., 2006): “The basic notion is that a unstructured folksonomies with basic vocabulary resource which is tagged with important tags by control are discussed as tag gardening (Governor, important users becomes important itself” (Hotho 2006; Weller & Peters, 2008). et al., 2006, 417). A patent application by Yahoo!

139 Folksonomy

for its photo-sharing service Flickr proposes an not to be viewed as rivalling systems; addition- “interestingness” ranking which takes into ac- ally, new options for combinations of different count, for instance, the user’s behavior in click- techniques will be designed. This will also be ing and tagging or the number of assigned tags particularly beneficial in specialized contexts, (Butterfield et al., 2006). All in all, three sets of since the number of professional pro- applicable ranking factors can be determined: (1) viders, libraries, and museums that have adapted tags, (2) collaboration, and (3) prosumers (Peters folksonomies continues to grow. & Stock, 2007). Besides research efforts on improving the qual- ity of folksonomies, some work is also done to use REFERENCES them as a basis for new applications (or as a source for ). For example, different methods Al-Khalifa, H. S., & Davis, H. C. (2007). Towards for identifying communities of interest with the better understanding of folksonomic patterns. In help of folksonomies are considered (Diederich S. Harper et al. (Eds.), Proceedings of the 18th & Iofciu, 2006; Wu et al., 2006), and analyses on Conference on and (pp. how people tag documents on the web might lead 163-166). New York: ACM Press. to a better understanding of how humans organize Alexiev, V., Breu, M., de Bruijn, J., Fensel, D., and process information (Lodwick, 2005). Lara, R., & Lausen, H. (Eds.). (2005). Information integration with ontologies: Experiences from an industrial showcase. Chichester, : Wiley CONCLUSION & Sons.

Folksonomies present a valuable addition to the Angeletou, S., Sabou, M., Specia, L., & Motta, E. spectrum of knowledge representation methods. (2007). Bridging the gap between folksonomies They appear in the context of user collaboration and the : An experience report. In in Web 2.0 environment and provide easy and B. Hoser & A. Hotho (Eds.), Bridging the gap comprehensive access to large data collections. between Semantic Web and Web 2.0 (pp. 30-43). With web users taking control over document Innsbruck, Austria: International Workshop at the indexing, folksonomies offer an inexpensive 4th European Semantic Web Conference (SemNet way of processing large data sets. User centered 2007). approaches to tagging have multiple benefits, as Aurnhammer, M., Hanappe, P., & Steels, L. they can actively capture the authentic language (2006). Augmenting navigation for collabora- of the user, are flexible and allow new ways of tive tagging with emergent semantics. Lecture social navigation within document collections. Notes in , 4273, 58–71. Yet some problems derive from the unstructured doi:10.1007/11926078_5 nature of tags which may be solved by improv- ing the users’ tag literacy, by (automatic) query Butterfield, D. S., Costello, E., Fake, C., Hen- refinements, or by processing tags through natural derson-Begg, C. J., & Mourachow, S. (2006). language processing. Interestingness ranking of media objects (Patent In the future, the advantages and shortcomings Application No. US 2006/0242139 A1). Washing- of folksonomies will be considered more closely ton, DC: U.S. Patent and Trademark Office. as advanced approaches to the use of social tag- ging applications are emerging. Folksonomies and traditional knowledge representation methods are

140 Folksonomy

Campbell, D. G. (2006). A phenomenological Egghe, L., & Rousseau, R. (1995). Generalized framework for the relationship between the Se- success-breeds-success principle leading to time- mantic Web and user-centered tagging systems. In dependent informetric distributions. Journal of Proceedings of the 17th Workshop of the American the American Society for Information Science Society for Information Science and , and Technology, 46(6), 426–445. doi:10.1002/ Special Interest Group in Classification Research, (SICI)1097-4571(199507)46:6<426::AID- Austin, TX. Retrieved June 23, 2008, from http:// ASI3>3.0.CO;2-B dlist.sir.arizona.edu/1838/ Fichter, D. (2006). applications for tagging Cattuto, C., Loreto, V., & Pietronero, L. (2007). and folksonomies. Online, 30(3), 43–45. Semiotic dynamics and collaborative tagging. Pro- Frakes, W. B., & Baeza-Yates, R. (Eds.). (1992). ceedings of the National Academy of Sciences of Information retrieval: Data structure and algo- the United States of America, 104(5), 1461–1464. rithms. Upper Saddle River, NJ: Prentice Hall. doi:10.1073/pnas.0610487104 Furnas, G. W., Fake, C., von Ahn, L., Schachter, Christiaens, S. (2006). Metadata mechanisms: J., Golder, S., Fox, K., et al. (2006). Why do tag- From ontology to folksonomy… and back. Lec- ging systems work? CHI’06 Extended Abstracts ture Notes in Computer Science, 4277, 199–207. on Human Factors in Systems (pp. doi:10.1007/11915034_43 36-39). New York: ACM. Cleveland, D. B., & Cleveland, A. (2001). Intro- Furnas, G. W., Landauer, T. K., Gomez, L. M., duction to indexing and abstracting. Englewood, & Dumais, S. T. (1987). The vocabulary problem CO: Greenwood Press. in human-system communication: An analysis Davies, J., Fensel, D., & van Harmelen, F. (Eds.). and a solution. of the ACM, 30, (2003). Towards the Semantic Web: Ontology- 964–971. doi:10.1145/32206.32212 driven . Chichester, Gendarmi, D., & Lanubile, F. (2006). Community- England: Wiley & Sons. driven ontology evolution based on folksono- Diederich, J., & Iofciu, T. (2006). Finding commu- mies. Lecture Notes in Computer Science, 4277, nities of practice from user profiles based on folk- 181–188. doi:10.1007/11915034_41 sonomies. In Proceedings of the 1st International Golder, S. A., & Huberman, B. A. (2006). Us- Workshop on Building Technology Enhanced age patterns of collaborative tagging systems. Learning Solutions for Communities of Practice Journal of Information Science, 32(2), 198–208. (TEL-CoPs’06). Retrieved June 23, 2008, from doi:10.1177/0165551506062337 http://palette.cti.gr/workshops/telcops06.htm Gordon-Murnane, L. (2006). Social bookmarking, Egghe, L. (2005). Power laws in the informa- folksonomies, and Web 2.0 tools. Searcher - The tion production process: Lotkaian informetrics. for Database Professionals, 14(6), Amsterdam: Elsevier Academic Press. 26-38. Egghe, L., & Rousseau, R. (1990). Introductions Governor, J. (2006, October 1). On the emer- to informetrics. Amsterdam: Elsevier. gence of professional tag gardeners. Retrieved June 23, 2008, from http://www.redmonk.com/ jgovernor/2006/01/10/on-the-emergence-of- professional-tag-gardeners/

141 Folksonomy

Grahl, M., Hotho, A., & Stumme, G. (2007). Kipp, M. E. I. (2006b). Exploring the context of Conceptual clustering of social bookmarking user, creator and intermediate tagging. In Proceed- sites. In Proceedings of I-Know’07, Graz, Austria ings of the IA Summit 2006, Canada. Retrieved June (pp. 356-364). 23, 2008, from http://www.iasummit.org/2006/ files/109_Presentation_Desc.pdf Gruber, T. (2007). Ontology of folksonomy: A mash-up of apples and oranges. International Kolbitsch, J. (2007). WordFlickr: A solution to Journal on Web Semantics and Information Sys- the vocabulary problem in social tagging systems. tems, 3(1), 1–11. In Proceedings of I-MEDIA ‘07 and I-SEMAN- TICS’07, Graz, Austrai (pp. 77-84). Guy, M., & Tonkin, E. (2006). Folksonomies: Ti- dying up tags? D-Lib Magazine, 12(1). Retrieved Krause, J. (1996). Informationserschließung und June 23, 2008, from http://www.dlib.org/dlib/ -bereitstellung zwischen Deregulation, Kom- january06/guy/01guy. merzialisierung und weltweiter Vernetzung. IZ- Arbeitsbericht, 6. Halpin, H., Robu, V., & Shepherd, H. (2007). The complex dynamics of collaborative tagging. Kroski, E. (2005). The hive mind: Folksonomies In Proceedings of the 16th Conference on World and user-based tagging. Retrieved June 23, 2008, Wide Web (pp. 211-220). New York: ACM. from http://infotangle.blogsome.com/2005/12/07/ the-hive-mind-folksonomies-and-user-based- Hammond, T., Hannay, T., Lund, B., & Scott, tagging J. (2005). Social bookmarking tools: A general review. Part I. D-Lib Magazine, 11(4). Retrieved Lancaster, F. W. (2003). Indexing and abstract- June 23, 2008, from http://www.dlib.org/dlib/ ing in theory and practice (3rd ed.). Urbana, IL: april05/hammond/04hammond.html University of Illinois Urbana. Hotho, A., Jäschke, R., Schmitz, C., & Stumme, Lodwick, J. (2005). Tagwebs, Flickr, and the G. (2006). Information retrieval in folksonomies: human brain. Retrieved January 10, 2008, from Search and ranking. Lecture Notes in Computer Sci- http://www.blumpy.org/tagwebs/ ence, 4011, 411–426. doi:10.1007/11762256_31 Maass, W., Kowatsch, T., & Münster, T. (2007). Kipp, M., & Campbell, D. (2006). Patterns and Vocabulary patterns in free-for-all collaborative inconsistencies in collaborative tagging systems: indexing systems. In Proceedings of International An examination of tagging practices. In Proceed- Workshop on Emergent Semantics and Ontology ings of the 17th Annual Meeting of the American Evolution, Busan, Korea (pp. 45-57). Society for Information Science and Technology, Macgregor, G., & McCulloch, E. (2006). Collab- Austin, TX. orative tagging as a knowledge organisation and Kipp, M. E. I. (2006a). @toread and cool: Tagging resource discovery tool. Library Review, 55(5), for time, task and emotion. In 17th ASIS&T SIG/ 291–300. doi:10.1108/00242530610667558 CR Classification Research Workshop: Abstracts MacLaurin, M. B. (2005). Selection-based item of Posters, Austin, TX (pp. 16-17). tagging (Patent Application No. US 2007/0028171 A1). Washington, DC: U.S. Patent and Trademark Office.

142 Folksonomy

Madhavan, J., Halevy, A., Cohen, S., Dong, X., O’Reilly, T. (2005). What is Web 2.0: Design pat- Jeffery, S. R., Ko, D., & Yu, C. (2006). Struc- terns and business models for the next generation tured data meets the Web: A few observations. A of software. Retrieved January 10, 2008, from Quarterly Bulletin of the Computer Society of the http://www.oreillynet.com/pub/a/oreilly/tim/ IEEE Technical Committee on Data Engineering, news/2005/09/30/what-is-web-20.html 29(4), 19–26. Peters, I. (2006). Against folksonomies: Index- Maness, J. M. (2006). Library 2.0 theory: Web ing blogs and podcasts for corporate knowledge 2.0 and its implications for libraries. Webology, management. In H. Jezzard (Ed.), Proceedings of 3(2). Retrieved January 10, 2008, from http:// Preparing for Information 2.0, Online Information www.webology.ir/2006/v3n2/a25.html 2006 (pp. 93-97). London: Learned Information Europe. Marlow, C., Naaman, M., Boyd, D., & Davis, M. (2006). HT06, tagging paper, taxonomy, Flickr, Peters, I., & Stock, W. G. (2007). Folksonomy academic article, to read. In Proceedings of the and information retrieval. In Proceedings of the 17th Conference on Hypertext and Hypermedia 70th Annual Meeting of the American Society for (pp. 31-40). New York: ACM. Information Science and Technology, 45, 1510- 1542. Mathes, A. (2004). Folksonomies – cooperative classification and communication through shared Peters, I., & Weller, K. (2008). Paradigmatic and metadata. Urbana, IL: University of Illinois Ur- syntagmatic relations in ontologies and folk- bana-. Retrieved June 23, 2008, from http://www. sonomies. Information . Wissenschaft und Praxis, adammathes.com/academic/computer-mediated- 59(2), 100–107. communication/folksonomies.html Peterson, E. (2006). Beneath the metadata: Some Merholz, P. (2004, October 19). Metadata for the philosophical problems with folksonomies. D-Lib masses. Retrieved January 10, 2008, from http:// Magazine, 12(11). doi:10.1045/november2006- www.adaptivepath.com/publications/essays/ peterson /000361. Pluzhenskaia, M. (2006). Folksonomies or faux- Mika, P. (2005). Ontologies are us: A unified sonomies: How social is social bookmarking? In model of social networks and semantics. Lec- 17th ASIS&T SIG/CR Classification Research ture Notes in Computer Science, 3729, 522–536. Workshop: Abstracts of Posters, Austin, TX (pp. doi:10.1007/11574620_38 23-24). Millen, D. R., Feinberg, J., & Kerr, B. (2006). Quintarelli, E. (2005). Folksonomies: Power to DOGEAR: Social bookmarking in the enterprise. the people. Paper presented at the ISKO Italy In Proceedings of the SIGCHI Conference on Hu- UniMIB Meeting, Milan. man Factors in Computing Systems (pp. 111-120). Safari, M. (2004): Metadata and the Web. Webol- New York: ACM. ogy, 1(2). Retrieved January 10, 2008, from http:// Murison, J. (2005). Messageboard topic tagging: www.webology.ir/2004/v1n2/a7.html User tagging of collectively owned community Shirky, C. (2005). Ontology is overrated: Catego- content. In Proceedings of the 2005 Conference ries, links, and tags. Retrieved January 10, 2008, on Designing for User eXperience (Article no. 5). from www.shirky.com/writings/ontology_over- New York: American Institute of Graphic Art. rated.html

143 Folksonomy

Sinclair, J., & Cardew-Hall, M. (2008). The Szomszor, M., Cattuto, C., Alani, H., O’Hara, K., folksonomy tag cloud: When is it useful? Baldassarri, A., Loreto, V., et al. (2007, June). Journal of Information Science, 34(1), 15–29. Folksonomies, the Semantic Web, and movie doi:10.1177/0165551506078083 recommendation. In B. Hoser & A. Hotho (Eds.), Bridging the gap between Semantic Web and Web Smith, G. (2004, August 3). Folksonomy: Social 2.0 (pp. 71-84). Innsbruck, Austria: International classification. Retrieved January 10, 2008, from Workshop at the 4th European Semantic Web http://atomiq.org/archives/2004/08/folksonomy_ Conference (SemNet 2007). social_classification.html Toffler, A. (1980). The third wave. New York: Smith, G. (2008). Tagging: People-powered Morrow. metadata for the . Berkeley, CA: New Riders. Trant, J. (2006). Exploring the potential for social tagging and folksonomy in art muse- Spiteri, L. F. (2006). The use of folksonomies in ums: Proof of concept. New Review of Hy- public library catalogues. The Serials , permedia and Multimedia, 12(1), 83–105. 51(2), 75–89. doi:10.1300/J123v51n02_06 doi:10.1080/13614560600802940 Spyns, P., de Moor, A., Vandenbussche, J., Vander Wal, T. (2005, February 21). Explaining & Meersman, R. (2006). From folksonomies and showing broad and narrow folksonomies. to ontologies: How the twain meet. Lecture Retrieved January 10, 2008, from http://www. Notes in Computer Science, 4275, 738–755. vanderwal.net/random/entrysel.php?blog=1635 doi:10.1007/11914853_45 Vander Wal, T. (2008). Keeping up with social tag- Staab, S., & Studer, R. (Eds.). (2004). Handbook ging. In Workshop Good Tags – Bad Tags, Social on ontologies. Berlin, Germany: Springer. Tagging in der Wissensorganisation, Institut für Stock, W. G. (2006). On relevance distributions. Wissensmedien, Tübingen, Germany. Retrieved Journal of the American Society for Informa- June 23, 2008, from http://www.e-teaching.org/ tion Science and Technology, 57(8), 1126–1129. community/taggingcast doi:10.1002/asi.20359 von Ahn, L. (2006). Games with a purpose. IEEE Stock, W. G. (2007a). Information retrieval: Computer Magazine, 96-98. Informationen suchen und finden. München, Weiss, A. (2005). The power of collective intel- Germany: Oldenbourg. ligence. netWorker, 9(3), 16-23. Stock, W. G. (2007b). Folksonomies and science Weller, K. (2007). Folksonomies and ontologies: communication. A mash-up of professional sci- Two new players in indexing and knowledge repre- ence databases and Web 2.0 services. Information sentation. In H. Jezzard (Ed.), Online Information Services & Use, 27(3), 97–103. Conference Proceedings (pp. 108-115). London: Stock, W. G., & Stock, M. (2008). Wissen- Learned Information Europe. srepräsentation: Informationen auswerten und Weller, K., & Peters, I. (2007). Reconsidering bereitstellen. München, Germany: Oldenbourg. relationships for knowledge representation. In Surowiecki, J. (2004). The wisdom of crowds: Proceedings of I-KNOW ‘07, Graz, Austria (pp. Why the many are smarter than the few. New 493-496). York: Anchor Books.

144 Folksonomy

Weller, K., & Peters, I. (2008). Seeding, weed- Thomas Vander Wal as a combination of “folk” ing, fertilizing – different tag gardening activities and “taxonomy.” for folksonomy maintenance and enrichment. In Knowledge Representation and Indexing: Triple-I Conference, Proceedings of I-Semantics, In the context of information storage and retrieval Graz, Austria (pp. 110-117). techniques, knowledge representation is con- cerned with providing methods for organizing Wu, H., Zubair, M., & Maly, K. (2006). Harvest- and representing knowledge domains and sorting ing social knowledge from folksonomies. In Pro- documents accordingly. A traditional way to do ceedings of the 17th Conference on Hypertext and this is by document indexing: i.e., by assigning Hypermedia (pp. 111-114). New York: ACM. keywords or notations (usually taken from a Xu, Z., Fu, Y., Mao, J., & Su, D. (2006). Towards controlled vocabulary or classification scheme) the Semantic Web: Collaborative tag suggestions. to a document to describe its content. In Proceedings of the 15th International WWW Knowledge Organization Systems (KOS): Conference. Knowledge Organization Systems are (structured) representations of a knowledge domain, used for Zacharias, V., & Braun, S. (2007). SOBOLEO. and indexing. Common Social bookmarking and lightweight ontology classical knowledge organization systems in- engineering. In Proceedings of the Workshop clude classifications (taxonomies), thesauri, and on Social and Collaborative Construction of nomenclatures. Folksonomies and ontologies are Structured Knowledge (CKC), 16th International new forms of KOS. World Wide Web Conference (WWW 2007), Banff, Tag: Within a given context, a tag is a keyword Alberta, Canada. assigned to a document to describe it. Tags can be Zhang, L., Wu, X., & Yu, Y. (2006). Emergent used for document retrieval. Folksonomy tags can semantics from folksonomies: A quantitative be freely chosen by the users of a folksonomy- study. Lecture Notes in Computer Science, 4090, based system. 168–186. doi:10.1007/11803034_8 Tag Cloud: A tag cloud displays the popular- ity of tags, either for tags assigned to one single document or for all tags within a complete folk- sonomy-based platform. The bigger and broader KEY TERMS AND DEFINITIONS a tag is displayed in a tag cloud, the more often has it been used. Broad and Narrow Folksonomies: Broad and Tag Distribution: The frequency of tags as- narrow folksonomies differ in whether multiple signed to one document (or within a platform) assignments of identical tags are possible or not. can be counted and visualized as a tag distribution Systems with broad folksonomies allow to assign graph. Some specific forms of tag distributions the same tag to one document several times (thus are dominant within folksonomies: for example, the tag frequency can be counted), whereas narrow the of a “long tail”, which reacts to folksonomies record every tag only once. the rules of the power law. A “long trunk” may Folksonomy: An indexing method open for appear as well; the curve then follows an inverse- users to apply freely chosen index terms. The logistic distribution. term “folksonomy” was introduced in 2004 by

145 Folksonomy

ENDNOTES 6 Engineering Village: http://www.engineer- ingvillage.com 1 Flickr: http://www.flickr.com 7 WISO: http://www.wiso-net.de 2 You Tube: http://www.youtube.com 8 PennTags: http://tags.library.upenn.edu/ 3 Del.icio.us: http://del.icio.us 4 Technorati: http://www.technorati.com 5 Google Image Labeler: http://images.google. com/imagelabeler/

146