Learning to Explain Non-Standard English Words and Phrases

Total Page:16

File Type:pdf, Size:1020Kb

Learning to Explain Non-Standard English Words and Phrases Learning to Explain Non-Standard English Words and Phrases Ke Ni and William Yang Wang Department of Computer Science University of California, Santa Barbara Santa Barbara, CA 93106 USA fke00@umailg,[email protected] Abstract We describe a data-driven approach for au- tomatically explaining new, non-standard English expressions in a given sentence, building on a large dataset that includes Figure 1: An example Tweet with a non-standard 15 years of crowdsourced examples from English expression. Our model aims at automati- UrbanDictionary.com. Unlike prior stud- cally explaining any newly emerged, non-standard ies that focus on matching keywords from expressions (generating the blue box). a slang dictionary, we investigate the pos- sibility of learning a neural sequence-to- sequence model that generates explana- tions from Wiktionary to detect satirical news ar- tions of unseen non-standard English ex- ticles. Wang and McKeown (2010) show that us- pressions given context. We propose a ing a slang dictionary of 5K terms can help detect- dual encoder approach—a word-level en- ing vandalism of Wikipedia edits. Han and Bald- coder learns the representation of context, win (2011) make use of the same slang dictionary, and a second character-level encoder to and achieve the best performance by combining learn the hidden representation of the tar- the dictionary lookup approach with word simi- get non-standard expression. Our model larity and context for Twitter and SMS text nor- can produce reasonable definitions of new malization. However, using a 5K slang dictionary non-standard English expressions given may suffer from the coverage issue, since slang their context with certain confidence. is evolving rapidly in the social media age1. Re- cently, Thanapon Noraset (2016) shows that it is 1 Introduction possible to use word embeddings to generate plau- sible definitions. Nonetheless, one weakness is In the past two decades, the majority of NLP re- that definition of a word may change within dif- search focused on developing tools for the Stan- ferent contexts. dard English on newswire data. However, the non- standard part of the language is not well-studied In contrast, we take a more radical approach: in the community, even though it becomes more we aim at building a general purpose sequence-to- and more important in the era of social media. sequence neural network model (Sutskever et al., While we agree that one must take a cautious ap- 2014) to explain any non-standard English expres- proach to automatic generation of non-standard sion, which can be used in many NLP and social language (Hickman, 2013), but for many practical media applications. More specifically, given a sen- purposes, it is also of crucial importance for ma- tence that includes a non-standard English expres- chines to be able to understand and explain this sion, we aim at automatically generating the trans- important subversion of the language. lation of the target expression. Previously, this In the NLP community, using dictionaries of is not possible because the resources of labeled non-standard language as an external knowledge 1For example, more than 2K entries are submitted daily to source is useful for many tasks. For example, Bur- Urban Dictionary (Kolt and Lazier, 2009), the largest online foot and Baldwin (2009) consult the slang defini- slang resource. non-standard expressions are not available. In this and McKeown, 2011). However, we argue that us- paper, we collect a large corpus of 15 years of ing a small, fixed-size dictionary approach may crowdsourced examples, and formulate the task as be suboptimal: it suffers from the low coverage a sequence-to-sequence generation problem. Us- problem, and to keep the dictionary up to date, ing a word-level model, we show that it is pos- maintaining such dictionary is also expensive and sible to build a general purpose non-standard En- time-consuming. To the best of our knowledge, glish words and phrases explainer using neural se- our work is the first to build a general purpose ma- quence learning techniques. To summarize our chine learning model for explaining non-standard contributions: English terms, using a large crowdsourced dataset. • We present a large, publicly available corpus 3 Our Approach of non-standard English words and phrases, including 15 years of definitions and exam- 3.1 Sequence-to-Sequence Model ples for each entry via crowdsourcing; Since our goal is to automatically generate expla- nations for any non-standard English expressions, • We present a hybrid word-character we select sequence-to-sequence models with at- sequence-to-sequence model that directly tention mechanisms as our fundamental frame- explains unseen non-standard expressions work (Bahdanau et al., 2014), which can pro- from social media; duce abstractive explanations, and assign different • Our novel dual encoder LSTM model outper- weights to different parts of a sentence. To model forms a standard attentive LSTM baseline, both the context words and the non-standard ex- and it is capable of generative plausible ex- pression, we propose a hybrid word-character dual planation for unseen non-standard words and encoder. An overview of our model is shown in phrases. Figure2. In the next section, we outline related work on 3.2 Context Encoder non-standard expressions and social media text Our context encoder is basically a recurrent processing. We will then introduce our dual en- neural network with long-short term memory coder based attentive sequence-to-sequence model (LSTM) (Hochreiter and Schmidhuber, 1997). for explaining non-standard expressions in Sec- LSTM consists of input gates, forget gates and tion3. Experimental results are shown in Sec- output gates, together with hidden units as its in- tion4. And finally, we conclude in Section5. ternal memory. Here, i controls the impact of new input, while f is a forget gate, and o is an output 2 Related Work gate. C~t is the candidate new value. h is the hidden The study of non-standard language is of inter- state, and m is the cell memory state. “ ” means ests to many researchers in the social media and element-wise vector product. The definition of the NLP communities. For example, Eisenstein et gates, memory, and hidden state is: al. (2010) propose a latent variable model to study i = σ(W [x ; h ]) the lexical variation of the language on Twitter, t i t t−1 where many regional slang words are discovered. ft = σ(W [xt; ht−1]) Zappavigna (2012) identifies the Internet slang as f an important component in the discourse of Twit- ot = σ(Wo[xt; ht−1]) ter and social media. Gouws et al. (2011) provide C~ = tanh(W [x ; h ]) a contextual analysis of how social media users t c t t−1 shorten their messages. Notably, a study on Tweet mt = mt−1 ft + it C~t normalization (Han and Baldwin, 2011) finds that, h = m o even when using a small slang dictionary of 5K t t t words, Slang makes up 12% of the ill-formed At each step, RNN is given a vector as input, words in a Twitter corpus of 449 posts. In the changes its hidden states and produces outputs NLP community, slang dictionary is widely used from its last layer. Hidden states and outputs are in many tasks and applications (Burfoot and Bald- stored and later passed to a decoder, which pro- win, 2009; Wang and McKeown, 2010; Rosenthal duces final outputs based on hidden states, outputs, it confuses the decoder on which one it should ex- plain. In practice, the user can often pin point to the exact expression she or he does not know, so in this work, we design a second encoder to learn the representation of the target non-standard expres- sion. Since our goal is to explain any non-standard English words and phrases, we consider a character-level encoder for this purpose. This sec- Figure 2: Dual Encoder Structure. Top left: a ond encoder reads the embedding vector of each word-level LSTM encoder for modeling context. character at each time step, and produces an out- Bottom left: a character-level LSTM encoder for put vector, and hidden states. Our dual encoder modeling target non-standard expression. Right: model linearly combines the hidden states of the an LSTM decoder for generating explanations. character-level target expression encoder with the hidden states of the word-level encoder with the and the decoder’s own parameters.The context en- following equation: coder learns and encodes sentence-level informa- tion for the decoder to generate explanations. hnew = h1W1 + h2W2 + B 3.3 Attention Mechanism Here, h1 is the context representation, and h2 is The idea of designing an attention mechanism is the target expression representation. The Bias to select key focuses in the sentence. More specif- B and the combination weights W1 and W2 are ically, we would like to pay attention to specific learnable. parts of encoder outputs and states. We follow a 4 Experiment recent study (Vinyals et al., 2015) to setup an at- tention mechanism. We have a separate LSTM for 4.1 Dataset decoding. We collect the non-standard English corpus2 from Here we briefly explain the method. When the Urban Dictionary (UD)3—the largest online slang decoder starts its decoding process, it has all hid- dictionary, where terms, definitions and examples den units from the encoder, denoted as (h ::h ). 1 T1 are submitted by the crowd. UD is made a reliable Also we denote the hidden state of the decoder resource, due to the quality control of its publish- as (d ::d ). T and T are input and output 1 T2 1 2 ing procedure.
Recommended publications
  • Online Dictionaries As Emergent Archives of Contemporary Usage and Collaborative Codification
    Online Dictionaries as Emergent Archives of Contemporary Usage and Collaborative Codification Colleen Cotter Queen Mary, University of London John Damaso Arizona State University February 2007 Abstract Within the history of modern English lexicography, individual dictionary editors have had ultimate control over the selection, meaning, and illustration of words; extensive collaboration with contributors has been limited. However, Internet technologies that easily permit exchanges between a user and a database have allowed a new type of dictionary online, one that is built by the collaboration of contributing end-users, allowing ordinary users of dictionaries who are not trained lexicographers to engage in dictionary-making. We discuss a popular online slang dictionary called UrbanDictionary.com (UD) to illustrate how lexicographic principles are joined with Web-only communication technologies to provide a context for collaborative engagement and meaning-making; and to note the many characteristics and functions shared with the traditional print dictionary. Significantly, UD captures what most traditional English dictionaries fall short of: recording ephemeral quotidian spoken language and representing popular views of meaning. By relying on the users of language to select and define words for a dictionary, UD, which defines more than 1 million words, has in effect influenced access to and formulation of the lexis. Keywords computer-mediated communication, lexicography, slang, youth language; English Queen Mary’s OPAL #9 Occasional Papers Advancing Linguistics 1 Introduction English lexicography stems from a tradition of relatively limited functional collaboration, beginning with Samuel Johnson’s dictionary in 1755, in which editors overseeing numerous contributors held the ultimate authority over the selection, meaning, and illustration of words.
    [Show full text]
  • Humping a Grouse Cobber Dymphna Lonergan
    Humping a grouse cobber Dymphna Lonergan When I arrived in Adelaide in the 1970s, they were called cobbers. Those small, chocolate covered squares of hard caramel now go by the name mates. The word mate has nudged ahead of cobber in popularity, probably since the republican debate in the 1990s when mateship was widely discussed as a proposed term for inclusion in the Constitution of Australia. By the time of the republican debate, the word cobber had been in print for just over one hundred years. Appearing first in the Worker in Sydney in 1893, it points to a rural origin. The Australian National Dictionary suggests the British dialectal word cob, ‘to take a liking to’ as a probable origin of cobber, and the use of the word seems to support the need for friendship in the harsh environment of the Australian bush. If we no longer hear cobber used much, it may simply have outlived its usefulness because the social conditions that created it have changed (and because mate is shorter, and more suited to these downsized and core business times). On the other hand, the word cobber appears in a current online urbandictionary website with the claim that it is not an archaic word, but simply little understood as an equivalent to mate. Therefore, while cobber seems to have been superseded by mate as a more commercially viable term for chocolate-coated caramel squares, cobber still has currency for technology-focused young Australians The English Dialect Dictionary cautions that cob ‘to like’ is not widely known. It certainly seems curious that this apparently rare verb could evolve quickly into an Australian noun widely used in the 1890s.
    [Show full text]
  • The Wiki-Fication of the Dictionary: Defining Lexicography in the Digital Age
    THE WIKI-FICATION OF THE DICTIONARY: DEFINING LEXICOGRAPHY IN THE DIGITAL AGE Darrell J. Penta College of Professional Studies Northeastern University [email protected] Abstract The future of lexical reference books, such as the 20-volume Oxford English Dictionary (OED), is going to be determined, in part, by the emergence of free on-line dictionaries, such as Wiktionary and Urban Dictionary. Specifically, we are witnessing a paradigmatic shift of authority in which users, rather than editorial boards, are making decisions concerning the content associated with a lexical entry’s definition. In effect, an exclusive privilege formerly enjoyed by professional lexicographers is now being extended unequivocally to laypersons. It is pertinent to ask, therefore, what effect this state of affairs will have on the ways that dictionaries are compiled and used. For some, including Jill Lepore of the New Yorker Magazine, online collaborative lexical references are “Maoist” resources, “cobble[d]…together” by non-experts who “pilfer” definitions (79). This paper rejects such a characterization and seeks, instead, to provide a description more suitable for critical inquiry. By contrasting the entry “bomb” as it appears in the OED, Wiktionary, and Urban Dictionary, and by making use of contemporary linguistic theory, the author posits that: word meanings are highly constrained by popular usage; and, users regularly provide semantically and pragmatically significant, and grammatical accurate, definitions; and, in providing users the flexibility to modify entries in real-time, user-generated dictionaries are uniquely practical as catalogues of the current state of language. It is concluded that, whereas traditional dictionaries may be the better resource for diachronic analyses of words, Wiktionary and the like may prove better for synchronic analyses.
    [Show full text]
  • Urban Dictionary : Fularious Street Slang Defined Pdf, Epub, Ebook
    URBAN DICTIONARY : FULARIOUS STREET SLANG DEFINED PDF, EPUB, EBOOK Urbandictionary.com | 352 pages | 02 Oct 2006 | Andrews McMeel Publishing | 9780740751431 | English | Kansas City, United States Urban Dictionary : Fularious Street Slang Defined PDF Book Because of the inconsistent capitalization in UD, we experiment with three approaches to match the headwords between both dictionaries: no preprocessing, lower casing of all characters, and mixed. Friend Reviews. Pages are unmarked. Online: UrbanDictionary. Book is in NEW condition. Its first recorded use dates back to , 1 but was only added to OxfordDictionaries. Average rating 3. Drug Addict says: For what? Published on. Competing interests We declare we have no competing interests. Get A Copy. Notify me of new comments via email. Lily rated it really liked it May 19, Within urbandictionary. Backpedaling This paper presents an explorative study of Urban Dictionary UD , an online crowd- sourced dictionary founded in December Figure 7. Jan 25, Jill rated it really liked it Shelves: humor , non-fiction , reference. Lulu rated it it was amazing Apr 10, Within urbandictionary. Shipped within 24 hours from our UK warehouse. In total, we obtained annotations for headwords and thus definitions. To investigate how offensive content is distributed in UD, we ran a crowdsourcing task on CrowdFlower see Data and methods for more details. Please follow the detailed Help center instructions to transfer the files to supported eReaders. Sort order. Abstract access only. As a response, various online platforms have integrated different mechanisms to detect, report and remove inappropriate content. This book was given to me as a gift back in ish when it was released.
    [Show full text]
  • Online English Dictionaries: Friend Or Foe?
    Online English Dictionaries: Friend or Foe? Gao Yongwei Keywords: online dictionary, e-lexicography, English-Chinese lexicography. Abstract The emergence of online English dictionaries in the past two decades has not only changed the lookup habit of many people and but also influenced the way dictionaries are compiled and presented. The traditional role played by paper dictionaries has been challenged, as witness the sharp decrease of the sales of the so-called “dead-tree” dictionaries and the steady diminishing in their readership. In consequence, many paper dictionaries have been gathering dust on bookshelves in bookstores, libraries or private studies. The ever-increasing popularity of online dictionaries has even made some alarmists suggest the possible demise of paper dictionaries. However, the future of dictionary-making and that of bilingual lexicography in particular is not as dismal as what people usually think. The lexicographical information presented in online dictionaries may prove to be a bonanza for bilingual lexicographers. This paper attempts to research into the major online English dictionaries that are available today, and their advantages and disadvantages will also be discussed. The scene of online English-Chinese dictionaries will also be investigated, and opportunities presented to English-Chinese dictionary-makers in the digital era will be explored. According to most current English reference books, a dictionary is usually defined as “a book that gives a list of words in alphabetical order and explains what they mean”. However, technological advances have already redefined what a dictionary is. Many dictionaries that are being used today are no longer “books” in the traditional sense of the word as they can be found in electronic devices (e.g.
    [Show full text]
  • Googling for Meaning: Statutory Interpretation in the Digital Age Alice A
    THE YALE LAW JOURNAL FORUM F EBRUARY 15, 2016 Googling for Meaning: Statutory Interpretation in the Digital Age Alice A. Wang introduction Dictionary use has become a common practice in modern statutory interpretation at the Supreme Court.1 With the rise of the “new textualism,”2 Justices increasingly rely on dictionaries to shed light on the plain meaning of statutes—that is, the understanding that an ordinary English speaker would draw from the text.3 This trend is not limited to textualists: Justices who favor purposivist analyses of legislative intent cite dictionaries just as often.4 Scholars and practitioners have criticized this development in two main ways.5 The first critique questions the soundness of the interpretive 1. See, e.g., James J. Brudney & Lawrence Baum, Oasis or Mirage: The Supreme Court’s Thirst for Dictionaries in the Rehnquist and Roberts Eras, 55 WM. & MARY L. REV. 483 (2013); John Calhoun, Note, Measuring the Fortress: Explaining Trends in Supreme Court and Circuit Court Dictionary Use, 124 YALE L.J. 484 (2014); Jeffrey L. Kirchmeier & Samuel A. Thumma, Scaling the Lexicon Fortress: The United States Supreme Court’s Use of Dictionaries in the Twenty-First Century, 94 MARQ. L. REV. 77 (2010). 2. The new textualism, most closely associated with Justice Scalia, focuses statutory interpretation on the text of the statute in question and the larger body of surrounding law. Rickie Sonpal, Old Dictionaries and New Textualists, 71 FORDHAM L. REV. 2177, 2192 (2003). 3. Plain meaning can diverge from ordinary meaning when a statute includes a specific definition for a term or when a statute uses a term that has a specialized meaning within a particular field.
    [Show full text]
  • Read Book Collins Cobuild Advanced Learners Dictionary
    COLLINS COBUILD ADVANCED LEARNERS DICTIONARY OF AMERICAN ENGLISH : MOBIPOCKET E-DICTIONARY BOX PDF, EPUB, EBOOK none | none | 05 Jan 2009 | HarperCollins Publishers | 9780007303076 | English | London, United Kingdom Collins Cobuild Advanced Learners Dictionary of American English : Mobipocket e- Dictionary Box PDF Book Diacritization in Arabic is employed in WSD in such systems. As Stein and Sterkenburg explained, such a dictionary does not describe restricted or marked language varieties but rather describes a standard language, and the data it contains is more semantic than non-semantic. However, advanced language learners encounter problems when they attempt to deal with the information located within the dictionary concerning idioms, especially when they try to incorporate it with the original context. A large lexical database of English with , entries and 1. Sam Vaknin. Strong's Concordance - King James Version Similarly, Shizuka also compared between the effect of using ELDs and PRDs on learners' efficient retrieval of word meanings and usage examples. Look up: Urban Dictionary is a slang dictionary with your definitions. Search for "synonymous" in all of MSN Encarta. Release notes: learn about the work of our editors in updating the etymology and variant forms section in OED entries that have yet to be fully revised in an article by OED Deputy Chief Editor, Phillip Durkin. Site History. Sterkenburg further pointed out why the monolingual general- purpose dictionary is considered the prototypical dictionary. The Reviewer David Shaffer holds a Ph. Ronowicz, Hehir, Kaimi, Kojima, and Lee investigated the presence of a relationship that connects dictionary use to lexical proficiency, comprehension of the source language SL text, and translation speed.
    [Show full text]
  • Making Sense of Urban Dictionary, a Crowd-Sourced Online Dictionary
    Emo, Love, and God: Making Sense of Urban Dictionary, a Crowd-Sourced Online Dictionary Dong Nguyen1;2, Barbara McGillivray1;3, Subject Areas: 1;4 human-computer interaction and Taha Yasseri 1 Keywords: The Alan Turing Institute, London, UK. 2 Natural language processing, Institute for Language, Cognition and Computation, Linguistic innovation, Computational School of Informatics, University of Edinburgh, sociolinguistics, Human-computer Edinburgh, UK. interaction 3Theoretical and Applied Linguistics, Faculty of Modern and Medieval Languages, University of Author for correspondence: Cambridge, Cambridge, UK. Dong Nguyen 4Oxford Internet Institute, University of Oxford, Oxford, e-mail: [email protected] UK. The Internet facilitates large-scale collaborative projects and the emergence of Web 2.0 platforms, where producers and consumers of content unify, has drastically changed the information market. On the one hand, the promise of the "wisdom of the crowd" has inspired successful projects such as Wikipedia, which has become the primary source of crowd-based information in many languages. On the other hand, the decentralized and often un- monitored environment of such projects may make them susceptible to low quality content. In this work, we focus on Urban Dictionary, a crowd- sourced online dictionary. We combine computational methods with qualitative annotation and shed light on the overall features of Urban Dictionary in terms of growth, coverage and types of content. We measure a high presence of opinion-focused entries, as opposed to the meaning-focused entries that we expect from traditional dictionaries. Furthermore, arXiv:1712.08647v2 [cs.CL] 5 Apr 2018 Urban Dictionary covers many informal, unfamiliar words as well as proper nouns.
    [Show full text]
  • S: the Latest Slang Suffix, for Reals
    -s: The latest slang suffix, for reals Vanessa McCumber University of Victoria [email protected] This paper explores the recent use of the suffix -s in English slang formation, as found in such forms as “whatevs,” “totes,” and “for reals.” A morphological analysis contrasts -s with similar suffixes and describes related morphological processes, including clipping and reduplication. The history, productivity and usage distribution of the suffix are also examined. Keywords: -s, slang, English, suffixation 1 Introduction The suffix -s serves many purposes in English: it can mark plurality, possession, or grammatical person. Recently it is also becoming popular for creating informal, slang versions of common words and phrases. For example, “see you later” becomes “lates!” and “this is certainly my favourite song” becomes “this is totes my faves song”. Including “for reals” in a sentence expresses the truth or sincerity of the statement. This paper is exploratory and aims to describe the relatively new linguistic phenomenon of what will be referred to as “slang -s”, and will examine the suffix's morphology, history, and productivity, and its connections to other English morphological processes. 2 Data Examples of words and phrases using slang -s are presented in Table 1. These were collected from the author's personal lexicon, self-reported data from colleagues and acquaintances (both male and female native English speakers, ranging in age from 12 to 60), and online sources. As slang, these words do not appear in any standard dictionaries, and, presumably because of their recency, only two were found in published slang dictionaries. Many words with slang -s can be found in Urban Dictionary, an online resource that is wholly user-edited and is thus much more current than most published works, but which lacks 125 professional standards and official citations.
    [Show full text]
  • Issue 2 Urban Dictionary Final Manuscript.Docx
    URBAN DICTIONARY: CROWDSOURCED APPROPRIATION Emma Rashes Urban Dictionary began as a joke among friends, but it crept into broader culture and became a major propagator of popular language. Urban Dictionary’s success and widespread use have many consequences and are illustrative of some of the large problems on the internet today. This paper will discuss the role of Urban Dictionary in the integration and appropriation of words from marginalized dialects, specifically African American Vernacular English (AAVE), into mainstream culture and media. According to Jenna Wortham of The New York Times, Urban Dictionary “has become the anthropologist of the internet.” It documents cultural occurrences and phenomena through the new terms that circulate online. Many of the terms defined on Urban Dictionary originate from AAVE. In most cases, the way in which these words are defined and treated on the website reflects blatant cultural insensitivity and appropriation. The definitions often portray Black people as being less intelligent and Black culture as unrefined. The first sign that Urban Dictionary is not focused on cultural sensitivity or humility is the use of the word “urban” in its name. The word urban has been commonly misused to describe Black people (Nolan). Black people can be urban, that is, live in cities, but not all urban residents are Black. The term is also frequently urban music is also frequently used as a synonym for Black music, like hiphop, even if the music did not originate from an urban area (Nolan). As New York City freelance writer Clio Chang points out, Urban Dictionary serves to disseminate prejudiced views about the Black community, rather than define their language in a productive way (Chang).
    [Show full text]
  • Urban Dictionary Embeddings For
    Edinburgh Research Explorer Urban Dictionary Embeddings for Slang NLP Applications Citation for published version: Wilson, S, Magdy, W, McGillivray, B, Garimella, K & Tyson, G 2020, Urban Dictionary Embeddings for Slang NLP Applications. in Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020). European Language Resources Association (ELRA), pp. 4764–4773, 12th Language Resources and Evaluation Conference, Marseille, France, 11/05/20. <https://www.aclweb.org/anthology/2020.lrec-1.586> Link: Link to publication record in Edinburgh Research Explorer Document Version: Publisher's PDF, also known as Version of record Published In: Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020) General rights Copyright for the publications made accessible via the Edinburgh Research Explorer is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights. Take down policy The University of Edinburgh has made every reasonable effort to ensure that Edinburgh Research Explorer content complies with UK legislation. If you believe that the public display of this file breaches copyright please contact [email protected] providing details, and we will remove access to the work immediately and investigate your claim. Download date: 29. Sep. 2021 Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), pages 4764–4773 Marseille, 11–16 May
    [Show full text]
  • User-Generated Content (UGC) in Online English Dictionaries
    User-generated content (UGC) in online English dictionaries Robert Lew, [email protected] 1. Introduction User-generated content rests on two pillars: the philosophy of the role of the collective of ordinary people, and the technical capability to realize this philosophy in practice. The philosophy behind UGC is actually anything but new. It is based on the belief in the wisdom of crowds (Surowiecki 2005), which in essence puts the collective opinion of a group of people above that of a single expert. A practical and important social application of this principle is trial by jury , which functions in numerous societies and has many advocates. The technological underpinnings of the concept of user-generated content have to be sought in the rise of Web 2.0 . This playful name (which draws on the convention of referring incrementally to an improved version of a software application) stands for the advanced interactive stage in internet experience, and it is the modern affordances that come with it which make user-generated content happen. Users of Web 2.0 are no longer passive recipients of packaged content. Increasingly, they actively contribute to the creation and provision of self-made content. This double capacity of newly empowered users can be aptly captured in the neologism prosumer , which is a blend of pro ducer and con sumer . In this way, web users’ social roles become blurred. In the context of online lexicography, the new model puts dictionary users in the shoes of lexicographers. Another concept relevant to UGC is new media . In opposition to traditional media such as newspapers or motion pictures, new media involve a two-way exchange of information: it is interactive or conversational .
    [Show full text]