Automatic Idiom Identification in Wiktionary

Total Page:16

File Type:pdf, Size:1020Kb

Automatic Idiom Identification in Wiktionary Automatic Idiom Identification in Wiktionary Grace Muzny and Luke Zettlemoyer Computer Science & Engineering University of Washington Seattle, WA 98195 fmuznyg,[email protected] Abstract not. Using these incomplete annotations as super- vision, we train a binary Perceptron classifier for Online resources, such as Wiktionary, provide identifying idiomatic dictionary entries. We intro- an accurate but incomplete source of idiomatic duce new lexical and graph-based features that use phrases. In this paper, we study the problem WordNet and Wiktionary to compute semantic re- of automatically identifying idiomatic dictio- latedness. This allows us to learn, for example, that nary entries with such resources. We train the words in the phrase diamond in the rough are an idiom classifier on a newly gathered cor- pus of over 60,000 Wiktionary multi-word more closely related to the words in its literal defi- definitions, incorporating features that model nition than the idiomatic one. Experiments demon- whether phrase meanings are constructed strate that the classifier achieves precision of over compositionally. Experiments demonstrate 65% at recall over 52% and that, when used to fill in that the learned classifier can provide high missing Wiktionary idiom labels, it more than dou- quality idiom labels, more than doubling the bles the number of idioms from 7,764 to 18,155. number of idiomatic entries from 7,764 to These gains also translate to idiom detection in 18,155 at precision levels of over 65%. These sentences, by simply using the Lesk word sense gains also translate to idiom detection in sen- tences, by simply using known word sense disambiguation (WSD) algorithm (1986) to match disambiguation algorithms to match phrases phrases to their definitions. This approach allows to their definitions. In a set of Wiktionary def- for scalable detection with no restrictions on the syn- inition example sentences, the more complete tactic structure or context of the target phrase. In a set of idioms boosts detection recall by over set of Wiktionary definition example sentences, the 28 percentage points. more complete set of idioms boosts detection recall by over 28 percentage points. 1 Introduction 2 Related Work Idiomatic language is common and provides unique To the best of our knowledge, this work represents challenges for language understanding systems. For the first attempt to identify dictionary entries as id- example, a diamond in the rough can be the literal iomatic and the first to reduce idiom detection to unpolished object or a crude but lovable person. Un- identification via a dictionary. derstanding such distinctions is important for many Previous idiom detection systems fall in one applications, including parsing (Sag et al., 2002) and of two paradigms: phrase classification, where a machine translation (Shutova et al., 2012). phrase p is always idiomatic or literal, e.g. (Gedigian We use Wiktionary as a large, but incomplete, ref- et al., 2006; Shutova et al., 2010), or token classifi- erence for idiomatic entries; individual entries can cation, where each occurrence of a phrase p can be be marked as idiomatic but, in practice, most are idiomatic or literal, e.g. (Katz and Giesbrecht, 2006; Birke and Sarkar, 2006; Li and Sporleder, 2009). Data Set Literal Idiomatic Total Most previous idiom detection systems have focused All 56,037 7,764 63,801 on specific syntactic constructions. For instance, Train 47,633 6,600 54,233 Unannotated Dev 2,801 388 3,189 Shutova et al. (2010) consider subject/verb (cam- Annotated Dev 2,212 958 3,170 paign surged stir ex- ) and verb/direct-object idioms ( Unannotated Test 5,603 776 6,379 citement) while Fazly and Stevenson (2006), Cook Annotated Test 4,510 1,834 6,344 et al. (2007), and Diab and Bhutada (2009) de- Figure 1: Number of dictionary entries with each class blow smoke tect verb/noun idioms ( ). Fothergill and for the Wiktionary identification data. Baldwin (2012) are syntactically unconstrained, but only study Japanese idioms. Although we focus on Data Set Literal Idiomatic Total identifying idiomatic dictionary entries, one advan- Dev 171 330 501 tage of our approach is that it enables syntactically Test 360 695 1055 unconstrained token-level detection for any phrase Figure 2: Number of sentences of each class for the Wik- in the dictionary. tionary detection data. 3 Formal Problem Definitions Identification For identification, we assume data its base form—senses that are not defined as a dif- ferent tense of a phrase—e.g. the pair h “weapons of of the form f(hpi; dii; yi): i = 1 : : : ng where mass destruction”, “Plural form of weapon of mass pi is the phrase associated with definition di and destruction” i was removed while the pair h “weapon yi 2 fliteral, idiomaticg. For example, this would include both the literal pair h “leave for dead”, “To of mass destruction”, “A chemical, biological, radio- i abandon a person or other living creature that is in- logical, nuclear or other weapon that ... ” was kept. jured or otherwise incapacitated, assuming that the Each pair hp; di was assigned label y according death of the one abandoned will soon follow.”i and to the idiom labels in Wiktionary, producing the the idiomatic pair h “leave for dead”, “To disregard Train, Unannotated Dev, and Unannotated Test data sets. In practice, this produces a noisy assignment or bypass as unimportant.” i. Given hpi; dii, we aim because a majority of the idiomatic senses are not to predict yi. marked. The development and test sets were anno- Detection To evaluate identification in the con- tated to correct these potential omissions. Annota- text of detection, we assume data f(hpi; eii; yi): tors used the definition of an idiom as a “phrase with i = 1 : : : ng. Here, pi is the phrase in exam- a non-compositional meaning” to produce the An- ple sentence ei whose idiomatic status is labeled notated Dev and Annotated Test data sets. Figure 1 yi 2 fidiomatic, literalg. One such idiomatic pair presents the data statistics. is h“heart to heart”, “They sat down and had a We measured inter-annotator agreement on 1,000 long overdue heart to heart about the future of their examples. Two annotators marked each dictionary relationship.”i. Given hpi; eii, we again aim to pre- entry as literal, idiomatic, or indeterminable. Less dict yi. than one half of one percent could not be deter- mined2—the computed kappa was 81.85. Given 4 Data this high level of agreement, the rest of the data We gathered phrases, definitions, and example sen- were only labeled by a single annotator, follow- tences from the English-language Wiktionary dump ing the methodology used with the VNC-Tokens from November 13th, 2012.1 Dataset (Cook et al., 2008). Identification Phrase, definition pairs hp; di were Detection For detection, we gathered the example gathered with the following restrictions: the title of sentences provided, when available, for each defi- the Wiktionary entry must be English, p must com- nition used in our annotated identification data sets. posed of two or more words w, and hp; di must be in These sentences provide a clean source of develop- 1We used the Java Wiktionary Library (Zesch et al., 2008). 2The indeterminable pairs were omitted from the data. ment and test data containing idiomatic and literal Graph-based features use the graph structure of phrase usages. In all, there were over 1,300 unique WordNet 3.0 to calculate path distances. Let phrases, half of which had more than one possible distance(w; v; rel; n) be the minimum distance via dictionary definition in Wiktionary. Figure 2 pro- links of type rel in WordNet from a word w to a vides the complete statistics. word v, up to a threshold max integer value n, and 0 otherwise. The features compute: 5 Identification Model • closest synonym: For identification, we use a linear model that pre- ∗ min distance(w; v; synonym; 5) dicts class y 2 fliteral, idiomaticg for an input pair w2p;v2d hp; di with phrase p and definition d. We assign the class: • closest antonym:4 y∗ = arg max θ · φ(p; d; y) y min distance(w; v; antonym; 5) w2p;v2d n given features φ(p; d; y) 2 R with associated pa- n rameters θ 2 R . • average synonym distance: 1 X Learning In this work, we use the averaged Per- distance(w; v; synonym; 5) jpj ceptron algorithm (Freund and Schapire, 1999) to w2p;v2d perform learning, which was optimized in terms of iterations T , bounded by range [1, 100], by maxi- • average hyponym: mizing F-measure on the development set. 1 X distance(w; v; hyponym; 5) jpj The models described correspond to the features w2p;v2d they use. All models are trained on the same, unan- notated training data. • synsets connected by an antonym: This feature in- dicates whether the following is true. The set of Features The features that were developed fall synsets Synp, all synsets from all words in p, and into two categories: lexical and graph-based fea- the set of synsets Synd, all synsets from all words tures. The lexical features were motivated by the in d, are connected by a shared antonym. This fea- intuition that literal phrases are more likely to have ture follows an approach described by Budanitsky closely related words in d to those in p because lit- et al. (2006). eral phrases do not break the principle of compo- sitionality. All words compared are stemmed ver- 6 Experiments sions. Let count(w; t) = number of times word w We report identification and detection results, vary- appears in text t. ing the data labeling and choice of feature sets.
Recommended publications
  • Dictionary Users Do Look up Frequent Words. a Logfile Analysis
    Erschienen in: Müller-Spitzer, Carolin (Hrsg.): Using Online Dictionaries. Berlin/ Boston: de Gruyter, 2014. (Lexicographica: Series Maior 145), S. 229-249. Alexander Koplenig, Peter Meyer, Carolin Müller-Spitzer Dictionary users do look up frequent words. A logfile analysis Abstract: In this paper, we use the 2012 log files of two German online dictionaries (Digital Dictionary of the German Language1 and the German Version of Wiktionary) and the 100,000 most frequent words in the Mannheim German Reference Corpus from 2009 to answer the question of whether dictionary users really do look up fre- quent words, first asked by de Schryver et al. (2006). By using an approach to the comparison of log files and corpus data which is completely different from that of the aforementioned authors, we provide empirical evidence that indicates - contra - ry to the results of de Schryver et al. and Verlinde/Binon (2010) - that the corpus frequency of a word can indeed be an important factor in determining what online dictionary users look up. Finally, we incorporate word dass Information readily available in Wiktionary into our analysis to improve our results considerably. Keywords: log file, frequency, corpus, headword list, monolingual dictionary, multi- lingual dictionary Alexander Koplenig: Institut für Deutsche Sprache, R 5, 6-13, 68161 Mannheim, +49-(0)621-1581- 435, [email protected] Peter Meyer: Institut für Deutsche Sprache, R 5, 6-13, 68161 Mannheim, +49-(0)621-1581-427, [email protected] Carolin Müller-Spitzer: Institut für Deutsche Sprache, R 5, 6-13, 68161 Mannheim, +49-(0)621-1581- 429, [email protected] Introduction We would like to Start this chapter by asking one of the most fundamental questions for any general lexicographical endeavour to describe the words of one (or more) language(s): which words should be included in a dictionary? At first glance, the answer seems rather simple (especially when the primary objective is to describe a language as completely as possible): it would be best to include every word in the dictionary.
    [Show full text]
  • Etytree: a Graphical and Interactive Etymology Dictionary Based on Wiktionary
    Etytree: A Graphical and Interactive Etymology Dictionary Based on Wiktionary Ester Pantaleo Vito Walter Anelli Wikimedia Foundation grantee Politecnico di Bari Italy Italy [email protected] [email protected] Tommaso Di Noia Gilles Sérasset Politecnico di Bari Univ. Grenoble Alpes, CNRS Italy Grenoble INP, LIG, F-38000 Grenoble, France [email protected] [email protected] ABSTRACT a new method1 that parses Etymology, Derived terms, De- We present etytree (from etymology + family tree): a scendants sections, the namespace for Reconstructed Terms, new on-line multilingual tool to extract and visualize et- and the etymtree template in Wiktionary. ymological relationships between words from the English With etytree, a RDF (Resource Description Framework) Wiktionary. A first version of etytree is available at http: lexical database of etymological relationships collecting all //tools.wmflabs.org/etytree/. the extracted relationships and lexical data attached to lex- With etytree users can search a word and interactively emes has also been released. The database consists of triples explore etymologically related words (ancestors, descendants, or data entities composed of subject-predicate-object where cognates) in many languages using a graphical interface. a possible statement can be (for example) a triple with a lex- The data is synchronised with the English Wiktionary dump eme as subject, a lexeme as object, and\derivesFrom"or\et- at every new release, and can be queried via SPARQL from a ymologicallyEquivalentTo" as predicate. The RDF database Virtuoso endpoint. has been exposed via a SPARQL endpoint and can be queried Etytree is the first graphical etymology dictionary, which at http://etytree-virtuoso.wmflabs.org/sparql.
    [Show full text]
  • User Contributions to Online Dictionaries Andrea Abel and Christian M
    The dynamics outside the paper: User Contributions to Online Dictionaries Andrea Abel and Christian M. Meyer Electronic lexicography in the 21st century (eLex): Thinking outside the paper, Tallinn, Estonia, October 17–19, 2013. 18.10.2013 | European Academy of Bozen/Bolzano and Technische Universität Darmstadt | Andrea Abel and Christian M. Meyer | 1 Introduction Online dictionaries rely increasingly on their users and leverage methods for facilitating user contributions at basically any step of the lexicographic process. 18.10.2013 | European Academy of Bozen/Bolzano and Technische Universität Darmstadt | Andrea Abel and Christian M. Meyer | 2 Previous work . Mostly focused on specific type/dimension of user contribution . e.g., Carr (1997), Storrer (1998, 2010), Køhler Simonsen (2005), Fuertes-Olivera (2009), Melchior (2012), Lew (2011, 2013) . Ambiguous, partly overlapping terms: www.wordle.net/show/wrdl/7124863/User_contributions (02.10.2013) www.wordle.net/show/wrdl/7124863/User_contributions http:// 18.10.2013 | European Academy of Bozen/Bolzano and Technische Universität Darmstadt | Andrea Abel and Christian M. Meyer | 3 Research goals and contribution How to effectively plan the user contributions for new and established dictionaries? . Comprehensive and systematic classification is still missing . Mann (2010): analysis of 88 online dictionaries . User contributions roughly categorized as direct / indirect contributions and exchange with other dictionary users . Little detail given, since outside the scope of the analysis . We use this as a basis for our work We propose a novel classification of the different types of user contributions based on many practical examples 18.10.2013 | European Academy of Bozen/Bolzano and Technische Universität Darmstadt | Andrea Abel and Christian M.
    [Show full text]
  • Wiktionary Matcher
    Wiktionary Matcher Jan Portisch1;2[0000−0001−5420−0663], Michael Hladik2[0000−0002−2204−3138], and Heiko Paulheim1[0000−0003−4386−8195] 1 Data and Web Science Group, University of Mannheim, Germany fjan, [email protected] 2 SAP SE Product Engineering Financial Services, Walldorf, Germany fjan.portisch, [email protected] Abstract. In this paper, we introduce Wiktionary Matcher, an ontology matching tool that exploits Wiktionary as external background knowl- edge source. Wiktionary is a large lexical knowledge resource that is collaboratively built online. Multiple current language versions of Wik- tionary are merged and used for monolingual ontology matching by ex- ploiting synonymy relations and for multilingual matching by exploiting the translations given in the resource. We show that Wiktionary can be used as external background knowledge source for the task of ontology matching with reasonable matching and runtime performance.3 Keywords: Ontology Matching · Ontology Alignment · External Re- sources · Background Knowledge · Wiktionary 1 Presentation of the System 1.1 State, Purpose, General Statement The Wiktionary Matcher is an element-level, label-based matcher which uses an online lexical resource, namely Wiktionary. The latter is "[a] collaborative project run by the Wikimedia Foundation to produce a free and complete dic- tionary in every language"4. The dictionary is organized similarly to Wikipedia: Everybody can contribute to the project and the content is reviewed in a com- munity process. Compared to WordNet [4], Wiktionary is significantly larger and also available in other languages than English. This matcher uses DBnary [15], an RDF version of Wiktionary that is publicly available5. The DBnary data set makes use of an extended LEMON model [11] to describe the data.
    [Show full text]
  • A Study of Idiom Translation Strategies Between English and Chinese
    ISSN 1799-2591 Theory and Practice in Language Studies, Vol. 3, No. 9, pp. 1691-1697, September 2013 © 2013 ACADEMY PUBLISHER Manufactured in Finland. doi:10.4304/tpls.3.9.1691-1697 A Study of Idiom Translation Strategies between English and Chinese Lanchun Wang School of Foreign Languages, Qiongzhou University, Sanya 572022, China Shuo Wang School of Foreign Languages, Qiongzhou University, Sanya 572022, China Abstract—This paper, focusing on idiom translation methods and principles between English and Chinese, with the statement of different idiom definitions and the analysis of idiom characteristics and culture differences, studies the strategies on idiom translation, what kind of method should be used and what kind of principle should be followed as to get better idiom translations. Index Terms— idiom, translation, strategy, principle I. DEFINITIONS OF IDIOMS AND THEIR FUNCTIONS Idiom is a language in the formation of the unique and fixed expressions in the using process. As a language form, idioms has its own characteristic and patterns and used in high frequency whether in written language or oral language because idioms can convey a host of language and cultural information when people chat to each other. In some senses, idioms are the reflection of the environment, life, historical culture of the native speakers and are closely associated with their inner most spirit and feelings. They are commonly used in all types of languages, informal and formal. That is why the extent to which a person familiarizes himself with idioms is a mark of his or her command of language. Both English and Chinese are abundant in idioms.
    [Show full text]
  • Experiments in Idiom Recognition
    Experiments in Idiom Recognition Jing Peng and Anna Feldman Department of Computer Science Department of Linguistics Montclair State University Montclair, New Jersey, USA 07043 Abstract Some expressions can be ambiguous between idiomatic and literal interpretations depending on the context they occur in, e.g., sales hit the roof vs. hit the roof of the car. We present a novel method of classifying whether a given instance is literal or idiomatic, focusing on verb-noun constructions. We report state-of-the-art results on this task using an approach based on the hypothesis that the distributions of the contexts of the idiomatic phrases will be different from the contexts of the literal usages. We measure contexts by using projections of the words into vector space. For comparison, we implement Fazly et al. (2009)’s, Sporleder and Li (2009)’s, and Li and Sporleder (2010b)’s methods and apply them to our data. We provide experimental results validating the proposed techniques. 1 Introduction Researchers have been investigating idioms and their properties for many years. According to traditional approaches, an idiom is — in its simplest form— a string of two or more words for which meaning is not derived from the meanings of the individual words comprising that string (Swinney and Cutler, 1979). As such, the meaning of kick the bucket (‘die’) cannot be obtained by breaking down the idiom and an- alyzing the meanings of its constituent parts, to kick and the bucket. In addition to being influenced by the principle of compositionality, the traditional approaches are also influenced by theories of generative grammar (Flores, 1993; Langlotz, 2006) The properties that traditional approaches attribute to idiomatic expressions are also the properties that make them difficult for generative grammars to describe.
    [Show full text]
  • Extracting an Etymological Database from Wiktionary Benoît Sagot
    Extracting an Etymological Database from Wiktionary Benoît Sagot To cite this version: Benoît Sagot. Extracting an Etymological Database from Wiktionary. Electronic Lexicography in the 21st century (eLex 2017), Sep 2017, Leiden, Netherlands. pp.716-728. hal-01592061 HAL Id: hal-01592061 https://hal.inria.fr/hal-01592061 Submitted on 22 Sep 2017 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Extracting an Etymological Database from Wiktionary Benoît Sagot Inria 2 rue Simone Iff, 75012 Paris, France E-mail: [email protected] Abstract Electronic lexical resources almost never contain etymological information. The availability of such information, if properly formalised, could open up the possibility of developing automatic tools targeted towards historical and comparative linguistics, as well as significantly improving the automatic processing of ancient languages. We describe here the process we implemented for extracting etymological data from the etymological notices found in Wiktionary. We have produced a multilingual database of nearly one million lexemes and a database of more than half a million etymological relations between lexemes. Keywords: Lexical resource development; etymology; Wiktionary 1. Introduction Electronic lexical resources used in the fields of natural language processing and com- putational linguistics are almost exclusively synchronic resources; they mostly include information about inflectional, derivational, syntactic, semantic or even pragmatic prop- erties of their entries.
    [Show full text]
  • Idioms-And-Expressions.Pdf
    Idioms and Expressions by David Holmes A method for learning and remembering idioms and expressions I wrote this model as a teaching device during the time I was working in Bangkok, Thai- land, as a legal editor and language consultant, with one of the Big Four Legal and Tax companies, KPMG (during my afternoon job) after teaching at the university. When I had no legal documents to edit and no individual advising to do (which was quite frequently) I would sit at my desk, (like some old character out of a Charles Dickens’ novel) and prepare language materials to be used for helping professionals who had learned English as a second language—for even up to fifteen years in school—but who were still unable to follow a movie in English, understand the World News on TV, or converse in a colloquial style, because they’d never had a chance to hear and learn com- mon, everyday expressions such as, “It’s a done deal!” or “Drop whatever you’re doing.” Because misunderstandings of such idioms and expressions frequently caused miscom- munication between our management teams and foreign clients, I was asked to try to as- sist. I am happy to be able to share the materials that follow, such as they are, in the hope that they may be of some use and benefit to others. The simple teaching device I used was three-fold: 1. Make a note of an idiom/expression 2. Define and explain it in understandable words (including synonyms.) 3. Give at least three sample sentences to illustrate how the expression is used in context.
    [Show full text]
  • Wiktionary Matcher Results for OAEI 2020
    Wiktionary Matcher Results for OAEI 2020 Jan Portisch1;2[0000−0001−5420−0663] and Heiko Paulheim1[0000−0003−4386−8195] 1 Data and Web Science Group, University of Mannheim, Germany fjan, [email protected] 2 SAP SE Product Engineering Financial Services, Walldorf, Germany [email protected] Abstract. This paper presents the results of the Wiktionary Matcher in the Ontology Alignment Evaluation Initiative (OAEI) 2020. Wiktionary Matcher is an ontology matching tool that exploits Wiktionary as exter- nal background knowledge source. Wiktionary is a large lexical knowl- edge resource that is collaboratively built online. Multiple current lan- guage versions of Wiktionary are merged and used for monolingual on- tology matching by exploiting synonymy relations and for multilingual matching by exploiting the translations given in the resource. This is the second OAEI participation of the matching system. Wiktionary Matcher has been improved and is the best performing system on the knowledge graph track this year.3 Keywords: Ontology Matching · Ontology Alignment · External Re- sources · Background Knowledge · Wiktionary 1 Presentation of the System 1.1 State, Purpose, General Statement The Wiktionary Matcher is an element-level, label-based matcher which uses an online lexical resource, namely Wiktionary. The latter is "[a] collaborative project run by the Wikimedia Foundation to produce a free and complete dic- tionary in every language"4. The dictionary is organized similarly to Wikipedia: Everybody can contribute to the project and the content is reviewed in a com- munity process. Compared to WordNet [2], Wiktionary is significantly larger and also available in other languages than English. This matcher uses DBnary [13], an RDF version of Wiktionary that is publicly available5.
    [Show full text]
  • Draft 5 for Printing
    Jan Milí č of Kroměř íž and Emperor Charles IV: Preaching, Power, and the Church of Prague Eleanor Janega UCL Thesis Submitted for the degree of PhD in History 1 I, Eleanor Janega, confirm that the work presented in this thesis is my own. Where information has been derived from other sources, I confirm that this has been indicated in my thesis. 2 Abstract During the second half of the fourteenth century Jan Milí č of Krom ěř íž became an active and popular preacher in Prague. The sermons which he delivered focused primarily on themes of reform, and called for a renewal within the church. Despite a sustained popularity with the lay populace of Prague, Milí č faced opposition to his practice from many individual members of the city’s clergy. Eventually he was the subject of twelve articles of accusation sent to the papal court of Avignon. Because of the hostility which Milí č faced, historians have most often written of him as a precursor to the Hussites. As a result he has been identified as an anti-establishment rabble-rouser and it has been assumed that he conducted his career in opposition to the court of the Emperor Charles IV. This thesis, over four body chapters, examines the careers of both Milí č and Charles and argues that instead of being enemies, the two men shared an amicable relationship. The first chapter examines Milí č’s career and will prove that he was well-connected to Charles and several members of his court. It will also examine the most common reasons given to argue that Charles and Milí č were at odds, and disprove them.
    [Show full text]
  • S Wiktionary Wikisource Wikibooks Wikiquote Wikimedia Commons
    SCHWESTERPROJEKTE Wiktionary S Das Wiktionary ist der lexikalische Partner der freien Enzyklopädie Wikipedia: ein Projekt zur Erstellung freier Wörterbücher und Thesau- ri. Während die Wikipedia inhaltliche Konzepte beschreibt, geht es in ihrem ältesten Schwester- projekt, dem 2002 gegründeten Wiktionary um Wörter, ihre Grammatik und Etymologie, Homo- nyme und Synonyme und Übersetzungen. Wikisource Wikisource ist eine Sammlung von Texten, die entweder urheberrechtsfrei sind oder unter ei- ner freien Lizenz stehen. Das Projekt wurde am 24. November 2003 gestartet. Der Wiktionary-EIntrag zum Wort Schnee: Das Wörterbuch präsen- Zunächst mehrsprachig auf einer gemeinsamen tiert Bedeutung, Deklination, Synonyme und Übersetzungen. Plattform angelegt, wurde es später in einzel- ne Sprachversionen aufgesplittet. Das deutsche Teilprojekt zählte im März 2006 über 2000 Texte Wikisource-Mitarbeiter arbeiten an einer digitalen, korrekturge- und über 100 registrierte Benutzer. lesenen und annotierten Ausgabe der Zimmerischen Chronik. Wikibooks Das im Juli 2003 aus der Taufe gehobene Projekt Wikibooks dient der gemeinschaftlichen Schaf- fung freier Lehrmaterialien – vom Schulbuch über den Sprachkurs bis zum praktischen Klet- terhandbuch oder der Go-Spielanleitung Wikiquote Wikiquote zielt darauf ab, auf Wiki-Basis ein freies Kompendium von Zitaten und Das Wikibooks-Handbuch Go enthält eine ausführliche Spielanleitung Sprichwörtern in jeder Sprache zu schaffen. Die des japanischen Strategiespiels. Artikel über Zitate bieten (soweit bekannt) eine Quellenangabe und werden gegebenenfalls in die deutsche Sprache übersetzt. Für zusätzliche Das Wikimedia-Projekt Wikiquote sammelt Sprichwörter und Informationen sorgen Links in die Wikipedia. Zitate, hier die Seite zum Schauspieler Woody Allen Wikimedia Commons Wikimedia Commons wurde im September 2004 zur zentralen Aufbewahrung von Multime- dia-Material – Bilder, Videos, Musik – für alle Wi- kimedia-Projekte gegründet.
    [Show full text]
  • Thank You - Wiktionary 4/24/11 9:56 PM Thank You
    thank you - Wiktionary 4/24/11 9:56 PM thank you Definition from Wiktionary, the free dictionary See also thankyou and thank-you Contents 1 English 1.1 Etymology 1.2 Pronunciation 1.3 Interjection 1.3.1 Synonyms 1.3.2 Translations 1.4 Noun 1.5 References 1.6 See also English Etymology Thank you is a shortened expression for I thank you; it is attested since c. 1400.[1] Pronunciation Audio (UK) (file) Interjection thank you 1. An expression of gratitude or politeness, in response to something done or given. http://en.wiktionary.org/wiki/thank_you Page 1 of 5 thank you - Wiktionary 4/24/11 9:56 PM Synonyms cheers (informal), thanks, thanks very much, thank you very much, thanks a lot, ta (UK, Australia), thanks a bunch (informal), thanks a million (informal), much obliged Translations an expression of gratitude [hide ▲] Select targeted languages ﺳﻮﭘﺎﺱ ,Afar: gadda ge Kurdish: spas Afrikaans: dankie, baie dankie Ladin: giulan Albanian: faleminderit, ju falem nderit Lao: (kööp cai) Gheg: falimineres Latin: benignē dīcis (la), tibi gratiās agō (la) Latvian: paldies (lv) Aleut: qaĝaasakuq (Atkan), qaĝaalakux̂ Lithuanian: ačiū (lt) (Eastern) Luo: erokamano Alutiiq: quyanaa Macedonian: благодарам (mk) (blagódaram) American Sign Language: OpenB@Chin- (formal), фала (mk) (fála) (informal) PalmBack OpenB@FromChin-PalmUp Malagasy: misaotra (mg) Amharic: (am) (amesegenallo) Malay: terima kasih (ms) (ar) (mersii) Malayalam: (ml) (nandhi) ﻣﻴﺮﺳﻲ ,(ar) (shúkran) ﺷﻜﺮًﺍ :Arabic (informal) Maltese: grazzi (mt) Armenian: Maori: kia ora (mi) շնորհակալություն
    [Show full text]