Extending Wordnet Bahasa with External Resources
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Universality, Similarity, and Translation in the Wikipedia Inter-Language Link Network
In Search of the Ur-Wikipedia: Universality, Similarity, and Translation in the Wikipedia Inter-language Link Network Morten Warncke-Wang1, Anuradha Uduwage1, Zhenhua Dong2, John Riedl1 1GroupLens Research Dept. of Computer Science and Engineering 2Dept. of Information Technical Science University of Minnesota Nankai University Minneapolis, Minnesota Tianjin, China {morten,uduwage,riedl}@cs.umn.edu [email protected] ABSTRACT 1. INTRODUCTION Wikipedia has become one of the primary encyclopaedic in- The world: seven seas separating seven continents, seven formation repositories on the World Wide Web. It started billion people in 193 nations. The world's knowledge: 283 in 2001 with a single edition in the English language and has Wikipedias totalling more than 20 million articles. Some since expanded to more than 20 million articles in 283 lan- of the content that is contained within these Wikipedias is guages. Criss-crossing between the Wikipedias is an inter- probably shared between them; for instance it is likely that language link network, connecting the articles of one edition they will all have an article about Wikipedia itself. This of Wikipedia to another. We describe characteristics of ar- leads us to ask whether there exists some ur-Wikipedia, a ticles covered by nearly all Wikipedias and those covered by set of universal knowledge that any human encyclopaedia only a single language edition, we use the network to under- will contain, regardless of language, culture, etc? With such stand how we can judge the similarity between Wikipedias a large number of Wikipedia editions, what can we learn based on concept coverage, and we investigate the flow of about the knowledge in the ur-Wikipedia? translation between a selection of the larger Wikipedias. -
Taxonomic Data Integration from Multilingual Wikipedia Editions
Noname manuscript No. (will be inserted by the editor) Taxonomic Data Integration from Multilingual Wikipedia Editions Gerard de Melo · Gerhard Weikum Received: date / Accepted: date Abstract Information systems are increasingly making use of taxonomic knowl- edge about words and entities. A taxonomic knowledge base may reveal that the Lago di Garda is a lake, and that lakes as well as ponds, reservoirs, and marshes are all bodies of water. As the number of available taxonomic knowledge sources grows, there is a need for techniques to integrate such data into combined, unified taxonomies. In particular, the Wikipedia encyclopedia has been used by a num- ber of projects, but its multilingual nature has largely been neglected. This paper investigates how entities from all editions of Wikipedia as well as WordNet can be integrated into a single coherent taxonomic class hierarchy. We rely on link- ing heuristics to discover potential taxonomic relationships, graph partitioning to form consistent equivalence classes of entities, and a Markov chain-based ranking approach to construct the final taxonomy. This results in MENTA (Multilingual Entity Taxonomy), a resource that describes 5.4 million entities and is one of the largest multilingual lexical knowledge bases currently available. Keywords Taxonomy induction · Multilingual · Graph · Ranking 1 Introduction Motivation. Capturing knowledge in the form of machine-readable semantic knowl- edge bases has been a long-standing goal in computer science, information science, and knowledge management. Such resources have facilitated tasks like query ex- pansion [34], semantic search [48], faceted search [8], question answering [69], se- mantic document clustering [19], clinical decision support systems [14], and many more. -
Wikipedia Depicting Mohammed Different Approaches in Different Environments
Wikipedia depicting Mohammed Different approaches in different environments Man77, Wikimania 2016, Esino Lario أهلً وسهلً Benvenuti I am ● From Vienna, Austria ● Speaker at WikiCon 2015 ● Wikipedian since 2006 ● Mainly active in de.wikipedia.org Topic: Illustration of the ● Student of article on Mohammed in anthropology de.wikipedia.org ● First-time speaker at Debate on other Wikimania Wikipedias Why is this a topic of interest? ● Wikipedistically: – Diversity of visual representations available to Wikipedia – Wikipedias with different communities and different style guidelines – Principles, in particular NPOV Why is this a topic of interest? ● Islamically: – Aniconism in Islam – Depictions of Mohammed In the tiniest of nutshells Editor at Large, Pistachio shells, CC BY-SA 2.5 In the tiniest of nutshells ● It's not a Qur'an thing, but a hadith thing ● Verbal descriptions of Mohammed's appearance ● Islamic schools of thought are devided ● Calligraphy as classic Islamic art ● There are even Muslim paintings of Mohammed ● In recent years, the „Westerners“ experimented a little... What kinds of depictions do we have? ● Figurative depictions – Islamic tradition – „Western“ depictions – chris cosco, Everybody Draw Mohammed Day - Mohammed at night by Ysterius, CC BY-SA 2.0 What kinds of depictions do we have? ● Calligraphic representations What kinds of depictions do we have? ● Associative illustrations So there are options to choose from Thomas Rosenau, Gummy bears, CC BY-SA 2.5 Arabic Wikipedia Mohammed with face Mohammed w/o face Calligraphies -
UNIVERSITY of CALIFORNIA SANTA CRUZ PRINTED TEXTS and DIGITAL DOPPELGANGERS: READING LITERATURE in the 21 CENTURY a Dissertation
UNIVERSITY OF CALIFORNIA SANTA CRUZ PRINTED TEXTS AND DIGITAL DOPPELGANGERS: READING LITERATURE IN THE 21ST CENTURY A dissertation submitted in partial satisfaction of the requirements for the degree of DOCTOR OF PHILOSOPHY in LITERATURE By Jeremy Throne December 2018 The Dissertation of Jeremy Throne is approved: _________________________________ Professor Susan Gillman, chair _________________________________ Professor Kirsten Silva Gruesz _________________________________ Professor Johanna Drucker _________________________________ Lori Kletzer Vice Provost and Dean of Graduate Studies Table of Contents Abstract……………………………………………………………………………….iv Dedications……..……………………………………………………………………..v Introduction……………………………………………………………………………1 Chapter 1: Mark Twain’s Literary Legacies: A Digital Perspective………...………23 Chapter 2: “Mark Twain” in Chronicling America: Contextualizing Mark Twain’s Autobiography……………………………………………………………………….63 Chapter 3: From Mining to Modeling: An Agent-Based Approach to Mark Twain’s Autobiography…………………………………………………...…………………108 Conclusion………………………………………………………………………….146 Appendix for Chapter 1…………………………………………………….………158 Appendix for Chapter 2…………………………………………………….………178 Appendix for Chapter 3………………………………………………………….…204 List of Supplemental Files………………………………………………………….219 Bibliography………………………………………………………………..………225 iii Abstract Printed Texts and Digital Doppelgangers: Reading Literature in the 21st Century Jeremy Throne Much ink has been spilt worrying over the death of the book. It may be, however, that we find ourselves -
Graph-Based Methods for Large-Scale Multilingual Knowledge Integration
demelo_cover:Layout 1 13.01.2012 10:35 Seite 1 Dissertationen aus der Naturwissenschaftlich- Technischen Fakultät I der Universität des Saarlandes Graph-based Methods for Large-Scale n o i t a r Multilingual Knowledge Integration Given that much of our knowledge is expressed in textual form, infor - g e mation systems increasingly depend on knowledge about words and t n the entities they represent. This book investigates novel methods for I e automatically building large repositories of knowledge that capture g d semantic relationships between words, names, and entities, in many e l different languages. Three major new contributions are presented, w o each involving graph algorithms and statistical techniques that com - n bine evidence from multiple sources of information. K Gerard de Melo l a u The lexical integration method involves learning models that disam - g n i biguate word meanings based on contextual information in a graph, l i t thereby providing a means to connect words to the entities that they l denote. The entity integration method combines semantic items from u M different sources into a single unified registry of entities by reconciling e l equivalence and distinctness information and solving a combinatorial a c optimization problem. Finally, the taxonomic integration method adds S - a comprehensive and coherent taxonomic hierarchy on top of this e g registry, capturing how different entities relate to each other. r a L Together, these methods can be used to produce a large-scale multi - r o lingual knowledge base semantically describing over 5 million entities f s and over 16 million natural language words and names in more than d o 200 different languages. -
Geography of Social Ontologies: Testing a Variant of the Sapir-Whorf Hypothesis in the Context of Wikipedia Alexander Mehler, Olga Pustylnikov, Nils Diewald
Geography of Social Ontologies: Testing a Variant of the Sapir-Whorf Hypothesis in the Context of Wikipedia Alexander Mehler, Olga Pustylnikov, Nils Diewald To cite this version: Alexander Mehler, Olga Pustylnikov, Nils Diewald. Geography of Social Ontologies: Testing a Variant of the Sapir-Whorf Hypothesis in the Context of Wikipedia. Computer Speech and Language, Elsevier, 2011, 25 (3), pp.716. 10.1016/j.csl.2010.05.006. hal-00730282 HAL Id: hal-00730282 https://hal.archives-ouvertes.fr/hal-00730282 Submitted on 9 Sep 2012 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Accepted Manuscript Title: Geography of Social Ontologies: Testing a Variant of the Sapir-Whorf Hypothesis in the Context of Wikipedia Authors: Alexander Mehler, Olga Pustylnikov, Nils Diewald PII: S0885-2308(10)00043-4 DOI: doi:10.1016/j.csl.2010.05.006 Reference: YCSLA 459 To appear in: Received date: 31-10-2009 Revised date: 10-3-2010 Accepted date: 7-5-2010 Please cite this article as: Mehler, A., Pustylnikov, O., Diewald, N., Geography of Social Ontologies: Testing a Variant of the Sapir-Whorf Hypothesis in the Context of Wikipedia, Computer Speech & Language (2008), doi:10.1016/j.csl.2010.05.006 This is a PDF file of an unedited manuscript that has been accepted for publication. -
Wikipedia from Wikipedia, the Free Encyclopedia This Article Is About the Internet Encyclopedia
Wikipedia From Wikipedia, the free encyclopedia This article is about the Internet encyclopedia. For other uses, see Wikipedia ( disambiguation). For Wikipedia's non-encyclopedic visitor introduction, see Wikipedia:About. Page semi-protected Wikipedia A white sphere made of large jigsaw pieces, with letters from several alphabets shown on the pieces Wikipedia wordmark The logo of Wikipedia, a globe featuring glyphs from several writing systems, mo st of them meaning the letter W or sound "wi" Screenshot Main page of the English Wikipedia Main page of the English Wikipedia Web address wikipedia.org Slogan The free encyclopedia that anyone can edit Commercial? No Type of site Internet encyclopedia Registration Optional[notes 1] Available in 287 editions[1] Users 73,251 active editors (May 2014),[2] 23,074,027 total accounts. Content license CC Attribution / Share-Alike 3.0 Most text also dual-licensed under GFDL, media licensing varies. Owner Wikimedia Foundation Created by Jimmy Wales, Larry Sanger[3] Launched January 15, 2001; 13 years ago Alexa rank Steady 6 (September 2014)[4] Current status Active Wikipedia (Listeni/?w?k?'pi?di?/ or Listeni/?w?ki'pi?di?/ wik-i-pee-dee-?) is a free-access, free content Internet encyclopedia, supported and hosted by the non -profit Wikimedia Foundation. Anyone who can access the site[5] can edit almost any of its articles. Wikipedia is the sixth-most popular website[4] and constitu tes the Internet's largest and most popular general reference work.[6][7][8] As of February 2014, it had 18 billion page views and nearly 500 million unique vis itors each month.[9] Wikipedia has more than 22 million accounts, out of which t here were over 73,000 active editors globally as of May 2014.[2] Jimmy Wales and Larry Sanger launched Wikipedia on January 15, 2001. -
In Search of the Ur-Wikipedia: Universality, Similarity, and Translation in the Wikipedia Inter-Language Link Network
In Search of the Ur-Wikipedia: Universality, Similarity, and Translation in the Wikipedia Inter-language Link Network Morten Warncke-Wang1, Anuradha Uduwage1, Zhenhua Dong2, John Riedl1 1GroupLens Research Dept. of Computer Science and Engineering 2Dept. of Information Technical Science University of Minnesota Nankai University Minneapolis, Minnesota Tianjin, China {morten,uduwage,riedl}@cs.umn.edu [email protected] ABSTRACT 1. INTRODUCTION Wikipedia has become one of the primary encyclopaedic in- The world: seven seas separating seven continents, seven formation repositories on the World Wide Web. It started billion people in 193 nations. The world's knowledge: 283 in 2001 with a single edition in the English language and has Wikipedias totalling more than 20 million articles. Some since expanded to more than 20 million articles in 283 lan- of the content that is contained within these Wikipedias is guages. Criss-crossing between the Wikipedias is an inter- probably shared between them; for instance it is likely that language link network, connecting the articles of one edition they will all have an article about Wikipedia itself. This of Wikipedia to another. We describe characteristics of ar- leads us to ask whether there exists some ur-Wikipedia, a ticles covered by nearly all Wikipedias and those covered by set of universal knowledge that any human encyclopaedia only a single language edition, we use the network to under- will contain, regardless of language, culture, etc? With such stand how we can judge the similarity between Wikipedias a large number of Wikipedia editions, what can we learn based on concept coverage, and we investigate the flow of about the knowledge in the ur-Wikipedia? translation between a selection of the larger Wikipedias. -
Wikimedia Foundation Board of Trustees Retreat November 2018 Operations Update
Wikimedia Foundation Board of Trustees Retreat November 2018 Operations update 2 Revenue & Fundraising FY18-19 Q1 3 FY18-19 Q1 revenue $11.6m +17% Budget Other $ 550k Q1 Income Total revenue of $13.6M $13.6m includes realized interest Actual and dividend income Notes: ● Major gifts raised $1.5M more funds than budget in Q1, including $1M from Brin Wojcicki and $650K(1) from Siegel Family Endowment. $xx.xm ● Interest and dividend income in our investment portfolio was higher than expected by $350k. (1) This grant is paid over 3 years and we confirmed with KPMG that recording the full grant amount as revenue is appropriate and consistent with the accounting guidance. 4 Q1 revenue by team Endowment 10.0% Major Gifts 14.9% Online Donations 75.1% FY18-19 Q2 projections* PROGRAM TARGET PROBABILITY PROJECTION Country Campaigns United States, Canada, $58.5m 90% $52.65m Ireland, United Kingdom, Australia, New Zealand, France, and Latin America Recurring Donations $2.2m 95% $2.1m Major Gifts Pipeline (that may come by December 30) $6m 50% $3m TOTAL: $57.75m *Projections are for online fundraising only. Total Q2 revenue budget is $64.2M. FY17-18 by geography *https://meta.wikimedia.org/wiki/Fundraising/2017-18_Report FY17-18 by grants World Bank country categories Funds raised % of total Grants % of total High Income Economies $96.8M 97.7% $5.8M 76.3% Upper Middle Income Economies $1.6M 1.6% $718K 9.4% Lower Middle Income Economies $580K .006% $1.05M 13.8% Low Income Economies $48K .0005% $34K .005% Total $99.03M $7.602M *Country\Economy categories: https://datahelpdesk.worldbank.org/knowledgebase/articles/906519-world-bank-country-and-lending-groups Financial overview FY18-19 Q1 9 YoY increase drivers $2.4M spending increase year over year[1] +$1.4M Staffing FY17-18 Notes: +$0.5M Contract Services YTD [1] Funds available for a specific purpose in +$0.4M Other expenses FY17-18 are excluded in this comparison +$0.2M Grants [2] Data Center expenses have been revised +$0.1M [2] inFY16-17 FY18-19 to conform with GAAP standard.