The case of BabelNet 2.0 Roberto Navigli
http://lcl.uniroma1.it
Athens, Greece, March 21, 2013 BabelNet (http://babelnet.org)
• A wide-coverage multilingual semantic network including both encyclopedic (from Wikipedia) and lexicographic (from WordNet and OmegaWiki) entries NEs and specialized Concepts concepts
Concepts integrated from both resources
BabelNet & friends 21/03/2014 2 Roberto Navigli BabelNet: a multilingual encyclopedic dictionary!
• Available online: http://babelnet.org
BabelNet & friends 21/03/2014 3 Roberto Navigli BabelNet 2.0 is online: http://babelnet.org
BabelNet & friends 21/03/2014 4 Roberto Navigli Anatomy of BabelNet 2.0 • 50 languages covered (including Latin!) • Full list at http://babelnet.org/stats.jsp
BabelNet & friends 21/03/2014 5 Roberto Navigli Anatomy of BabelNet 2.0 • 50 languages covered (including Latin!) • 9.3M Babel synsets (concepts and named entities) • 50M word senses • 262M semantic relations (28 relations per synset on avg.) • 7.7M synset-associated images • 18M textual definitions
BabelNet & friends 21/03/2014 6 Roberto Navigli Anatomy of BabelNet 2.0 • 50 languages covered (including Latin!) • Integrates: – WordNet 3.0 – Wikipedia (October 2012 dumps) – OmegaWiki : a collaborative multilingual dictionary – Open Multilingual WordNet [Bond and Foster, 2013] • Translations for all open-class parts of speech
BabelNet & friends 21/03/2014 7 Roberto Navigli WordNet+Open Multilingual WordNet+Wikipedia+…
BabelNet & friends 21/03/2014 8 Roberto Navigli +OmegaWiki+automatic translations…
BabelNet & friends 21/03/2014 9 Roberto Navigli +textual definitions
BabelNet & friends 21/03/2014 10 Roberto Navigli +Wikipedia categories
BabelNet & friends 21/03/2014 11 Roberto Navigli +images
BabelNet & friends 21/03/2014 12 Roberto Navigli In the Linguistic Licensed Linked Data cloud…
BabelNet goes to the (Multilingual) Semantic Web 21/03/2014 14 Roberto Navigli RDF-Lemon encoding of BabelNet
• RDF representation based on: – lemon , a Lexicon Model for Ontology – SKOS, Simple Knowledge Organization System – LexInfo Ontology 2.0
• 1,026,780,575 triples
• Interlinking: DBpedia, Wikipedia, lemon WordNet and lemon OmegaWiki (English), Wikipedia
BabelNet goes to the (Multilingual) Semantic Web 21/03/2014 15 Roberto Navigli License issues and current solution
BabelNet & friends 21/03/2014 19 Roberto Navigli License issues and current solution
• Create an onion-like distribution with:
1. BabelNet core, including unrestricted data and the alignments between resources 2. CC-BY 3.0 data 3. CC-BY-SA 3.0 data 4. CC-BY-SA-NC 3.0 data (only if BN is used for non commercial purposes)
BabelNet & friends 21/03/2014 20 Roberto Navigli Babelfy: Bringing together Entity Linking and Word Sense Disambiguation (TACL, 2014)
• «Thomas and Mario are strikers playing in Munich» palloneEntity Words aerostaticomentions denoting senses
BabelNet & friends 21/03/2014 21 Roberto Navigli Roberto Navigli Linguistic Computing Laboratory http://lcl.uniroma1.it