The case of BabelNet 2.0 Roberto Navigli

http://lcl.uniroma1.it

Athens, Greece, March 21, 2013 BabelNet (http://babelnet.org)

• A wide-coverage multilingual including both encyclopedic (from Wikipedia) and lexicographic (from WordNet and OmegaWiki) entries NEs and specialized Concepts concepts

Concepts integrated from both resources

BabelNet & friends 21/03/2014 2 Roberto Navigli BabelNet: a multilingual encyclopedic !

• Available online: http://babelnet.org

BabelNet & friends 21/03/2014 3 Roberto Navigli BabelNet 2.0 is online: http://babelnet.org

BabelNet & friends 21/03/2014 4 Roberto Navigli Anatomy of BabelNet 2.0 • 50 covered (including Latin!) • Full list at http://babelnet.org/stats.jsp

BabelNet & friends 21/03/2014 5 Roberto Navigli Anatomy of BabelNet 2.0 • 50 languages covered (including Latin!) • 9.3M Babel synsets (concepts and named entities) • 50M senses • 262M semantic relations (28 relations per synset on avg.) • 7.7M synset-associated images • 18M textual definitions

BabelNet & friends 21/03/2014 6 Roberto Navigli Anatomy of BabelNet 2.0 • 50 languages covered (including Latin!) • Integrates: – WordNet 3.0 – Wikipedia (October 2012 dumps) – OmegaWiki : a collaborative multilingual dictionary – Open Multilingual WordNet [Bond and Foster, 2013] • Translations for all open-class parts of speech

BabelNet & friends 21/03/2014 7 Roberto Navigli WordNet+Open Multilingual WordNet+Wikipedia+…

BabelNet & friends 21/03/2014 8 Roberto Navigli +OmegaWiki+automatic translations…

BabelNet & friends 21/03/2014 9 Roberto Navigli +textual definitions

BabelNet & friends 21/03/2014 10 Roberto Navigli +Wikipedia categories

BabelNet & friends 21/03/2014 11 Roberto Navigli +images

BabelNet & friends 21/03/2014 12 Roberto Navigli In the Linguistic Licensed cloud…

BabelNet goes to the (Multilingual) Semantic Web 21/03/2014 14 Roberto Navigli RDF-Lemon encoding of BabelNet

• RDF representation based on: – lemon , a Model for – SKOS, Simple Knowledge Organization System – LexInfo Ontology 2.0

• 1,026,780,575 triples

• Interlinking: DBpedia, Wikipedia, lemon WordNet and lemon OmegaWiki (English), Wikipedia

BabelNet goes to the (Multilingual) Semantic Web 21/03/2014 15 Roberto Navigli License issues and current solution

BabelNet & friends 21/03/2014 19 Roberto Navigli License issues and current solution

• Create an onion-like distribution with:

1. BabelNet core, including unrestricted data and the alignments between resources 2. CC-BY 3.0 data 3. CC-BY-SA 3.0 data 4. CC-BY-SA-NC 3.0 data (only if BN is used for non commercial purposes)

BabelNet & friends 21/03/2014 20 Roberto Navigli Babelfy: Bringing together Entity Linking and Word Sense Disambiguation (TACL, 2014)

• «Thomas and Mario are strikers playing in Munich» palloneEntity aerostaticomentions denoting senses

BabelNet & friends 21/03/2014 21 Roberto Navigli Roberto Navigli Linguistic Computing Laboratory http://lcl.uniroma1.it