<<

Improving Sense Disambiguation with Linguistic Knowledge from a Sense Annotated

Kiril Simov Alexander Popov Petya Osenova Linguistic Modeling Department IICT-BAS kivs alex.popov petya @bultreebank.org { | | }

Abstract (Sensed BulTreeBank) in order to extract semantic relations and add them into the lexical resources. In this paper we present an approach The hypothesis that this enrichment would lead to for the enrichment of WSD knowledge better WSD for Bulgarian was tested in the bases with data-driven relations from a of the Personalized PageRank algorithm. gold standard corpus (annotated with word The structure of the papers is as follows: the senses, valency information, syntactic next section discusses the related work on the analyses, etc.). We focus on Bulgarian as topic. Section 3 presents the Bulgarian sense an- a use case, but our approach is scalable to notated treebank. Section 4 focuses on the Bul- other as well. For the purpose garian Syntactic and Lexical Resources. Section 5 of exploring such methods, the Personal- introduces the WSD experiments and results. Sec- ized Page Rank algorithm was used. The tion 6 concludes the paper. reported results show that the addition of new knowledge improves the accuracy of 2 Related Work WSD with approximately 10.5%. Knowledge-based systems for WSD have proven 1 Introduction to be a good alternative to supervised systems, which require large amounts of manually anno- Solutions to WSD-related tasks usually employ tated training data. In contrast, knowledge-based lexical databases, such as and ontolo- systems require only a knowledge base and no gies. However, lexical databases suffer from additional corpus-dependent information. An es- sparseness in the availability and density of rela- pecially popular knowledge-based disambiguation tions. One approach towards remedying this prob- approach has been the use of popular graph-based lem is the BabelNet (Navigli and Ponzetto, 2012), algorithms known under the name of ”Random which relates several lexical resources — Word- Walk on Graph” (Agirre et al., 2014). Most ap- Net1, DBpedia, , etc. Although such a proaches exploit variants of the PageRank algo- setting takes into consideration the role of lexical rithm (Brin and Page, 2012). Agirre and Soroa and world knowledge, it does not incorporate con- (2009) apply a variant of the algorithm to Word textual knowledge learned from actual texts (such Sense Disambiguation by translating WordNet as collocational patterns, for example). This hap- into a graph in which the synsets are represented pens because the knowledge sources for WSD sys- as vertices and the relations between them are rep- tems usually capture only a fraction of the rela- resented as edges between the nodes. The result- tions between entities in the world. Many im- ing graph is called a knowledge graph in this pa- portant relations are not present in ontological re- per. Calculating the PageRank vector Pr is accom- sources but could be learned from texts. plished through solving the equation: One possible approach to handling this sparse- ness issue is the incorporation of relations from Pr = cMPr + (1 c)v (1) sense annotated corpora into the lexical databases. − We decided to focus on this line of research, where M is an N x N transition probability matrix by using the Bulgarian sense annotated treebank (N being the number of vertices in the graph), c 1In this work we used version 3.0 of Princeton WordNet: is the damping factor and v is an N x 1 vector. In https://wordnet.princeton.edu/. the traditional, static version of PageRank the val-

596 Proceedings of Recent Advances in Natural Processing, pages 596–603, Hissar, Bulgaria, Sep 7–9 2015. ues of v are all equal (1/N), which means that in knowledge graph – whether the knowledge repre- the case of a random jump each vertex is equally sented in terms of nodes and relations (arcs) be- likely to be selected. Modifying the values of tween them is sufficient for the algorithm to pick v effectively changes these probabilities and thus the correct senses of ambiguous . Several makes certain nodes more important. The version extensions of the knowledge graph constructed on of PageRank for which the values in v are not uni- the basis of WordNet have been proposed and im- form is called Personalized PageRank. plemented. An approach similar to the one pre- The words in the text that are to be disam- sented here is described in Agirre and Martinez biguated are inserted as nodes in the knowledge (2002), which explores the extraction of syntacti- graph and are connected to their potential senses cally supported semantic relations from manually via directed edges (by default, a context window of annotated corpora. In that piece of research, Sem- at least 20 words is used for each disambiguation). Cor, a semantically annotated corpus, was pro- These newly introduced nodes serve to inject ini- cessed with the MiniPar dependency parser and tial probability mass (via the v vector) and thus to the subject-verb and object-verb relations were make their associated sense nodes especially rel- consequently extracted. The new relations were evant in the knowledge graph. Applying the Per- represented on several levels: as word-to-class sonalized PageRank algorithm iteratively over the and class-to-class relations. The extracted selec- resulting graph determines the most appropriate tional relations were then added to WordNet and sense for each ambiguous word. The method has used in the WSD task. The chief difference with been boosted by the addition of new relations and the presently described approach is that the set by developing variations and optimizations of the of relations used here is bigger (it includes also algorithm (Agirre and Soroa, 2009). It has also indirect-object-to-verb relations, noun-to-modifier been applied to the task of Disam- relations, etc.). Another difference is that the new biguation (Agirre et al., 2015). relations in the present piece of research are not added as selectional relations, but as semantic re- Montoyo et al. (2005) present a combina- lations between the corresponding synsets. This tion of knowledge-based and supervised systems means that the specific syntactic role of the partic- for WSD, which demonstrates that the two ap- ipant is not taken into account, but only the con- proaches can boost one another, due to the funda- nectedness between the participant and the event mentally different types of knowledge they utilise is registered in the knowledge graph. (paradigmatic vs. syntagmatic). They explore a knowledge-based system that uses heuristics for 3 The Bulgarian Sense Annotated WSD depending on the position of word poten- Treebank tial senses in the WordNet knowledge base. In terms of supervised machine learning based on The sense annotation process over BulTreeBank an annotated corpus, it explores a Maximum En- (BTB) was organized in three layers: verb valency tropy model that takes into account multiple fea- frames (Osenova et al., 2012); senses of verbs, tures from the context of the to-be-disambiguated nouns, adjectives and adverbs; DBpedia URIs over word. This earlier line of research demonstrates named entities. However, in the experiment pre- that combining paradigmatic and syntagmatic in- sented here, we used mainly the annotated senses formation is a fruitful strategy, but it does so by of nouns and verbs (together with the valency in- doing the combination in a postprocessing step, formation), as well as the concept mappings to i.e. by merging the output of two separate sys- WordNet. For that reason we do not discuss the tems; also, it still relies on manually-annotated DBpedia annotation here. A brief outline can be data for the supervised disambiguation. Building found in Popov et al. (2014). on the already mentioned work on graph-based ap- The sense annotation was organized as follows: proaches, it is possible to combine paradigmatic the lemmatized words per part-of-speech (POS) and syntagmatic information in another way – by from BTB received all their possible senses from incorporating both into the knowledge graph. This the explanatory of Bulgarian and from 2 approach is described in the current paper. our Core WordNet . When two competing defi- nitions came from both resources, preference was The success of knowledge-based WSD ap- proaches apparently depends on the quality of the 2Available at http://compling.hss.ntu.edu.sg/omw/

597 given to the one that was mapped to the WordNet. open for developing a language-specific hierarchy. In the ambiguous cases the correct sense was se- The valency lexicon consists of around 18,000 lected according to the context of usage. For the verb frames extracted from the BTB. The partici- purposes of the evaluation, some of the files were pants in these frames have ontological constraints. independently manually checked by two individ- At the moment, the verb senses are mapped to ual annotators. In total, 92,000 running words WordNet, but the constraints over arguments are have been mapped to word senses. Thus, about 43 not synchronized with the WordNet concepts in % of all the treebank tokens have been associated their levels of granularity and specificity. This with senses. syncronization is planned as a next step in our The word forms annotated with senses mapped work, in order to further enrich the knowledge to WordNet synsets are 69,333, consisting of graph. nouns and verbs. From these POS, 12,792 word forms have been used for testing, and the rest 5 Experiments have been used for relation extraction. About 20,000 word forms are now in the process of be- 5.1 Description of the WSD tool ing mapped to WordNet synsets. Most of them are The experiments that serve to illustrate the out- adjectives and adverbs. They will be included in lined approaches were carried out with the UKB3 the next round of experiments, which will result in tool, which provides graph-based methods for an increase the sense density of the graph. Word Sense Disambiguation and measuring lex- ical similarity. The tool uses the Personalized 4 Bulgarian Lexical and Syntactic PageRank algorithm, described in Agirre and Resources: BTB-Wordnet and Valency Soroa (2009). It can be and has been used to Lexicon perform Named Entity Disambiguation as well (Agirre et al., 2015). The tool builds a knowledge The BTB-Wordnet has been compiled in several graph over a set of relations that can be induced steps. Initially, the Core WordNet was created for from different types of resources, such as WordNet Bulgarian, which covered 4,999 synsets. Then, or DBPedia; then it selects a context window of nearly the same number of new synsets were open class words and runs the algorithm over the added to the WordNet (now we have 9,000 synsets graph. There is an additional module called NAF or so). We tried to map the Bulgarian senses to the UKB4 that can be used to run UKB with input in English ones as faithfully as possible, respecting the NAF format5 and to obtain output structured in the Princeton WordNet hierarchy. the same way, only with added word sense infor- Although connectivity was very important for mation. For compatibility reasons, NAF UKB was the experiments, we also mapped specific concepts used to perform the experiments reported here; the to more general ones in both directions (English to input NAF document contains in its ”term” nodes Bulgarian and Bulgarian to English). New defini- lemma and POS information, which is necessary tions for concepts which did not have a counterpart for the running of UKB. We have used the UKB in the Princeton Wordnet have been introduced. In default settings, i.e. a context window of 20 words this way, we established a language specific hier- that are to be disambiguated together, 30 iterations archy for Bulgarian. of the Personalized PageRank algorithm. The ongoing mapping of word senses in the The UKB tool requires two resource files to pro- treebank to the WordNet is thus complicated by cess the input file. One of the resources is a dic- the fact that the available resources are not directly tionary file with all lemmas that can be possibly comparable. These are: the Treebank, where linked to a sense identifier. In our case WordNet- words were annotated with definitions from an ex- derived relations were used for our knowledge planatory dictionary of Bulgarian (dictionary en- base; consequently, the sense identifiers are Word- tries), and the Princeton WordNet, which contains Net IDs. For instance, a line from the dictionary whole groups of ( sets) uni- extracted from WordNet looks like this: fied by common definitions of the concepts. At 3 the same time, such an approach makes it possi- http://ixa2.si.ehu.es/ukb/ 4https://github.com/asoroa/naf_ukb ble to easily structure the resource via the Prince- 5http://www.newsreader-project.eu/ ton Wordnet hierarchy, and it also leaves the door files/2013/01/techreport.pdf

598 predicate 06316813-n:0 06316626-n:0 from the annotations in BTB. These additional re- 01017222-v:0 01017001-v:0 00931232-v:0 sources are: First comes the lemma associated with the rele- Inferred hypernymy relations vant word senses, after the lemma the sense iden- • tifiers are listed. Each ID consists of eight digits Syntactic relations from the golden corpus followed by a and a label referring to the • POS category of the word. Finally, a number fol- Extended syntactic relations • lowing a colon indicates the frequency of the word Domain relations from WordNet sense, calculated on the basis of a tagged corpus. • When a lemma from the dictionary has occurred The phrase ”inferred hypernymy relations” in the analysis of the input text, the tool assigns means the transitive closure of the hypernymy re- all associated word senses to the word form in lation type. That is, if A is a hyponym of B and the context and attempts to disambiguate its mean- B is a hyponym of C, it is inferred that A is a hy- ing among them. The Bulgarian dictionary com- ponym of C. This type of inference has been done prises of all the lemmas of words annotated with for all synset IDs that participate in hypernymy re- WordNet senses in the BTB. It has 8,491 lemmas lations in the WordNet hierarchy and are found in mapped to 6,965 unique word senses. Currently the Bulgarian dictionary. 590,272 new relations we have opted to copy over the frequencies from have been generated in this way. the English corpus, but they are not actually used All relations described up until now are of a lex- in the experiments. ical nature, therefore essentially paradigmatic and The second resource file required for running providing information about an idealized model the tool is the set of relations that is used to con- of the world. The work presented here enriches struct the knowledge graph over which Person- further the knowledge graph by adding syntag- alized PageRank is run. The distribution of the matic information, i.e. contextual knowledge tool provides data (dictionary and relation files) about words and word senses. This has been done for WordNet 1.7 and 3.0. Since the BTB has been by extracting the intersection of the syntactic de- annotated with word senses from WordNet 3.0, the pendency relations from the BTB corpus and the resource files for version 3.0 were used for our ex- WordNet sense annotations in the same resource. periments. The distribution of UKB comes with In this way dependency relations between specific a file containing the standard lexical relations de- words in the text that also have attached Word- fined in WordNet, such as hypernymy, meronymy, Net identifiers have been transformed into graph etc., as well as with a file containing relations de- relations of the kind described above. The tar- rived on the basis of common words found in the geted dependency relations are of the types: nsubj, synset glosses, which have been manually disam- nmod, amod, iobj, dobj; for more information biguated. As the Bulgarian lemmas in the gener- about the set of relations ated dictionary are mapped to the English Word- that we have used, see the documentation of the Net and the specific Bulgarian WordNet hierarchy UD project6, which includes contribution from the is not exploited in this phase, we have used the BulTreeBank group for the . same file with the relations for English. Because These syntactic relations have been extended in the generation of gloss-based relations is a time- a similar way as the hypernymy relations. For ex- consuming task, we have used the relations for ample, in the case of the nsubj relation, the hy- the English glosses, on the assumption that they ponyms of the dependent node have been repli- should capture to a significant degree the related- cated in new relations of the same kind, for all hy- ness between Bulgarian word senses as well. The ponyms of that particular word sense encountered WordNet ontological relations are 252,392 and the in the golden corpus. Thus, the relation relations from the glosses are 419,387. u:00118523-v v:00510189-n is derived from an nsubj relation, where 5.2 Additional Relations in the Knowledge 00118523-v stands for a sense of the Bulgar- Graph ian verb ”prodalzha” (continue) and is the head

In addition to these available relations, we have 6http://universaldependencies.github. utilized further resources from WordNet itself and io/docs/

599 node (the predicate in nsubj), and 00510189-n, merrymaking corresponding to a particular word sense of ”veselba” (revelry), is the dependent node (the subject). The dependent node has a number of revelry jinx/high jinx hyponyms in the WordNet hierarchy, therefore all these (and their hyponyms, too) have been added into a relation with the node 00118523-v. For binge/bout/tear orgy/riot/debauchery instance, 00510723-n (the synset for particular Figure 1: Traversing the hypernymy hierarchy, an word senses of the words ”binge”, ”bout” and example. ”tear”) has been entered analogously in the same slot as 00510189-n. The open class word forms in the BTB are all the relevant relation attested in the golden corpus. tagged with their respective word senses, but a big Returning to the example from above in order to portion of those senses are yet to be mapped to illustrate this strategy, we identify the ”revelry” WordNet identifiers. Thus, only a part of the de- node (”unrestrained merrymaking”) as subject of pendency relations from the corpus have been ex- the ”continue” node, then we go one level up to tracted for the purpose of these experiments (be- its hypernym, which is ”merrymaking” (”a bois- cause both nodes in a relation must have WordNet terous celebration; a merry festivity”), and extend IDs). More specifically, for 15,675 dependency re- the nsubj relation from there downwards the hier- lations, the numbers for the extracted relations are archy. Thus, the hyponym sense ”jinks” (”noisy as follows: 1,844 nsubj, 3,875 nmod, 1,025 amod, and mischievous merrymaking”) is also inserted 716 iobj, and 1,312 dobj relations. The num- in the nsubj relation with the relevant sense of the bers for the extended relations are: 372,247 nsubj, verb ”continue”. This extension leads to an addi- 1,125,823 nmod (note that there are two cases with tional significant increase in the size of the knowl- nmod: once we extend along the chain of descen- edge base. dants of the dependent element, and once along Figure 1 illustrates the described hierarchy as the chain of those of the head), 377,577 amod, a simple tree. The bolded term (”revelry”) is the 114,760 iobj, and 292,202 dobj relations. node we want to use to expand the nsubj rela- Our motivation for using the hyponyms to infer tion. The expanding procedure finds the hyper- new relations is based on the intuition that these nym of that node (”merrymaking”), then takes all syntactic relations connect an entity to an event7 in the nodes below it and inserts them in the same which the entity participates or connects two par- type of relation, in place of ”revelry”. In this way, ticipants of an event. We assume that if a class of multiple relations can be derived from the initial entities contains possible participants in an event, nsubj relation. then the instances of all sub-classes are possible Finally, we have used information about participants in the same kind of event. The origi- WordNet domains, e.g. biology, , nal relations are trusted to be valid, because they time period, etc. An initial experiment was run were annotated manually in the semantically an- whereby all synsets in a given domain were en- notated treebank. Another important assumption tered in a relation with the domain. Unique is that the relations found in the treebank are not WordNet-style IDs were generated for all domains the most general ones, which means that there is and the relevant synsets were connected to those room for generalization over the participants in nodes. This approach yielded poor results, possi- these events. bly due to the fact that in the PageRank algorithm the contribution of a node weakens the more out- Thus, in addition to the extension of the depen- going edges it has, and the artifical domain nodes dency relations outlined above, we did a further have hundreds of outgoing links. Thus, an alterna- enrichment of the knowledge base by taking the tive strategy was adopted of connecting all synsets hypernym of the node of interest in the syntac- within a domain to each other. In order to avoid tic relation and then taking all nodes beneath it generating many millions of new relations, only in the hypernym hierarchy, and inserting them in the synsets in the Bulgarian dictionary were con- 7Here we interpret the concept of ”event” in a wider sense nected in this fashion. This resulted in 132,596 that also includes states. new relations. The hierarchical relations between

600 domains were also added to the graph, e.g. ”gram- golden corpus + extended dependency rela- mar” is a hyponym of ”linguistics”. tions + domain relations of the kind synset- to-domain and domain hierarchy relations 5.3 Experimental Setup and Results WNGISED2: WordNet relations + relations Several different versions of the relations graph • were used in the experiments with the UKB tool. from the glosses + inferred hypernymy rela- Those configurations that use relations indepen- tions + dependency relations from the golden dently of the corpus (i.e. ontological and defini- corpus + extended dependency relations + tional) were tested on the full corpus of 40 files. domain relations of the kind synset-to-synset Most of the texts in the corpus are journalistic ar- and domain hierarchy relations ticles, but there are a number of texts from literary, WNGISEUD2: WordNet relations + rela- academic, legal and other sources. Those config- • tions from the glosses + inferred hypernymy urations that include context-dependent relations relations + dependency relations from the were tested on a test portion of the corpus com- golden corpus + extended dependency rela- prising of 3 large files with journalistic articles. tions starting from one level up + domain re- The syntactic depedency relations and their exten- lations of the kind synset-to-synset and do- sions used in these configurations were extracted main hierarchy relations and constructed from the development portion of the corpus, i.e. the remaining 37 files. Table 1 shows the results obtained after run- This is a short description of the different con- ning the UKB tool on all texts in the corpus and figurations for the graph: only with WordNet-induced relations, while table WN: WordNet relations 2 shows the results on the test set and with all • relations (WordNet-induced and corpus-induced). WNG: WordNet relations + relations from The ”Recall” column presents results according to • the glosses the formula: (CORRECT DECISIONS + INCORRECT DE- WNI: WordNet relations + inferred hyper- • CISIONS) / (ALL DECISIONS + FALSE NEGA- nymy relations TIVES) WNGI: WordNet relations + relations from As evidenced by the ”Recall” column, about 6% • the glosses + inferred hypernymy relations of the word forms with gold senses are not tagged at all by the UKB tool (which results in the false WNGID1: WordNet relations + relations negatives). The reasons for this are not completely • from the glosses + inferred hypernymy rela- clear at this moment; possible culprits could be in- tions + domain relations of the kind synset- consistencies between the lemmatizer and the dic- to-domain and domain hierarchy relations tionary or some option of the tool not to output decisions for words that cannot be disambiguated. WNGID2: WordNet relations + relations • We are currently working to solve this issue; the from the glosses + inferred hypernymy rela- solution would possibly lead to a further increase tions + domain relations of the kind synset- in accuracy (e.g. decision making based on fre- to-synset and domain hierarchy relations quency counts can be used as a fall-back disam- WNGIS: WordNet relations + relations from biguation mechanism). • the glosses + inferred hypernymy relations + dependency relations from the golden corpus Config Accuracy Recall WN 0.516 0.942 WNGISE: WordNet relations + relations • from the glosses + inferred hypernymy rela- WNG 0.542 0.942 tions + dependency relations from the golden WNI 0.537 0.942 corpus + extended dependency relations WNGI 0.549 0.942 WNGID1 0.549 0.942 WNGISED1: WordNet relations + rela- WNGID2 0.551 0.942 • tions from the glosses + inferred hypernymy relations + dependency relations from the Table 1: Results on the full corpus

601 Config Accuracy Recall WSD module for Bulgarian. The incorporation of additional hypernymy and domain relations from WN 0.517 0.940 WordNet, as well as syntactic Universal Depen- WNG 0.538 0.940 dency relations from the BulTreeBank, improves WNI 0.535 0.940 WSD significantly. However, the algorithm per- WNGI 0.537 0.940 formance drops in terms of speed with the addition WNGID1 0.538 0.940 of links to the graph, and optimization is needed in WNGID2 0.550 0.940 order to handle the increased space of relations. WNGIS 0.565 0.941 The experiments also demonstrate that, given WNGISE 0.616 0.941 the availability of appropriate language resources, WNGISED1 0.617 0.941 a graph model for one language (in our case En- WNGISED2 0.624 0.941 glish) can be successfully adapted to another lan- WNGISEUD2 0.656 0.941 guage (in our case Bulgarian). Table 2: Results on the test portion of the corpus Our future work on WSD for Bulgarian will be focused on: adding more syntactic relations to the setting, adding the information from the mapped- Several interesting facts can be observed from to-WordNet adjectives and adverbs, adding more the two tables. With regards to just the context- context related features, trying to link WordNet re- independent configurations, it is evident that the lations with additional resources (e.g. Wikipedia, inferred hypernymy relations help increase accu- FrameNet, etc.), experimenting with the fine op- racy when added on top of the WordNet ontologi- tions of the UKB tool. cal relations alone; however, the relations derived from the glosses are more effective and the two Acknowledgements sets of relations do not seem to complement each other, i.e. the addition to inferred hypernymies to This research has received partial support by the the gloss similarity relations does not improve the EC’s FP7 (FP7/2007-2013) project under grant results. agreement number 610516: “QTLeap: Quality Secondly, the addition of domain relations does Translation by Deep Language Engineering Ap- not contribute significantly when all synsets are proaches” and FP7 grant 316087 AComIn ”Ad- linked to the domain nodes. Linking all synsets vanced Computing for Innovation”, funded by the in a domain with each other, however, causes sig- European Commission in 20122016. nificant improvement, both in the case of context- We are grateful to the three anonymous review- independent configurations, and when combined ers, whose remarks, comments, suggestions and with dependency relations (one such configuration encouragement helped us to improve the initial gives the highest accuracy for all experiments). variant of the paper. All errors remain our own The last and perhaps most important insight responsibility. concerns the impact of syntactic information on WSD. Adding the dependency relations extracted from the golden corpus results in close to 3% im- References provement, while the addition of the downwards Eneko Agirre and David Martinez. 2002. Integrating extended set adds a further improvement of 5%; selectional preferences in . In Proceedings extending the set by starting from one level above of First International WordNet Conference. the original nodes in the dependency relations Eneko Agirre and Aitor Soroa. 2009. Personalizing helps even more. Contextual information accounts PageRank for word sense disambiguation. In Pro- for about 10% higher accuracy in the experiment ceedings of the 12th Conference of the European done with the last configuration. Chapter of the ACL (EACL 2009), pages 33–41, Athens, Greece, March. Association for Computa- tional Linguistics. 6 Conclusion Eneko Agirre, Oier Lopez´ de Lacalle, and Aitor Soroa. The paper demonstrates that the inclusion of addi- 2014. Random walks for knowledge-based word tional linguistic knowledge to a graph-based ex- sense disambiguation. Comput. Linguist., 40(1):57– perimental setting increases the accuracy of the 84, March.

602 Eneko Agirre, Ander Barrena, and Aitor Soroa. 2015. Studying the wikipedia hyperlink graph for re- latedness and disambiguation. arXiv preprint arXiv:1503.01655. Sergey Brin and Lawrence Page. 2012. Reprint of: The anatomy of a large-scale hypertextual web search engine. Computer networks, 56(18):3825– 3833. Andres Montoyo, Armando Suarez,´ German Rigau, and Manuel Palomar. 2005. Combining knowledge- and corpus-based word-sense-disambiguation meth- ods. J. Artif. Intell. Res.(JAIR), 23:299–330. Roberto Navigli and Simone Paolo Ponzetto. 2012. BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual se- mantic network. Artificial Intelligence, Elsevier, 193:217–250. Petya Osenova, Kiril Simov, Laska Laskova, and Stanislava Kancheva. 2012. A treebank-driven creation of an ontovalence verb lexicon for Bul- garian. In Proceedings of the Eight International Conference on Language Resources and Evalua- tion (LREC’12), pages 2636–2640, Istanbul, Turkey. LREC 2012. Alexander Popov, Stanislava Kancheva, Svetlomira Manova, Ivaylo Radev, Kiril Simov, and Petya Osenova. 2014. The sense annotation of bultree- bank. In Proceedings of the Thirteenth Interna- tional Workshop on and Linguistic Theo- ries (TLT13), pages 127–136, Tuebingen, Germany. TLT 2014.

603