The Resources for Processing Bulgarian and Serbian – the brief overview of their Completeness, Compatibility, and Similarities Svetla Koeva, Cvetana Krstev, Ivan Obradovi, Duško Vitas Department of Computational Faculty of Philology Faculty of Geology and Faculty of Mathematics Linguistics – IBL, BAS University of Belgrade Mining, U. of Belgrade University of Belgrade 52 Shipchenski prohod, Bl. 17 Studentski trg 3 ušina 7 Studentski trg 16 Sofia 1113 11000 Belgrade 11000 Belgrade 11000 Belgrade
[email protected] [email protected] [email protected] [email protected] furthermore presupposes the successful Abstract implementation in different application areas as cross- Some important and extensive language resources lingual information and knowledge management, have been developed for Bulgarian and Serbain cross-lingual content management and text data that have similar theoretical background and mining, cross-lingual information retrieval and structure. Some of them were developed as a part information extraction, multilingual summarization, of a concerted action (wordnet), the others were multilingual language generation etc. developed independently. The brief overview of these resources is presented in this paper, with the 2. Electronic dictionaries emphasis on the similarities and differences of the information presented in them. The special 2.1 Bulgarian Grammatical Dictionary attention is given to the similarities of problems The grammatical information included in the encountered in the course of their development. Bulgarian Grammatical Dictionary (BGD) is divided into three types [Koeva, 1998]: category information 1. Introduction that describes lemmas and indicates the words clustering into grammatical classes (Noun, Verb, Bulgarian and Serbian as Slavonic languages show Adjective, Pronoun, Numeral, and Other); similarities in their lexicons and grammatical paradigmatic information that also characterizes structures.